--- tags: 進階電腦系統理論與實作 --- # 2020q3 Homework6 (kcalc) contributed by < `RinHizakura` > > [I09: kcalc](https://hackmd.io/@sysprog/2020-kcalc#I09-kcalc) > [GitHub](https://github.com/RinHizakura/kcalc-fixed) ## Character Device > [Character device drivers ](https://linux-kernel-labs.github.io/refs/heads/master/labs/device_drivers.html) ## 實作原理 :::danger 僅先研讀後續更動程式可能需要瞭解的關鍵部份,細節如果未來有需要再回頭補充 ::: ### `struct expr` ```cpp struct expr { int type; union { struct { float value; } num; struct { float *value; } var; struct { vec_expr_t args; } op; struct { struct expr_func *f; vec_expr_t args; void *context; } func; } param; }; ``` `struct expr` 是程式中關鍵的資料結構,在 `param` union 底下定義包含: * 儲存數值的 `num` 對應真實的數值 `value` * 儲存變數的 `var`,一般時候 `value` 會指向一個 `struct expr` 結構 * 表示 operation 的 `op`,其底下的 `vec_expr_t` 是一個 `struct expr`,也就是說,儲存 `struct expr` 的 buffer 底下會存在另個 buffer,形成了樹狀的結構 * 表示 function 的 `func`,其內容有儲存函式相關資訊的 `f`、儲存 function 需要變數的 `args`、以及如果有必要可以用來配置該函式所需額外空間的 `context` ### `expr_create` 為了容易理解,先從 [MathEX](https://github.com/jserv/MathEX) 而非原始作業下手,並且使用 test-simple.c 的 expression `x = 40, add(2, x)` 搭配 gdb 觀察程式。 將中斷設在 503 行的 `int n = expr_next_token(s, len, &flags);` #### Break 1: n = 1, *tok = 'x' 第一個中斷時,取出 token `x`,並且會進入 721 行的分支中,將 `id` 設成 `tok`,`idn = 1`。 ```c=719 } else { if (n > 0 && !isdigit(*tok)) { /* Valid identifier, a variable or a function */ id = tok; idn = n; } else { ... ``` #### Break 2: n = 1, *tok = ' ' 空白符符號,執行到 538 行 ```c=538 if (isspace(*tok)) { continue; } ``` #### Break 3: n = 1, *tok = '=' 執行到 564 行 ```c=564 else if ((v = expr_var(vars, id, idn))) { vec_push(&es, expr_varref(v)); paren = EXPR_PAREN_FORBIDDEN; } id = NULL; idn = 0; } ``` * `expr_var` 檢查變數 `x` 是否已經存在 `expr_var_list` 型態的 `vars` 中,若無,則建立 `x` 的 `expr_var` 節點並串到 `expr_var_list` 上 * `expr_varref` 得到一個 `expr` 節點,屬於 union 中的 `var` ,且 `value` 會設為 `v` 的 value 值 * `expr_var` 初始的 `v` value 值目前是 0 * 經 `vec_push` 後放進 `es` 結構中,察看目前 `es` 的狀態為 ``` (gdb) p es $17 = {buf = 0x55555555d2c0, len = 1, cap = 1} (gdb) p *(struct expr_var*)es.buf[0].param.var $33 = {value = 0, next = 0x0, name = 0x55555555d2b0 "x"} ``` ```graphviz digraph graphname{ rankdir=LR subgraph cluster1{ label ="es"; l0 [label="vec_expr_t" shape=plaintext] buf [label="buf" shape=box]; p1 [label="len = 1" shape=box]; p2 [label="cap = 1" shape=box]; {rank=same;} } subgraph cluster2{ style=filled; color=yellow label ="es.buf[0]"; l1[label="struct expr" shape=plaintext]; t1 [label="type=OP_VAR" shape=box]; p3 [label="param.var.value" shape=box]; {rank=same;} } subgraph cluster3{ label =""; l2[label="struct expr_var" shape=plaintext]; t2 [label="value=0" shape=box]; p4 [label="name=x" shape=box]; {rank=same;} } buf->l1 p3->l2 } ``` #### Break 4: (n = 1) (*tok = '=') ```cpp else if (expr_op(tok, n, -1) != OP_UNKNOWN) { enum expr_type op = expr_op(tok, n, -1); struct expr_string o2 = { NULL, 0 }; ... for (;;) { ... enum expr_type type2 = expr_op(o2.s, o2.n, -1); if (!(type2 != OP_UNKNOWN && expr_prec(op, type2))) { struct expr_string str = { tok, n }; vec_push(&os, str); break; } ... } } ``` 接著會執行到 687 行: * `expr_op(tok, n, -1)` 得到 `OP_ASSIGN`,也就是確認 token 是 `=` operator * 建立一個 `expr_string` 結構的 `str`,內容為長度為 `n = 1` 的 `tok = "="`,放進 os 結構中 :::warning 注意到放進的 `tok` 是一個 reference 因此實際去取(`$ p tok`)不會只拿到 "=",這裡僅是用程式邏輯的角度解釋,後續的解釋也會依據此思考 ::: 可以看到目前 os 的狀態: ``` (gdb) p os.buf[0] $44 = {s = 0x55555555a00a "= 40, add(2, x)", n = 1} (gdb) p os $45 = {buf = 0x55555555d2f0, len = 1, cap = 1} ``` #### Break 5: (n = 1) (*tok = ' ') #### Break 6: (n = 2) (tok = "40") ```c=684 else if (!isnan(num = expr_parse_number(tok, n))) { vec_push(&es, expr_const(num)); paren_next = EXPR_PAREN_FORBIDDEN; } ``` * 得到 `num = 40`,放進 `es` 結構中 ```graphviz digraph graphname{ rankdir=LR subgraph cluster1{ label ="es"; l0 [label="vec_expr_t" shape=plaintext] buf [label="buf" shape=box]; p1 [label="len = 2" shape=box]; p2 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster2{ style=filled; color=yellow label ="es.buf[0]"; l1[label="struct expr" shape=plaintext]; t1 [label="type=OP_VAR" shape=box]; p3 [label="param.var.value" shape=box]; {rank=same;} } subgraph cluster3{ label =""; l2[label="struct expr_var" shape=plaintext]; t2 [label="value=0" shape=box]; p4 [label="name=x" shape=box]; {rank=same;} } subgraph cluster4{ style=filled; color=yellow label ="es.buf[1]"; l3[label="struct expr" shape=plaintext]; t3 [label="type=OP_CONST" shape=box]; p5 [label="param.num.value=40" shape=box]; {rank=same;} } buf->l1->l3 p3->l2 } ``` ``` (gdb) p es $52 = {buf = 0x55555555d310, len = 2, cap = 2} (gdb) p es.buf[1].param.num $57 = {value = 40} ``` #### Break 7: (n = 1) (*tok = ',') ```c=687 else if (expr_op(tok, n, -1) != OP_UNKNOWN) { enum expr_type op = expr_op(tok, n, -1); struct expr_string o2 = { NULL, 0 }; if (vec_len(&os) > 0) { o2 = vec_peek(&os); } for (;;) { if (n == 1 && *tok == ',' && vec_len(&os) > 0) { struct expr_string str = vec_peek(&os); if (str.n == 1 && *str.s == '{') { struct expr e = vec_pop(&es); vec_push(&vec_peek(&as).args, e); break; } } enum expr_type type2 = expr_op(o2.s, o2.n, -1); if (!(type2 != OP_UNKNOWN && expr_prec(op, type2))) { struct expr_string str = { tok, n }; vec_push(&os, str); break; } if (expr_bind(o2.s, o2.n, &es) == -1) { goto cleanup; } (void)vec_pop(&os); if (vec_len(&os) > 0) { o2 = vec_peek(&os); else { o2.n = 0; } } } ``` * 得到 `op = expr_op(tok, n, -1) = OP_COMMA` * `vec_peek` 會得到最後一個放進的 `expr_string`,因此 `o2` 是長度為 `n = 1` 的 `tok = "="` * 執行到 `expr_bind(o2.s, o2.n, &es)`,根據 `o2` 代表的 operator 處理。以這裡的 `=` 為例,會將 `es` 倒數的兩個 `expr` 合併在一起,並放回 `es` 中,以 `param.op.args` 屬性存在(想像成是 bottom up 將兩個 `expr` 向上組合出樹狀結構) * 則此時我們知道變數 x 所代表的數值為 40 * `vec_pop` 將 `os` 的最後一個 `expr_string` pop 出來 此時 es 的狀態如下: ```graphviz digraph graphname{ rankdir=LR subgraph cluster1{ label ="es"; l0 [label="vec_expr_t" shape=plaintext] buf [label="buf" shape=box]; p1 [label="len = 1" shape=box]; p2 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster2{ style=filled; color=yellow label ="es.buf[0]"; l1[label="struct expr" shape=plaintext]; t1 [label="type=OP_ASSIGN" shape=box]; p3 [label="param.op.args" shape=box]; {rank=same;} } subgraph cluster3{ label =""; l2[label="vec_expr_t" shape=plaintext]; buf2 [label="buf" shape=box]; p4 [label= "len = 2" shape=box]; p5 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster4{ style=filled; color=green label ="buf[0]"; l3[label="sturct expr" shape=plaintext]; t2 [label="type=OP_VAR" shape=box]; p6 [label= "param.var.value" shape=box]; {rank=same;} } subgraph cluster5{ label =""; l4[label="struct expr_var" shape=plaintext]; t3 [label="value=0" shape=box]; p7 [label="name=x" shape=box]; {rank=same;} } subgraph cluster6{ style=filled; color=green label ="buf[1]"; l5[label="sturct expr" shape=plaintext]; t4 [label="type=OP_CONST" shape=box]; p8 [label= "param.num.value=40" shape=box]; {rank=same;} } buf->l1 p3->l2 buf2->l3->l5 p6->l4 } ``` ``` (gdb) p es $67 = {buf = 0x55555555d310, len = 1, cap = 2} (gdb) p *(struct expr_var*)es.buf[0].param.op.args.buf[0].param.var $76 = {value = 0, next = 0x0, name = 0x55555555d2b0 "x"} (gdb) p es.buf[0].param.op.args.buf[1].param.num.value $78 = 40 ``` * 繼續執行至 703 行的分支中,建立一個 `expr_string` 結構的 `str`,內容為長度為 `n = 1` 的 `tok = ","`,放進 os 結構中 #### Break 8: (n = 1) (*tok = ' ') #### Break 9: (n = 3) (tok = "add") ```c=719 else { if (n > 0 && !isdigit(*tok)) { /* Valid identifier, a variable or a function */ id = tok; idn = n; } } ``` * 執行到 719 行,`id = "add"`、`idn = 3` #### Break 10: (n = 1) (*tok = '(') 執行到 556 行的分支 ```cpp if (idn > 0) { ... if ((idn == 1 && id[0] == '$') || has_macro || expr_func(funcs, id, idn)) { struct expr_string str = { id, (int)idn }; vec_push(&os, str); paren = EXPR_PAREN_EXPECTED; } else { goto cleanup; /* invalid function name */ } ... id = NULL; idn = 0; } ``` * ,建立一個 `expr_string` 結構的 `str`,內容為長度為 `n = 3` 的 `tok = "add"`,放進 os 結構中 * `paren` 被 assign `EXPR_PAREN_EXPECTED` ```c=572 if (n == 1 && *tok == '(') { if (paren == EXPR_PAREN_EXPECTED) { struct expr_string str = { "{", 1 }; vec_push(&os, str); struct expr_arg arg = { vec_len(&os), vec_len(&es), vec_init() }; vec_push(&as, arg); } else if (paren == EXPR_PAREN_ALLOWED) { struct expr_string str = { "(", 1 }; vec_push(&os, str); } else { goto cleanup; // Bad call } } ``` * 接著,執行至第 572 行的分支,因為 `paren` 剛剛被設成 `EXPR_PAREN_EXPECTED`,因此會進入第 573 行的分支 * 將長度為 1 的 token `"{"` 放進 os 結構中 * 產生一個包含 `os` `es` 長度以及 `expr` 結構的 `expr_arg` 結構,並放進 `as` 中 #### Break 11: (n = 1) (*tok = '2') ```c=684 else if (!isnan(num = expr_parse_number(tok, n))) { vec_push(&es, expr_const(num)); paren_next = EXPR_PAREN_FORBIDDEN; } ``` * 得到 `num = 2`,放進 `es` 結構中 ```graphviz digraph graphname{ rankdir=LR subgraph cluster1{ label ="es"; l0 [label="vec_expr_t" shape=plaintext] buf [label="buf" shape=box]; p1 [label="len = 2" shape=box]; p2 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster2{ style=filled; color=yellow label ="es.buf[0]"; l1[label="struct expr" shape=plaintext]; t1 [label="type=OP_ASSIGN" shape=box]; p3 [label="param.op.args" shape=box]; {rank=same;} } subgraph cluster3{ label =""; l2[label="vec_expr_t" shape=plaintext]; buf2 [label="buf" shape=box]; p4 [label= "len = 2" shape=box]; p5 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster4{ style=filled; color=green label ="buf[0]"; l3[label="sturct expr" shape=plaintext]; t2 [label="type=OP_VAR" shape=box]; p6 [label= "param.var.value" shape=box]; {rank=same;} } subgraph cluster5{ label =""; l4[label="struct expr_var" shape=plaintext]; t3 [label="value=0" shape=box]; p7 [label="name=x" shape=box]; {rank=same;} } subgraph cluster6{ style=filled; color=green label ="buf[1]"; l5[label="sturct expr" shape=plaintext]; t4 [label="type=OP_CONST" shape=box]; p8 [label= "param.num.value=40" shape=box]; {rank=same;} } subgraph cluster7{ style=filled; color=yellow label ="es.buf[1]"; l6[label="sturct expr" shape=plaintext]; t5 [label="type=OP_CONST" shape=box]; p9 [label= "param.num.value=2" shape=box]; {rank=same;} } buf->l1->l6 p3->l2 buf2->l3->l5 p6->l4 } ``` ``` (gdb) p es $107 = {buf = 0x55555555d310, len = 2, cap = 2} (gdb) p es.buf[1].param.num $110 = {value = 2} ``` #### Break 12: (n = 1) (*tok = ',') ```c=687 else if (expr_op(tok, n, -1) != OP_UNKNOWN) { enum expr_type op = expr_op(tok, n, -1); struct expr_string o2 = { NULL, 0 }; if (vec_len(&os) > 0) { o2 = vec_peek(&os); } for (;;) { if (n == 1 && *tok == ',' && vec_len(&os) > 0) { struct expr_string str = vec_peek(&os); if (str.n == 1 && *str.s == '{') { struct expr e = vec_pop(&es); vec_push(&vec_peek(&as).args, e); break; } } enum expr_type type2 = expr_op(o2.s, o2.n, -1); if (!(type2 != OP_UNKNOWN && expr_prec(op, type2))) { struct expr_string str = { tok, n }; vec_push(&os, str); break; } if (expr_bind(o2.s, o2.n, &es) == -1) { goto cleanup; } (void)vec_pop(&os); if (vec_len(&os) > 0) { o2 = vec_peek(&os); else { o2.n = 0; } } } ``` * 得到 `op = expr_op(tok, n, -1) = OP_COMMA` * `vec_peek` 會得到最後一個放進的 `expr_string`,因此 `o2` 是長度為 `n = 1` 的 `tok = "{"` * 執行到 696 行的分支,此時會 pop 出 `es` 結構中的最後一個節點,也就是之前得到的數字 2 ``` (gdb) p e.param.num $117 = {value = 2} ``` * 則 es 現在變回 ```graphviz digraph graphname{ rankdir=LR subgraph cluster1{ label ="es"; l0 [label="vec_expr_t" shape=plaintext] buf [label="buf" shape=box]; p1 [label="len = 1" shape=box]; p2 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster2{ style=filled; color=yellow label ="es.buf[0]"; l1[label="struct expr" shape=plaintext]; t1 [label="type=OP_ASSIGN" shape=box]; p3 [label="param.op.args" shape=box]; {rank=same;} } subgraph cluster3{ label =""; l2[label="vec_expr_t" shape=plaintext]; buf2 [label="buf" shape=box]; p4 [label= "len = 2" shape=box]; p5 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster4{ style=filled; color=green label ="buf[0]"; l3[label="sturct expr" shape=plaintext]; t2 [label="type=OP_VAR" shape=box]; p6 [label= "param.var.value" shape=box]; {rank=same;} } subgraph cluster5{ label =""; l4[label="struct expr_var" shape=plaintext]; t3 [label="value=0" shape=box]; p7 [label="name=x" shape=box]; {rank=same;} } subgraph cluster6{ style=filled; color=green label ="buf[1]"; l5[label="sturct expr" shape=plaintext]; t4 [label="type=OP_CONST" shape=box]; p8 [label= "param.num.value=40" shape=box]; {rank=same;} } buf->l1 p3->l2 buf2->l3->l5 p6->l4 } ``` * 將其取代 `as` 結構的最後一個元素 ``` (gdb) p as.buf[0].args.buf[0].param.num $123 = {value = 2} ``` #### Break 13: (n = 1) (*tok = ' ') #### Break 14: (n = 1) (*tok = 'x') ```c=719 else { if (n > 0 && !isdigit(*tok)) { /* Valid identifier, a variable or a function */ id = tok; idn = n; } } ``` * 執行到 719 行,`id = "x"`、`idn = 1` #### Break 15: (n = 1) (*tok = ')') ```cpp if (idn > 0) { ... } else if ((v = expr_var(vars, id, idn))) { vec_push(&es, expr_varref(v)); paren = EXPR_PAREN_FORBIDDEN; } id = NULL; idn = 0; } ``` * 執行到 543 行,並進入 564 行的分支中 * 取出變數後,將其放入 `es` 中 ``` (gdb) p *(struct expr_var *) v $132 = {value = 0, next = 0x0, name = 0x55555555d2b0 "x"} ``` * 此時 es 的狀態為: ```graphviz digraph graphname{ rankdir=LR subgraph cluster1{ label ="es"; l0 [label="vec_expr_t" shape=plaintext] buf [label="buf" shape=box]; p1 [label="len = 2" shape=box]; p2 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster2{ style=filled; color=yellow label ="es.buf[0]"; l1[label="struct expr" shape=plaintext]; t1 [label="type=OP_ASSIGN" shape=box]; p3 [label="param.op.args" shape=box]; {rank=same;} } subgraph cluster3{ label =""; l2[label="vec_expr_t" shape=plaintext]; buf2 [label="buf" shape=box]; p4 [label= "len = 2" shape=box]; p5 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster4{ style=filled; color=green label ="buf[0]"; l3[label="sturct expr" shape=plaintext]; t2 [label="type=OP_VAR" shape=box]; p6 [label= "param.var.value" shape=box]; {rank=same;} } subgraph cluster5{ label =""; l4[label="struct expr_var" shape=plaintext]; t3 [label="value=0" shape=box]; p7 [label="name=x" shape=box]; {rank=same;} } subgraph cluster6{ style=filled; color=green label ="buf[1]"; l5[label="sturct expr" shape=plaintext]; t4 [label="type=OP_CONST" shape=box]; p8 [label= "param.num.value=40" shape=box]; {rank=same;} } subgraph cluster7{ style=filled; color=yellow label ="es.buf[1]"; l6[label="sturct expr" shape=plaintext]; t5 [label="type=OP_VAR" shape=box]; p9 [label= "param.var.value" shape=box]; {rank=same;} } buf->l1->l6 p3->l2 buf2->l3->l5 p6->l4 p9->l4 } ``` ``` (gdb) p es $133 = {buf = 0x55555555d310, len = 2, cap = 2} (gdb) p *(struct expr_var *) es.buf[1].param.var $138 = {value = 0, next = 0x0, name = 0x55555555d2b0 "x"} ``` * 將 `paren` 設為 `EXPR_PAREN_FORBIDDEN` ```c=587 else if (n == 1 && *tok == ')') { int minlen = (vec_len(&as) > 0 ? vec_peek(&as).oslen : 0); ... struct expr_string str = vec_pop(&os); if (str.n == 1 && *str.s == '{') { str = vec_pop(&os); struct expr_arg arg = vec_pop(&as); if (vec_len(&es) > arg.eslen) { vec_push(&arg.args, vec_pop(&es)); } ... ``` * 執行到 587 行,並進入 600 行的分支中 * `str = vec_pop(&os)` 得到 `str` 為 `"add"` * `arg = vec_pop(&as)` 得到 `arg` 結構 * `vec_push(&arg.args, vec_pop(&es))` 會將 `es` 的最後一個節點( `x` )放進 `arg.args` 的 buffer 中,則目前 arg.args 的狀態為: ``` (gdb) p arg.args.buf[0].param.num $150 = {value = 2} (gdb) p *(struct expr_var *)arg.args.buf[1].param.var $151 = {value = 0, next = 0x0, name = 0x55555555d2b0 "x"} ``` * 且 es 變成: ```graphviz digraph graphname{ rankdir=LR subgraph cluster1{ label ="es"; l0 [label="vec_expr_t" shape=plaintext] buf [label="buf" shape=box]; p1 [label="len = 1" shape=box]; p2 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster2{ style=filled; color=yellow label ="es.buf[0]"; l1[label="struct expr" shape=plaintext]; t1 [label="type=OP_ASSIGN" shape=box]; p3 [label="param.op.args" shape=box]; {rank=same;} } subgraph cluster3{ label =""; l2[label="vec_expr_t" shape=plaintext]; buf2 [label="buf" shape=box]; p4 [label= "len = 2" shape=box]; p5 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster4{ style=filled; color=green label ="buf[0]"; l3[label="sturct expr" shape=plaintext]; t2 [label="type=OP_VAR" shape=box]; p6 [label= "param.var.value" shape=box]; {rank=same;} } subgraph cluster5{ label =""; l4[label="struct expr_var" shape=plaintext]; t3 [label="value=0" shape=box]; p7 [label="name=x" shape=box]; {rank=same;} } subgraph cluster6{ style=filled; color=green label ="buf[1]"; l5[label="sturct expr" shape=plaintext]; t4 [label="type=OP_CONST" shape=box]; p8 [label= "param.num.value=40" shape=box]; {rank=same;} } buf->l1 p3->l2 buf2->l3->l5 p6->l4 } ``` ```cpp ... else { int i = 0; int found = -1; struct macro m; vec_foreach(&macros, m, i) { if (strlen(m.name) == (size_t)str.n && strncmp(m.name, str.s, str.n) == 0) { found = i; } } ... else { struct expr_func *f = expr_func(funcs, str.s, str.n); struct expr bound_func = expr_init(); bound_func.type = OP_FUNC; bound_func.param.func.f = f; bound_func.param.func.args = arg.args; if (f->ctxsz > 0) { void *p = calloc(1, f->ctxsz); if (!p) { goto cleanup; /* allocation failed */ } bound_func.param.func.context = p; } vec_push(&es, bound_func); } ``` * 接著,執行到 624 行的分支,因為 `add` 並非定義的 macro,因此會執行到 666 行,得到一個 `expr_func` 結構,並放進 `es` 結構中 ```graphviz digraph graphname{ rankdir=LR subgraph cluster1{ label ="es"; l0 [label="vec_expr_t" shape=plaintext] buf [label="buf" shape=box]; p1 [label="len = 2" shape=box]; p2 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster2{ style=filled; color=yellow label ="es.buf[0]"; l1[label="struct expr" shape=plaintext]; t1 [label="type=OP_ASSIGN" shape=box]; p3 [label="param.op.args" shape=box]; {rank=same;} } subgraph cluster3{ label =""; l2[label="vec_expr_t" shape=plaintext]; buf2 [label="buf" shape=box]; p4 [label= "len = 2" shape=box]; p5 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster4{ style=filled; color=pink label ="buf[0]"; l3[label="sturct expr" shape=plaintext]; t2 [label="type=OP_VAR" shape=box]; p6 [label= "param.var.value" shape=box]; {rank=same;} } subgraph cluster5{ label =""; l4[label="struct expr_var" shape=plaintext]; t3 [label="value=0" shape=box]; p7 [label="name=x" shape=box]; {rank=same;} } subgraph cluster6{ style=filled; color=pink label ="buf[1]"; l5[label="sturct expr" shape=plaintext]; t4 [label="type=OP_CONST" shape=box]; p8 [label= "param.num.value=40" shape=box]; {rank=same;} } subgraph cluster7{ style=filled; color=yellow label ="es.buf[1]"; l6[label="sturct expr" shape=plaintext]; t5 [label="type=OP_FUNC" shape=box]; p9 [label= "param.func.f" shape=box]; p10 [label= "param.func.args" shape=box]; {rank=same;} } subgraph cluster8{ label =""; l7[label="sturct expr_func" shape=plaintext]; n1 [label="name=\"add\"" shape=box]; {rank=same;} } subgraph cluster9{ label =""; l8[label="vec_expr_t" shape=plaintext]; buf3 [label="buf" shape=box]; p11 [label= "len = 2" shape=box]; p12 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster10{ style=filled; color=green label ="buf[1]"; l9[label="sturct expr" shape=plaintext]; t6 [label="type=OP_VAR" shape=box]; p13 [label= "param.var.value" shape=box]; {rank=same;} } subgraph cluster11{ style=filled; color=green label ="buf[0]"; l10[label="sturct expr" shape=plaintext]; t7 [label="type=OP_CONST" shape=box]; p14 [label= "param.num.value=2" shape=box]; {rank=same;} } buf->l1->l6 p3->l2 buf2->l3->l5 p6->l4 p9->l7 p10->l8 buf3->l10->l9 p13->l4 } ``` 至此,字串已經處理完畢,因此會跳出處理字串的 `for(;;)` 迴圈。 ```cpp=735 while (vec_len(&os) > 0) { struct expr_string rest = vec_pop(&os); if (rest.n == 1 && (*rest.s == '(' || *rest.s == ')')) { goto cleanup; // Bad paren } if (expr_bind(rest.s, rest.n, &es) == -1) { goto cleanup; } } result = (struct expr *)calloc(1, sizeof(struct expr)); if (result) { if (vec_len(&es) == 0) { result->type = OP_CONST; } else { *result = vec_pop(&es); } } ``` 繼續往下執行,執行至 735 行時,因為 `os` 中尚有未處理的 operator 會有一次 `expr_bind` 將 `es` 的末兩個元素做合併,則此時的 es 狀態經合併後為: ```graphviz digraph graphname{ rankdir=LR subgraph cluster2{ style=filled; color=green label ="buf[0]"; l1[label="struct expr" shape=plaintext]; t1 [label="type=OP_ASSIGN" shape=box]; p3 [label="param.op.args" shape=box]; {rank=same;} } subgraph cluster3{ label =""; l2[label="vec_expr_t" shape=plaintext]; buf2 [label="buf" shape=box]; p4 [label= "len = 2" shape=box]; p5 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster4{ style=filled; color=pink label ="buf[0]"; l3[label="sturct expr" shape=plaintext]; t2 [label="type=OP_VAR" shape=box]; p6 [label= "param.var.value" shape=box]; {rank=same;} } subgraph cluster5{ label =""; l4[label="struct expr_var" shape=plaintext]; t3 [label="value=0" shape=box]; p7 [label="name=x" shape=box]; {rank=same;} } subgraph cluster6{ style=filled; color=pink label ="buf[1]"; l5[label="sturct expr" shape=plaintext]; t4 [label="type=OP_CONST" shape=box]; p8 [label= "param.num.value=40" shape=box]; {rank=same;} } subgraph cluster7{ style=filled; color=green label ="buf[1]"; l6[label="sturct expr" shape=plaintext]; t5 [label="type=OP_FUNC" shape=box]; p9 [label= "param.func.f" shape=box]; p10 [label= "param.func.args" shape=box]; {rank=same;} } subgraph cluster8{ label =""; l7[label="sturct expr_func" shape=plaintext]; n1 [label="name=\"add\"" shape=box]; {rank=same;} } subgraph cluster9{ label =""; l8[label="vec_expr_t" shape=plaintext]; buf3 [label="buf" shape=box]; p11 [label= "len = 2" shape=box]; p12 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster10{ style=filled; color=lightgray label ="buf[1]"; l9[label="sturct expr" shape=plaintext]; t6 [label="type=OP_VAR" shape=box]; p13 [label= "param.var.value" shape=box]; {rank=same;} } subgraph cluster11{ style=filled; color=lightgray label ="buf[0]"; l10[label="sturct expr" shape=plaintext]; t7 [label="type=OP_CONST" shape=box]; p14 [label= "param.num.value=2" shape=box]; {rank=same;} } subgraph cluster12{ label ="es"; l11 [label="vec_expr_t" shape=plaintext] buf4 [label="buf" shape=box]; p15 [label="len = 1" shape=box]; p16 [label="cap = 2" shape=box]; {rank=same;} } subgraph cluster13{ style=filled; color=yellow label ="es.buf[0]"; l12[label="struct expr" shape=plaintext]; t8 [label="type=OP_COMMA" shape=box]; p17 [label="param.op.args" shape=box]; {rank=same;} } subgraph cluster14{ label =""; l13[label="vec_expr_t" shape=plaintext]; bufx [label="buf" shape=box]; p18 [label= "len = 2" shape=box]; p19 [label="cap = 2" shape=box]; {rank=same;} } p3->l2 buf2->l3->l5 p6->l4 p9->l7 p10->l8 buf3->l10->l9 p13->l4 l11->l13 bufx->l12 p17->l1 t1->l6 } ``` ``` (gdb) p *(struct expr_var*)(es.buf[0].param.op.args.buf[0].param.op.args.buf[0].param.var).value $178 = {value = 0, next = 0x0, name = 0x55555555d2b0 "x"} (gdb) p es.buf[0].param.op.args.buf[0].param.op.args.buf[1].param.num.value $179 = 40 (gdb) p es.buf[0].param.op.args.buf[1].param.func.args.buf[0].param.num.value $214 = 2 (gdb) p *(struct expr_var *)(es.buf[0].param.op.args.buf[1].param.func.args.buf[1].param.var).value $218 = {value = 0, next = 0x0, name = 0x55555555d2b0 "x"} ``` 最後,取出 `es` 頂端的元素位址作為 `*result` 並回傳。 ### `expr_eval` 對於此樹狀結構,`expr_eval()` 會從 root 開始處理對應的操作。 (暫不細究) ### `expr_parse_number` 可以從 `expr_parse_number` 了解 kcalc-fixed 如何定義定點數: ```cpp static uint64_t expr_parse_number(const char *s, size_t len) { fixedp num = {0}; int frac = 0; int dot = 0; unsigned int digits = 0; for (unsigned int i = 0; i < len; i++) { if (s[i] == '.' && dot == 0) { dot = i + 1; continue; } if (isdigit(s[i])) { if (dot) frac++; else digits++; } else return NAN_INT; } ``` * 解析字串所代表的數字,其中 `dot` 表示小數點是第幾個位置 * `digit` 表示整數部份有幾位,`frac` 表示小數部份有幾位 ```cpp static int pow10[] = { 1, 10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000, }; if (dot) { uint32_t ipt = 0, tmp; uint32_t mask = pow10[frac]; int i = 31; sscanf(s, "%u.%u", &tmp, &ipt); while (ipt && i) { ipt <<= 1; if (ipt >= mask) { num.frac |= 1 << i; ipt %= mask; } i--; } } if (digits) sscanf(s, "%u", &num.inte); return (digits > 0 ? num.data : NAN_INT); } ``` * 對於小數點部份,想成是把它表示成二進位的表示法,例如 $0.75_{dec} = 0.11000...0_{binary}$,小數點後統一用 32 位元表示 * 則以上述例子來說,我們如何把 75 表示成 11000...0 呢?考慮整數表示的 32 bits 從左到右該為 1 還是 0,可以遞迴判斷: * `ipt <<= 1;if (ipt >= mask)`: 75 是否大於等於 50( `75 * 2 >= 100?` )? * `num.frac |= 1 << i`: 如果是則先填 1,否則填 0 * `ipt %= mask`: 捨去已經逼近的距離,以這裡為例 75 - 50 = 25 * 則下一個遞迴 `ipt <<= 1;if (ipt >= mask)` 判斷 25 是否大於等於 25(`25 * 2 >= 50?` )...以此類推 * 整數部份則直接保留其二進位表示 ## 檢驗數值的機制 為了方便進行實驗,檢驗我們的定點數實作是否有錯,因此調整程式如下: ```cpp static ssize_t dev_write(struct file *filep, const char *buffer, size_t len, loff_t *offset) { ... calc(); /* return the result of calc() for error checking */ return result; } ``` 有點投機的讓 `dev_write` 的回傳變成 `calc()` 得到的結果 ```cpp #define CALC_DEV "/dev/calc" #define RED "\x1B[31m" #define GRN "\x1B[32m" #define RESET "\x1B[0m" int fd; void test_expr(char *s, double expect) { if (expect > INT_MAX || expect < INT_MIN) expect = INFINITY; ssize_t value = write(fd, s, strlen(s)); char buffer[BUF_SIZE]; fixedp ipt = {0}; ipt.data = value; double result; if (ipt.data == NAN_INT) result = NAN; else if (ipt.data == INF_INT) result = INFINITY; else result = (int) ipt.inte + ((double) ipt.frac / UINT32_MAX); if ((isnan(expect) && isnan(result)) || (isinf(expect) && isinf(result))) printf(GRN "[PASS]" RESET ": %s == +-%f\n", s, result); else if (fabs(result - expect) < 0.00001f) printf(GRN "[PASS]" RESET ": %s == %f\n", s, result); else printf(RED "[FAIL]" RESET ": %s expect %f but get %f\n", s, expect, result); } int main() { fd = open(CALC_DEV, O_RDWR); if (fd < 0) { perror("Failed to open character device"); exit(1); } printf("\nStart bench......\n"); ... /* write something like: * test_expr("sqrt(1)", sqrt(1)); */ printf("Done.\n\n"); close(fd); return 0; } ``` 則我們可以直接對 `/dev/calc` 做寫入操作,並檢驗其回傳與在浮點數計算下的結果,就可以用來實驗自己的實作正確性了! 這個作法的好處是我們可以執行程式計算出預期的正確結果,而不需要手算正確答案。補充幾個設計的用意: * 對於超出定點數可以表達的範圍,我們假設預期結果該是 INF,因此檢查使做出相應的調整 * 在輸出的地方可以看到使用了 [ANSI escape code](https://en.wikipedia.org/wiki/ANSI_escape_code) 來使輸出有顏色,方便觀察結果,可以對照 [CSI sequences](https://en.wikipedia.org/wiki/ANSI_escape_code#CSI_sequences) 理解指令的使用: * `\x1b` 代表 escape 的 ESC,表示 escape code 的開始 * 31 對應 color table 的紅色、32 對應綠色 * 最後的 m 表示依據前面的指令做設定 ## 擴充運算器功能 ### sqrt 透過 MathEX 的 API,首先註冊一個 sqrt function。 ```cpp static struct expr_func user_funcs[] = { {"nop", user_func_nop, user_func_nop_cleanup, 0}, {"sqrt", user_func_sqrt, user_func_nop_cleanup, 0}, {NULL, NULL, NULL, 0}, }; ``` 因為 `user_func_sqrt` 中沒有額外的記憶體配置,因此 `cleanup` 就借用 `nop` 的,在 `cleanup` 階段不做任何額外處理。 ```cpp noinline uint64_t user_func_sqrt(struct expr_func *f, vec_expr_t args, void *c) { int64_t ix0 = expr_eval(&vec_nth(&args, 0)); /* for 0 or NAN or INF, just return itself*/ if (ix0 == 0 || ix0 == NAN_INT || ix0 == INF_INT) return ix0; /* first, scale our number between 1 to 4 */ int lz = __builtin_clzll(ix0); /* for negative number */ if (lz == 0) return NAN_INT; /* for range from 0 to 30 */ if (lz <= 30) { if (lz & 1) lz++; ix0 >>= (30 - lz); } /* for range from 31 to 63 */ else { if (!(lz & 1)) lz++; ix0 <<= (lz - 31); } int64_t s0, q, t, r; /* generate sqrt(x) bit by bit */ q = s0 = 0; /* [q] = sqrt(x) */ r = 0x400000000; /* r = moving bit from right to left */ while (r != 0) { t = s0 + r; // t = s_i + 2^(-(i+1)) if (t <= ix0) { // t <= y_i ? s0 = t + r; // s_{i+1} = s_i + 2^(-i) ix0 -= t; // y_{i+1} = yi - t q += r; // q_{i+1} = q_{i}+ 2^(-i-1) } ix0 += ix0; r >>= 1; } if (lz < 31) return (q >> 1) << ((30 - lz) >> 1); return (q >> 1) >> ((lz - 31) >> 1); } ``` 實作函式如上,可以看到其實是借用了 [quiz5: 測驗 2](https://hackmd.io/w6CB0mKwRE6zG1TvQaz3wg?view#%E6%B8%AC%E9%A9%97-2) 的方法,針對我們所定義的定點數結構做修改。計算方式的細節就請參考連結,這裡僅補充一些與該連結有差異的部份: * 針對特例: `0` / `NAN` / `INF` 進行處理 * 一樣是把定點數拆成 $X \times 2^Y$ 的形式,$1 \leq X < 4$、且 $Y$ 為偶數 * 因此針對 exponent $Y$ 的計算,定點數的處理是透過 `__builtin_clzll` 來求得,適當的調整使得滿足前述的條件 * 則開根號的結果為 $\sqrt x \times 2^{Y/2}$,對應後續的 return 計算 先隨便測試一些簡單的測資,確認大方向應該是沒有錯誤的: ``` Start bench...... [PASS]: sqrt(0) == 0.000000 [PASS]: sqrt(10) == 3.162278 [PASS]: sqrt(20) == 4.472136 [PASS]: sqrt(30) == 5.477226 [PASS]: sqrt(40) == 6.324555 [PASS]: sqrt(50) == 7.071068 [PASS]: sqrt(60) == 7.745967 [PASS]: sqrt(70) == 8.366600 [PASS]: sqrt(80) == 8.944272 [PASS]: sqrt(90) == 9.486833 [PASS]: sqrt(100.000000) == 10.000000 [PASS]: sqrt(100.099998) == 10.004999 [PASS]: sqrt(100.199997) == 10.009995 [PASS]: sqrt(100.299995) == 10.014989 [PASS]: sqrt(100.399994) == 10.019980 [PASS]: sqrt(100.499992) == 10.024968 [PASS]: sqrt(100.599991) == 10.029955 [PASS]: sqrt(100.699989) == 10.034938 [PASS]: sqrt(100.799988) == 10.039920 [PASS]: sqrt(100.899986) == 10.044899 [PASS]: sqrt(100.999985) == 10.049875 Done. ``` 下一步,需要考慮在此實作方法的有效數字範圍,以及是否有可能出現產生特例的數字。 * 此作法對於哪個範圍外的數字會出現錯誤(非誤差)? * 此作法對於哪個範圍外的數字會出現可能不能接受的誤差? * 為了滿足條件而把數字挪到 1 至 4 間,這表示顯然在數字大的時候,小數的部份可能會有大量的捨去,這可能導致精確度十分低 * 在 [quiz5: 測驗 2](https://hackmd.io/w6CB0mKwRE6zG1TvQaz3wg?view#%E6%B8%AC%E9%A9%97-2) 的方法中,最後操作浮點數進行 rounding,但是此處我想還是儘量避免使用浮點數,這是否會導致精確度不佳? #### 有效的數字範圍 先考慮整數部份的情況下,數字的上限是有號 int32 可以表示的上限 `INT_MAX=2147483647` ```cpp test_expr("INT_MAX=2147483647, sqrt(INT_MAX)",sqrt(INT_MAX)); test_expr("INT_MAX=2147483647, sqrt(INT_MAX+1)",sqrt(INT_MAX+1.0f)); ``` ``` [PASS]: INT_MAX=2147483647, sqrt(INT_MAX) == 46340.950001 [FAIL]: INT_MAX=2147483647, sqrt(INT_MAX+1) expect 46340.950012 but get inf ``` 如果考慮小數點精確的問題,以 INT_MAX 為例,leading zero 等於 1,依據我們設計的演算法,會對數字 right shift 28 位元,等於會把定點數用來表示小數點的 32 位元中,最後的 28 位元去除。因此小數點位數在 $2^{-5}$ 以下有差異的數字會被當成相同數字處理。 ``` [PASS]: INT_MAX=2147483647,sqrt(INT_MAX+0.500000) == 46340.950005 [PASS]: INT_MAX=2147483647,sqrt(INT_MAX+0.250000) == 46340.950001 [PASS]: INT_MAX=2147483647,sqrt(INT_MAX+0.125000) == 46340.950001 [PASS]: INT_MAX=2147483647,sqrt(INT_MAX+0.062500) == 46340.950001 [PASS]: INT_MAX=2147483647,sqrt(INT_MAX+0.031250) == 46340.950001 ... [PASS]: INT_MAX=2147483647,sqrt(INT_MAX+0.000004) == 46340.950001 [PASS]: INT_MAX=2147483647,sqrt(INT_MAX+0.000002) == 46340.950001 ``` 不過因為整數部份太大,所以捨棄小數點部份導致的誤差可能是可以接受的? #### 精準度瑕疵 比較 sqrt(2) 的計算結果,並且觀察至小數點後第 11 位。 ``` double sqrt(2) == 1.41421356237 fixed sqrt(2) == 1.41421356225 ``` 可以看到因為精度不足的原因,加上我們的 sqrt 的實作瑕疵,對最後一個 bit 是沒有考量如何 rounding 的,在後面的位數中出現了偏差,需要思考如何對原始的實作再進行調整。 ### sigma 首先,同樣先註冊 sigma: ```cpp static struct expr_func user_funcs[] = { {"nop", user_func_nop, user_func_nop_cleanup, 0}, {"sqrt", user_func_sqrt, user_func_nop_cleanup, 0}, {"sigma", user_func_sigma, user_func_nop_cleanup, 0}, {NULL, NULL, NULL, 0}, }; ``` 接著,設計 sigma 的實作: ```cpp noinline uint64_t user_func_sigma(struct expr_func *f, vec_expr_t args, void *c) { struct expr *v = &vec_nth(&args, 0); int64_t lower = expr_eval(&vec_nth(&args, 2)); int64_t upper = expr_eval(&vec_nth(&args, 3)); /* return if bad function call */ if (lower > upper) return NAN_INT; int64_t sum = 0; int max_iter = INT_MAX; for (int64_t i = lower; i <= upper && max_iter > 0; i += (1UL << 32), max_iter--) { (*(struct expr_var *) (v->param.var.value)).value = i; int64_t tmp = expr_eval(&vec_nth(&args, 1)); if(tmp == NAN_INT || tmp == INF_INT) return tmp; int64_t new_sum; if(__builtin_add_overflow(sum ,tmp, &new_sum)){ pr_info("calc: sigma, overflow occur\n"); return INF_INT; } sum = new_sum; } if(max_iter <= 0) return NAN_INT; return sum; } ``` 從前面章節對 `expr_create` 的觀察,我們可以知道對於同名稱的變數,其樹狀結構上的節點會指向相同的 reference,我們可以善用這點來設計 sigma,使得 sigma 可以根據使用者定義的算式來計算結果。舉例來說,`"sigma(n, 2 * n, 1, 10)"` 這個 expression,對應的數學式為: $$ \sum\limits_{n = 1}^{10}{2n} $$ 也就是說: * 第一個參數為 $\Sigma$ 中的 index * 第二個參數為根據該 index 設計的算式 * 第三個參數為 lower bound * 第四個參數為 upper bound 則回顧函式的設計: * 我們取出第 0 個參數(假設是 `n`)的 `struct expr` 結構,且在迴圈中調整其數值 * 則因為第一個參數的算式中 `n` 的 `struct expr` 也會 reference 到同一個 `sturct expr_var`,因此透過 `expr_eval` 去 evalute 這個參數時就可以得到不同 index 的代入結果 * 迴圈中使用 `max_iter` 限制迴圈的 iteration 數量,這是為了避免 `lower` 的 overflow 可能導致迴圈無法終止,程式卡死在 kernel 中 * 使用 gcc 中提供的 [`__builtin_add_overflow`](https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html),進行 overflow 的檢查並對應處理 接著,在測試程式中加入以下 macro ```cpp #define SIGMA(ans, i, expr, start, end) \ ans = 0; \ for (i = start; i <= end; i++) { \ ans += expr; \ } ``` 則我們可以透過類似以下的程式來測試結果: ```cpp int64_t n; double ans; SIGMA(ans, n, 2 * n, 1, 10); test_expr("sigma(n, 2 * n, 1, 10)", ans); ``` 隨便測試幾個測資,確認大方向上沒有問題: ```diff Start bench...... [PASS]: sigma(n, n, 1, 10) == 55.000000 [PASS]: sigma(n, 2 * n, 1, 10) == 110.000000 [PASS]: sigma(n, 3 * n + 2, 20, 33) == 1141.000000 [PASS]: sigma(n, n * 1.1 + 5, 10, 30) == 567.000000 [PASS]: sigma(n, sqrt(n), 1, 10) == 22.468278 [PASS]: INT_MAX=2147483647, sigma(n, n, INT_MAX - 1, INT_MAX) == +-inf [PASS]: INT_MIN=-2147483648, sigma(n, n, INT_MIN, INT_MIN+1) == +-inf Done. ``` #### 有效的數字範圍 對於 $\Sigma$ 而言,最需要小心的點應該是加到超出定點數可以表達的上限或下限。從前面的測資可以看到,我們所設計的 `sigma` 會自動去將 overflow 一併以回傳 `INF_INT` 處理。 #### 誤差累積 考慮以下測資: ``` [FAIL]: x = 0.01, sigma(n, x, 1, 50000) expect 500.000000 but get 499.999989 ``` 可以看到達 50000 個 iteration 時,原本的實作的精度問題明顯的被顯示出來。從程式去觀察 `0.01` 被轉換成定點數再轉換回浮點數時的實際數字,可以看到其實會是 `0.00999999978`,其中的誤差是造成上述測資出現錯誤的原因。 ## 修正錯誤結果 ### 乘法 overflow 考慮會造成 overflow 的測資 `INT_MAX * 2`,可以看到非預期的計算結果。 ``` [FAIL]: INT_MAX = 2147483647, INT_MAX * 2 expect inf but get -2.000000 ``` 為此,我們要修正程式的 `mult` ```cpp static uint64_t mult(uint64_t a, uint64_t b) { fixedp fa = {.data = a}, fb = {.data = b}; /* (a + b) * (c + d) = ac + ad + bc + bd */ int32_t tmp; if(__builtin_mul_overflow(fa.inte, fb.inte, &tmp)){ pr_info("calc: mult, overflow occur\n"); return INF_INT; } fixedp result = {.inte = tmp}; uint64_t ad = (GET_NUM(a) >> 32) * GET_FRAC(b); uint64_t bc = GET_FRAC(a) * (GET_NUM(b) >> 32); return result.data + ad + bc; } ``` 注意到 `tmp` 使用的型態是 `int32_t`,這是因為雖然數字是透過 `uint32_t` 儲存,但是從 `eval.c` 的轉換中我們知道其中還是有 sign bit 的概念,所以 overflow 的處理需要從有號數型態去考量。 但是上面的修改還不足夠,考慮以下測資。 ``` [FAIL]: INT_MAX = 2147483647, INT_MAX * 1.5 expect inf but get 2.000000 ``` 對照原本的修改,不難看出是因為 overflow 的檢查不夠周全,因此需要再對程式進行更正。 ```cpp static uint64_t mult(uint64_t a, uint64_t b) { fixedp fa = {.data = a}, fb = {.data = b}; /* (a + b) * (c + d) = ac + ad + bc + bd */ int32_t tmp; if(__builtin_mul_overflow(fa.inte, fb.inte, &tmp)){ pr_info("calc: mult, overflow occur\n"); return INF_INT; } fixedp result = {.inte = tmp}; uint64_t ad = (GET_NUM(a) >> 32) * GET_FRAC(b); uint64_t bc = GET_FRAC(a) * (GET_NUM(b) >> 32); int64_t ans; if(__builtin_add_overflow(ad, bc, &ans)){ pr_info("calc: mult, overflow occur\n"); return INF_INT; } if(__builtin_add_overflow(result.data, ans, &ans)){ pr_info("calc: mult, overflow occur\n"); return INF_INT; } return (uint64_t)ans; } ``` 經此修改後,測試以下的測資,包含應該如同從前運作,以及檢查出 overflow 者,並得到結果: ``` [PASS]: INT_MAX = 2147483647, INT_MAX * 2 == +-inf [PASS]: INT_MAX = 2147483647, INT_MAX * 1.5 == +-inf [PASS]: 9 * 5 * 7 == 315.000000 [PASS]: 1.5 * 4 * 100 == 600.000000 ``` ### 乘法的實作錯誤 閱讀程式時會注意到,註解表示乘法的實作是拆解成 `(a + b) * (c + d)` 也就是分別處理兩個數字的整數部份與小數部份之相乘結果再總和起來。但是仔細閱讀原本的設計會發現兩個數字的小數相乘部份似乎被遺漏掉了,我們可以透過幾個測資來驗證這點。 ``` [FAIL]: 3.5 * 3.5 expect 12.250000 but get 12.000000 [FAIL]: 3.5 * 3.75 expect 13.125000 but get 12.750000 [FAIL]: 1.1 * 1.7 expect 1.870000 but get 1.800000 [FAIL]: 0.5 * 0.5 expect 0.250000 but get 0.000000 ``` 對此,需要把小數部份相乘的結果也補充回去。針對小數部份,要求出正確的結果,目前想到的方法是透過 shift 的方式來求得。其概念為,例如 $(a \times 0.75)$ 的結果可以表示為: $$ a \times 0.5 + a \times 0.25 $$ 詳細請見以下的程式碼求出 `bd` 的部份: ```cpp static uint64_t mult(uint64_t a, uint64_t b) { fixedp fa = {.data = a}, fb = {.data = b}; /* (a + b) * (c + d) = ac + ad + bc + bd */ int32_t tmp; if (__builtin_mul_overflow(fa.inte, fb.inte, &tmp)) { pr_info("calc: mult, overflow occur\n"); return INF_INT; } fixedp result = {.inte = tmp}; uint64_t ad = (GET_NUM(a) >> 32) * GET_FRAC(b); uint64_t bc = GET_FRAC(a) * (GET_NUM(b) >> 32); uint64_t bd = 0; uint32_t frac_a = GET_FRAC(a); uint32_t frac_b = GET_FRAC(b); /* calculate 'bd' by moving bit */ while (frac_b != 0) { int shift = __builtin_clz(frac_b) + 1; frac_a >>= shift; bd += frac_a; frac_b <<= shift; } int64_t ans; if (__builtin_add_overflow(ad, bc, &ans)) { pr_info("calc: mult, overflow occur\n"); return INF_INT; } if (__builtin_add_overflow(ans, result.data, &ans)) { pr_info("calc: mult, overflow occur\n"); return INF_INT; } if (__builtin_add_overflow(ans, bd, &ans)) { pr_info("calc: mult, overflow occur\n"); return INF_INT; } return (uint64_t) ans; } ``` 則可以得到正確結果: ``` [PASS]: 3.5 * 3.5 == 12.250000 [PASS]: 3.5 * 3.75 == 13.125000 [PASS]: 1.1 * 1.7 == 1.870000 [PASS]: 0.5 * 0.5 == 0.250000 ``` 換一下可能更複雜的測資再確認一下,順利通過,happy! ``` [PASS]: 1.123 * 1.789 == 2.009047 [PASS]: 12345.6789 * 9876.5432 == 121932630.989172 [PASS]: 99.81 * 88.001 == 8783.379810 ``` ### 負數乘法的處理 經 [YLowy](https://hackmd.io/@YLowy) 同學的提醒(:pray::pray:),注意到在上面的實作中,我們並沒有仔細考慮對正負號乘法的處理。當乘法的兩個數字有其一是負數時,前面作法的程式邏輯會當成是 overflow 處理。因此,我們需要再對其做相應的修改。這裡比較首先要了解的議題是,到底 kcalc-fixed 的定點數是如何表示負數的呢? 回顧 `expr_parse_number` ,事實上,此函數所接受的數字字串只會有數字本身,而不包含該數字前面的任何 unary operator (例如 `-`) 的。因此這個函式相當於是先把一個表達正數的字串轉換成對應的正定點數表示。 然而,在進入定義的運算函數前,如這裡的 `mult` 時,為了得到正確的結果,如果輸入的兩個數字是帶有 `-` 的,就需要替換成正確的定點數表示。舉例來說,`-3 * -5` 這個 expression 在呼叫 `mult` 時,應該對應到 `mult(-3 的定點數編碼,-5 的定點數編碼)`。 而這點在原本的程式中已經有對應的處理,因為根據優先權,parser 會先檢查到 `expr_eval` 中的 case `OP_UNARY_MINUS`,因此執行 `minus(0, expr_eval(&e->param.op.args.buf[0]))` 將計算定點數成負定點數。觀察 minus 的實作的話,先不考慮處理 underflow 等特例,unary `-` 就只是用 0 減去原始的定點數編碼。 ```cpp static uint64_t minus(uint64_t a, uint64_t b) { return a - b; } ``` 例如以下的範例 ``` 3.25 = 0x00000003 40000000 -> -3.25 = (0 - 3.25) = 0xFFFFFFFC C0000000 ``` 這個表示符合依然 `eval.c` 中的轉換運算 ``` -3 = 0xFFFFFFFC C0000000 -> (int) 0xFFFFFFFC + ((double)C0000000 / UINT32_MAX) -> -4 + 0.75 = -3.25 ``` :::info 註: unsinged int 的運算沒有 overflow, 所以例如 `(unsigned int) 0 - (unsigned int) 3` 的結果是被明確定義的,根據 C 語言規格書 6.2.5 第 9 點: > A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type 結果會是 (- 3 + UINT_MAX + 1) ::: 至此,我們可以理解到要把定點數做正負之間的轉換,其實就是直接加上 `-`,沒有其他的複雜的操作需要處理。有了這個認識之後,就可以來實作正確處理正負數乘法的 `mult` 了! ```cpp static uint64_t mult(uint64_t a, uint64_t b) { /* take the sign bit of fixed-point representation */ uint32_t sign_bit_a = a >> 63; uint32_t sign_bit_b = b >> 63; uint64_t final_sign_bit = sign_bit_a ^ sign_bit_b; /* convert both a and b to postive * you can fix them to a branchless expression */ a = sign_bit_a ? -a : a; b = sign_bit_b ? -b : b; fixedp fa = {.data = a}, fb = {.data = b}; /* (a + b) * (c + d) = ac + ad + bc + bd */ int32_t tmp; if (__builtin_mul_overflow((int64_t)fa.inte, (int64_t)fb.inte, &tmp)) { pr_info("calc: mult, overflow occur\n"); return INF_INT; } fixedp result = {.inte = tmp}; int64_t ad = (GET_NUM(a) >> 32) * GET_FRAC(b); int64_t bc = GET_FRAC(a) * (GET_NUM(b) >> 32); int64_t bd = 0; uint32_t frac_a = GET_FRAC(a); uint32_t frac_b = GET_FRAC(b); /* calculate 'bd' by moving bit */ while (frac_b != 0) { int shift = __builtin_clz(frac_b) + 1; frac_a >>= shift; bd += frac_a; frac_b <<= shift; } int64_t ans; if (__builtin_add_overflow(ad, bc, &ans)) { pr_info("calc: mult, overflow occur\n"); return INF_INT; } if (__builtin_add_overflow(ans, result.data, &ans)) { pr_info("calc: mult, overflow occur\n"); return INF_INT; } if (__builtin_add_overflow(ans, bd, &ans)) { pr_info("calc: mult, overflow occur\n"); return INF_INT; } return (uint64_t)(final_sign_bit ? -ans : ans); } ``` :::warning 注意到這裡對初始化的型態為 signed 或 unsigned 重新調整,該使用有號還是無號數可不是隨便設計的!要考量預期可以檢查到 overflow 的情境、使用 right shift 的結果 ::: 類似於原來的實作,但是為了方便處理有號數的乘法,我先將 `a` 和 `b` 的 sign bit 提出並轉換成正數,然後先對正數的乘法進行處理後(方法則同前面的章節說明),並於返回時再根據乘法的正負號規則轉換成正確的結果。 ```diff [PASS]: sqrt(2) == 1.41421356225 [PASS]: INT_MAX = 2147483647, INT_MAX * 2 == +-inf [PASS]: INT_MAX = 2147483647, INT_MAX * 1 == 2147483647.00000000000 [PASS]: INT_MIN = -2147483648, INT_MIN * 2 == +-inf - [FAIL]: INT_MIN = -2147483648, INT_MIN * 1 expect -2147483648.000000 but get inf [PASS]: -3.5 * 3.75 == -13.12499999980 [PASS]: -0.5 * -0.25 == 0.12500000003 [PASS]: 0.25 * -1.9 == -0.47499999974 [PASS]: 12345.6789 * 9876.5432 == 121932630.98917217553 [PASS]: 12345.6789 * -9876.5432 == -121932630.98917217553 ``` 透過一些簡單的測資確認實作的方向正確,可以看到大致的測資皆符合。但是其中 `INT_MIN * 1` 的結果出現錯誤。不難想像其原因是因為我們將數字都先轉成正數處理,但是合法的 -2147483648 轉成正數卻超過了其只能表示到的 2147483647,因此出現非預期的結果,顯然我們需要針對此特例做額外的處理。 ### 加法 overflow ``` [FAIL]: INT_MAX = 2147483647, INT_MAX + INT_MAX expect inf but get -2.000000 ``` 對於加法的處理與乘法同理: ```cpp static uint64_t plus(uint64_t a, uint64_t b) { int64_t ans; if (__builtin_add_overflow((int64_t)a, (int64_t)b, &ans)) { pr_info("calc: add, overflow occur\n"); return INF_INT; } return (uint64_t) ans; } ```