2017q3 Homework3 (simulator)

# 2017q3 Homework3 (simulator) contributed by <`tina0405`> >記得固定格式喔！ >[name=課程助教][color=red] ### 背景知識 - [x] 閱讀 [現代處理器設計](http://hackfoldr.org/cpu) 教材和觀看裡頭所有錄影,記錄你的心得和疑惑 * 影片 [淺談虛擬機器設計與實作](https://www.youtube.com/watch?v=izy3TUBAnio&feature=youtu.be) * 資料來源： [Wikipedia](https://en.wikipedia.org/wiki/Cross_compiler) :::danger 是 Wikipedia，而非 wiki，兩者不同 :notes: jserv ::: * cross compiler: is a compiler capable of creating executable code for a platform other than the one on which the compiler is running * 例子： a compiler that runs on a Windows 7 PC but generates code that runs on Android smartphone is a cross compiler. * 所需背景知識 * turing complete * turing machine * compiler * linker * LLVM 現在已不可以稱作 Low Level Virtual Machine * 具有完整編譯器架構 * 一開始在 Frontend 資訊會比較不明確（如投影片中例子是 int），也許還會有註解，在 IR (Intermediate Representation) 就會更清楚說明 32-bits,到了 Backend 甚至會直接告訴你偏移量 ~~~ Frontend LLVM Backend C/C++ -> IR -> X86 組語 ~~~ * 補：為什麼我們需要 IR？（[你所不知道的 C 語言：編譯器和最佳化原理篇](https://hackmd.io/s/Hy72937Me) 提到） * 因為如果今天有很多種 Frontend (比如：C/C++) 要轉成各個 Backend 的 target （比如：ARM/X86）會有很多不同的組合，如果我們中間有一個 IR ，就大家可以專心在一件事身上 * 利用 Brainfuck instruction 來得到 C 的執行檔： * 產生 C 的執行檔或 ELF ，可以透過 Brainfuck instruction （不同類型的指令）來產生 ~~~ Brainfuck instruction | Brainfuck Compiler / \ ELF C 的執行檔 ~~~ * Brainfuck 看似只有 8 種符號，（後半段提及）但若是實現 Nested Loop （巢狀迴圈）或複雜結構卻相對複雜 | Character | C | Meaning | | -------- | -------- | -------- | | > | ++p | increment the data pointer (to point to the next cell to the right) | | < | - -p |decrement the data pointer (to point to the next cell to the left)| | + | ++*p |increment (increase by one) the byte at the data pointer.| | - | - -*p |decrement (decrease by one) the byte at the data pointer.| | . | putchar(*p)|output the byte at the data pointer.| | ， | *p=putchar() |accept one byte of input, storing its value in the byte at the data pointer.| | [ | while(*p){ |if the byte at the data pointer is zero, then instead of moving the instruction pointer forward to the next command, jump it forward to the command after the matching ] command.| | ] | } |if the byte at the data pointer is nonzero, then instead of moving the instruction pointer forward to the next command, jump it back to the command after the matching [ command.| * 舉例 ~~~ brainfunction: [-] C: Clean while(cell[0]) { --cell[0]; } ~~~ * 影片中 Interpreter 和 Static Compiler 的比較 * 參考資料 [compiler interpreters assemblers 比較](http://www.microcontrollertips.com/compilers-translators-interpreters-assemblers/) / [wiki](https://en.wikipedia.org/wiki/Just-in-time_compilation) * Interpreter: translates code like a compiler but reads the code and immediately executes on that code, and therefore is initially faster than a compiler(中間不產 object file) * Compiler: Compilers convert high-level language code to machine (object) code in one session * Assembler： An assembler translates a program written in assembly language into machine language * 影片中提到的 JIT compiler（ Just-in-time compilation ） * IN-TIME：及時（特性:收到 work 就馬上去做，因為有 delay 才有相對的 in-time） * REAL-TIME：即時（特性：在 deadline 以前完成 work） * 而通常程序有兩種進行方式是上述的： Dynamic Interpreter 和 Static Compiler，但 JIT compiler 集合兩種特性： * dynamic translation, is compilation done during execution of a program – at run time – rather than prior to execution. * 然後呈現的結果是 Dynamic（JIT compiler）不做最佳化的效率贏過 Static Compiler 做最佳化 * 碎形（fractal） ### 研究程式碼 - [ ] 研究給定的 [full-stack-hello](https://github.com/sysprog21/full-stack-hello) - [ ] 學習其中 ISA 和模擬器實作，並在 GitHub 上 fork [full-stack-hello](https://github.com/sysprog21/full-stack-hello) - [x] 對照研讀 [你所不知道的 C 語言：編譯器和最佳化原理篇](https://hackmd.io/s/Hy72937Me) 教材和觀看直播錄影，以得知編譯器運作原理 * 搭配影片 [你所不知道的 C 語言：編譯器和最佳化原理 (上)](https://www.youtube.com/watch?v=XY4YkuX_a3k) / [(下)](https://www.youtube.com/watch?v=J2lZmh4PXTc) * 編譯器的限制 * 太多 #if 的 macro 會導致不能處理（行號錯亂） * 有些嵌入式系統的限制（不准使用太多的 STACK 或 MALLOC） * 如果我們理解編譯器，就可以從中修改 * 如果不理解最佳化的話，在對實驗開最佳化可能造成錯誤 * [Compiler design] * DFA （DETERMINISTIC FINITE AUTOMATA） * 每一個 input 的集合一定要對應到一個 status * 每一個 input 的集合只能要對應到一個 status * NFA （NONDETERMINISTIC FINITE AUTOMATA） * NFA 轉 DFA 的所付出的成本會比直接找 DFA 來的低 * 簡易的 NFA 轉 DFA ~~~ ~~~ * shell 會執行 fork + execve() * 在系統呼叫程式時，都會有一個 main 的程式進入點 [(source)](http://refspecs.linuxbase.org/LSB_2.0.1/LSB-PDA/LSB-PDA/baselib---libc-start-main-.html) ~~~ __libc_start_main : function shall initialize the process, call the main function with appropriate arguments, and handle the return from main. __libc_start_main is not in the source standard; it is only in the binary standard. ~~~ * 並不是每一個都是 Compiler -> Assembler -> Linker 是因為除錯方便和有效的模組化（有些沒有 Assembler ，直接由 Compiler 產生的 object code -> linker） * gcc : compiler driver * ld : GNU linker * (DSL) domain specific language * 原來的 vm 所支援的功能 ~~~ Usage: as_exec [-w] [-x] [-o <out_file>] <in_file> -w Assemble <in_file> and write to an ELF file, see -o below -o if -w is specifed, <out_file> is used to store the object code -x Load <in_file> and execute it <in_file> the file name to be used by commands above ~~~ ### 實作 Fibonacci 數列 - [ ] 閱讀 [你所不知道的C語言：遞迴呼叫篇](https://hackmd.io/s/rJ8BOjGGl) 和觀看對應講座錄影，用 [full-stack-hello](https://github.com/sysprog21/full-stack-hello) 的指令實作 Fibonacci 數列 - [x] 需要包含 recursive 和 non-recursive 實作，分別置於檔案 `tests/fib-recursive.s` 和 `tests/fib-iterative.s` - [x] 修改 `vm.c` (和對應的原始程式碼)，允許執行時期由使用者 (和 `Makefile`) 帶入參數，如 `--input`，這數值應該透過暫存器作為上述 Fibonacci 數列程式的 `Fib(N)` - [ ] 研究現有 [full-stack-hello](https://github.com/sysprog21/full-stack-hello) 的自動測試程式，整合上述兩種 Fibonacci 程式實作，並且比較效能 (需要包含 vm 內部的 cycle count) * [Fibonacci number](https://en.wikipedia.org/wiki/Fibonacci_number)　：　the Fibonacci numbers are the numbers in the following integer sequence, called the Fibonacci sequence * 定義：　every number after the first two is the sum of the two preceding ones * 像這樣　1,1,2,3,5,8,13,21,24.. * 但現在的使用，有時候會給 0 當初始值　0,1,1,2,3,5,8,13，24... * Fib(N) 就是要求出數列中的第 N 個為多少： * 補：目前的 N 是輸入的 N 減 2 ，還沒扣 R_1 R_2 * line8 到 line13 相當於 for 迴圈 #### iteration ~~~= or $0 $0 #1 ;R_1 : initial the first number or $0 $1 #2 ;R_2 : initial the second number or $0 $0 #3 ;R_3 for temp or $0 $0 #0 ;R_0 for input jz #0 #12 ;if 0,print 0 or $0 #2 #3 ;temp = R_2 add #1 #2 #2 ;R_2 = R_1 + R_2 or $0 #3 #1 ;R_1 = temp sub #0 $1 #0 ;input-1 jnz #0 #5 ;jump back until it become 0 print #2 halt print $0 halt ~~~ #### recursive * 一開始如果用 C 去想，會長這樣 ~~~c= int fib(int x) { if (x == 0) return 0; if (x == 1) return 1; return fib(x-1)+fib(x-2); } ~~~ * 拿 5 為例:我們是一直減直到找到 1 ， 0 才 return ~~~ fin(5) / \ fin(4) fin(3) / \ / \ fin(3) fin(2) fin(2) fin(1) / \ / \ / \ / \ fin(2)fin(1)fin(1)fin(0)..... / \ fin(1) fin(0) ~~~ * 但如果我只想用 1 個 register 記錄 N 回傳回去時要 +2 因為做了兩次減 1 像這樣 fib(x-1)+fib(x-2) ~~~= or $0 $0 #1 ;r_1=output or $2 $0 #0 ;r_0=input call #5 print #1 halt xor #0 $1 #2 ;r_2=compare register jnz #2 #9 ;jump,if!=1 add #1 $1 #1 ;if==1,output add ret xor #0 $0 #2 jnz #2 #12 ;jump,if!=1 ret ;if==0,return sub #0 $1 #0 call #5 sub #0 $1 #0 call #5 add #0 $2 #0 ret ~~~ * 關於 `--input` 的設計，我想要他和 `-w` 一起打包成一個 elf ，所以不和 `-x` 一起使用 ~~~c ... } else if (!strcmp(argv[i], "--input")) { if (!argv[i + 1]) FATAL(-1, "Fib(N) need int as a parameter , see -h\n"); if (req == LOAD_ELF_AND_EVAL) FATAL(-1, "-x and --input used together, see -h\n"); int Fib_N = atoi(argv[++i]); } ... ~~~ :::danger `driver.c` 本身不該知道特定程式的存在，這些錯誤訊息應該移到組合語言程式中。`driver.c` 只檢查輸入的型態。注意看 [atoi](http://www.cplusplus.com/reference/cstdlib/atoi/) 的回傳值，我們可能會分不清楚 `0` 來自使用者輸入，抑或是錯誤的表達式 :notes: jserv ::: >>了解了，謝謝老師提點! [name=tina0405] * 先處理如果有 `--input` ，將他的參數放進 register_0 * 這邊我預設 register_0 就是要給 `--input` 參數放置的 * 我使用 sprintf ，如果有 `--input` 如同一開始就有一行組語是在初始化暫存器 * 傳進去的 input 是我再 driver.c 定義，沒用到為 -1 * 20 的空間是包括原本字串長度和可能的數字長度 ~~~c= void assemble_from_fd(vm_env *env, int fd, int input) { ... if (input != -1) { char str[20]; /* INT_MAX = 2147483647 */ sprintf(str, "or #0 $%d #0", input); assemble_line(env, str); } ... } ~~~ * 再來思考如何區分 atoi 的錯誤值（預設是0）和使用者真的輸入 0 * 思考的方向是利用 ASCII code，在 48~57 之間為數字 * 有錯誤的話傳送 error signal 再組語中印出 * 錯誤訊號存在 rigister 5 ~~~c= } else if (!strcmp(argv[i], "--input")) { ... for (int k = 0; k < strlen(argv[i]); k++) { if (argv[i][k] < 48 || argv[i][k] > 57) { error_N = 1; /*error signal for assembly*/ break; } } ... ~~~ ### 製圖和解讀 - [ ] 除了 recursive 和 iterative 版本，應該一併考慮以下 Fibonacci 數列的實作方式，並善用 GNU plot 製圖和解讀 - [x] Tail recursion - [x] Q-Matrix - [x] Fast doubling #### tail recursive * [參考 Tail recursion](http://www.geeksforgeeks.org/tail-recursion-fibonacci/) : A recursive function is tail recursive when the recursive call is the last thing executed by the function. ~~~c= // A tail recursive function to // calculate n th fibnacci number int fib(int n, int a = 0, int b = 1) { if (n == 0) return a; if (n == 1) return b; return fib(n-1, b, a+b); } ~~~ * 例子： n = 5 帶入，就是這樣層層 call 下去，再 return 回來 ~~~ n a b 5 0 1 4 1 1 3 1 2 2 2 3 1 3 5 ~~~ * 我將 rigister 3 設計為 temp ，他可以用來比較是否不為 0 要 jump 也可以做 result 因為在每一次 ret 時，他當下的任務就完成了，可以被蓋掉 ~~~ or $0 $0 #1 ;a = 0 or $1 $0 #2 ;b = 1 or $0 $0 #3 ;tmp = 0 call #8 print #3 halt xor #0 $0 #3 ;compare with 0 jnz #3 #12 or #1 $0 #3 ret xor #0 $1 #3 ;compare with 1 jnz #3 #16 or #2 $0 #3 ret sub #0 $1 #0 or #1 $0 #3 or #2 $0 #1 add #2 #3 #2 call #8 ret ~~~ #### Fibonacci Q-Matrix 參考 [source](http://mathworld.wolfram.com/FibonacciQ-Matrix.html) $Q=\left [ \begin{array}{cc} F_2 & F_1 \\ F_1 & F_0 \end{array} \right ]=\left[ \begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array} \right ]$ * 然後變成 n 次方 $Q^n=\left [ \begin{array}{cc} F_{n+1} & F_n \\ F_n & F_{n-1} \end{array} \right ]$ * 其實這個 Q-Matrix 就是利用矩症乘法的特性，又要乘的係數剛好都是 1 所以有累加的效果 * 矩陣的實作因為有 4 個元素，但是是 3 個值，用 3 個 rigister 表示 $F_0$ $F_1$ $F_2$ 其中 $F_1$ 會是結果 * 然後我們做矩陣的乘法，例子： $\left[ \begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array} \right ]$$\left[ \begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array} \right ]=\left[ \begin{array}{cc} 2 & 1 \\ 1 & 1 \end{array} \right ]$ * 所以我們每次只做 4 件事，但至少要兩個暫存器來存被蓋掉之前的值 $1\times F_2+1\times F_1=F_3$ 蓋掉舊的 $F_2$ $1\times F_1+1\times F_0=F_2$ 蓋掉舊的 $F_1$ $1\times F_2+0\times F_1=F_2$ 蓋掉舊的 $F_1$ $1\times F_1+0\times F_0=F_1$ 蓋掉舊的 $F_0$ * 原本 4 行的運算式，用組語表示就必須一直考量暫存器是否會被蓋掉： ~~~ or $0 $0 #1 ;F0 = 0 or $1 $0 #2 ;F1 = 1 or $1 $0 #3 ;F2 = 1 or $0 $0 #4 ;tmp or $0 $0 #6 ;tmp or $0 $0 #7 ;tmp or $0 $0 #8 ;tmp xor #0 $0 #4 jnz #4 #13 print #1 ;input == 0 ,print result halt or #3 $0 #7 ;record F2 mul $1 #3 #4 ;1*F2 mul $1 #2 #6 ;1*F1 add #4 #6 #3 ;1*F2+1*F1=F2 or #2 $0 #8 ;record F1 mul $1 #2 #4 ;1*F1 mul $1 #1 #6 ;1*F0 add #4 #6 #2 ;1*F1+1*F0=F1 mul $1 #7 #4 ;1*F2 mul $0 #8 #6 ;0*F1 add #4 #6 #3 ;1*F2+0*F1=F1 mul $1 #8 #4 ;1*F1 mul $0 #1 #6 ;0*F0 add #4 #6 #1 ;1*F1+0*F0=F0 sub #0 $1 #0 ;input sub 1 jmp #9 ~~~ * 在 Q-matrix 還有一個特性：之後可以用這些特性來擴大 fibonacci 的允許範圍 $Q^m Q^{n-1}=Q^{n+m-1}$ #### Fast doubling * 參考了這篇證明 [Fibonacci Fast Doubling Proof](https://math.stackexchange.com/questions/1124590/need-help-understanding-fibonacci-fast-doubling-proof) * 我們可以利用以上證明寫出下列的算式： $F(2n)=2F(n+1)F(n)-F(n)^2$ $F(2n+1)=F(n+1)^2+F(n)^2$ * 把奇數偶數分開處理 ~~~ or $0 $0 #1 ;R_1 : initial the first number or $0 $1 #2 ;R_2 : initial the second number or $0 $0 #3 ;R_3 for temp or $0 $0 #4 ;R_4 for n or $0 $0 #6 ;F(n) or $0 $0 #7 ;F(n+1) or $0 $0 #8 ;R_8 for temp mod #0 $2 #8 jnz #8 #13 ;if even div #0 $2 #4 ;register_4 = n jmp #24 sub #0 $1 #4 ;if odd div #4 $2 #4 ;register_4 = n jmp #24 jnz #4 #18 ;if 0,print 0 ret or $0 #2 #3 ;temp = R_2 add #1 #2 #2 ;R_2 = R_1 + R_2 or $0 #3 #1 ;R_1 = temp sub #4 $1 #4 ;input-1 jnz #4 #18 ;jump back until it become 0 ret call #16 or #1 $0 #6 ;register_6 = F(n) add #4 $1 #4 ;n-1 call #16 or #1 $0 #7 ;register_7 = F(n-1) jnz #8 #36 mul $2 #6 #3 ;even mul #3 #7 #3 ;2*F(n)*F(n+1)-F(n)*F(n) mul #6 #6 #1 sub #3 #1 #3 print #3 halt mul #6 #6 #3 ;odd mul #7 #7 #1 ;F(n+1)*F(n+1)+F(n)*F(n) add #3 #1 #3 print #3 halt ~~~ ### 加入追蹤 / 除錯用途的指令 - [ ] 模擬器的程式碼，加入追蹤 / 除錯用途的指令 (instruction) - [ ] 探討模擬器的運作機制並說明自己修改的方式 ### 提點 * 注意 Fib(63) 之後的數值會超過 2^32^ 可表達的範圍，需要設計有效的數值表示機制來處理 (如用兩個 32-bit 暫存器保存 2^64^ 數值) * 觀察 [Add label support](https://github.com/jserv/full-stack-hello/pull/30) 的實作方式，試著在[full-stack-hello](https://github.com/sysprog21/full-stack-hello) 給定的組譯器中新增 label 的支援