2017q3 Homework3 (simulator)

contributed by <tina0405>

記得固定格式喔！
課程助教

背景知識

閱讀現代處理器設計教材和觀看裡頭所有錄影,記錄你的心得和疑惑

影片淺談虛擬機器設計與實作
資料來源： Wikipedia

是 Wikipedia，而非 wiki，兩者不同

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

cross compiler: is a compiler capable of creating executable code for a platform other than the one on which the compiler is running
- 例子： a compiler that runs on a Windows 7 PC but generates code that runs on Android smartphone is a cross compiler.
所需背景知識
- turing complete
- turing machine
- compiler
- linker
LLVM 現在已不可以稱作 Low Level Virtual Machine
- 具有完整編譯器架構
- 一開始在 Frontend 資訊會比較不明確（如投影片中例子是 int），也許還會有註解，在 IR (Intermediate Representation) 就會更清楚說明 32-bits,到了 Backend 甚至會直接告訴你偏移量
```
Frontend         LLVM         Backend
C/C++      ->     IR   ->     X86 組語
```
- 補：為什麼我們需要 IR？（你所不知道的 C 語言：編譯器和最佳化原理篇提到）
  - 因為如果今天有很多種 Frontend (比如：C/C++) 要轉成各個 Backend 的 target （比如：ARM/X86）會有很多不同的組合，如果我們中間有一個 IR ，就大家可以專心在一件事身上
利用 Brainfuck instruction 來得到 C 的執行檔：
產生 C 的執行檔或 ELF ，可以透過 Brainfuck instruction （不同類型的指令）來產生

Brainfuck instruction
          |
 Brainfuck Compiler
     /        \
   ELF    C 的執行檔

Brainfuck 看似只有 8 種符號，（後半段提及）但若是實現 Nested Loop （巢狀迴圈）或複雜結構卻相對複雜

Character	C	Meaning
>	++p	increment the data pointer (to point to the next cell to the right)
<	- -p	decrement the data pointer (to point to the next cell to the left)
+	++*p	increment (increase by one) the byte at the data pointer.
-	- -*p	decrement (decrease by one) the byte at the data pointer.
.	putchar(*p)	output the byte at the data pointer.
，	*p=putchar()	accept one byte of input, storing its value in the byte at the data pointer.
[	while(*p){	if the byte at the data pointer is zero, then instead of moving the instruction pointer forward to the next command, jump it forward to the command after the matching ] command.
]	}	if the byte at the data pointer is nonzero, then instead of moving the instruction pointer forward to the next command, jump it back to the command after the matching [ command.

舉例

brainfunction: [-]
C: Clean
while(cell[0])
{
   --cell[0];
}

影片中 Interpreter 和 Static Compiler 的比較
- 參考資料 compiler interpreters assemblers 比較 / wiki
- Interpreter: translates code like a compiler but reads the code and immediately executes on that code, and therefore is initially faster than a compiler(中間不產 object file)
- Compiler: Compilers convert high-level language code to machine (object) code in one session
- Assembler： An assembler translates a program written in assembly language into machine language
影片中提到的 JIT compiler（ Just-in-time compilation ）
- IN-TIME：及時（特性:收到 work 就馬上去做，因為有 delay 才有相對的 in-time）
- REAL-TIME：即時（特性：在 deadline 以前完成 work）
而通常程序有兩種進行方式是上述的： Dynamic Interpreter 和 Static Compiler，但 JIT compiler 集合兩種特性：
- dynamic translation, is compilation done during execution of a program – at run time – rather than prior to execution.
然後呈現的結果是 Dynamic（JIT compiler）不做最佳化的效率贏過 Static Compiler 做最佳化
碎形（fractal）

研究程式碼

研究給定的 full-stack-hello
學習其中 ISA 和模擬器實作，並在 GitHub 上 fork full-stack-hello
對照研讀你所不知道的 C 語言：編譯器和最佳化原理篇教材和觀看直播錄影，以得知編譯器運作原理

搭配影片你所不知道的 C 語言：編譯器和最佳化原理 (上) / (下)
編譯器的限制
- 太多 #if 的 macro 會導致不能處理（行號錯亂）
- 有些嵌入式系統的限制（不准使用太多的 STACK 或 MALLOC）
  - 如果我們理解編譯器，就可以從中修改
- 如果不理解最佳化的話，在對實驗開最佳化可能造成錯誤
[Compiler design]
- DFA （DETERMINISTIC FINITE AUTOMATA）
  - 每一個 input 的集合一定要對應到一個 status
  - 每一個 input 的集合只能要對應到一個 status
- NFA （NONDETERMINISTIC FINITE AUTOMATA）
NFA 轉 DFA 的所付出的成本會比直接找 DFA 來的低
簡易的 NFA 轉 DFA

shell 會執行 fork + execve()

在系統呼叫程式時，都會有一個 main 的程式進入點 (source)

__libc_start_main :
function shall initialize the process, call the 	main function with appropriate 
arguments, and handle the return from main.
__libc_start_main is not in the source standard; 
it is only in the binary standard.

並不是每一個都是 Compiler -> Assembler -> Linker 是因為除錯方便和有效的模組化（有些沒有 Assembler ，直接由 Compiler 產生的 object code -> linker）
gcc : compiler driver
- ld : GNU linker
(DSL) domain specific language
原來的 vm 所支援的功能

Usage: as_exec [-w] [-x] [-o <out_file>] <in_file>
       -w Assemble <in_file> and write to an ELF file, see -o below
       -o if -w is specifed, <out_file> is used to store the object code
       -x Load <in_file> and execute it

       <in_file> the file name to be used by commands above

實作 Fibonacci 數列

閱讀你所不知道的C語言：遞迴呼叫篇和觀看對應講座錄影，用 full-stack-hello 的指令實作 Fibonacci 數列
需要包含 recursive 和 non-recursive 實作，分別置於檔案 tests/fib-recursive.s 和 tests/fib-iterative.s
修改 vm.c (和對應的原始程式碼)，允許執行時期由使用者 (和 Makefile) 帶入參數，如 --input，這數值應該透過暫存器作為上述 Fibonacci 數列程式的 Fib(N)
研究現有 full-stack-hello 的自動測試程式，整合上述兩種 Fibonacci 程式實作，並且比較效能 (需要包含 vm 內部的 cycle count)

Fibonacci number　：　the Fibonacci numbers are the numbers in the following integer sequence, called the Fibonacci sequence
定義：　every number after the first two is the sum of the two preceding ones
- 像這樣　1,1,2,3,5,8,13,21,24..
- 但現在的使用，有時候會給 0 當初始值　0,1,1,2,3,5,8,13，24…
Fib(N) 就是要求出數列中的第 N 個為多少：
- 補：目前的 N 是輸入的 N 減 2 ，還沒扣 R_1 R_2
- line8 到 line13 相當於 for 迴圈

iteration


















or $0 $0 #1  ;R_1 : initial the first number
or $0 $1 #2  ;R_2 : initial the second number
or $0 $0 #3  ;R_3 for temp 
or $0 $0 #0  ;R_0 for input 

jz #0 #12    ;if 0,print 0

or $0 #2 #3  ;temp = R_2
add #1 #2 #2 ;R_2 = R_1 + R_2
or $0 #3 #1  ;R_1 = temp

sub #0 $1 #0 ;input-1
jnz #0 #5    ;jump back until it become 0   
print #2
halt

print $0
halt

recursive

一開始如果用 C 去想，會長這樣









int fib(int x) {
    if (x == 0)
        return 0;

    if (x == 1)
        return 1;

    return fib(x-1)+fib(x-2);
}

拿 5 為例:我們是一直減直到找到 1 ， 0 才 return

		          fin(5)
  			/        \
  		   fin(4)        fin(3)
 		 /       \      /      \
	     fin(3)   fin(2)   fin(2)   fin(1)
            /     \  /     \  /     \  /     \
	fin(2)fin(1)fin(1)fin(0)..... 
  	/    \
    fin(1) fin(0)

但如果我只想用 1 個 register 記錄 N 回傳回去時要 +2 因為做了兩次減 1 像這樣 fib(x-1)+fib(x-2)




















or $0 $0 #1 ;r_1=output
or $2 $0 #0 ;r_0=input
call #5
print #1
halt

xor #0 $1 #2 ;r_2=compare register
jnz #2 #9 ;jump,if!=1
add #1 $1 #1 ;if==1,output add
ret
xor #0 $0 #2
jnz #2 #12 ;jump,if!=1
ret ;if==0,return

sub #0 $1 #0
call #5
sub #0 $1 #0
call #5
add #0 $2 #0
ret

關於 --input 的設計，我想要他和 -w 一起打包成一個 elf ，所以不和 -x 一起使用

...
} else if (!strcmp(argv[i], "--input")) {
    if (!argv[i + 1])
        FATAL(-1, "Fib(N) need int as a parameter , see -h\n");
    if (req == LOAD_ELF_AND_EVAL)
        FATAL(-1, "-x and --input used together, see -h\n");
    int Fib_N = atoi(argv[++i]);
}
...

driver.c 本身不該知道特定程式的存在，這些錯誤訊息應該移到組合語言程式中。driver.c 只檢查輸入的型態。注意看 atoi 的回傳值，我們可能會分不清楚 0 來自使用者輸入，抑或是錯誤的表達式

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

了解了，謝謝老師提點! tina0405

先處理如果有 --input ，將他的參數放進 register_0
- 這邊我預設 register_0 就是要給 --input 參數放置的
- 我使用 sprintf ，如果有 --input 如同一開始就有一行組語是在初始化暫存器
- 傳進去的 input 是我再 driver.c 定義，沒用到為 -1
- 20 的空間是包括原本字串長度和可能的數字長度










void assemble_from_fd(vm_env *env, int fd, int input)
{
...
if (input != -1) {
        char str[20]; /* INT_MAX = 2147483647 */
        sprintf(str, "or #0 $%d #0", input);
        assemble_line(env, str);
    }
...
}

再來思考如何區分 atoi 的錯誤值（預設是0）和使用者真的輸入 0
- 思考的方向是利用 ASCII code，在 48~57 之間為數字
- 有錯誤的話傳送 error signal 再組語中印出
- 錯誤訊號存在 rigister 5









} else if (!strcmp(argv[i], "--input")) {
            ...
            for (int k = 0; k < strlen(argv[i]); k++) {
                if (argv[i][k] < 48 || argv[i][k] > 57) {
                    error_N = 1; /*error signal for assembly*/
                    break;
                }
            }
       ...

製圖和解讀

除了 recursive 和 iterative 版本，應該一併考慮以下 Fibonacci 數列的實作方式，並善用 GNU plot 製圖和解讀
Tail recursion
Q-Matrix
Fast doubling

tail recursive

參考 Tail recursion : A recursive function is tail recursive when the recursive call is the last thing executed by the function.










// A tail recursive function to
// calculate n th fibnacci number
int fib(int n, int a = 0, int b = 1)
{
    if (n == 0)
        return a;
    if (n == 1)
        return b;
    return fib(n-1, b, a+b);
}

例子： n = 5 帶入，就是這樣層層 call 下去，再 return 回來

我將 rigister 3 設計為 temp ，他可以用來比較是否不為 0 要 jump 也可以做 result 因為在每一次 ret 時，他當下的任務就完成了，可以被蓋掉

or $0 $0 #1  ;a = 0
or $1 $0 #2  ;b = 1
or $0 $0 #3  ;tmp = 0

call #8 
print #3 
halt

xor #0 $0 #3 ;compare with 0
jnz #3 #12
or #1 $0 #3
ret

xor #0 $1 #3 ;compare with 1
jnz #3 #16
or #2 $0 #3
ret

sub #0 $1 #0
or #1 $0 #3
or #2 $0 #1
add #2 #3 #2
call #8
ret

Fibonacci Q-Matrix 參考 source

Q = [\begin{array}{cc} F_{2} & F_{1} \\ F_{1} & F_{0} \end{array}] = [\begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array}]

然後變成 n 次方

Q^{n} = [\begin{array}{cc} F_{n + 1} & F_{n} \\ F_{n} & F_{n - 1} \end{array}]

其實這個 Q-Matrix 就是利用矩症乘法的特性，又要乘的係數剛好都是 1 所以有累加的效果
矩陣的實作因為有 4 個元素，但是是 3 個值，用 3 個 rigister 表示
$F_{0}$
$F_{1}$
$F_{2}$ 其中
$F_{1}$ 會是結果
然後我們做矩陣的乘法，例子：

$[\begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array}]$
$[\begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array}] = [\begin{array}{cc} 2 & 1 \\ 1 & 1 \end{array}]$
所以我們每次只做 4 件事，但至少要兩個暫存器來存被蓋掉之前的值

$1 \times F_{2} + 1 \times F_{1} = F_{3}$ 蓋掉舊的
$F_{2}$

$1 \times F_{1} + 1 \times F_{0} = F_{2}$ 蓋掉舊的
$F_{1}$

$1 \times F_{2} + 0 \times F_{1} = F_{2}$ 蓋掉舊的
$F_{1}$

$1 \times F_{1} + 0 \times F_{0} = F_{1}$ 蓋掉舊的
$F_{0}$
原本 4 行的運算式，用組語表示就必須一直考量暫存器是否會被蓋掉：

or $0 $0 #1  ;F0 = 0
or $1 $0 #2  ;F1 = 1
or $1 $0 #3  ;F2 = 1
or $0 $0 #4  ;tmp
or $0 $0 #6  ;tmp
or $0 $0 #7  ;tmp
or $0 $0 #8  ;tmp

xor #0 $0 #4 
jnz #4 #13   
print #1     ;input == 0 ,print result
halt

or  #3 $0 #7 ;record F2
mul $1 #3 #4 ;1*F2
mul $1 #2 #6 ;1*F1
add #4 #6 #3 ;1*F2+1*F1=F2

or  #2 $0 #8 ;record F1
mul $1 #2 #4 ;1*F1
mul $1 #1 #6 ;1*F0
add #4 #6 #2 ;1*F1+1*F0=F1

mul $1 #7 #4 ;1*F2
mul $0 #8 #6 ;0*F1
add #4 #6 #3 ;1*F2+0*F1=F1

mul $1 #8 #4 ;1*F1
mul $0 #1 #6 ;0*F0
add #4 #6 #1 ;1*F1+0*F0=F0

sub #0 $1 #0 ;input sub 1
jmp #9

在 Q-matrix 還有一個特性：之後可以用這些特性來擴大 fibonacci 的允許範圍

$Q^{m} Q^{n - 1} = Q^{n + m - 1}$

Fast doubling

參考了這篇證明 Fibonacci Fast Doubling Proof
我們可以利用以上證明寫出下列的算式：

$F (2 n) = 2 F (n + 1) F (n) - F (n)^{2}$

$F (2 n + 1) = F (n + 1)^{2} + F (n)^{2}$
把奇數偶數分開處理

or $0 $0 #1  ;R_1 : initial the first number
or $0 $1 #2  ;R_2 : initial the second number
or $0 $0 #3  ;R_3 for temp 
or $0 $0 #4  ;R_4 for n
or $0 $0 #6  ;F(n)
or $0 $0 #7  ;F(n+1)
or $0 $0 #8  ;R_8 for temp
mod #0 $2 #8
jnz #8 #13    ;if even
div #0 $2 #4 ;register_4 = n 
jmp #24
sub #0 $1 #4 ;if odd
div #4 $2 #4 ;register_4 = n 
jmp #24
jnz #4 #18   ;if 0,print 0
ret
or $0 #2 #3  ;temp = R_2
add #1 #2 #2 ;R_2 = R_1 + R_2
or $0 #3 #1  ;R_1 = temp
sub #4 $1 #4 ;input-1
jnz #4 #18   ;jump back until it become 0   
ret
call #16     
or #1 $0 #6  ;register_6 = F(n)
add #4 $1 #4 ;n-1
call #16
or #1 $0 #7  ;register_7 = F(n-1)
jnz #8 #36   
mul $2 #6 #3 ;even
mul #3 #7 #3 ;2*F(n)*F(n+1)-F(n)*F(n)
mul #6 #6 #1
sub #3 #1 #3 
print #3
halt
mul #6 #6 #3 ;odd
mul #7 #7 #1 ;F(n+1)*F(n+1)+F(n)*F(n)
add #3 #1 #3
print #3
halt

加入追蹤 / 除錯用途的指令

模擬器的程式碼，加入追蹤 / 除錯用途的指令 (instruction)
探討模擬器的運作機制並說明自己修改的方式

提點

注意 Fib(63) 之後的數值會超過 2³² 可表達的範圍，需要設計有效的數值表示機制來處理 (如用兩個 32-bit 暫存器保存 2⁶⁴ 數值)
觀察 Add label support 的實作方式，試著在full-stack-hello 給定的組譯器中新增 label 的支援