owned this note
owned this note
Published
Linked with GitHub
# 2016q3 Homework5 (jit-compiler)
contributed by <`heathcliffYang`> <`janetwei`>
[Github](https://github.com/janetwei/jit-construct)
## 目標
>>中英文字間請以空白隔開
>>[color=red][name=課程助教]
>>好的
## Code
- [ ] README.md
a Just In Time (JIT) compiler for the brainfuck language.
**JIT compilation:** 意思是在 run time 時編譯,由兩種編譯方法組合(AOT & interpretation)
**Cross compiler :** 可以產生針對另外一個平台 (platform) 的執行檔,而不是現在正在運行的平台。常用於 multiple platforms
> compiler 與 platform 的關係是什麼?
> platform 是指 CPU 架構嗎?
> 在 compiler 查對照的組語的時候嗎?
> compiler 與 platform 的相依性多大?
**ABI :** (Application Binary Interface for the ARM Architercture)==NOT YET==
- [ ] Makefile
==CC 好像沒有被定義?==
> 目前認為是因為此次作業使用不同 compiler 測試結果,因此在 Makefile 裡就沒有定義 CC
```
run-jit-x64: jit-x64
./jit-x64 progs/hello.b && objdump -D -b binary \
-mi386 -Mx86-64 /tmp/jitcode
```
**objdump :** 顯示 object file 的資訊
`-D` : Display the assembler mnemonics for the machine instructions from objfile.
`-b` : 標示 object file 的 object-code format
[參考資料](https://sourceware.org/binutils/docs/binutils/objdump.html)
**.dasc :**
DynASM has two parts: the assembler/preprocessor, written in Lua, and the the linker/encoder, written in C. dynasm.lua is the preprocessor.
**.dasl files** refer to Lua/ASM files
**.dasc files** refer to C/ASM files.
[參考資料](https://luapower.com/dynasm#examples)
**arm-linux-gnueabihf- :** gcc for armhf architecture, `-v`可以看編譯器更詳細資訊
`-mfloat-abi` 有三種值
soft : 不用fpu進行浮點運算,而是用軟體算==待查詢浮點運算的方法==
softfp : armel architecture
hard : armhf architecture, 用 fpu 計算,傳參數也用 fpu 的 float register 傳,省去了轉換,但是中斷的 overhead 高
[參考資料](http://fanli7.net/a/caozuoxitong/Linux/20140810/525891.html)
**QEMU :** 模擬處理器 (virtural processor?), 具有高速與跨平台的特性,分為 User mode & System mode
User mode : ==QEMU能啟動那些為不同中央處理器編譯的Linux程序。 Ex: Wine==
System mode : 能模擬整個電腦系統,從 CPU 到周邊
**Wine :** 讓在 x86 上的 Unix family OS 可以跑 Microsoft Windows 的程式
> ==CPU & platform 的關係==
**Lua :** [參考資料](https://zh.wikipedia.org/wiki/Lua)
- [ ] interpreter
- [ ] compiler-x86
- [ ] compiler-x64
- [ ] compiler-arm
**.b :** brainfuck code
## DynASM
**DynASM** is a preprocessor and tiny runtime library for creating assemblers and JIT compilers in C orC++.
* DynASM is a pre-processing assembler.
* DynASM converts mixed C/Assembler source to plain C code.
* It is written in Lua.
* There is no machine-dependency for the pre-processor itself.
**.dasc**:檔案中包含 C 和 assembly ,藉由 Lua 可以轉成 C code
**`|`**: The code of this line should be pre-processed.
可包含 assembly language instructions 或 directives.
**`dasm_put()`**:copies machine code snippets from the action list and merges them with the arguments
[參考資料](http://luajit.org/dynasm_features.html)
## Optimization
* Origin 的數據:
```
bench jit-x86
progs/awib.b GOOD 119.2ms
progs/mandelbrot.b GOOD 4064.4ms
progs/hanoi.b GOOD 10087.5ms
```
>> 兩個版本?
>> [color=red][name=課程助教]
>> 已經修改過,謝謝助教
1. Contraction
此方法是將多餘運算整併,例如 +++++ ,其實就是+=5,此改善方法雖然直覺簡單,但實質效益也不差,而實作方法可在遇到+ - > < 時即檢查是否為連續出現。
```C
case '+':
count=continuous_count(p);
| add byte [PTR],count
p+=(count-1);
break;
```
![](https://i.imgur.com/qQdq5sg.png)
```C
case '-':
count_dec=continuous_count(p);
| sub byte [PTR],count_dec
p+=(count_dec-1);
break;
```
此為做完`+` `-`運算後所優化的結果
![](https://i.imgur.com/mcPVWbz.png)
因為 hanoi.b 裡面大部份都是 contraction 的`+` `-`運算,所以優化有明顯的改變
將連續的`+` `-` `>` `<`統計後再一次運算所優化的結果
![](https://i.imgur.com/XarueOP.png)
2. check loops
* clear loop:遇到[-],將迴圈歸零,將PTR的值為0
```C
int clear_loop(char *p)
{
if( *(p+1)=='-' && *(p+2)==']' )
return 1;
else
return 0;
}
```
先判別是否為[-],若是則 return 1,其餘 return 0
```C
case '[':
if (top == limit) err("Nesting too deep.");
if(clear_loop(p)){
p+=2;
| mov byte [PTR], 0
break;
}
// Each loop gets two pclabels: at the beginning and end.
// We store pclabel offsets in a stack to link the loop
// begin and end together.
else
{
maxpc += 2;
*top++ = maxpc;
dasm_growpc(&state, maxpc);
| cmp byte [PTR], 0
| je =>(maxpc-2)
|=>(maxpc-1):
break;
}
```
![](https://i.imgur.com/3GAlO3M.png)
之後參考老師的 code 將 clear loop & copy loop & multipile loop 整合
```C
int result = 0;
int check_loops(char *p,int *index,int *mult)
{
int res,offset = 0,_index = 0;
result = 0;
if (*(p+1) != '-') return -1;
p += 2;
while (*p != ']') {
if (*p == '[' || *p == '-' ||
*p == '.' || *p == ',')
return -2;
res = continuous_count(p);
if (*p == '>') offset += res;
else if (*p == '<') offset -= res;
else if (*p == '+') {
for(int i = 0;i < _index;i++)
{
if(index[i] == offset)
{
mult[i] += res;
goto L1;
}
}
index[_index] = offset;
mult[_index] = res;
_index++;
}
L1: p += res;
}
if (offset != 0) return -1;
result += *p;
return _index;
}
```
判斷`[-`完後如果後面接的是`*p == '[' || *p == '-' ||*p == '.' || *p == ','`則會忽略前面的`[-`,因此為了跟 return -1 的做區分,所以 return -2
```C
else if(count == -2)
{
for(int j = *p-2;j < *p ; j++)
{
switch(*p)
{
case '[':
if (top == limit) err("Nesting too deep.");
maxpc += 2;
*top++ = maxpc;
dasm_growpc(&state, maxpc);
| cmp byte [PTR], 0
| je =>(maxpc-2)
|=>(maxpc-1):
break;
case ']':
if (top == pcstack) err("Unmatched ']'");
top--;
| cmp byte [PTR], 0
| jne =>(*top-1)
|=>(*top-2):
break;
}
}
}
```
return -2 的時候,前面兩個的判別跟原本的 code 一樣
```C
count_scan = scan_loop(p, brainfuck_ptr, brainfuck_size);
if (count_scan > 0) {
| add PTR,count_scan
break;
}
else if (count_scan < 0) {
count_scan = 0-count_scan;
| sub PTR,count_scan
break;
}*/
count = check_loops(p , index , mult);
if (count == 0)
{
p+=2;
| mov byte [PTR],0
break;
}
else if(count > 0)
{
for(int i = 1; i < count; i++)
{
if(mult[i] ==1)
{
| add PTR,index[i]
| mov byte [PTR],*brainfuck_ptr
}
else
{
int multi_result=(*brainfuck_ptr)*mult[i];
| add PTR,index[i]
| add byte [PTR],multi_result
}
}
*brainfuck_ptr += result;
}
```
>> check_loops 的優化尚未完成
```
Unmatched ']'
progs/awib.b bad output: expected 3b4f9a78ec3ee32e05969e108916a4affa0c2bba got da39a3ee5e6b4b0d3255bfef95601890afd80709
Unmatched ']'
progs/mandelbrot.b bad output: expected b77a017f811831f0b74e0d69c08b78e620dbda2b got da39a3ee5e6b4b0d3255bfef95601890afd80709
Unmatched ']'
progs/hanoi.b bad output: expected 32cdfe329039ce63531dcd4b340df269d4fd8f7f got da39a3ee5e6b4b0d3255bfef95601890afd80709
```
3. scan loops
在實作scan的過程中,發現一個問題,其實我們缺乏一個目標程式裡的pointer,只有compiler在讀程式的時候的pointer
更詳細的討論,之所以original的版本只有compiler讀程式的pointer也不會出問題的原因是,逐個字元翻譯不需要知道目標程式的目的和狀態 ; 然而如果要優化,就要讓compiler理解目標程式到底要做什麼,所謂的理解就是實際執行一次,逐步記錄目標程式的執行情況,剛好這次的brainfuck要紀錄的程式執行情況簡單很多,就是一串用ASCII code表示的array
```c
int scan_loop(char *p, char *brainfuck_ptr, int brainfuck_size)
{
char *ptr = p;
int right = 0, count_scan = 0;
ptr++;
if (*ptr == '>')
right++;
while (*ptr != ']') {
if (*ptr != '>' || *ptr != '<') {
if (*ptr != '\n')
return 0;
ptr++;
}
ptr++;
}
p = ptr; // at ']'
/* scan right */
if ( right != 0) {
count_scan = (memchr(brainfuck_ptr, 0, brainfuck_size) - brainfuck_ptr);
brainfuck_ptr += count_scan;
return count_scan;
}
/* scan left */
count_scan = (memrchr(brainfuck_ptr, 0, brainfuck_size) - brainfuck_ptr);
brainfuck_ptr += count_scan;
return count_scan;
}
```
switch判斷部份 `'['` , p已經在`scan_loop()`裡面改完
```
count_scan = scan_loop(p, brainfuck_ptr, brainfuck_size);
if (count_scan > 0) {
| add PTR,count_scan
break;
}
else if (count_scan < 0) {
count_scan = 0-count_scan;
| sub PTR,count_scan
break;
}
```
4. 移除換行符號
`$ make bench-jit-x86 | grep GOOD | rev | cut -c 3- >> result_time.csv`
## 背景知識
**just-in-time (JIT) compilation:** 動態編譯
在程式執行時將程式碼轉換成 machine code ,並同時可以進行優化
- [ ]執行程式時分析程式碼,找出熱點,避免 compile 到極少執行的程式碼
- [ ]將程式精簡化,優化程式
### brainfuck
由八種運算符構成:
`>` : ++ptr
`<` : --ptr
`+` : ++*ptr
`-` : --*ptr
`.` : putchar(*ptr)
`,` : *ptr = getchar() -> 從鍵盤讀取一個字符
`[` : while (*ptr){
`]` : }
1. 以byte為單位
2. 被初始化為零的variable
3. 一個指向該variable的pointer
4. 兩個stream用於input & output
5. 當前位置清零 : [-]
**sed :** (stream editor),分析STDIN的資料,再輸出到STDOUT
`-f filename` 可以執行 filename 內的 sed 動作
[參考資料](http://dywang.csie.cyut.edu.tw/dywang/linuxProgram/node36.html)
* [brainfuck optimization strategies](http://calmerthanyouare.org/2015/01/07/optimizing-brainfuck.html)
1. define an intermediary representation (IR) which an optimizer can operate
```
BF IR C
+ Add mem[p]++;
- Sub mem[p]--;
> Right p++;
< Left p--;
. Out putchar(mem[p]);
, In mem[p] = getchar();
[ Open while(mem[p]) {
] Add }
```
2. contraction
```
IR C
Add(x) mem[p] += x;
Sub(x) mem[p] -= x;
Right(x) p += x;
Left(x) p -= x;
Out putchar(mem[p]);
In mem[p] = getchar();
Open while(mem[p]) {
Add }
```
3. Clear loop: `[-]` ---> `Clear`
```
IR C
Add mem[p]++;
Sub mem[p]--;
Right p++;
Left p--;
Out putchar(mem[p]);
In mem[p] = getchar();
Open while(mem[p]) {
Add }
Clear mem[p] = 0;
```
4. Copy loop: `[->+>+<<]` ---> `Copy(x)`
```
IR C
Add(x) mem[p] += x;
Sub(x) mem[p] -= x;
Right(x) p += x;
Left(x) p -= x;
Out putchar(mem[p]);
In mem[p] = getchar();
Open while(mem[p]) {
Add }
Clear mem[p] = 0;
Copy(x) mem[p+x] += mem[p];
```
5. Multiplication loops
6. Scan loops : efficiently searching a memory area for occurrences of a particular byte
**`void *memchr(const void *s, int c, size_t n);`** : scan n bytes大小的memory area
**`void *memrchr(const void *s, int c, size_t n);`** : backward from end
return a pointer to the matching byte or NULL
==for the first instance of c 是什麼意思? c 應該是key element?!==
```
ScanLeft p -= (long)((void *)(mem + p) - memrchr(mem, 0, p + 1));
ScanRight p += (long)(memchr(mem + p, 0, sizeof(mem)) - (void *)(mem + p));
```
7. Operation offsets :
```
Add(x, off) mem[p+off] += x;
Sub(x, off) mem[p+off] -= x;
Out(off) putchar(mem[p+off]);
In(off) mem[p+off] = getchar();
Clear(off) mem[p+off] = 0;
Mul(x, y, off) mem[p+x+off] += mem[p+off] * y;
```
[ASCII](https://zh.wikipedia.org/wiki/ASCII)
[Reading note](https://hackmd.io/EwVgHAnAjAphBmBaYBjARgQ0QFhAdgDZEI8ATKRFAZgzQhoAZJgCg===?both)