課程｜編譯器設計

# 編譯器設計 - [CS143-Compiler](https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/) - [自己動手寫編譯器](https://pandolia.net/tinyc/index.html) - [Tools - First / Follow / Predict](http://hackingoff.com/compilers/predict-first-follow-set) - [Tools - LL(1) / LR(1) / SLR(1) / TM](http://jsmachines.sourceforge.net/machines/) - [Rubular](https://rubular.com/) / [Regex101](https://regex101.com/) - dot -Tpng AST_Graph.gv -o graph.png ## 1.Introduction - 程式碼規模： - GCC (4.9) 約有 14.5 M 行 code - Linux kernel (4.1) 約有 22 M 行 code - Compiler 定義 - 狹義：把 **Source program** 轉成 **Machine language** - 廣義：把 **Source language** 轉成等價功能的 **Target language** - Interpreter v.s. Compiler - Interpreter：在不產生 target code 情況下，執行程式碼 - 通常 overhead 比較大 - 避免了 synthesis 部分 - 有比較好的 portability，避免 machine dependent 的處理 - 有效率的 interactive debugging - 寫一個 interpreter 比寫一個 compiler 容易許多 - JVM / JIT - Java Source Code 先 **"編譯"** 成 Intermediate Byte Code - JVM 用 **"直譯"** 方式執行 Byte Code - 若有 Byte Code 反覆執行，JIT 會把該部分 **"編譯"** 成 machine code 加速 - Compiler 步驟 - Analysis 分析 - Lexical analysis - Syntax analysis - 文法，被 Context Free Grammar 所定義 - Semantic Analysis - 語意，通常 Context Sensitive，如變數型態要求，宣告要求 - Static semantics：Rule Based 的語意 - Attribute grammars：在 CTG 後面在加入 attribute，來表示 static semantics 要滿足的條件 - Synthesis 合成 - IR code generation - Optimization - Target code generation - 除了以上六個步驟，過程中還會有兩個 componet - Symbol Table manager：如同資料庫，紀錄每個變數的 value - Error handler and recovery：如同監察院，告訴 programmar 錯誤訊息等 - 做完 Analysis 分析，會產生 AST - Parse Tree = Concrete Syntax Tree - Syntax Tree = Abstract Syntax Tree (AST) - Compiler Phase 區分 - Front End：取決於 Source Language - Scannar - Parser - Type Checker - Middle End - IR Code Generator - Machine Independent Optimzer - Back End：取決於 Target Machine - Machine Dependent Optimzer - Target Code Generator - Imprecise Semantics Challenges ```clike= //PASCAL，如果呼叫後面就會 division zero error if(i != 0 and 1000/i > 10) //JAVA，compiler 不確定是否有涵蓋所有情況 int function(int b){ if(b > 0) return b; else if(b <= 0) return -b; } //C Example unsigned ux; if(ux > -1){ }// will always false ``` - Compiler 重要性 - 和 programming language design 有關 - 和 computer design 有關 - 該技術本身常常用來開發其他軟體 - 可以寫更有效率的 code - All 5 top companies have invested on compiler efforts - Apple ： LLVM/Clang - Amazon ： TVM - Microsoft ： MS Visual - Google ： Tensorflow/XLA - Facebook ： GLOW - Compiler 發展史 - 1945~1960 ： Code Generation - 1960~1975 ： Parsing - 1975~ now ： Code Optimization - now~ 2050 ： AI and Compilers - 第一個 Fortran compiler 花了 18 年時間，我們一學期就可以寫一個簡單的 compiler - Compiler - Compiler - 現在有許多自動生成 compiler 的工具 - **新的機器 (e.g. Itanium) 的第一個 Compiler** - 方法一：Cross Compile - GCC 會把一份 source code，轉成 x86 執行碼 - 1.用 C 寫一個 source code A，可以 Compile c 程式成 Itanium 執行碼 - 2.用 gcc 在本機上，把 A 轉成 x86 執行碼 A.exe - 3.用 A.exe 在本機上，把 A 轉成 Itanium 執行碼 A.ita - 4.把 A.ita 放到 Itanium 機器上，就是一個 C compiler - 方法二：Bootstraping - 先用組合語言寫一個小的 compiler C0 - 用 C0 看得懂的語法寫一個大一點的 compiler C1，並用 C0 編譯 - 用 C1 看得懂的語法寫一個大一點的 compiler C2，並用 C1 編譯 - ... 直到完整的 C compiler - 昂貴 Operation - Memory - Branch - Divide 60 cycle - Fallacy：組合語言有較好的 performance - 現在 compiler 的優化技術，其實非常厲害 - 現在機器架構複雜，寫組合語言需要很深度的瞭解 - 用高階語言有好的 Maintainability，Portability，容易 debug 和修改 code - Fallacy：Compiler 通常是 static - Runtime compile 比較複雜，但是很有用 - java、GPGPU 都用到 runtime compile 的技術 - Runtime Compile 更能做 machine dependent 的優化 - 可以允許 distribute and compile ## 2.A Simple Compiler - 寫一個 compiler，將 AC 轉成 DC - Adding Calculator - [Desktop Calculator](https://en.wikipedia.org/wiki/Dc_(computer_program)) in linux - AC Definition - Types：int / float - Keywords：[f, i, p] - Variables：[a - z] \ [f, i, p] - Token：用 RE 表示 - ![](https://i.imgur.com/c6ImSml.png) - 其中 id 、 inum 、 fnum 需要 attribute - Syntax：用 CFG 表示 (Backus-Naur Form) - ![](https://i.imgur.com/2TQxxnI.png) - Formal Definition 的好處 - 避免 ambiguity - 可以使用 automatic compiler construction 技術 - Ambiguous Grammar - Rule (9-2+4) - String → String + String - String → String – String - String → digit - Association - Left associative operators： + , - , * , / - Right associative operators： A = B = C, A ? B : C ? D : E - 判斷方法：看重複的部分 - (op primary) → left associative - (primary op) → right associative - 轉換方法： - Left Recursion ```python= A → Aα | β # β,βα,βαα,βααα,βαααα,βααααα ``` - 轉成 Right Recursion ```python= A → β R R → α R | λ ``` - Precedence of Operators - 如果要處理加減乘除 ```clike= Expr → expr + term | expr – term | term Term → term * factor | term / factor | factor Factor → digit | (expr) ``` - C 語言裡有超過 10 層 Precedence of Operators - Operator Precedence parsing - 建立 Operator Precedence 的 Table - Top Down Parsing v.s. Bottom Up Parsing - Top Down Parsing： - 容易了解 - 容易直接手刻 - 對 grammar 的限制較多 (e.g. 不可以 left recursion / left factoring) - left recursion：RHS 是由 nonterminal 開頭 - A → Aα | β - left factoring：RHS 的開頭有相同的 terminal - A → αB | αC - 可以使用 Predictive Parsing - Bottom Up Parsing： - 可以處理各式各樣的 grammars - 使用 tool 來建立 parser - 對 grammar 的限制較少 - 現代的 compiler 比較常使用的方式 - Parsing 的一些名詞介紹 - Parsing：檢查 token string 是否滿足 context free grammar，通常建立 parse tree - Top Down Parsing - 從 Start Symbol 開始，做 CFG 的展開 - Recursive Descent Parsing - Top Down Parsing 的一種實作，可能需要 backtrack - 每一個 non-terminal 有一個 parsing procedure，若其 RHS 中 - 有 non-terminal A ：呼叫 parsing procedure A - 有 terminal t ：呼叫 match(t) - Predictive Parsing - Top Down Parsing 的一種實作，不需要 backtrack - 以 A → BCD | EF 為例 ```clike= Function A{ if(lookahead == FIRST(BCD)) {B();C();D();} elseif(lookahead == FIRST(EF)) {E();F();} } ``` - FIRST(x)：表示 x 可以產生的 string 中可能的第一個字元所形成集合 - 不是所有的 language 可以做 predictive parsing - LL(1)：看一個 symbol 可以區分，Left-to-Right, Left-most derivation - LL(2)：看兩個 symbol 可以區分，但表格的大小會變成平方 - Syntax-Directed Compiler - 根據 parser 來 driven - 根據 CFG ，在每個環節增加對應的 semantic rules / action - Semantic Routine： - 不影響 parsing 對 CFG 的檢查 - 把 action symbol 加到 grammar 中，在 CFG 處理的同時，順便做一些 semantic 操作 - 回紀錄一些 semantic record，如 constant value，甚至可以用 union 來記錄 variant record - Single Pass v.s. Multiple Pass - Single Pass： - 只處理程式碼一次 - 在 parse 同時順便 semantic check 和 code generation - Multiple Pass：我們的 project 方式 - 第一次先建立 AST，之後每一次 traverse 做 - semantic check - code generation - optimization - Semantic Analysis 一些過程 - Symbol table - Type checking - 每個 id 都應該事先被宣告 - 可以自動做 type coercion (但要注意 coercion 方向) - 去 AST 從下到上做檢查 - 作業 AC 語法中，加減乘除有關的 grammar - 原本版本，只有加減的 Right Recursion ```python= Stmt = num ET ET = +num ET | -num ET | λ ``` - 新版本，加減乘除的 Right Recursion ```python= Stmt = term ET ET = +term ET | -term ET | λ Term = num TT T = *num TT | /num TT | λ ``` ## 3.Scanner (lexical analyze) - 達到的目的 - 把字元 group 成 token - 把 comment 去除掉 - 檢查是否有 source program 的 character error - 做 macro expansion - Case Study - 字串的 delimiter collision - 字串的 new line - null strings 是否存在 - DO 5 I = 1.25 v.s. DO 5 I = 1, 25 - 左邊指的是變數，右邊指的是迴圈 - 1. 和 .10 是否為合法的 constant - 以上例子顯示出 Formal Specification 的重要 - 避免 ambiguity - 自動 scanner generation - Regular Expressions (RE) - Regular Expressions 不允許遞迴，但 Context Free Grammar 可以 - RE -> DFA - CFG -> PDA - 適合用來描述 token 的 structure - Character Set $V$ - Concatenation (串接) - Alternation (選擇) - Closure (重複) - **Precedence**：Closure > Concatenation > Alternation - Regular Definition ：把 RE 用某個 name 取代 ```clike= letter = A | B | ... | Z digit = 0 | 1 | ... | 9 id = letter (letter|digit)^* digits = digit digit^* ``` - Some Examples - Character class [^abc] - L → [A-Z] , D → [0-9] - comment → // [^\n]*\n - decimal → D+ "." D+ - ID → L ( L|D )* ( _+ (L|D)+ )* - comment → ## ( [#]?[^#] )* ## - 科學記號 → (D+ (\. D+)? (e (+|-)?D+)?) | (D+)? (\. D+) (e (+|-)?D+) ? - Finite Automata - Conversion RE to NFA ： **Thompson’s construction** - ![](https://i.imgur.com/qPgXqUi.png) ![](https://i.imgur.com/Qdsavfg.png) - Conversion NFA to DFA ： **Subset Construction Algorithm** - ![](https://i.imgur.com/ybTYtx4.png) - $A=\{0\},~B=\{0,1\},~C=\{0,2\},~D=\{0,3\}$ - Minimize number of states - 一開始把所有 states 分成兩個 state set，接著劃分出一定要區分出來的 state - 如果有某個 state set 有兩個 state 在讀取某個 input 後，會到不同的 state set，則要把該 state set 分割成兩個 state set - Simultae DFA ```clike= s = S0 c = getchar() while( c!=EOF ){ s = move(s,c) c = getchar() } retrun "YES" if s in F else "No" ``` - Reserved Word - 方法一：在 scannar 去檢查是否是 reserved word - 缺點：State 數量會變很多 - 方法二：把 reserved work 當作 token，在 symbol table 中去處理這些 token - 注意：去查 symbol table 的時候要 hash 而不要 linear search 才會比較快 - token id 的字串要送到 symbol table，必須在 semantic check 的時候做會比較好，因為才知道該 id 在哪一個 scope ( block, procedure ) - Runaway strings - 不知道右括弧在哪，而以為整個程式都是字串 - C Comment 的 REGEX - https://rubular.com/ - https://regex101.com/ ```clike= \/\* (\*(?!\/)|[^*])* \*\/ # 用讀下一字元方式 \/\* ((\*+[^/])|[^*])*(\*)? \*\/ # 讀到 "*" 非 "/" 或是讀到非 "*" ``` - **自動產生工具： [LEX](http://dinosaur.compilertools.net/flex/manpage.html)** - 把一個 lex.l 轉換成 lex.c，之後再 compile 及得 Scanner。 ```clike= ${ #include <stdio.h> int linenumber=0; $} letter [A-Za-z] digit [0-9] ID {letter}({letter}|{digit}|"_")* newline "\n" %% ID printf("%s\n",yytext); newline linenumber += 1; %% int main(int argc, char **argv) { yylex(); } ``` - 除了產生 rule，還可以在 RE match 時做一些 actions - Overlapped 處理 - Longest possible match："<=" 優先於 "<" 和 "=" - Order of rules：比較早定義的 RE 優先 ## 4.Grammar and Parsing - Context Free Grammar - $G=(V_t,V_n,S,P)$ - $V_t$ : **Terminals** - $V_n$ : **Non-terminals** - $S$ : Start Symbol 或是 Goal Symbol - $P$ : Production Rule - $A\rightarrow\lambda$ : 為了 optional 選擇 - Notational Convention - $a,b,c$ : token - $A,B,C$ : non-terminal - $U,V,W$ : any symbol - $\alpha,\beta,\gamma$ : any sentential form string - $u,v,w$ : token string / sentence - $L(G)=\{w\in V_t^*|S\rightarrow^+w\}$ ：可能產生的 token string - $SF(G)=\{\beta\in V^*|S\rightarrow^+\beta\}$ ： Sentential Form，可能產生的中間過程 - 只要文法不是 ambiguious，不同 derivation 出來的 parse tree 都相同，但我們只關注兩種 derivation 順序 - Left-most derivation：每次做 derivation 時，都從 Left-most 開始做 - 每次把 left most nonterminal 展開成 handle - **Top Down Parsing 就是從 $S$ 開始根據 input string 每次做 left most derivation** - Right-most derivation：每次做 derivation 時，都從 Right-most 開始做 - 每次把 handle 規約成 nonterminal - **Bottom Up Parsing 就是從 input string 每次找 handle 規約成 non-terminal，剛好能逆著找到 right most derivation** - **Phrase / Handle** - Phrase of nonterminal ：只用一種 nonterminal 推導出來的 symbol string - 限定只有一種，但沒限定 derivate 幾次 - Simple/Prime Phrase ：用 nonterminal 推導**一次**出來的 symbol string - 不包含更小的 phrase - Handle ： **Left-Mose Simple Phrase** - 通常是用來描述 bottom up derivation 的術語 - Grammar Ambiguity - Operator Associativity ( A+B+C ) - Operator Precedence ( A+B*C ) - Dangling Else ( if E1 then if E2 then S1 else S2 ) - ANSI C 中 else 會跟著 innermost if ，也就是最靠近的 if - 不容易直接從 grammar 的定義解決，但可以在 parsing 時限制 else 要跟著最接近的 if statement - Left factoring - 若 production 的 RHS 開頭有相同 terminals，將其轉成沒有得形式 - e.x. A → αB | αC 需要改寫成 - A → α S' - S' → B | C - Top Down Parsing 必須預處理來避免 Common Left Factor - Name of parsing - LL(1) = Left to Right, Leftmost, 1 lookahead = Top-Down Parsing - LR(1) = Left to Right, Rightmost, 1 lookahead = Bottom-Up Parsing - **FOLLOW and FIRST** - **First** 定義在 **sentential form $\alpha$** 上，看其可 derive 出甚麼 token 開頭的 string - $First(\alpha)=S1\cup S2$ - $S1=\{a\in V_t|\alpha\rightarrow^*a\beta\}$ - $S2=\{\lambda\}~if~\alpha\rightarrow^*\lambda ~else~\phi$ - 對於 terminal $Z$，$First(Z)=\{Z\}$ - 對於 non-terminal $Z$ - $Z\rightarrow Y_1Y_2...Y_k$ ，則 $First(Y_i)~\subseteq~First(Z)$ - $Z\rightarrow \lambda$ ，則 $\lambda~\subseteq~First(Z)$ - **Follow** 定義在 **non-terminal $A$** 上，看其後可接那些 token - $Follow(A)=\{~a\in V_t|S\rightarrow^*...Aa...~\}$ - $ $\in Follow(S)$ - $B\rightarrow \alpha A \beta$ ，則 $First(\beta)-\lambda\in Follow(A)$ - $B\rightarrow \alpha A \beta$ 且 $\lambda\in First(\beta)$，則 $Follow(B)\subseteq Follow(A)$ - $B\rightarrow \alpha A$ ，則 $Follow(B)\subseteq Follow(A)$ ## 5.Top-Down Parsing ( LL(1) Parsing ) - **Predict** 定義在 rule $A\rightarrow X_1X_2..X_m$ 上，看走向該 rule 時，出現的字串開頭 token 可能為何 - 如果 $\lambda\in First(X_1X_2..X_m)$ - $Predict(A\rightarrow X_1X_2..X_m)~=~(First(X_1X_2..X_m)-\lambda)\cup Follow(A)$ - 否則 - $Predict(A\rightarrow X_1X_2..X_m)~=~First(X_1X_2..X_m)$ - **LL(1)表示該文法的 predict sets 彼此是 disjoint 的，可以做 Top Down Parsing** - 方法一：Recursive Descent Parsing - 手刻演算法 - 對每一個 production rule 設計一個 function，可能需要 back tracking - 方法二：自動產生 parser，使用 Parse Table - 程式較小，執行較快，容易更改 grammar - Step 1 : 只要給定文法，就可以自動建立 LL(1) parse table - 首先對於每個 grammar symbol，先後去計算 - (1) First - (2) Follow - (3) Predict - 接著用 Predict Set 建表 - $T:V_n\times V_t\rightarrow P\cap\{error\}$ - 當讀到 non-terminal 時，去 lookahead 看輸入字串，即可知道要配合 grammar 的哪一個 prodcution rule - Step 2 : 根據該 LL(1) parse table，對輸入字串進行 parse - 使文法變成 LL(1) 的方法 - 有些 non-LL(1) 文法可以透過修正而變成 LL(1) 文法 - **non-LL(1) 文法： LL(1) Parse Table 的某些欄位會對應到兩個 prodcution rule** - 問題與作法： - Left Factoring - Left Recursion Elimination - Other transformations may be needed ```clike= S → LABEL STATEMENT LABEL → λ LABEL → ID : STATEMENT → ID = EXP ``` 需要轉換成以下等價文法 ```clike= S → ID SUFFIX SUFFIX → : STATEMENT SUFFIX → = EXP STATEMENT → ID = EXP ``` - Dangling else 需要用 Special Rule 強制 Else match 最接近的 If ```clike= S → Stmt $ Stmt → if Exp else Stmt V V → else Stmt | λ ``` - 以上方法都不適用時，可能還需改寫文法 - 絕大多數的 compiler 都是使用 LR parsing 而不是 LL parsing，因為我們不想修改文法，修改後的文法變得難以理解 - LR parsing 的限制比較少，功能比較強大 ## 6.Bottom-up Parsing ( LR(1) Parsing ) - https://pandolia.net/tinyc/ch11_buttom_up_parse_a.html - https://pandolia.net/tinyc/ch12_buttom_up_parse_b.html - 定義： - 以上下文無關文法 $G$ 為例： - ![](https://i.imgur.com/CZu4a1t.png) - Viable prefix 合法前缀: 文法的 Sentential Form 的某一個前缀 - Configuration 型態 / LR(0) item: - 可能的 parsing 狀態，記錄著已經看到 viable prefix，以及預期要看到甚麼 symbol string - e.g. ![](https://i.imgur.com/5C0ZFYu.png) - 該文法有 13 種可能的 configuration - Configuration set: - 把某些 Configuration 蒐集起來 - Closure: - 對一個 Configuration set $I$ 取 closure，則 - 若有 $A\rightarrow \alpha \bullet B$ ，則必須增加所有的 $B\rightarrow \bullet \gamma$ 到 $I$ 之中 - e.g. Closure Of ![](https://i.imgur.com/cxEkTGA.png) = ![](https://i.imgur.com/mgQ5nDw.png) - State: - 進行過 Closure 運算的 Configuration Set - 整個文法的 Start State = Closure Of ![](https://i.imgur.com/G1TZZxG.png) - 可以透過 $GoTo(I,x)$，產生可能到達的其他 State - e.g. ![](https://i.imgur.com/NV6SATX.png) - Characteristic Finite State Machine - 將文法的所有可能 State 以及 Goto Transition 找出來 - e.g. ![](https://i.imgur.com/ubg40gw.png) - LR(0) Parse Table - e.g. ![](https://i.imgur.com/3FppPLO.png) - 延伸性質： - Conflict： - 若某一個 state，同時出現 shift 以及 reduce，則稱為 shift-reduce conflict - 若某一個 state，同時出現兩種不同的 reduce，則稱為 reduce-reduce conflict - SLR(1) Parser: - 假設某一個 state 有 configuration $A\rightarrow \alpha \bullet$ - 我們只在 look ahead 是 $Follow(A)$ 時才會執行這一個 reduce - 這裡的 $Follow(A)$ 只考慮文法 context free 的特性，並不 precise - SLR(1) 比 LR(0) 解決部分 conflict 問題，但仍有許多 context sensitive 的不足 - ![](https://i.imgur.com/zmGRgLa.png) - SLR(1) 並不擔心 Shift / Reduce Conclict ，因為可以靠 look ahead 解決 - LR(1) Parser: - 上述例子可以解決： - ![](https://i.imgur.com/q31XxoK.png) - 為了解決 SLR(1) 在 context sensitive 的不足部分，每個 configuration 會記錄當下的 **"可能 look ahead"** - Closure( ![](https://i.imgur.com/O9h76Za.png) ) = ![](https://i.imgur.com/raPtMZz.png) - LALR(1) Parser: - LR(1) 很強大，但是 states 數量太多 - 把 LR(1) 中，如果兩 state 只有 look ahead 不同時， merge 成相同的 state - Bottom-Up Parsing 喜歡 left recursion 更勝於 right recursion - 考慮字串 "1, 2, 3" 用不同的文法做 bottom up parsing - Left Recursion ： list -> list, num | num - Right Recursion ： list -> num | num, list - 比較： - **non-LR(0) 文法：不考慮 look ahead 時有 conflict** - Shift-Reduce / Reduce-Reduce - **non-LR(1) 文法：考慮 look ahead 時有 conflict** - Reduce-Reduce - **non-SLR(1) 文法：用 follow 判定 look ahead 時有 conflict** - Reduce-Reduce ## 7. Syntax Directed Compilation - Syntax-Directed Translation - CFG + Semantic Actions - Semantic Actions ：每一個 production rule 後頭要做得 semantic check - Semantic Values ： Semantic check 的對象，看 attribute - Synthesized Attributes ：從 leaf node 往上 propogate 得到的資訊。 - Inherited Attributes ：從 parent node 或是 sibling node 得到的資訊。 - YACC/Bison - 背後是用 LALR(1) 的方法 - 除了 parser stack ，還有 value stack ( 都是存指標 ) - 可以從 scanner 取得 terminal，也可自己定義 nonterminal - 可以用 yyin 指定要 parse 的 input， yyparse() 開始執行 parse - 範例： ```clike= %{ #include <stdio.h> %} %union { int num; char* lexeme; } // scanner 回傳時，可以指定回傳的 type，則 parser 就可以用 $$、$1、... 去接 // scanner 要設定 yylval.lexeme 或是 yylval.num %token T_NUM %left '+' '-' %left '*' '/' %% S : S E '\n' { printf("ans = %d\n", $2); } | /* empty */ { /* empty */ } ; E : E '+' E { $$ = $1 + $3; } | E '-' E { $$ = $1 - $3; } | E '*' E { $$ = $1 * $3; } | E '/' E { $$ = $1 / $3; } | T_NUM { $$ = $1; } | '(' E ')' { $$ = $2; } ; %% int main() { return yyparse(); } ``` - 使用 gv 工具 - dot -Tpng ./AST_graph.gv -o ./graph.png --- ## Ch8 Declaration Processing and Symbol Table - (name, attribute) - 當變數被宣告時，要 collect attributes - 當變數被使用時，要 check attributes - Scope：通常每個 name 對應到 closet containing scope - Visibility Rules - 每個變數只可以被 current scope 及 open scopes 所 access - 若不同 active scope 都宣告同一個變數名稱，則採用 closest containing scope - Parameter 和 function name 都屬於外面的 scope - 進入 function/block 首行時 open_scope() - 離開 function/block 末行時 close_scope() - 方法一：單一大 symbol table 處理不同 scope - hash table 每個位置存 linked list 串起 (name, attribute, scope) - 容易 reference，空間使用率好，但不容易 close scope - 1-pass compiler 時適合用的實作方式 - 但其實不容易實作 - 方法二：每個 scope 有自己獨立的 symbol table - 每個 scope 有自己的 hash table，用 push table 和 pop table 來開關 scope - pop 時不一定要把 symbol table 整個丟掉， multipass 還可以使用 - 容易 open / close scope，但不容易 reference，且空間使用率差 - 如 global variable 或是 undefined variable，要找遍整個不同 scope 的 hash table 才知道 - NameSpace - 一種優化空間的方法，symbol table 中要儲存 name，但是 name 長短不一 - 一種方式是用指標指向一塊動態設立的空間，並把該空間填入 name - 但是如果許多 name 都有重複的字元或字串會很浪費空間 - 節省空間解法： - 共用一個 name pool - 使用到某個 name 時，去 pool 看是否有出現過相同的字串 - 若有則把指標指向該字串 - 沒有則在 pool 末端階上該字串，並把指標指向它 - 因為 name pool 是動態宣告的，指標可以用 (selector, offset, length) 代替 - Forward References - 如果外面的 scope 跟裡面的 scope 都有用相同的變數名稱，而裡面的 scope 有用指標只到相同的變數名稱 - 若是變數宣告比使用還要晚，可能會造成誤以為操作的是外面的變數 - 解法： - 有使用到 pointer 時，必須把相同 scope 的所有變數宣告都先處理過一次 - Yacc/Bison 中的例子 - yylval.lexeme = strdup(yytext); - strdup() 會自動配置一段記憶體，copy 進去，然後回傳 - 若每個 token 是固定型態時 ```clike= %union { int num; char* lexeme; } ``` - 同一個 token 卻有可能不同的形態，如 const ( int, float, string) ```clike= %{ Typedef struct { C_type ctype; Union { int intval; double fval; char *sc; } const_u; } Const_Type; %} %union { int num; char* lexeme; Const_Type *const_rec } ``` - Equivalence - Name Equivalence：同樣的 associated type expressions - Structural Equivalence：有同樣的結構，不容易檢查 - C語言： - name equivalence for structs and unions - structural equivalence for arrays - Efficient test for structural Inequivalence 一種可能可以檢查不相等的方法： - boolean 0, char 1, integer 2, real 3 - pointer 1, array 2, freturns 3 - 則 array(pointer(freturn(char))) → 2 1 3 1 - 把 AST 轉成 DAG 以節省空間 - ![](https://i.imgur.com/KN8cxRQ.png) - **Value Numbering** - 除了原圖外，把每一個 node <op, l, r> 的 hash 值另外記在 hash table - 每次要使用 <op, l, r> node 時，先去 hash table 找 - 找到就直接使用同一個已經存在的 node - 找不到就宣告一個新的 node，並把該 hash 值存入 hash table - 原本是用在 expression 的，若是**用在 statement 時要注意檢查變數是否有被更改** - Common Sub Expression 的節省空間優化 ## Ch9 Semantic Analysis - 常見需要 Semantic analysis 的地方 - If statements - Switch and Case - While, Do and Repeat Loops - For loops - Function Calls - 常見三種 Semantic analysis 種類 - Type Correctness : 檢查 type 是否正確 - Reachability and termination : 檢查該 function 是否會被執行到 - Exception Handling : 檢查 exception 是否有被正常處理 - Exceptions 介紹 - Hardware exception handling/traps - 中斷（Interrupt）：屬於非同步發生的 event，在任何時間都可能發生且與 processer在執行的東西毫無關連，通常它由 I/O devices，處理計時器，或 timers 產生。 - 陷阱（Trap 或 exception）：屬於同步狀態，通常由執行某一特別的指令。可以藉由執行相同資料及狀態下的程式而重複產生。其例有無效記憶體存取或除以零。 - Software exception handling - [從 Java compiler 的角度來看 exceptions](http://teddy-chen-tw.blogspot.com/2011/05/checked-or-unchecked-exceptions-1.html) - Checked exceptions - Try-Throw-Catch 範例 ```clike= void divide1(int a = 5, int b = 4){ try { if(b == 0) throw 0; cout << "a / b = " << double(a) / b << endl; } catch(int err) { cout << "除數為: " << err << endl; } } void divide2(int a = 5, int b = 4) throws 0{ try { if(b == 0) throw 0; cout << "a / b = " << double(a) / b << endl; } } ``` - 概念是 compiler 會幫忙檢查，如果有出現 throw，必定要滿足 Catch or Declare 原則 - Unchecked exceptions - comiler 認為，理論上不應該在程式中去 handle 它 - 需要修改程式碼本身來修復 - Compiler 採取 Conservative Analysis - General 來說，底下問題都是 undecidable 的問題 - 正面表示可能，反面表示一定不可能 - terminateNormally : 表示可能 terminate - isReachable : 表示可能 reachable - isInitialized : 表示可能被 initialized - 如果要做 optimization，可能需要絕對的資訊，compiler 可以多增加檢查 if / else 條件的 code 去檢查 runtime 時的情境。 - 一些 Function call 定義但不太重要 - return → terminate abnormally - no return → terminate normally → issued error message ## Ch10 Code Generation Runtime Support - Storage Allocation - Static : global variable - Stack : function local variable - Heap : malloc - **Dangling Pointer v.s. Memory Leak** - 使用 malloc / free 動態宣告記憶體常見的問題 - Dangling Pointer : 指向的空間已經被回收，但是又去使用該空間 - Memory Leak : 指向的空間尚未回收，而該指標卻不繼續使用該空間 - Memory Layout : 把一個 ELF 檔案 map 到記憶體後，虛擬記憶體擺放的狀態 - Code : 程式碼 - Static data : 初始化 - BSS : 未初始化 - Heap : 從低位置往高位置長 - Stack : 從高位置往低位置長 - 但是 reference array 仍從低往高位置，因此會有 buffer overflow 的漏洞 - Function call 時要記錄到 stack 中的 **Activation record** - 1.Local variables - 2.Parameters - 3.Return value - 有些 expression 算好的暫存值 - 執行這個 function 時的機器狀態，如暫存器等 - Access link 以及 Control link - Calling Sequence Example - Caller 和 Callee 各負責一些事情 - Caller 負責 : `參數`、`Return Address`、`Old Ebp`、`AR空間` - Callee 負責 : `存 register 值`、`存 local variable` - 我們盡可能**把工作交給 Callee 處理：** - 雖然給 Caller 和 Callee 做，實際上的執行 instruction 數量相同 - 考慮 Function call `500 次` - 放在 caller 會使 static code size 放 500 次 - 放在 callee 只要放 1 次 - Code size 會影響讀取程式碼的 cache，會不適合小 memory 的 embedded system - Non-Local Names / Static Chain / Dynamic Chain / Display - 常見一些問題討論 - Local Variable Align 是為了避免 page fault 和 TLB Miss - Stack 中的 local array ，它的 index 也是由低位置到高位置 - 因為要和 heap 中的 array 一致方向，拿到 pointer 才知道如何 index - 這個方向會造成 buffer overflow 問題，也就是可以更改倒 saved ebp 以及 return address 等 - 業界最費時 bug : `Dangleing Reference`、`Memory Leak` - 實作上，可以每個 block 都個別去要一個 Activation Record，也可以只在 function call 時才要一個 Activation Record。 ## Ch11 Code Generation Risk-V - 處理 declaration - 要可以區分 global 或是 local 變數 - global variable: - 直接去對應記憶體拿資料 - local variable: - 要計算每個變數的 offset，並存在 symbol table 中 - 每一個 symbol table 的變數要記錄是否為 global/local variable，以及他的 offset - 要處理 expression 時，每一個 AST node 要記錄他的值存在哪一個 register - 每一個 function 會把某些值存在某些 register 中，當使用到該值要釋放開 register，function 結束時所有 register 都是釋放狀態不會被占用 - 處理 function - Prologue ```clike= sd ra, 0(sp) sd fp, -8(sp) addi fp, sp, -8 addi sp, sp, -16 la ra, _frameSize_main lw ra, 0(ra) sub sp, sp, ra sd t0, 8(sp) sd t1, 16(sp) ``` - Epilogue ```clike= ld t0, 8(sp) ld t1, 16(sp) ld ra, 8(fp) // restore return address addi sp, fp, 8 // pop AR ld fp, 0(fp) // restore caller (old) fp jr ra ``` - Risk V 64 bits Register - https://github.com/riscv/riscv-asm-manual/blob/master/riscv-asm.md - https://content.riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf - 整數：`t0-t6`、`s1-s11`、`a2-a7` - 浮點數：`ft0-ft7`、`fs1-fs11`、`fa2-fa7` - Code Generation + Optimization - 我們常常會分成兩個 pass - 先用比較寬鬆的條件 code generation 產生 code - 再用 optimization 把該 code 去除多餘步驟 - **Register Tracking** - 用一張表，紀錄每個 register 現在是否有 associated 到某個記憶體，某個 temporary 值，以及是否有更改過還未存回 - 可以避免每次都從記憶體把東西載入 register - 如果 temporary 太多，register 不夠用會需要 spill - e.g. `A = b + (c*d+e*f*(g+h*j*(j+k…))))` - 如果 function call 會改到某些變數，需要重新 reload - e.g. `A = b + (….+ function1(…))` - Get Register 常見實做 - 如果有未用到的 register ，就使用它 - 否則，找所有使用中的 register ，spill cost 最小的 ## Ch12 Code Generation Control Structures and Loops - If statement ```c= load data to x9 beqz x9, Lexit1 "if statement" Lexit1: "exit statement" ``` - If else statement ```c= load data to x9 beqz x9, Lelse1 "if statement" j Lexit1 Lelse1: "else statement" Lexit1: "exit statement" ``` - Nested If sturcture - `local_label_number = label_num++` - While statement ```c= _Test1: load data to x9 beqz x9, Lexit1 "code for statement" j _Test1 Lexit1: "exit statement" ``` - For statement ```c= "initial data assignment" _Test1: load data to x9 beqz x9, Lexit1 j _Body1 _Inc1: "increment assignment" j _Test1 _Body1: "code for statement" j _Inc1: _Exit1: "exit statement" ``` - 執行頻率高的 code 放在 then 會比 else 快速 ## Ch13 Code Generation -- Array References / Function Calls / Switch Cases - Array Reference - 假設 A[100] 的 Base 在 B - 存取 A[i] → `B + i*4` - 假設 A[100][200] 的 Base 在 B - 存取 A[i][j] → `B + (200*i+j)*4` - 假設 A[100][200][300] 的 Base 在 B - 存取 A[i][j][k] → `B + (200*300*i+300*j+k)*4` - General：存取 A[n1][n2][n3] 當中的 A[i1][i2]...[ik] - `Base` + `4` * `((i1*n2+i2)*n3+i3)...+ik` - Function Call - 常常要包含四個部分 - 1.save_registers() : - 根據 calling convention 決定是 caller saved 或是 callee saved - 2.passing_param() : - 根據 calling convention 決定參數要放在 register 還是 runtime stack 上 - 3.gen_proc_call(name) : - 通常是 `jal name` - 4.gen_after_return() : - 把參數從 `a0` 或是 `fa0` 拿回來使用 - Caller/Callee saved register - 如果 temporary 存在 caller saved register，則 function call 之前要從 register 存到 stack，function call 之後要從 stack 放回 register - 如果 temporary 存在 callee saved register，則該 function 的 prologue 和 epilogue 會幫忙處理這件事 - Return： - function 結束會有一個 label `_end_name`，擺入 epilogue - restore register、fp、ra - `jr ra` - 一個 function 可能有很多 return，基本上不會把整個 epilogue 放入 - 1.放好 return value 到 `a0` 或 `fa0` - 2.`j _end_name` - 一些 Optimization 技巧 - Procedure In-lining - Using calling convention - Leaf routine optimization - 若讓它只用到 caller saved register，可以省下 load/store 時間 - Non-recursive routine optimization - 可以讓它的 AR 靜態宣告在全域裡 - Switch Case Statement - 該語法的出現，最主要是為了產生有效率的 target code，而非程式碼的整潔 - Cascaded if： - 當成 if else 來解析 - 當種類少的時候，或是有一個情況很常出現時適用 - Search table： - 建一個所有 case 的 sorted list - 去以上 list 二分搜看是否有出現，有的話做對應的操作 - 當 case value 非常分散時適用 - Jump table： - 花大量空間建一個大表，以 case 作為 index，對應到對應操作 - 當 case value 密集的出現在某範圍裏適用 ## Ch14 Caller/Callee Register Convention - [Register Allocation](https://en.wikipedia.org/wiki/Register_allocation) - Why not 1024 registers? - 1.ISA：這樣 register encodeing 就要 10 個 bit，如果是 `add x1, x1, x1`，32 bits instruction 就會不夠用 - 2.這樣要 saving / restoring 的 cost 就會變得很高 - 3.array 和 string 的操作才是程式執行時間的 bottle neck，使用再多的 register 也無法有效解決此問題。 - e.g. `for(int i=0;i<10000;i++) c[i]=a[i]+b[i]` - Caller/Callee Register Convention - Caller Save Registers： - 當要去 call 別的 function 時，有值存在 caller saved register，希望 call 完別的 function 回來時該值不變 - `在 call 別的 function 前後要 save/restore caller saved register` - Callee Save Registers： - 幫被別的 function call 時，別的 function 有值存在 callee saved register，希望自己做完操作後該值不變 - `在 function 的 prologue/epilogue 要 save/restore callee saved register` - 心得： - 基本上，ISA 會是先區分 register 為 caller saved 以及 callee saved - caller saved 用到的時候才 save/restore，用到很多次就會很多記憶體操作 - 有用到 callee saved ，則進出 function 時要 save/restore - ISA 必須權衡，兩種 register 各有多少個 - caller saved register 多：function call 很多時，會多很多次記憶體操作 - callee saved register 多：每個 function 的 overhead 很大 - function 在使用 register 時，應該盡量以**不用額外做 save/restore 的 register 為優先** - **leaf routine optimization：** leaf procedure 盡量使用 caller saved register - 如果需要用到 register 的地方太多，e.g. local variable 、 temporary - temporay 放在 callee saved register - local variable、CSE 放在 caller saved register ## Ch15 Compiler Optimization Analysis - Compiler Optimization 的目標有可能不同 - 執行時間最小 - Code Size 最小 - 最省電 - Compile Time 最小 - 一些 Optimization 問題 - 短的程式碼不一定執行時間就比較短，比如 inlineing 後程式碼變長但變快 - 快的程式碼不一定比較省電，比如 GPU/multi-thread 兩倍的電力不一定加速兩倍 - Compile Time 有時跟 runtime 一樣重要， JIT 的 compile time 要計入 runtime - 不能都用 AI deep learning 做最佳化，仍需專家做 analysis and transformation - Optimization 性質 - Optimization 的過程不應該影響程式的行為與正確性 - 找出最 runtime optimal 的程式碼問題是 NP-hard - Optimization 在某些情況下很好，但在特定情況寫卻可能反而變差 - Sethi-Ullman/Ershov Numbering - bottom up 算出每個 AST node 所需要得最少 register 個數 ```c= Procedure registerNeeds(T){ if(T.kind == ID or T.kind == const) T.regcount = 1; else{ call registerNeeds(T.leftchild); call registerNeeds(T.rightchild); if (T.leftchild.regcount == T.rightchild.regcount) T.regcount = T.rightchild.regcount +1; else T.regcount = MAX (T.leftchild.regcount, T.rightchild.regcount); } } ``` - Common subexpression elimination - 把 AST 中 expression 相同的部分，不重複複製兩份 - 減少 AST 大小，增加 code 效率 - 從 AST 變成 DAG - 可以用 value numbering 的技巧 - Automatic Instruction Selection - 將 AST 建出之後，把 code generation 的過程自動化 ## Ch15 Register Allocation - Allocation and Assignment - Allocation：決定該值是否要 allocate 到 virtual register - Assignment：把 virtual register 真的 map 到 register 上 - Register Allocation 概念 - lifetime 不同變數，可以 assign 到相同的 register 上不會衝突 - 我們可以根據 lifetime 來建一個圖，變成 graph coloring 問題 - graph coloring 問題 - 問題定義： - 每一個點代表一個變數需要被 allocate 到一個 virtual register - 每一個邊，代表相鄰的兩個變數，lifetime 有重疊 - 每一種顏色，代表真正的 register - 我們需要最少的顏色，使得圖上沒有相鄰點同色 - 有理論說 4 色必定可以 coloring，但四色塗色法難以用演算法描述出 - 先隨意讓 n = 4，想要看是否可以塗色，接著依序把圖上 degree < n 的點移除，看是否可以讓圖為空，快速判斷圖是否為 n-colorable： - 可以->n-colorable - 不可以->不一定，但我們可以多用一個顏色再試一次 - [Speculative Register Promotion](https://ieeexplore.ieee.org/abstract/document/1191539) - 情境：我們可不可以把 a->b 放到 register 中? ```c= = *p+1; *q = … = *p+9; ``` - 總的來說，我們不知道 `*p` 是否會改到 `a 或 a->b` 的值 - 我們可以用 `ld.c r1=[p]` 來檢查 r1 是否存的值為 p - 是：這行指令變成 NOP - 否：這行指令變成 `ld.a r1, [p]`，把 p 的值再 load 到 register ## Ch16 Transformation and Analysis - LLVM has 34 analysis passes and 63 transformation passes - Loop Interchange - C 是 row major，好的 loop 順序可以使 cache miss 降低 - 產生好的 spatial locality - 產生 vectorization 的程式碼 - Profile Guided Optimizations - Instrumentation：紀錄整體執行情況 - Performance Monitor Unit (sample)：每格一段時間去觀測 - 使用頻率高 -> optimize for speed - 使用頻率低 -> optimize for space - Inlineing v.s. Cloneing - Inlineing：整體來說比 cloneing 更快，但是 code size 會變很大，因此要根據 profile 來決定是否使用 - Cloneing：function call 若常常傳相同的參數，則可以 copy 一份沒有參數的 procedure 以降低傳遞參數的 overhead - Automatic Parallelization - 有一些工具，但對 general 的程式碼而言，工具找到的平行度遠遠不夠 - Data-Dependence Tests - GCD test - `Z[7*i+3] = Z[3*i+5]+a;` - 檢查是否 `gcd(7,3)|(5-3)` - 是：可能有 dependency - 否：沒有 dependency - Banerjee Test - 檢查 RHS 是否有落再 LHS 的最大和最小值之間 - load/store instruction 會做 address 檢查，所以不行太多 - 一個 cycle 可以做 6 個 arithmetic 運算，但只能做 2 個 memory 運算 - 避免 cache miss 要去記憶體讀資料要很久 - Machine Independent Optimizations - loop-invariant code motion ```c= for (i = m; i < n; i++) for (j = m; j < i; j++) a[n*i + j] = b[j]; ``` - Strength Reduction - `16A -> A<<4` - Common Sub-Expression Elimination - Making use of registers - **Many machine independent optimizations are now machine dependent.** - Data Cache Prefetching - [prefetch 背景知識簡略整理](https://hackmd.io/@Rance/Hy3KRm1CG) - Coverage：有多少比例的 cache miss 被消除 - Accuracy：有多少比例的 prefetch 是有用的 - Timeliness：需要多早進行 prefetch 動作，和 prefetch 所需 cycle 數有關 - Stall-on-Miss = blocking caches = Lock-up caches - Stall-on-Use = non-blocking caches = Lock-up free caches - hardware 需要 MSHR (Miss Status and Handling Register) 支持 - Memory Level Parallelism：有多少個 prefetch 可以同時一起做 - Array Prefetch 在 loop 時，常常會 unrolling 來提高 hit rate，因為 cache line 可以一次抓幾個 byte 進 cache，不應該另外用 if/else 決定哪一個 iteration 要做 prefetch - Pointer Prefetching： - 在處理 tree 結構時，可能在處理某一 pointer 前，會先 prefetch left child 及 right child - 在處理 linked list 時，只 prefetch 下一個 pointer 可能效用不夠大，可能需要 jump pointer 來幫助 prefetch - [Prefetch in C](https://yunmingzhang.wordpress.com/2019/02/12/software-prefetching-in-c-c/) ## Ch16 Code Scheduling - **Data and Control dependences may generate *BUBBLES* in the pipeline** - Dependence DAG - 每個節點是 instruction - 每個 edge 是 instruction 間的 dependency - Flow dependence ： Read after Write - Anti-dependence ： Write after Read - Output-dependence ： Write after Write - 圖上的任何一個 topolical order 都是合法的 schedule 方式，找到其中最快的是一個 NP-hard 問題 - Heuristic 解法： - Anti-dependence 和 output-dependence 可以透過增加 pseudo register 來解決 - 1.List Scheduling - 先找到 critical path - 依序從 root 往 leaf schedule - 1.delay 長的優先 - 2.沒有 stall 優先 - 3.original order 優先 - 4.children 數量多優先 - 2.PrePass Scheduling - 先用 **pseudo register** 做 code scheduling - 再做 register assignment - 3.PostPass Scheduling - 先做 register assignment - 再用 **physical register** 做 code scheduling - 4.Compromised Scheduling - Two pass scheduling - 先用 PrePass Scheduling - 當遇到 register 不夠有很多 spill 指令時，再做 another round 的 PostPass Scheduling - Integrated scheduling - DAG reshaping - Loop Unrolling - Unrolling 之後，會有更多彈性來做 code scheduling - Prefetch 可能更好，在原本 dependency instructions 中可能可以多安插一些指令