Code Generation

# Code Generation ## 定義 Definition - What does Code Generation mean? Code generation is a mechanism where a compiler takes the source code as an input and converts it into machine code. This machine code is actually executed by the system. **Code generation is generally considered the last phase of compilation, although there are multiple intermediate steps performed before the final executable is produced.** These intermediate steps are used to perform optimization and other relevant processes. After passing multiple phases, a parse tree or an abstract syntax tree is generated and that is the input to the code generator.[參考](https://www.techopedia.com/definition/6531/code-generation) >Code generator 是 compiler 的最後一個部份，拿 parse tree 或者 an abstract syntax tree 當作 input ，最後輸出 machine code > ![](https://i.imgur.com/oGVJXAh.png) >[Compiler Design - Code Generation](https://www.tutorialspoint.com/compiler_design/compiler_design_code_generation.htm) >內有一系列 `compiler design` 文章可以參考 > ## Compliler vs Linker vs Loader In brief, the difference between linker loader and compiler is that a linker combines one or more object files generated by the compiler to a single executable file and a loader places the programs into memory and prepares them for execution while a compiler converts the source code into object code. ![](https://i.imgur.com/dBuD3RU.png) #### Compiler ![](https://i.imgur.com/FjkYbjZ.png) #### Linker ![](https://i.imgur.com/ijl1tj8.png) >[What is the Difference Between Linker Loader and Compiler](https://pediaa.com/what-is-the-difference-between-linker-loader-and-compiler/) > ## 翻譯程序 ![](https://i.imgur.com/pV3S3gH.png) 1. Lexical analysis (詞彙分析) - 將一連串的字元轉換為 `token` ，轉換後就不會有換行、空白、註解等等，並且產生 `symblo table` ![](https://i.imgur.com/fQ8Y6YI.png) 3. Syntactic analysis (語法分析) - 產生 `parser tree` (又稱 parser) ![](https://i.imgur.com/8JP3ruB.png) 5. Semantic analysis (語意分析) - 分析是否有語意上的錯誤 ![](https://i.imgur.com/8O0yisG.png) ![](https://i.imgur.com/omjPlvg.png) >[Translators 翻譯程序](http://www.ablmcc.edu.hk/~scy/CIT/compilation.pdf) > :::info code generation 待查: ## Choices of IR 1. three-address representations 2. virtual machine representations 3. linear representations 4. graphical representations 5. DAGs ## exchange steps ### Instruction selection #### determined things 1. the level of the IR 2. the nature of the instruction-set architecture 3. the desired quality of the generated code. ### Register Allocation ### Evaluation Order ## output type 1. absolut machine-language 2. relocatable maching-langua - (often called an object module) - A key problem in code generation is deciding what values to hold in what registers. - Certain machines require register-pairs (an even and next odd numbered register) for some operands and results. - 例如做乘法或者除法的時候就要兩個連續的 register 3. assembly-language program ## Basic Blocks and Flow Graphs 將 `intermediate code` 分成一個一個 `block` 再將所有 `block` 畫成 `graph` ，`basic block` 變成 `node` ，`edge` 連接的兩個 `block` 表示它們可以相通 ### Basic Block 在 `intermediate code` 中找到 `leader` ，兩 `leader` 之間的 code 則組成一個 block #### Leader 1. The first three-address instruction in the intermediate code is a leader. 2. Any instruction that is the target of a conditional or unconditional jump is a leader. 3. Any instruction that immediately follows a conditional or unconditional jump is a leader. >範例在 p550 >![](https://i.imgur.com/3QeOPm0.png) ![](https://i.imgur.com/6Lnljad.png) > ## Othoers 1. stack-based machine To achieve high performance the top of the stack is typically kept in registers. 2. just-in-time (JIT) Java ## 重點 1. We need to know instruction costs in order to design good code sequences but, unfortunately, accurate cost information is often difficult to obtain - 在決定要用哪一個 `instruction` 時最好能先知道每個 `instruction` 所需的時間，這樣才能最有效率的做轉換 ## 進度待重看： `8.4.2` `8.3` 看到： `8.4.4` ::: ## LLVM 現在是指一個編譯器的專案，將編譯器各各部份進行模組化，使用時可以挑需要的部份兜成自己要的編譯器。共用一個中間語言（LLVM IR），此中介語言可藉由 `LLVM` 的不同模組進行不同的優化，最後在轉換成組合語言 ![](https://i.imgur.com/A88jlm2.png) >參考（附簡單範例: [編譯器 LLVM 淺淺玩](https://medium.com/@zetavg/%E7%B7%A8%E8%AD%AF%E5%99%A8-llvm-%E6%B7%BA%E6%B7%BA%E7%8E%A9-42a58c7a7309) >官方文件: [LLVM](http://www.aosabook.org/en/llvm.html) >Document; [The LLVM Target-Independent Code Generator](https://llvm.org/docs/CodeGenerator.html) >[Writing an LLVM Backend](https://llvm.org/docs/WritingAnLLVMBackend.html) > :::info 未來實驗室要做的是將上層都用 `LLVM` 編譯器轉換成 `LLVM IR` ，而我們要設計如何將 `LLVM IR` 轉成指定的 `machine code`。 Key word: ![](https://i.imgur.com/Nwibi0a.png) ::: ###### tags: `專題`