# Code Generation
## 定義
Definition - What does Code Generation mean?
Code generation is a mechanism where a compiler takes the source code as an input and converts it into machine code. This machine code is actually executed by the system. **Code generation is generally considered the last phase of compilation, although there are multiple intermediate steps performed before the final executable is produced.** These intermediate steps are used to perform optimization and other relevant processes.
After passing multiple phases, a parse tree or an abstract syntax tree is generated and that is the input to the code generator.[參考](https://www.techopedia.com/definition/6531/code-generation)
>Code generator 是 compiler 的最後一個部份,拿 parse tree 或者 an abstract syntax tree 當作 input ,最後輸出 machine code
>

>[Compiler Design - Code Generation](https://www.tutorialspoint.com/compiler_design/compiler_design_code_generation.htm)
>內有一系列 `compiler design` 文章可以參考
>
## Compliler vs Linker vs Loader
In brief, the difference between linker loader and compiler is that a linker combines one or more object files generated by the compiler to a single executable file and a loader places the programs into memory and prepares them for execution while a compiler converts the source code into object code.

#### Compiler

#### Linker

>[What is the Difference Between Linker Loader and Compiler](https://pediaa.com/what-is-the-difference-between-linker-loader-and-compiler/)
>
## 翻譯程序

1. Lexical analysis (詞彙分析)
- 將一連串的字元轉換為 `token` ,轉換後就不會有換行、空白、註解等等,並且產生 `symblo table`

3. Syntactic analysis (語法分析)
- 產生 `parser tree` (又稱 parser)

5. Semantic analysis (語意分析)
- 分析是否有語意上的錯誤


>[Translators 翻譯程序](http://www.ablmcc.edu.hk/~scy/CIT/compilation.pdf)
>
:::info
code generation 待查:
## Choices of IR
1. three-address representations
2. virtual machine representations
3. linear representations
4. graphical representations
5. DAGs
## exchange steps
### Instruction selection
#### determined things
1. the level of the IR
2. the nature of the instruction-set architecture
3. the desired quality of the generated code.
### Register Allocation
### Evaluation Order
## output type
1. absolut machine-language
2. relocatable maching-langua
- (often called an object module)
- A key problem in code generation is deciding what values to hold in what registers.
- Certain machines require register-pairs (an even and next odd numbered register) for some operands and results.
- 例如做乘法或者除法的時候就要兩個連續的 register
3. assembly-language program
## Basic Blocks and Flow Graphs
將 `intermediate code` 分成一個一個 `block` 再將所有 `block` 畫成 `graph` ,`basic block` 變成 `node` ,`edge` 連接的兩個 `block` 表示它們可以相通
### Basic Block
在 `intermediate code` 中找到 `leader` ,兩 `leader` 之間的 code 則組成一個 block
#### Leader
1. The first three-address instruction in the intermediate code is a leader.
2. Any instruction that is the target of a conditional or unconditional jump is a leader.
3. Any instruction that immediately follows a conditional or unconditional jump is a leader.
>範例在 p550
>

>
## Othoers
1. stack-based machine
To achieve high performance the top of the stack is typically kept in registers.
2. just-in-time (JIT) Java
## 重點
1. We need to know instruction costs in order to design good code sequences but, unfortunately, accurate cost information is often difficult to obtain
- 在決定要用哪一個 `instruction` 時最好能先知道每個 `instruction` 所需的時間,這樣才能最有效率的做轉換
## 進度
待重看:
`8.4.2` `8.3`
看到 :
`8.4.4`
:::
## LLVM
現在是指一個編譯器的專案,將編譯器各各部份進行模組化,使用時可以挑需要的部份兜成自己要的編譯器。共用一個中間語言(LLVM IR),此中介語言可藉由 `LLVM` 的不同模組進行不同的優化,最後在轉換成組合語言

>參考(附簡單範例: [編譯器 LLVM 淺淺玩](https://medium.com/@zetavg/%E7%B7%A8%E8%AD%AF%E5%99%A8-llvm-%E6%B7%BA%E6%B7%BA%E7%8E%A9-42a58c7a7309)
>官方文件: [LLVM](http://www.aosabook.org/en/llvm.html)
>Document; [The LLVM Target-Independent Code Generator](https://llvm.org/docs/CodeGenerator.html)
>[Writing an LLVM Backend](https://llvm.org/docs/WritingAnLLVMBackend.html)
>
:::info
未來實驗室要做的是將上層都用 `LLVM` 編譯器轉換成 `LLVM IR` ,而我們要設計如何將 `LLVM IR` 轉成指定的 `machine code`。
Key word: 
:::
###### tags: `專題`