# Compiler
###### tags: `CS-course` , `compiler`
---
## NFA example
1.
```graphviz
digraph compiler {
node [shape = circle]
rankdir = LR
label="a(a|b)*a"
s [label = "start" shape=plaintext]
s->0
0->1[label="ε"]
1->2[label="a"]
2->3[label="ε"]
3->4[label="ε"]
3->5[label="ε"]
4->6[label="a"]
5->7[label="b"]
6->8[label="ε"]
7->8[label="ε"]
8->9[label="ε"]
9->10[label="a"]
2->9[label="ε"]
8->3[label="ε"]
10 [shape=doublecircle]
}
```
---
## Syntax Analysis
### Eliminating Immediate Left Recursion
1. 先消除 non-immediate left recursion
* $S \to Ab | Ac \\ A \to Sc \\ \Longrightarrow A \to Abc | Acc$
2. 再消除 immediate left recursion
* $A \to A \alpha | \beta \\ \Longrightarrow \\ A \to \beta A' \\ A' \to \alpha A' | \epsilon$

> 消除 immediate left recursion
1. 先消除一部分 immediate left recursion
* $A \to A \alpha | \beta \\ \Longrightarrow \\ A \to \beta A' \\ A' \to \alpha A' | \epsilon$
2. 消除兩者的 non-immediate left recursion
* $S \to Ab | Ac \\ A \to Sc \\ \Longrightarrow A \to Abc | Acc$
3. 再消除 immediate left recursion
* $A \to A \alpha | \beta \\ \Longrightarrow \\ A \to \beta A' \\ A' \to \alpha A' | \epsilon$

* All Algorithm

---
## Top-Down Parsing
略等於 leftmost 推導,而 bottem-up parsing 則會略等於 rightmost 。
The **FIRST** and **FOLLOW** Functions
* Construction of both top-down and bottom-up parsers is aided by the two functions.
* They allow us to choose which production to apply, based on the next input symbol.
### $FIRST(\alpha)$:
> 從 $\alpha$ 可以推導出第一個 terminal 。(**最左**)
* The set of terminals that begins the strings derived from $\alpha$
* To compute $FIRST(X)$ for all grammar symbols $X$, apply the following rules until no more terminals or $ϵ$ can be added:
* If $X$ is a terminal, then $FIRST(X)$ is just $X$.
* If there is a Production $X → ϵ$, then $FIRST(X)$ contains $ϵ$.
* If there is a Production $X → Y_1Y_2...Y_k$, then $FIRST(X)$ contains $FIRST(Y_1Y_2...Y_k)$.
:::warning
**$FIRST(Y_1Y_2 … Y_k)$ is either**
* (if $FIRST (Y_1)$ doesn't contain $ϵ$) $FIRST (Y_1)$
* (if $FIRST(Y_1)$ contains $ϵ$) everything in $FIRST (Y_1)$ (except for $ϵ$) and everything in $FIRST (Y_2..Y_k)$
* If $FIRST(Y_1)FIRST(Y_2)..FIRST(Y_k)$ all contain $ϵ$, then $FIRST(Y_1Y_2…Y_k)$ contains $ϵ$.
:::
EX:
* $S→ABCDE$,$FIRST(S)=\{a,b,c\}$
* $A→a|ϵ$,$FIRST(A)=\{a,ϵ\}$
* $B→b|ϵ$,$FIRST(B)=\{b,ϵ\}$
* $C→c$,$FIRST(C)=\{c\}$
* $D→d|ϵ$,$FIRST(D)=\{d,ϵ\}$
* $E→e|ϵ$,$FIRST(E)=\{e,ϵ\}$
:::warning
**6/2 更新**
對於 s->sa 會回到 first(s) ,先保留住:

:::
### $FOLLOW(A)$
>A 後面可能的 terminal 。(**最右**)
>$FOLLOW$ 不會有 $ϵ$
* $FOLLOW(A)$: The set of terminals a that can appear immediately to the right of A in some sentential form.
* $a \in FOLLOW(A)$ if there exists a derivation $S \Rightarrow \alpha Aa\beta$
* To compute $FOLLOW(A)$ for all nonterminals A, apply the following rules until nothing can be added:
1. **If S is the start symbol, then put $ (the end of input marker) in $FOLLOW(S)$.**
2. If there is a production $A → \alpha B\beta$, then everything in $FIRST(\alpha)$ except for $ϵ$ is placed in $FOLLOW(B)$.
3. If there is a production $A → \alpha B$, then everything in $FOLLOW(A)$ is in $FOLLOW(B)$.
4. If there is a production $A → \alpha B\beta$, where $FIRST(\beta)$ contains $ϵ$, then everything in $FOLLOW(A)$ is in $FOLLOW(B)$.
看當前 node1 的父節點展開當中,node1 的右邊 node2。依序判斷 node2 的展開選項,判斷是否為 node1 的 follow 。如果 node2 有 $ϵ$ 選項,再往右找。如果找完父節點展開的所有 node 選項,但還沒有找到 follow ,則直接涵蓋 $follow(父節點)$。
EX:
>先往上一層(父節點)找右邊的。
>如果右邊是$ϵ$,直接包含$follow(父節點)$。
* $E \to TE'$,$follow(E) = \{),dollarsign\}$
* $E' \to +TE'|ϵ$
* $T \to FT'$,$follow(T)= \{+, follow(E), follow(E')\}= \{+, ),dollarsign\}$
* $T' \to *FT'|ϵ$
* $F \to (E)|ϵ$

注意最後第二個 symbol ,如果父節點的右側可能會得到 $\epsilon$ 而導致父節點後免沒東西放入 follow() ,則需涵蓋 follow(父節點) 。
---
## Recursive-descent / LL(1)
:::warning
http://jsmachines.sourceforge.net/machines/ll1.html
:::
* parsing **without backtracking**
* Can be constructed for a class of grammars called LL(1)
* The first “L” stands for **scanning the input from left to right**
* The second “L” stands for producing a **leftmost derivation**
* The (1) means **using one input symbol of look ahead at each step**
* **Limitations for LL(1)** grammar
* No left-recursive
* No ambiguous grammar
* No left-factor
:::success
**LL(1)**
從左到右掃描,最左優先推導。
最多只能偷看一個符號。
:::



### LL(1) parsing table
```
| input symobl
--------------------------------------
non-terminal |
variabbles |
|
|
```
> 給 statement ,填表。
### Example of LL(1) parsing table
E $\to$ TE'
E' $\to$ +TE' | $\epsilon$
T $\to$ FT'
T' $\to$ *FT' | $\epsilon$
F $\to$ (E) | **id**
FIRST
* FIRST(E) = FIRST(T) = FIRST(F) = {(,**id**)}
* FIRST(E') = {+, $\epsilon$}
* FIRST(T') = {*, $\epsilon$}
FOLLOW
* FOLLOW(E) = FOLLOW(E') = {),$}
* FOLLOW(T) = FOLLOW(T') = {+,),$}
* FOLLOW(F) = = {+,*,),$}
:::warning
Steps:
1. first
2. if have $\epsilon$ , use follow
:::
E $\to$ TE'
FIRST(TE') = {(,**id**)}
| | id | + | * | ( | ) | $ |
|:---:|:---------:| --- |:---:|:---------:| --- |:---:|
| E | E$\to$TE' | | | E$\to$TE' | | |
| E' | | | | | | |
| T | | | | | | |
| T' | | | | | | |
| F | | | | | | |
E' $\to$ +TE'
FIRST(+TE') = {+}
| | id | + | * | ( | ) | $ |
|:---:| --------- |:-----------:|:---:|:---------:| --- |:---:|
| E | E$\to$TE' | | | E$\to$TE' | | |
| E' | | E'$\to$+TE' | | | | |
| T | | | | | | |
| T' | | | | | | |
| F | | | | | | |
E' $\to$ $\epsilon$
FIRST($\epsilon$) = {$\epsilon$}
FOLLOW(E') = {),$}
| | id | + | * | ( | ) | $ |
|:---:| --------- |:-----------:|:---:|:---------:|:------------------:|:------------------:|
| E | E$\to$TE' | | | E$\to$TE' | | |
| E' | | E'$\to$+TE' | | | E'$\to$ $\epsilon$ | E'$\to$ $\epsilon$ |
| T | | | | | | |
| T' | | | | | | |
| F | | | | | | |
...
| | id | + | * | ( | ) | $ |
|:---:|:------------:|:-----------------:|:------------:|:---------:|:------------------:|:------------------:|
| E | E$\to$TE' | | | E$\to$TE' | | |
| E' | | E'$\to$+TE' | | | E'$\to$ $\epsilon$ | E'$\to$ $\epsilon$ |
| T | T$\to$FT' | | | T$\to$FT' | | |
| T' | | T'$\to$$\epsilon$ | T'$\to$* FT' | | T'$\to$ $\epsilon$ | T'$\to$ $\epsilon$ |
| F | F$\to$**id** | | | F$\to$(E) | | |
An error is detected when
* The terminal on top of the stack does not match the next input symbol
* Nonterminal A is on top of the stack, a is the next input symbol, and M[A, a] is empty
:::danger
For the grammer,
如果 parsing table 中的 index 有 conflict ( which cell has more than one rule ) ,則不是 LL(1) 。
:::
---
## LL(0) Grammar
An LL(0) parser works on a grammar in which there are no decisions to make.
So, in such grammars, every non terminal can have only one production.
### Part - 2 Bottom-Up Parsing
Construct a parse tree for an input string
* Begin at the leaves
* Work up towards the root
A general style of bottom-up parsing is known as shift-reduce parsing (移入-化簡語法分析)
The largest class of grammars for which shift-reduce parsing can be built is the LR grammar
### LR(k) Parser
LR(k) parsing
* an efficient, bottom-up parsing technique
* L: scanning the input from left to right
* R: producing a rightmost derivation in reverse
* k: the number of input symbols of lookahead
Advantages
* Can recognize almost all programming-language *
* The most commonly used nonbacktracking (無回溯) shift-reduce parsing
* Can detect a syntactic error as soon as possible
* The class of grammars that can be parsed using LR methods is a proper superset (父集合) of the class of grammars that can be parsed with LL parsers
* Drawback
* Too much work to construct an LR parser by hand
---
## LR(0) and SLR
### CLOSURE Function
### GOTO Function
* GOTO(I, X)
### LR(0) Automaton
:::warning
**LR(0) automation and SLR parsing table**
http://jsmachines.sourceforge.net/machines/slr.html
:::
* Shift-reduce
1. DFA
2. LR(0) table (parser)
:::warning
**NOTE**
03 - Syntax Analysis (part2)
Parse String using LR(0) parser page.70
---
Use the DFA with LR(0) table
查表可能查步道之後的符號要用什麼,會查到 $r_n$ (reduce) 。遇到這情況請 reduce 成之前標號的狀態。

> 註:每次 reduce 時,tree 都可以長一層。
:::
The parsing table consists of two parts
* A parsing action function ACTION
* A goto function GOTO
Value of ACTION[i, a] for state i and a terminal a
* Shift j: shift input a to the stack, but use state j to represent a
* Reduce A->beta: Reduce beta on the top of the stack to head A
* Accept: accept the input and finish parsing
* Error: discover an error and take some corrective action
The GOTO function
* If GOTO[Ii, A] = Ij, then GOTO also maps a state i and a nonterminal A to state j
### Construct LR(0) parsing tables
* Input: An augmented grammar G’
* Output: The LR(0)-parsing table for G’
* Method
* Construct C={I0, I1, …, In}, the collection of sets of LR(0) items for G’
* State i is constructed from Ii. The parsing actions for state i are determined as follows
* If [A-> alpha * a beta] is in Ii, and GOTO(Ii, a) = Ij, set ACTION[i,a] to “shift j”.
* Here a is a terminal
* If [A-> alpha *] is in Ii, set ACTION[i,a] to “reduce A->alpha” for all terminal a.
* Here A may not be S’
* If [S’->S*] is in Ii, set ACTION[i,$] to “accept”

### Conflict LR(0) parse

* LR conflict, which will **not** occur?
* A. **Shift/Shift**
* This will not, since DFA's state cannot output two same symbol
* B. Shift/Reduce
* C. Reduce/Reduce
### SLR-Parsing Tables
:::warning
**LR(0) automation and SLR parsing table**
http://jsmachines.sourceforge.net/machines/slr.html
:::
* Input: An augmented grammar G’
* Output: The SLR-parsing table for G’
* Method
* Construct C={I0, I1, …, In}, the collection of sets of LR(0) items for G’
* State i is constructed from Ii. The parsing actions for state i are determined as follows
1. If [A-> $\alpha$ * a $\beta$] is in $I_i$, and GOTO(Ii, a) = Ij, set ACTION[i,a] to “shift j”.
* Here a is a terminal
2. If [A-> $\alpha$ *] is in $I_i$, set ACTION[i,a] to **“reduce A-> alpha”** for all a in **FOLLOW(A)**.
* Here A may not be S’
3. If [S’->S*] is in Ii, set ACTION[i,$] to “accept”
* The goto transitions for state i are constructed for all nonterminals A using the rule: If GOTO(Ii, A) = Ij, then GOTO[i, A] = j
* All undefined entries are made “error”
* The initial state of the parser is the one constructed from the set of items containing [S’-> * S]
* The parsing table generated by the algorithm is called the SLR(1) table for G
* An LR parser using the SLR(1) table is said to be SLR(1)
Usually we omit the “(1)” after the “SLR”
* Given an LR parsing table
* $s_i$ means shift and stack state i
* $r_j$ means reduce by the production numbered j
* acc means accept
* blank means error

### Example of SLR

> 註:注意 $I_0$ 以及 $ 得設置。 $是在 $S'\to sS\cdot$ 通常是 $I_1$
> 注意, $S \to \cdot$ ,亦即 $S \to \cdot \epsilon$ = $S \to \cdot \epsilon \epsilon$ = ...
:::success
**NOTE**
* 因此,如果 SLR 有 conflict , LR(0) 一定有 conflict 。
* 而,若 LR(0) 沒有 clonflict ,則 SLR 必定沒有 conflict 。
* **Eevery SLR(1) grammar is unambiguous**
* However, there are **unambiguous grammars that are not SLR(1)**
* Example

:::
LR(1)/SLR(1) ,當中的 1 ,為 reduce 的時候,可以偷看一個 symbol 。
:::warning
LR(1)/SLR(1) 也可以分別簡稱 LR / SLR 。
:::
### Problem 1. not LL(1), is LR(0)?

$I_0SI_1$$
NO conflict, yes
### Problem 2.
### Problem 3. how to search table for "->$\epsilon$"

---
## LR(1) Parser
> LR(0) + follow() => SLR(1)
:::warning
http://jsmachines.sourceforge.net/machines/lr1.html
:::
* Also called **“canonical LR parser (CLR)”**, or CLR(1) parser
* LR(1) parser is built based on **LR(1) items**
* LR(1) items: the general form of an items becomes [A → α • β, a]
* The **lookahead** a has no effect when β ≠ є
* [A → α •, a] calls for a reduction by A → α **if the next symbol is a**
:::info
不嚴謹、有衝突 **LR(0) < SLR < LALR < LR()** 嚴謹
:::

與 LR(0) 唯一不同的是 LR(1) 的 CLOSURE 會去看展開過的子 item 在父 item 得剩餘 symbol => FIRST(...)
例如 :
1. $S \to \cdot C C ,$ $
2. 展開第一個 C 得下一個 item
$S \to \cdot c C$ + FIRST(C$)
3. FIRST(C$) = c/d
=> $S \to \cdot c C, c/d$

* Input: An augmented grammar G’
* Output: LR(1) parsing table functions ACTION and GOTO for G’
* Method
Construct C={I0, I1, …, In}, the collection of sets of LR(1) items for G’
State i is constructed from Ii. The parsing actions for state i are determined as follows
1. If [A→α•aβ, b] is in Ii, and GOTO(Ii, a) = Ij, set ACTION[i,a] to **“shift j**”. Here a is a terminal
2. If [A→α•, a] is in Ii, A ≠ S’, set ACTION[i,a] to “**reduce A→α**”
3. If [S’→S•, $] is in Ii, set ACTION[i,\$] to “**accept**”
* The goto transitions for state i are constructed for all nonterminals A using the rule: If GOTO(Ii, A) = Ij, then GOTO[i, A] = j
All undefined entries are made “error”
The initial state of the parser is the one constructed from the set of items containing [S’→•S,\$]
* The parsing table generated by the algorithm is called the LR(1) table for G
* An LR parser using the LR(1) table is said to be LR(1)
Usually we omit the “(1)” after the “LR”

> 精確地把 reduce 的分開,如 state 4, 7 的 $r_3$ 。
記得,後面得 First() 要放正在看的 symbol 後的其他 symbol :

---
## LALR
* LALR – lookahead LR
* Often used in practice because the tables are much smaller than the LR tables
* Most common syntactic constructs of programming languages can be expressed by an LALR grammar
* Method:
* A LALR(1) parsing table can be obtained by merging the sets of items of a LR(1) table with the same core
(core: item扣除逗號與後面的部份)
* Example. Consider the following sets of LR(1) items
**LA(1) merge state to generate LALR will not get the NFA (重邊與 $\epsilon$) 。**

> $I_3$ 、 $I_6$ 可以 merge ,而因已從其延伸的藍、綠框 $I_x$ 也可以分別 merge 。
Input: An augmented grammar G’
Output: LALR parsing-table functions ACTION and GOTO for G’
Method
- Construct C = {I0, I1, I2, ..., In }, the sets of LR(1) items
- For each core in C, find all sets with that core, and replace these sets by
their union. Let result be C’ = {J0, J1, J2, ..., Jm}
- State i is constructed from Ji, Actions for state i
- If [A→$alpha$•a$beta$, b] is in Ii, and GOTO(Ii, a) = Ij, set ACTION[i,a] to “shift j”. Here a is a terminal
- If [A→$alpha$•, a] is in Ii, A S’, set ACTION[i,a] to “reduce A→$alpha$”
- If [S’→S•, $] is in Ii, set ACTION[i,$] to “accept”
- If Ji = I1 ∪ I2 ∪ ... ∪ Ik, then GOTO(I1,X) = GOTO(I2,X) = ... = GOTO(Ik,X)
- Let K be the union of items having the same core as GOTO(I1, X), then GOTO(J, X) = K
### LALR(1) Conflicts
- 表格只要有 1 格有 2 個以上的選擇。
- No shift-reduce conflicts will be introduced **by merging**
- Because shift depends only on core, not the lookahead
- However, it may produce reduce/reduce conflict
:::warning
reduce/reduce 是因為 merging ,所以他還是有可能有 shift/reduce conflict
:::
Example: Consider the grammar
S → a A d | b B d | a B e | b A e
A → c
B → c
整體上而言,會有下列衝突:
* shift/shift
* shift/reduce
* reduce/reduce
:::success
**嚴謹程度**
LR(0) < SLR < LALR < LR(1)
:::
### LALR(1) Example


> A::first(d\$) = {d}
> B::first(e\$) = {e}
> 選當前來源的後續字串作 first


> NO Conflict
* $I_6$ $I_9$ merge

> **Merge** 需要圖,只有 LR(1) parsing tables 和 grammar 是沒辦法直接作 merge 。
:::warning
所以這個 example 是:
* LALR 不是
* LR(1) 是
:::
:::success
如果有 $\epsilon$ :
在 $S'\to\cdot S,$\$ 在第五行生成 $S\to \cdot ,$\$ 。
LR series 就是 DFA ,沒有 $\epsilon$ 。
:::
---
http://www.cs.nthu.edu.tw/~wkhon/assignments/assign2ans.pdf
https://web.njit.edu/~marvin/cs341/hw/hwsoln05.pdf
https://math.stackexchange.com/questions/353031/w-such-that-it-contains-at-least-3-ones-is-my-approach-to-the-cfg-right
https://github.com/fool2fish/dragon-book-exercise-answers/blob/master/ch04/4.2/4.2.md
:::info
[Generate Predict, First, and Follow Sets from EBNF (Extended Backus Naur Form) Grammar](http://hackingoff.com/compilers/predict-first-follow-set)
:::
---
## C2IR - Project

### Example
* [A small calculator to parse and execute mathematical expressions based on C, Flex, Bison.](https://github.com/BaseMax/SmallCalculator)
* [yacc](https://github.com/Kray-G/kinx/blob/b2ef0e40fb49e9497f71eaa13fe5e456522fc37d/src/kinx.y)
### Reference
* [clang to llvm ir](https://stackoverflow.com/questions/9148890/how-to-make-clang-compile-to-llvm-ir)
```
$ cloc .
10 text files.
10 unique files.
1 file ignored.
github.com/AlDanial/cloc v 1.82 T=0.01 s (1266.3 files/s, 154364.4 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
C/C++ Header 4 177 27 706
yacc 1 28 3 91
make 1 15 2 51
lex 1 6 0 43
C++ 1 11 0 23
C 1 7 0 16
Markdown 1 2 0 11
-------------------------------------------------------------------------------
SUM: 10 246 32 941
-------------------------------------------------------------------------------
```
```
rm -f lex.cpp parser.cpp lex.hpp parser.hpp
rm -f parser.tab.c parser.tab.h parser.output
rm -f lex.o parser.o main.o
rm -f c2ir
flex --header-file=lex.hpp -o lex.cpp lex.l
bison -d -v -o parser.cpp parser.y
g++ -c lex.cpp `llvm-config --cppflags` -std=c++14
g++ -c parser.cpp `llvm-config --cppflags` -std=c++14
g++ -c main.cpp `llvm-config --cppflags` -std=c++14
g++ -o c2ir lex.o parser.o main.o `llvm-config --ldflags` -lpthread -ldl -lz -lncurses -rdynamic `llvm-config --libs`
#include <stdio.h>
extern int puts(const char *str);
extern int printf(const char *str, ...);
int do_math(int a)
{
int b = a + 1;
printf("b = %d", b);
puts(" ");
return b;
}
int main () {
char *string = "Hello World!";
do_math(1);
puts(string);
return 0;
}
LEX_HEADER
LEX_EXTERN
LEX_INT
NIdentifier: int
LEX_IDENTIFIER
NIdentifier: puts
LEX_LPAREN
LEX_CHAR
LEX_ASTERISK
NIdentifier: char
LEX_IDENTIFIER
NIdentifier: str
LEX_RPAREN
NVariableDeclaration
LEX_SEQPOINT
Extern NFunctionDeclaration
NBlock
LEX_EXTERN
LEX_INT
NIdentifier: int
LEX_IDENTIFIER
NIdentifier: printf
LEX_LPAREN
LEX_CHAR
LEX_ASTERISK
NIdentifier: char
LEX_IDENTIFIER
NIdentifier: str
LEX_RPAREN
NVariableDeclaration
LEX_SEQPOINT
Extern NFunctionDeclaration
LEX_INT
NIdentifier: int
LEX_IDENTIFIER
NIdentifier: do_math
LEX_LPAREN
LEX_INT
NIdentifier: int
LEX_IDENTIFIER
NIdentifier: a
LEX_RPAREN
NVariableDeclaration
LEX_LBRACE
LEX_INT
NIdentifier: int
LEX_IDENTIFIER
NIdentifier: b
LEX_EQUAL
LEX_IDENTIFIER
NIdentifier: a
LEX_ADD
LEX_INTEGER
NInteger: 1
NBinaryOperator
NVariableDeclaration
LEX_SEQPOINT
NBlock
LEX_IDENTIFIER
NIdentifier: printf
LEX_LPAREN
LEX_LITERAL
NLiteral: b = %d
LEX_COMMA
LEX_IDENTIFIER
NIdentifier: b
LEX_RPAREN
NMethdCall
LEX_SEQPOINT
NExpressionStatment
LEX_IDENTIFIER
NIdentifier: puts
LEX_LPAREN
LEX_LITERAL
NLiteral:
LEX_RPAREN
NMethdCall
LEX_SEQPOINT
NExpressionStatment
LEX_RETURN
LEX_IDENTIFIER
NIdentifier: b
LEX_SEQPOINT
NReturnStatement
LEX_RBRACE
NFunctionDeclaration
LEX_INT
NIdentifier: int
LEX_IDENTIFIER
NIdentifier: main
LEX_LPAREN
LEX_RPAREN
LEX_LBRACE
LEX_CHAR
LEX_ASTERISK
NIdentifier: char
LEX_IDENTIFIER
NIdentifier: string
LEX_EQUAL
LEX_LITERAL
NLiteral: Hello World!
NVariableDeclaration
LEX_SEQPOINT
NBlock
LEX_IDENTIFIER
NIdentifier: do_math
LEX_LPAREN
LEX_INTEGER
NInteger: 1
LEX_RPAREN
NMethdCall
LEX_SEQPOINT
NExpressionStatment
LEX_IDENTIFIER
NIdentifier: puts
LEX_LPAREN
LEX_IDENTIFIER
NIdentifier: string
LEX_RPAREN
NMethdCall
LEX_SEQPOINT
NExpressionStatment
LEX_RETURN
LEX_INTEGER
NInteger: 0
LEX_SEQPOINT
NReturnStatement
LEX_RBRACE
NFunctionDeclaration
parser program block
0x55c0fc9f1e50
---------------------
Generating code...
Start generate code
After push Block
Generating block
Start of block
Generating code for 20NFunctionDeclaration
Generating function declaration of puts
identifier type: char
identifier type: int
Generating code for 20NFunctionDeclaration
Generating function declaration of printf
identifier type: char
identifier type: int
Generating code for 20NFunctionDeclaration
Generating function declaration of do_math
identifier type: int
identifier type: int
start of arguments
Generating variable declaration of int a
identifier type: int
End of arguments
Generating block
Start of block
Generating code for 20NVariableDeclaration
Generating variable declaration of int b
identifier type: int
NAssignment
Generating assignment of b =
Generating binary operator
Generating identifier a
Generating Integer: 1
L is IntegerTyID
R is IntegerTyID
Generating code for 20NExpressionStatement
Generating method call of printf
Function arguments size not match, calleeF=0, this->arguments=2
Start of callee arg
Generating Literal: b = %d
Generating identifier b
End of callee arg
Generating code for 20NExpressionStatement
Generating method call of puts
Start of callee arg
Generating Literal:
End of callee arg
Generating code for 16NReturnStatement
Generating return statement
Generating identifier b
End of block
Function block created
Generating code for 20NFunctionDeclaration
Generating function declaration of main
identifier type: int
start of arguments
End of arguments
Generating block
Start of block
Generating code for 20NVariableDeclaration
Generating variable declaration of char * string
identifier type: char
NAssignment
Generating assignment of string =
Generating Literal: Hello World!
Generating code for 20NExpressionStatement
Generating method call of do_math
Start of callee arg
Generating Integer: 1
End of callee arg
Generating code for 20NExpressionStatement
Generating method call of puts
Start of callee arg
Generating identifier string
End of callee arg
Generating code for 16NReturnStatement
Generating return statement
Generating Integer: 0
End of block
Function block created
End of block
After code gen
Code is generated.
----------------------
; ModuleID = 'main'
source_filename = "main"
@.str = private constant [4 x i8] c"%d\0A\00"
@string = private unnamed_addr constant [7 x i8] c"b = %d\00", align 1
@string.2 = private unnamed_addr constant [2 x i8] c" \00", align 1
@string.3 = private unnamed_addr constant [13 x i8] c"Hello World!\00", align 1
declare i32 @printf(i8*, ...)
define internal void @echo(i64 %toPrint) {
entry:
%0 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i64 %toPrint)
ret void
}
declare i32 @puts(i8*)
declare i32 @printf.1(i8*)
define i32 @do_math(i32 %a) {
entry:
%0 = alloca i32
store i32 %a, i32* %0
%1 = alloca i32
%arrayPtr = load i32, i32* %0
%2 = load i32, i32* %0
%addtmp = add i32 %2, 1
store i32 %addtmp, i32* %1
%arrayPtr1 = load i32, i32* %1
%3 = load i32, i32* %1
%calltmp = call i32 (i8*, ...) @printf([7 x i8]* @string, i32 %3)
%calltmp2 = call i32 @puts([2 x i8]* @string.2)
%arrayPtr3 = load i32, i32* %1
%4 = load i32, i32* %1
ret i32 %4
}
define i32 @main() {
entry:
%0 = alloca i8*
store [13 x i8]* @string.3, i8** %0
%calltmp = call i32 @do_math(i32 1)
%arrayPtr = load i8*, i8** %0
%1 = load i8*, i8** %0
%calltmp1 = call i32 @puts(i8* %1)
ret i32 0
}
---------------------
Object code wrote to text.o
-------sample--------
clang -S -emit-llvm text.c
cat text.ll
; ModuleID = 'text.c'
source_filename = "text.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"
@.str = private unnamed_addr constant [7 x i8] c"b = %d\00", align 1
@.str.1 = private unnamed_addr constant [2 x i8] c" \00", align 1
@.str.2 = private unnamed_addr constant [13 x i8] c"Hello World!\00", align 1
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @do_math(i32 %0) #0 {
%2 = alloca i32, align 4
%3 = alloca i32, align 4
store i32 %0, i32* %2, align 4
%4 = load i32, i32* %2, align 4
%5 = add nsw i32 %4, 1
store i32 %5, i32* %3, align 4
%6 = load i32, i32* %3, align 4
%7 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0), i32 %6)
%8 = call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @.str.1, i64 0, i64 0))
%9 = load i32, i32* %3, align 4
ret i32 %9
}
declare dso_local i32 @printf(i8*, ...) #1
declare dso_local i32 @puts(i8*) #1
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @main() #0 {
%1 = alloca i32, align 4
%2 = alloca i8*, align 8
store i32 0, i32* %1, align 4
store i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str.2, i64 0, i64 0), i8** %2, align 8
%3 = call i32 @do_math(i32 1)
%4 = load i8*, i8** %2, align 8
%5 = call i32 @puts(i8* %4)
ret i32 0
}
attributes #0 = { noinline nounwind optnone uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 10.0.0-4ubuntu1 "}
-------sample--------
clang -o test text.o
./test
b = 2
Hello World!
rm -f test text.o
```