---
title: Archi 2
---
# Computer Architectures
NTNU 計算機結構
##### [Back to Note Overview](https://reurl.cc/XXeYaE)
##### [Back to Computer Architectures](https://hackmd.io/@NTNUCSIE112/Archi109-2)
{%hackmd @sophie8909/pink_theme %}
###### tags: `NTNU` `CSIE` `必修` `Computer Architectures` `109-2`
<!-- tag順序 [學校] [系 必選] or [學程 學程名(不含學程的 e.g. 大師創業)] [課程] [開課學期]-->
## Ch.02 Instructions: Language of the Computer
### 2.1 Introduction
#### Instruction Set 指令集
+ A repertoire of instruction of a computer
計算機的指令集
+ 不同的電腦會有不同的指令集
+ 但有許多方面會相同
+ 早期的電腦都有非常簡單的指令集
+ 簡化設計
+ 許多現代電腦也有簡單的指令集
#### MIPS Instruction Set MIPS 指令集
- Large share of embedded core market
- Applications in consumer electronics, network/storage equipment, cameras, printers...
### 2.2 Operations of Computer Hardware
#### Arithmetic Operations 算術運算
+ Add and subtract, three operands 加減法、三個操作數
+ Two sources and one destination
兩個來源一個目標
```mips=
add a,b,c #a=b+c
sub a,b,c #a=b-c
```
+ 所有算術運算都是這個格式
+ 【Design Principle 1】Simplicity favors regularity 簡潔有助於規律性
+ 規律性有助於讓實作更簡單
+ Simplicity enables higher performance at lower cost
+ 簡化可以更容易最佳化
##### Example
+ C Code
```c=
a = ( b + c ) - ( d + e )
```
+ MIPS Code
```MIPS=
add t0, b, c #temp t0 = b+c
add t1, d, e #temp t1 = d+e
sub a,t0,t1 # a = t0-t1
```
<!-- week02 end -->
### 2.3 Operands of Computer Hardware
#### Register Operands 暫存器的運算子
- used by **Arithmetic instructions**
- MIPS has 32 * 32-bit register file
- used for frequently accessed data
- Numbered 0 to 31
- word 字節: 32-bit data
- Assembler names 組譯器中的名字
- temporary values 暫時的數值: `$t0`,`$t1`,$\dots$,`$t9`
- saved variables 儲存的變數: `$s0`,`$s1`,$\dots$,`$s7`
- 【Design Principle 2】Smaller is faster 越小越快
- c.f. main memory 主記憶體: million of locations 百萬個位置
##### Example
- C Code
```c=
a=(b+c)-(d+e)
```
a,b,c,d,e 分別存在 `$s0`,`$s1`,`$s2`,`$s3`,`$s4`
- Compiled MIPS code
```MIPS=
add $t0,$s1,$s2
add $t1,$s3,$s4
sub $s0,$t0,$t1
```
### Memory Operands 記憶體的運算
+ Main memory is used for composite data
主記憶體是用於綜合資料
+ Array 陣列、 structure 結構、 dynamic data 動態資料...
+ 用於算術運算(arithmetic operations)
+ Load values from memory into register
將數值從記憶體載入到快取
+ Store result from register to memory
將結果從快取存入到記憶體
+ Memory is byte addressed
記憶體用byte定址
+ Each address identifies an 8-bit/byte
每個位置表示一個 8-bit/byte
+ Words are aligned in memory
字節在記憶體中為對齊的
+ Address nust be a multiple of 4
地址必為四的倍數
+ MIPS is Big Endian
+ Big-Endian 是指資料放進記憶體中的時候,最高位的位元組會放在最低的記憶體位址上
+ Little-Endian 則相反
#### Example 1
- C code
```c=
g=h+A[8];
```
- g in `$s1`, h in `$s2`, base adddddress of A in `$s3`
- Compiled MIPS code
- Index 8 requires offset of 32
- 4bytes per word
```MIPS=
lw $t0, 32($s3) # load word
# 32 is offset
# $s3 is base register
add $s1, $s2, $t0
```
#### Example 2
+ C Code
```c=
A[12] = h + A[8] ;
```
h 在 `$s2` , A的基礎位置在 `$s3`
+ Compiled MIPS Code
+ Index 8 需要 offset of 32
+ Index 12 需要 offset of 48
```mips=
lw $t0, 32($s3) # load word
add $t0, $s2, $t0
sw $t0, 48($s3) # store word
```
#### Registers v.s. Memory 暫存器 v.s. 記憶體
+ Registers are faster to acess than memory
暫存器相較記憶體可以更快的存取
+ Operating on memory data requires loads and stores
在記憶體上做運算還需要加載跟保存的時間
+ More instructions to be executed
需要更多指令
+ Compilers must use registers for variable as much as possible
編譯器必須盡可能把暫存器用在變數上
+ Only spill to memory for less frequently used variables
真的沒有空間可以用才會將不常使用的變數放到記憶體中
+ Register optimization is important
暫存器最佳化很重要
#### Immediate Operands 直接操作數
+ Constant data specified in an instruction
指令中指定的常數數據
```mips=
addi $s3,$s3,4 # $s3 = $s3 + 4
```
+ No subtract immediate instruction 沒有減去的直接指令
+ 直接加一個負數(negative constant)
```mips=
addi $s2, $s1, -1
# $s2 = $s1 + (-1)
```
+ 【Design Principle 3】Make the common case faster 讓常見情況處裡的更快
+ Small constants are common
小的常數的運算很常見
+ Immediate operand avoids a load instruction
直接操作可以免除常數加載到暫存器的指令
### 2.4 Signed and Unsigned Numbers
#### Constant Zero 常數零
+ MIPS暫存器 0 (`$zero`) 為常數 0
+ Cannot be overwritten
不能被覆寫
+ Useful for common operations
對常做的操作有用的
+ e.g. move between registers
在暫存器之間搬移資料
```MIPS=
add $t2,$s1,$zero #t2=s1+0
```
#### Unsigned Binary Integers 無符號二進位整數
+ Given an n-bit number
給一個 n-bit 的數字
$x=x_{n-1}2^{n-1}+x_{n-2}2^{n-2}+\cdots+x_12^1+x_02^0$
+ 範圍:$0$ to $+2^n-1$
+ Example
+ $\begin{aligned}
&\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 1011_2
\\&= 0 + \cdots + 1×2^3 + 0×2^2 +1×2^1 +1×2^0
\\&= 0 + \cdots + 8 + 0 + 2 + 1 = 11_{10}
\end{aligned}$
+ Using 32 bits
$0$ to $+4,294,967,295(2^{32}-1)$
#### 2s-Complement Signed Integers 二補數有符號整數
+ 給一個 n-bit 的數字(最高位不一樣)
$x=-x_{n-1}2^{n-1}+x_{n-2}2^{n-2}+\cdots+x_12^1+x_02^0$
+ 範圍:$-2^{n-1}$~$+2^{n-1}-1$
+ Example
+ $\begin{aligned}
&\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1100_2
\\&= –1×2^{31} + 1×2^{30} + … + 1×2^2 +0×2^1 +0×2^0
\\&= –2,147,483,648 + 2,147,483,644 = –4_{10}
\end{aligned}$
+ 使用 32 bits
$–2,147,483,648$ ~ $+2,147,483,647$
+ Bit 31 為符號 bit
+ 1 代表負
+ 0 代表正
+ $-(-2^{n-1})$ 不能表示
+ 非負數字有相同的無符號及二補數表示
+ 一些特殊的數字
+ $0:0000\ 0000 \cdots 0000$
+ $–1:1111\ 1111 \cdots 1111$
+ 最負:$1000\ 0000\ \cdots\ 0000$
+ 最正:$0111\ 1111\ \cdots\ 1111$
#### Signed Negation 整數正負號變換
+ Complement and add 1
+ complement: $0\rightarrow 1,1\rightarrow 0$
$\begin{aligned}
x+\bar x&=1111\cdots 111_2=-1\\
\implies\bar x+1&=-x
\end{aligned}$
+ Example: negate+2
+ $+2=0000\ 0000\cdots 0010_2$
+ $\begin{aligned}-2&=1111\ 1111\cdots 1101_2+1\\&=1111\ 1111\cdots 1110_2\end{aligned}$
#### Sign Extension 有符號擴展
+ Representing a number using more bits
用更多 bits 表示一個數字
+ Preserve the numeric value 保留數字
+ In MIPS instruction set
在 MIPS 指令集中
+ `addi`: extend immediate value 擴展立即數值
+ `lb`,`lh`: extend loaded byte/halfword 擴展加載的
+ `beq`,`bne`: extend the displacement (之後才會介紹)
+ Replicate the sign bit to the left
左邊的 bit 用 sign bit 填滿
+ unsigned values: extend with 0s
+ Examples: 8-bit to 16-bit
+ +2: 0000 0010 → 0000 0000 **0**000 0010
+ -2: 1111 1110 → 1111 1111 **1**111 1110
### 2.5 Representing Instructions in the Computer
#### Represent Instructions
- Instructions are encoded in binary
- Called machine code
- MIPS instructions
- Encoded as 32-bit instruction words
- Small number of formats encoding operation code(opcode), register numbers, ...
- Regularity
- `$t0` - `$t7`: registers 8 - 15
- `$s0` - `$s7`: registers 16 - 23
- `$t8` - `$t9`: registers 24 - 25
#### MIPS R-format Instructions
![](https://i.imgur.com/vyaNxtw.png)
- Instruction fields
- `op`: operation code (opcode)
- `rs`: first source register number
- `rt`: second source register number
- `rd`: destination register number
- `shamt`: shift amount (000000 for now)
- `funct`: function code (extends opcode)
#### Hexadecimal
- Base 16
- Compact representation of bit strings
- 4 bits per hex digit
#### MIPS I-format Instructions
![](https://i.imgur.com/GFgtiMO.png)
- Immediate arithmetic and load/store instructions
- `rt`: destination or source register number
- Constant: $-2^{15}$ to $+2^{15}-1$
- Address: offset added to base address in `rs`
- 【Design Principle 4】Good design demands good compromises
好的設計有好的妥協
- Different formats complicate decoding, but allow 32-bit instructions uniformly
雖然解碼變複雜,但維持所有指令的長度都是32-bit
- Keep formats as similar as possible
保持部分欄位是相似的
(`op`、`rs`、`rt` 的欄位在 R,I-format 一樣)
#### Store Program Computers
![](https://i.imgur.com/Ip7ip4p.png)
- Instructions are represented in binary just like data
- Instructions and data are stored in memory
- Programs can operate on programs
- e.g. compilers, linkers, ...
- Binary compatibility allows compiled programs to work on different computers
- standardized ISAs
### 2.6 Logical Operations
#### Logical Operations
- Instructions for bitwise manipulation
- Useful for extracting and inserting groups of bits in a word
| Operation | C | Java | MIPS |
| ----------- | --- | ---- |:--------- |
| Shift left | << | << | `sll` |
| Shift right | >> | >>> | `srl` |
| Bitwise AND | & | & | `and`, `andi` |
| Bitwise OR | \| | \| | `or`, `ori` |
| Bitwise NOT | ~ | ~ | `nor` |
#### Shift Operations
![](https://i.imgur.com/GeP2uRa.png)
- shamt: how many positions to ahift
- Shift left logical: `sll`
- Shift left and fill with 0 bits
- `sll` by i bits multiplies by $2^i$
- ~Shift right logical: `srl`
- Shift right and fill with 0 bits
- `srl` by i bits divides by $2^i$ *(unsigned only)*
#### AND Operations
- Useful to mask(遮罩) bits in a word
- Select some bits, clear others to 0
- Example:
`and $t0, $t1, $t2`
![](https://i.imgur.com/N6JSQUO.png)
#### OR Operations
- Useful to include bits in a word
- Set some bits to 1, leave others unchanged
設定特定 bit 為 1
- Example:
`or $t0, $t1, $t2`
![](https://i.imgur.com/aO5NSHw.png)
#### NOT Operations
- Useful to invert bits in a word
- Change 0 to 1, and 1 to 0
- MIPS has NOR 3-operand instruction
- Example: a NOR b == NOT (a or b)
`nor $t0, $t1. $zero`
![](https://i.imgur.com/mUoONot.png)
#### Conditional Operation
- Branch to a labeled instruction if a condition is true
- Otherwise, continue sequentially
- `beq rs, rt, L1`
- if (rs == rt) branch to instruction labeled L1
- `bne rs,rt, L1`
- if (rs != rt) branch to instruvciton labeled L1
- `j L1`
- unconditional kump to instruction labeled L1
### 2.7 Instructions for Making Decisions
#### Compiling If Statements
![](https://i.imgur.com/60ZM74d.png)
- C code
```c=
if(i==j) f = g + h;
else f = g - h;
// f, g, ... in $s0, $s1, ...
```
- Compiled MIPS code
```MIPS=
bne $s3, $s4, Else
add $s0, $s1, $s2
j Exit
Else: `sub $s0, $s1, $s2`
Exit: ...
```
<!-- 0309 end -->
#### Compiling Loop Statements
- C Code
```c=
while( save[i] == k )
i += 1;
// i in $s3, k in $s5, address of save in $s6
```
- Compiled MIPS code
```MIPS=
Loop: sll $t1, $s3, 2
#save $s1*4 in $s3
#
add $t1, $t1, $s6
#(add $s3 at save[0] to get save[i])
lw $t0, 0($t1)
bne $t0, $s5, Exit
addi $s3, $s3, 1
j Loop
Exit: ...
```
#### Basic Blocks
- A basic block is a sequence of instructions with ...
- No embedded branches( except at end)
- No branch targets ( expect at beginning)
- A compiler identifies basic blocks for optimization.
- An advanced processor can accelerate execution of basic blocks.
#### More Conditional Operations
- Set result to 1 if a condition is true
- Otherwise, set to 0
- `slt rd, rs, rt`
- `if( rs < rt ) rd = 1; else rd = 0;`
- `slti rt, rs, constant`
- `if( rs < constant ) rt = 1; else rt = 0`
- Use in combination with `beq` `bne`
```MIPS=
slt $t0, $s1, $s2 # if ($s1 < $s2)
bne $t0, $zero, L # branch to L
```
#### Branch Instruction Design
- Why not `blt`, `bqe`?
為何在 MIPS 中沒有設計這兩個指令
- Common case
- 效率更高
- Hardware for <, > ,... is slower than =, $\neq$
- Combine with branch involves more work per instruction, requiring a slower clock.
- All instructions are penalized
因為指令處理的方式,如果有某些指令特別長,那其他沒那麼久的也會跟著一起變慢
#### Signed v.s. Unsigned
- Signed comparison: `slt`, `slti`
- Unsigned comparison: `sltu`, `sltui`
- Example
```MIPS=
# $s0 = 1111 1111 1111 1111 1111 1111 1111 1111
# $s1 = 0000 0000 0000 0000 0000 0000 0000 0001
slt $t0, $s0, $s1 #signed
#-1 < +1, so $t0 = 1
sltu $t0, $s0, $s1 #unsigned
+4294967296 > +1, so $t0 = 0
```
### 2.8 Supporting Procedures in Computer Hardware
#### Procedure Calling
- caller 跟 callee 執行權的轉移
- caller 就是原本在的地方
- callee 就是呼叫 function 的東西
- 因為他們共用 register,所以需要做一些處理避免資料遺失
- 透過 register 溝通
- Steps required
1. Place parameters in registers
- caller 設定要呼叫的prcedure的參數,執行權從 caller 到 callee
2. Transfer control to procedure
3. Acquire storage for procedure
- callee 開始執行程式
- 原本 caller 用的 register 的資料存起來
- 因為 callee 才知道需要多少空間
4. Perform procedure’s operations
5. Place result in register for caller
6. Return to place of call
- 交執行權還給caller
- 在 Step 3 的時候如果有保留的數值,需要在這步之前將數值還原回原本的樣子
#### Register Usage
| | type | register | |
| --------- | ------------------------------ | -------- | -------------------------------------- |
| $v0, $v1 | result values | 2, 3 | procedure calling 中 output |
| $a0 - $a3 | arguments | 4 - 7 | procedure calling 中input |
| $t0 - $t7 | temporary values | 8 - 15 | can be overwritten by callee |
| $s0 - $s7 | saved variables | 16 - 23 | must be saved/ restored by callee |
| $t8 - $t9 | temporary values | 24, 25 | can be overwritten by callee |
| $gp | global pointer for ststic data | 28 | |
| $sp | stack pointer | 29 | procedure calling 中可以隨意使用的 |
| $fp | frame pointer | 30 | |
| $ra | return address | 31 | procedure calling 中儲存 caller 的位置 |
#### Procedure Call Instructions
- Procedure call: jump and link
`jal ProcedureLabel`
- Address of the following instruction put in $ra
- Jump to target address
- Procedure return: jump register
`jr $ra`
- Copy $ra to program counter
- Can also be used for computed jumps
- E.g., for case/switch statements
#### Local Data on the Stack
- Local data is allocated by callee.
- E.g., C automatic variables
- Procedure frame (activation record)
- Used by some compilers to manage stack storage
![](https://i.imgur.com/SRgjFVa.png)
- stack 從最高開始往下放
- `$sp` 指向 stack 最底
#### Leaf Procedure Example
- C code
```c=
// g, h, i, j in $a0, $a1, $a2, $a3
int leaf_example (int g, h, i, j)
{
int f;
// f in $s0
// hence, need to save $s0 on stack
f = (g + h) - (i + j);
// Result in $v0
return f;
}
```
- MIPS code
```MIPS=
###
我也不知道為什麼這邊打一個$ 底下就都可以註解了
###
leaf_example:
# 將 $sp 移出 4 個空間
addi $sp, $sp, -4
# save $s0 on stack
sw $s0, 0($sp)
# Procedure body
# $a0(g) + $a1(h) 暫存在 $t0
add $t0, $a0, $a1
# $a2(i) + $a3(i) 暫存在 $t1
add $t1, $a2, $a3
# $s0(f) = $t0(g+h) - $t1(i+j)
sub $s0, $t0, $t1
# Result
add $v0, $s0, $zero
# Restore $s0
lw $s0, 0($sp)
addi $sp, $sp, 4
# Return
jr $ra
```
#### Non-Leaf Procedures
- Procedures that call other procedures
- For nested(嵌套) call, caller needs to save on the stack …
- Its return address(它的 caller)
- Any arguments and temporaries needed after the call
- Restore from the stack after the call
##### Example
- C Code
```c=
// n in $a0
int fact( int n )
{
if( n < 1 )
return 1;
else
return n * fact( n - 1 );
// result in $v0
}
```
- MIPS
```MIPS=
###
我也不知道為什麼這邊打一個$ 底下就都可以註解了
###
fact:
#把 caller 的東西都保存下來
# adjust stack for 2 items(8 bits)
addi $sp, $sp, -8
# save return address, $ra
sw $ra, 4($sp)
# save 1 argument, $a0
sw $a0, 0($sp)
# test for n < 1
slti $t0, $a0, 1
beq $t0, $zero, L1
# if so, result is 1
addi $v0, $zero, 1
# pop 2 items from stack
addi $sp, $sp, 8
# and return
jr $ra
L1:
# else decrement n
addi $a0, $a0, -1
# recursive call
jal fact
# 把 caller 的東西都放回去
# restore original n
lw $a0, 0($sp)
# restore return address
lw $ra, 4($sp)
# pop 2 items from stack
addi $sp, $sp, 8
# multiply to get result
mul $v0, $a0, $v0
# and return
jr $ra
```
#### Memory Layout
![](https://i.imgur.com/gzZ8W18.png)
- Text: program code
- Static data: global variables
- E.g., static variables in C, constant arrays and strings
- `$gp` is initialized to address allowing ±offsets into this segment.
- Dynamic data: heap
- E.g., malloc in C, new in Java
- Stack: automatic storage
#### Character Data
- Byte-encoded character set
- ASCII: 128 characters = 95 graphic + 33 control
- Latin-1: 256 characters = ASCII + 96 more graphic characters
- Unicode: 32-bit character set
- Used in Java, C++ wide characters, …
- Most of world’s alphabets, plus symbols
- UTF-8, UTF-16: variable-length encodings
#### Byte/Halfword Operations
- Could use bitwise operations
- MIPS byte/halfword load/store
- String processing is a common case.
- Sign extend to 32 bits in `rt`
```MIPS=
# sign extend a byte
lb rt, offset(rs)
# sign extend a halfword
lh rt, offset(rs)
```
- Zero extend to 32 bits in `rt`
```MIPS=
# unsign(zero) extend a byte
lbu rt, offset(rs)
# unsign(zero) extend a halfword
lhu rt, offset(rs)
```
- Store just rightmost byte/halfword
```MIPS=
sb rt, offset(rs)
sh rt, offset(rs)
```
#### String Copy Example 字串拷貝
- C Code
- Null-terminated string
最後一個index為NULL的字串
```c=
// x in $a0, y in $a1
void strcpy(char x[], char y[])
{
// i in $s0
int i;
i = 0;
while( (x[i] = y[i]) != '\0')
{
i += 1;
}
}
```
- MIPS
```MIPS=
# non-leaf 只需要保留 s 系列
strcpy:
addi $sp, $sp, -4 # adjust stack for 1 item
sw $S0, 0($sp) # save $s0
add $s0, $zero, $zero # i - 0
L1:
add $t1, $s0, $a1 # add of y[i] in $t1
lbu $t2, 0($t1) # $t2 = y[i]
add $t3, $s0, $a0 #addr of x[i] in $t3
sb $t2, 0($t3) # x[i] = y[i]
beq $t2, $zero, L2 # exit loop if y[i] == 0
addi $s0, $s0, 1 # i = i +1
j L1 # next iteration of loop
L2:
lw $s0, 0($sp) # restore saved $s0
addi $sp, $sp, 4 # pop 1 item from stack
jr $ra # and return
```
### Communicating with People
### 2.9 MIPS Addressing for 32-Bit Immediates and Addresses
#### 32bit Constant
- Most constant are small 多數的常數都很小
- 16-bit immediate us sufficient 通常都只有16個bits
- For an occasional 32-bit constant
`lui rt, constant`
- Copy 16-bit constant to left 16 bits or `rt`
- Clear right 16 bits of `rt` to 0
#### Branch Addressing
- Branch instructions specify...
- Opcode, two registers, target address
- Most branch targets are near branch.
- Forward or backward
![](https://i.imgur.com/P4bsYy3.png)
- PC-relative addressing (PC (program counter)相對定址)
- Target address = PC + offset * 4 (MIPS一個指令4個 byte)
- PC is already incremented by 4 by this time
#### Jump Addressing
- Jump (`j` and `jal`) targets could be anywere in text segment
- Encode full address in an instruction
![](https://i.imgur.com/kzHV2lQ.png)
- 在 MIPS 中 `op` 都是 6 bits
- (Pseudo) Direct jump addressing
- Target address = PC31...28:(address*4)
4個跟PC借 接下來26個bits 剩下直接x4
##### Target Addressing Example
- Loop code from the earlier example
- Assume Loop at location 80000
![](https://i.imgur.com/vbdUEfQ.png)
#### Branch Far Away
- If a branch target is too far to encode with 16-bit offset assembler rewrites the code
- 當要跳過去的目標位置太遠無法用 16 bit 表達,可以改寫指令使存放 address 的 bit 更大
- e.g. `j` and `jal` 有 26 bits
- Example
```MIPS=
beq $s0, $s1, L1 # 16 bits
bne $s0, $s1, L2 # 16 bits
j L1 # 26 bits
L2: ...po
```
#### Addressing Mode Summary
![](https://i.imgur.com/CQfY3Bf.png)
### 2.10 Parallelism and Instructions: Synchronization
#### Synchronization 同步
- Two processors share an area of memory
- P1 writes, then P2 reads
- Data race if P1 and P2 don't synchronize(race 資料不同)
- Result depends of order of accesses
- Hardware support is required.
- Atomic read/write memory operation
- No other access to a location is allowed between read and write.
- 一段 atomic operation 間不能被其他指令中斷
- Could be a single instruction
- E.g., atomic swap of register ↔ memory
提供資料對換
- Or an atomic pair of instructions
#### Synchronization in MIPS
- Load linked: `ll rt, offset(rs)`
- Store conditional: `sc rt, offset(rs)`
- Succeed if the location is not changed since the `ll`
- Returns 1 in `rt`
- Fail if the location is changed
- Returns 0 in `rt`
- 檢查 `rt` 的值在 `ll` 和 `sc` 間是否被更改過
- Example: atomic swap `$s4` with `M[$s1]` (to test/set lock variable)
```MIPS=
try:add $t0,$zero,$s4 #copy exchange value
ll $t1,0($s1) #load linked
sc $t0,0($s1) #store conditional
beq $t0,$zero,try #branch store fails
add $s4,$zero,$t1 #put load value in $s4
```
### 2.11 Translation and Starting a Program
#### Translation and Startup
![](https://i.imgur.com/RKJvKGj.png)
#### Assembler Pseudoinstructions
- Most assembler instructions represent machine instructions one-to-one.
- Pseudoinstructions: figments of the assembler’s imagination
```MIPS=
move $t0, $t1
# add $t0, $zero, $t1
```
```MIPS=
blt $t0, $t1, L
# slt $at, $t0, $t1
# if( $t0 < $t1 ) $at = 1
# bne $at, $zero, L
```
- `$at` (register 1): assembler temporary
#### Produce Object Module
- An assembler (or compiler) translates a program into machine instructions.
- Provide information for building a complete program
- Header: describe contents of object module
描述檔案中有哪些內容
- Text segment: translated instructions
機器語言碼
- Static data segment: data allocated for the life of the program
整個程式都需要的資料(全域變數
- Relocation information: for contents that depend on absolute location of loaded program
要根據絕對記憶體位置運行的數據的位置
- Symbol table: undefined labels, e.g., global definitions and external references
未定義的標籤(不包含在該檔案中的東西)
- Debug information: for associating with source code
紀錄給方便除錯的東西
#### Link Object Modules
- Produce an executable image
產生一個可執行的映象檔
- Merge segments
把不同的區塊結合在一起
- Resolve labels (determine their address)
把在 library 中的 label 找出來
- Patch location-dependent and external references
- Could leave location dependencies for fixing by a relocating loader
- But with virtual memory, no need to do this
- Program can be loaded into absolute location in virtual memory space.
#### Load a Program
- Load from an image file on a disk into memory
- Read the header to determine segment sizes
- Create a virtual address space
- Copy the text and initialized data into memory
- Or set page table entries so they can be faulted in
- Set up arguments on stack
- Initialize registers (including $sp, $fp, $gp)
- Jump to startup routine
- Copy arguments to $a0, … and call the main routine
- When the main routine returns, do exit() system call
#### Dynamic Linking
- Only link/load a library procedure when it is called
- Require the procedure code to be relocatable
- Avoid image bloat caused by static linking of all (transitively) referenced libraries
- Automatically pick up new library versions
#### Lazy Linkage
![](https://i.imgur.com/kKauns8.png)
#### Start Java Applications
![](https://i.imgur.com/OSMlYaM.png)
- 跨平台
- 編成 JVM
<!-- TBC -->
![h3標題](https://i.imgur.com/wkYdhaY.png)
### 2.12 A C sort Example to Put It All Together
#### C Swap Example
- Illustrate use of assembly instructions for a C bubble sort function
- Swap procedure
```c=
void swap(int v[], int k)
{
int temp;
temp = v[k];
v[k] = v[k+1]
v[k+1] = temp;
}
```
- v in $a0, k in $a1, temp in $t0
#### Procedure Swap
```MIPS=
swap: sll $t1,$a1,2 #$t1=k*4
add $t1,$a0,$t1 #$t1=v+(k*4)
lw $t0, 0($t1) # (address of v[k])
lw $t2, 4($t1) #$t0(temp)=v[k]
sw $t2, 0($t1) #v[k] = $t2(v[k+1])
sw $t0, 4($t1) #v[k+1 = $t0] (temp)
jr $ra #return to calling routine
```
#### C Procedure Sort
- Non-leaf(calls swap)
```c=
// v in $a0, n in $a1
void sort( int v[], int n)
{
// i in $s0, j in $s1
int i, j;
for(i = 0; i < n; i += 1)
{
for( j = i - 1; j <= 0 && v[j] > v[j+1]; j -= 1)
{
// swap 為自定義函式
// 將 v[j] 跟 v[j+1] 交換
swap(v,j);
}
}
}
```
#### Procedure Body
```MIPS=
move $s2, $a0 # save $a0 into $s2
move $s3, $a1 # save $a1 into $s3
move $s0, $zero # i = 0
for1tst: slt $t0, $s0, $s3 # $t0 = 0 if $s0 ≥ $s3 (i ≥ n)
beq $t0, $zero, exit1 # go to exit1 if $s0 ≥ $s3 (i ≥ n)
addi $s1, $s0, –1 # j = i – 1
for2tst: slti $t0, $s1, 0 # $t0 = 1 if $s1 < 0 (j < 0)
bne $t0, $zero, exit2 # go to exit2 if $s1 < 0 (j < 0)
sll $t1, $s1, 2 # $t1 = j * 4
add $t2, $s2, $t1 # $t2 = v + (j * 4)
lw $t3, 0($t2) # $t3 = v[j]
lw $t4, 4($t2) # $t4 = v[j + 1]
slt $t0, $t4, $t3 # $t0 = 0 if $t4 ≥ $t3
beq $t0, $zero, exit2 # go to exit2 if $t4 ≥ $t3
move $a0, $s2 # 1st param of swap is v (old $a0)
move $a1, $s1 # 2nd param of swap is j
jal swap # call swap procedure
addi $s1, $s1, –1 # j –= 1
j for2tst # jump to test of inner loop
exit2: addi $s0, $s0, 1 # i += 1
j for1tst # jump to test of outer loop
exit1:
```
#### Full Procedure
```mips=
sort:
addi $sp, $sp, -20 #make room on stack for 5 registers
sw $ra, 16($sp) #save $ra on stack
sw $s3, 12($sp) #save $s3 on stack
sw $s2, 8($sp) #save $s2 on stack
sw $s1, 4($sp) #save $s1 on stack
sw $s0, 0($sp) #save $s3 on stack
... #procedure body in the previous
exit1:
lw $s0, 0($sp) #restore $s0 from stack
lw $s1, 4($sp) #restore $s1 from stack
lw $s2, 8($sp) #restore $s2 from stack
lw $s3, 12($sp) #restore $s3 from stack
lw $s0, 0($sp) #restore $s0 from stack
addi $sp, $sp, 20 #restore stack pointer
jr $ra #return to calling routine
```
#### Lesson Learnt
- Instruction count and CPI are not good performance indicators in isolation.
- 單看指令數跟 CPI 無法評斷效能好壞
- Compiler optimizations are sensitive to algorithms.
- Java/JIT compiled code is significantly faster than JVM interpreted.
- Comparable to optimized C in some cases
- Nothing can fix a dumb algorithm!
- 沒有東西可以拯救智障的演算法
### Arrays versus Pointers
- Array indexing involves …
- Multiply index by element size
- Add to array base address
- Pointers correspond directly to memory addresses
- Can avoid indexing complexity
### Real Stuff: ARM Instructions, x86 Instructions, ARM v8(64-bit) Instructions
### Fallacies and Pitfalls
<!-- This is the end of the NOTE -->