--- title: Archi 2 --- # Computer Architectures NTNU 計算機結構 ##### [Back to Note Overview](https://reurl.cc/XXeYaE) ##### [Back to Computer Architectures](https://hackmd.io/@NTNUCSIE112/Archi109-2) {%hackmd @sophie8909/pink_theme %} ###### tags: `NTNU` `CSIE` `必修` `Computer Architectures` `109-2` <!-- tag順序 [學校] [系 必選] or [學程 學程名(不含學程的 e.g. 大師創業)] [課程] [開課學期]--> ## Ch.02 Instructions: Language of the Computer ### 2.1 Introduction #### Instruction Set 指令集 + A repertoire of instruction of a computer 計算機的指令集 + 不同的電腦會有不同的指令集 + 但有許多方面會相同 + 早期的電腦都有非常簡單的指令集 + 簡化設計 + 許多現代電腦也有簡單的指令集 #### MIPS Instruction Set MIPS 指令集 - Large share of embedded core market - Applications in consumer electronics, network/storage equipment, cameras, printers... ### 2.2 Operations of Computer Hardware #### Arithmetic Operations 算術運算 + Add and subtract, three operands 加減法、三個操作數 + Two sources and one destination 兩個來源一個目標 ```mips= add a,b,c #a=b+c sub a,b,c #a=b-c ``` + 所有算術運算都是這個格式 + 【Design Principle 1】Simplicity favors regularity 簡潔有助於規律性 + 規律性有助於讓實作更簡單 + Simplicity enables higher performance at lower cost + 簡化可以更容易最佳化 ##### Example + C Code ```c= a = ( b + c ) - ( d + e ) ``` + MIPS Code ```MIPS= add t0, b, c #temp t0 = b+c add t1, d, e #temp t1 = d+e sub a,t0,t1 # a = t0-t1 ``` <!-- week02 end --> ### 2.3 Operands of Computer Hardware #### Register Operands 暫存器的運算子 - used by **Arithmetic instructions** - MIPS has 32 * 32-bit register file - used for frequently accessed data - Numbered 0 to 31 - word 字節: 32-bit data - Assembler names 組譯器中的名字 - temporary values 暫時的數值: `$t0`,`$t1`,$\dots$,`$t9` - saved variables 儲存的變數: `$s0`,`$s1`,$\dots$,`$s7` - 【Design Principle 2】Smaller is faster 越小越快 - c.f. main memory 主記憶體: million of locations 百萬個位置 ##### Example - C Code ```c= a=(b+c)-(d+e) ``` a,b,c,d,e 分別存在 `$s0`,`$s1`,`$s2`,`$s3`,`$s4` - Compiled MIPS code ```MIPS= add $t0,$s1,$s2 add $t1,$s3,$s4 sub $s0,$t0,$t1 ``` ### Memory Operands 記憶體的運算 + Main memory is used for composite data 主記憶體是用於綜合資料 + Array 陣列、 structure 結構、 dynamic data 動態資料... + 用於算術運算(arithmetic operations) + Load values from memory into register 將數值從記憶體載入到快取 + Store result from register to memory 將結果從快取存入到記憶體 + Memory is byte addressed 記憶體用byte定址 + Each address identifies an 8-bit/byte 每個位置表示一個 8-bit/byte + Words are aligned in memory 字節在記憶體中為對齊的 + Address nust be a multiple of 4 地址必為四的倍數 + MIPS is Big Endian + Big-Endian 是指資料放進記憶體中的時候,最高位的位元組會放在最低的記憶體位址上 + Little-Endian 則相反 #### Example 1 - C code ```c= g=h+A[8]; ``` - g in `$s1`, h in `$s2`, base adddddress of A in `$s3` - Compiled MIPS code - Index 8 requires offset of 32 - 4bytes per word ```MIPS= lw $t0, 32($s3) # load word # 32 is offset # $s3 is base register add $s1, $s2, $t0 ``` #### Example 2 + C Code ```c= A[12] = h + A[8] ; ``` h 在 `$s2` , A的基礎位置在 `$s3` + Compiled MIPS Code + Index 8 需要 offset of 32 + Index 12 需要 offset of 48 ```mips= lw $t0, 32($s3) # load word add $t0, $s2, $t0 sw $t0, 48($s3) # store word ``` #### Registers v.s. Memory 暫存器 v.s. 記憶體 + Registers are faster to acess than memory 暫存器相較記憶體可以更快的存取 + Operating on memory data requires loads and stores 在記憶體上做運算還需要加載跟保存的時間 + More instructions to be executed 需要更多指令 + Compilers must use registers for variable as much as possible 編譯器必須盡可能把暫存器用在變數上 + Only spill to memory for less frequently used variables 真的沒有空間可以用才會將不常使用的變數放到記憶體中 + Register optimization is important 暫存器最佳化很重要 #### Immediate Operands 直接操作數 + Constant data specified in an instruction 指令中指定的常數數據 ```mips= addi $s3,$s3,4 # $s3 = $s3 + 4 ``` + No subtract immediate instruction 沒有減去的直接指令 + 直接加一個負數(negative constant) ```mips= addi $s2, $s1, -1 # $s2 = $s1 + (-1) ``` + 【Design Principle 3】Make the common case faster 讓常見情況處裡的更快 + Small constants are common 小的常數的運算很常見 + Immediate operand avoids a load instruction 直接操作可以免除常數加載到暫存器的指令 ### 2.4 Signed and Unsigned Numbers #### Constant Zero 常數零 + MIPS暫存器 0 (`$zero`) 為常數 0 + Cannot be overwritten 不能被覆寫 + Useful for common operations 對常做的操作有用的 + e.g. move between registers 在暫存器之間搬移資料 ```MIPS= add $t2,$s1,$zero #t2=s1+0 ``` #### Unsigned Binary Integers 無符號二進位整數 + Given an n-bit number 給一個 n-bit 的數字 $x=x_{n-1}2^{n-1}+x_{n-2}2^{n-2}+\cdots+x_12^1+x_02^0$ + 範圍:$0$ to $+2^n-1$ + Example + $\begin{aligned} &\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 1011_2 \\&= 0 + \cdots + 1×2^3 + 0×2^2 +1×2^1 +1×2^0 \\&= 0 + \cdots + 8 + 0 + 2 + 1 = 11_{10} \end{aligned}$ + Using 32 bits $0$ to $+4,294,967,295(2^{32}-1)$ #### 2s-Complement Signed Integers 二補數有符號整數 + 給一個 n-bit 的數字(最高位不一樣) $x=-x_{n-1}2^{n-1}+x_{n-2}2^{n-2}+\cdots+x_12^1+x_02^0$ + 範圍:$-2^{n-1}$~$+2^{n-1}-1$ + Example + $\begin{aligned} &\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1100_2 \\&= –1×2^{31} + 1×2^{30} + … + 1×2^2 +0×2^1 +0×2^0 \\&= –2,147,483,648 + 2,147,483,644 = –4_{10} \end{aligned}$ + 使用 32 bits $–2,147,483,648$ ~ $+2,147,483,647$ + Bit 31 為符號 bit + 1 代表負 + 0 代表正 + $-(-2^{n-1})$ 不能表示 + 非負數字有相同的無符號及二補數表示 + 一些特殊的數字 + $0:0000\ 0000 \cdots 0000$ + $–1:1111\ 1111 \cdots 1111$ + 最負:$1000\ 0000\ \cdots\ 0000$ + 最正:$0111\ 1111\ \cdots\ 1111$ #### Signed Negation 整數正負號變換 + Complement and add 1 + complement: $0\rightarrow 1,1\rightarrow 0$ $\begin{aligned} x+\bar x&=1111\cdots 111_2=-1\\ \implies\bar x+1&=-x \end{aligned}$ + Example: negate+2 + $+2=0000\ 0000\cdots 0010_2$ + $\begin{aligned}-2&=1111\ 1111\cdots 1101_2+1\\&=1111\ 1111\cdots 1110_2\end{aligned}$ #### Sign Extension 有符號擴展 + Representing a number using more bits 用更多 bits 表示一個數字 + Preserve the numeric value 保留數字 + In MIPS instruction set 在 MIPS 指令集中 + `addi`: extend immediate value 擴展立即數值 + `lb`,`lh`: extend loaded byte/halfword 擴展加載的 + `beq`,`bne`: extend the displacement (之後才會介紹) + Replicate the sign bit to the left 左邊的 bit 用 sign bit 填滿 + unsigned values: extend with 0s + Examples: 8-bit to 16-bit + +2: 0000 0010 → 0000 0000 **0**000 0010 + -2: 1111 1110 → 1111 1111 **1**111 1110 ### 2.5 Representing Instructions in the Computer #### Represent Instructions - Instructions are encoded in binary - Called machine code - MIPS instructions - Encoded as 32-bit instruction words - Small number of formats encoding operation code(opcode), register numbers, ... - Regularity - `$t0` - `$t7`: registers 8 - 15 - `$s0` - `$s7`: registers 16 - 23 - `$t8` - `$t9`: registers 24 - 25 #### MIPS R-format Instructions ![](https://i.imgur.com/vyaNxtw.png) - Instruction fields - `op`: operation code (opcode) - `rs`: first source register number - `rt`: second source register number - `rd`: destination register number - `shamt`: shift amount (000000 for now) - `funct`: function code (extends opcode) #### Hexadecimal - Base 16 - Compact representation of bit strings - 4 bits per hex digit #### MIPS I-format Instructions ![](https://i.imgur.com/GFgtiMO.png) - Immediate arithmetic and load/store instructions - `rt`: destination or source register number - Constant: $-2^{15}$ to $+2^{15}-1$ - Address: offset added to base address in `rs` - 【Design Principle 4】Good design demands good compromises 好的設計有好的妥協 - Different formats complicate decoding, but allow 32-bit instructions uniformly 雖然解碼變複雜,但維持所有指令的長度都是32-bit - Keep formats as similar as possible 保持部分欄位是相似的 (`op`、`rs`、`rt` 的欄位在 R,I-format 一樣) #### Store Program Computers ![](https://i.imgur.com/Ip7ip4p.png) - Instructions are represented in binary just like data - Instructions and data are stored in memory - Programs can operate on programs - e.g. compilers, linkers, ... - Binary compatibility allows compiled programs to work on different computers - standardized ISAs ### 2.6 Logical Operations #### Logical Operations - Instructions for bitwise manipulation - Useful for extracting and inserting groups of bits in a word | Operation | C | Java | MIPS | | ----------- | --- | ---- |:--------- | | Shift left | << | << | `sll` | | Shift right | >> | >>> | `srl` | | Bitwise AND | & | & | `and`, `andi` | | Bitwise OR | \| | \| | `or`, `ori` | | Bitwise NOT | ~ | ~ | `nor` | #### Shift Operations ![](https://i.imgur.com/GeP2uRa.png) - shamt: how many positions to ahift - Shift left logical: `sll` - Shift left and fill with 0 bits - `sll` by i bits multiplies by $2^i$ - ~Shift right logical: `srl` - Shift right and fill with 0 bits - `srl` by i bits divides by $2^i$ *(unsigned only)* #### AND Operations - Useful to mask(遮罩) bits in a word - Select some bits, clear others to 0 - Example: `and $t0, $t1, $t2` ![](https://i.imgur.com/N6JSQUO.png) #### OR Operations - Useful to include bits in a word - Set some bits to 1, leave others unchanged 設定特定 bit 為 1 - Example: `or $t0, $t1, $t2` ![](https://i.imgur.com/aO5NSHw.png) #### NOT Operations - Useful to invert bits in a word - Change 0 to 1, and 1 to 0 - MIPS has NOR 3-operand instruction - Example: a NOR b == NOT (a or b) `nor $t0, $t1. $zero` ![](https://i.imgur.com/mUoONot.png) #### Conditional Operation - Branch to a labeled instruction if a condition is true - Otherwise, continue sequentially - `beq rs, rt, L1` - if (rs == rt) branch to instruction labeled L1 - `bne rs,rt, L1` - if (rs != rt) branch to instruvciton labeled L1 - `j L1` - unconditional kump to instruction labeled L1 ### 2.7 Instructions for Making Decisions #### Compiling If Statements ![](https://i.imgur.com/60ZM74d.png) - C code ```c= if(i==j) f = g + h; else f = g - h; // f, g, ... in $s0, $s1, ... ``` - Compiled MIPS code ```MIPS= bne $s3, $s4, Else add $s0, $s1, $s2 j Exit Else: `sub $s0, $s1, $s2` Exit: ... ``` <!-- 0309 end --> #### Compiling Loop Statements - C Code ```c= while( save[i] == k ) i += 1; // i in $s3, k in $s5, address of save in $s6 ``` - Compiled MIPS code ```MIPS= Loop: sll $t1, $s3, 2 #save $s1*4 in $s3 # add $t1, $t1, $s6 #(add $s3 at save[0] to get save[i]) lw $t0, 0($t1) bne $t0, $s5, Exit addi $s3, $s3, 1 j Loop Exit: ... ``` #### Basic Blocks - A basic block is a sequence of instructions with ... - No embedded branches( except at end) - No branch targets ( expect at beginning) - A compiler identifies basic blocks for optimization. - An advanced processor can accelerate execution of basic blocks. #### More Conditional Operations - Set result to 1 if a condition is true - Otherwise, set to 0 - `slt rd, rs, rt` - `if( rs < rt ) rd = 1; else rd = 0;` - `slti rt, rs, constant` - `if( rs < constant ) rt = 1; else rt = 0` - Use in combination with `beq` `bne` ```MIPS= slt $t0, $s1, $s2 # if ($s1 < $s2) bne $t0, $zero, L # branch to L ``` #### Branch Instruction Design - Why not `blt`, `bqe`? 為何在 MIPS 中沒有設計這兩個指令 - Common case - 效率更高 - Hardware for <, > ,... is slower than =, $\neq$ - Combine with branch involves more work per instruction, requiring a slower clock. - All instructions are penalized 因為指令處理的方式,如果有某些指令特別長,那其他沒那麼久的也會跟著一起變慢 #### Signed v.s. Unsigned - Signed comparison: `slt`, `slti` - Unsigned comparison: `sltu`, `sltui` - Example ```MIPS= # $s0 = 1111 1111 1111 1111 1111 1111 1111 1111 # $s1 = 0000 0000 0000 0000 0000 0000 0000 0001 slt $t0, $s0, $s1 #signed #-1 < +1, so $t0 = 1 sltu $t0, $s0, $s1 #unsigned +4294967296 > +1, so $t0 = 0 ``` ### 2.8 Supporting Procedures in Computer Hardware #### Procedure Calling - caller 跟 callee 執行權的轉移 - caller 就是原本在的地方 - callee 就是呼叫 function 的東西 - 因為他們共用 register,所以需要做一些處理避免資料遺失 - 透過 register 溝通 - Steps required 1. Place parameters in registers - caller 設定要呼叫的prcedure的參數,執行權從 caller 到 callee 2. Transfer control to procedure 3. Acquire storage for procedure - callee 開始執行程式 - 原本 caller 用的 register 的資料存起來 - 因為 callee 才知道需要多少空間 4. Perform procedure’s operations 5. Place result in register for caller 6. Return to place of call - 交執行權還給caller - 在 Step 3 的時候如果有保留的數值,需要在這步之前將數值還原回原本的樣子 #### Register Usage | | type | register | | | --------- | ------------------------------ | -------- | -------------------------------------- | | $v0, $v1 | result values | 2, 3 | procedure calling 中 output | | $a0 - $a3 | arguments | 4 - 7 | procedure calling 中input | | $t0 - $t7 | temporary values | 8 - 15 | can be overwritten by callee | | $s0 - $s7 | saved variables | 16 - 23 | must be saved/ restored by callee | | $t8 - $t9 | temporary values | 24, 25 | can be overwritten by callee | | $gp | global pointer for ststic data | 28 | | | $sp | stack pointer | 29 | procedure calling 中可以隨意使用的 | | $fp | frame pointer | 30 | | | $ra | return address | 31 | procedure calling 中儲存 caller 的位置 | #### Procedure Call Instructions - Procedure call: jump and link `jal ProcedureLabel` - Address of the following instruction put in $ra - Jump to target address - Procedure return: jump register `jr $ra` - Copy $ra to program counter - Can also be used for computed jumps - E.g., for case/switch statements #### Local Data on the Stack - Local data is allocated by callee. - E.g., C automatic variables - Procedure frame (activation record) - Used by some compilers to manage stack storage ![](https://i.imgur.com/SRgjFVa.png) - stack 從最高開始往下放 - `$sp` 指向 stack 最底 #### Leaf Procedure Example - C code ```c= // g, h, i, j in $a0, $a1, $a2, $a3 int leaf_example (int g, h, i, j) { int f; // f in $s0 // hence, need to save $s0 on stack f = (g + h) - (i + j); // Result in $v0 return f; } ``` - MIPS code ```MIPS= ### 我也不知道為什麼這邊打一個$ 底下就都可以註解了 ### leaf_example: # 將 $sp 移出 4 個空間 addi $sp, $sp, -4 # save $s0 on stack sw $s0, 0($sp) # Procedure body # $a0(g) + $a1(h) 暫存在 $t0 add $t0, $a0, $a1 # $a2(i) + $a3(i) 暫存在 $t1 add $t1, $a2, $a3 # $s0(f) = $t0(g+h) - $t1(i+j) sub $s0, $t0, $t1 # Result add $v0, $s0, $zero # Restore $s0 lw $s0, 0($sp) addi $sp, $sp, 4 # Return jr $ra ``` #### Non-Leaf Procedures - Procedures that call other procedures - For nested(嵌套) call, caller needs to save on the stack … - Its return address(它的 caller) - Any arguments and temporaries needed after the call - Restore from the stack after the call ##### Example - C Code ```c= // n in $a0 int fact( int n ) { if( n < 1 ) return 1; else return n * fact( n - 1 ); // result in $v0 } ``` - MIPS ```MIPS= ### 我也不知道為什麼這邊打一個$ 底下就都可以註解了 ### fact: #把 caller 的東西都保存下來 # adjust stack for 2 items(8 bits) addi $sp, $sp, -8 # save return address, $ra sw $ra, 4($sp) # save 1 argument, $a0 sw $a0, 0($sp) # test for n < 1 slti $t0, $a0, 1 beq $t0, $zero, L1 # if so, result is 1 addi $v0, $zero, 1 # pop 2 items from stack addi $sp, $sp, 8 # and return jr $ra L1: # else decrement n addi $a0, $a0, -1 # recursive call jal fact # 把 caller 的東西都放回去 # restore original n lw $a0, 0($sp) # restore return address lw $ra, 4($sp) # pop 2 items from stack addi $sp, $sp, 8 # multiply to get result mul $v0, $a0, $v0 # and return jr $ra ``` #### Memory Layout ![](https://i.imgur.com/gzZ8W18.png) - Text: program code - Static data: global variables - E.g., static variables in C, constant arrays and strings - `$gp` is initialized to address allowing ±offsets into this segment. - Dynamic data: heap - E.g., malloc in C, new in Java - Stack: automatic storage #### Character Data - Byte-encoded character set - ASCII: 128 characters = 95 graphic + 33 control - Latin-1: 256 characters = ASCII + 96 more graphic characters - Unicode: 32-bit character set - Used in Java, C++ wide characters, … - Most of world’s alphabets, plus symbols - UTF-8, UTF-16: variable-length encodings #### Byte/Halfword Operations - Could use bitwise operations - MIPS byte/halfword load/store - String processing is a common case. - Sign extend to 32 bits in `rt` ```MIPS= # sign extend a byte lb rt, offset(rs) # sign extend a halfword lh rt, offset(rs) ``` - Zero extend to 32 bits in `rt` ```MIPS= # unsign(zero) extend a byte lbu rt, offset(rs) # unsign(zero) extend a halfword lhu rt, offset(rs) ``` - Store just rightmost byte/halfword ```MIPS= sb rt, offset(rs) sh rt, offset(rs) ``` #### String Copy Example 字串拷貝 - C Code - Null-terminated string 最後一個index為NULL的字串 ```c= // x in $a0, y in $a1 void strcpy(char x[], char y[]) { // i in $s0 int i; i = 0; while( (x[i] = y[i]) != '\0') { i += 1; } } ``` - MIPS ```MIPS= # non-leaf 只需要保留 s 系列 strcpy: addi $sp, $sp, -4 # adjust stack for 1 item sw $S0, 0($sp) # save $s0 add $s0, $zero, $zero # i - 0 L1: add $t1, $s0, $a1 # add of y[i] in $t1 lbu $t2, 0($t1) # $t2 = y[i] add $t3, $s0, $a0 #addr of x[i] in $t3 sb $t2, 0($t3) # x[i] = y[i] beq $t2, $zero, L2 # exit loop if y[i] == 0 addi $s0, $s0, 1 # i = i +1 j L1 # next iteration of loop L2: lw $s0, 0($sp) # restore saved $s0 addi $sp, $sp, 4 # pop 1 item from stack jr $ra # and return ``` ### Communicating with People ### 2.9 MIPS Addressing for 32-Bit Immediates and Addresses #### 32bit Constant - Most constant are small 多數的常數都很小 - 16-bit immediate us sufficient 通常都只有16個bits - For an occasional 32-bit constant `lui rt, constant` - Copy 16-bit constant to left 16 bits or `rt` - Clear right 16 bits of `rt` to 0 #### Branch Addressing - Branch instructions specify... - Opcode, two registers, target address - Most branch targets are near branch. - Forward or backward ![](https://i.imgur.com/P4bsYy3.png) - PC-relative addressing (PC (program counter)相對定址) - Target address = PC + offset * 4 (MIPS一個指令4個 byte) - PC is already incremented by 4 by this time #### Jump Addressing - Jump (`j` and `jal`) targets could be anywere in text segment - Encode full address in an instruction ![](https://i.imgur.com/kzHV2lQ.png) - 在 MIPS 中 `op` 都是 6 bits - (Pseudo) Direct jump addressing - Target address = PC31...28:(address*4) 4個跟PC借 接下來26個bits 剩下直接x4 ##### Target Addressing Example - Loop code from the earlier example - Assume Loop at location 80000 ![](https://i.imgur.com/vbdUEfQ.png) #### Branch Far Away - If a branch target is too far to encode with 16-bit offset assembler rewrites the code - 當要跳過去的目標位置太遠無法用 16 bit 表達,可以改寫指令使存放 address 的 bit 更大 - e.g. `j` and `jal` 有 26 bits - Example ```MIPS= beq $s0, $s1, L1 # 16 bits bne $s0, $s1, L2 # 16 bits j L1 # 26 bits L2: ...po ``` #### Addressing Mode Summary ![](https://i.imgur.com/CQfY3Bf.png) ### 2.10 Parallelism and Instructions: Synchronization #### Synchronization 同步 - Two processors share an area of memory - P1 writes, then P2 reads - Data race if P1 and P2 don't synchronize(race 資料不同) - Result depends of order of accesses - Hardware support is required. - Atomic read/write memory operation - No other access to a location is allowed between read and write. - 一段 atomic operation 間不能被其他指令中斷 - Could be a single instruction - E.g., atomic swap of register ↔ memory 提供資料對換 - Or an atomic pair of instructions #### Synchronization in MIPS - Load linked: `ll rt, offset(rs)` - Store conditional: `sc rt, offset(rs)` - Succeed if the location is not changed since the `ll` - Returns 1 in `rt` - Fail if the location is changed - Returns 0 in `rt` - 檢查 `rt` 的值在 `ll` 和 `sc` 間是否被更改過 - Example: atomic swap `$s4` with `M[$s1]` (to test/set lock variable) ```MIPS= try:add $t0,$zero,$s4 #copy exchange value ll $t1,0($s1) #load linked sc $t0,0($s1) #store conditional beq $t0,$zero,try #branch store fails add $s4,$zero,$t1 #put load value in $s4 ``` ### 2.11 Translation and Starting a Program #### Translation and Startup ![](https://i.imgur.com/RKJvKGj.png) #### Assembler Pseudoinstructions - Most assembler instructions represent machine instructions one-to-one. - Pseudoinstructions: figments of the assembler’s imagination ```MIPS= move $t0, $t1 # add $t0, $zero, $t1 ``` ```MIPS= blt $t0, $t1, L # slt $at, $t0, $t1 # if( $t0 < $t1 ) $at = 1 # bne $at, $zero, L ``` - `$at` (register 1): assembler temporary #### Produce Object Module - An assembler (or compiler) translates a program into machine instructions. - Provide information for building a complete program - Header: describe contents of object module 描述檔案中有哪些內容 - Text segment: translated instructions 機器語言碼 - Static data segment: data allocated for the life of the program 整個程式都需要的資料(全域變數 - Relocation information: for contents that depend on absolute location of loaded program 要根據絕對記憶體位置運行的數據的位置 - Symbol table: undefined labels, e.g., global definitions and external references 未定義的標籤(不包含在該檔案中的東西) - Debug information: for associating with source code 紀錄給方便除錯的東西 #### Link Object Modules - Produce an executable image 產生一個可執行的映象檔 - Merge segments 把不同的區塊結合在一起 - Resolve labels (determine their address) 把在 library 中的 label 找出來 - Patch location-dependent and external references - Could leave location dependencies for fixing by a relocating loader - But with virtual memory, no need to do this - Program can be loaded into absolute location in virtual memory space. #### Load a Program - Load from an image file on a disk into memory - Read the header to determine segment sizes - Create a virtual address space - Copy the text and initialized data into memory - Or set page table entries so they can be faulted in - Set up arguments on stack - Initialize registers (including $sp, $fp, $gp) - Jump to startup routine - Copy arguments to $a0, … and call the main routine - When the main routine returns, do exit() system call #### Dynamic Linking - Only link/load a library procedure when it is called - Require the procedure code to be relocatable - Avoid image bloat caused by static linking of all (transitively) referenced libraries - Automatically pick up new library versions #### Lazy Linkage ![](https://i.imgur.com/kKauns8.png) #### Start Java Applications ![](https://i.imgur.com/OSMlYaM.png) - 跨平台 - 編成 JVM <!-- TBC --> ![h3標題](https://i.imgur.com/wkYdhaY.png) ### 2.12 A C sort Example to Put It All Together #### C Swap Example - Illustrate use of assembly instructions for a C bubble sort function - Swap procedure ```c= void swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1] v[k+1] = temp; } ``` - v in $a0, k in $a1, temp in $t0 #### Procedure Swap ```MIPS= swap: sll $t1,$a1,2 #$t1=k*4 add $t1,$a0,$t1 #$t1=v+(k*4) lw $t0, 0($t1) # (address of v[k]) lw $t2, 4($t1) #$t0(temp)=v[k] sw $t2, 0($t1) #v[k] = $t2(v[k+1]) sw $t0, 4($t1) #v[k+1 = $t0] (temp) jr $ra #return to calling routine ``` #### C Procedure Sort - Non-leaf(calls swap) ```c= // v in $a0, n in $a1 void sort( int v[], int n) { // i in $s0, j in $s1 int i, j; for(i = 0; i < n; i += 1) { for( j = i - 1; j <= 0 && v[j] > v[j+1]; j -= 1) { // swap 為自定義函式 // 將 v[j] 跟 v[j+1] 交換 swap(v,j); } } } ``` #### Procedure Body ```MIPS= move $s2, $a0 # save $a0 into $s2 move $s3, $a1 # save $a1 into $s3 move $s0, $zero # i = 0 for1tst: slt $t0, $s0, $s3 # $t0 = 0 if $s0 ≥ $s3 (i ≥ n) beq $t0, $zero, exit1 # go to exit1 if $s0 ≥ $s3 (i ≥ n) addi $s1, $s0, –1 # j = i – 1 for2tst: slti $t0, $s1, 0 # $t0 = 1 if $s1 < 0 (j < 0) bne $t0, $zero, exit2 # go to exit2 if $s1 < 0 (j < 0) sll $t1, $s1, 2 # $t1 = j * 4 add $t2, $s2, $t1 # $t2 = v + (j * 4) lw $t3, 0($t2) # $t3 = v[j] lw $t4, 4($t2) # $t4 = v[j + 1] slt $t0, $t4, $t3 # $t0 = 0 if $t4 ≥ $t3 beq $t0, $zero, exit2 # go to exit2 if $t4 ≥ $t3 move $a0, $s2 # 1st param of swap is v (old $a0) move $a1, $s1 # 2nd param of swap is j jal swap # call swap procedure addi $s1, $s1, –1 # j –= 1 j for2tst # jump to test of inner loop exit2: addi $s0, $s0, 1 # i += 1 j for1tst # jump to test of outer loop exit1: ``` #### Full Procedure ```mips= sort: addi $sp, $sp, -20 #make room on stack for 5 registers sw $ra, 16($sp) #save $ra on stack sw $s3, 12($sp) #save $s3 on stack sw $s2, 8($sp) #save $s2 on stack sw $s1, 4($sp) #save $s1 on stack sw $s0, 0($sp) #save $s3 on stack ... #procedure body in the previous exit1: lw $s0, 0($sp) #restore $s0 from stack lw $s1, 4($sp) #restore $s1 from stack lw $s2, 8($sp) #restore $s2 from stack lw $s3, 12($sp) #restore $s3 from stack lw $s0, 0($sp) #restore $s0 from stack addi $sp, $sp, 20 #restore stack pointer jr $ra #return to calling routine ``` #### Lesson Learnt - Instruction count and CPI are not good performance indicators in isolation. - 單看指令數跟 CPI 無法評斷效能好壞 - Compiler optimizations are sensitive to algorithms. - Java/JIT compiled code is significantly faster than JVM interpreted. - Comparable to optimized C in some cases - Nothing can fix a dumb algorithm! - 沒有東西可以拯救智障的演算法 ### Arrays versus Pointers - Array indexing involves … - Multiply index by element size - Add to array base address - Pointers correspond directly to memory addresses - Can avoid indexing complexity ### Real Stuff: ARM Instructions, x86 Instructions, ARM v8(64-bit) Instructions ### Fallacies and Pitfalls <!-- This is the end of the NOTE -->