Chap. 02 - Instructuion Set

# Chap. 02 - Instructuion Set Architecture > 課程內容 : 清華大學開放式課程黃婷婷教授 > 參考書目 : Computer Organization and Design: The Hardware/Software Interface (5th), David A. Patterson, John L. Hennessy > > 其他科目內容請參見 [[Here]](https://hackmd.io/@ChenZE/By4SOO6Jyl) ## Content * [Sec. 2.1 Instruction set architecture](#Sec-21-Instruction-set-architecture) * [1. Recall in C language](#1-Recall-in-C-language) * [2. Components of an ISA](#1-Components-of-an-ISA) * [Sec. 2.2 Operands](#Sec-22-Operands) * [1. Register operands and their organization](#1-Register-operands-and-their-organization) * [1.1 Register](#11-Register) * [1.2 Register architecture](#12-Register-architecture) * [2. Memory operands and data transfer](#2-Memory-operands-and-data-transfer) * [2.1 Data transfer: Memory to Register](#21-Data-transfer-Memory-to-Register) * [2.2 Data transfer instruction](#22-Data-transfer-instruction) * [2.3 Compilation with memory](#23-Compilation-with-memory) * [2.4 Data transfer: Register to memory](#24-Data-transfer-Register-to-memory) * [2.5 Compilation with memory](#25-Compilation-with-memory) * [2.6 Register V.S. Memory](#26-Register-VS-Memory) * [3. Immediate operands](#3-Immediate-operands) * [3.1 The constant zero](#31-The-constant-zero) * [Sec. 2.3 Sign and unsigned number (略)](#Sec-23-Sign-and-unsigned-number-略) * [Sec. 2.4 Representing instructions](#Sec-24-Representing-instructions) * [1. R-format instruction](#1-R-format-instruction) * [2. I-format instruction](#2-I-format-instruction) * [2.1 Example (1)](#21-Example-1) * [2.2 Example (2)](#22-Example-2) * [Sec. 2.5 Operation](#Sec-25-Operation) * [1. Logical](#1-Logical) * [1.1 Shift operations](#11-Shift-operations) * [1.2 AND operation](#12-AND-operation) * [1.3 OR operation](#13-OR-operation) * [1.4 NOT operations](#14-NOT-operations) * [2. Decision making and branches](#2-Decision-making-and-branches) * [2.1 $\mathtt{beq}$: branch equal](#21-mathttbeq-branch-equal) * [2.2 J-format instruction](#22-J-format-instruction) * [2.3 Inequality](#23-Inequality) * [2.4 Signed v.s. Unsigned](#24-Signed-vs-Unsigned) * [Sec. 2.6 Supporting procedures in hardware](#Sec-26-Supporting-procedures-in-hardware) * [1. Procedure calling](#1-Procedure-calling) * [2. Instruction for procedure call](#2-Instruction-for-procedure-call) * [3. Code for callee](#3-Code-for-callee) * [4. Non-leaf procedures](#4-Non-leaf-procedures) * [5. Memory layout](#5-Memory-layout) * [Sec. 2.7 Communicating with people](#Sec-27-Communicating-with-people) * [1. Challenge](#1-Challenge) * [2. Improvenment](#2-Improvenment) * [2.1 Example](#21-Example) * [Sec. 2.8 Addressing for 32-bit immediate and addresses](#Sec-28-Addressing-for-32-bit-immediate-and-addresses) * [1. 32-bit constants](#1-32-bit-constants) * [2. Branch addressing](#2-Branch-addressing) * [2.1 Syntax](#21-Syntax) * [2.2 Improvement](#22-Improvement) * [3. Jump addressing](#3-Jump-addressing) * [3.1 Syntax](#31-Syntax) * [3.2 Improvement](#32-Improvement) * [4. MIPS instruction summary](#4-MIPS-instruction-summary) * [Sec. 2.9 補充: Translating and starting program](#Sec-29-補充-Translating-and-starting-program) * [Sec. 2.10 Sort example](#Sec-210-Sort-example) * [1. C sort example (暫略)](#1-C-sort-example-暫略) * [2. Optimization](#2-Optimization) * [Sec. 2.11 Arrays v.s. pointers](#Sec-211-Arrays-vs-pointers) * [1. Difference](#1-Difference) * [2. Example: clearing and array](#2-Example-clearing-and-array) * [Sec. 2.12 ARM and X86 instruction sets](#Sec-212-ARM-and-X86-instruction-sets) * [1. Instruction](#1-Instruction) * [2. ARM v.s. MIPS](#2-ARM-vs-MIPS) * [3. Fallacies (常見錯誤迷思)](#3-Fallacies-常見錯誤迷思) * [4. Pitfalls (易錯觀念)](#4-Pitfalls-易錯觀念) * [5. Concluding remarks](#5-Concluding-remarks) ## Sec. 2.1 Instruction set architecture ### 1. Recall in C language > Computer architecture = Instruction set architecture + Machine organization 回顧 C 語言(或其他語言)中，程式碼通常可分成以下的幾個區塊 * 運算子 operator : `+`、`-`、`*`、`/`、`%` * 運算元 operand * 變數 variable : `lower`, `upper`, ... * 常數 constant : `0`, `1`, ... * 敘述 assignment statement : `a = b + 1` 當使用 Assembler 組譯成組合語言時，以 `a = b + 5` 為例 ![S__2170886](https://hackmd.io/_uploads/rJ5zNl0mC.jpg) 同樣會有些運算子、運算元等基本要素構成 ### 2. Components of an ISA ISA 的組成包含以下幾點 * Organization of programmable storage，用以表示資料該怎麼放，怎麼組織 * register 暫存器 * memory 記憶體 * mode of addressing and accessing data item and instruction * Data type and data structure，資料型態與結構 * Instruction format，指令的格式，例如長度為 32 bits 或 64 bits，或是每個區塊代表的意義 * Instruction set(或稱 operaion code, op code)，表示可以執行哪些指令以 MIPS 架構為例，指令長度為 32-bits，且總共有三種不同的格式，如下圖 ![S__2170887](https://hackmd.io/_uploads/rJUV8xR7C.jpg) ## Sec. 2.2 Operands ### 1. Register operands and their organization 以下為最基本的 MIPS arithmetic/logic instruction(算數與邏輯指令)，總共包含四個區塊(field) $$ \begin{align*} &1 & &2 & &3 & &4\\ a&dd & $&s0 & $&s1 & $&s2 \end{align*} $$ * 各部名稱 * Field 1 : operation by name, or called OP code (運算子的名稱) * Field 2 : operand getting result, or called destination (儲存結果) * Field 3 : 1st operand for operation, or called "source 1" (來源 1) * Field 4 : 2nd operand for operation, or called "source 1" (來源 2) * 意義 : 把 $s1$ 的內容 $+$ $s2$ 的內容後放到 $s0$ 中 * 以上四個區塊的順序不可改變，且語法結構是固定的 * 目的 : Keep hardward simply via regularity * 每一個指令的長度皆為 32-bits > **Design principle 1** - Simplicity favors regularity > * Regularity makes implementation simpler > * Simplicity enables highers performance at lower cost > * 越簡單，速度越快舉例 : How to do the following C statement ? ```c f = (g + h) - (i - j); ``` 經過編譯器編譯成 MIPS code 後 $$ \begin{align*} &add & &t0, & &g, &h & &// \text{temp} \;\; t0 = g + h\\ &add & &t1, & &i, &j & &// \text{temp} \;\; t1 = i + j\\ &sub & &f, & &t0, &t1 & &//f = t0 - t1\\ \end{align*} $$ #### 1.1 Register 與一般的高階語言不同，組合語言不會使用變數(variables)，組合語言的運算子稱為 Register (暫存器)。是靠近計算單元的一個特殊位置，數量有限，目的是用來儲存暫存的資料。直接焊接在 Hardware 之上，運算通常會在 Register 之中完成。因為是直接以 Hardware 實作，所以優點是速度比記憶體快，且易於實作與編譯以 MIPS 架構為例，總共有 32 個暫存器，因此需要 5 bits 來描述暫存器的位置 $(\because 2^5=32)$ 且每個暫存器的大小為 32 bits (32 bits wide)，表示可以處理 32 bits 的資料量。每 32 bits 的大小稱為 a word in MIPS > Why is 32？ > **Design Principle 2** - Smaller is faster 這 32 個暫存器可以用數字(from $0 to $31)或名稱來稱呼，習慣上＄16 到 $23 被稱為 $s0 到 $s7，作用相當於高階語言中的變數。 $8 到 $15 被稱為 $t0 到 $t7，作用相當於中間值(temporary)，每個暫存器都有他各自的意義與用途(如下圖) ![S__2179092](https://hackmd.io/_uploads/ryfvwpLNA.jpg) 以上述的 C 語言加法為例，若 f 到 j 依序使用 $s0 到 $s4 且中間值使用 $t0 與 $t1，則編譯成組合語言後為 $$ \begin{align*} &add & &$t0, & &$s1, & &$s2 &//\text{temp}\;t0 = g + h\\ &add & &$t1, & &$s3, & &$s4 &//\text{temp}\;\;t1 = i + j\\ &sub & &$s0, & &$t0, & &$t1 &// f = (g+h) - (h+j)\\ \end{align*} $$ #### 1.2 Register architecture 暫存器的架構共有 4 種 1. Accumulator(累加器) 只有一個 Register，做法是將資料從 Memory 載入並加到 Accumulator 之中。例如 : * 1 address : $\text{add} \; A$ 表示 $acc \leftarrow acc + Mem[A]$ * 1 + x address : $\text{add} \; A$ 表示 $acc \leftarrow acc + Mem[A+x]$ 2. Stack 以 Stack 的方式實作，例如 : * 0 address : $\text{add}$ 表示 $tos \leftarrow tos + next$ 3. General Purpose Register * 2 address : $\text{add} \; A, B$ 表示 $EA(A) \leftarrow EA(A) + EA(B)$ * 3 address : $\text{add} \; A, B, C$ 表示 $EA(A) \leftarrow EA(A) + EA(B) + EA(C)$ 4. Load/Store(a special case of GPR) * 3 address : $\text{add}$ $ra, $rb, $rc 表示 $ra $\leftarrow$ $rb + $rc $\text{load}$ $ra, $rb 表示 $ra $\leftarrow$ Mem[$rb] 前兩種方式是早期做法，因為硬體資源貴，所以只有一個暫存器。後兩種方式差別在於第 3 種的資料可放在 Memory 或 Register 但第 4 種資料一定放在 Register 中 ### 2. Memory operands and data transfer > Variables in C is mapping to register, then however **what about large data structure like arrays ?** #### 2.1 Data transfer: Memory to Register 在 MIPS 中，算術運算的指令是在 Register 中完成的，不是直接在 Memory 中。相關的指令包括 $lw, $sw 等，稱為 Data transfer instruction，目的是將資料從 Memory 移動到 Register 中，因此需要一個能夠處理記憶體的運算元因為 Memory address 的大小也是 32-bits 所以無法使用指令來表示記憶體位址(太長)，因此可將 Address 視為一個一維陣列，以一個 pointer 來表示記憶體位址，再使用一個 offset (偏移量，以 byte 為單位)來表示相對距離，就可指出資料的記憶體位址。 ![S__2179096](https://hackmd.io/_uploads/HksHZmwER.jpg) 舉例 : 8($t0) 表示記憶體位址 = $t0 所存的位址 + 8 bytes :::info 因為 MIPS 架構中 Instruction 的長度是 32-bits，而記憶體位址的表示長度也是 32-bits，所以如果使用指令來表示記憶體位址，那其他的 OP code, Register 就會無法表示 ::: #### 2.2 Data transfer instruction $$ \begin{align*} &1 & &2 & &3 & &4\\ &lw & $&t0, & &12 & ($&s0) \end{align*} $$ * 各部名稱 * Filed 1 : OP code * Field 2 : register that will receive **value** (存值，不是位址) * Field 3 : offset in bytes * Field 4 : register containing pointer to memory, called base register typically 舉例 : $$ lw \;\;\; $t0, 12($s0) $$ where $s0 = 1000 * Take the pointer in $s0 then add 12 bytes to it * $s0 + 12 = 1012 * Load word(lw) from the memory pointed into register $t0 * 將記憶體位址 1012 所存的內容存到暫存器 $t0 ![S__2211842](https://hackmd.io/_uploads/r15jbelSR.jpg) #### 2.3 Compilation with memory 舉例 : 假設 $s1 表示變數 g , $s2 表示變數 h , $s3 表示 base，編譯程式碼 `g = h + A[8];` 有兩個動作 * 載入 `A[8]` 的值 * 有 8 個元素，每個元素大小為 4 byte : $8 \times 4 = 32$ (offsets) * 相加因此，程式碼 `g = h + A[8];` 編譯完後的結果為 ```mips lw $t0 32($s3) # load value to t0 add $s1 $s2 $t0 # add s2 and t0 then store into s1 ``` #### 2.4 Data transfer: Register to memory $$ \text{sw}\ \ \ $t0,\ 12($s0) $$ 與 lw 相同，sw 表示 store word，暫存器 \$t0 表示要存到記憶體的值，常數表示 offest，$s0 表示 base register ![S__2211843](https://hackmd.io/_uploads/H1ATilgrA.jpg) #### 2.5 Compilation with memory 假設 \$s2 表示變數 h，\$s3 表示 base，編譯程式碼 `A[12] = h + A[8];` 有三個動作 * 載入 `A[8]` 的值 * 相加 * 儲存 `A[12]` 的值 ```mips lw $t0, 32 $s3 # load A[8] to t0 where offset = 8*4 = 32 add $t0, $s2 $t0 # add to and s2 in t0 sw $t0, 48 $s3 # store t0 into A[12] ``` #### 2.6 Register V.S. Memory * 如果編譯過程中，變數的數量多於暫存器的數量？ * Compiler 會將使用頻率高的變數放在 register * 使用頻率低的變數放在 Memory : spilling * 為何不把所有的變數都放在記憶體中？ * 因為記憶體的存取比較耗時 ### 3. Immediate operands 對於常用的常數，或是會頻繁使用的常數，如果存在記憶體中等需要時再從記憶體載入，時間會太慢，因此會將一些常用到的常數直接寫到指令中，例如 : $$ \text{addi}\ \ \ $29,\ \ \ $29,\ \ \ 4 $$ OP code 後面的 i 表示其中一個 operand 存放在 instruction 之中，因此常數 4 是在指令中，不是存於 register > **Design Principle 3** - Make the common case fast 因此，類此這樣的程式碼 : `f = g + 10;` (in C)經過編譯後會是 $$ \text{addi}\ \ \ $s0,\ \ \ $s1,\ \ \ 10 $$ 另外，由於已經將常用寫到指令之中，所以就不會再有減法的 opcode 了，直接用負數表示減法即可 #### 3.1 The constant zero 由於常數 0 出現頻率極高，依此常數 0 直接寫入到硬體之中，單獨使用一個 register ($zero or $0)來表示 0，且 register $zero 有以下特色 : * 無法被覆寫(overwritten) * 定義於硬體之中 * 通常用於資料的搬移舉例 : (1) 下列指令中，第一個指令不會發生任何事情，因為 $0 不會被覆寫，加完後依舊是 0 (2) 將 $s1 的資料搬到 $t2 中 ```mips addi $0, $0, 5 # nothing happen since $0 can not be overwritten add $t2 $s1 $0 # data movement ``` ## Sec. 2.3 Sign and unsigned number (略) ## Sec. 2.4 Representing instructions 一個指令中有 32-bits :arrow_right: 分割一個 instruction word 變成多個 "fields"，每個 fields 會有各自所代表的意義一般來說，在指令中會定義許多 "fields"。但在 MIPS 的架構中因為要簡單化，所以只定義了 3 種基本的指令型式 * **R-format** : for register * **I-format** : for immediate, $\mathtt{lw}$ and $\mathtt{sw}$ * **J-format** : for jump ### 1. R-format instruction R-format 指令的定義如下 : 主要用於 $+$、$-$、$\times$、$/$ 之類的運算 | $\mathtt{OP code}$ | $\mathtt{rs}$ | $\mathtt{rt}$ | $\mathtt{rd}$ | $\mathtt{shamt}$ | $\mathtt{funct}$ | | :-: | :-: | :-: | :-: | :-: | :-: | | 6-bits | 5-bits | 5-bits | 5-bits | 5-bits | 6-bits | * $\mathtt{Opcode}$ : 具體描述這個使令是哪一種指令，**但是 R-format 的 opcode 都是由 0 組成**，會進一步用 funct 做區分 * $\mathtt{funct}$ : 與 $\mathtt{opcode}$ 一起組成，表示指令用途 * $\mathtt{rs}$ : Source register * $\mathtt{rt}$ : Target register * $\mathtt{rd}$ : Destination registet，接收計算結果的暫存器 * $\mathtt{shamt}$ : 移動量(?) :::info Instruction 的 binary code 所表示的順序與 compile 後的順序不太一樣(見下面例子) ::: $$ \text{add} \ \ \ $\text{t0}, \ \ $\text{s1}, \ \ $\text{s2} $$ | $\mathtt{OP code}$ | $\mathtt{rs}$ | $\mathtt{rt}$ | $\mathtt{rd}$ | $\mathtt{shamt}$ | $\mathtt{funct}$ | | :-: | :-: | :-: | :-: | :-: | :-: | | 6-bits | 5-bits | 5-bits | 5-bits | 5-bits | 6-bits | | Special | $s1 | $s2 | $t0 | 0 | add | | 0 | 17 | 18 | 8 | 0 | 32 | | 000000 | 10001 | 10010 | 00100 | 00000 | 100000 | 指令型式 : $00000 | 10001 | 10010 | 00100 | 00000 | 100000_2 = 02323020_{16}$ ### 2. I-format instruction I-format 指令的定義如下 : 主要用於載入(load word)、儲存(store word)、常數運算(addi)之類的運算。與 R-format 相比，兩者有類似的結構，差別在於 I-format 是將 R-format 中後面的 $\mathtt{rd}$, $\mathtt{shamt}$ 和 $\mathtt{funct}$ 合併改成是 $\mathtt{immediate}$ (見 Design Principle 4) :::danger 這邊有一段是解釋為什麼 R-format 後面會有一段 func 需要 field，而不是直接用 OPcode 表示即可，但聽不太懂(?) ::: | $\mathtt{OP code}$ | $\mathtt{rs}$ | $\mathtt{rt}$ | $\mathtt{immediate}$ | | :-: | :-: | :-: | :-: | | 6-bits | 5-bits | 5-bits | 16-bits | * $\mathtt{Opcode}$ : 具體描述是哪一種 I-format 指令 * $\mathtt{rs}$ : Source register，I-format 中唯一的 operand * $\mathtt{rt}$ : Target register，接收計算結果 * $\mathtt{immediate}$ : 一個常數 * 以 16-bits 表示一個常數，但因為 register 中的資料是以 32-bits 為單位來儲存，所以這個 16-bits 的常數會根據最前面的 sign symbol (0 for positive and 1 for negative) 來擴增 bits (padding 0 or 1)，也就是做 sing-extended > **Design Principle 4 -** Good design demands good compromises > * Different formats complicate decoding, but allow 32-bits instruction uniformly > * Keep formats as similar as possible #### 2.1 Example (1) $$ \begin{align} &\text{addi} & &\text{dst}, & &\text{src}, & &\text{cons}\\ &\text{addi} & &$21 , & &\$22, & &$-50 \end{align} $$ | $\mathtt{OP code}$ | $\mathtt{rs}$ | $\mathtt{rt}$ | $\mathtt{immediate}$ | | :-: | :-: | :-: | :-: | | 6-bits | 5-bits | 5-bits | 16-bits | | 8 | $22 | $21 | -50 | | 001000 | 10110 | 10101 | 1111111111001110 | 表示 : \$21 = \$22 + (-50) #### 2.2 Example (2) $$ \begin{align} &\text{lw} & &\text{dst}, & &\text{cons}, & &\text{src}\\ &\text{lw} & &$\text{t0}, & &1200, & &$\text{t1} \end{align} $$ | $\mathtt{OP code}$ | $\mathtt{rs}$ | $\mathtt{rt}$ | $\mathtt{immediate}$ | | :-: | :-: | :-: | :-: | | 6-bits | 5-bits | 5-bits | 16-bits | | 35 | $9 | $8 | 1200 | | 100011 | 01001 | 01000 | 0000010010110000 | Address = 8 + 1200 = 1208 + X04B8 $t0 = X04B8 ## Sec. 2.5 Operation 到目前為止，不論是 register 的個數，還是 instruction 的長度，或是 memory address 都是以 32-bit 為單位，所有指令都是以 32-bit 作為一個整體的 block 進行操作和運算，不過其實也可以將這 32-bits 的 block 視為 32 個單一的 bit 為單位進行運算。 > New perspective : View contents of rerister as **32 bits** rather than as **a 32-bit number** > ### 1. Logical Instruction for bitwise manipulation | Operation | In C | In Java | In MIPS | | :-: | :-: | :-: | :-: | | shift left | << | << | $\mathtt{sll}$ | | shift right | << | << | $\mathtt{srl}$ | | bitwise AND | << | << | $\mathtt{and},\ \mathtt{andi}$ | | bitwise OR | << | << | $\mathtt{or},\ \mathtt{ori}$ | | bitwise NOT | << | << | $\mathtt{nor}$ | :::info bitwise NOT 中，因為 MIPS 的 R-foramt instruction 具有兩個輸入值(source register)，但 NOT 運算只要一個輸入就好，因此在 MIPS 的架構中是使用 $\mathtt{nor}$ 的運算(i.e., $\mathtt{nor}=\mathtt{not}+\mathtt{or}$)來保證具有兩個 source register ::: #### 1.1 Shift operations 指令定義如下 : | $\mathtt{OP code}$ | $\mathtt{rs}$ | $\mathtt{rt}$ | $\mathtt{rd}$ | $\mathtt{shamt}$ | $\mathtt{funct}$ | | :-: | :-: | :-: | :-: | :-: | :-: | | 6-bits | 5-bits | 5-bits | 5-bits | 5-bits | 6-bits | * $\mathtt{shamt}$ : how many positions to shift * shift left logical * shift left and fill with 0 bits (左移，且空的位置補 0) * $\mathtt{sll}$ by $i$ bits multiplies by $2^i$ (左移的效果等價於 $\times 2$) * shift right logical * shift right and fill with 0 bits (右移，且空的位置補 0) * $\mathtt{srl}$ by $i$ bits divides by $2^i$ (右移的效果等價於 $\div 2$) ##### Syntax $$ \begin{align} &1 & &2 & &3 & &4\\ s&ll & $&\text{t2}, & $&\text{s0}, & &4 \end{align} $$ * Field 1 : OP code * Field 2 : register that will receive value * Field 3 : first operand(register) * Field 4 : shift amount(constant) MIPS 中有三種不同的 shift instructions * $\mathtt{sll}$ (shift left logical) : 往左移動，且空的位置補 0 * $\mathtt{srl}$ (shift right logical) : 往右移動，且空的位置補 0 * $\mathtt{sra}$ (shift right arithmetic) : 往右移動，且空的位置補 sign symbol ##### Example * shift right by 8-bits ($\mathtt{srl}$) * 原始 : **0001 0010 0011 0100 0101 0110** 0111 1000 * 輸出 : ==0000 0000== **0001 0010 0011 0100 0101 0110** (padding 0) * shift right arithmetic by 8-bits ($\mathtt{sra}$) * 原始 : **1001 0010 0011 0100 0101 0110** 0111 1000 * 輸出 : ==1111 1111== **==1==001 0010 0011 0100 0101 0110** (padding sign symbol **1**) 在二元運算中，$\times 4$ 等價於左移 2 位元(Ex: $11_2 \times 100_2 = 1100_2$)，同理，$\times 2^n$ 等價於右移 n 位元。由此可以想見 bitwise shift 的運算算度會比一般乘法速度快上取多，所以在 C 語言中如果是進行 $\times 2^n$ 的運算，通常會編譯成 shift instruction。例如 : * C language : `a *= 8;` * Compile to MIPS instruction : `sll $s0, $s0, 3` #### 1.2 AND operation AND 運算主要用來 **masked** 某些位元，目的在於保留某些 bits (與 1 作 AND)且其餘設為 0 (與 0 作 AND) $$ \text{add}\ \ \ $t0,\ \ $t1,\ \ $t2 $$ $$ \begin{align} $t2 : 0000\ 0000\ 0000\ 0000\ 00\textbf{00}\ \textbf{11}01\ 1100\ 0000\\ $t1 : 0000\ 0000\ 0000\ 0000\ 00\textbf{11}\ \textbf{11}00\ 0000\ 0000\\ $t0 : 0000\ 0000\ 0000\ 0000\ 00\textbf{00}\ \textbf{11}00\ 0000\ 0000\\ \end{align} $$ #### 1.3 OR operation OR 運算主要用來 **include** 某些位元，目的在於將某些 bits 設為 1 (與 1 作 OR)且其餘不變(與 0 作 OR) $$ \text{or}\ \ \ $t0,\ \ $t1,\ \ $t2 $$ $$ \begin{align} $t2 : 0000\ 0000\ 0000\ 0000\ 00\textbf{00}\ \textbf{11}01\ 1100\ 0000\\ $t1 : 0000\ 0000\ 0000\ 0000\ 00\textbf{11}\ \textbf{11}00\ 0000\ 0000\\ $t0 : 0000\ 0000\ 0000\ 0000\ 00\textbf{11}\ \textbf{11}00\ 0000\ 0000\\ \end{align} $$ #### 1.4 NOT operations NOT 運算主要用來 **invert** 某些位元，目的在於將 0 設為 1 且 1 設為 0 $$ \text{nor}\ \ \ $t0,\ \ $t1,\ \ $\text{zero} $$ $$ \begin{align} $t1 : 0000\ 0000\ 0000\ 0000\ 00\textbf{11}\ \textbf{11}00\ 0000\ 0000\\ $t0 : 1111\ 1111\ 1111\ 1111\ 11\textbf{00}\ \textbf{00}11\ 1111\ 1111\\ \end{align} $$ 雖然只要做 NOT 運算即可，但因為 load-store 的指令需要有 2 個輸入，所以改為 NOT + OR 的運算，也就是先與 \$zero 作 OR 再將結果作 NOT 運算 ($\mathtt{NOT}$ a == $\mathtt{NOT}$ (a $\mathtt{OR}$ \$zero) == a $\mathtt{NOR}$ \$zero) ### 2. Decision making and branches 程式語言厲害的地方在於可以進行選擇性結構來做程式區段的跳躍，接下來的指令同樣會進行判斷與區段的跳躍。 #### 2.1 $\mathtt{beq}$: branch equal 語法結構如下 : $$ \begin{align} &1 & &2 & &3 & &4 \\ b&eq & regi&ster1 & regi&ster2 & &L1 \\ \end{align} $$ * Field 1 : OP code * $\mathtt{beq}$ means branch if (register are) equal * In other word :arrow_right: `if (register1 == register2) goto L1` * 類似的 OP code 還有 $\mathtt{bne}$ (branch if not equal) * Field 2 : First register * Field 3 : Second register * Field 4 : Label 以上這些指令($\mathtt{beq}$ 與 $\mathtt{bne}$) 被稱為 conditional branches，表示條件判斷正確的話執行(跳到)後面的敘述 :::info 仔細觀察這個 instruction 的結構會發現，他的結構與 I-format 的結構很像，都是一個 OPcode 和兩個輸入值，因次這類型的 conditional branches 其實是一種 I-format 的 instruction ::: #### 2.2 J-format instruction J-format 是一種 unconditional branch，不用任何的條件判斷就可以直接跳至指定的區段中，定義如下 $$ \begin{align} &1 & &2 \\ &j & la&bel \\ \end{align} $$ Field 1 : OPcode Field 2 : Label, destination 只要看到這個指令，直接跳到 label 的地方繼續執行，不用作任何判斷。此外這個指令也可以使用 $\mathtt{beq}$ 表示 : `beq $0, $0, label` ##### Example (1) ```flow cond=>condition: i == j? ed=>end: Exit add=>operation: f = g + h sub=>operation: f = g - h cond(yes)->add->ed cond(no)->sub->ed ``` C code ```c if (i == j) f = g + h; else f = g - h; ``` 使用 $s0 到 $s4 表示變數 f 到 j，則編譯後的 MIPS 為 : ```assembly= bne $s3, $s4, Else # branch (i != j) add $s0, $s1, $s2 # f = g + h (True) j Exit # go to Exit Else: sub $s0, $s1, $s2 # f = g - h (False) Exit: ``` * 第 1 行表示 : 若 $s3 != $s4 (條件判斷正確)則跳到 Else 的地方繼續執行 * 第 2 行表示 : 若 $s3 == $s4 (條件判斷錯誤)則繼續往後執行 * 第 3 行表示 : 執行完加法後直接跳到 Exit 的地方結束 * 第 4 行表示 : 執行條件中的減法運算 * 第 5 行表示 : 結束 :::warning 第 4 行的 jump 指令一定要加，否則程式執行完 add 指令後會接續執行 sub 指令 ::: ##### Example (2) C code ```c while (save[i] == k) i += 1; ``` 假設變數 i 存放於 $s3，變數 k 存放於 $s5，且 memory address base 存放於 $s6 Compile to MIPS code ```assembly= Loop: sll $t1, $s3, 2 # t1 = i * 4(bytes) add $t1, $t1, $s6 # t1 = addr of save[i] lw $t0, 0($t1) # t0 = save[i] bne $t0, $s5, Exit # if save[i] != k goto Exit addi $s3, $s3, 1 # i = i + 1 j Loop # go to Loop Exit: ``` 1. 必須先存取 Array 中的變數，且因為 Array 中的變數大小是以 4 bytes 為單位，因此使用 `ssl $t1, $s3, 2` 將 \$s3 的值右移兩個 bits $(\times 4)$ 得到 offset (存放在 \$t1) 2. 將 \$t1 中的 offset 加上 memory address base (\$s6) 得到當前 `save[i]` 的位址 3. 使用 `lw $t0, 0($t1)` 來存取 \$t1 中位址的數值。至此完成 `save[i]` 的讀取 4. 判斷 \$t0 (i.e., `save[i]`) 與 \$s5 (i.e., `k`) 的數值是否相同 * 相同(`False`)則接續進行 5. * 不相同(`True`)則跳至 7. 5. 執行 `i += 1` 6. 變數 `i` 加完後直接跳值 `Loop` 標籤的位置繼續進行迴圈 7. 程式結束 #### 2.3 Inequality 到目前為止已經介紹了 equality 的狀況(`==` 或 `!=`)，但 programming language 中尚有包含不等式。在 MIPS 中會使用 `slt` 或 `slti` (set on less than)來表示不等式 $$ \begin{align} &1 & &2 & &3 & &4\\ s&lt & &rd, & &rs, & &rt\\ \end{align} $$ `slt rd, rs, rt` 表示 `if (rs < rt) rd = 1; else rd = 0;` $$ \begin{align} &1 & &2 & &3 & &4\\ s&lti & &rt, & &rs, & cons&tant\\ \end{align} $$ `slti rt, rs, constant` 表示 `if (rs < constant) rt = 1; else rt = 0;` 從語法架構上可發現他與 R-format 類似，此外因為 MIPS 講求越簡單越好，因此在設計上只有 $<$ 的語法，其他的 $\le$, $>$, $\ge$ 等都可以用小於搭配其他等式結構來創建 ##### Example C code ```c if (g < h) goto Less; ``` 假設變數 `g: $s0` 且 `h: $s1` ```assembly= slt $t0, $s0, $s1 # if g < h then $t0 = 1, otherwise $t0 = 0 bne $t0, 0, Less # $t0 != 0 (i.e., g < h, True), goto Less Less: ``` #### 2.4 Signed v.s. Unsigned 在做比較時，會牽扯到正負數的 symbol 問題，若是 signed number 則會使用 `slt`, `slti` 來比較；若是 unsigned number 則會使用 `sltu`, `sltiu` 比較。例如 : `$s0 = 1111 1111 1111 1111 1111 1111 1111 1111` 與 `$s1 = 0000 0000 0000 0000 0000 0000 0000 0001` 在 sign 與 unsigned 是兩種不同的意義，在 sign number 中 \$s0 表示 -1 而 \$s1 表示 1；在 unsigned number 中 \$s0 表示 +4,294,967,295 而 \$s1 表示 +1 ## Sec. 2.6 Supporting procedures in hardware 類似 programming language 中的函數呼叫，講的是程式進行時主控權的轉換，牽涉到程式主控權轉換時需要進行的動作，例如變數的儲存、下一個指令的位址(i.e., program counter)等 ### 1. Procedure calling procedure calling 過程中的需求 * Caller (叫人的) * 將參數放到 register 之中 (place parameters in registers) * 轉移程式的控制權 (transfer control to procedure) * Callee (被叫的) * 取得 procedure 的儲存空間 (acquire storage for procedure) * 執行 procedure 的運算 (perform procedure's operation) * 將結果放在 register 之中 (place result in register for caller) * 返回控制權 (return to place of call) 以下列這段 C code 為例 ```c= sum = leaf_example(a, b, c, d); int leaf_example(int g, h, i ,j){ ing f; f = (g + h) - (i + j); return f; } ``` * Return address : `$ra` * 當程式轉移控制權時，需要紀錄轉移前進行到哪個指令，才能在之後回到該指令的下一步繼續進行 * Procedure address : `Label` * 要跳到哪一個 procedure * Arguments : `$a0`, `$a1`, `$a2`, `$a3` * 用來儲存 arguments 的特殊 register * Return value : `$v0`, `$v1` * 用來儲存返回值的特殊 register * Local variables : `$s0`, `$s1`, ..., `$s7` * 用來儲存 procedure 中局部變數的特殊 register 關於 register 的使用習慣如下圖所示 ![register-convention-mips](https://hackmd.io/_uploads/HyvMO9MdC.jpg) `$8` 到 `$15` 主要是給 caller 使用，`$16` 到 `$23` 主要是給 callee 使用，`$4` 到 `$7` 是紀錄 arguments 的，`$31` 是給 return address 的 register ### 2. Instruction for procedure call 當程式需要執行 child procedure 時需要做兩的動作: (1) 紀錄當前執行到哪個 instruction 並在返還時回到 next instruction ，也就是 program conunter 的概念 (2) 紀錄完後跳到要執行的 child procedure。由此可衍生出 jump and link ($\mathtt{jal}$) 這個指令 : $$ \begin{align} &1 & &2\\ j&al & procudure&\_label \end{align} $$ * address of following instruction put in `$ra` * jumps to target address (i.e., procedure_label) 執行完 child procedure 後，要回到 parent procedure，此時需要加 `$ra` 中所儲存的 return address 傳給 program counter，由此可衍生出 jump reguster ($\mathtt{jr}$) 這個指令 : $$ \begin{align} &1 & &2\\ &jr & &\$ra \end{align} $$ 以前段 C code 中的第一行 `sum = leaf_example(a, b, c, d);` 為例。假設變數 `a` 到 `d` 為 `$s0` 到 `$s3` 且變數 `sum` 對應到 `$s4` ，則 compile to MIPS code 如下 ```assembly= add $a0, $0, $s0 # 將 $s0 的內容搬移到 $a0 add $a1, $0, $s1 # 將 $s1 的內容搬移到 $a1 add $a2, $0, $s2 # 將 $s2 的內容搬移到 $a2 add $a3, $0, $s3 # 將 $s3 的內容搬移到 $a3 jal leaf_example # jump to leaf_example add $s4, $0, $v0 # 將 $v0 的結果搬移到 $s4 ``` ### 3. Code for callee 前一小節主要聚焦在 caller 的指令，也就是以 parent procedure 為主角描述控制權轉移，這小節以 callee 為主。同樣以以下的 C code 為例 ```c= int lead_example(int g, h, i, j){ int f; f = (g + h) - (i + j); return f; } ``` * arguments `g` to `j` are saved in `$a0` to `$a3` * local variable `f` is saved in `$s0` * results of `g + h` and `i + j` are saved in `$t0` and `$t1`, respectively * final result is saved in `$v0` 因為 `s0`, `$t0`, `$t1` 已經在 caller (或其他地方) 使用過了，所以 callee 使用前需要先將這些 register 中原有的值儲存起來，避免被刪除 :arrow_right: 以 stack 的方式存在 memory 中 (如下圖)。注意在 memory 中 stack 的堆疊是往下長的，所以 memory address 是負號，此外會使用另一個 register ($\mathtt{sp}$) 來儲存 stack pointer (stack 的最底層) ![S__2236443_0](https://hackmd.io/_uploads/r1Ea7ifd0.jpg) 上述的 C code 經過 compile 後的 MIPS code 如下 ```assembly= # 在 memory 中以 stack 形式儲存 register 的值 addi $sp, $sp, -12 # save base address to stack pointer sw $s0, 0($sp) # $s0 = $sp + 0, stack in memory sw $t0, 4($sp) # $t0 = $sp + 4, stack in memory sw $t1, 8($sp) # $t1 = $sp + 8, stack in memory # 執行 child procedure add $t0, $a0, $a1 # g + h add $t1, $a2, $a3 # i + j sub $s0, $t0, $t1 # f = (g + h) - (i + j) add $v0, $s0, $0 # move $s0 to $v0 # 從 memory 中載入原先 register 的值 lw $s0, 0($sp) # $s0 = $sp + 0, load value from memory lw $t0, 4($sp) # $t0 = $sp + 4, load value from memory lw $t1, 8($sp) # $t1 = $sp + 8, load value from memory addi $sp, $sp, -12 # 返回控制權 jr $ra ``` 但上述方式會有一些缺點 : memory access 的過程中非常耗時。因此改善方式會將某些不重要的值不再放入 memory stack 之中，避免過多資料搬移而浪費時間。所以 register 在命名上才會有 `$t0` (temporary) 與 `$s0` 的差別。以上述的 MIPS code 為例， `$t0` 與 `$t1` 就不會再存入 memory stack 之中，詳見下方 MIPS code (與上述比較) ```assembly= # 在 memory 中以 stack 形式儲存 register 的值 addi $sp, $sp, 4 # save base address to stack pointer sw $s0, 0($sp) # $s0 = $sp + 0, stack in memory # 執行 child procedure add $t0, $a0, $a1 # g + h add $t1, $a2, $a3 # i + j sub $s0, $t0, $t1 # f = (g + h) - (i + j) add $v0, $s0, $0 # move $s0 to $v0 # 從 memory 中載入原先 register 的值 lw $s0, 0($sp) # $s0 = $sp + 0, load value from memory addi $sp, $sp, 4 # 返回控制權 jr $ra ``` ### 4. Non-leaf procedures 會呼叫其他 procedure 的 procedure 稱為Ｎon-lead procedure，簡單來說就類似於巢狀結構或迴圈的概念，一層一層傳遞下去，同時每一層的 caller 都會被存在 stack 之中，也就是說每一層的 caller 的 return address 都要儲存以此段 C code 為例 ```c int fact(int n){ if (n < 1) return 1; else return n * fact(n-1); } ``` 假設 argument `n` 會被存在 `$a0` 且 result 存在 `$v0` Compile 為 MIPS code 後如下 ```assembly= fact: # (Block: 1) 創建記憶體空間來儲存當前的 "回傳位址" 與 "參數" addi $sp, $sp, -8 # create storage of stack for 2 items sw $ra, 0(sp) # add($ra) = $sp + 0, save return address sw $a0, 4(sp) # add($a0) = $sp + 4, save argument # (Block: 2) 判斷 if (n < 1) slti $t0, $a0, 1 # set $t0 = 1 if n($a0) < 1, oeherwise $to = 0 beq $t0, $zero, L1 # $t0 == 0, goto L1 # (Block: 3) True addi $v0, $zero, 1 # 回傳結果 1 addi $sp, $sp, 8 # 釋放當前的記憶體空間 $jr $ra # 跳到 $ra(前一層的回傳位址), 也就是 "前一層的下個指令" L1: # (Block: 4) False addi $a0, $a0, -1 # n = n - 1 jal fact # recursive call # (Block: 5) lw $a0, 0($sp) # 從記憶體載入當前參數 lw $ra, 4($sp) # 從記憶體載入 "前一層的回傳位址" addi $sp, $sp, 8 # 釋放空間 mul $v0, $a0, $v0 # 後一層結果($v0)與當前參數($a0)相乘 jr $ra # 跳到 $ra(前一層的回傳位址), 也就是 "前一層的下個指令" ``` 細節流程可參考下圖，以 `fact(n=3)` 為例 ![EE4CEBDA-63DB-4C6C-BEC9-D8607166723F](https://hackmd.io/_uploads/BJc6kgHuA.jpg) ![1F3DD9B1-0B72-4A11-BA91-8171CEE05662](https://hackmd.io/_uploads/Hyz0ygruR.jpg) ### 5. Memory layout Procedure 執行的過程中，callee 會分配一些記憶體空間來儲存局部變數，而類似這樣由 callee 所產生的 procedure frame 稱為 activation record (下圖黃色區塊) ![S__2236451](https://hackmd.io/_uploads/HkmaXzSOA.jpg) 以整個記憶體空間的配置來看(如下圖)，大致可分為以下區域 : * Text : 主要存放程式碼 (program code) * Static data : 不會被改變的全域變數，例如 * static variables in C * constant array, strings * Dynamic data : heap * malloc in C * new in Java * Stack : 自動的儲存空間 ![S__2236448](https://hackmd.io/_uploads/HkESVzSdC.jpg) 由上圖可知，前述在分配記憶體空間時是往下長，但並不是所有資料儲存時都是往下儲存的 stack 型式，也有往上的 dynamic data ## Sec. 2.7 Communicating with people ### 1. Challenge 字元(character) 與字串 (string) 等資料與整數型資料不同，無法直接由 binary code 組成，因此這類資料在搬運時不能直接做載入與儲存 ### 2. Improvenment 分為兩種運算 bytes 與 halfword，可進行以下動作 * 資料的儲存(sign) : load bytes/halfword * `lb rt, offset(rs)` * `lh rt, offset(rs)` * sign extend to 32 bits in `rt` * 資料儲存(unsign) * `lbu rt, offset(rs)` * `lhu rt, offset(rs)` * zero extend to 32 bits in `rt` * 資料載入 : store bytes/halfword * `sb rt, offset(rs)` * `sh rt, offset(rs)` * sign extend to 32 bits in `rt` 如下圖所示，其中 F7 是 16 進位表示法，共 8-bits 也就是 1 byte ![S__2236450](https://hackmd.io/_uploads/r1dBkmrOC.jpg) #### 2.1 Example 以下列 C code 為例，進行字元陣列的複製，此處需要注意兩件事 1. C code 中 Array 的引用方式是 call by address，所以直接傳入 address 就好，傳入的 address 會作為 memory address base 放到 register 之中 2. 無法直接從 array y 的 address 搬移資料到 array x 的 address 之中，須先放到 register 做間接的複製 ```c void strcpy (char x[], char y[]){ int i = 0; while ((x[i] = y[i]) != '\0') i += 1; } ``` 假設 array x 與 array y 的 address base 分別存在 $a0, $a1 且變數 i 存在 $s0，則 Compile to MIPS code ```assembly= # 創建空間儲存變數 i addi $sp, $sp, -4 sw $s0, 0($sp) # initialize i add $s0, $zero, $zero # i = 0 # load y[i] L1: add $t1, $s0, $a1 # addr(y[i]): $t1 = $a1 + $s0 lbu $t2, 0($t1) # $t2 = y[i] # x[i] = y[i] add $t3, $s0, $a0 # addr(x[i]): $t3 = $a0 + $s0 sb $t3, 0($t3) # x[i] = y[i] # x[i] != '\0' beq $t2, $zero, L2 # if y[i] == 0, goto L2 addi $s0, $s0, 1 # i = i + 1 j L1 # loop # 刪除儲存空間 L2: lw $s0, 0($sp) # restore saved $s0 addo $sp, $sp, 4 jr $ra ``` ## Sec. 2.8 Addressing for 32-bit immediate and addresses ### 1. 32-bit constants 這個小節要解決的問題是，當要處理的數字或地址過大，無法用 16-bits 表示時有什麼解決辦法使用 load upper immediate 指令 (`lui rt, constant`)，他是一種 I-format 型式的指令，主要做以下兩個動作 : * 複製 16-bits 的常數到 $rt 的左邊 16-bits * 將 $rt 右邊 16-bits 設為 0 例如原始資料是 $$ \text{ori} : 0000 | 0000 | 0011 | 1101 | 0000 | 1001 | 0000 | 0000 $$ 前 16-bits 表示整數 $61$，後 16-bits 表示整數 $2304$ 經過 `lui $s0 61` 後可得 $$ \$\text{s0} : \textbf{0000} | \textbf{0000} | \textbf{0011} | \textbf{1101} | 0000 | 0000 | 0000 | 0000 $$ 再使用 `ori $s0, $s0, 2304` 跟原始資料做 `or` 運算可得 $$ \begin{align} &\$\text{s0} : &0000 | 0000 | 0011 | 1101 | 0000 | 0000 | 0000 | 0000 \\ &\text{ori} : &0000 | 0000 | 0011 | 1101 | 0000 | 1001 | 0000 | 0000\\ &\$\text{s0} : &0000 | 0000 | 0011 | 1101 | 0000 | 1001 | 0000 | 0000\\ \end{align} $$ 如此可將原始資料複製到 `$s0` ### 2. Branch addressing 前面在使用 branch 的相關指令時 (Ex: `beq`)，immediate 都是用 Label 表示跳轉位置，但實際上 Label 也是一個 memory address，此節要說的就是如何表示這個 branch 的 immediate #### 2.1 Syntax 語法上使用 I-format 的型式 (可參考 4.2 I-format instruction) | $\mathtt{opcode}$ | $\mathtt{rs}$ | $\mathtt{rt}$ | $\mathtt{immediate}$ | | :-: | :-: | :-: | :-: | | 6-bits | 5-bits | 5-bits | 16-bits | 此處的 $\mathtt{OP code}$ 指定為 $\mathtt{beq}$ 或 $\mathtt{bne}$ 的兩種。而 $\mathtt{immediate}$ 的用意類似於 $\mathtt{lw}$ 中的 offset，只不過是以 PC-relative addressing 作為當前位址並向上加/下減位移量，也就是說 PC-relative addressing 其實類似 $\mathtt{lw}$ 中的 base memory address。此外，因為 $\mathtt{immediate}$ 只有 16-bits 所以無法表示整個記憶體空間 #### 2.2 Improvement 將 $\mathtt{immediate}$ 指定為 word address。前面的 I-format 指令中(Ex: $\mathtt{lw}$) 的 $\mathtt{immediate}$ 都是使用以 bytes 為主的 bytes address，但 $\mathtt{beq}$ 與 $\mathtt{bne}$ 都是以指令的跳轉為主，所以是使用以 instruction 的數量為單位的 word address。此外，因為每個 word (or instruction) 的大小為 4-bytes，所以如果要轉成該指令所在的記憶體位址，需要在 $\mathtt{immediate}$ 後面補 00，相當於 $\times 4$ bytes。如此一來就可以以 PC 為中心，跳轉 $\pm 2^{15}$ 的 word (或 $\pm 2^{17}$ bytes) 因為 PC 預設會指向當前執行的下一個指令，所以實際上應該是從 PC+4 開始算起 : * If branch not taken : PC = Pc + 4 * If branch taken : PC = (pc + 4) + (immediate \* 4) :::info PC (program counter) 的單位是 byte address ::: ##### Example MIPS code : ```assembly= Loop: beq $9, $0, End # PC (目前執行指令) add $8, $8, $10 # PC + 4 (當前 PC 指向) addi $9, $9, -1 j Loop End: ``` 已知 $\mathtt{beq}$ 的 OPcode = 4，$\mathtt{rs}$ = \$9，$\mathtt{rt}$ = \$0，因為進行到 $\mathtt{beq}$ 指令時，當前的 PC + 4 指向 $\mathtt{add}$ 指令，再經過 3 個指令後可到 End 標籤，因此 `beq $9, $0, End` 指令中的 immediate (或 End 標籤) 應該 = 3 = 0000000000000011 ### 3. Jump addressing 理論上來說，$\mathtt{j}$ 指令 (i.e., jump) 可以跳到記憶體空間的任何地方，但實際上因為 OPcode 佔了 6-bits 的關係，而一個 word (或 instruction) 又只有 32-bits，所以只剩下 $32-6-28$ bits 能夠表示記憶體位址。又因為記憶體位址的長度也是 32-bits 所以在計算上也都是以 32-bits 為主，因此這 28-bits 必須補足剩下的 6-bits #### 3.1 Syntax | $\mathtt{opcode}$ | $\mathtt{target\ address}$ | | :-: | :-: | | 6-bits | 26-bits | :::warning Question - 如何補足剩下的 6-bits ? ::: #### 3.2 Improvement 1. 因為同樣是指令的跳轉，所以此處的 $\mathtt{target\ address}$ 同樣是以 word address 為主，又因為每個 word (或 instruction) 的大小為 4-bytes，因此直接在 $\mathtt{target\ address}$ 後面補上 00 表示 $\times 4$ bytes，如此就變成 26 + 2 = 28-bits 2. 至於剩下的 4-bits 可以直接複製 PC 的前面 4 個 bits 補在 $\mathtt{target\ address}$ 的前面。 * 原因在於，1 個 bit 可以將記憶體空間切成 2 塊，2 個 bits 可以切成 4 塊 * 而每個 program 中的指令通常都是在同一區塊中，因此 PC 的前 4-bits 通常是相同的 * 但前提是 program 必須要在相同的 block 中 (見下圖) 3. 如此可得到 28 + 4 = 32-bits ![S__2244611](https://hackmd.io/_uploads/B1X2-CIuR.jpg) 結果表示如下 : * New PC = PC[28:31] \|| $\mathtt{target\ address}$ \|| 00 * New PC = 4-bits \|| 26-bits \|| 2-bits 總結來說 * 短距離的指令跳轉 : 使用 relative PC address (Ex: $\mathtt{beq}$) * 中距離的指令跳轉 : 使用有限制範圍的 absolute address (Ex: $\mathtt{jal}$) * 長距離的指令跳轉 : 直接使用 $ra (return address)，因為 register 可以儲存 32-bits 的資料 ##### Example ![S__2244613](https://hackmd.io/_uploads/ByV8QCU_R.jpg) > Quesiton : > 1. 黃色螢光筆處的 Exit 標籤所對應的 $\mathtt{immediate}$ 為何是 2 > 2. 紅色螢光筆處的 Loop 標籤所對應的 $\mathtt{immediate}$ 為何是 20000 以上圖來說，當執行到 $\mathtt{bne}$ 時，PC + 4 指向 $\mathtt{addi}$ 指令，再過 2 個 instruction 就可以到 Exit，所以 Exit 標籤所代表的 immediate = 2 接著，因為 Loop 標籤所對應到的 address 為 $80000_{10}$ ，所以 Loop 標籤所對應到的 $\mathtt{immediate}$ 應該是 $80000 / 4$ bytes $= 20000$ ::: info 如果 $\mathtt{beq}$ 或 $\mathtt{bne}$ 指令所要跳轉的位址真的太遠，可以先做一個假的 instruction 到 $\mathtt{jump}$，再利用 $\mathtt{jump}$ 跳遠一點 ::: ### 4. MIPS instruction summary * R-format 1. Register addressing (Ex : $\mathtt{add}$) * I-format 2. Base addressing (Ex : $\mathtt{lw}$) 3. immediate addressing (Ex : $\mathtt{addi}$) 4. PC-relative addressing (Ex : $\mathtt{beq}$) * J-format 5. pseudodirect addressing (Ex : $\mathtt{jal}$) ![4506614C-7ED9-44EA-934A-3F0E8DA1C2E7](https://hackmd.io/_uploads/ryO0nRLd0.jpg) ![6F3374C3-554C-4BBC-A49D-B461A85D0AAC](https://hackmd.io/_uploads/B1k16AIdA.jpg) ## Sec. 2.9 補充: Translating and starting program Assembler pseudo instruction 的概念類似 programming language 中的 pseudo code。並不是真正的 instruction。當使用 pseudo instruction 的時候，只要創建一個 instruction 就必須有 Hardware 與之對應。例如以下的 pseudo instruction ```assembler= move $t0, $t1 blt $t0, $t1, L ``` 實際上並沒有 `move` 與 `blt` 這兩個 instruction，而真正所對應到的 assemblt code 如下 ```assembly= add $t0, $zero, $t1 slt $at, $t0, $t1 bne $at, $zero, L ``` `move` 對應到 `add` 而 `blt` 對應到 `slt` 和 `bne`。此外會發現它寫成真正的 assembly code 時，實際上會需要 3 個 register。多餘的 register 由 `$at` (或 `$1`) 提供，他是一個專門提供給 assembler 使用的 register，可參考上面的 convention register 圖片 ## Sec. 2.10 Sort example ### 1. C sort example (暫略) ### 2. Optimization 一個 complier 的最佳化與評估方式大致上可以分以下幾點 * Instruction count and CPI are not good performance indicators in isolation * Compiler optimizations are sensitive to the algorithm * Nothing cna fix a dumb algorithm (先寫出好的演算法再做最佳化) ## Sec. 2.11 Arrays v.s. pointers ### 1. Difference Array 與 pointer 在 assembly code 的實作上可以使用 index 或 pointer 兩種方式，兩者差異如下 * Array index * index 必須乘上元素大小 (Ex: int 大小為 4 bytes) * 上述大小需再加上 array base memory address * Pointer * 直接對應到 array 的記憶體位置 * 避免複雜的 index ### 2. Example: clearing and array 以下圖為例，左邊是以 array 的 index 做，右邊是直接以 array 的 address 做，兩者功能相同但做法不同。比較後可發現，右邊的 address 版本並沒有用到 invreament by i 且有沒有不斷地 $\times 4$ ![S__2244614](https://hackmd.io/_uploads/rkhX6DwOC.jpg) 因此，一個好的 compiler 必須做到: (1) strength reduced，也就是將 $\times 4$ 改為 shift。(2) induciton variable elimination，也就是將 increament by i 移到迴圈外 ## Sec. 2.12 ARM and X86 instruction sets ### 1. Instruction 目前主流的三種 instruction set 分別是用於手機與嵌入式系統的 ARM、用於個人 PC 的 X86 與 MIPS。雖然 Intel 的 X86 最早發展，但因為用途分佈最多，以及程式相容性的問題，所以目前還是存在。不過因為內容隨時代不斷擴展，所以比較複雜， ![S__2244617](https://hackmd.io/_uploads/SJUSAPD_C.jpg) ### 2. ARM v.s. MIPS 兩者功能與結構類似，且都是屬於 RISC isa (X86 屬於 SISC)，但 ARM 用途最廣。兩主比較如下 : ![S__2244616](https://hackmd.io/_uploads/BJ_81dPOA.jpg) 其中最大的差異在 compare 與 branch 這兩個方式。在 ARM 的 instruction 中，最前方有 4-bits 的 conditional code 來決定該 instruction 是否要執行，但在 MIPS 中做 condition 時需要使用兩個指令才能做條件判斷: (1) $\mathtt{slt}$: 比對結果 (2) $\mathtt{beq}$/$\mathtt{bne}$: 測試結果 + 跳轉 ### 3. Fallacies (常見錯誤迷思) * Powerful instruction $\rightarrow$ higher performance (X) * 越複雜 instruction 越難實作 * 指令通常越好越好 * Use assembly code for high performance (X) * assembly language 非常繁瑣且行數更多 * 越多的行數會導致高更多錯誤率以及更低的生產力 ### 4. Pitfalls (易錯觀念) instruction (或 word) 雖然是連續的，但不代表他在記憶體空間上的位址也是連續的，因為一個 word 的大小是 4 bytes，所以在跳轉指令或是使用記憶體位址表示時，必須 +4 bytes 而不是 +1 byte ### 5. Concluding remarks 總結 ISA 的設計原則 > Design principles - > 1. Simplicity favors regularity (Ex: 結構固定，OPcode 都是 6-bits) > 2. Smaller is faster (Ex: 資料大小是 32-bits 不是 64-bits) > 3. Make the common case fast (Ex: constant 常用，所以直接設一個 immediate format) > 4. Good design demands good compromises (Ex: 不同需求有不同設計) >