# Instructions: Language of the Computer ###### tags: `RISCV` # 目錄 [TOC] ## 2.1 Introduction **stored-program concept** * instructions and data are represtented in binary。 * Instructions and data stored in memory。 **Design Principle** **要背!** 1. Simplicity favors regularity 2. Smaller is faster 3. Good design demands good compromises ## 2.2 Operations of the Computer Hardware :::warning CPU存取register前需要先解碼,才能知道是哪個register,故register數量不能太多,以免造成CPU的clock cycle time 太長。 ::: ## 2.3 Operands of the Computer Hardware ### 32-bit RISC-V Registers ![](https://i.imgur.com/CEmKS3t.png) ### Register Operands A 32-bit RISC-V computer has 32 registers. 32-bit data is called a “word”. 64-bit data is called a “doubleword”. **alignment restriction** : In many architectures, words must start at addresses that are multiples of 4. **spilling registers** :::warning compiler 將常用的變數放在register,其餘的放到記憶體,透過lw和sw使資料在register 和 memory 之間做交換。 而將**不常用**的變數放到memory的動作稱為**spilling register**。 ::: ### Memory Operands Memory is **byte addressed**. **Load** : values from memory into registers. **Store** : result from register to memory. :::danger RISC-V and Intel x86 do not have alignment restrictions, but MIPS does. ::: **Little Endian** : LSB (Least-significant byte) 放置於最低 memory address. **Big Endian** : MSB 放置於最低 memory address. :::warning RISC-V is **Little Endian**. ::: ### 為何沒有 swu、shu、sbu? Ans : 因為 load 時會有**位元擴充問題**,需要**區分正負號**。 而 store 時,memory 是 **byte addressed**,**沒有**位元擴充問題,**不需要**區分正負號。 ### Memory Operands Example ```java /* 程式碼如下 */ g = h + A[8]; // g and h with the registers x20 and x21 A[12] = g; // Base address of A is in x22 /* 轉換成組語為如下 */ lw x9, 32(x22) // Transfer A[8] to a register x9 add x20, x21, x9 sw x20, 48(x22) ``` ### Immediate Operands RISC-V 對於 Arithmetic、Logical、Shift 指令,幾乎都有提供 Immediate 版本,目的為 make the common case fast。 ```java /* 程式碼如下 */ g = g + 4; // g with the register x20 /* 轉換成組語為如下 */ addi x20, x20, 4 // x22 = x22 + 4 ``` ## 2.4 Signed and Unsigned Numbers **two’s complement number** 範圍是 $(-2^{n-1})\ to \ (2^{n-1}-1)$,n為bits數 **Least Significant Bit (LSB)** : refers to the rightmost bit (bit 0) **Most Significant Bit (MSB)** : refers to the leftmost bit (bit 31) 對於 **Unsigned numbers**,第 31 bit 為**正**的 ![](https://i.imgur.com/hQ4PGiW.png) 對於 **Signed numbers**,第 31 bit 為**負**的 ![](https://i.imgur.com/naht70h.png) 2種**快速轉換** two’s complement number 正負號方法。 1. 取補數後再加 1。 2. 由右往左掃,遇到第一個 1 則不動,往左的其他 bit 取補數。 ## 2.5 Representing Instructions in the Computer ![](https://i.imgur.com/ukgrTjY.png) Design Principle 3: **Good design demands good compromises**. ![](https://i.imgur.com/uSIm4OC.png) ![](https://i.imgur.com/oC5p7O9.png) ![](https://i.imgur.com/riNSX8S.png) :::warning In S-format, 分割 immediate field 成 5 bits and 7 bits, 使 rs1 和 rs2 在所有指令格式中都有相同位置, 以降低 hardware complexity. ::: ![](https://i.imgur.com/06xRpfH.png) ## 2.6 Logical Operations ![](https://i.imgur.com/rOsASqw.png) :::warning 沒有 sla,因為 sla 作用與 sll 相同。 ::: ## 2.7 Instructions for Making Decisions ![](https://i.imgur.com/xaqD1ad.png) ```java /* 程式碼如下 */ if (i == j) f = g + h; else f = g - h; // f, g, h, i, j are in x19, x20, x21, x22, x23 /* 轉換成組語為如下 */ bne x22, x23, Else // go to Else if i ≠ j add x19, x20, x21 // f = g + h beq x0, x0, Exit // 一定跳到Exit Else: sub x19, x20, x21 // f = g - h Exit: ``` ```java /* 程式碼如下 */ while (save[i] == k) i += 1; // i, k are in x22, x24, base of the array save is in x25 /* 轉換成組語為如下 */ Loop: slli x10, x22, 2 // 計算位移量 add x10, x10, x25 // x10 = address of save[i] lw x9, 0(x10) // Temp reg x9 = save[i] bne x9, x24, Exit // go to Exit if save[i] ≠ k addi x22, x22, 1 // i = i + 1 beq x0, x0, Loop // go to Loop Exit: ``` ## 2.8 Supporting Procedures in Computer Hardware ```java /* 程式碼如下 */ int leaf_example (int g, int h,int i, int j) { int f; f = (g + h) - (i + j); return f; } // g, h, i, j are in x10, x11, x12, x13 // f in x20, Temporary variables in x5, x6, need to save them. /* 轉換成組語為如下 */ leaf_example: addi sp,sp,-12 // 空出空間,存caller的register sw x5,8(sp) // 存儲caller的register sw x6,4(sp) // 存儲caller的register sw x20,0(sp) // 存儲caller的register /* ---------------- */ add x5,x10,x11 add x6,x12,x13 sub x20,x5,x6 addi x10,x20,0 // 將結果保存在x10 /* ---------------- */ lw x20,0(sp) // 還原caller的register lw x6,4(sp) // 還原caller的register lw x5,8(sp) // 還原caller的register addi sp, sp,12 // 恢復空間 /* ---------------- */ jalr x0,0(x1) // return to the caller ``` ```java /* 程式碼如下 */ int fact (int n) { if (n < 1) return 1; else return n * fact(n - 1); } // Argument n in x10 // Result in x10 /* 轉換成組語為如下 */ fact: addi sp, sp, -8 // Adjust stack for 2 items sw x1, 4(sp) // Save the return address sw x10, 0(sp) // Save the argument n /* ---------------- */ addi x5, x10, -1 // x5 = n - 1 bge x5, x0, L1 // if n >= 1, go to L1 /* ---------------- */ addi x10, x0, 1 // return 1 addi sp, sp, 8 // pop 2 items off stack jalr x0, 0(x1) // return to caller /* ---------------- */ L1: addi x10, x10, -1 // n >= 1: argument gets (n − 1) jal x1, fact // call fact with (n − 1) /* ---------------- */ addi x6, x10, 0 // return from jal: move result of fact(n - 1) to x6: /* ---------------- */ lw x10, 0(sp) // restore argument n lw x1, 4(sp) // restore the return address addi sp, sp, 8 // adjust stack pointer to pop 2 items /* ---------------- */ mul x10, x10, x6 // return n * fact (n − 1) jalr x0, 0(x1) // return to the caller ``` ![](https://i.imgur.com/AwlGUNY.png) ### global pointer C語言的變數分為兩種: 1. *automatic* : declared in a procedure 2. *static* : declared outside all procedures are considered static To simplify **access to static data**, RISC-V compilers reserve a register **x3** for use as the **global pointer**, or **gp**. ### Memory Layout ![](https://i.imgur.com/nLfriVi.png) ## 2.9 Communicating with People 多數網頁使用 Unicode,而不是 ASCII。 ![](https://i.imgur.com/gmHWPr6.png) C語言 用 8 bits 來表示一個 character (ASCII)。 JAVA 用 16 bits 來表示一個 character (Unicode)。 ```java /* 複製字元陣列如下 */ void strcpy (char x[], char y[]) { size_t i; i = 0; while ((x[i]=y[i])!='\0') i += 1; } //arrays x, y are found in x10, x11 /* 轉換成組語為如下 */ strcpy: addi sp,sp,-4 // adjust stack for 1 doubleword sd x19,0(sp) // push x19 add x19,x0,x0 // i=0+0 L1: add x5,x19,x11 // x5 = addr of y[i] lbu x6,0(x5) // x6 = y[i] add x7,x19,x10 // x7 = addr of x[i] sb x6,0(x7) // x[i] = y[i] beq x6,x0,L2 // if y[i] == 0 then exit addi x19,x19, 1 // i = i + 1 (advance to next byte) jal x0,L1 // next iteration of loop L2: ld x19,0(sp) // restore saved x19 addi sp,sp,4 // pop 1 doubleword from stack jalr x0,0(x1) // and return ``` ## 2.10 RISC-V Addressing for Wide Immediates and Addresses ### Loading a 32-Bit Constant 指令: lui rd, constant 用途: 把 20-bits constant 放到 rd 的 左半部,並把右半部的 12 bits 清成 0。 ![](https://i.imgur.com/cwLwgGh.png) --- ### Branch Addressing 指令: bne rs1, rs2, constant 例如: bne x10, x11, 2000 // 2000ten = 0111 1101 0000 範圍: $-2^{11}*2=-4096$ to $(2^{11}-1)*2=4094$ from currenct PC,因為imm目標位置是以half-word為單位。 ![](https://i.imgur.com/bk85VCQ.png) ### Jump Addressing 指令: jal rd, constant 例如: jal x0, 2000 // go to location x0 +2000ten 範圍: $-2^{19}*2=-1MiB$ to $(2^{19}-1)*2=1MiB$ from currenct PC ![](https://i.imgur.com/dTLujex.png) :::info Like most recent computers, RISC-V uses **PC-relative addressing** for both **conditional** branches and **unconditional** jumps, because the destination of these instructions is likely to be close to the branch ::: :::info **For long jumps** (e.g., to 32-bit absolute address), RISC-V allows very long jumps to any 32- bit address with a two-instruction sequence: **lui** (load address[31:12] to temp register), and **jalr** (add the lower address[11:0] and jump to target) ::: :::warning However, the RISC-V architects wanted to support the possibility of instructions that are only **2 bytes (halfwords)** long, so the branch instructions represent the number of halfwords between the branch and the branch target. ::: ### 轉換成 machine code ![](https://i.imgur.com/6xVdtHj.png) ![](https://i.imgur.com/pt63bKH.png) 因為一個指令大小為 4 bytes,且記憶體為 byte-address。 bne 的目的位置為 Exit (12 + 80012),故 imme 欄位為 12。 beq 的目的位置為 Loop (-20 + 80020),故 imme 欄位為 -20。 ### Addressing Mode ![](https://i.imgur.com/Re8BOYU.png) 1. **Immediate addressing** where the operand is a constant within theinstruction itself. 3. **Register addressing** where the operand is a register. 4. **Base or displacement addressing** where the operand is at the memory location whose address is the sum of a register and a constant in the instruction. 5. **PC-relative addressing** where the branch address is the sum of the PC and a constant in the instruction. ![](https://i.imgur.com/Zx0CT7A.png) ![](https://i.imgur.com/j9QaxUP.png) ## 2.11 Parallelism and Instructions: Synchronization **Data race** : Result depends on the order of executions and accesses. ### the special pair of instructions **l.w (load-reserved word)** * lr.w rd,(rs1) * Load from address in rs1 to rd **sc.w (store-conditional word)** * sc.w rd,(rs1),rs2 * Store from rs2 to address in rs1 * Succeeds if location not changed since the lr.d, and returns 0 in rd * Fails if location is changed returns non-zero value in rd ```clike /* Example 1: Impl. atomic swap (to test/set lock variable) */ again: lr.w x10,(x20) sc.w x11,(x20),x23 // x11 = status bne x11,x0,again // branch if store failed addi x23,x10,0 // X23 = loaded value /* Example 2: Impl. lock acquire */ addi x12,x0,1 // copy locked value again: lr.w x10,(x20) // read lock bne x10,x0,again // check if it is 0 yet sc.w x11,(x20),x12 // attempt to store bne x11,x0,again // branch if fails Unlock: sw x0,0(x20) // free lock ``` ## 2.12 Translating and Starting a Program ![](https://i.imgur.com/cJIyaIK.png) | 類型 | UNIX | MS-DOS | | ---------------------------- | ---- | ------ | | C source files | .c | .C | | assembly files | .s | .ASM | | object files | .o | .OBJ | | statically library routines | .a | .LIB | | dynamically library routines | .so | .DLL | | executable files | .out | .EXE | ### Compiler transforms the C program into an **assembly language program**. ### Assembler 1. converts the assembly language instruction into the **machine code**, called **object file**. 2. Handle **pseudoinstructions** that are not implemented by the **target hardware**(ex. RISC-V machine language) but found in assembly language instructions(ex.RISC-V assembler). 3. An **object file** contains the **six parts**: ➢ Header: described contents of object module ➢ Text segment: translated instructions ➢ Static data segment: data allocated for the life of the program ➢ Relocation info: for contents that depend on absolute location of loaded program ➢ Symbol table: global definitions and external references ➢ Debug info: for associating with source code ### Linker 1. The linker uses the **relocation information** and **symbol table** in each object module to resolve all undefined labels. 2. It takes all the **independently** assembled machine language programs and **“stitches”** them together. 3. The linker produces an **executable file** that can be run on a computer, containing no unresolved references. ### Loader the operating system reads **executable file** to **memory** and starts it. The loader follows **6 steps** in **UNIX** systems 1. Reads the executable file header to determine size of the text and data segments. 2. Creates an address space large enough for the text and data. 3. Copies the instructions and data from the executable file into memory. 4. Copies the parameters (if any) to the main program onto the stack. 5. Initializes the processor registers and sets the stack pointer to the first free location. 6. Branches to a start-up routine that copies the parameters into the argument registers and calls the main routine of the program. When the main routine returns, the start-up routine terminates the program with an exit system call. ### Static vs. Dynamically Linking Two ways to link to **external object files:** **1. Static linking** 1. Library routines that are linked to a program **during linking**. 2. 缺點一 : 若 library 有更新,則整個程式需要重新 linking,否則只能使用 old library。 3. 缺點二 : 不見得所有 library 都會使用到,可能使 library 比程式本身還要占空間。 **2. Dynamic linking:** 1. Library routines that are linked to a program **during execution**. 2. Good for **new version** of the library and supports **new hardware** devices. ### Java Virtual Machine (JVM) ![](https://i.imgur.com/NQ7D2B9.png) JVM的目的 : **Machine independence** | 方法 | 速度 | | ---- | ---- | | JVM | 慢 | | JIT | 快 | ## 2.13 A C Sort Example to Put it All Together 講義沒有提到,看原文書 ## 2.14 Arrays versus Pointers 看講義 ## (X) 2.15 Advanced Material: Compiling C and Interpreting Java ## (X) 2.16 Real Stuff: MIPS Instructions ## (X) 2.17 Real Stuff: ARMv7 (32-bit) Instructions ## (X) 2.18 Real Stuff: ARMv8 (64-bit) Instructions ## (X) 2.19 Real Stuff: x86 Instructions ## 2.20 Real Stuff: The Rest of the RISC-V Instruction Set 看講義 ## (X) 2.21 Going Faster: Matrix Multiply in C ## 2.22 Fallacies and Pitfalls **Fallacy**: More powerful instructions --> higher performance * Compilers are good at making fast code from simple instructions * 現行計算機大多採用RISC架構,而非CISC架構。 **Fallacy**: Use assembly code --> high performance * But modern compilers are better at dealing with modern processors * More lines of code --> more errors and less productivity **Pitfall**: Sequential words are not at sequential addresses * 一次增加 **4** (for 32-bit architecture),而不是 1。 **Pitfall**: Using a pointer to an automatic variable outside its defining procedure * 不要把 passing pointer 當作回傳值。 ## 2.23 Concluding Remarks • Design principles 1. Simplicity favors regularity 2. Smaller is faster 3. Good design demands good compromises • Make the common case fast • Layers of software/hardware * Compiler, assembler, hardware • RISC-V: typical of RISC ISAs * c.f. x86 ## (X) 2.24 Historical Perspective and Further Reading ## (X) 2.25 Self-Study ## (X) 2.26 Exercises