# Instruction Format Instruction formats refer to the way instructions are encoded and represented in machine language. Instructions are in the format of: - Opcode (operations code) - Operands - source operand - destination operand There are several types of instruction formats: - Three-address and Two-address - general-purpose register machine - One-address - accumulator-based machine - Zero-address - stack machine Computer instructions consists of groups called fields. They contain diff information for computers which are all written in binaries. Each field signifies something based on which a CPU decides to perform. ![image](https://hackmd.io/_uploads/H1l2dSgIyg.png)_ ![image](https://hackmd.io/_uploads/rkw0dHlUJx.png) To summarize, the instruction format has the following components: | Component | size (bytes) | function | | - | - | - | | Prefix | 0-4 | Affect the operation of the instruction | | Opcode | 1-2 | Instruction Prefix | | Addressing Mode | 1 | Supports register or memory operands | | Optional | 1 | If the instruction uses scaled / indexed addressing | | Displacement | 0/1/2/4 | Specifes if there is a memory address displacement | | Immediate | 0/1/2/4 | For immediate operand | ![image](https://hackmd.io/_uploads/rJr2crxUJg.png) ## Opcode We will start with the opcode. We will show an example of the `ADD` opcode that starts with six '0' followed by d and s: ``` 000000ds ``` - `d`: direction of the data transfer - 0: register to memory - 1: memory to register - `s`: size of operands (registers and memory locations) - 0: adding 8-bit operands - 1: adding 16-bit / 32-bit operands ### Addressing Mode (`MOD`, `REG`, `R/M`) ![image](https://hackmd.io/_uploads/HyAVdUeIye.png) ![image](https://hackmd.io/_uploads/HyDEuIl8yg.png) ![image](https://hackmd.io/_uploads/H1DBOLgLkg.png) ### Scale Index Base (SIB) SIB refers to the scale, index, and base in the format. ![image](https://hackmd.io/_uploads/HyJ3DAlUkx.png) The scale value is used when scaled addressing is applied: | Scale Value | Index * Scale Value | | - | - | | 00 | Index * 1 | | 01 | Index * 2 | | 10 | Index * 4 | | 11 | Index * 8 | Index indicates which register we're indexing: | Index | Register | | - | - | | 000 | EAX | | 001 | ECX | | 010 | EDX | | 011 | EBX | | 100 | Illegal | | 101 | EBP | | 110 | ESI | | 111 | EDI | The following table is the base: | Base | Register | | - | - | | 000 | EAX | | 001 | ECX | | 010 | EDX | | 011 | EBX | | 100 | EBP | | 101 | Displacement (only if MOD = 00, EBP if MOD = 01 or 10) | | 110 | ESI | | 111 | EDI | ### Examples `ADD CL, AL` - `ADD`: 000000 - d: 0 (register to memory) - s: 0 (8-bit) - MOD: 11 (Register addressing) - REG: 000 (source field is AL) - R/M: 001 (destination field is CL) This instruction has the machine code: 0000000011000001 ![image](https://hackmd.io/_uploads/HyJ-q0gIyl.png) `ADD ECX, EAX` - `ADD`: 000000 - d: 0 (register to memory) - s: 1 (32-bit) - MOD: 11 (Register addressing) - REG: 000 (source field is EAX) - R/M: 001 (Destination field is ECX) This instruction has the machine code: 0000000111000001 ![image](https://hackmd.io/_uploads/By805RlIkx.png) `ADD EDX, DISP` - `ADD`: 000000 - d: 1 (memory to register) - s: 1 (32-bit) - MOD: 00 (Memory Direct addressing) - **REG: 010 (Destination field is EDX)** - R/M: 101 (32-bit Displacement addressing) This instruction has the machine code: 0000001100010101 ![image](https://hackmd.io/_uploads/Bypp1-ZU1l.png) `ADD EDI, [EBX]` - `ADD`: 000000 - d: 1 (memory to register) - s: 1 (32-bit) - MOD: 00 (Memory Direct addressing) - REG: 111 (Destination field is EDI) - R/M: 011 (Source field is EBX) This instruction has the machine code: 0000001100111011 ![image](https://hackmd.io/_uploads/SJuT1W-Uke.png) `ADD EAX, [ESI + disp8]` - `ADD`: 000000 - d: 1 (memory to register) - s: 1 (32-bit) - MOD: 01 (one byte displacement for `disp8`) - REG: 000 (Destination field is EAX) - R/M: 110 (Source field is ESI) - Disp8 (follows the MOD-REG-R/M byte) This instruction has the machine code: 0000001101000110 with disp8 ![image](https://hackmd.io/_uploads/rJS3gZZUkl.png) `ADD EBX, [EBP + disp32]` - `ADD`: 000000 - d: 1 (memory to register) - s: 1 (32-bit) - MOD: 10 (four-byte displacement for `disp32`) - REG: 011 (Destination field is EBX) - R/M: 101 (Source field is EBP) - Disp32 (follows the MOD-REG-R/M byte) This instruction has the machine code: 0000001110011101 with disp32 ![image](https://hackmd.io/_uploads/B1iNZZ-81e.png) `ADD EBP, [disp32 + EAX*1]` - `ADD`: 000000 - d: 1 (memory to register) - s: 1 (32-bit) - MOD: 00 (SIB) - REG: 101 (Destination field is EBP) - R/M: 100 (SIB) - **SIB** - **S: 00 (Index * 1)** - **I: 000 (EAX)** - **B: 101 (Displacement)** - disp32 (follows the MOD-REG-R/M-SIB byte) This instruction has the machine code: 000000110010100000000101 with disp32 ![image](https://hackmd.io/_uploads/BkkdMb-Iyx.png) `ADD ECX, [EBX + EDI*4]` - `ADD`: 000000 - d: 1 (memory to register) - s: 1 (32-bit) - MOD: 00 (SIB) - REG: 001 (Destination field is ECX) - R/M: 100 (SIB) - SIB - S: 10 (Index * 4) - I: 111 (EDI) - B: 011 (EBX) - disp32 (follows the MOD-REG-R/M-SIB byte) This instruction has the machine code: 000000110000110010111011 with disp32 ![image](https://hackmd.io/_uploads/HJY8QZbU1l.png) `ADD` **Immediate Instruction** ![image](https://hackmd.io/_uploads/HkU57WbIke.png)