# 組合語言與系統程式 Assembly Language and System Programming [Toc] ## Chapter 1 : Basic Concept ### Comparing ASM to High-Level Languages | Type of Application | High-level Language | Assembly Language | | -------- | -------- | -------- | | Bussiness Application Software for Single Platform | formal structure make it easy to maintain and organize | less formal structure | | Hardware Device Driver | not so easy to access hardware or even not provide | simple and straightforward | | Bussiness Application Software for Multiplatform | portable, usually can recompile for target system without so many changes | must be recoded for different platform and usually using a assembler with different syntax | | Embedded System and Computer Games Requiring Direct Hardware Access | too much code, may not be efficient | executable code is small and runs quickly | ### Example of Assembly Language ![](https://i.imgur.com/wDpTym5.png) ### Specific Machine Levels - Level 4 : High-level Language - Level 3 : Assembly Language - Level 2 : Instruction Set Architecture(ISA) - Level 1 : Digital Logic ### Data Representation - Some terms - MSB : most significant bit(leftest) - LSB : least significant bit(rightest) - Integer Storage Sizes | Storage Type | Range(Low to High) | Powers of 2 | | -------- | -------- | -------- | | Unsigned Byte | $0$ to $255$ | $0$ to $2^8-1$ | | Unsigned Word | $0$ to $65535$ | $0$ to $2^{16}-1$ | | Signed Byte | $-128$ to $+127$ | $-2^7$ to $2^7-1$ | | Signed Word | $-32768$ to $+32767$ | $-2^{15}$ to $2^{15}-1$ | - Forming the 2' Complement | | 2' Complement | Decimal | | -------- | -------- | -------- | | Starting Value | 0000 0010 | 2 | | Stage 1 : Reverse the bit | 1111 1101 | $-3$ | | Stage 2 : Add 1 to the Value from Stage 1 | 1111 1101<br>+0000 0001 | $-3+1$ | | Final : 2' Complement Representation | 1111 1110 | -2 | ## Chapter 2 : x86 Processor Architecture ### Basic Microcomputer Design - **clock** synchronizes CPU operations - control unit (CU) **coordinates sequence of execution steps** - ALU performs **arithmetic and bitwise processing** ![](https://i.imgur.com/GtvN57u.png) ### Instruction Execution Cycle - Fetch:getting command and move to caches - Decode:decode the command - Fetch operands - Execute - Store output ![](https://i.imgur.com/38Gqk1O.png) ### IA-32 Processor Architecture - mode of accessing register - Real-address mode - Single Task - Program can access any place in the memory ![](https://i.imgur.com/evZC3Cn.png) - Protected mode - Multitasking - Program can only access certain partition of memory ![](https://i.imgur.com/INwMHDo.png) - Something about Register - Named storage locations inside the CPU, optimized for speed ![](https://i.imgur.com/c3w8m9w.png) - registers have different names when they are accessed with different sizes - *example* ![](https://i.imgur.com/oR5D1xp.png) - *certain cases* | 32-bit | 16-bit | | -------- | -------- | | ESI | SI | | EDI | DI | | EBP | BP | | ESP | SP | - type of register - General-Purpose - EAX – accumulator - ECX – loop counter - ESP – stack pointer - ESI, EDI – index registers - EBP – extended frame pointer (stack) - Segment - CS – code segment - DS – data segment - SS – stack segment - ES, FS, GS - additional segments - Others - EIP – instruction pointer - EFLAGS - status and control flags - each flag is a **single binary bit** ## Chapter 3 : Assembly Language Fundamentals ### Example of Code ```assembly TITLE Add and Substract, Version 2 (AddSub2r.asm) INCLUDE Irvine32.inc .data val1 DWORD 10000h val2 DWORD 40000h val3 DWORD 20000h myStr BYTE "HELLO!" finalVal DWORD ? .code Main PROC L1: mov eax, val1 ;get first value add eax, val2 ;add second value sub eax, val3 ;substact third value mov finalVal, eax jmp L1 Main ENDP END main ``` ### Instruction - Label - marks the address (offset) of code and data - type - data label:must be unique - code label:target of jump and loop instructions - Integer Constants - Optional leading + or – sign - type:binary, decimal, hexadecimal, or octal digits - Common radix characters: - **h** – hexadecimal(must begin with number) - **d** – decimal - **b** – binary - **r** – encoded real - Character and String Constants - *example*:'a', "B", 'minaseinori', "MDFK" - embedded quotes:'suichan is also "cute" today' - Reserved Words and Identifiers - Reserved words cannot be used as identifier - Identifiers - 1-247 characters, including digits - not case sensitive - first character must be a letter - Suggested Coding Standards - capitalize only directives and operators - descriptive identifier names - spaces surrounding arithmetic operators - blank lines between procedures - Indentation and spacing - code and data labels – no indentation - executable instructions – indent 4-5 spaces - comments: right side of page, aligned vertically - 1-3 spaces between instruction and its operands - 1-2 blank lines between procedures - Data Define - *example* | Name | Directive | Initializer | | -------- | -------- | -------- | | val1 | BYTE | 10 | - **DUP operator** - Use DUP to allocate (create space for) an array or string. - Syntax:[counter] DUP([argument]) - **Little Endian Order** - *example* val1 DWORD 12345678h | Memory | Data | | -------- | -------- | | 0000 | 78h | | 0001 | 56h | | 0002 | 34h | | 0003 | 12h | ## Chapter 4 : Data Transfers, Addressing, and Arithmetic ### moving data - **MOV** - source and destination must be in the same size - CS, DS, EIP, IP, and immediate value cannot be the destination - *example*:mov a, eax(**a** must be 32 bit) - **MOVZX** ![](https://i.imgur.com/7TaBtY9.png) - **MOVSX** ![](https://i.imgur.com/1lWphcE.png) ### direct access - *example* ```assembly .data arrayB BYTE 10h, 20h, 30h, 40h .code mov al, arrayB+1 ; AL = 20h mov al, [arrayB+1] ; alternative notation ``` ### computing - **only affect the destination** - **INC** & **DEC** - $destination\pm 1$ - *example*:```inc eax``` - **ADD** & **SUB** - $destination\pm source$ - *example*:```sub eax, 5h``` - **NEG** - $-destination$ - *example*:```neg eax``` ### flags - **1 bit** - **set:1, clear:0** - **ZF**(zero flag) - set when the r**esult of an operation produces zero** in the **destination** operand. - *example* ```assembly mov cx, 1 sub cx, 1 ; CX = 0, ZF = 1 mov al, 0FFh inc al ; AL = 0, ZF = 1 inc al ; AL = 1, ZF = 0 ``` - **SF**(signed flag) - set when the destination operand is **negative** - *example* ```assembly mov al, 0 sub al, 1 ;SF = 1 add al, 2 ;SF = 0 ``` - **OF**(overflow flag) - set when the **signed result** of an operation is **invalid or out of range** - *example* ```assembly ;Example 1 mov al, +127 add al, 1 ;OF = 1, AL = ?? ;Example 2 mov al, 7Fh ;OF = 1, AL = 80h add al, 1 ``` - hardware viewpoint ![](https://i.imgur.com/74T92q8.png) - **CF**(carry flag) - when the result of an operation generates an unsigned value that is **out of range** (*ex*:positive result become negative) - *example* ```assembly mov al, 0FFh add al, 1 ;CF = 1, AL = 00 mov al, 0 sub al, 1 ;CF = 1, AL = FF ``` ### Data-related operator - **OFFSET** - start address of a variable - being different in different mode - *example*:```mov esi, OFFSET bVal``` - **PTR** - Overrides the default type of a label (variable) - Provides the flexibility to access part of a variable - *example*: ```assembly mov al, BYTE PTR myDouble ;AL = 78h mov al, BYTE PTR [myDouble+1] ;AL = 56h mov al, BYTE PTR [myDouble+2] ;AL = 34h mov ax, WORD PTR myDouble ;AX = 5678h mov ax, WORD PTR [myDouble+2] ;AX = 1234h ``` - **TYPE** - returns the size - *example*: ```assembly .data var1 BYTE ? var2 WORD ? var3 DWORD ? var4 QWORD ? .code mov eax, TYPE var1 ; 1 mov eax, TYPE var2 ; 2 mov eax, TYPE var3 ; 4 mov eax, TYPE var4 ; 8 ``` - **LENGTHOF** - the number of elements in a single data declaration - *example*: ```assembly .data array1 WORD 30 DUP(?), 0, 0 ; 32 .code mov ecx, LENGTHOF array1 ; 32 ``` - **SIZEOF** - returns a value that is equivalent to multiplying LENGTHOF by TYPE - *example*: ```assembly .data array1 WORD 30 DUP(?), 0, 0 ;64 byte .code mov ecx, SIZEOF array1 ;64 byte ``` ### JUMP & LOOP - **LOOP** - count the time of loop by **ECX** - *example*:```loop L1``` - **JMP** - unconditional jump to a label that is usually within the same procedure - *example*:```jmp L1``` ### Indirect access - like point in C/C++ - *example* ```assembly .data val1 BYTE 10h, 20h, 30h .code mov esi, OFFSET val1 mov al, [esi] ;dereference ESI (AL = 10h) inc esi mov al, [esi] ;AL = 20h inc esi mov al, [esi] ;AL = 30h ``` ## Chapter 5 : Procedures ### Stack - *example* ![](https://i.imgur.com/k97eT8M.png) - complement by **push** & **pop** ### Procedure - just like folding some code in the program - *example* ```assembly main PROC 00000020 call MySub 00000025 mov eax, ebx . . main ENDP MySub PROC 00000040 mov eax,edx . . ret MySub ENDP ``` - **USES** - use **push & pop** to keep the value of register - *example* ```assembly ArraySum PROC USES esi ecx ;push esi ;push ecx mov eax,0 ; set the sum to zero ``` ## Chapter 6 : Conditional Processing ### Boolean - **AND** - syntax:```AND destination, source``` - can do mask ```assembly mov al, 'a' ;AL = 01100001b and al, 11011111b ;AL = 01000001b ``` ![](https://i.imgur.com/hcP2vOJ.png) - **OR** - syntax:```OR destination, source``` ![](https://i.imgur.com/Fg5bkaS.png) - **XOR** - syntax:```XOR destination, source``` ![](https://i.imgur.com/iHidzKf.png) - **NOT** - syntax:```NOT destination``` ![](https://i.imgur.com/N9yb6uB.png) ### Compare and test - **cmp** - destination operand to the source operand - Nondestructive subtraction - *syntax*:```cmp operand_1, operand_2``` - unsigned compare | Case | ZF | CF | | | -------- | -------- | -------- | -------- | | D = S | 1 | | D - S = 0 | | D < S | 0 | 1 | D - S < 0(overflow) | | D > S | 0 | 0 | D - S > 0 | - signed compare | Case | D | S | ZF | CF | | -------- | -------- | -------- | -------- | -------- | | D > S | 0101(5) | 1110(-2) | 0 | 0 | | D > S | 0110(6) | 1110(-2) | 1 | 1 | | D < S | 1110(-2) | 0111(7) | 0 | 1 | | D < S | 1111(-1) | 0101(5) | 1 | 0 | - **test** - Performs a nondestructive AND - No operands are modified, but the Zero flag is affected. - *syntax*:```test operand_1, operand_2``` ### Jump - base on flag | Mnemonic | Description | Flag | | - | - | - | | JZ | jump if zero | ZF = 1 | | JNZ | jump if not zero | ZF = 0 | | JC | jump if carry | CF = 1 | | JNC | jump if not carry | CF = 0 | | JO | jump if overflow | OF = 1 | | JNO | jump if not overflow | OF = 0 | | JS | jump if signed | SF = 1 | | JNS | jump if not signed | SF = 0 | | JP | jump if parity(even) | PF = 1 | | JNP | jump if not parity(odd) | PF = 0 | - base on equality | Mnemonic | Description | | - | - | | JE | jump if equal | | JNE | jump if not equal | | JCXZ | jump if CX = 0 | | JECXZ | jump if ECX = 0 | - base on unsigned comparison | Mnemonic | Description | | - | - | | JA | jump if above | | JBNE | jump if not below or equal | | JAE | jump if above or equal | | JNB | jump if not below | | JB | jump if below | | JNAE | jump if not above or equal | | JBE | jump if below or equal | | JNA | jump if not above | - base on signed comparison | Mnemonic | Description | | - | - | | JG | jump if greater | | JNLE | jump if not less or equal | | JGE | jump if greater or equal | | JNL | jump if not less | | JL | jump if less | | JNGE | jump if not greater or equal | | JLE | jump if less or equal | | JNG | jump if not greater | ## Chapter 7 : Integer Arithmetic ### Shift - type - Logic shift ![](https://i.imgur.com/B5Ihjza.png) - Arithmetic Shift(keep the sign $\pm$) ![](https://i.imgur.com/DCackNz.png) - *syntax*:```SHL Destinaiton, bit``` - application - **SHL**(SHift Left) - logic shift to left - can do multiplies by 2 - **SHR**(SHift Right) - The highest bit position is filled with a zero - can do divide by 2 ![](https://i.imgur.com/PZRQ6ol.png) - **SAL**(Shift Arithmetic Left) - **SAR**(Shift Arithmetic Right) ![](https://i.imgur.com/39HfcDA.png) ### Rotation - no bit lost - *syntax*:```ROL Destinaiton, bit``` - application - **ROL**(ROtate Left) ![](https://i.imgur.com/xAkuMHW.png) - **ROR**(ROtate Right) ![](https://i.imgur.com/L3fyCkM.png) - **RCL**(Rotate Carry Left) ![](https://i.imgur.com/UnOOG9W.png) - **RCR**(Rotate Carry Right) ![](https://i.imgur.com/MKtkR25.png) ### Multiplication - **MUL** - *syntax*:```mul bx``` ![](https://i.imgur.com/jtH2JKm.png) - **IMUL** - for signed number ### Division - *syntax*:```div bx``` | Dividend | Divisor | Quotient | Remainder | | -------- | -------- | -------- | -------- | | AX | r/m8 | Al | AH | | DX:AX | r/m16 | AX | DX | | EDX:EAX | r/m32 | EAX | EDX | ## Chapter 10 : Structures and Macros - just like a normal **function** in high-level language - During the assembler's preprocessing step, each macro call is expanded into a copy of the macro - The expanded code is passed to the assembly step, where it is checked for correctness - *example* ```assembly mPutchar MACRO char push eax mov al, char call WriteChar pop eax ENDM .code mPutchar 'A' push eax mov al, 'A' call WriteChar pop eax ``` - error handling - blank input ```assembly IFB <row> ;if row is blank, EXITM ;exit the macro ENDIF ```