Assembly Language in UNIX

Tools for ASM in UNIX

Complier `GCC`

-m32: x86 obj. code
-masm=intel: Intel/AT&T syntax
-fno-stack-protecter: disable stack protector, easier 4 understanding stack frame

Assembler `YASM`

A computer program which translates assembly language to machine language

yasm -f elf32: Output x86 object codes
yasm -f elf64: Output x86_64 object codes

Linker `ld`

A linker takes one or more object files generated by a compiler or an assembler and combines them into a single executable file, library file, or another 'object' file.

ld -m elf_i386: Link with x86 object codes
ld -m elf_x86_64: Link with x86_64 object codes

Debugger `gdb` /w plugin `gdb_peda`

Installion:

git clone https://github.com/longld/peda.git ~/peda
echo "source ~/peda/peda.py" >> ~/.gdbinit

ASM Charcteristics

Low level language
One-to-one mapping from mnemonics to machine codes Assembler: Turn assembly codes to machine codes
Machine and platform-dependent
⋅⋅* Different machines/platforms have different assembler
⋅⋅* Even assemblers on the same machine/platform could be different

Instruction Execution Cycle

Fetch: Read instruction code from address in PC and place in IR. ( IR ← Memory[PC] )
Decode: Hardware determines what the opcode/function is．
Fetch operands (from memory if necessary): If any operands are memory addresses, known as the effective address, or EA for short, initiate memory read cycles to read them into CPU registers.
Execute: Perform the function of the instruction. If arithmetic or logic instruction, utilize the ALU circuits to carry out the operation on data in registers.
Store output (in memory if necessary): If destination is a memory address, initiate a memory write cycle to transfer the result from the CPU to memory.

Basic Execution Environment

Addressable Memory

Long mode (x86_64)
1. Long mode allows the microprocessor to access 64-bit memory space, and access 64-bit long registers.
2. When a computer is powered on, the CPU starts in real mode and begins booting. The 64-bit operating system then checks and switches the CPU into Long mode and then starts new kernel-mode threads running 64-bit code.
3. 256 TB
4. 48-bit virtual address
Protected mode (x86)
1. Protected mode may only be entered after the system software sets up one descriptor table and enables the Protection Enable (PE) bit in the control register 0 (CR0)
2. 4GB
3. 32-bit virtual address
Real-address and Virtual-8086 modes
1. Real mode is characterized by a 20-bit segmented memory address space (giving exactly 1 MiB of addressable memory) and unlimited direct software access to all addressable memory, I/O addresses and peripheral hardware.
2. 1MB space
3. 20-bit address
4. 靠offset/segment定址

General Purpose Registers/Access Parts of Registers

Index and Base Registers

An index register is used for modifying operand addresses during the run of a program, typically for doing vector/array operations.
同一暫存器，用到區塊不同，名稱即不同。
AH, AL, AX不會影響到整個RAX，若是EAX會，則會改變其他部分補0。

64-bit name	32-bit name	16-bit name
RDI	EDI	DI
RSI	ESI	SI
RBP	EBP	BP
RSP	ESP	SP

Common Format

Basic Element

Integer Constants

Opt. leading +/-
Radix Characters:
- h - hexadecimal
- d - decimal
- b - binary
- r - encoded real (?)

Interger Expressions

Operators and precedence levels:

Operator Name Precedence Level

() parentheses 1

+,- unary plus,minus 2

*,/ multiply,divide 3

MOD modulud 3

+,- add,substract 4

Operator	Name	Precedence Level
()	parentheses	1
+,-	unary plus,minus	2
*,/	multiply,divide	3
MOD	modulud	3
+,-	add,substract	4

Character and String Constants

Enclose character in single or double quotes
- 'A',"x"
- ASCII character = 1byte
Enclose strings in single or double quotes
- "ABC",'xyx'
- Each character occupies a single byte
Embedded quotes
- 'Hello "World", Charles'

Reserved Words & Identifiers

Reserved words cannot be used as identifiers
Identifiers
- not case sensitive
- first character must be a letter,_,@,?,or$

Directives(組合程式指引)

Commands that are recognized and acted upon by the assembler
- Not part of the Intel instruction set
- Used to declare code
Different assemblers have different directives

Instructions

Assembled into machine code by assembler
Executed at runtime by the CPU
An Instruction contains:
- Label (opt.)
- Mnemonic (man.)
- Operands (dep. on the instruction)
- Commands (opt.)

Labels

Act as place markers
Data label
- unique
- myArray (not folloewd by colon)
Code label
- target of jump and loop instructions
- L1: (folloewd by colon))

Mnemonics and Operands

Instruction Mnemonics
- memory aid MOV,ADD,SUB…
Operands
- constant
- constant expression
- register
- memory(datalabel)

Instruction Format Examples

;No operands
    stc                 ;set Carry flag
;One operand
    inc eax             ;register
    inc BYTE PTA [a]    ;memory
;Two operands
    add ebx,ecx         ;ebx = ebx + ecx
    sub BYTE[a],25      ;a=a-25
    add eax, 36*25      ;eax = eax + 36*25

The Assemble-Link-Execute Cycle

Create an ASCII text file (source file).
The assembler reads the source file and produces an object file.
The linker reads the object file and checks to see if the program contains any calls to procedures in a link library.
The operating system loader utility reads the executable file into memory and branches the CPU to the program’s starting address, and the program begins to execute.

Defining Data

Intrinsic Data Types(內建資料型別)

Type	Usage
BYTE	8-bit unsigned integer. B stands for byte
SBYTE	8-bit signed integer. S stands for signed
WORD	16-bit unsigned integer
SWORD	16-bit signed integer
DWORD	32-bit unsigned integer. D stands for double
SDWORD	32-bit signed integer. SD stands for signed double
FWORD	48-bit integer (Far pointer in protected mode)
QWORD	64-bit integer. Q stands for quad
TBYTE	80-bit (10-byte) integer. T stands for Ten-byte
REAL4	32-bit (4-byte) IEEE short real
REAL8	64-bit (8-byte) IEEE long real
REAL10	80-bit (10-byte) IEEE extended real

Data definition statement

A data definition statement sets aside storage in memory for a variable.
All initializers become binary data in memory

A data definition has the following syntax:

    [name] directive initializer [,initializer]...
    count  DWORD     12345

Legacy Data Directives:

Directive Usage

DB 8-bit integer

DW 16-bit integer

DD 32-bit integer or real

DQ 64-bit integer or real

DT define 80-bit (10-byte) integer

Directive	Usage
DB	8-bit integer
DW	16-bit integer
DD	32-bit integer or real
DQ	64-bit integer or real
DT	define 80-bit (10-byte) integer

Defining BYTE and SBYTE Data

Each of the following defines a single byte of storage

    value1 BYTE 'A'     ; character literal
    value2 BYTE 0       ; smallest unsigned byte
    value3 BYTE 255     ; largest unsigned byte
    value4 SBYTE −128   ; smallest signed byte
    value5 SBYTE +127   ; largest signed byte

A question mark (?) initializer leaves the variable uninitialized.
```
    value6 BYTE ?
```
Multiple Initializers
```
list BYTE 10,20,30,40
```
- Memory layout of a byte sequence.

Defining Strings

A string is implemented as an array of characters
- usually enclosed in quotation marks
- often will be null-terminated (ends with a null byte ,containing 0.)

Example

greeting1 BYTE "Good afternoon",0
greeting2 BYTE 'Good night',0
greeting1 BYTE 'G','o','o','d'....etc.

;can be divided between multiple lines
greeting1 BYTE "Welcome to the Encryption Demo program "
        BYTE "created by Kip Irvine.",0dh,0ah
        BYTE "If you wish to modify this program, please "
        BYTE "send me a copy.",0dh,0ah,0

End-of-line character sequence:
- 0Dh = carriage return
- 0Ah = line feed

DUP Operator

Use DUP to allocate (create space for) an array or string. Syntax: counter DUP ( argument )

 var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zero
 var2 BYTE 20 DUP(?) ; 20 bytes, uninitialized
 var3 BYTE 4 DUP("STACK") ; 20 bytes: "STACKSTACKSTACKSTACK"
 var4 BYTE 10,3 DUP(0),20 ; 5 bytes

Defining WORD and SWORD Data

Define storage for 16-bit integers

or double characters

single value or multiple values

word1 WORD 65535        ; largest unsigned value
word2 SWORD –32768      ; smallest signed value
word3 WORD ?            ; uninitialized, unsigned
word4 WORD "AB"         ; double characters
myList WORD 1,2,3,4,5   ; array of words
array WORD 5 DUP(?)     ; uninitialized array

Memory layout

Defining DWORD and SDWORD Data

Storage definitions for signed and unsigned 32-bit integers

val1 DWORD 12345678h ; unsigned
val2 SDWORD −2147483648 ; signed
val3 DWORD 20 DUP(?) ; unsigned array

;DD directive can also be used to define doubleword data
val1 DD 12345678h ; unsigned
val2 DD −2147483648 ; signed

;The DWORD can be used to declare a variable that contains the 32-bit offset of another variable.
pVal DWORD val3

;Array of 32-bit
myList DWORD 1,2,3,4,5

Defining QWORD, TBYTE, Real Data

Storage definitions for quadwords, tenbyte values, and real numbers

;storage for 64-bit (8-byte) values
quad1 QWORD 1234567812345678h
;DQ
quad1 DQ 1234567812345678h

Uninitialized Data (BSS Section)

resb – 1-byte
resw – 2-byte
resd – 4-byte
resq – 8-byte
rest – 10-byte
resdq – 16-byte
reso – the same as resdq

buffer: resb 64 ; reserve 64 bytes
wordvar: resw 1 ; reserve a word
realarray resq 10 ; array of ten reals

Symbolic Constants

A symbolic constant (or symbol definition) is created by associating an identifier (a symbol) with an integer expression or some text. Symbols do not reserve storage. They are used only by the assembler when scanning a program, and they cannot change at runtime. The following table summarizes their differences.

Equal-Sign Directive

name = expression
1. 32-bit integer value.
2. redefined
3. name is called a symbolic constant

COUNT = 500
mov eax, COUNT

Calculating the Sizes of Arrays and Strings

current location counter: $

list BYTE 10,20,30,40
ListSize = 4
;Divide total number of bytes by 2 (the size of a word)
list WORD 1000h,2000h,3000h,4000h
ListSize = ($ - list) / 2
;Divide total number of bytes by 4 (the size of a doubleword)
list DWORD 1,2,3,4
ListSize = ($ - list) / 4

EQU Directive

Define a symbol as either an integer or text expression.
Cannot be redefined

PI EQU <3.1416>
pressKey EQU <"Press any key to continue...",0>
.data
prompt BYTE pressKey

TEXTEQU Directive

Define a symbol as either an integer or text expression
Called a text macro
Can be redefined

name TEXTEQU <text>
name TEXTEQU textmacro
name TEXTEQU %constExpr

continueMsg TEXTEQU <"Do you wish to continue (Y/N)?">
rowSize = 5
.data
prompt1 BYTE continueMsg
count TEXTEQU %(rowSize * 2) ; evaluates the expression
setupAL TEXTEQU <mov al,count>
.code
setupAL ; generates: "mov al,10"

Data Transfers Instructions

Operand Types

Immediate—uses a numeric literal expression
Register—uses a named register in the CPU
Memory—references a memory location
Instruction Operand Notation, 32-Bit Mode.

Operand	Description
reg8	8-bit general-purpose register: AH, AL, BH, BL, CH, CL, DH, DL
reg16	16-bit general-purpose register: AX, BX, CX, DX, SI, DI, SP, BP
reg32	32-bit general-purpose register: EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP
reg Any	general-purpose register
sreg	16-bit segment register: CS, DS, SS, ES, FS, GS
imm	8-, 16-, or 32-bit immediate value
imm8	8-bit immediate byte value
imm16	16-bit immediate word value
imm32	32-bit immediate doubleword value
reg/mem8	8-bit operand, which can be an 8-bit general register or memory byte
reg/mem16	16-bit operand, which can be a 16-bit general register or memory word
reg/mem32	32-bit operand, which can be a 32-bit general register or memory doubleword
mem	An 8-, 16-, or 32-bit memory operand

[label:]    mnemonic        [operands]              [ ; comment ]
mnemonic
mnemonic    [destination]
mnemonic    [destination],  [source]
mnemonic    destination],   [source-1],[source-2]

Direct Memory Operands

A direct memory operand is a named reference to storage in memory.
The named reference (label) is automatically dereferenced by the assembler

.data
var1 BYTE 10h
.code
mov al,var1 ; AL = 10h
mov al,[var1] ; AL = 10h

MOV Instruction

Syntax:
```
MOV destination,source
```
Both operands must be the same size.
Both operands cannot be memory operands.

The instruction pointer register (IP, EIP, or RIP) cannot be a destination operand.

MOV reg,reg
MOV mem,reg
MOV reg,mem
MOV mem,imm
MOV reg,imm
mov eax, [ebx + ecx*4 + 4]
mov [0x600004], ebx
inc [0x600008] ; this is invalid
inc DWORD PTR [0x600008]
mov [0x600000], [0x600004] ; this is invalid

XCHG Instruction

XCHG exchanges the values of two operands. At least one operand must be a register. No immediate operands are permitted.

.data
var1 WORD 1000h
var2 WORD 2000h
.code
xchg ax,bx ; exchange 16-bit regs
xchg ah,al ; exchange 8-bit regs
xchg var1,bx ; exchange mem, reg
xchg eax,ebx ; exchange 32-bit regs
xchg var1,var2 ; error: two memory operands

Direct-Offset Operands

A constant offset is added to a data label to produce an effective address (EA). The address is dereferenced to get the value inside its memory location.

.data
arrayB BYTE 10h,20h,30h,40h
.code
mov al,arrayB+1 ; AL = 20h
mov al,[arrayB+1] ; alternative notation

LEA Instruction

Move the address into the target operand

lea eax, [0x600000] ; eax = 0x600000
lea eax, [0x600000+4] ; eax = 0x600000 + 4
lea eax, [ebx + 17] ; eax = ebx + 17
lea eax, [ebx + ecx*4 + 4] ; eax = ebx + ecx*4 + 4

Some special usage
- Add a constant to a register
- Quick multiplication of 2, 3, 5, 9

Bitwise Operations

Operation	Description
AND	Boolean AND operation between a source operand and a destination operand.
OR	Boolean OR operation between a source operand and a destination operand.
XOR	Boolean exclusive-OR operation between a source operand and a destination operand.
NOT	Boolean NOT operation on a destination operand.TEST Implied boolean AND operation between a source and destination operand, setting the CPU flags appropriately.

Shift and Rotate Instructions

Shift and Rotate Instructions.

Shift Instruction

Shift left evalue = value * 2
Shift right evalue = value / 2

Logical shift

SHL reg,imm8
SHL mem,imm8
SHL reg,CL
SHL mem,CL

Arithmetic shift right

SAL == SHL
SAR reg/mem, imm8/cl

Rotate Instructions

Rotate w/o the carry flag

ROL reg/mem, imm8/cl
ROR reg/mem, imm8/cl

Rotate with the carry flag

RCL reg/mem, imm8/cl
RCR reg/mem, imm8/cl

Multiplication and Division Instructions

MUL Instruction

In 32-bit mode, MUL (unsigned multiply) instruction multiplies an 8-, 16-, or 32-bit operand by either AL, AX, or EAX.
```
MUL r/m8
MUL r/m16
MUL r/m32
```

Multiplicand	Multiplier	Product
AL	reg/mem8	AX
AX	reg/mem16	DX:AX
EAX	reg/mem32	EDX:EAX

IMUL Instruction

IMUL (signed integer multiply ) multiplies an 8-, 16-, or 32-bit signed operand by either AL, AX, or EAX

IMUL reg/mem8 ; AX = AL * reg/mem8
IMUL reg/mem16 ; DX:AX = AX * reg/mem16
IMUL reg/mem32 ; EDX:EAX = EAX * reg/mem32

DIV Instruction

The DIV (unsigned divide) instruction performs 8-bit, 16-bit, and 32-bit division on unsigned integers

Dividend Divisor Quotient Remainder

AX reg/mem8 AL AH

DX:AX reg/mem16 AX DX

EDX:EAX reg/mem32 EAX EDX

Dividend	Divisor	Quotient	Remainder
AX	reg/mem8	AL	AH
DX:AX	reg/mem16	AX	DX
EDX:EAX	reg/mem32	EAX	EDX

32-bit example

mov dx,0 ; clear dividend, high
mov ax,8003h ; dividend, low
mov cx,100h ; divisor
div cx ; AX = 0080h, DX = 0003h

Sign Extension Instructions (CBW, CWD, CDQ)

CBW (convert byte to word) extends AL into AH
CWD (convert word to doubleword) extends AX into DX
CDQ (convert doubleword to quadword) extends EAX into EDX

IDIV Instruction

IDIV (signed divide) performs signed integer division

Same syntax and operands as DIV instruction

mov eax,-48
cdq ; extend EAX into EDX
mov ebx,5
idiv ebx ; EAX = -9, EDX = -3

Control Flow Instructions

TEST Instruction

Performs a nondestructive AND operation between each pair of matching bits in two operands
No operands are modified, but the Zero flag is affected.
Example: jump to a label if either bit 0 or bit 1 in AL is set.
```
test al,00000011b
jnz ValueFound
```

CMP Instruction

Compares the destination operand to the source operand
- Nondestructive subtraction of source from destination (destination operand is not changed)

Syntax: CMP destination, source

mov ax,5
cmp ax,10 ; ZF = 0 and CF = 1

Conditional Jumps

can be divided into four groups
- Jumps based on specific flag values
- Jumps based on equality between operands or the value of (E)CX
- Jumps based on comparisons of unsigned operands
- Jumps based on comparisons of signed operands
Jumps Based on Specific Flag Values
Jumps Based on Equality
Jumps Based on Unsigned Comparisons
Jumps Based on Signed Comparisons

Conditional Structures

Implement Compound

Logical AND Operator

When implementing the logical AND operator, consider that HLLs use short-circuit evaluation

if (al > bl) AND (bl > cl) X = 1

cmp al,bl ; first expression...
ja L1
jmp next
L1: cmp bl,cl ; second expression...
ja L2
jmp next
L2: mov X,1 ; both true: set X to 1
next:

reduce the code to five instructions by changing the initial JA instruction to JBE:

cmp al,bl ; first expression...
jbe next ; quit if false
cmp bl,cl ; second expression
jbe next ; quit if false
mov X,1 ; both are true
next:

Logical OR Operator

When implementing the logical OR operator, consider that HLLs use short-circuit evaluation
We can use "fall-through" logic to keep the code as short as possible:

if (al > bl) OR (bl > cl) X = 1

cmp al,bl ; 1: compare AL to BL
ja L1 ; if true, skip second expression
cmp bl,cl ; 2: compare BL to CL
jbe next ; false: skip next statement
L1: mov X,1 ; true: set X = 1
next:

WHILE Loops

A WHILE loop is really an IF statement followed by the body of the loop, followed by an unconditional jump to the top of the loop.

L1:
    cmp or test
    J<INV>cond L2
    ; do something here
    jmp L1
L2:

while( eax < ebx) eax = eax + 1;

mov eax,val1 ; copy variable to EAX
beginwhile:
cmp eax,val2 ; if not (val1 < val2)
jnl endwhile ; exit the loop
inc eax ; val1++;
dec val2 ; val2--;
jmp beginwhile ; repeat the loop
endwhile:
mov val1,eax ; save new value for val1

FOR Loop

CPU built-in loops

Use CX/ECX/RCX as the counter

Repeat the loop for 10 times

    mov ecx, 10
L1:
    ; do something here
    loop L1

C-style FOR loop

for (i = 0; i < 10; i++) { /* … */ }

    mov DWORD [i], 0
L1:
    cmp DWORD [i], 10
    jge L2
    ; do something here
    inc DWORD [i]
    jmp L1
L2:

Stack Operations

Imagine a stack of plates
- plates are only added to the top
- plates are only removed from the top
- LIFO structure

Runtime Stack

Managed by the CPU, using two registers
- SS (stack segment)
- ESP (stack pointer)(SP in Real-address mode)

PUSH Operation

Put a number into the stack
ESP = ESP - sizeof(object)
[ESP] = object

A 32-bit push operation decrements the stack pointer by 4 and copies a value into the location pointed to by the stack pointer.
Same stack after pushing two more integers:
The stack grows downward. The area below ESP is always available (unless the stack has overflowed).higher->bottom

Syntax

PUSH reg/mem16
PUSH reg/mem32
PUSH imm32

POP Operation

Remove a number from the stack
ESP = ESP + sizeof(top object)

Copies value at stack[ESP] into a register or variable.
Adds n to ESP, where n is either 2 or 4.
- value of n depends on the attribute of the operand receiving the data
pop edx => edx=[esp]
pop dword ptr[a] =>[a]=[esp]
POP只是esp值的改變，除非有重新push蓋掉本來的位置