---
# System prepended metadata

title: Assembly Notes

---

# Assembly Notes

[Reference](https://www.tutorialspoint.com/assembly_programming/index.htm)

## The Assembly Language

The Assembly Language (ASM) is a low-level language that involves writing instructions for the processor to execute directly. It is the lowest level apart from machine code to communicate with the machine since these instructions represent symbolic code of machine language.

We will be using x86 ASM in DosBox through MASM. 

## DosBox and MASM

DosBox is an emulator that supports the x86 assembly language. 

After installation, run this command to mount it to ur directory:

```
mount c d:\masm
c:
```

U may check if DosBox succesfully detected ur MASM by typing:
```
MASM
```

Ur output should look like this:

![image](https://hackmd.io/_uploads/H1l-T7ym1x.png)

Additionally, u can put this at the bottom of the options script to skip this process whenever u open DosBox.

## Program Structure

Assembly programs are composed of three sections:
- `DATA SEGMENT`
- `EXTRA SEGMENT`
- `CODE SEGMENT`

### `DATA` Segment

This section is used for declaring initialized data or constants. It doesn't change at runtime, so u can declare constant values, file names, etc.

We declare the data section as:

```masm
DATA SEGMENT
[label] DB "Hello World"
DATA ENDS
```

Note that:
- `DW`: Define Word
    - integer
- `DB`: Define Byte
    - strings and arrays
- `DD`: Double Word
    - large integers

### `EXTRA` Segment

This section is used for declaring variables:

```masm
EXTRA SEGMENT
[label] DW 25
EXTRA ENDS
```

### `CODE` Segment

This is where we write the code.

```masm
CODE SEGMENT
ASSUME CS:CODE, DS:DATA, ES:EXTRA
START: MOV AX, DATA ; initialization
       MOV DS, AX
       ; needed if there's an EXTRA segment
       MOV AX, EXTRA
       MOV ES, AX
       ...
CODE ENDS
     END START
```

## Statements/Instructions

ASM programs operate on three types of statements:

- instructions
    - instructions that directly tell the procesor what to do
    - each instruction consists of an operation code (opcode)
- assembler directives or pseudo-ops
    - various aspects of the assembly process
    - non-executable
- macros
    - textual substitution

## Syntax

ASM statements follows this format:

```
[label] mnemonic [operands] [;comment]
```

Fields with square brackets are optional. An instruction consists of the name of the instruction (mnemonic) or **to be executed** followed by operands.

```nasm
INC COUNT ;increment memory variable count
MOV TOTAL, 48 ;transfer the value 48 in memory variable TOTAL
ADD AH, BH ;add content of BH into AH register
AND MASK1, 128 ;perform AND operation on variable MASK1 and 128
ADD MARKS, 10 ;add 100 to variable MARKS
MOV AL, 10 ;transfer the value 10 into AL register
```

### Printing Hello World

We use this example as the simplest way to introduce how an ASM program works:

```nasm
DATA SEGMENT
MSG DB "Hello World", "$" ;$ indicates an End of String argument
DATA ENDS

CODE SEGMENT
ASSUME CS:CODE, DS:DATA ; program initialization
START: MOV AX, DATA
	   MOV DS, AX
	   
	   MOV DX, OFFSET MSG ; loads offset msg to dx
	   ; after loading MSG's offset into DX, full address of MSG can be
	   ; referenced as DS:DX
	   MOV AH, 09H ; DOS function to print string
	   INT 21H ; DOS interruption to execute instruction
	   
	   MOV AH, 4CH ; sets up DOS function for program termination
	   INT 21H ; terminates program (from MOV AH, 4CH)
	   
CODE ENDS
	 END START
```

Save this file as `[filename].asm` and put it into the directory u have mounted MASM with, exp `D:\MASM`. Then, type:

```
masm [filename].asm
```

This just assembles the ASM file, we have to run it using:

```
link [filename].obj
[filename]
```

The output should look like this:

![image](https://hackmd.io/_uploads/H1CJLIemyx.png)

## General Purpose Registers
| **Register** | **Full Name**        | **Purpose**                                        | **Common Instructions**                                           |
|--------------|----------------------|---------------------------------------------------|--------------------------------------------------------------------|
| **AX**       | Accumulator Register | Used for arithmetic operations, I/O instructions | `MOV AX, value` / `ADD AX, BX` / `MUL CX` / `INT 21H`             |
| **BX**       | Base Register        | Used for addressing memory                        | `MOV BX, OFFSET DATA` / `ADD BX, AX` / `MOV [BX], AL`             |
| **CX**       | Count Register       | Loop control and shift count                      | `MOV CX, count` / `LOOP label` / `SHL AX, CL`                     |
| **DX**       | Data Register        | Used for I/O operations, extended multiplication/division | `MOV DX, value` / `DIV BX` / `OUT DX, AL`                     |

---

## Segment Registers
| **Register** | **Full Name**           | **Purpose**                             | **Common Instructions**                                           |
|--------------|-------------------------|-----------------------------------------|--------------------------------------------------------------------|
| **CS**       | Code Segment            | Points to the code being executed       | `ASSUME CS:CODE` / `MOV AX, CS`                                   |
| **DS**       | Data Segment            | Points to the data segment              | `ASSUME DS:DATA` / `MOV DS, AX`                                   |
| **SS**       | Stack Segment           | Points to the stack segment             | `ASSUME SS:STACK` / `MOV SS, AX`                                  |
| **ES**       | Extra Segment           | Extra pointer for string/data operations | `MOV ES, AX` / `MOV [ES:DI], AL`                                  |

---

## Index and Pointer Registers
| **Register** | **Full Name**       | **Purpose**                              | **Common Instructions**                                           |
|--------------|---------------------|------------------------------------------|--------------------------------------------------------------------|
| **SI**       | Source Index        | Source for string operations             | `MOV AL, [SI]` / `LODSB` / `ADD SI, 1`                            |
| **DI**       | Destination Index   | Destination for string operations        | `MOV [DI], AL` / `STOSB` / `ADD DI, 1`                            |
| **BP**       | Base Pointer        | Points to the base of the stack frame    | `MOV BP, SP` / `MOV AX, [BP+4]`                                   |
| **SP**       | Stack Pointer       | Points to the top of the stack           | `PUSH AX` / `POP BX` / `MOV AX, [SP]`                             |
| **IP**       | Instruction Pointer | Points to the next instruction           | Automatically updated during execution                            |

---

## Special Purpose Registers
| **Register** | **Purpose**                                        | **Common Instructions**                                           |
|--------------|----------------------------------------------------|--------------------------------------------------------------------|
| **FLAGS**    | Holds status flags for comparisons, arithmetic, etc. | `CMP AX, BX` / `JZ label` / `JC label`                            |

---

## Common Instructions

### Data Transfer
- `MOV dest, src` — Transfers data from `src` to `dest`.
- `PUSH reg/mem` — Pushes a register/memory value onto the stack.
- `POP reg/mem` — Pops the top value of the stack into a register/memory.

### Arithmetic
- `ADD reg/mem, value` — Adds a value to a register/memory.
- `SUB reg/mem, value` — Subtracts a value from a register/memory.
- `MUL reg` — Multiplies `AX` by a register (result in `AX` or `DX:AX` for larger values).
- `DIV reg` — Divides `AX` (or `DX:AX` for larger values) by a register.

### Logic
- `AND reg, value` — Performs a bitwise AND operation.
- `OR reg, value` — Performs a bitwise OR operation.
- `XOR reg, value` — Performs a bitwise XOR operation.
- `NOT reg` — Inverts all bits in a register.

### Control Flow
- `JMP label` — Jumps unconditionally to `label`.
- `JE/JZ label` — Jumps if equal (or zero flag is set).
- `JNE/JNZ label` — Jumps if not equal (or zero flag is not set).
- `LOOP label` — Decrements `CX` and jumps to `label` if `CX` ≠ 0.

### Stack Operations
- `PUSH reg/mem` — Pushes a value onto the stack.
- `POP reg/mem` — Pops the top value from the stack into a register/memory.

### String Operations
- `MOVSB/MOVSW` — Moves string data from `[SI]` to `[DI]` (increments/decrements both).
- `LODSB` — Loads a byte from `[SI]` into `AL`.
- `STOSB` — Stores a byte from `AL` into `[DI]`.

### Input/Output
- `IN AL, port` — Reads a byte from a hardware port into `AL`.
- `OUT port, AL` — Writes a byte from `AL` to a hardware port.

---

## Quick Tips
- **Combine Registers for Extended Precision**:
  - `AX` and `DX` are often used together for extended multiplication/division.
- **Segment Override Prefix**:
  - Use instructions like `MOV AX, [ES:DI]` to specify which segment register to use.
- **Flags Usage**:
  - After arithmetic or comparison (`CMP`), jump instructions (`JE`, `JL`, etc.) depend on the `FLAGS` register. 

Type `DEBUG` to go into assembler tool:

```
R: view registers
D and E: Display and edit memory
T: Execute instruction
U: unassemble code at a specific memory address
```

## Often Used DOS Functions (ASM)

DOS functions mainly handles I/O operations like displaying and reading data. This [reference](https://www.philadelphia.edu.jo/academics/qhamarsheh/uploads/Lecture%2021%20MS-DOS%20Function%20Calls%20_INT%2021h_.pdf) provides some great examples on functions we often use. We're starting on the basic syntax for a DOS function which follows:

```
MOV AH, (function number) ; DOS function
INT 21H ; DOS interruption (execution)
```

### Program Termination

```masm
MOV AH, 4CH
INT 21H
```

terminates the program.

### Various Outputs

- **Output String:** `09H`
- **Output Character:** `02H`, `06H` (ASCII)

Writing a letter A to output:

```masm
MOV AH, 02H
MOV DL, 'A'
INT 21H
```

Printing a character **requires loading the character to the DL register.**

Writing a string to output:

```masm
DATA SEGMENT
STR DB "Hello World", "$"
DATA ENDS
...
MOV AH, 09H
MOV DX, OFFSET STR
; we can also perform LEA DX, MSG
INT 21H
```

Printing a string **requires loading the offset address of the string to the DX register.** Alternatively, we can also use the instruction `LEA DX, MSG`.

## Conditionals

There are some important concepts we need to understand before writing conditionals in ASM.

The first concept is the **usage of Flags.** Conditional statements rely heavily on flag registers to determine comparison outputs:

| Flag Name         | Abbreviation | Description                                                                                  |
|-------------------|--------------|----------------------------------------------------------------------------------------------|
| Carry Flag        | `CF `          | Indicates a carry out from the most significant bit in an addition or a borrow in subtraction.|
| Parity Flag       | `PF`           | Set if the number of 1 bits in the least significant byte is even.                           |
| Auxiliary Carry   | `AF`           | Set when there is a carry or borrow between bit 3 and bit 4 in an arithmetic operation.      |
| Zero Flag         | `ZF`           | Set if the result of an operation is zero.                                                   |
| Sign Flag         | `SF`           | Set if the result of an operation is negative (based on the most significant bit).           |
| Overflow Flag     | `OF`           | Set if there is a signed overflow (e.g., adding two positive numbers yields a negative result).|
| Direction Flag    | `DF`           | Controls string operations; if set, decrements pointers, otherwise increments pointers.     |
| Interrupt Flag    | `IF`           | Controls whether interrupts are enabled (`IF = 1`) or disabled (`IF = 0`).                  |

**ASM conditionals are usually composed of two sections, the comparison instruction and the comparison jump.** We use `CMP` often before a jump, this sets the appropriate flags without performing operations on the data. A subtle hint that we will understand later is that all loops in ASM are executed in a `do while` loop fashion.

We will explain the variants and usages of comparison jumps in the next chapter.

### Comparison Jumps

Comparison jumps follow the general syntax of:

```
(jump instruction) label
```

**Check if Zero / Equal**

- `JE`: jump if equal
    - jumps to label if comparisons are equal

```masm
CMP AX, BX
JE label ; if AX == BX
```

- `JZ`: jump if zero
    - jumps to label if `ZF == 0` which means that the Zero Flag is set to 1
    - **compares the previous instruction whether it returns 0**

```masm
CMP AX, 5
SUB AX, 5
JZ label ; if AX == 0
```

We can see an implementation of both which follows:

```masm
DATA SEGMENT
MSG1 DB "Zero detected", 13, 10, "$" ; to print the msg and a new line
; 13 for carriage return
; 10 for line feed
MSG2 DB "A and B are equal", "$"
DATA ENDS

CODE SEGMENT
ASSUME CS:CODE, DS:DATA
START: MOV AX, DATA
	   MOV DS, AX
	   
	   MOV AX, 5
	   SUB AX, 5
	   JZ PRINT
	   JE EQUAL
	   
PRINT: MOV DX, OFFSET MSG1
	   MOV AH, 09H
	   INT 21H
	   
EQUAL: MOV DX, OFFSET MSG2
	   MOV AH, 09H
	   INT 21H
	   
	   MOV AH, 4CH
	   INT 21H
	   
CODE ENDS
	 END START
```

**Check if Not Zero / Equal**

This operation checks for the complement of zero and equality.

- `JNE`: Jump if not equal

```masm
CMP AX, BX
JNE label ; if AX != BX
```

- `JNZ`: Jump if not zero
    - checks if `ZF != 0` or Zero Flag is not set

```masm
MOV AX, 6
SUB AX, 5
JNZ label ; if AX != 0
```

We can extend the implementation by including `JNE` and `JNZ` comparison jumps:

```masm
DATA SEGMENT
MSG1 DB "Zero detected", 13, 10, "$"
MSG2 DB "A and B are equal", 13, 10, "$"
MSG3 DB "Operation doesn't return a zero", 13, 10, "$"
MSG4 DB "A and B are not equal", 13, 10, "$"
DATA ENDS

CODE SEGMENT
ASSUME CS:CODE, DS:DATA
START: MOV AX, DATA
	   MOV DS, AX
	   
	   MOV AX, 5
	   SUB AX, 5
	   JZ ZERO
	   MOV BX, 0
	   CMP AX, BX
	   JE EQUAL
	   MOV AX, 6
	   SUB AX, 5
	   JNE NOT_ZERO
	   MOV BX, 0
	   CMP AX, BX
	   JNZ NOT_EQUAL
	   
ZERO: MOV DX, OFFSET MSG1
	   MOV AH, 09H
	   INT 21H
	   
EQUAL: MOV DX, OFFSET MSG2
	   MOV AH, 09H
	   INT 21H
	   
NOT_ZERO: MOV DX, OFFSET MSG3
		  MOV AH, 09H
		  INT 21H

NOT_EQUAL: MOV DX, OFFSET MSG4
		   MOV AH, 09H
		   INT 21H
		   
		   MOV AH, 4CH
		   INT 21H
CODE ENDS
	 END START
```

**Jump if Greater**

- `JG`: Jump **if first operand is greater than second operand**

```masm
CMP AX, BX
JG label ; if AX > BX
```

Alternatively, we can use `JNLE` which stands for jump if not less or equal. These jumps depend on Sign Flag (SF) and Zero Flag (ZF).

**Jump if Greater or Equal**

- `JGE`: Jump if first operand is greater than or equal second operand

```masm
CMP AX, BX
JGE label ; if AX >= BX
```

**Jump if Lesser**

- `JL`: Jump if first operand is lesser than second operand

```masm
CMP AX, BX
JL label ; if AX < BX
```

**Jump if Lesser or Equal**

- `JLE`: Jump if first operand is lesser than or equal to second operand

```masm
CMP AX, BX
JLE label ; if AX <= BX
```

**Jump if Carry / Not Carry**

`JC` and `JNC` are used to determine unsigned overflow:

- `JC`: jump if the operations results in a carry
- `JNC`: jump if no carry occurs

```masm
ADD AX, BX
JC label ; jump if adding BX to AX results in a carry (unsigned overflow)
```

## Loops

Loops in ASM operate in a `do-while` fashion, having to execute a block of code before evaluating the condition. We can appreciate C's abstraction of for loops since they take more effort to write in ASM.

 ### For Loops

Recall that in C we define for loops in this manner:

```c
for (int i = 0; i < 10; i++){
    //do something
}
```

In ASM, we use Count Register (CX) to keep track of the loop. This example of printing the character `*` shows us how to implement it:

```masm
DATA SEGMENT
CHAR DB "*", "$"
DATA ENDS

CODE SEGMENT
ASSUME CS:CODE DS:DATA
START: MOV AX, DATA
	   MOV DS, AX
	   
	   MOV CX, 10 ; set loop counter to 10
	   
PRINT: MOV AH, 02H ; invoke DOS function to print character
	   MOV DL, CHAR ; load CHAR
	   INT 21H
	   LOOP PRINT ; execute loop by jumping to PRINT until CX == 0
	   
	   MOV AH, 4CH ; program termination
	   INT 21H
CODE ENDS
	 END START   
```

Alternatively, we can replace `LOOP PRINT` by decrementing CX using `DEC CX` and jumping while `CX != 0` using `JNZ LOOP`.

### While Loops

We provide a simple declaration of while loops in C:

```c
while (x == 1){
    //do something
}
```

Since while loops need a condition to evaluate to determine whether or not to stop the program, we can use our previously explained Comparison Jumps to define conditions. **An extra label is needed to evaluate the condition:**

```masm
DATA SEGMENT
NUM1 DW 5
NUM2 DW 8
STRING1 DB "Decrementing", 13, 10, "$"
STRING2 DB "Now NUM1 == NUM2", "$"
DATA ENDS

CODE SEGMENT
ASSUME CS:CODE DS:DATA
START: MOV AX, DATA
	   MOV DS, AX
	   
	   MOV AX, NUM1
	   MOV BX, NUM2

CONDITION: CMP AX, BX
		   JE END_LOOP

LOOP1:  DEC BX
	    MOV AH, 09H
	    MOV DX, OFFSET STRING1
	    INT 21H
	    JMP CONDITION
		   
END_LOOP: MOV AH, 09H
		  MOV DX, OFFSET STRING2
		  INT 21H
		   
		  MOV AH, 4CH
		  INT 21H
CODE ENDS
	 END START
```

## Little Note on Square Brackets

[This SO question](https://stackoverflow.com/questions/48608423/what-do-square-brackets-mean-in-x86-assembly) explains very well on how square brackets work in assembly. In brief, we can say that:

```masm
MOV AX, 1234H
MOV BX, [AX]
```

can be understood as a dereferencing operation such as:

```masm
MOV BX [1234H]
```

We started by copying 1234H into AX register. In `MOV BX, [AX]`, AX holds the value of 1234H, so we can say that this operation copies the value from the address 0x1234H (`[AX] = [1234H]`) onto BX register.

We can see similar operations with square brackets such as copying the value from AX to the address BX is pointed to (`[BX]`):

```masm
MOV [BX] AX
```

Since BX is enclosed with square brackets, the value of the actual BX register remains the same, instead the value is copied to the **address BX points to:**

```
 | AX : 01234567 |   --no-->   | AX : 01234567 |
 | BX : 00000008 | --change--> | BX : 00000008 |

ADDRESS           VALUE
00000000          6A43210D   ->   6A43210D 
00000004          51C9A847   ->   51C9A847 
00000008          169B87F1 =====> 01234567 
0000000C          C981A517   ->   C981A517 
00000010          9A16D875   ->   9A16D875 
00000014          54C9815F   ->   54C9815F 
```

BX holds the value of `00000008`, so it copies to the value stored in address `00000008` in the memory. It's convenient enough to think that values stored in registers and memory are separate.

### Addressing Memory with Registers

Here we try to access the address `[AX]` points to, grab its value and copies it to BX register:

```masm 
MOV BX, [AX]
```

Right now we're accessing the value stored in the **address of the memory, given by the value of AX:**

```
 | AX : 00000008 |    ->     | AX : 00000008 |
 | BX : 01234567 |   ====>   | BX : 169B87F1 |

[No change to memory]
ADDRESS           VALUE
00000000          6A43210D
00000004          51C9A847
00000008          169B87F1
0000000C          C981A517
00000010          9A16D875
00000014          54C9815F  
```

It is important to determine which location of the value u want to copy from, this leads to the value stored in memory address or the register, and vice versa.

## String Operations

### Often Used Registers (SI, DI, CX)

Strings in ASM are stored as a sequence of bytes / words in consecutive memory locations. ASM provides instructions designed to operate on strings that can be directly accessed. These are the Source Index (SI) and Destination Index (DI).

SI holds the source address for operations involving **moving data (second operand), loading data, and scanning data.** It is convenient to think of SI being the source that u want to retrieve a string from.
 
DI indexes the destination of the array for **storing data, and moving data (first operand) operations.** We can also conveniently say that DI points to the destination u want to store data in.

CX will still be a register frequently used since string operations are iterative, we need a counter to keep track of individual character instructions along the string.

### Loop Over Data: `REP` and Variants

ASM provides `REP`, `REPE`, `REPZ`, `REPNZ`, `REPNE` to repeat string instructions. It is usually paired with string operations in a single line that we will explain later. We will introduce them in brief.

**All repeat operations depend on CX to count iterations,** so we need to load the number of iterations we want to repeat into CX register.

- `REP`: Repeat
    - ends if `CX == 0`
- `REPE` / `REPZ`: Repeat while equal / zero
    - ends if `ZF == 0` or `CX == 0`, otherwise loop
- `REPNE` / `REPNZ`: Repeat while not equal / zero
    - ends if `ZF == 1` or `CX == 0`, otherwise loop

### Instructions

**Copying a String**

- `MOVSB` / `MOVSW`: Move bytes / words from source to destination

This instruction is particularly useful for copying strings from one source to a destination. We will be using ES (Extra Segment) to declare another size 13 empty array with `DB 13 DUP(?)`.

```masm
DATA SEGMENT
SOURCE DB 'Hello, World!', "$"  ; Source string
DATA ENDS

EXTRA SEGMENT
DEST DB 13 DUP(?) ; empty array with size 13
EXTRA ENDS

CODE SEGMENT
    ASSUME CS:CODE, DS:DATA, ES:EXTRA
START:
    MOV AX, DATA
    MOV DS, AX                   
    MOV AX, EXTRA
    MOV ES, AX 

    CLD ; clear direction flag (increment mode)
    LEA SI, SOURCE               
    LEA DI, DEST   
    MOV CX, 13                   
    REP MOVSB ; Copy bytes from DS:SI to ES:DI
	
    MOV BYTE PTR [DI], "$" ; sets the byte at DI to "$"
    LEA DX, DEST
    MOV AH, 09H
    INT 21H
	
    MOV AH, 4CH                ; Terminate program
    INT 21H
CODE ENDS
    END START
```

**Storing Characters into Buffer**

- `STOSB` / `STOSW`: Store a byte / word into memory location

We say that a buffer is a temporary location to hold data in the memory. Here we use an array as the data strucutre for buffer. This instruction is mainly used to fill a buffer with characters.

- `LODSB` / `LODSW`: Load a byte / word into AX / AL

Once we stored data into the buffer, we can use this instruction to load these characters into AL for further usage.

```masm
DATA SEGMENT
BUFFER DB 20 DUP(?)
DATA ENDS

CODE SEGMENT
ASSUME CS:CODE DS:DATA ES:DATA
START: MOV AX, DATA
	   MOV DS, AX
	   MOV ES, AX
	   
	   LEA DI, BUFFER ; point DI to buffer
	   MOV AL, "#"
	   MOV CX, 20
	   
	   CLD
	   REP STOSB ; store character "#" from AL 20 times
	   
	   MOV CX, 20
	   LEA SI, BUFFER ; point SI to buffer
	   
PRINT: LODSB ; load byte at DS:SI into AL and increment SI
	   MOV DL, AL
	   MOV AH, 02H
	   INT 21H
	   LOOP PRINT
	   
	   MOV AH, 4CH
	   INT 21H
CODE ENDS
	 END START
```

**Scaning a String**

- `SCASB` / `SCASW`: Scan a byte / word

This instruction is particularly useful for checking if a character is in a string:

```masm
DATA SEGMENT
STRING1 DB "Hello world!", "$"
STRING2 DB "Character not found", "$"
STRING3 DB "Character found", "$"
DATA ENDS

EXTRA SEGMENT
CHAR DB "W"
EXTRA ENDS

CODE SEGMENT
ASSUME CS:CODE, DS:DATA, ES:EXTRA
START: MOV AX, DATA
	   MOV DS, AX
	   MOV AX, EXTRA
	   MOV ES, AX
	   
	   LEA DI, STRING1
	   MOV AL, CHAR
	   MOV CX, 13
	   
	   CLD
	   REPNE SCASB
	   
	   JNZ NOT_FOUND
	   
	   LEA DX, STRING3
	   MOV AH, 09H
	   INT 21H
	   JMP PROGRAM_END
	   
NOT_FOUND: LEA DX, STRING2
		   MOV AH, 09H
		   INT 21H
		   
PROGRAM_END: MOV AH, 4CH
			 INT 21H
CODE ENDS
	 END START
```

## Pointers

Recall that we use pointers to access memory addresses, different pointers tell the machine the type, size and location of the data being referenced.

### Size Pointers

This type of pointers indicate a specific type of value we are working on which include byte, word and double word:

- `BYTE PTR`
    - explicitly tell the assembler the operation is accessing a byte-sized value (8 bits)

- `WORD PTR`
    - accessing word-sized value (16 bits), such as storing and moving a word in memory.

- `DWORD PTR`
    - double word (32 bits) data

### Intra / Inter Segment Access Pointers (Subroutines)

Pointers can access memory within the same segment (Intra) or across different segment (Inter). It is typically useful in subroutine jump and call instructions.

- `FAR PTR`
    - access **inter segment** subroutines
    - composed of a segment and offset (address)

```masm
JMP FAR PTR [BX] ; Jump to a far address pointed to by BX (segment:offset)
```

- `NEAR PTR`
    - access **intra segment** subroutines
    - just an offset (without segment)

```masm
CALL NEAR PTR [SI] ; call a subroutine at the address pointed by SI (offset)
```

## More examples

### Turning a String to Lowercase

```masm
DATA SEGMENT
SOURCE DB "HELLOWORLD"
DATA ENDS

EXTRA SEGMENT
DEST DB 10 DUP(?)
EXTRA ENDS

CODE SEGMENT
ASSUME CS:CODE DS:DATA ES:EXTRA
START: MOV AX, DATA
	   MOV DS, AX
	   MOV AX, EXTRA
	   MOV ES, AX
	   
	   CLD
	   LEA SI, SOURCE
	   LEA DI, ES:DEST
	   MOV CX, 10
	   
AGAIN: ADD BYTE PTR [SI], 20H 
	   ; uses SI to point to the index of the string and adds 20H
	   ; essentially making the character lowercase
	   MOVSB
	   DEC CX
	   JNZ AGAIN
	   
	   ; optional
	   CLD
	   MOV CX, 10
	   LEA SI, ES:DEST
	   
PRINT: LODSB
	   MOV DL, AL
	   MOV AH, 02H
	   INT 21H
	   LOOP PRINT
	   
	   MOV AH, 4CH
	   INT 21H
CODE ENDS
	 END START
```

### Count Frequency of Character in String

```masm
DATA SEGMENT
STRING DB "HELLOWORLD", "$"
RESULT DW 0
DATA ENDS

CODE SEGMENT
ASSUME CS:CODE DS:DATA ES:EXTRA
START: MOV AX, DATA
	   MOV DS, AX
	   
	   LEA SI, STRING
	   MOV AL, "L"
	   
	   CLD
	   MOV CX, 10
	   MOV BX, 0
	   
CHECK: CMP BYTE PTR [SI+BX], AL
	   JNZ SKIP
	   INC RESULT
	   
SKIP: INC BX
	  DEC CX
	  JMP CHECK

OVER: MOV AH, 4CH
	  INT 21H
CODE ENDS
	 END START	   
```

## Stack Segment and Related Operations

We begin to discuss Stack Segment (SS) and Stack Pointer (SP) as well as common instructions used. Recall that a stack is a LIFO data structure with the top element, push and pop operations.

- SS points to the segment in memory used for stack
- SP points to the current top of the stack

Note that the stack grows downward, **pushing elements requires `SP` to decrement 2 bytes while popping increments 2 bytes.**

Here's an example of pushing items 12 and 34 from AX with SP initially pointing to address 2000H.

![image](https://hackmd.io/_uploads/rkIxUy571g.png)

### Pushing and Popping

- `PUSH`: pushes registers, addresses, or values to stack
    - decrements `SP` by 2 bytes

```masm
MOV AX, 25
PUSH AX
```

- `POP`: removes data from the stack
    - increments `SP` by 2 bytes

```masm
POP BX
```

### Usages

Stack is generally essential for procedures and subroutines for **saving the return address and passing parameters.**

**Call Subroutine**

When a `CALL` instruction executes to call a subroutine, the return address is pushed onto the stack. When there's a `RET` return instruction, the return address gets popped off the stack and used to continue exectuion/

```masm
CALL PROCEDURE ; calls procedure, pushes return address to stack
...
PROCEDURE: ...
           RET ; return to caller, pops return address from stack
```

**Save Registers within Subroutine**

Stack can be used to save registers at the beginning of a subroutine and restore them before returning to ensure the registers will not be overwritten.

```masm
PROCEDURE: PUSH AX
           PUSH BX
           ...
           POP AX
           POP BX
           RET
```

This operation saves the values from AX and BX via pushing to stack. The rest of the program continues until popping AX and BX values, returning to the caller.

**Passing Parameters onto The Stack**

We can also pass **parameters of subroutine** onto the stack. Caller pushes arguments onto the stack and saves them for later via SP.

## Shift and Rotate