Marco Paolieri
  • NEW!
    NEW!  Connect Ideas Across Notes
    Save time and share insights. With Paragraph Citation, you can quote others’ work with source info built in. If someone cites your note, you’ll see a card showing where it’s used—bringing notes closer together.
    Got it
      • Create new note
      • Create a note from template
        • Sharing URL Link copied
        • /edit
        • View mode
          • Edit mode
          • View mode
          • Book mode
          • Slide mode
          Edit mode View mode Book mode Slide mode
        • Customize slides
        • Note Permission
        • Read
          • Only me
          • Signed-in users
          • Everyone
          Only me Signed-in users Everyone
        • Write
          • Only me
          • Signed-in users
          • Everyone
          Only me Signed-in users Everyone
        • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invite by email
        Invitee

        This note has no invitees

      • Publish Note

        Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

        Your note will be visible on your profile and discoverable by anyone.
        Your note is now live.
        This note is visible on your profile and discoverable online.
        Everyone on the web can find and read all notes of this public team.

        Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

        Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

        Explore these features while you wait
        Complete general settings
        Bookmark and like published notes
        Write a few more notes
        Complete general settings
        Write a few more notes
        See published notes
        Unpublish note
        Please check the box to agree to the Community Guidelines.
        View profile
      • Commenting
        Permission
        Disabled Forbidden Owners Signed-in users Everyone
      • Enable
      • Permission
        • Forbidden
        • Owners
        • Signed-in users
        • Everyone
      • Suggest edit
        Permission
        Disabled Forbidden Owners Signed-in users Everyone
      • Enable
      • Permission
        • Forbidden
        • Owners
        • Signed-in users
      • Emoji Reply
      • Enable
      • Versions and GitHub Sync
      • Note settings
      • Note Insights New
      • Engagement control
      • Make a copy
      • Transfer ownership
      • Delete this note
      • Save as template
      • Insert from template
      • Import from
        • Dropbox
        • Google Drive
        • Gist
        • Clipboard
      • Export to
        • Dropbox
        • Google Drive
        • Gist
      • Download
        • Markdown
        • HTML
        • Raw HTML
    Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
    Create Create new note Create a note from template
    Menu
    Options
    Engagement control Make a copy Transfer ownership Delete this note
    Import from
    Dropbox Google Drive Gist Clipboard
    Export to
    Dropbox Google Drive Gist
    Download
    Markdown HTML Raw HTML
    Back
    Sharing URL Link copied
    /edit
    View mode
    • Edit mode
    • View mode
    • Book mode
    • Slide mode
    Edit mode View mode Book mode Slide mode
    Customize slides
    Note Permission
    Read
    Only me
    • Only me
    • Signed-in users
    • Everyone
    Only me Signed-in users Everyone
    Write
    Only me
    • Only me
    • Signed-in users
    • Everyone
    Only me Signed-in users Everyone
    Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.

    Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Explore these features while you wait
    Complete general settings
    Bookmark and like published notes
    Write a few more notes
    Complete general settings
    Write a few more notes
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    8
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    <style> h2 { counter-increment: h2; } h2:before { content: counter(h2) ". " } .markdown-body { font-family: -apple-system, BlinkMacSystemFont, "SFNS Display", "Roboto", "Helvetica Neue", Helvetica, Arial, sans-serif !important; } </style> # CS356: A short guide to x86-64 assembly I'll try to improve this guide over time. The goal is to collect the essential knowledge about x86-64 assembly and examples of very common patterns (`if`, `while`, `for`, procedure calls) in BombLab. ## The registers We have plenty of registers inside the CPU; they can store 8 bytes each and they are called: - `%rax`, `%rbx`, `%rcx`, `%rdx` (the "a, b, c, d") - `%rsi`, `%rdi` (source index, destination index) - `%rsp`, `%rbp` (stack pointer, base pointer) - `%r8`, `%r9`, .., `%r15` (just numbers here) We can also operate on the least significant 4, 2, or 1 bytes of these registers. To do that, we need to use different names: - `%eax`, `%ebx`, .., `%r8d`, `%r9d`, .., `%r15d` (least significant 4 bytes) - `%ax`, `%bx`, .., `%r8w`, `%r9w`, .., `%r15w` (least significant 2 bytes) - `%al`, `%bl`, .. (but `%sil` and `%dil`), `%r9b`, .., `%r15b` (least significant byte) We also need to use instructions (e.g., `movX` or `addX`) where the suffix `X` (one of `q`, `l`, `w`, `b`) matches the size of the register: ![](https://i.imgur.com/nHMUcng.png) When we operate on just a portion of the register, the rest doesn't change (e.g., `movw %ax, %bx` moves data only into the least significant 2 bytes of `%rbx`). An important exception is that of instructions that end with the `l` suffix: they operate on the least significant 4 bytes but also set the most significant 4 bytes to 0. For example, `movl %eax, %ebx` moves the least significant 4 bytes of `%rax` into the least significant 4 bytes of `%rbx` but also sets the rest of `%rbx` to 0. ## Data Movement and Addressing Modes We can move data using `movb`, `movw`, `movl`, `movq`. - To **move a constant into a register**: `movq $42,%rax`. Note that the constant has a `$` in front; it can also be specified in hex as `$0x2a`. The destination size has to match the suffix (`movq` for `%rax`, `movl` for `%eax`, `movw` for `%ax`, `movb` for `%al`). - To **move a constant into a memory address**: `movq $42,0xFF112233` . Note that the address does *not* have the `$` in front. It is the initial address: the number of bytes modified in memory depends on the suffix (8 bytes with `movq`, 4 with `movl`, 2 with `movw`, 1 with `movb`). - To **move data from one register to another**: `movq %rax,%rbx`. Note that both sizes match the suffix `q` (8 bytes). There are also instructions `movzbw`, `movzbl`, `movzbq`, `movzwl`, `movzwq` and `movsbw`, `movsbl`, `movsbq`, `movswl`, `movswq`, `movslq` to zero/sign extend from one size to another (this is frequent when you cast from one type to another in C, e.g., `int x = 10; long y = (long)x;` will use `movslq` to sign-extend `x` into `y`); `cltq` is a shortcut for `movslq %eax,%rax`. - To **move data from a register to memory**: `movq %rax,0xFF112233`. - To **move data from memory to a register**: `movq 0xFF112233,%rax`. (It's not possible to move data from memory to memory directly; we need to read into a register first, then save to a memory address.) In the examples above, we used the constant `0xFF112233` to specify a memory address. Very often, it's useful to use some form of **indirect addressing**: - `movq %rax,(%rbx)`: moves the 8 bytes of `%rax` to the memory address given by `%rbx` (beware: the register `%rbx` is not changed, the change happens in memory at the address given by `%rbx`). This addressing mode is very common when we use pointers in C: ``` trojan@cs356:~$ cat test.c void f(int *p) { *p = 2; // writes 2 at the address given by p } trojan@cs356:~$ gcc -Og -c test.c trojan@cs356:~$ gdb -batch -ex 'disas f' test.o Dump of assembler code for function f: 0x0000000000000000 <+0>: movl $0x2,(%rdi) 0x0000000000000006 <+6>: retq End of assembler dump. ``` - `movq %rax,4(%rbx)`: moves the 8 bytes of `%rax` to the memory address given by `%rbx + 4`. This is common when using pointers to structs in C: ``` trojan@cs356:~$ cat test.c struct Data { char a; int b; }; void f(struct Data *p) { p->b = 2; // writes 2 at the address of the struct plus offset of b } trojan@cs356:~$ gcc -Og -c test.c trojan@cs356:~$ gdb -batch -ex 'disas f' test.o Dump of assembler code for function f: 0x0000000000000000 <+0>: movl $0x2,0x4(%rdi) 0x0000000000000007 <+7>: retq End of assembler dump. ``` - `movq %rax,(%rbx,%rcx,4)`: moves the 8 bytes of `%rax` to the memory address given by `%rbx + 4 * %rcx` (the multiplier must be one of 1, 2, 4, 8). This is common when using arrays in C: ``` trojan@cs356:~$ cat test.c void f(int *p, int i) { p[i] = 2; // writes 2 at address p + i * (size of each int) } trojan@cs356:~$ gcc -Og -c test.c trojan@cs356:~$ gdb -batch -ex 'disas f' test.o Dump of assembler code for function f: 0x0000000000000000 <+0>: movslq %esi,%rsi 0x0000000000000003 <+3>: movl $0x2,(%rdi,%rsi,4) 0x000000000000000a <+10>: retq End of assembler dump. ``` - `movq %rax,8(%rbx,%rcx,4)` combines all the previous, writing to `8 + %rbx + 4 * %rcx`. ## Arithmetic Operations Arithmetic operations can use the same addressing modes, register names, constants as data movements. The following examples show the variants operating on 8 bytes (suffix `q`), but 4-byte, 2-byte and 1-byte variants (suffixes `l`, `w`, `b`) are also available (similarly to data movement, variants that end with `l` set the most significant 4 bytes to 0). - **Increment**: `incq %rax` is equivalent to `rax++` in C - **Decrement**: `decq %rax` is equivalent to `rax--` in C - **Negation**: `negq %rax` is equivalent to `rax = -rax` in C - **Bitwise Negation**: `notq %rax` is equivalent to `rax = ~rax` in C - **Addition**: `addq %rax,%rbx` is equivalent to `rbx += rax` in C ("add `rax` to `rbx`") - **Subtraction**: `subq %rax,%rbx` is equivalent to `rbx -= rax` ("subtract `rax` from `rbx`") - **Bitwise AND**: `andq %rax,%rbx` is equivalent to `rbx &= rax` in C - **Bitwise OR**: `orq %rax,%rbx` is equivalent to `rbx |= rax` in C - **Bitwise XOR**: `xorq %rax,%rbx` is equivalent to `rbx ^= rax` in C - **Arithmetic Shift**: - `salq %cl,%rax` is equivalent to `rax = rax << cl` in C when `rax` is signed (this shifts in 0's from the right; a constant can be used too, e.g., `salq $2,%rax`) - `sarq %cl,%rax` is equivalent to `rax = rax >> cl` in C when `rax` is signed (this replicates the sign bit from the left; a constant can be used too, e.g., `sarq $2,%rax`) - **Logical Shift**: - `shlq %cl,%rax` is equivalent to `rax = rax << cl` in C when `rax` is unsigned (this fills in 0's from the right; a constant can be used too, e.g., `shlq $2,%rax`) - `shrq %cl,%rax` is equivalent to `rax = rax >> cl` in C when `rax` is unsigned (this fills in 0's from the left; a constant can be used too, e.g., `shrq $2,%rax`) - **Multiplication**: - `imulq %rax,%rbx` is equivalent to `rbx *= rax` (can be used for signed or unsigned, but keeps only the least significant 64 bits of the 128-bit result). - `imulq %rbx` multiplies `%rax` by `%rbx` and stores the result in `%rdx` (most-significant 64 bits) and `%rax` (least-significant 64 bits). The most-significant ones (`%rdx`) are correct only for signed multiplication. Note the implicit assumptions on the use of `%rax` for one of the inputs and `%rdx`:`%rax` for the output. - `mulq %rbx` is like `imulq %rbx` but for the most-significant bits saved in `%rdx` are correct for unsigned multiplication. - **Division**: - `idivq %rbx` computes the signed division of `%rdx`:`%rax` (concatenation of two 64-bit registers) by `%rbx`. The quotient is stored in `%rax` and the remainder in `%rdx`. Similarly, `idivl %ebx` divides "`%edx`:`%eax`" by `%ebx` and stores quotient in `%eax` and remainder in `%edx`; or, `idivw %bx` divides "`%dx`:`%ax`" by `%bx` and stores quotient in `%ax` and remainder in `%dx`. - Very often, we just want to divide a single 64-bit signed integer with `idivq`, but the instruction expects the 128-bit input `%rdx`:`%rax`; in that case, we can use the instruction `cqto` (no inputs) to replicate the sign bit of `%rax` all over `%rdx` before using `idivq`. And for `idivl`, we can use `cltd` to replicate the sign bit of `%eax` into `%edx`; for `idivw`, we can use `cwtd` to replicate the sign bit of `%ax` into `%dx`. - `divq %rbx`is the same but for *unsigned* division of `%rdx`:`%rax` by `%rbx`, storing quotient in `%rax` and reminder in `%rdx` (unsigned division gives a different result than `idivq`). If we only have a register as input (instead of `%rdx`:`%rax`), we can set `%rdx` to zero with `movq $0,%rdx` or `xorq %rdx,%rdx`. - **Load Effective Address**: `leaq 10(%rax,%rbx,4),%rcx` saves `10 + %rax + %rbx * 4` into `%rcx`. This is quite useful to combine simple arithmetic operations into one, or to compute a memory address and store it in a register `%rcx` for later use. Only the `leaq` / `leal` variants exist in x86-64 and they are used quite frequently by the compiler: ``` trojan@cs356:~$ cat test.c int f(int x) { return 9*x + 5; } trojan@cs356:~$ gcc -Og -c test.c trojan@cs356:~$ gdb -batch -ex 'disas f' test.o Dump of assembler code for function f: 0x0000000000000000 <+0>: lea 0x5(%rdi,%rdi,8),%eax 0x0000000000000004 <+4>: retq End of assembler dump. ``` ## The FLAGS Register There is also another important register in the CPU, the `FLAGS` register. This register is not modified directly using assembly instructions; instead, the CPU updates the `FLAGS` register **automatically after every operation**, to keep track of special true/false "conditions" associated with its different bits (the "flags"): - `ZF` (zero flag): This flag is set to 1 if the last operation produced a result equal to 0. For example, if `%ax` is `0xFFFF`, then after `addw $1,%ax` we will see `ZF == 1` because `%rax` is equal to `???? ???? ???? 0000`(even when the rest of `%rax` is nonzero). - `SF` (sign flag): This flag is set to 1 if the last operation produced a result with sign bit equal to 1. For example, if `%ax` is `0x7FFF`, then after `addw $1,%ax` we will see `SF == 1` because `%rax` is equal to `???? ???? ???? 8000` (even when the rest of `%rax` is zero). - `OF` (overflow flag): This flag is set to 1 if the last operation produced signed overflow. In the previous example, `%ax` is `0x7FFF` (the largest 2-byte positive integer in 2's complement); after `addw $1,%ax`, the result is `???? ???? ???? 8000` (note that the carry does not propagate out of `%ax`). Since `%ax` is now `0x8000` (negative), this is a case of signed overflow (`p+p=n`). - `OF` (carry flag): This flag is set to 1 if the last operation produced unsigned overflow. If `%ax` is `0xFFFF` (the largest 2-byte unsigned integer), after `addw $1,%ax`, the result is `???? ???? ???? 0000` (again, note that the carry does not propagate out of `%ax`). This is an unsigned overflow because we added 1 to `%ax` (unsigned) and obtained something smaller. The rules above apply to most arithmetic operations, but **there are some exceptions**: - Condition flags are not modified at all by: - Data movement instructions (all `mov` variants) - Load effective address (`leaq` and `leal`) - Bitwise negation `notq` (and `l`, `w`, `b` variants) - `incq` and `decq` (and their `l`, `w`, `b` variants) leave `CF` unchanged, even in case of unsigned overflow (e.g., from `0xFFFF` to `ox0000`). - Bitwise `orq`, `andq`, `xorq` (and `l`, `w`, `b` variants), always set `OF` and `CF` to 0. - Shift operations (`salq`/`shlq`, `sarq`/`shrq` and `l`, `w`, `b` variants), set `CF` to the last bit shifted out, while `OF` is undefined (shifts > 1) or it is modified by 1-bit shifts as: `0` for `sarq`, `MSB(input)` for `shrq`, `MSB(result)^CF` for left shifts (i.e., it's `1` if the MSB changed due to the shift). In addition, these instructions (and their `l`, `w`, `b` variants) **update the FLAGS register without modifying any registers**: - `cmpq %rax,%rbx` sets the flags like `subq %rax,%rbx`, but doesn't modify `%rbx`. - `testq %rax,%rbx` sets the flags like `andq %rax,%rbx`, but doesn't modify `%rbx`. ## Conditional and Unconditional Jumps Why updating all these flags, all the time? To use them for conditional jumps: - `jmp 0xFF112233` in an unconditional jump, which replaces the content of the instruction pointer register `%rip` with the address `0xFF112233`. The CPU will fetch the next instruction at this address, so this is indeed a "jump." - `je 0xFF112233` ("jump if equal") jumps only if `ZF == 1`. - When the instruction before `je` is a comparison like `cmpq %rax,%rbx`, `ZF` is 1 at `je` only if the difference `%rbx - %rax` is zero, i.e., they are equal. - When the instruction before `je` is a same-register test like `testq %rax,%rax`, `ZF` is 1 at `je` only if `%rax` is zero (since `x & x == x`) - `jl 0xFF112233` ("jump if lower") jumps only if `(SF == 1) ^ (OF == 1)`. - When the instruction before `jl` is a comparison like `cmpq %rax,%rbx`, `SF` is 1 at `jl` only if the difference `%rbx - %rax` is negative, i.e., `%rbx` is lower than `%rax`. The second part `^ (OF == 1)` of the condition accounts for signed overflow in the subtraction performed by `cmpq` (in the case, the output sign flag `SF` is wrong, so the XOR flips it). - When the instruction before `jl` is a same-register test like `testq %rax,%rax`, `SF` is 1 at `jl` only if `%rax` is negative (since `x & x == x`, `SF` is the sign flag of `%rax`). You get the idea... `cmpq` or `testq` or other arithmetic instructions produce a change in the `FLAGS` register and then conditional jumps use the flags to decide whether to take the jump or not. Similarly, there are `jle` ("jump if lower or equal"), `jg` ("jump if greater"), `jge` ("jump if greater or equal"), `jne` ("jump if not equal"). There are also the variants: `jb` ("jump if below"), `jbe` ("jump if below or equal"), `ja` ("jump if above"), `jae` ("jump if above or equal"). These account for **unsigned overflow** instead of signed overflow; GCC will emit them if your C code uses, for example, `unsigned int` instead of `int` inside an `if` guard. The instruction `js` ("jump if signed") is equivalent to `jl` or `jb` when it follows `testq %rax,%rbx`: it means "jump if `%rax` is negative" in all these cases, because `testq` always sets the overflow flags `OF` and `CF` to 0. But how can you figure out the meaning of these "compare + jump" combos without thinking about `ZF`, `SF`, `OF`, `CF` flags all the time? It's easy: - `cmp x,y` + `jOP` jumps if `y OP x` (e.g., `cmpq %rax,%rbx` + `jl` jumps if `rbx < rax`). - `test x,x` + `jOP` jumps if `x OP 0` (e.g., `testq %rax,%rax` + `jl` jumps if `rax < 0`). ## Common Patterns ### `if` statement ``` trojan@cs356:~$ cat test.c int f(int x, int y) { if (x < y) return 5; else return 10; } trojan@cs356:~$ gcc -Og -c test.c trojan@cs356:~$ gdb -batch -ex 'disas f' test.o Dump of assembler code for function f: 0x0000000000000000 <+0>: cmp %esi,%edi 0x0000000000000002 <+2>: jge 0xa <f+10> 0x0000000000000004 <+4>: mov $0x5,%eax 0x0000000000000009 <+9>: retq 0x000000000000000a <+10>: mov $0xa,%eax 0x000000000000000f <+15>: retq End of assembler dump. trojan@cs356:~$ ``` Here: - GDB is omitting some instruction suffixes (`mov` instead of `movl`, `cmp` instead of `cmpl`) for brevity; but you would need those in your `.s` file to compile it with GCC. - `x` is saved into `%edi`, `y` is saved into `%esi`, the return value is in `%eax`. - `cmp %esi,%edi` + `jge` takes the jump if `%edi >= %esi`, i.e., `x >= y`. If that's the case, we go to `f+10`, where we have `mov $0xa,%eax` and `retq` (returning 10). - If we don't jump, we just move to the next instruction; that happens when `%edi >= %esi`, i.e., `x < y` (the reverse of the previous condition). In this case, we return 5. ### `for` loop ``` trojan@cs356:~$ cat test.c int f(int n) { int total = 0; for (int i=1; i <= n; i++) { total += i; } return total; } trojan@cs356:~$ gcc -Og -c test.c trojan@cs356:~$ gdb -batch -ex 'disas f' test.o Dump of assembler code for function f: 0x0000000000000000 <+0>: mov $0x1,%edx 0x0000000000000005 <+5>: mov $0x0,%eax 0x000000000000000a <+10>: cmp %edi,%edx 0x000000000000000c <+12>: jg 0x15 <f+21> 0x000000000000000e <+14>: add %edx,%eax 0x0000000000000010 <+16>: add $0x1,%edx 0x0000000000000013 <+19>: jmp 0xa <f+10> 0x0000000000000015 <+21>: retq End of assembler dump. ``` The C code is adding up all integers from 1 to `n`. But how do we figure that out from the assembly? - As usual, the input `n` is in `%edi`, while the output is returned in `%eax`. - `mov $0x1,%edx` stores 1 into `%edx`; we don't know what this is at this point, but we take note. - `mov $0x0,%eax` stores 0 in `%eax`; sounds like the output value is initialized to 0. - `cmp %edi,%edx` + `jg` jumps if `%edx` (1 at this point) is greater than `%edi` (which is `n`). If we jump, we go to `f+21`, where we simply return what's in `%rax`. If we don't jump, we execute the following instructions until `jmp 0xa`, which will take us to `f+10`, where `cmp` is; in other words, we execute the body of the loop and then we check the condition again. - `add %edx,%eax` adds `%edx` to `%eax`: remember, `%eax` is our result while `%edx` is used in the test `edx > edi`, which makes us jump to `f+21` and return (currently, `%edx` is still 1). - `add $0x1,%edx` adds 1 to `%edx`. - `jmp 0xa` jumps to `f+10` (unconditionally), where we check again whether `edx > edi`. ### `while` loop ``` trojan@cs356:~$ cat test.c int gcd(int a, int b) { // Euclid of Alexandria, 300 BC int tmp; while (b != 0) { tmp = b; b = a % b; a = tmp; } return a; } trojan@cs356:~$ gcc -Og -c test.c trojan@cs356:~$ gdb -batch -ex 'disas gcd' test.o Dump of assembler code for function gcd: 0x0000000000000000 <+0>: mov %edi,%eax 0x0000000000000002 <+2>: test %esi,%esi 0x0000000000000004 <+4>: je 0xf <gcd+15> 0x0000000000000006 <+6>: cltd 0x0000000000000007 <+7>: idiv %esi 0x0000000000000009 <+9>: mov %esi,%eax 0x000000000000000b <+11>: mov %edx,%esi 0x000000000000000d <+13>: jmp 0x2 <gcd+2> 0x000000000000000f <+15>: retq End of assembler dump. ``` Greatest common divisor, a great algorithm indeed: - As usual, our inputs are in `%edi` (`a`) and `%esi` (`b`); return value in `%eax`. - `mov %edi,%eax` saves `%edi` (`a`) into `%eax` (the return value). - `test %esi,%esi` + `je` jumps to `gcd+15` when `%esi` (`b`) is 0; at `+15`, we just return. Otherwise, we execute the following instructions until `jmp` at `+13` takes us back to this test. - `cltd` replicates the sign bit of `%eax` into all the bits of `%edx`; then, `idiv %esi` divides `%edx`:`%eax` (64 bits) by `%esi` (32 bits) storing quotient in `%eax` and remainder in `%edx`. - `mov %esi,%eax` saves `%esi` (currently `b`) into `%eax` (where we had saved `a`). - `mov %edx,%esi` saves `%edx` (the remainder of the previous division) into `%esi` (currently `b`). - `jmp 0x2` takes us back to the beginning of the loop. So, yeah, the logic of Euclid's algorithm is all there. But it seems somehow easy to say that, because we have the C code as well. If we only have the assembly, it helps to "rewrite it" in more understandable pseudocode. For example: ``` <+0>: eax = edi <+2>: if esi == 0, go to +15 <+6>: edx = replicas of sign bit of eax <+7>: edx = edx_eax % esi eax = edx_eax / esi <+9>: eax = esi <+11>: esi = edx <+13>: go to +2 <+15>: return eax ``` Now, we notice the `go to` pattern, which we can eliminate (for [Dijkstra's sake](https://homepages.cwi.nl/~storm/teaching/reader/Dijkstra68.pdf)!): ``` <+0>: eax = edi <+2>: while (esi != 0) { <+6>: edx = replicas of sign bit of eax <+7>: edx = edx_eax % esi eax = edx_eax / esi <+9>: eax = esi <+11>: esi = edx <+13>: } <+15>: return eax ``` We also notice that we're only using the modulo of the division; so, we can simplify the code as: ``` <+0>: eax = edi // result = a <+2>: while (esi != 0) { // while (b != 0) <+7>: edx = edx_eax % esi // tmp = result % b <+9>: eax = esi // result = b <+11>: esi = edx // b = tmp <+13>: } <+15>: return eax // return result ``` ### `for` loop over an array ``` trojan@cs356:~$ cat test.c int f(int *a, int n) { int total = 0; for (int i = 0; i < n; i++) { total += a[i]; } return total; } trojan@cs356:~$ gcc -Og -c test.c trojan@cs356:~$ gdb -batch -ex 'disas f' test.o Dump of assembler code for function f: 0x0000000000000000 <+0>: mov $0x0,%edx 0x0000000000000005 <+5>: mov $0x0,%eax 0x000000000000000a <+10>: cmp %esi,%edx 0x000000000000000c <+12>: jge 0x19 <f+25> 0x000000000000000e <+14>: movslq %edx,%rcx 0x0000000000000011 <+17>: add (%rdi,%rcx,4),%eax 0x0000000000000014 <+20>: add $0x1,%edx 0x0000000000000017 <+23>: jmp 0xa <f+10> 0x0000000000000019 <+25>: retq End of assembler dump. ``` This is just a variant of the `for` loop we've seen, but it also uses the addressing mode `(a,i,4)` to get element $i$ (counting from 0) in an array where each element has size 4, i.e., at byte $a+i\times 4$. - Again, `%rdi` is the first input argument (`a`), `%edi` the second (`n`), `%eax` the return value. - `mov $0x0,%edx` stores 0 into `%edx` (Our counter? We don't know for sure.) - `mov $0x0,%eax` stores 0 into `%eax` (Likely, our result.) - `cmp %esi,%edx` + `jge` jumps to `f+25` when `edx >= esi`; at `f+25`, we just return. Otherwise, we execute the following instructions until `jmp` at `f+23` takes us back to this `cmp`. - `movslq %edx,%rcx` extends `%edx` (our counter?) into 64 bits, saving the result in `%rcx`. - `add (%rdi,%rcx,4),%eax` adds the 4 bytes at `%rdi + %rcx * 4` to `%eax`. - `add $0x1,%edx` increments `%edx` by 1 (now we know it's a counter!) - Then `jmp` goes back to the beginning, to check whether now `edx >= esi` (i.e., `i >= n`); note that **we stay in the for loop if the opposite condition is true** (`i < n`). ### `sscanf` to parse a string This is a BombLab :bomb: classic... most phases work like this: ``` trojan@cs356:~$ cat test.c #include <stdio.h> int phase(char *input) { int x; int y; char str[20]; int count = sscanf(input, "%d %x %[^\n]", &x, &y, str); if (count == 3) { printf("%d %d %s\n", x, y, str); // prints 42 10 test return 0; // means success } else { return 1; // means error/failure } } int main() { return phase("42 a test"); } trojan@cs356:~$ gcc -Og test.c -o test trojan@cs356:~$ gdb -batch -ex 'disas phase' test Dump of assembler code for function phase: 0x0000000000001145 <+0>: sub $0x28,%rsp 0x0000000000001149 <+4>: lea 0x18(%rsp),%rcx 0x000000000000114e <+9>: lea 0x1c(%rsp),%rdx 0x0000000000001153 <+14>: mov %rsp,%r8 0x0000000000001156 <+17>: lea 0xea7(%rip),%rsi # 0x2004 0x000000000000115d <+24>: mov $0x0,%eax 0x0000000000001162 <+29>: callq 0x1040 <__isoc99_sscanf@plt> 0x0000000000001167 <+34>: cmp $0x3,%eax 0x000000000000116a <+37>: je 0x1176 <phase+49> 0x000000000000116c <+39>: mov $0x1,%eax 0x0000000000001171 <+44>: add $0x28,%rsp 0x0000000000001175 <+48>: retq 0x0000000000001176 <+49>: mov %rsp,%rcx 0x0000000000001179 <+52>: mov 0x18(%rsp),%edx 0x000000000000117d <+56>: mov 0x1c(%rsp),%esi 0x0000000000001181 <+60>: lea 0xe88(%rip),%rdi # 0x2010 0x0000000000001188 <+67>: mov $0x0,%eax 0x000000000000118d <+72>: callq 0x1030 <printf@plt> 0x0000000000001192 <+77>: mov $0x0,%eax 0x0000000000001197 <+82>: jmp 0x1171 <phase+44> End of assembler dump. ``` First of all, what is this C program doing? - The function `phase` receives an `input` string (an array of `char`, with `0x00` to mark the end of the string); in assembly, `%rdi` will contain the address of the first character of the string. - It allocates three local variables: `x` (4 bytes), `y` (4 bytes), `str` (20 bytes). These variables are allocated in memory, on the stack, not in some register. - It calls `sscanf` passing: a pointer to the beginning of the input string to parse (`%rdi`); a pointer to the format string `%d %x %[^\n]` (`%rsi`), meaning that an integer in decimal format, one in hex format and a string (without newlines) should be parsed; the addresses `&x`, `&y`, `str` where the parsed values should be saved (`%rdx`, `%rcx`, `r8`). - It checks the return value of `sscanf`: if all three arguments are parsed correctly, the return argument is going to be 3 and the phase returns 0 (no error); otherwise, it returns 1 (error). Now, let's look at the assembly, step-by-step: - `sub $0x28,%rsp` decreases the stack pointer `%rsp` by 40 to make space for local variables. - `lea 0x18(%rsp),%rcx` saves the address of `y` into `%rcx` (the choice of where to place `y` in the allocated space is arbitrary). - `lea 0x1c(%rsp),%rdx` saves the address of `x` into `%rdx`. - `mov %rsp,%r8` saves the address of `str` into `%r8`. - `lea 0xea7(%rip),%rsi` stores the address of the format string (a constant relative to `%rip`) into `%rsi`. - `mov $0x0,%eax` / `callq 0x1040 <__isoc99_sscanf@plt>` saves 0 into `%rax` and calls `sscanf`. - `cmp $0x3,%eax` + `je` jumps to `+49` if `sscanf` has returned 3 into `%rax`; otherwise, continues to the next instruction. - `mov $0x1,%eax` saves 1 into `%eax` (the return value). - `<+44>: add $0x28,%rsp` increases the stack pointer by `0x28` to deallocate the memory that was initially allocated. - `retq` returns. - `<+49>: mov %rsp,%rcx`: we get here if the number of parsed arguments was right; this instruction saves the address of `str` into `%rcx` (4th argument to `printf` later). - `mov 0x18(%rsp),%edx` saves the value of `y` into `%edx` (3rd argument to `printf`). - `mov 0x1c(%rsp),%esi` saves the value of `x` into `%rsi` (2nd argument to `printf`). - `lea 0xe88(%rip),%rdi` saves the address of the format string into `%rdi` (1st argument to `printf`). - `mov $0x0,%eax` / `callq 0x1030 <printf@plt>` saves 0 into `%rax` and calls `printf`. - `mov $0x0,%eax` saves 0 into `%rax` (our return value, no error). - `jmp 0x1171 <phase+44>` jumps to `+44`, where it dellocates memory from the stack and returns.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password
    or
    Sign in via Google Sign in via Facebook Sign in via X(Twitter) Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    By signing in, you agree to our terms of service.

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully