Assembly Crash Course Write-up

# Assembly Crash Course Write-up ## Level 1: Set register ``` In this level, you will be working with registers. You will be asked to modify or read from registers. In this level, you will work with registers! Please set the following: rdi = 0x1337 ``` Script: ```asm .intel_syntax noprefix _start: mov rdi, 0x1337 ``` run script: as -o asm.o asm.S && objcopy -O binary --only-section=.text ./asm.o ./asm.bin && cat ./asm.bin | /challenge/run Flag: pwn.college{gczBhQlAVRh2IiUGZqfoo3QHyd3.0FN5EDLwMzN1czW} ## Level 2: Set multiple register ``` In this level, you will be working with registers. You will be asked to modify or read from registers. In this level, you will work with multiple registers. Please set the following: rax = 0x1337 r12 = 0xCAFED00D1337BEEF rsp = 0x31337 ``` Script: ```asm .intel_syntax noprefix _start: mov rax, 0x1337 mov r12, 0xCAFED00D1337BEEF mov rsp, 0x31337 ``` Flag: pwn.college{0snN7_ogePcOilz_25bc8d_f2dM.dBTM4MDLwMzN1czW} ## Level 3: Add to register ``` In this level, you will be working with registers. You will be asked to modify or read from registers. We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. Many instructions exist in x86 that allow you to perform all the normal math operations on registers and memory. For shorthand, when we say A += B, it really means A = A + B. Here are some useful instructions: add reg1, reg2 <=> reg1 += reg2 sub reg1, reg2 <=> reg1 -= reg2 imul reg1, reg2 <=> reg1 *= reg2 div is more complicated, and we will discuss it later. Note: all regX can be replaced by a constant or memory location. Do the following: Add 0x331337 to rdi ``` Script: ```asm .intel_syntax noprefix _start: add rdi, 0x331337 ``` Flag: pwn.college{wlnHEC9FU-nZbDjYjL9Sc0l-CJe.0VN5EDLwMzN1czW} ## Level 4: Linear equation registers ``` In this level, you will be working with registers. You will be asked to modify or read from registers. We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. Using your new knowledge, please compute the following: f(x) = mx + b, where: m = rdi x = rsi b = rdx Place the result into rax. Note: There is an important difference between mul (unsigned multiply) and imul (signed multiply) in terms of which registers are used. Look at the documentation on these instructions to see the difference. In this case, you will want to use imul. ``` Scipt: ```asm .intel_syntax noprefix _start: imul rdi, rsi mov rax, rdi add rax, rdx ``` Flag: pwn.college{gpuk8Vta3E7MV1LCa-7mLoU2uq3.0lN5EDLwMzN1czW} ## Level 5: Integer division ``` In this level, you will be working with registers. You will be asked to modify or read from registers. We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result, which is usually rax. Division in x86 is more special than in normal math. Math here is called integer math, meaning every value is a whole number. As an example: 10 / 3 = 3 in integer math. Why? Because 3.33 is rounded down to an integer. The relevant instructions for this level are: mov rax, reg1 div reg2 Note: div is a special instruction that can divide a 128-bit dividend by a 64-bit divisor while storing both the quotient and the remainder, using only one register as an operand. How does this complex div instruction work and operate on a 128-bit dividend (which is twice as large as a register)? For the instruction div reg, the following happens: rax = rdx:rax / reg rdx = remainder rdx:rax means that rdx will be the upper 64-bits of the 128-bit dividend and rax will be the lower 64-bits of the 128-bit dividend. You must be careful about what is in rdx and rax before you call div. Please compute the following: speed = distance / time, where: distance = rdi time = rsi speed = rax Note that distance will be at most a 64-bit value, so rdx should be 0 when dividing. ``` Scipt: ```asm .intel_syntax noprefix _start: mov rax, rdi div rsi ``` Flag: pwn.college{kHEbUKaJyWn9knnOz8mNFdQzQKA.01N5EDLwMzN1czW} ## Level 6: Modulo operation ``` In this level, you will be working with registers. You will be asked to modify or read from registers. We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform a formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. Modulo in assembly is another interesting concept! x86 allows you to get the remainder after a div operation. For instance: 10 / 3 results in a remainder of 1. The remainder is the same as modulo, which is also called the "mod" operator. In most programming languages, we refer to mod with the symbol %. Please compute the following: rdi % rsi Place the value in rax. ``` Script: ```asm .intel_syntax noprefix _start: mov rax, rdi div rsi mov rax, rdx ``` Flag: pwn.college{Ul19wsAz3Np0ddgHCREM74mDLET.0FO5EDLwMzN1czW} ## Level 7: Set upper byte ``` In this level, you will be working with registers. You will be asked to modify or read from registers. We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result, which is typically in rax. Another cool concept in x86 is the ability to independently access the lower register bytes. Each register in x86_64 is 64 bits in size, and in the previous levels, we have accessed the full register using rax, rdi, or rsi. We can also access the lower bytes of each register using different register names. For example, the lower 32 bits of rax can be accessed using eax, the lower 16 bits using ax, and the lower 8 bits using al. MSB LSB +----------------------------------------+ | rax | +--------------------+-------------------+ | eax | +---------+---------+ | ax | +----+----+ | ah | al | +----+----+ Lower register bytes access is applicable to almost all registers. Using only one move instruction, please set the upper 8 bits of the ax register to 0x42. ``` Scipt: ```asm .intel_syntax noprefix _start: mov ah, 0x42 ``` Flag: pwn.college{47LzubGMRVn18rrLOVPwyUCiH1c.dFTM4MDLwMzN1czW} ## Level 8: Efficient modulo ``` In this level, you will be working with registers. You will be asked to modify or read from registers. We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. It turns out that using the div operator to compute the modulo operation is slow! We can use a math trick to optimize the modulo operator (%). Compilers use this trick a lot. If we have x % y, and y is a power of 2, such as 2^n, the result will be the lower n bits of x. Therefore, we can use the lower register byte access to efficiently implement modulo! Using only the following instruction(s): mov Please compute the following: rax = rdi % 256 rbx = rsi % 65536 ``` Script: ```asm .intel_syntax noprefix _start: mov al, dil mov bx, si ``` 256 is 2^8 and 8 lower byte of rdi is dil 65536 is 2^16 and 16 lower byte of rsi is si Flag: pwn.college{YpxSdoDgSvaytH3pzUWctG_St--.0VO5EDLwMzN1czW} ## Level 9: Byte extraction ``` In this level, you will be working with registers. You will be asked to modify or read from registers. We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with bit logic and operations. This will involve heavy use of directly interacting with bits stored in a register or memory location. You will also likely need to make use of the logic instructions in x86: and, or, not, xor. Shifting bits around in assembly is another interesting concept! x86 allows you to 'shift' bits around in a register. Take, for instance, al, the lowest 8 bits of rax. The value in al (in bits) is: rax = 10001010 If we shift once to the left using the shl instruction: shl al, 1 The new value is: al = 00010100 Everything shifted to the left, and the highest bit fell off while a new 0 was added to the right side. You can use this to do special things to the bits you care about. Shifting has the nice side effect of doing quick multiplication (by 2) or division (by 2), and can also be used to compute modulo. Here are the important instructions: shl reg1, reg2 <=> Shift reg1 left by the amount in reg2 shr reg1, reg2 <=> Shift reg1 right by the amount in reg2 Note: 'reg2' can be replaced by a constant or memory location. Using only the following instructions: mov, shr, shl Please perform the following: Set rax to the 5th least significant byte of rdi. For example: rdi = | B7 | B6 | B5 | B4 | B3 | B2 | B1 | B0 | Set rax to the value of B4 ``` Script: ```asm .intel_syntax noprefix _start: shr rdi, 32 mov al, dil ``` rdi have 64 byte so B4 is 8 bytes lower of upper 32 bytes, so i shr rdi for 32 bytes and take 8 lower byte of it Flag: pwn.college{oZbdZa4NSNi-rcTe6Srrwe7WKZS.0FMwIDLwMzN1czW} ## Level 10: Bitwise and ``` In this level, you will be working with registers. You will be asked to modify or read from registers. We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with bit logic and operations. This will involve heavy use of directly interacting with bits stored in a register or memory location. You will also likely need to make use of the logic instructions in x86: and, or, not, xor. Bitwise logic in assembly is yet another interesting concept! x86 allows you to perform logic operations bit by bit on registers. For the sake of this example, say registers only store 8 bits. The values in rax and rbx are: rax = 10101010 rbx = 00110011 If we were to perform a bitwise AND of rax and rbx using the and rax, rbx instruction, the result would be calculated by ANDing each bit pair one by one, hence why it's called bitwise logic. So from left to right: 1 AND 0 = 0 0 AND 0 = 0 1 AND 1 = 1 0 AND 1 = 0 ... Finally, we combine the results together to get: rax = 00100010 Here are some truth tables for reference: AND A | B | X ---+---+--- 0 | 0 | 0 0 | 1 | 0 1 | 0 | 0 1 | 1 | 1 OR A | B | X ---+---+--- 0 | 0 | 0 0 | 1 | 1 1 | 0 | 1 1 | 1 | 1 XOR A | B | X ---+---+--- 0 | 0 | 0 0 | 1 | 1 1 | 0 | 1 1 | 1 | 0 Without using the following instructions: mov, xchg, please perform the following: Set rax to the value of (rdi AND rsi) ``` Script: ```asm .intel_syntax noprefix _start: AND rax, 0 OR rax, rdi AND rax, rsi ``` Explain: rax & 0 => rax = 0 rax | rdi = 0 | rdi = rdi rax & rsi = rdi & rsi Flag: pwn.college{IyRPIqjxHS03WZtS1eBX7J2kvRS.0VMwIDLwMzN1czW} ## Level 11: Check even ``` In this level, you will be working with registers. You will be asked to modify or read from registers. We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it is rax. In this level, you will be working with bit logic and operations. This will involve heavy use of directly interacting with bits stored in a register or memory location. You will also likely need to make use of the logic instructions in x86: and, or, not, xor. Using only the following instructions: and or xor Implement the following logic: if x is even then y = 1 else y = 0 Where: x = rdi y = rax ``` Script: ```asm .intel_syntax noprefix _start: AND rdi, 1 XOR rdi, 1 AND rax, 0 OR rax, rdi ``` Explain: - if rdi is odd - rdi & 1 = 1 - rdi ^ 1 = 0 - rax & 0 = 0 - rax | rdi = 0 - if rdi is even - rdi & 1 = 0 - rdi ^ 1 = 1 - rax & 0 = 0 - rax | rdi = 1 Flag: pwn.college{Mh4CHByCpXnTdGBnfjSgQ-0NPNd.0lMwIDLwMzN1czW} ## Level 12: Memory read ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use. Up until now, you have worked with registers as the only way for storing things, essentially variables such as 'x' in math. However, we can also store bytes into memory! Recall that memory can be addressed, and each address contains something at that location. Note that this is similar to addresses in real life! As an example: the real address '699 S Mill Ave, Tempe, AZ 85281' maps to the 'ASU Brickyard'. We would also say it points to 'ASU Brickyard'. We can represent this like: ['699 S Mill Ave, Tempe, AZ 85281'] = 'ASU Brickyard' The address is special because it is unique. But that also does not mean other addresses can't point to the same thing (as someone can have multiple houses). Memory is exactly the same! For instance, the address in memory where your code is stored (when we take it from you) is 0x400000. In x86, we can access the thing at a memory location, called dereferencing, like so: mov rax, [some_address] <=> Moves the thing at 'some_address' into rax This also works with things in registers: mov rax, [rdi] <=> Moves the thing stored at the address of what rdi holds to rax This works the same for writing to memory: mov [rax], rdi <=> Moves rdi to the address of what rax holds. So if rax was 0xdeadbeef, then rdi would get stored at the address 0xdeadbeef: [0xdeadbeef] = rdi Note: Memory is linear, and in x86_64, it goes from 0 to 0xffffffffffffffff (yes, huge). Please perform the following: Place the value stored at 0x404000 into rax. Make sure the value in rax is the original value stored at 0x404000. ``` Script: ```asm .intel_syntax noprefix _start: mov rbx, 0x404000 mov rax, [rbx] ``` Flag: pwn.college{oIaEhXMeAwTuX6ThFUG3KUt-UPs.dJTM4MDLwMzN1czW} ## Level 13: Memory write ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use. Please perform the following: Place the value stored in rax to 0x404000. ``` Script: ```asm .intel_syntax noprefix _start: mov rbx, 0x404000 mov [rbx], rax ``` Flag: pwn.college{c7y_gSbOjNOlBHLEbJ-fpm0cIdA.dNTM4MDLwMzN1czW} ## Level 14: Memory increment ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use. Please perform the following: Place the value stored at 0x404000 into rax. Increment the value stored at the address 0x404000 by 0x1337. Make sure the value in rax is the original value stored at 0x404000 and make sure that [0x404000] now has the incremented value. ``` Script: ```asm .intel_syntax noprefix _start: mov rbx, 0x404000 mov rax, [rbx] mov rdi, 0x1337 add [rbx], rdi ``` Flag: pwn.college{sJkdOKzMFJXk_oJj5kapUNvQ1pD.01MwIDLwMzN1czW} ## Level 15: Byte access ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use. Recall that registers in x86_64 are 64 bits wide, meaning they can store 64 bits. Similarly, each memory location can be treated as a 64-bit value. We refer to something that is 64 bits (8 bytes) as a quad word. Here is the breakdown of the names of memory sizes: Quad Word = 8 Bytes = 64 bits Double Word = 4 bytes = 32 bits Word = 2 bytes = 16 bits Byte = 1 byte = 8 bits In x86_64, you can access each of these sizes when dereferencing an address, just like using bigger or smaller register accesses: mov al, [address] <=> moves the least significant byte from address to rax mov ax, [address] <=> moves the least significant word from address to rax mov eax, [address] <=> moves the least significant double word from address to rax mov rax, [address] <=> moves the full quad word from address to rax Remember that moving into al does not fully clear the upper bytes. Please perform the following: Set rax to the byte at 0x404000. ``` Script: ```asm .intel_syntax noprefix _start: mov rbx, 0x404000 mov al, [rbx] ``` Flag: pwn.college{Q9wXssURDLnY6MwZcNkST5Tsilr.dRTM4MDLwMzN1czW} ## Level 16: Memory size access ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, refer to the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use. Recall the following: The breakdown of the names of memory sizes: Quad Word = 8 Bytes = 64 bits Double Word = 4 bytes = 32 bits Word = 2 bytes = 16 bits Byte = 1 byte = 8 bits In x86_64, you can access each of these sizes when dereferencing an address, just like using bigger or smaller register accesses: mov al, [address] <=> moves the least significant byte from address to rax mov ax, [address] <=> moves the least significant word from address to rax mov eax, [address] <=> moves the least significant double word from address to rax mov rax, [address] <=> moves the full quad word from address to rax Please perform the following: Set rax to the byte at 0x404000 Set rbx to the word at 0x404000 Set rcx to the double word at 0x404000 Set rdx to the quad word at 0x404000 ``` Script: ```asm .intel_syntax noprefix _start: mov al, [0x404000] mov bx, [0x404000] mov ecx, [0x404000] mov rdx, [0x404000] ``` Flag: pwn.college{UkTiKgyoQXCHwyzc-qY52OOpi__.0FNwIDLwMzN1czW} ## Level 17: Little endian write ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use. It is worth noting, as you may have noticed, that values are stored in reverse order of how we represent them. As an example, say: [0x1330] = 0x00000000deadc0de If you examined how it actually looked in memory, you would see: [0x1330] = 0xde [0x1331] = 0xc0 [0x1332] = 0xad [0x1333] = 0xde [0x1334] = 0x00 [0x1335] = 0x00 [0x1336] = 0x00 [0x1337] = 0x00 This format of storing things in 'reverse' is intentional in x86, and it's called "Little Endian". For this challenge, we will give you two addresses created dynamically each run. The first address will be placed in rdi. The second will be placed in rsi. Using the earlier mentioned info, perform the following: Set [rdi] = 0xdeadbeef00001337 Set [rsi] = 0xc0ffee0000 Hint: it may require some tricks to assign a big constant to a dereferenced register. Try setting a register to the constant value, then assigning that register to the dereferenced register. ``` Script: ```asm .intel_syntax noprefix _start: mov rax, 0xdeadbeef00001337 mov [rdi], rax mov rax, 0xc0ffee0000 mov [rsi], rax ``` Flag: pwn.college{EUtEVZV7W-gjZ_57u3KvHUYavCz.0VNwIDLwMzN1czW} ## Level 18: Memory sum ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it’s rax. In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use. Recall that memory is stored linearly. What does that mean? Say we access the quad word at 0x1337: [0x1337] = 0x00000000deadbeef The real way memory is laid out is byte by byte, little endian: [0x1337] = 0xef [0x1337 + 1] = 0xbe [0x1337 + 2] = 0xad ... [0x1337 + 7] = 0x00 What does this do for us? Well, it means that we can access things next to each other using offsets, similar to what was shown above. Say you want the 5th byte from an address, you can access it like: mov al, [address+4] Remember, offsets start at 0. Perform the following: Load two consecutive quad words from the address stored in rdi. Calculate the sum of the previous steps' quad words. Store the sum at the address in rsi. ``` Script: ```asm .intel_syntax noprefix _start: mov rax, [rdi] mov rbx, [rdi + 8] add rax, rbx mov [rsi], rax ``` Flag: pwn.college{gDPujomJxC_H9x9zTfUOjDrReTe.0lNwIDLwMzN1czW} ## Level 19: Stack subtraction ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with the stack, the memory region that dynamically expands and shrinks. You will be required to read and write to the stack, which may require you to use the pop and push instructions. You may also need to use the stack pointer register (rsp) to know where the stack is pointing. In these levels, we are going to introduce the stack. The stack is a region of memory that can store values for later. To store a value on the stack, we use the push instruction, and to retrieve a value, we use pop. The stack is a last in, first out (LIFO) memory structure, and this means the last value pushed is the first value popped. Imagine unloading plates from the dishwasher. Let's say there are 1 red, 1 green, and 1 blue. First, we place the red one in the cabinet, then the green on top of the red, then the blue. Our stack of plates would look like: Top ----> Blue Green Bottom -> Red Now, if we wanted a plate to make a sandwich, we would retrieve the top plate from the stack, which would be the blue one that was last into the cabinet, ergo the first one out. On x86, the pop instruction will take the value from the top of the stack and put it into a register. Similarly, the push instruction will take the value in a register and push it onto the top of the stack. Using these instructions, take the top value of the stack, subtract rdi from it, then put it back. ``` Script: ```asm .intel_syntax noprefix _start: pop rax sub rax, rdi push rax ``` Flag: pwn.college{IWxbh_hNLk0kz4ycCzU7SQMNQJP.01NwIDLwMzN1czW} ## Level 20: Swap stack values ``` We will now set some values in memory dynamically before each run. On each run the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with the stack, the memory region that dynamically expands and shrinks. You will be required to read and write to the stack, which may require you to use the pop and push instructions. You may also need to use the stack pointer register (rsp) to know where the stack is pointing. In this level, we are going to explore the last in first out (LIFO) property of the stack. Using only the following instructions: push pop Swap values in rdi and rsi. Example: If to start rdi = 2 and rsi = 5 Then to end rdi = 5 and rsi = 2 ``` Script: ```asm .intel_syntax noprefix _start: push rdi push rsi pop rdi pop rsi ``` Flag: pwn.college{Aqk08TZOSEM8oY1_46-LbonLDp5.0FOwIDLwMzN1czW} ## Level 21: Average stack values ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with the stack, the memory region that dynamically expands and shrinks. You will be required to read and write to the stack, which may require you to use the pop and push instructions. You may also need to use the stack pointer register (rsp) to know where the stack is pointing. In the previous levels, you used push and pop to store and load data from the stack. However, you can also access the stack directly using the stack pointer. On x86, the stack pointer is stored in the special register, rsp. rsp always stores the memory address of the top of the stack, i.e., the memory address of the last value pushed. Similar to the memory levels, we can use [rsp] to access the value at the memory address in rsp. Without using pop, please calculate the average of 4 consecutive quad words stored on the stack. Push the average on the stack. Hint: RSP+0x?? Quad Word A RSP+0x?? Quad Word B RSP+0x?? Quad Word C RSP Quad Word D ``` Script: ```asm .intel_syntax noprefix _start: mov rax, [rsp] add rax, [rsp + 16] add rax, [rsp + 8] add rax, [rsp + 24] mov rbx, 4 div rbx push rax ``` Flag: pwn.college{gWUyKoLJ5fVslePcbDKGEHQw0HX.0VOwIDLwMzN1czW} ## Level 22: Absolute jump ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior. Earlier, you learned how to manipulate data in a pseudo-control way, but x86 gives us actual instructions to manipulate control flow directly. There are two major ways to manipulate control flow: Through a jump Through a call In this level, you will work with jumps. There are two types of jumps: Unconditional jumps Conditional jumps Unconditional jumps always trigger and are not based on the results of earlier instructions. As you know, memory locations can store data and instructions. Your code will be stored at 0x400042 (this will change each run). For all jumps, there are three types: Relative jumps: jump + or - the next instruction. Absolute jumps: jump to a specific address. Indirect jumps: jump to the memory address specified in a register. In x86, absolute jumps (jump to a specific address) are accomplished by first putting the target address in a register reg, then doing jmp reg. In this level, we will ask you to do an absolute jump. Perform the following: Jump to the absolute address 0x403000. ``` Script: ```asm .intel_syntax noprefix _start: mov rax, 0x403000 jmp rax ``` Flag: pwn.college{EtXgOFeiECAc9taQdovn0R5BYJS.dVTM4MDLwMzN1czW} ## Level 23: Relative jump ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior. Recall that for all jumps, there are three types: Relative jumps Absolute jumps Indirect jumps In this level, we will ask you to do a relative jump. You will need to fill space in your code with something to make this relative jump possible. We suggest using the nop instruction. It's 1 byte long and very predictable. In fact, the assembler that we're using has a handy .rept directive that you can use to repeat assembly instructions some number of times: GNU Assembler Manual Useful instructions for this level: jmp (reg1 | addr | offset) nop Hint: For the relative jump, look up how to use labels in x86. Using the above knowledge, perform the following: Make the first instruction in your code a jmp. Make that jmp a relative jump to 0x51 bytes from the current position. At the code location where the relative jump will redirect control flow, set rax to 0x1. ``` Script: ```asm .intel_syntax noprefix _start: jmp solve .rept 0x51 nop .endr solve: mov rax, 0x1 ``` Explain: I don't know why i can't use **times 0x51 nop** to nop 0x51 time so I search on the internet and use another way to jump 0x51 nop Flag: pwn.college{EPhjhQZ-s8MCw1s9VW7WJhGVvhZ.dZTM4MDLwMzN1czW} The second solution: ```asm .intel_syntax noprefix _start: jmp $+0x53 .rept 0x51 nop .endr mov rax, 0x1 ``` It's jmp + offset. So why it's 0x53 instead of 0x51 ? Because you need to add more 2 bytes for action jmp ## Level 24: Jump trampoline ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior. Now, we will combine the two prior levels and perform the following: Create a two jump trampoline: Make the first instruction in your code a jmp. Make that jmp a relative jump to 0x51 bytes from its current position. At 0x51, write the following code: Place the top value on the stack into register rdi. jmp to the absolute address 0x403000. ``` Script: ```asm .intel_syntax noprefix _start: jmp solve .rept 0x51 nop .endr solve: pop rdi mov rax, 0x403000 jmp rax ``` Flag: pwn.college{svMqi3LLy4MKLn6FYowGUxuHIqJ.0FMxIDLwMzN1czW} ## Level 25: Conditional jump ``` In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior. We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use. We will now introduce you to conditional jumps--one of the most valuable instructions in x86. In higher-level programming languages, an if-else structure exists to do things like: if x is even: is_even = 1 else: is_even = 0 This should look familiar since it is implementable in only bit-logic, which you've done in a prior level. In these structures, we can control the program's control flow based on dynamic values provided to the program. Implementing the above logic with jmps can be done like so: ; assume rdi = x, rax is output ; rdx = rdi mod 2 mov rax, rdi mov rsi, 2 div rsi ; remainder is 0 if even cmp rdx, 0 ; jump to not_even code if it's not 0 jne not_even ; fall through to even code mov rbx, 1 jmp done ; jump to this only when not_even not_even: mov rbx, 0 done: mov rax, rbx ; more instructions here Often though, you want more than just a single 'if-else'. Sometimes you want two if checks, followed by an else. To do this, you need to make sure that you have control flow that 'falls-through' to the next if after it fails. All must jump to the same done after execution to avoid the else. There are many jump types in x86, it will help to learn how they can be used. Nearly all of them rely on something called the ZF, the Zero Flag. The ZF is set to 1 when a cmp is equal, 0 otherwise. Using the above knowledge, implement the following: if [x] is 0x7f454c46: y = [x+4] + [x+8] + [x+12] else if [x] is 0x00005A4D: y = [x+4] - [x+8] - [x+12] else: y = [x+4] * [x+8] * [x+12] Where: x = rdi, y = rax. Assume each dereferenced value is a signed dword. This means the values can start as a negative value at each memory position. A valid solution will use the following at least once: jmp (any variant), cmp ``` Script: ```asm .intel_syntax noprefix _start: mov eax, [rdi] cmp eax, 0x7f454c46 je first cmp eax, 0x00005A4D je second mov eax, [rdi + 4] imul eax, [rdi + 8] imul eax, [rdi + 12] jmp end first: mov eax, [rdi + 4] add eax, [rdi + 8] add eax, [rdi + 12] jmp end second: mov eax, [rdi + 4] sub eax, [rdi + 8] sub eax, [rdi + 12] jmp end end: ``` Flag: pwn.college{EGX7CciwbIY05jysJ2Jnk3XiAqY.0VMxIDLwMzN1czW} ## Level 26: Indirect jump ``` In this level, you will work with control flow manipulation. This involves using instructions to indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior. We will be testing your code multiple times in this level with dynamic values! This means we will run your code in various random ways to verify that the logic is robust enough to survive normal use. The last jump type is the indirect jump, often used for switch statements in the real world. Switch statements are a special case of if-statements that use only numbers to determine where the control flow will go. Here is an example: switch(number): 0: jmp do_thing_0 1: jmp do_thing_1 2: jmp do_thing_2 default: jmp do_default_thing The switch in this example works on number, which can either be 0, 1, or 2. If number is not one of those numbers, the default triggers. You can consider this a reduced else-if type structure. In x86, you are already used to using numbers, so it should be no surprise that you can make if statements based on something being an exact number. Additionally, if you know the range of the numbers, a switch statement works very well. Take, for instance, the existence of a jump table. A jump table is a contiguous section of memory that holds addresses of places to jump. In the above example, the jump table could look like: [0x1337] = address of do_thing_0 [0x1337+0x8] = address of do_thing_1 [0x1337+0x10] = address of do_thing_2 [0x1337+0x18] = address of do_default_thing Using the jump table, we can greatly reduce the amount of cmps we use. Now all we need to check is if number is greater than 2. If it is, always do: jmp [0x1337+0x18] Otherwise: jmp [jump_table_address + number * 8] Using the above knowledge, implement the following logic: if rdi is 0: jmp 0x40301e else if rdi is 1: jmp 0x4030da else if rdi is 2: jmp 0x4031d5 else if rdi is 3: jmp 0x403268 else: jmp 0x40332c Please do the above with the following constraints: Assume rdi will NOT be negative. Use no more than 1 cmp instruction. Use no more than 3 jumps (of any variant). We will provide you with the number to 'switch' on in rdi. We will provide you with a jump table base address in rsi. Here is an example table: [0x40427c] = 0x40301e (addrs will change) [0x404284] = 0x4030da [0x40428c] = 0x4031d5 [0x404294] = 0x403268 [0x40429c] = 0x40332c ``` Script: ```asm .intel_syntax noprefix _start: cmp rdi, 3 jle solve jmp [rsi + 32] solve: jmp [rsi + rdi * 8] end: ``` Flag: pwn.college{MYDOkgUupgKkJmuOShY6ZuwJ4w8.0lMxIDLwMzN1czW} ## Level 27: Average loop ``` We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior. In a previous level, you computed the average of 4 integer quad words, which was a fixed amount of things to compute. But how do you work with sizes you get when the program is running? In most programming languages, a structure exists called the for-loop, which allows you to execute a set of instructions for a bounded amount of times. The bounded amount can be either known before or during the program's run, with "during" meaning the value is given to you dynamically. As an example, a for-loop can be used to compute the sum of the numbers 1 to n: sum = 0 i = 1 while i <= n: sum += i i += 1 Please compute the average of n consecutive quad words, where: rdi = memory address of the 1st quad word rsi = n (amount to loop for) rax = average computed ``` Script: ```asm .intel_syntax noprefix _start: mov rbx, 0 mov rax, 0 cmp rsi, rbx jne solve solve: add rax, [rdi + rbx * 8] add rbx, 1 cmp rsi, rbx jne solve idiv rbx jmp end end: ``` Flag: pwn.college{MgmNbEVVyHMkzJL7SUDYKyy4qTN.01MxIDLwMzN1czW} ## Level 28: Count non zero ``` In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior. We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use. In previous levels, you discovered the for-loop to iterate for a number of times, both dynamically and statically known, but what happens when you want to iterate until you meet a condition? A second loop structure exists called the while-loop to fill this demand. In the while-loop, you iterate until a condition is met. As an example, say we had a location in memory with adjacent numbers and we wanted to get the average of all the numbers until we find one bigger or equal to 0xff: average = 0 i = 0 while x[i] < 0xff: average += x[i] i += 1 average /= i Using the above knowledge, please perform the following: Count the consecutive non-zero bytes in a contiguous region of memory, where: rdi = memory address of the 1st byte rax = number of consecutive non-zero bytes Additionally, if rdi = 0, then set rax = 0 (we will check)! An example test-case, let: rdi = 0x1000 [0x1000] = 0x41 [0x1001] = 0x42 [0x1002] = 0x43 [0x1003] = 0x00 Then: rax = 3 should be set. ``` Script: ```asm .intel_syntax noprefix _start: xor rax, rax xor rbx, rbx cmp rdi, 0 je end jmp solve solve: mov rbx, [rdi] cmp rbx, 0 jz end inc rax inc rdi jmp solve end: ``` Flag: pwn.college{kaJoEseUmXMg182rY5r2nW3hCNF.0FNxIDLwMzN1czW} ## Level 29: String Lower ``` We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use. In this level, you will be working with functions! This will involve manipulating the instruction pointer (rip), as well as doing harder tasks than normal. You may be asked to use the stack to store values or call functions that we provide you. In previous levels, you implemented a while loop to count the number of consecutive non-zero bytes in a contiguous region of memory. In this level, you will be provided with a contiguous region of memory again and will loop over each performing a conditional operation till a zero byte is reached. All of which will be contained in a function! A function is a callable segment of code that does not destroy control flow. Functions use the instructions "call" and "ret". The "call" instruction pushes the memory address of the next instruction onto the stack and then jumps to the value stored in the first argument. Let's use the following instructions as an example: 0x1021 mov rax, 0x400000 0x1028 call rax 0x102a mov [rsi], rax call pushes 0x102a, the address of the next instruction, onto the stack. call jumps to 0x400000, the value stored in rax. The "ret" instruction is the opposite of "call". ret pops the top value off of the stack and jumps to it. Let's use the following instructions and stack as an example: Stack ADDR VALUE 0x103f mov rax, rdx RSP + 0x8 0xdeadbeef 0x1042 ret RSP + 0x0 0x0000102a Here, ret will jump to 0x102a. Please implement the following logic: str_lower(src_addr): i = 0 if src_addr != 0: while [src_addr] != 0x00: if [src_addr] <= 0x5a: [src_addr] = foo([src_addr]) i += 1 src_addr += 1 return i foo is provided at 0x403000. foo takes a single argument as a value and returns a value. All functions (foo and str_lower) must follow the Linux amd64 calling convention (also known as System V AMD64 ABI): System V AMD64 ABI Therefore, your function str_lower should look for src_addr in rdi and place the function return in rax. An important note is that src_addr is an address in memory (where the string is located) and [src_addr] refers to the byte that exists at src_addr. Therefore, the function foo accepts a byte as its first argument and returns a byte. ``` Script: ```asm .intel_syntax noprefix _start: mov r8, 0x403000 mov rax, 0 mov rbx, 0 cmp rdi, 0 jz end call str_lower str_lower: jmp _while _while: mov rax, 0 mov al, byte ptr [rdi] cmp al, 0x00 jz end mov al, byte ptr [rdi] cmp al, 0x5a jle solve inc rdi jmp _while solve: inc rbx push rdi mov rax, 0 mov al, byte ptr [rdi] mov rdi, 0 mov dil, al call r8 pop rdi mov byte ptr [rdi], al jmp _while ret end: mov rax, rbx ret ``` In this level i have got in stuck for nearly 2 days just because i forgot the **ret** in the end of the function. In this challenge we need to learn about how to use [**Calling Convention**](https://cs61.seas.harvard.edu/site/2018/Asm2/) and you can also learn about [**System call**](https://cs61.seas.harvard.edu/site/2019/Section3/) because it is nearly the same with **Calling Convention** Flag: pwn.college{c6Rhq1nFtXsYjTJU2zOAHfvPhPg.0VNxIDLwMzN1czW} ## Level 30: Most Common Byte ``` We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use. In this level, you will be working with functions! This will involve manipulating the instruction pointer (rip), as well as doing harder tasks than normal. You may be asked to use the stack to store values or call functions that we provide you. In the previous level, you learned how to make your first function and how to call other functions. Now we will work with functions that have a function stack frame. A function stack frame is a set of pointers and values pushed onto the stack to save things for later use and allocate space on the stack for function variables. First, let's talk about the special register rbp, the Stack Base Pointer. The rbp register is used to tell where our stack frame first started. As an example, say we want to construct some list (a contiguous space of memory) that is only used in our function. The list is 5 elements long, and each element is a dword. A list of 5 elements would already take 5 registers, so instead, we can make space on the stack! The assembly would look like: ; setup the base of the stack as the current top mov rbp, rsp ; move the stack 0x14 bytes (5 * 4) down ; acts as an allocation sub rsp, 0x14 ; assign list[2] = 1337 mov eax, 1337 mov [rbp-0x8], eax ; do more operations on the list ... ; restore the allocated space mov rsp, rbp ret Notice how rbp is always used to restore the stack to where it originally was. If we don't restore the stack after use, we will eventually run out. In addition, notice how we subtracted from rsp, because the stack grows down. To make the stack have more space, we subtract the space we need. The ret and call still work the same. Once again, please make function(s) that implement the following: most_common_byte(src_addr, size): i = 0 while i <= size-1: curr_byte = [src_addr + i] [stack_base - curr_byte] += 1 i += 1 b = 0 max_freq = 0 max_freq_byte = 0 while b <= 0xff: if [stack_base - b] > max_freq: max_freq = [stack_base - b] max_freq_byte = b b += 1 return max_freq_byte Assumptions: There will never be more than 0xffff of any byte The size will never be longer than 0xffff The list will have at least one element Constraints: You must put the "counting list" on the stack You must restore the stack like in a normal function You cannot modify the data at src_addr ``` Script: ```asm .intel_syntax noprefix _start: push rbp mov rbp, rsp sub rsp, 0xfff mov rbx, 0 jmp _while1 _while1: cmp rbx, rsi jge _nxt mov r8b, byte ptr [rdi + rbx] mov r9, rbp sub r9, r8 mov r10, [r9] inc r10 mov [r9], r10 add r9, r8 inc rbx jmp _while1 _nxt: mov r8, 0 mov r9, 0 mov rax, 0 jmp _while2 _while2: cmp r8, 0xff jg end mov r10, rbp sub r10, r8 mov r11b, byte ptr [r10] add r10, r8 cmp r11, r9 jg _update inc r8 jmp _while2 _update: mov r9, r11 mov rax, r8 inc r8 jmp _while2 end: mov rsp, rbp pop rbp ret ``` - This challenge is easier than the challenge before because I have study **enough** knowledge from the challenge before to AC this challenge. I just got in stuck for a bit because I was confused between **byte ptr** and **byte** - In this challenge there is one note that you need to focus, it's **[stack_base - b]** and **[stack_base - curr_byte]**, so you need to declare more space (more than 0xff the reason stand at the note of the second note of the challenge) Flag: pwn.college{gDVrYq2krLjHruX1NnBub-ttIT7.0lNxIDLwMzN1czW}