using Radare2 Solve

# using Radare2 Solve Try It Yourself: I suggest traversing the help for a while. Google every term you don’t understand. There is a lot of cool functionality that I won’t touch on here, but which might inspire you to try something. ## Automatic Analysis [0x000005e0]> afl 0x00000000 3 73 -> 75 fcn.rsp 0x00000049 1 219 fcn.00000049 0x00000590 3 23 sym._init 0x000005c0 1 8 sym.imp.puts 0x000005c8 1 8 sym.imp.__printf_chk 0x000005d0 1 16 sym.imp.__cxa_finalize 0x000005e0 1 43 entry0 0x00000610 4 50 -> 44 sym.deregister_tm_clones 0x00000650 4 66 -> 57 sym.register_tm_clones 0x000006a0 5 50 sym.__do_global_dtors_aux 0x000006e0 4 48 -> 42 entry1.init 0x00000710 7 58 sym.check_pw 0x0000074a 7 203 main 0x00000820 4 101 sym.__libc_csu_init 0x00000890 1 2 sym.__libc_csu_fini 0x00000894 1 9 sym._fini We only care about main and check_pw. we have 9 local stack variables. Radare names them local_2h, local_3h, et cetera based on their offsets from the stack pointer. The beginning of our program is pretty familiar. Starting at 0x74a: push rbx sub rsp, 0x10 cmp edi, 2 jne 0x7cc We have the function prologue allocating 16 bytes of memory for our local variables and an if statement. Recall that the DI register holds the first argument, and since this is main, that argument is argc. So, if (argc != 2) jump somewhere. In Radare, look to the left of the jne instruction. You’ll see an arrow coming out of that instruction and running down to 0x7cc, where we see: ``` lea rdi, str.Need_exactly_one_argument. ; 0x8a4 ; "Enter Password :" ; const char * s call sym.imp.puts ; int puts(const char *s) mov eax, 0xffffffff ; -1 jmp 0x7c6 ``` ## Visual Flow Analysis Radare provides a feature known as “visual mode”. To use it, we need to move Radare’s internal cursor to the function we want to analyze with the seek command: s main. You’ll notice that the prompt changes from [0x000005e0]> to [0x0000074a]>, indicating that the current location has moved to the first instruction in the main function. Then, type VV (Visual mode 2). You should see ASCII-art boxes with different parts of the program. Every time a jump instruction appears, the block ends and lines come out, pointing to other blocks. For instance, in the top block (the beginning of the function), the jne instruction which checks for the argument number causes a red line to come out to the left and a green one to the right. On the right, you should see a block that looks a bit like this: ``` .---------------------------------------------. | 0x7cc ;[ga] | | ; const char * s | | ; 0x8a4 | | ; "Enter Password : " | | lea rdi, str.Need_exactly_one_argument. | | call sym.imp.puts;[gh] | | ; -1 | | mov eax, 0xffffffff | | jmp 0x7c6;[gg] | `---------------------------------------------' ``` That’s the block we just analyzed. Use the arrow keys to follow the blue (unconditional) line down to see what happens after this block. You’ll see, at the bottom of the graph, a block at 0x7c6 that is unconditionally jumped to from a number of places in the program: ``` add rsp, 0x10 pop rbx ret ``` This simply frees stack space and returns. So this program behaves like the others we’ve looked at: if there aren’t the right number of arguments, it prints a string and exits with an error code (remember, eax was loaded with -1). If we progress down the red branch from the first block, where execution flows if the jne isn’t taken (that is, if there are exactly 2 strings passed to the binary), you’ll see these instructions at 0x754: mov dword [local_9h], 0x426d416c ; [0x426d416c:4]=-1 mov word [local_dh], 0x4164 ; [0x4164:2]=0xffff mov byte [local_fh], 0 mov word [local_6h], 0 mov byte [local_8h], 0 mov byte [local_2h], 2 mov byte [local_3h], 3 mov byte [local_4h], 2 mov byte [local_5h], 3 mov byte [local_6h], 5 mov rbx, qword [rsi + 8] ; [0x8:8]=0 mov eax, 0 mov rcx, 0xffffffffffffffff mov rdi, rbx repne scasb al, byte [rdi] cmp rcx, 0xfffffffffffffff8 je 0x7df Mostly all this block does is load a bunch of values into memory. Here, rather than showing the actual addresses, Radare has named each local variable based on its stack offset. Scrolling up to the initial block, we can see that local_2h through local_fh are all of type int (or, at least, that’s what Radare thinks) and they’re each one byte in size. After loading those values into local variables, it loads something in memory at the address rsi + 8 into rbx. If we recall the SystemV x86_64 calling convention, rsi is the second argument: argv. So rsi + 8 is argv[1]. It then loads up eax with 0, rcx with 0xffffffffffffffff , and rdi with rbx, the value just loaded from argv[1]. Then it runs repne scasb. This is a weird but fast quirk of x86: it has a native instruction for string length determination. repne means repeat while not equal, and scasb means string compare and subtract, byte variant - see the reference here for more info. So, this instruction compares bytes to the value of al (which is zero here), starting at the memory address in rdi and counting up, while subtracting from rcx (remember, C is the counter register). In essence, this instruction is measuring the length of a string. Isn’t x86 a fun time? Anyway, once the repne scasb operation is done, rcx will hold 0xffffffffffffffff minus the length of the string, and we can see that the next instruction compares it to 0xfffffffffffffff8. Therefore, if the string is 0xffffffffffffffff - 0xfffffffffffffff8 = 7 bytes long (note that this includes the terminating character), the jump is taken; otherwise it is not. If the jump isn’t taken, the code proceeds to a block at 0x7a8, where the failure string is printed. Therefore, we can immediately determine that any correct passcode will be precisely six bytes (plus the null terminator). Functions What is more interesting is the block that is executed when the jump is taken. ``` lea rdx, [local_2h] lea rsi, [local_9h] mov rdi, rbx call sym.check_pw test eax, eax je 0x7a8 ``` The binary loads the addresses of some local variables, loads argv[1] again (rbx, remember?), and then calls a function: sym.check_pw. Of course, the actual binary just has the offset of the function, but Radare was smart enough to look up that offset in the symbol table and put the name in for us. check_pw sounds pretty promising, as cunction names go, and we can verify that by continuing: after the call, the program jumps to failure if the function returned zero, and continues on to success if it did not (recall that test eax, eax followed by je jumps if eax is zero). So what exactly does this function do? First, recall that the SystemV x86_64 calling convention says that rdi, rsi, and rdx (the three registers loaded prior to the call) are the first three arguments to the function. So in C, the call looks like this: ```c int result = check_pw(argv[1], &local_9h, &local_2h); if (result == 0) { // fail } else { // succeed } ``` The question, then, is what, exactly, does check_pw do? To find that out, we need to exit visual mode (q followed by q) and seek to it (s sym.check_pw), then look at the flow diagram (VV). It is immediately clear that this function contains a loop. Unlike the main function, which continues consistently downward no matter which jumps you take, in check_pw a block near the bottom has a jne that jumps up to the top. Looking a little more closely, we can see that there are three opportunities to return. One of them (at 0x73e) returns 0 (failure) and the other two (at 0x744 and 0x748) return 1 (success). This kind of high-level analysis is only possible with a flow diagram, and is one of the major advantages of using a tool like Radare. When I was getting started with reverse engineering, I drew out flow diagrams by hand, simply because I was unaware that free tools existed. Don’t do that; it’s a waste of time. The function starts off by loading r8d, a 64-bit general purpouse register, with the value 0. It then jumps to the following block (at 0x716): ```asm! movsxd rax, r8d movzx ecx, byte [rdx + rax] add cl, byte [rsi + rax] cmp cl, byte [rdi + rax] jne 0x73e;[gb] ``` This block sets rax to r8d (which we know is zero), then loads a single byte from its third argument, indexed by eax. Going back to our arg list, this argument is &local_2h, so it’s loading (&local_2h)[0]. It then adds a byte loaded from the second argument indexed by eax ((&local_9h)[0]), and compares it to a byte loaded from its first argument indexed by eax (argv[1][0]). Remember that this is a loop, so eax will change. In other words: ```c while (/* something?? */) { char temp = arg3[eax] + arg2[eax]; if (temp != arg1[eax]) { return 0; // failure } } ``` If the jump isn’t taken, this code is run (at 0x725): ```asm! add r8d, 1 movsxd rax, r8d cmp byte [rsi + rax], 0 je 0x744;[gd] ``` This increments the loop counter, then checks if the second argument indexed by the loop counter is zero. If so, it jumps to code (at 0x744) that returns success. Otherwise, it continues looping. Our updated C code looks like this: ```c while (arg2[eax] != 0) { char temp = arg3[eax] + arg2[eax]; if (temp != arg1[eax]) { return 0; // failure } eax++; } return 1; ``` At this point, it’s pretty easy to see what check_pw is doing: it’s comparing two strings, but it’s modifying each byte of one of the strings. Looking at the arguments passed in main, we can see that the program is adding (&local_2h)[eax] to (&local_9h)[eax]. I suggest going back to the main function (exit visual mode; pdf@main) to look at what each of those values will be. Both of these are just locations on the stack. We also know that check_pw will only be run on a string with 6 characters in it, so we only need to look at 6 values. Here are the values after local_2h (you can see them being set in main): 2, 3, 2, 3, 5. That’s only 5 values. What’s going on? If we look again, the stack variables are set starting at offset 0x754: ```asm mov dword [local_9h], 0x426d416c ; [0x426d416c:4]=-1 mov word [local_dh], 0x4164 ; [0x4164:2]=0xffff mov byte [local_fh], 0 mov word [local_6h], 0 mov byte [local_8h], 0 mov byte [local_2h], 2 mov byte [local_3h], 3 mov byte [local_4h], 2 mov byte [local_5h], 3 mov byte [local_6h], 5 ``` Prior to the values being set in order from local_2h to local_6h, by moving byte-sized values into them, local_6h (that is, rsp+0x6) is loaded with a word-sized 0 (this is Intel-syntax, so a word is 16 bits; see this historical note) value. That means that both rsp+0x6 and rsp+0x7 are set to zero. Note that Radare didn’t even realize that these values were in an array, let alone tell us what it was initialized to, despite it being entirely static. This is the part of reverse engineering that requires a human brain; the computer knows what’s there, but it can’t know what it’s being used for. Anyway, our table of values starting at local_2h is [2, 3, 2, 3, 5, 0]. These aren’t printable ASCII characters, so presumably the hard-coded password is stored at the other input value: `local_9h`. The mov instruction above is sized for a double word (dword); a 32-bit value. It’s followed by a word-sized value, then a byte-sized zero. That works out to 4+2 = 6 bytes, plus a null terminator, so it’s a good bet that these three locations together form a string. Indeed, if we write out the values with byte seperation, it makes sense: 42 6d 41 6c 41 64 00 is a well-formed null-terminated string, with values in the printable ASCII range. ``` All that’s left is to add the offsets to them, giving us 44 70 43 6e 44 64 00. Translating these to ASCII, we get: DpCnDd. ``` Sure enough, putting this into the binary gives us incorrect ! What’s happening here is that x86 processors are little-Endian. That means that bytes are read from the right to the left, not the other way around, in multi-byte values. This is easily corrected by just flipping the order of local_9h and local_dh. 42 6d 41 6c becomes 6c 41 6d 42 and 41 64 becomes 64 41, making our whole string 6c 41 6d 42 64 41 00 and our correct string 6e 44 6f 45 69 41 00 or nDoEiA. ```Password : nDoEiA```