uiuctf - syscalls
Welcome to my writeup for the UIUCTF challenge syscalls. This was an exceptional learning experience for me. To be frank, I spent more time on this than I would like to admit and I barely managed to solve it in time (less than 1 hour before the end of the CTF at 4 AM 😅). In this writeup (more of a walkthrough), I will explain my approach, mistakes, findings, and how I managed to solve it in the end. I will also try my best to explain concepts related to binaries in general and binary exploitation in specific.
Note: informational sections are marked with XTRA:
.init
The challenge description is:
We are also given two files: syscalls
which is the target binary and also a Dockerfile
which I ignored as usual because I'm too excited to start exploring the binary (thankfully it wasn't required this time 😂).
Running file
on the binary we get the following:
So it's an Executable and Linkable Format (ELF) i.e. an executable file for linux. We also see that it's 64-bit and it's stripped meaning the symbols are removed such as the main
function but it's not a big deal since we don't have many functions anyway so it's easy to keep track of addresses.
Then we run checksec
to get an idea of the security mechanisms in place for the binary:
Understanding the Program
Before jumping into Ghidra and decompiling the assembly instructions, it's always a good habit to start by running the program and interacting with it as this helps give a general overview which can help later on when reverse engineering.
Running the program we get the following output and then we are prompted for input:
By entering a simple input we immediately get a segfault
Trying again with a single character we also get an Illegal instruction
which is weird, where are these instructions coming from?
By running dmesg
, which is a utility that allows us to view the kernel's messages and errors
we see that the instruction pointer (ip) was set to 0x00007fffffffdf70
when the segfault happens which tells us that we started executing instructions on the stack (probably our buffer) which obviously failed since the data there was not valid instructions.
Enough guessing and let's dive into Ghidra 🐲
Static Analysis
After creating a project, loading the binary, and running the default analysis we see the entry function which always has as its first argument the main
function
so after renaming and jumping to it we see the decompiled main
function:
In red is the relevant part of the code, while the blue is typical initializations and checks. We see there is a buffer of size 184 which is passed to the first and third functions.
First Function
Logically, the next step is to start dissecting the three function. Note: from here on out I will rename variables as we go to gradually add semantic meaning to the code and get it closer to the original source code.
The first function prints the hint, and then reads 176 bytes into the input buffer from STDIN (i.e. input). Now given that we are reading 176 bytes into a 184-byte buffer, we are sure this not a buffer overflow situation.
Second Function
Moving on:
Honestly the decompilation looks daunting but it's because Ghidra has not identified the struct being used and instead split it up into 8-byte chunks on the stack but we will get back to that later. For now, we identify two important function calls using the prctl()
function which is a wrapper for the prctl
syscall
Third function
The third function does as we suspected, it runs the buffer as code (free shellcode).
XTRA: System Calls
In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive or accessing the device's camera), creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.
Before jumping into an example let's briefly explain file descriptors as well as they will come in handy later
File Descriptors
File descriptors are identifiers to files, pipes, or networks sockets. Using this ID, a process can tell the OS which file it is referring to. FDs can be obtained in multiple ways, one of which is the open
syscall which takes the path name of the file as one of its arguments.
example
Based on this we can construct the shellcode to print the file contents to terminal
Second Function Continued
Dynamic Analysis
Dumping the Filter from Memory
Pitfall: The Slow Way of Doing Things
Instead of looking for a tool to extract the SECCOMP filter automatically from the running process (spoiler alert: IT EXISTS!), I instead opted for the manual way: extracting the filter by hand from memory. I made the mistake of relying on Ghidra's offsets for the variables which turned out to be inconsistent with the disassembled instructions.
When I finally got the correct address of the filters from memory by using strace
which gave the address of the struct, I was now faced with the task of decoding the filter.
At this point I started looking for a tool to decode the filter bytes I extracted from memory. The filters are written in the Berkley Packet Filter format which is used for writing filters for many purposes but was originally made for network packets. I stumbled upon seccomp-tools
which offer a suite of tools for reading, dumping, and decoding BPF filters with regards to SECCOMP.
Using the Correct Tool for the Job
By running seccomp-tools dump -c ./syscalls
, I was able to get the decoded SECCOMP filters in a well-formatted style without having to worry about dumping the filters by hand.
Pitfall: The Naive Approach
Bypassing the Filter
open -> openat
openat
opens a file relative to a directory file descriptor. Which is fine as we can use the current directory FD
read -> preadv2
preadv2
is similar to read
but allows for reading to multiple buffers instead of one.
For our case we just need one. Another point to consider is that unlike read
, preadv2
does not take a pointer to a char
buffer, instead it take a pointer to an iovec
struct.
Another point is that we are going to utilize the stack by "allocating" the buffer at a safe offset from RSP
. Putting this together we get the following:
write -> writev
This is the signature of writev
:
we can see that, similar to preadv2
, it takes a pointer an iovec
struct which is handy since we already have it on the stack from above. Moving on, in the filter we can see that writev
is not blocked, however, there is a condition placed on the FD argument:
Final Payload
With this we have all the ingredients to leak the flag. Putting it all together we get:
FLAG: uiuctf{a532aaf9aaed1fa5906de364a1162e0833c57a0246ab9ffc}
🎊
.fini (Conclusion and Learnings)
This challenge was a difficult one for me but it is undoubtedly the one I learned from the most both in terms of technical knowledge and problem solving / mindset. I hope you enjoyed this writeup, feel free to share with me your feedback and suggestions.
Thank you 👍🏻
Last updated