uiuctf - syscalls
Welcome to my writeup for the UIUCTF challenge syscalls. This was an exceptional learning experience for me. To be frank, I spent more time on this than I would like to admit and I barely managed to solve it in time (less than 1 hour before the end of the CTF at 4 AM 😅). In this writeup (more of a walkthrough), I will explain my approach, mistakes, findings, and how I managed to solve it in the end. I will also try my best to explain concepts related to binaries in general and binary exploitation in specific.
Note: informational sections are marked with XTRA:
.init
The challenge description is:
We are also given two files: syscalls
which is the target binary and also a Dockerfile
which I ignored as usual because I'm too excited to start exploring the binary (thankfully it wasn't required this time 😂).
Running file
on the binary we get the following:
So it's an Executable and Linkable Format (ELF) i.e. an executable file for linux. We also see that it's 64-bit and it's stripped meaning the symbols are removed such as the main
function but it's not a big deal since we don't have many functions anyway so it's easy to keep track of addresses.
Then we run checksec
to get an idea of the security mechanisms in place for the binary:
We see it is a Position-Independent Executable (PIE), this means that the program can be loaded into different locations of the memory (in the virtual address space). It makes it harder when we want to jump around in the binary as we have to leak the base of the binary first. You can read more about PIE here, anyway we won't have to worry about it in this challenge. The second thing we notice is that the stack is executable 🤨, this means we can run shellcode on the stack directly without having to use techniques like ROP.
Understanding the Program
Before jumping into Ghidra and decompiling the assembly instructions, it's always a good habit to start by running the program and interacting with it as this helps give a general overview which can help later on when reverse engineering.
Running the program we get the following output and then we are prompted for input:
By entering a simple input we immediately get a segfault
Trying again with a single character we also get an Illegal instruction
which is weird, where are these instructions coming from?
By running dmesg
, which is a utility that allows us to view the kernel's messages and errors
we see that the instruction pointer (ip) was set to 0x00007fffffffdf70
when the segfault happens which tells us that we started executing instructions on the stack (probably our buffer) which obviously failed since the data there was not valid instructions.
Enough guessing and let's dive into Ghidra 🐲
Static Analysis
After creating a project, loading the binary, and running the default analysis we see the entry function which always has as its first argument the main
function
so after renaming and jumping to it we see the decompiled main
function:
In red is the relevant part of the code, while the blue is typical initializations and checks. We see there is a buffer of size 184 which is passed to the first and third functions.
First Function
Logically, the next step is to start dissecting the three function. Note: from here on out I will rename variables as we go to gradually add semantic meaning to the code and get it closer to the original source code.
The first function prints the hint, and then reads 176 bytes into the input buffer from STDIN (i.e. input). Now given that we are reading 176 bytes into a 184-byte buffer, we are sure this not a buffer overflow situation.
Second Function
Moving on:
Honestly the decompilation looks daunting but it's because Ghidra has not identified the struct being used and instead split it up into 8-byte chunks on the stack but we will get back to that later. For now, we identify two important function calls using the prctl()
function which is a wrapper for the prctl
syscall
Third function
The third function does as we suspected, it runs the buffer as code (free shellcode).
XTRA: System Calls
Wikipedia defines system calls as:
In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive or accessing the device's camera), creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.
Basically, syscalls are a way for the program to safely interact with various physical devices (e.g. hard drive for reading/writing files), services, and processes. A list of all linux syscalls can be found here. Using syscalls is very similar to calling functions, we place the arguments in their respective registers, we choose the syscall number, then we invoke it using the syscall
mnemonic
Before jumping into an example let's briefly explain file descriptors as well as they will come in handy later
File Descriptors
File descriptors are identifiers to files, pipes, or networks sockets. Using this ID, a process can tell the OS which file it is referring to. FDs can be obtained in multiple ways, one of which is the open
syscall which takes the path name of the file as one of its arguments.
example
Back to our example, our goal is to write the contents of a file (for which we have already obtained a file descriptor) to STDOUT (to the terminal). We are going to use the sendfile
syscall with the following signature
Based on this we can construct the shellcode to print the file contents to terminal
Second Function Continued
prctl, short for process control, allows the program to control characteristics about its own process (and child processes). The first argument specifies the operation which is defined in <linux/prctl.h>
(link). For the first function call 38 maps to PR_SET_NO_NEW_PRIVS
which prohibits the process from obtaining new privileges, we can skip this and focus on the second call to prctl with the option 22 PR_SET_SECCOMP
which is used to set SECure COMPuting options, the second arguments is SECCOMP_MODE_STRICT
which is mentioned in seccomp.h. The third argument is a pointer to a struct of type struct sock_fprog
. By somehow obtaining the SECCOMP filter we would be able to determine which syscalls are allowed and which conditions are placed on their arguments. I gave up on trying to extract the filter from the Ghidra decompilation as it's unreliably and instead resolved to dynamic analysis.
Dynamic Analysis
Dumping the Filter from Memory
Pitfall: The Slow Way of Doing Things
Instead of looking for a tool to extract the SECCOMP filter automatically from the running process (spoiler alert: IT EXISTS!), I instead opted for the manual way: extracting the filter by hand from memory. I made the mistake of relying on Ghidra's offsets for the variables which turned out to be inconsistent with the disassembled instructions.
When I finally got the correct address of the filters from memory by using strace
which gave the address of the struct, I was now faced with the task of decoding the filter.
At this point I started looking for a tool to decode the filter bytes I extracted from memory. The filters are written in the Berkley Packet Filter format which is used for writing filters for many purposes but was originally made for network packets. I stumbled upon seccomp-tools
which offer a suite of tools for reading, dumping, and decoding BPF filters with regards to SECCOMP.
XTRA: Filter struct description from linux docs
Using the Correct Tool for the Job
By running seccomp-tools dump -c ./syscalls
, I was able to get the decoded SECCOMP filters in a well-formatted style without having to worry about dumping the filters by hand.
Pitfall: The Naive Approach
Due to my inexperience with the linux kernel and I had the brilliant idea of overwriting the SECCOMP rules by injecting my own filters which allow everything and then invoking the prctl
or the seccomp
syscalls to invoke the changes. After hours of crafting a payload byte by byte and looking out for struct alignment and so on I get the following errno EINVAL
which according to the man page can be the result of one of the following:
I still wasn't sure if it was because of the 4th or the 5th reason so I looked for another approach which was SECCOMP_SET_MODE_STRICT which basically sets a predefined filter that only allows read
, write
, and exit
. This was easier to test since I didn't have to provide pointers or worry about offsets or alignment. When this method also resulted in the same errno I immediately knew it was because of the 4th reason. What's the point of being able to overwrite them after they have been set?
Bypassing the Filter
Now I knew the only way to leak the flag is to somehow bypass the filter. Reading writeups for similar challenges and SECCOMP bypass guides I realized that there are alternatives to the important syscalls (open
, write
, read
) not included in the filter. The syscalls explained below are in the order they are executed and the stack representation rolls over the three syscalls.
open -> openat
openat
opens a file relative to a directory file descriptor. Which is fine as we can use the current directory FD
read -> preadv2
preadv2
is similar to read
but allows for reading to multiple buffers instead of one.
For our case we just need one. Another point to consider is that unlike read
, preadv2
does not take a pointer to a char
buffer, instead it take a pointer to an iovec
struct.
Another point is that we are going to utilize the stack by "allocating" the buffer at a safe offset from RSP
. Putting this together we get the following:
write -> writev
This is the signature of writev
:
we can see that, similar to preadv2
, it takes a pointer an iovec
struct which is handy since we already have it on the stack from above. Moving on, in the filter we can see that writev
is not blocked, however, there is a condition placed on the FD argument:
If any of the upper 32 bits is 1, then the syscall is permitted. Furthermore, the size of FDs on a 64-bit linux system is 4 bytes (32 bits). So only the 4 lower bytes of rdi
will be considered for the actual file descriptor that will be passed to the kernel. Therefore, we are free to change the upper 4 to anything that bypasses the filter, so I set the highest bit to 1 to satisfy this condition.
Final Payload
With this we have all the ingredients to leak the flag. Putting it all together we get:
FLAG: uiuctf{a532aaf9aaed1fa5906de364a1162e0833c57a0246ab9ffc}
🎊
.fini (Conclusion and Learnings)
This challenge was a difficult one for me but it is undoubtedly the one I learned from the most both in terms of technical knowledge and problem solving / mindset. I hope you enjoyed this writeup, feel free to share with me your feedback and suggestions.
Thank you 👍🏻
Last updated