Memory and Stack Frames

Recently I completed a Capture The Flag machine named 'The Necromancer' from the site VulnHub.  If you've never heard of VulnHub before, it is essentially a site that provides 'hands-on' experience in digital security by way of virtual machines designed to be deliberately vulnerable to certain exploits (https://www.vulnhub.com/about/).

One of 'The Necromancer''s 11 flags requires the use of a memory buffer overflow. In short, a memory buffer overflow occurs when a program attempts to put more data into an allocated section of memory than it can hold (for instance, 50 bytes of data into a 40 byte memory buffer) causing the program to crash or, in some instances, execute malicious code.

To understand and execute a memory buffer overflow first requires knowledge of memory, how it is allocated and how it is used. This knowledge is crucial to have before writing any code to execute an attack. In typical fashion I ended up with dozens of pages containing notes on these very topics while doing my own study. By collating these notes into one post I hope to help someone else understand computer memory, and also make my own revision that much easier.


Computer memory

Computer memory is a large array of consecutive data cells, or bytes, allocated to a program at run time. A technical caveat needs to be mentioned: there are both static and dynamic types of computer memory. Static memory is the allocation of memory for variables and data structures whose size(s) are predetermined and known at compile time. Dynamic memory is allocated to the program for variables and data structures without a fixed size, as needed.


Memory Map

Computer memory addresses are represented as hexadecimal numbers with the highest address being FFFFFFFF and the lowest address being 00000000. Within this structure there are some areas of memory which are always allocated to particular resources. For example, the Stack is an area of memory containing function call data that starts in high memory address space and grows down towards lower memory address spaces. In contrast, the Heap is an area of memory starting in low memory address space and grows towards higher memory address spaces.




The Stack: 

The Stack is an area of memory that consists of multiple Stack Frames. A Stack Frame consists of several pieces of information including a functions arguments and local variables, a return value, register values from previous functions, the previous functions stack pointer address and a return address used to continue executing the rest of the programs code after the function exits normally.
Each function creates its own Stack Frame. Each Stack Frame is a predictable structure which can be visually represented by the following image.


 
The process of putting new data onto the Stack is called a push while taking data off of the Stack is called a pop. So, we 'push' new data onto the Stack and 'pop' old data off of the Stack. Data on the Stack is processed by 'First In Last Out' (FILO) logic.

Stack Frames are the focus of memory buffer overflows. A memory buffer overflow seeks to rewrite the Return Address with a known memory address that contains some sort of malicious code or even just a call to another function that isn't suppose to be accessed.

Rewriting the Return Address is achieved by 'overflowing' the available memory buffer space, which can be thought of as everything between the Return Address and the Stack Pointer. If the space between the Return Address and the Stack Pointer is 500 bytes long, and we write 505 bytes as input to the function, the Return Address will be overwritten with the additional 5 bytes of data. When the newly rewritten Return Address is popped from the Stack, the program will attempt to find and execute the instruction at that memory address. If the memory address is invalid a Segmentation Fault will occur and the program will crash. If the memory address points to a valid instruction the program will execute whatever code is at that address.


Function calls

I've mentioned the term 'function' several times already without a proper explanation, so it's time to define exactly what a function call is.

For the sake of argument we will define a function as a block of code that returns a value (some arbitrary kind of data) after the function completes. When a function is called by a program all of its associated data is pushed onto the stack. The function may contain a call to yet another function which is also pushed onto the stack. Each function is then executed in turn and, when complete, popped from the stack. The process can be visually represented as the following (remember the "First In Last Out" logic here):





The function 'F' makes a call to the function 'G' which in turn calls function 'H'. Using the "First In Last Out" logic that is applicable to stack frames, the function 'H' is executed first and returns normally. The function 'G' is then executed followed finally by the function 'F'.


A closer look at ESP and EIP

To understand how data is pushed and popped within the context of a Stack Frame and how a program knows which instruction to execute next, it is necessary to take a closer look at Stack Pointers and Instruction Pointers.

A Stack Pointer is used to reference the lowest memory address of a given Stack Frame. This can generally be thought of as the 'end' of a Stack Frame. When the Stack Pointer's memory address is subtracted we are actually increasing the size of the Stack Frame and pushing new data onto the frame (remember: Stack Frames grow from high memory addresses to low memory addresses). When the Stack Pointer's memory address is added to we are decreasing the size of the Stack Frame and popping data from the frame.

The Instruction Pointer is, quite simply, a reference to the next memory address to be executed. Only the memory address being referenced by the Instruction Pointer can be executed.






In the above image the Instruction Pointer is represented by the => symbol. The instruction to be executed is the Sub instruction (subtract) with two arguments - $0x4, %esp. This is effectively subtracting the Stack Pointer by a given amount of space so that new data can be pushed onto the Stack Frame.

Prior to the Stack Pointer being subtracted by the sub $0x4, %esp instruction, a number of other instructions were executed that are related to the creation of this particular Stack Frame. The push %ebp instruction pushes the Base Pointer onto the Stack Frame, and the mov %esp, %ebp instruction moves the Base Pointer to the location of the Stack Pointer (this allows the Base Pointer to become a kind of reference for other memory addresses).
This set of instructions (push, mov, sub) make up a song and dance known as the Function Prologue. The Function Prologue prepares the Stack and Registers for use within the function.

After several other instructions have been executed, including a call to another function called 'wearTalisman', we observe an add instruction with two arguments - $0x4, %esp. This is effectively adding the same amount of space previously subtracted from the Stack Pointer. Next we observe two register values popping from the Stack Frame - pop %ecx and pop %ebp. The two pop instructions are a result of decreasing the size of the Stack Frame via the add instruction.

Finally, the Return Address (ret) will be executed to continue with the rest of the program.


Practice, practice, practice! Study, study, study!

Overwriting the Return Address (ret) with a memory address that points to malicious code or a desired function is the purpose of a memory buffer overflow. A big part of successfully executing this type of attack is understanding the basic concepts behind computer memory and stack frames. There is so much to learn about these concepts that it is impossible not to learn more and more each time the topic is visited. The attack is something of an art that demands practice, study, and coffee! 


Comments

Popular posts from this blog

Exploiting OpenSSH 4.7 / OpenSSL 0.9.8 (Metasploitable 2)

Reverse engineering a simple C program part 1: Strings, ltrace, hexdump

501 million 'Pwned Passwords'