Compilers and Software Security Gaurav S. Kc Programming Systems Lab

Compilers and Software Security

Gaurav S. Kc

gskc@cs.columbia.edu

http://www.cs.columbia.edu/~gskc

Programming Systems Lab

Tuesday, 22 nd April 2003

Outline

 Security

 Runtime Management of Processes

 Vulnerabilities and Attack Techniques

 Compilers 4115

 Security Research

 Conclusion

Security

 What does security mean?

– Focus: Security of resources

• No unauthorised access ( using Authentication )

• Availability for authorised users ( no DoS )

– Also: Security of data during transit

• Protection from eavesdropping

• Protection from malformation

• Solutions: PKI for encryption, digital signatures for non-repudiation

Security: Models & Threats

 Social aspects of security failure

– 3Bs: Burglary, Bribery, Brutality

– Social Engineering

 Threats to Security During Transit

– Man-in-the-middle attack

• Identity spoofing / Masquerading

• Packet sniffing

• Communication replay

Threats to Application Security

 Trojan Horses

Malicious security breaking program disguised as something benign like a screen saver or game program

– Keystroke loggers & powerful remote-control utility like Back Orifice

– Abnormal system behaviour, e.g. open server socket, CTRL-ALT-

DEL signal handler

– Zombie nodes, awaiting instructions for conducting D.DoS

 Computer Viruses

Executable code that, when run by someone, infects or attaches itself to other executable code in a computer in an effort to reproduce itself

– Can be malicious, erase files, lock up systems

– Boot Sector, File, Macro, Multipartite, Polymorphic, Stealth

– Anti-virus: search for known signature in suspect files

Threats to Application Security 2

 Internet Worms

A worm is a self-replicating program that does not alter files, but resides in active memory and duplicates itself by means of computer networks

– Morris Worm (RTM) exploited fingerd, sendmail, weak passwords

– Code Red exploited a (publicised) vulnerability in Microsoft IIS

– Code Red II had a Trojan payload

– Nimda: Swiss Army knife of worms – worm, virus, trojan!

Spread via its own e-mail engine, IIS servers that it scanned, and shared disks on corporate networks.

 Common Trait:

Well-crafted input data can let you take control of a computer

– WinNuke: for rebooting remote Win95 machine :)

 Security



 Compilers 4115


 Conclusion

Process Runtime







 x86

– 32-bit von Neumann machine

– 2 32 ≈ 4GB memory locations

Breakdown of process space stack

– <= 0xbfffffff , Grows downwards

– Environment variables, Program parameters

– Automatically allocated stack variables

– Activation records heap

– Dynamic allocation

– Explicitly through malloc, free int main(int argc, char *argv[], char *env[]) { return 0;

}

0xffffffff kernel space

0xbfffffff env[] argv[] char *env[] char *argv[] int argc runtime stack runtime heap

.bss

.data

.text

0x08048000

Program

Stack

Heap

0x00000000

Process Runtime 2









.bss

– assembler directive for IBM 704 assembler

– runtime allocation of space

– RWX

.data

– compile-time space allocation, and initialisation values

– RWX

.text

Block Started by Segment

– program code

– runtime DLLs

– RO, X

.rodata

– RO, X

– constants const int x = 4;

// static & global uninitialised data

Data Section

// static & global initialised data

Text Section

“hello, world” // executable machine code

0xffffffff kernel space

0xbfffffff env[] argv[] char *env[] char *argv[] int argc runtime stack runtime heap

.bss

.data

.text

0x08048000

0x00000000

Activation Records









Subroutines

– functions and procedures

– abstraction of computation

– structured programming concept

Stack frame, Function frame, Activation frame

– Block of stack space reserved for duration of function

Logical stack frames are crucial for implementing subroutines

– Each frame contains information related to the context of the

given function. Grows downwards for each

nested invocation.

Reserved registers

– %eip (

next

instruction) , %esp, %ebp (fixed offsets)

Activation Records 2

 Source function

 Visualisation of the runtime stack frame void function(char *s, float y, int x) { int a; int b; char buffer[SIZE]; int c; strcpy(buffer, s); return;

}

#define SIZE 9 int main(void) { function(“yep”, 2.f, 93); return 0;

}

16(%ebp)

12(%ebp)

8(%ebp)

-12(%ebp)

-16(%ebp)

-40(%ebp)

-44(%ebp) function parameters return address old frame pointer automatic variables int x float y char *s ret. addr: 0x0abcdef0 old fp: 0x4fedcba8 int a int b char buffer[SIZE] int c

PC

FP

SP

Activation Records 3

 Source function

 Assembly equivalent

 Building the stack frame prologue function body void function(char *s, float y, int x) { int a; int b; char buffer[SIZE]; int c; strcpy(buffer, s); return;

}

#define SIZE 9 int main(void) { function(“yep”, 2.f, 93); return 0;

} epilogue int x float y char *s function: pushl %ebp s movl %esp, %ebp subl $56, %esp subl $8, %esp pushl 8(%ebp) leal -40(%ebp), %eax pushl %eax call strcpy addl $16, %esp leave ret buffer

.LC0:

.string “yep” main:

...

pushl $93 pushl $0x40000000 pushl $.LC0 call function

...

 Security



 Compilers 4115


 Conclusion

Vulnerabilities

 C: Low level, high level systems language

 Efficient execution, Usable for real-time solutions

 Pointers and Arrays

– Pointer to (null-terminated?) block of memory

 Lack of bounds checking

– Buffer overflow causes havoc

Attack Techniques

 Criteria for successful attack

– Locate a buffer that has an unsafe operation applied to it

– Well-crafted input data to trigger the overflow

 Buffer overrun vulnerabilities

– Stack-based: Stack-smashing attack

– Heap-based: Function pointers, C++ virtual pointers,

Exception handlers (CodeRed)

 FormatString exploits

– %n format converter for *printf family of functions

– writes #bytes output so far to %n argument (int *) printf(“\x70\xf7\xff\xbf%%n”); //0xbffff770 := 4

Smashing the Stack







To overflow (automatic) stack buffer, one would need:

– Shellcode, i.e. characters representing machine code (obtain from gdb, as)

– Memory location of injected shellcode (typically buffer address)

Can approximate to make up for lack of precise information

– nop instructions at the beginning of the shellcode

– overwrite locations around 0(%ebp) with shellcode address suid installed programs. Shellcode: shell, export xterm display void function(char *s, float y, int x) { int a; int b; char buffer[SIZE]; int c;

... ; strcpy(buffer, s); ...

}

Stacksmashing attack

• Buffer overrun

• Code injection

• Return address overwritten int x float y char *s ret. addr: 0x0abcdef0 old fp: int a int b

...

(“/bin/sh”) exec char buffer[SIZE] int c

PC

Heap-Based Attacks



 C++ Pointer to vtable

– Higher address: virtual pointer

– Lower address: buffer

.bss

Function pointer

– Higher address: function pointer

– Lower address: buffer int (* f) (void) char buffer[ ]; void *vptr class ABC { char buffer[10]; virtual void print() { cout << buffer;

}

};

} void set(char *s) { strcpy(buffer, s); int main(int argc, char *argv[]) { static char buffer[10]; static int (*f)(void) = exit;

// gets(buffer); strcpy(buffer, argv[1]);

(*f)();

ABC *abc = new ABC(); abc->set(argv[1]); abc->print();

} char buffer[ ]; C++ object

 Security



 Compilers 4115


 Conclusion

Compilers 4115

 GCC: GNU Compiler Collection

– Just a wrapper for different phases

• cpp: C preprocessor program.c  program.i

• cc1: C compiler proper program.i  program.s

• as: Assembler (a.out, ELF relocatable files) program.s  program.o

• ld: Link editor (ELF executables) program.o  program

GCC

 Command line options gcc –save-temps (-pipe) –Wall

–O0 –dr –v –static

-I$HOME/include –L$HOME/lib

-lsocket –lm -lpthread

 Standard libraries

/lib/libc.so.6, /lib/ld-linux.so.2

 Standard library header files

/usr/include

Other tools

 GNU Debugger: gdb

 GNU Binutils

– objcopy : add/remove ELF sections

– readelf,objdump : print ELF information

 Miscellaneous

– ldd : list dynamic dependencies (DLLs)

– strace : trace syscall invocations

 Security



 Compilers 4115


 Conclusion

Security Research











Know thy enemy

– Monitor the attacker’s behaviour and tactics

– In a constrained resource environment

Honeypots

– Illusion of an “easy target” to lure attackers

Jail

– Sandboxed environment using chroot

– All necessary files are available locally

Virtual machines

Sandboxes with limited syscalls

Automatic Defence Mechanisms





Face thy enemy

– Applications fortified with runtime checks

Stackguard, Memguard, .NET cl.exe /gs

– “canary” word to detect Stack-smashing

– READONLY stack frame

– .NET C/C++ compiler protects 0(%ebp),4(%ebp)

 Libsafe, Libverify

– “safe” implementation of standard libraries

– runtime backup/checking of return address

Defence through Diversity

 Code Diversity

– Code randomisation for diversity

– Security through obscurity even for opensource software

– No more: breach once, breach everywhere

 Compiler-based Protection

– Secure the stack data

– Potentially vulnerable heap data

Casper

 Paper: Casper: Compiler-assisted securing of programs at runtime

 Via added runtime checks as part of function invocations

 Add protection code

 Protect what: control data in stack frames

 What from: most stack-smashing attacks

 Available as patches:

• Compiler: gcc-2.95

• Debugger: gdb-5.2.1

Casper in Action

 Similar in nature to Stackguard, but with much smaller overhead

 XOR property: idempotent when applied twice. Simplest form of encryption / obfuscation of data

Casper protection

• Mask original return address value when entering function

• Unmask and restore the original return address value when returning from function

• Overwritten value will be “restored” to invalid code address int x float y char *s ret. addr: 0x0abcdef0 old fp: int a int b char buffer[SIZE] int c

PC

Get the Processor Involved





 Paper: Countering Code-Injection Attacks With

Instruction-Set Randomization

Machine instruction translation – unique per process

Reversible mapping

 machine instruction ↔ garbage bit sequence

1. Post-compilation stage

• Encode all executable sections with key

• Store codec key in file header

2. Modified von Neumann: fetch, decrypt , decode, execute

• decrypt : “Processor” restores each block of bytes to valid, original instruction

• Injected code gets probabilistically transformed to garbage bitsequence that cannot be decoded

Binary Encryption and Execution

SOURCE

CODE compile MACHINE

EXECUTABLE

FILE decrypt fetch key

ENCRYPTED

EXECUTABLE

FILE key encrypt via objcopy

Binary Encryption and Execution 2

 Bochs Pentium emulator is the “modified machine”

– Support for hidden register %gav

– Interrupt routine handler saves %gav to process structure





Linux 2.2.14

– Kernel recognises new register

– Support for register in process structure as and objcopy for program encryption and codec storage

code

Future Work







Randomised ISA on real machine

– Programmable Transmeta chips

– Dynamo: Dynamic optimiser of native code

Activation records

– automatically managed, randomised layout

Heap smashing techniques

– break type-system

– corrupt malloc data, Diversified research

– Languages, Compilers: C++, Sun CC, Visual C++

– Other architectures: Solaris, Alpha (DLX ;-)

Conclusion

 Security

– Process Security


– Stack, Heap, Activation Records


– Buffer overrun. Stacksmashing. Pointer overwriting.

 Compilers 4115

– GCC, GDB, Binutils


– Monitoring. Runtime protection

References

4.

5.

1.

2.

3.

6.

7.

The Bochs Pentium emulator http://bochs.sourceforge.net/

Aleph One. Smashing The Stack For Fun And Profit http://www.phrack.org/show.php?p=49&a=14

Arash Baratloo, N. Singh, T. Tsai

Transparent Run-Time Defense Against Stack Smashing Attacks

Crispin Cowan, M. Barringer, et al.

FormatGuard: Automatic Protection From printf format string vulnerabilities

Crispin Cowan, Calton Pu, et al.

StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks

Gaurav S. Kc, Stephen A. Edwards, Gail E. Kaiser, Angelos Keromytis

Casper: Compiler-assisted securing of programs at runtime

Gaurav S. Kc, Angelos D. Keromytis, Vassilis Prevelakis

Countering Code-Injection Attacks With Instruction-Set Randomization

Optimisation of Tail-Recursion

C source code int factorial(int n) { if (1 >= n) return 1; return n*factorial(n-1);

} int val = factorial(x); int factorial(int n, int v) { if (1 >= n) return v; return factorial(n-1, v*n);

} int val = factorial(x, 1);

Assembly factorial:

...

pushl n-1 call factorial

...

factorial:

...

n := n-1 v := v*n goto factorial

back

x86 Processor

 Dual integer pipeline

 Hidden register

%eip does not always fetch the

“next” instruction

back

Binary Encryption Code: GNU as

if [ ! $1 ] ; then echo "usage: $0 <ELF_executable_image> [key]"; exit; fi if [ ! $2 ] ; then XOR_KEY="0x$RANDOM"; else XOR_KEY=$2; fi

# file names

NEW_FILE="$1.$XOR_KEY"

ORG_FILE=$1

INTERMEDIATE="$XOR_KEY.o"

# modified binary

OBJCOPY=/home/gskc/usr/binutils-2.13.2/bin/objcopy

# create an intermediate ELF object file with an .xor.stuff section as -o $INTERMEDIATE <<EOF

.section .xor.stuff

.long $XOR_KEY

EOF

# merge the .xor.stuff section into the specified file

$OBJCOPY --encrypt-xor-key $XOR_KEY --add-section .xor.stuff=$INTERMEDIATE $ORG_FILE $NEW_FILE

# clean up rm -f $INTERMEDIATE

back

Compilers and Software Security Gaurav S. Kc Programming Systems Lab

Compilers and Software Security

Gaurav S. Kc

Programming Systems Lab

Outline

Security

Security: Models & Threats

Threats to Application Security

Threats to Application Security 2

Process Runtime

Process Runtime 2

Activation Records

Activation Records 2

Activation Records 3

Vulnerabilities

Attack Techniques

Smashing the Stack

Heap-Based Attacks

Compilers 4115

GCC

Other tools

Security Research

Automatic Defence Mechanisms

Defence through Diversity

Casper

Casper in Action

Get the Processor Involved

Binary Encryption and Execution

Binary Encryption and Execution 2

Future Work

Conclusion

References

Optimisation of Tail-Recursion

x86 Processor

Binary Encryption Code: GNU as

Related documents

Products

Support

Compilers and Software Security Gaurav S. Kc Programming Systems Lab