ECE 4112 Internetwork Security Lab: Cryptography Group Number: _______________ Member Names: _________________________ _________________________ Date Assigned: Date Due: Last Edited: Lab Authored by: Scott Gilliland and Szabolcs Palinko SC SLDKFJX<J CB Please read the entire lab and any extra materials carefully before starting. Be sure to start early enough so that you will have time to complete the lab. Answer ALL questions and be sure you turn in ALL materials listed in the Turn-in Checklist ON or BEFORE the Date Due. Goal: This lab will introduce you to some basic cryptographic primitives. Cryptography is widely used over the Internet to allow secure communications and trust, and therefore a general understanding of its principles are essential for every network security professional. Summary: This lab covers symmetric and public key cryptography, key exchange algorithms, hash functions, and digital signatures. In the second part of the lab, Pretty Good Privacy (PGP), a personal cryptography suite is introduced. Background and Theory: The general background and theory of the different cryptographic primitives will be presented in the appropriate sections of the lab. Lab Scenario: You will use your Linux workstation for this lab. However, in order to answer some of the questions, you will have to have public Internet access. You can use your own computer for this purpose, but you can also find computers connected to Internet in the lab. 1 Section 1: Cryptographic Primitives Symmetric and public-key cryptography algorithms can be used to securely send messages/data between two parties. The original message is called plain text, and the encrypted message is called cipher text. The sender encrypts the plain text message using a key, and the receiver decrypts the cipher text into the original plain text using a key as well. Symmetric-Key Cryptography In symmetric-key cryptography, the key used by the sender to encrypt the message is the same key that is used by the receiver to decrypt the encrypted message; therefore the name “symmetric.” The sender and the receiver have to agree on the same key before using this method. Some examples of symmetric cryptography algorithms: DES AES IDEA Key Exchange Algorithms If two parties are using symmetric-key cryptography, they have to exchange/share the same key. Key sharing can be safely done by meeting personally and exchanging the key. However, this is not feasible in many cases, and definitely not useful in computer networks. There are mathematically proven algorithms that allow the secure exchange of keys over an insecure network. Some examples of key exchange algorithms: Diffie-Hellman Station-to-Station Exercise You decided to encrypt your documents with DES and send them to one of your friends. You live in Atlanta, but he is currently located in London. Answer the following questions assuming this situation. Q 1.1: Do you think it’s secure to send the encrypted documents to your friend as an attachment to an email? Why? 2 Q 1.2: How would you let your friend know the key required to decode the encrypted data? Which of the following methods would you consider secure for letting him know the key: email, telephone, fax, or mail? Why? Public-Key Cryptography As opposed to symmetric-key cryptography, public-key algorithms use two different keys: the key used to encrypt the message is called the public key, and the key used to decrypt a message is called the private key. The private key is held secret, but the public key can be made publicly available, and anyone can use the public key to encrypt a message and send it to the owner of the private key. However, only the receiver in possession of the private key is able to decrypt the message. Because of this asymmetry in the algorithm, public-key algorithms are also called asymmetric algorithms. Note that in public-key cryptography a public-private key pair is only good for one way secure communication. If Alice has her private key and makes her public key available for all her friends, her friends can safely send documents to her encrypted with the public key, and only she will be able to decrypt them using her private key. However, if Alice wants to securely send a message to one of her friends, Bob for instance, she needs to have Bob’s public key and encrypt the message with his public key. Only Bob will be able to decrypt the message, using his private key, that he receives from Alice. Although the public and private keys are mathematically related to each other, it is an “extremely hard” mathematical problem to infer the private key from the public key. Public-key algorithms rely on proven hard mathematical problems to guarantee that the private key can not be calculated (in a reasonable amount of time) from the public key. Although it would appear that public-key algorithms are superior to symmetric algorithms, they have one drawback: they are generally much more computation intensive and therefore slower and use more resources. This is why several protocols, such as SSH, do the following: they use the slow and computation intensive public-key cryptography to securely agree on a fast symmetric algorithm and a key, and then they use the faster symmetric encryption algorithm to exchange data between the communicating parties. Because the faster symmetric algorithms are also easier to break, the communicating parties periodically regenerate and exchange new symmetric keys using the safe but slow public-key method. The period used between generating new symmetric keys is chosen to be small enough that it’s impossible to break even the weaker symmetric key within that period. Some examples of public-key cryptography algorithms: 3 RSA Elliptic Curve Cryptography Exercise You decided to encrypt your documents with RSA and send them to one of your friends (using his public key). You live in Atlanta, but he is currently located in London. Answer the following questions assuming this situation. Q 1.3: Do you think it’s secure to send the encrypted documents to your friend as an attachment to an email? Why? Q 1.4: How would your friend let you know the public key required to encrypt the data? Which of the following methods would you consider secure for him letting you know the key: email, telephone, fax, or mail? Why? Q 1.5: Do you need your friend’s private key, in this case, to securely send him the documents? Why? Exercise In this exercise we will set up an ssh connection between a client and a server that uses RSAbased public-key cryptography for authentication. The public key will be stored on the server and any client machine that possesses the private key will be able to authenticate itself and log in to the server using ssh. Using the automatic public-private key based authentication method, there is no need to provide a password every time the client wants to access the server via ssh. Login to the workstation as root. Stop the firewall on the workstation machine to make sure that you will be able to use ssh. server$ service iptables stop Generate the public-private key pair on the workstation machine. The –t rsa option tells the program to generate an RSA keypair. server$ ssh-keygen -t rsa Generating public/private rsa key pair. 4 Enter file in which to save the key (/Users/szab/.ssh/id_rsa): [Enter] Enter passphrase (empty for no passphrase): [Enter] Enter same passphrase again: [Enter] Your identification has been saved in /Users/szab/.ssh/id_rsa. Your public key has been saved in /Users/szab/.ssh/id_rsa.pub. The key fingerprint is: 10:a8:fe:44:ba:9a:c4:20:99:66:bd:01:04:ed:73:80 szab@lawn-128-61-11344.lawn.gatech.edu The generated public key is stored in the ~/.ssh/id_rsa.pub file, and the private key is stored in the ~/.ssh/id_rsa file. The ssh server recognizes public keys that are stored in the ~/ssh/authorized_keys file. Therefore we should copy the newly generated public key into this file: server$ cd ~/.ssh/ server$ echo id_rsa.pub >> authorized_keys Now let’s copy the private key from the Workstation server to the Virtual RedHat machine’s ssh folder. Make sure that you are logged in the virtual machine as root. Type the following on the virtual machine (x IP belongs to the server): client$ scp 57.35.6.x:~/.ssh/id_rsa ~/.ssh/ Now we have the private key on the client’s ssh folder and the public key on the server’s ssh folder, we can log into the server from client using the public-private keys and without the need to manually enter passwords: client$ ssh 57.35.6.x Q 1.6: Could you log into the server using ssh without entering a password? Hash Functions A hash function maps an arbitrary input/message into a fixed length output (called hash value). Hash functions can be used to generate “fingerprints” of messages. The most important attributes of a hash function include: The same input is always mapped to the same output It is “extremely hard” and most of the times impossible to infer the actual input of the hash function by knowing its output (that’s why hash functions are also referred to as “one way functions”) Given a hash value, it is “extremely hard” to find a message that maps to that particular hash value 5 It is “extremely hard” to find two inputs that map to the same hash value (that’s why hash functions are called to be “collision free”) Even a small change (one bit) in the input radically changes the output Some examples of popular hash functions: MD5 SHA Exercise Download the file importantApplication.tar.gz from the NAS to your computer. Generate the MD5 hash value of the compressed file you have just downloaded. $ md5 importantApplication.tar.gz Now list the contents of the file importantApplication.md5 stored on the NAS (use cat for this purpose). This file contains the MD5 hash value of the application generated by the administrator who put the application on the NAS for public access. Q 1.7: Compare the two hash values. Are they the same? Was the file you downloaded modified by an attacker while you were downloading it? How do you know? On the Internet, you can also find the accompanying MD5 (or SHA) checksums for downloadable files on several web pages. This checksum can be used to verify that the downloaded file was not corrupted during the download process (by an attacker for instance) in the same way we did above: you calculate the MD5 hash value of the downloaded file on your computer and compare it to the hash value posted on the website. If the two are the same, you can be sure that you have exactly the same file that the administrator put up on the Internet for download. Q 1.8: If a website you downloaded from is modified by a hacker, can you rely on the MD5 checksum posted on the site to make sure that the file you downloaded does not contain malicious code? Why? Exercise 6 Generate the MD5 hash value (checksum) for the string “GroupXX” (substitute XX with your group number, use two digits, e.g. 01, 02, 12, 23): $ md5 –s “GroupXX” Now generate the checksum for another group (only change one of the digits of your group number): $ md5 –s “GroupXY” Q 1.9: Note that we only changed one character in the input string. How many characters are common in the two checksums? Which attribute of the hash functions mentioned above is related to this result? Here’s an MD5 checksum for one of the groups: 68cbc1d9f3ef9ba649564614d6064219 Q 1.10: Which group does this checksum belong to? Knowing the characteristics of hash functions, do you think it’s possible to get the group number other than using a brute force approach? Hash functions are also used in the Linux system for password authentication. When a user selects a password for her account, the password itself is not stored in the system, only its hash value. When the user tries to log in, the system generates the hash value of the password entered and compares it to the hash value stored in the system. If the two are the same, the user is authenticated. Using this approach, an attacker can not read the users’ passwords even if he gets root access to the system. The attacker has to perform a brute force or dictionary-based attack to try to guess a password that maps to the same hash value stored in the system. This is another reason why selecting complex passwords that are hard to guess is crucial in computer security. Random Numbers Most of the cryptographic algorithms rely on random number generators. Most of the cryptographic methods fail to provide security if the “random” numbers they are using can be predicted or inferred by an attacker. However, generating numbers on a deterministic computer that best resemble to real random numbers is not an easy task at all. 7 Q 1.11: Do you think using the current CPU time to generate random numbers for a secure algorithm is a viable option (Netscape used to generate random numbers in this manner)? Why? Q 1.12: Do you think using previous mouse movements and the timing of keyboard strokes is a viable way of generating random numbers for a secure algorithm? Exercise Your Linux installation provides two random number generators that act as a block device. You can get random numbers by reading the following devices: /dev/random and /dev/urandom Let’s read 10*1024 bytes of random data from both of these random number generators and measure the time required for this. First, generate random data using /dev/urandom: $ time dd if=/dev/urandom of=random.txt bs=1024 count=10 Then generate random data using /dev/random: $ time dd if=/dev/random of=random.txt bs=1024 count=10 Q 1.13: Which one took longer? From the running times, what do you think, which one generates higher quality random numbers, random or urandom? Q 1.14: Look at the man pages of random and urandom. Which one generates higher quality random numbers? Which one would you use in a secure algorithm? Why? 8 Digital Signatures There is another very common application of public-key cryptography other than using it for encrypting messages: digital signatures. Digital signatures are similar to regular signatures because they have the following characteristics: An actual signature is bound to the document that is signed and cannot be reused for another document The signature is bound to the person who signs the document and can not be faked by anyone else Anyone can verify and prove that a signed document was signed by a particular person Here is how digital signatures work: A user has his/her own private key and the related public key can be made public The user signs the document by encrypting it using his/her private key, and publishes the document along with the signature (encrypted document) Anyone can verify the signature by decrypting the signature using the public key and comparing it to the document. If the document and the decrypted signature are the same, then it is proven that the document was signed by the person to whom the public key belongs to. As we mentioned before, public-key cryptography is very computation intensive. Also if we sign the whole document, the signature will have the same size as the original document. Therefore signing large documents is not a viable option. This is the reason why only the hash value of a document (which is limited in size) is signed in practice: The user generates the hash value of the document to be signed The user encrypts the hash value using his/her private key, and publishes the document along with the signature (encrypted hash value) Anyone can verify the signature by decrypting the signature using the public key and comparing it to the hash value of the document. If the hash value of the document and the decrypted signature are the same, then it is proven that the document was signed by the person to whom the public key belongs to There is another important problem related to public-key cryptography. Let’s assume that you want to securely send a document to Henry and make sure that no one else can read it. You need to have Henry’s public key for this purpose, and you find his public key posted on a website. You can encrypt your document using that public key you found on the website, and only someone who owns the matching private key will be able to decrypt it. However, it is possible that an attacker posted his own public key stating that it belongs to Henry. If you use this key, Henry won’t be able to read the documents but the attacker will. 9 The same thing can happen with websites. Let’s assume that you go to Amazon.com to buy some books. When you enter the checkout process and before you enter your credit card number, your connection becomes encrypted using public-key cryptography (https:// URLs stand for secure http). When you see https:// in a URL, the public-key cryptography used in the secure http protocol guarantees that no other third party is able to decrypt the data traffic between you and the website. The website gives you its public key and you send your messages to the website encrypted with the public key. However, how can you be sure that you are actually communicating with Amazon.com and not with an attacker? How do you know that the public key belongs to Amazon.com and not to an attacker and that only Amazon will be able to read your credit card data? As an answer to these problems, to verify that a particular public key belongs to a particular person or entity, a trust model was introduced. There are some authorities (a few well known companies), called certificate authorities, that certify that a particular public key belongs to a particular entity. They do this by signing the public key--entity pair and making this signed pair (as a certificate) available to the public. Anyone who wants to make sure that a particular public key belongs to a particular entity should verify this relying on certificate authorities. For instance, if the Amazon.com website offers you a public key to use for encrypted data transfer, you should check whether the public key really belongs to Amazon.com or not. If you find a certificate issued and signed by a certificate authority that verifies that the public key belongs to Amazon, then you do not have to worry. If the certificate authorities cannot verify that the public key belongs to Amazon.com, you cannot be sure that Amazon.com is at the other end of the connection and not an attacker who wants to steal credit card data by pretending to be Amazon.com. Of course, you don’t have to manually verify the public keys, browsers automatically do this for you by looking at certificates issued and signed by authorities. Most of the browsers verify public keys used in https connections and stay silent if everything seems OK. If the browser can’t verify the public key offered by a website, it warns you. Exercise Using the Firefox browser, go to the Georgia Tech WebCT webpage and click on the “WebCT Login Page” link: http://webct.gatech.edu Notice that the URL begins with https://, which implies that the connection is now encrypted using public-key cryptography, and you are using the public key offered by the website to encrypt data sent to the site. Click on the lock icon located on the right side of the URL input field in the browser, and then click on the View button to take a look at the details of the certificate that certifies the public key offered by the website. Read through both the General and Details tabs and answer to the following questions: Q 1.15: Which organization issued the certificate? Is this organization a certificate authority? 10 Q 1.16: For which website and which organization was the certificate issued? Q 1.17: What public-key algorithm was used to sign this certificate? Q 1.18: What public-key algorithm is used to securely communicate between your browser and the WebCT page? Q 1.19: Do you think it’s safe to enter your GT username and password on this page? Can you be sure that the username and password will only be readable by Georgia Tech and no one else? Using the Firefox browser, go to a webpage that belongs to a European university: https://db.bme.hu When Firefox prompts you, click on “Examine Certificate…” and answer the following questions: Q 1.20: Which organization issued the certificate? Is this organization a certificate authority? Q 1.21: For which website and which organization was the certificate issued? Q 1.22: What public-key algorithm was used to sign this certificate? 11 Q 1.23: What public-key algorithm is used to securely communicate between the browser and the webpage? Q 1.24: Could a third party read the data you send to this webpage? Q 1.25: Can you be sure that the public key offered by the website really belongs to the db.bme.hu? Section 2: Pretty Good Privacy OpenPGP OpenPGP is a standard used by programs like PGP and GPG to provide encryption and digital signing. It also allows for methods of verifying who a public key belongs to. PGP was the original software, created by Phillip Zimmerman in 1991, and was designed to allow people to generate their own public/private key pairs, and then to encrypt/decrypt or sign/verify documents, mostly emails. This functionality was cloned by the GPG project so that the software would be available under a GNU license. In order to have the two programs compatible, they both used the OpenPGP standard. One of the most difficult problems facing public key cryptography today is proving that the person providing a certificate matches up to who they are in the real world. For instance, your computer could go to a web site that presents itself as www.paypal.com. Before you give them your password, however, you want to know that they are really the company, PayPal. They could have easily created a public/private key pair under the name paypal, but that doesn’t verify anything. What is needed is for someone you trust to verify that the public key for www.paypal.com is owned by PayPal, the company. They can vouch for that public key by signing it, something no 12 one else can do There are two models for doing this, one based on a hierarchical model, X.509, and other based on a web of people trusting each other, often called the “web of trust” model. X.509 This hierarchical model is actually what is used by most websites today. Companies like Verisign and Comodo act as certificate authorities, sitting at the top level of the tree. They sign other company’s certificates if they can prove that they are legally responsible for the domain name their certificate is for. Web of trust Under this model, individuals vouch for each other after having proven their identities. Once enough people have signed other people’s keys, any one person can usually trust another person through several trust agreements. For example, say there are three people, Alice, Bob, and Chris. If Alice verifies that Bob’s key actually belongs to Bob, and signs his key, and Bob verifies that Chris’s key actually belongs to Chris, and then signs his key, then Alice can trust that Chris’s listed key is really his by trusting Bob. With more people, the graph becomes more complete, so that trust doesn’t have to go through one person (in this case, Bob.) For a more real-world example, here is the way that two randomly picked people with college of computing email addresses (David Shea and Todd White) could trust one another: 13 In support of the idea that this method works, here is an interesting statistic: there is a strongly connected set of people (that is, people who can all trust one another) that is contains over 34000 keys, with an average distance between keys of just over 6 people. Exercise Q 2.1: Using the key lookup server at http://pgp.mit.edu/, find the 8 digit key ID of Chris Klaus, the person with his name of the side of the building you’re in. Q 2.2: Using a statistics site such as http://pgp.cs.uu.nl/, how many people have signed the key of Derek Atkins, C1B06AF1? How many keys have been signed by this key? (Derek Atkins is one of the most-connected people in the strong set) 14 Q 2.3: Say that you get an email from the email address of the professor of this class, and you want to verify that it’s really from the professor. The email is signed with a PGP key. What would you need to do to really trust the identity of the sender? Can you really be sure of the identity of a key if you download it from a public key-server? 15 ECE4112 Internetwork Security Lab: Cryptography Group Number: _________ Member Names: ___________________ _______________________ Answer Sheet Section 1 Q 1.1: Do you think it’s secure to send the encrypted documents to your friend as an attachment to an email? Why? Q 1.2: How would you let your friend know the key required to decode the encrypted data? Which of the following methods would you consider secure for letting him know the key: email, telephone, fax, or mail? Why? Q 1.3: Do you think it’s secure to send the encrypted documents to your friend as an attachment to an email? Why? Q 1.4: How would your friend let you know the public key required to encrypt the data? Which of the following methods would you consider secure for him letting you know the key: email, telephone, fax, or mail? Why? Q 1.5: Do you need your friend’s private key, in this case, to securely send him the documents? Why? 16 Q 1.6: Could you log into the server using ssh without entering a password? Q 1.7: Compare the two hash values. Are they the same? Was the file you downloaded modified by an attacker while you were downloading it? How do you know? Q 1.8: If a website you downloaded from is modified by a hacker, can you rely on the MD5 checksum posted on the site to make sure that the file you downloaded does not contain malicious code? Why? Q 1.9: Note that we only changed one character in the input string. How many characters are common in the two checksums? Which attribute of the hash functions mentioned above is related to this result? Q 1.10: Which group does this checksum belong to? Knowing the characteristics of hash functions, do you think it’s possible to get the group number other than using a brute force approach? Q 1.11: Do you think using the current CPU time to generate random numbers for a secure algorithm is a viable option (Netscape used to generate random numbers in this manner)? Why? 17 Q 1.12: Do you think using previous mouse movements and the timing of keyboard strokes is a viable way of generating random numbers for a secure algorithm? Q 1.13: Which one took longer? From the run times, what do you think, which one generates higher quality random numbers, random or urandom? Q 1.14: Look at the man pages of random and urandom. Which one generates higher quality random numbers? Which one would you use in a secure algorithm? Why? Q 1.15: Which organization issued the certificate? Is this organization a certificate authority? Q 1.16: For which website and which organization was the certificate issued? Q 1.17: What public-key algorithm was used to sign this certificate? 18 Q 1.18: What public-key algorithm is used to securely communicate between the browser and the WebCT page? Q 1.19: Do you think it’s safe to enter your GT username and password on this page? Can you be sure that the username and password will only be readable by Georgia Tech and no one else? Q 1.20: Which organization issued the certificate? Is this organization a certificate authority? Q 1.21: For which website and which organization was the certificate issued? Q 1.22: What public-key algorithm was used to sign this certificate? 19 Q 1.23: What public-key algorithm is used to securely communicate between the browser and the webpage? Q 1.24: Could a third party read the data you send to this webpage? Q 1.25: Can you be sure that the public key offered by the website really belongs to the db.bme.hu? Section 2 Q 2.1: Using the key lookup server at http://pgp.mit.edu/, find the 8 digit key ID of Chris Klaus, the person with his name of the side of the building you’re in. Q 2.2: Using a statistics site such as http://pgp.cs.uu.nl/, how many people have signed the key of Derek Atkins, C1B06AF1? How many keys have been signed by this key? (Derek Atkins is one of the most-connected people in the strong set) Q 2.3: Say that you get an email from the email address of the professor of this class, and you want to verify that it’s really from the professor. The email is signed with a PGP key. What would you need to do to really trust the identity of the sender? Can you really be sure of the identity of a key if you download it from a public key-server? 20 General Questions How long did it take you to complete this lab? Was it an appropriate length lab? What corrections and or improvements do you suggest for this lab? Please be very specific and if you add new material give the exact wording and instructions you would give to future students in the new lab handout. You may cross out and edit the text of the lab on previous pages to make minor corrections/suggestions. General suggestions like add tool xyz to do more capable scanning will not be awarded extras points even if the statement is totally true. Specific text that could be cut and pasted into this lab, completed exercises, and completed solutions may be awarded additional credit. Thus if tool xyx adds a capability or additional or better learning experience for future students here is what you need to do. You should add that tool to the lab by writing new detailed lab instructions on where to get the tool, how to install it, how to run it, what exactly to do with it in our lab, example outputs, etc. You must prove with what you turn in that you actually did the lab improvement yourself. Screen shots and output hardcopy are a good way to demonstrate that you actually completed your suggested enhancements. The lab addition section must start with the title “Lab Addition”, your addition subject title, and must start with a paragraph explaining at a high level what new concept may be learned by adding this to the existing laboratory assignment. After this introductory paragraph, add the details of your lab addition. Include the lab addition cover sheet from the class web site. 21