CMPT 408: Theory of Computer Networks Prof. Funda Ergun Network Security October 26 & 28, 2009 Scribe: Matthew Baker Security There are four classifications of security (of messages): 1. Privacy – can somebody else read your messages? 2. Authenticity – can somebody else pretend to be you? 3. Message Integrity – how can you know that the message you received was the original? 4. Access – can somebody deny you access to your messages (denial of service attack) Our standard model for message security: - Alice and Bob want to send private messages to each other Trudy does not like Alice and Bob and wants to read/intercept their messages We assume that Trudy is able to hear and/or intercept data that Alice and Bob send to each other Alice and Bob must alter their messages so that Trudy cannot understand them, but Alice and Bob still can. Cryptography – How do we encrypt and decrypt messages? Cryptanalysis – How can we (efficiently) break an encryption/decryption system? Elements of a Cryptosystem Plaintext – the original message (denoted by m), that anyone can understand Ciphertext – the encryption of m (denoted by Ek(m)) Encryption – an invertible function, Ek(x), where x is the input message and k is some key. By ‘invertible’ we mean that by applying a decryption function, Dk(y), to a ciphertext, we should (uniquely) obtain the plaintext that was encrypted [Dk( Ek(m) ) = m] Key – required to encrypt and/or decrypt (denoted by k) Keyspace – The set of values that the k can be. Shared-Key Cryptosystems Alice and Bob share a secret key that they use to encrypt and decrypt the messages they send to one another. Substitution Cipher: - - A cryptosystem in which we take each letter of the plaintext and substitute it with another letter k positions further in the alphabet. For example: if k = 3, then E3(“A”) = “D”, E3(“B”) = “E”, … , E3(“Y”) = “B” To decrypt, shift a letter back k positions in the alphabet. Clearly, k can be any integer between 0 and 25 (Notice: if k = 26 would yield the same results as k = 0 and, more generally, k = x 2 would yield the same result as k = (x mod 26) 2 {0, …, 25} ). This substitution cipher is easily broken by brute force, because someone could exhaust all 26 possible values that are in the keyspace to decrypt a message. Another Substitution Cipher: - Create a random permutation of the letters of the alphabet and map each letter in the plaintext to the corresponding letter in the permutation. The key, k, for this cryptosystem is the randomly generated permutation. For example: if k = {P,Q, …, T}, then Ek(“A”) = “P”, Ek(“B”) = “Q”, … , Ek(“Z”) = “T” - To decrypt, find a letter’s position in k and substitute it with the letter with the same position in the alphabet. Since k is a permutation of the letters of the alphabet, the size of the keyspace is 26!, which is too large to break by brute force. This cryptosystem can still be easily broken by the use of frequency analysis. For example: if “R” is the most frequent letter in the ciphertext, then we would expect Dk(“R”) = “E” since “E” is the most frequently used letter in the English language. Likewise, we can make similar assumptions about groups of letters (“THE” and “ING” are frequently used trigrams in English, for example). DES (Data Encryption Standard) cryptosystem: - An cryptosystem that traditionally uses a 64 bit key. DES was eventually broken by brute force. To fix DES we just increase the size of the key! For each additional bit added to the key, the keyspace is doubled! Double (128 bit key) and Triple (182 bit key) DES are still used and are quite secure. The problem with Shared-Key Cryptosystems How do Alice and Bob agree on a key, over a distance, and not have Trudy derive it? In the real world it is unreasonable to assume that Alice and Bob have to physically meet in order to exchange keys. In order to come up with a solution to the problem we make a cryptographic assumption. We assume that one-way functions exist. One-way function – A function f is a one-way function if f(x) is easily computed, but not easily inverted (computationally hard). Discrete Log – y = f(x,k) = xk mod p where p is a large prime. Given y,x, and p, find k (computationally hard). Discrete Log is a one-way function Primitive Element – g is a primitive element of the finite field p (p = mod p = {0, 1, … , p-1}) where p is a prime, if the set {g1, g2, … ,gp-1} (mod p) contains all of the nonzero elements of p (g is also known as a generator). For example: 3 is a primitive element of 7 = {0, 1, … , 6}. (g = 3, p = 7) 31 mod 7 = 3 34 mod 7 = 4 32 mod 7 = 2 35 mod 7 = 5 3 3 mod 7 = 6 36 mod 7 = 1 \{g1, g2, … ,g6} contains all of the nonzero elements of 7. An algorithm for generating a secret shared key - Alice and Bob openly agree on a prime p and a primitive element g to use Alice picks x 2 {1, … , p-1} at random and sends gx mod p to Bob Bob picks y 2 {1, … , p-1} at random and sends gy mod p to Alice Alice computes (gy)x mod p = gxy mod p Bob computes (gx)y mod p = gxy mod p Alice and Bob use gxy mod p as their secret shared key Trudy knows gx mod p, gy mod p, g and p. In order to get the key, Trudy must invert either gx mod p or gy mod p (to obtain x or y), but we assumed that this is computationally hard. Matt’s Bonus Box! Why do we use finite fields of the integers modulo large primes? - If p is a prime, then p is guaranteed to have at least one generator element. This is not necessarily true if p is not a prime. - We use large primes because we do not want x 2 {1, … , p-1} from gx mod p to be computed by brute force. Although this algorithm may seem secure, it is still vulnerable to a different type of attack. - Suppose Alice tries to send gx mod p to Bob. Trudy can intercept the message, and kill it (so that it does not reach Bob). Trudy then picks z 2 {1, … , p-1} and sends gz mod p it to Bob. Bob will then try to send gy mod p to Alice Bob will compute gzy mod p and he will think that it is the shared key Trudy can also intercept Bob’s message to Alice, and kill it. Trudy then picks w 2 {1, … , p-1} and sends gw mod p it to Alice Alice will compute gwx mod p and she will think that it is the shared key In this case, since Trudy picked w and z and she knows gx mod p and gy mod p, Trudy can now intercept messages, kill them, decrypt them, and re-encrypt them without Alice or Bob knowing. This is known as a Man-in-the-middle Attack. Public Key Cryptography (RSA) In RSA - Alice and Bob each produce two keys, one is posted somewhere publically and the other is kept a secret from everyone. Let PuA and PuB be the public keys of Alice and Bob respectively. Let PrA and PrB be the private keys of Alice and Bob respectively. The encryption function in RSA is equivalent to the decryption function. To encrypt a message, use either the private or public key with the encryption function. To decrypt a message, use the encryption function with the key that was not used in the encryption. That is, EPr( EPu(m) ) = m or EPu( EPr(m) ) = m How a message is sent - Alice wants to send a message to Bob Alice knows PuA, PrA, and PuB Alice encrypts using PuB and sends it to Bob Bob encrypts using PrB and gets the original message Trudy does not know the private key of Bob, so she cannot decrypt the message! This also gives us an algorithm for Authentication - How do we know that Alice sent m? Suppose she sends EPrA(m) instead of the original message Only Alice can produce EPrA(m) So, anyone can check that Alice generated EPrA(m) because they can decrypt it using Alice’s public key!