CO634 Cryptography Using Asymmetric Ciphers lectures by Carlos A. Perez-Delgado School of Computing, University of Kent c 2019 Outline 1 Using Asymmetric Ciphers 1 Using Asymmetric Ciphers MathsBite: More on prime numbers Public and private keys Using public key cryptography to establish a session key Distributing public keys 2 / 24 MathsBite: Prime Numbers A prime number is an integer greater than one that is not divisible by any integer apart from itself and one. The first few prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23 and 29. Stallings [2006] Table 8.1 shows all the prime numbers up to 2000 (with one mistake!—see the online errata). Also see primes.utm.edu. 3 / 24 MathsBite: Prime Numbers up to 1000 100 50 0 # primes <= n 150 This figure shows the number of prime numbers less than or equal to n, for n up to 1000. Among 1-digit numbers (1-9), there are 4 primes. Among 2-digit numbers (10-99), there are 21 primes. Among 3-digit numbers (100-999), there are 143 primes. 0 200 400 600 800 1000 n 4 / 24 MathsBite: No Highest Prime Number Assume that there is a highest prime number. Then we could make a list of all the prime numbers in ascending order: p1 , p2 , ... pn . Consider the number N = p1 ⇥ p2 ⇥ · · · ⇥ pn This is clearly a multiple of all the numbers on the list (i.e. all the prime numbers). But now consider the number N + 1. That can’t be a multiple of any of the numbers of the list. Therefore N + 1 must either be a prime number itself, or the product of two or more prime numbers not on the list. Either way, the assumption that there is a highest prime number leads to a contradiction. This proof was known to Euclid. 5 / 24 MathsBite: The Prime Number Theorem For large n, the number of prime numbers less than n is approximately n loge n This gives rise to the rule of thumb that among numbers with d decimal digits, then approximately one number in 2.3d will be prime. For example, among 100-digit numbers, approximately 1 in 230 is prime. The prime number theorem was conjectured by Gauss in 1792, but not proved for another century. 6 / 24 MathsBite: Tests of Primality The first polynomial-time (efficient) deterministic algorithm for determining whether a number is prime was pubished in 2002 (AKS primality test). There are, however, many efficient probabilistic tests: they will indicate ‘Yes’ if n is indeed prime, and will indicate ‘No’ with very high probability if n is not prime (and the more iterations of the algorithm you carry out, the higher this probability becomes). However, there is always a small probability of getting ‘Yes’ even with a non-prime number. 7 / 24 Outline 1 Using Asymmetric Ciphers 1 Using Asymmetric Ciphers MathsBite: More on prime numbers Public and private keys Using public key cryptography to establish a session key Distributing public keys 8 / 24 Symmetric and Asymmetric Ciphers: Reminder If the decryption key for a cipher is the same as the encryption key (i.e. KD = KE ) then it is called a symmetric cipher, and we write the key simply as K . (If the keys are different, but it is computationally straightforward to compute each from the other, than that too is treated as a symmetric cipher.) If the decryption and encryption keys are different, and it is computationally impractical to determine one from the other, then the cipher is asymmetric. Spelling hint: it comes from the Greek: a+syn+metron = not+together+measure. 9 / 24 Reversible Asymmetric Ciphers Asymmetric ciphers require a key pair (K1 , K2 ), rather than a single key. We’ll focus on asymmetric ciphers in which either key in the pair can be used for encryption, and the other for decryption. Specifically: Alice can encrypt a message P with K1 : C = E(P, K1 ) but then Bob (or indeed, anyone) will need to use K2 to decrypt it: P = D(C, K2 ) Alternatively, Alice can encrypt a message P with K2 : C 0 = E(P, K2 ) (yielding different ciphertext from the first method, which is why we’ve named it C 0 ), but then Bob will need to use K1 to decrypt it: P = D(C 0 , K1 ) For many important asymmetric ciphers, the encryption function E(·, ·) is identical to the decryption function D(·, ·): we see later that this is true for RSA in particular. 10 / 24 Public and Private Keys A typical way of deploying an asymmetric cipher is to make one member of the key pair a public key, available to anyone, and the other member of the pair a private key, kept secret by the owner of the key pair. We’ll use the notation UA for A’s pUblic key, and VA for A’s priVate key. 11 / 24 Key Rings Let’s go back to our n commodity traders from the last lecture. Let’s suppose that each trader, A for example, has generated a key pair (UA , VA ). A keeps VA secret, but somehow makes sure that every other trader knows his public key UA . Consequently, each trader will end up with a key ring, containing the public keys of all the other traders, each labelled with its owner. Let’s see what this set-up enables us to do. 12 / 24 Simple Applications Encryption/decryption Alice encrypts a message with Bob’s public key, and sends it to Bob, who decrypts it with his private key. Only Bob can read the message, because only he knows his private key. Digital signature Alice encrypts a message with her own private key and sends it to Bob, who decrypts it with Alice’s public key. Since only Alice knows her own private key, Bob can be sure that the message came from Alice. Bob can also be confident that the message hasn’t been tampered with en route: otherwise it would decrypt to gobbledegook. (However, anyone eavesdropping can read the message, using Alice’s public key.) Can these ideas be combined? 13 / 24 Simple Applications Disadvantages The simple ways of deploying public key cryptography considered in the previous slide have two disadvantages: As with symmetric ciphers, it is a bad idea to overuse a single cryptographic key (or key pair): it provides more ciphertext for cryptanalysts to work on, and provides no damage limitation if a key is inadvertently disclosed. Encryption/decryption in a good asymmetric cipher tends to be slower than in a good symmetric cipher. 14 / 24 Simple Applications Better approaches Encryption/decryption Use the public/private key pair to establish a session key for Alice and Bob to communicate using a symmetric cipher. We’ll consider how to do this next. Digital signature Use a cryptographic hash function. We’ll deal with these later in the course. 15 / 24 Outline 1 Using Asymmetric Ciphers 1 Using Asymmetric Ciphers MathsBite: More on prime numbers Public and private keys Using public key cryptography to establish a session key Distributing public keys 16 / 24 Establishing a Session Key Simple approach Suppose trader Alice (A) wants to set up a (symmetric) session key to communicate with trader Bob (B). A simple approach would be for Alice to generate a random session key Ks , and then encrypt this key (along with her own identity) using Bob’s public key, and send it to Bob. Specifically, she’d send the following message: E(A k Ks , UB ) where the encryption uses an asymmetric cipher. Alice can be confident that only Bob can decrypt Ks . 17 / 24 Establishing a Session Key Snags with the simple approach Here again is the message that Bob receives: E(A k Ks , UB ) Two points of concern: Although the message says that it originates from Alice, Bob can’t be sure that that is so. The protocol is vulnerable to replay attacks. There are various ways in which these concerns can be assuaged. We’ll describe one of the simplest ways of doing this next. 18 / 24 Establishing a Session Key A better method, 1 In this approach, when Alice wants to set up a session key, she sends Bob a message of the following form, encrypted using Bob’s public key: E(A k E(B k tA k Ks , VA ), UB ) where: E(B k Ks k tA , VA ) is an inner message encrypted with Alice’s private key. Since only Alice knows this key, this has the effect of signing the information contained within it. tA is a timestamp, recording the time the message was composed according the Alice’s computer clock. Ks is the random session key created by Alice, as before. 19 / 24 Establishing a Session Key A better method, 2 On receipt of the message from Alice, Bob decrypts it with his private key, to yield: A k E(B k tA k Ks , VA ) Now: Bob uses Alice’s identity, included in the message, to look up Alice’s public key. Bob uses Alice’s public key to decrypt the inner message; Bob knows that this inner message must have come from Alice, since only she could have encrypted it. Bob checks that the first part of the message, B, does indeed refer to himself. (Otherwise, it could be that this message was originally sent by Alice to someone else, who has now forwarded it to Bob in an attempt to impersonate Alice!) Bob checks that the timestamp tA is reasonably recent, and that he has not previously encountered it in setting up a session key. This helps to defeat replay attacks. Alice and Bob can now proceed to communicate using Ks . 20 / 24 Outline 1 Using Asymmetric Ciphers 1 Using Asymmetric Ciphers MathsBite: More on prime numbers Public and private keys Using public key cryptography to establish a session key Distributing public keys 21 / 24 Distributing Public Keys In all the preceding discussion, we assumed that each of our n traders was already in possession of the public keys of all the other traders. But how does this come about? This is outside the scope of these cryptography lectures, but may come about alongside authentication, see elsewhere in this course. Also see for example Stallings [2006] Sec. 10.1 for further information. However, there’s one pitfall you need to be aware of: the man-in-the-middle attack. 22 / 24 The Man-in-the-Middle Attack Alice sends key to Bob Suppose Alice decides to let Bob know her public key by emailing it to him. It’s a public key, after all, so what could be the harm in that? Here’s what Alice and Bob think is happening: A B UA However, sneaky Darth intercepts this communication. He creates his own key pair (UX , VX ) (different from his regular public and private keys), and forwards Alice’s message onto Bob, having substituted UX for Alice’s public key UA . Here’s what is really happening: A UA D UX B 23 / 24 The Man-in-the-Middle Attack Bob sends information to Alice Suppose Bob now proceeds to send a message P to Alice, encrypted using the key he’s just received. Here’s what Bob and Alice think is happening: A B E(P, UA ) But here is what is really happening: A E(P, UA ) D E(P, UX ) B Darth is intercepting the message, decrypting it with VX , noting its contents, reencrypting it with UA , and forwarding it to Alice! 24 / 24 CO634 Cryptography From Alphabets to Binary lectures by Carlos A. Perez-Delgado School of Computing, University of Kent c 2019 Outline 1 From Alphabets to Binary 1 From Alphabets to Binary MathsBite: Powers Modulo n More on Exclusive-OR The One-Time Pad in Binary Unconditional and Computational Security 2 / 17 MathsBite: Powers modulo n In ordinary arithmetic, if a is an integer and j is a positive integer, we use the notation aj to denote a multiplied by itself j times: a1 = a a2 = a ⇥ a a3 = a ⇥ a ⇥ a a4 = a ⇥ a ⇥ a ⇥ a .. . a is called the base, and j the exponent. We can define a similar operation in the world of arithmetic modulo n: multiply a by itself j times, but then reduce the result modulo n (i.e. take the remainder when aj is divided by n). 3 / 17 MathsBite: Powers Modulo 9 For example, let’s take a look at powers modulo 9: i = a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 4 8 7 5 1 2 4 8 7 3 0 0 0 0 0 0 0 0 0 4 7 1 4 7 1 4 7 1 4 5 7 8 4 2 1 5 7 8 4 6 0 0 0 0 0 0 0 0 0 7 4 1 7 4 1 7 4 1 7 8 1 8 1 8 1 8 1 8 1 4 / 17 Prime Numbers A prime number is an integer greater than one that is not divisible by any integer apart from itself and one. The first few prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23 and 29. Stallings [2006] Table 8.1 shows all the prime numbers up to 2000 (with one mistake!—see the online errata). 5 / 17 MathsBite: Theorem 3 (Fermat) If a is any integer and p is a prime number, then: ⇢ 0 if i mod p = 0 p 1 a mod p = 1 otherwise For a proof, see Stallings [2006] Sec. 8.2. Corollary: If a is any integer and p is a prime number, then: ap mod p = i mod p 6 / 17 Outline 1 From Alphabets to Binary 1 From Alphabets to Binary MathsBite: Powers Modulo n More on Exclusive-OR The One-Time Pad in Binary Unconditional and Computational Security 7 / 17 XOR Applied to Bit Patterns In the last lecture we met the exclusive-OR (XOR) operation as applied to single 0/1 bits. We can also apply it as an operation on (equal-lengthed) bit-strings, e.g. bytes. We simply apply the operation to each bit in turn. For example: 0 1 0 0 1 1 1 0 1 0 1 1 0 1 0 0 1 1 1 1 1 0 1 0 Question: Evaluate 1 0 1 1 0 1 0 0 1 1 1 1 1 0 1 0 8 / 17 Properties of XOR If A and B are bit-strings and A B=C then: B A = C (fairly obviously). But also: A=B C, and B=A C. So, given any two of A, B or C, we can XOR them together to get the third. This fact is used in some RAID arrays of three disks: roughly speaking, each sector on the third disk holds the XOR of the data in the corresponding sectors on the first two disks. 9 / 17 Outline 1 From Alphabets to Binary 1 From Alphabets to Binary MathsBite: Powers Modulo n More on Exclusive-OR The One-Time Pad in Binary Unconditional and Computational Security 10 / 17 The One-Time Pad Revisited Moving to binary All data on computers is stored as a sequence of 0/1 bits, usually further aggregated into bytes, blocks etc., so in computer cryptography we naturally think of encrypting the binary data directly, rather than working in terms of alphabetic letters etc. Let’s think how the one-time pad would work with an ‘alphabet’ consisting just of the ‘letters’ 0 and 1. 11 / 17 The One-Time Pad Revisited Suppose this is the start of our plaintext (Bredon Hill by A. E. Housman). The first row shows the characters of the poem, and below that their binary representation (using ASCII encoding). I n s u m m e r 010010010110111000100000011100110111010101101101011011010110010101110010 Our key now needs to be simply a random sequence of bits, for example: 101101011110001001010100010010101000111111000110000110111101101011110100 To encrypt the message, using the usual rule for Vigenère ciphers, we add each ‘letter’ of the plaintext to the corresponding ‘letter’ of the key, and reduce the result modulo the number of letters in the alphabet, i.e. 2. In other words we simply XOR the plaintext with the key, getting: 111111001000110001110100001110011111101010101011011101101011111110000110 How do we decrypt the message? 12 / 17 The One-Time Pad Revisited Randomness of ciphertext Take a look at the ciphertext from the previous slide: 111111001000110001110100001110011111101010101011011101101011111110000110 This looks pretty much like a random sequence of bits . . . . . . and not just to casual observation. Provided that the key is truly random, the ciphertext is statistically indistinguishable from a random sequence of bits (i.e. a sequence in which each bit is 1 with probability 1 2 , and is statistically independent of all preceding bits). This makes it impossible for cryptanalysts to get a handle on the code. Frequency analysts, eat your heart out! 13 / 17 Problems with the One-Time Pad As we’ve seen, a one-time pad is effectively unbreakable provided the key is secure. The big problem is how to distribute the key securely. For example, suppose that a company’s depot sends 100 MB of commercially sensitive data to head office each day over the internet, using a one-time pad. Head office won’t be able to read the data unless it has the same key as the depot, so somehow an average of 100 MB of random bits must be transmitted from head office to the depot (or vice versa) each day. How do you do that? A secondary issue is that it is not trivial to generate long sequences of genuinely random data. 14 / 17 Outline 1 From Alphabets to Binary 1 From Alphabets to Binary MathsBite: Powers Modulo n More on Exclusive-OR The One-Time Pad in Binary Unconditional and Computational Security 15 / 17 Unconditional Security An encryption scheme is said to be unconditionally secure if the ciphertext it generates provides an adversary with no information about the corresponding plaintext, even if the adversary has unlimited computing resources, and access to unlimited amounts of ciphertext. (cf. Menezes et al. [1997] Sec. 1.13.3; Stallings [2006] uses a weaker formulation.) In the 1940s Claude Shannon established that the only unconditionally secure encryption scheme is a one-time pad, or something very like it: in particular, the key must contain at least as many bits as there are bits in the plaintext, and these bits must be randomly chosen. 16 / 17 Computational Security An encryption scheme is said to be computationally secure if either: The computational cost of breaking the cipher exceeds the value of the encrypted information, or The time required to break the cipher exceeds the useful lifetime of the encrypted information. (Stallings [2006] p. 34). Notice that a cipher that is computationally secure now may not be secure in the future, owing (a) to advances in computing technology, and (b) to theoretical advances in cryptanalysis. 17 / 17 CO634 Cryptography Using Block Ciphers lectures by Carlos A. Perez-Delgado School of Computing, University of Kent c 2019 Outline 1 Using Block Ciphers 1 Using Block Ciphers Block Ciphers Block Cipher Modes of Operation MathsBite: Greatest Common Divisor (GCD) 2 / 28 Stream and Block Ciphers In a stream cipher, the plaintext is converted into ciphertext one bit, byte or character at a time. With the exception of transposition ciphers, all of the ciphers we’ve looked at so far have been stream ciphers. In a block cipher, the plaintext (represented in binary) is gathered into blocks of b bits, where b is called the block size. Typical block sizes are 64 bits (now getting less common), or 128 or 256 bits. Each block of plaintext is then converted into a block of ciphertext of the same length. If the number of bits in the plaintext is not an exact multiple of b, the last block is padded out—say with spaces, zero bits, or random bits—before conversion. 3 / 28 The Encryption Function Encryption key, KE Plaintext block, P Encryption function, E Ciphertext block, C C = E(P, KE ) 4 / 28 The Decryption Function Decryption key, KD Plaintext block, P Decryption function, D Ciphertext block, C P = D(C, KD ) 5 / 28 Encryption and Decryption Functions Necessary properties It is necessary that, for a given key value, the encryption maps different plaintext messages onto different ciphertexts, i.e. E must be a one-to-one function. In other words, if P1 6= P2 then we must have E(P1 , KE ) 6= E(P2 , KE ). Obviously we want the decryption function to undo the effect of the encryption function. This means that for any plaintext block P, we must have: P = D(E(P, KE ), KD ) The above conditions automatically imply that the decryption function D will also be one-to-one: different ciphertexts are mapped onto different plaintexts. However, if encryption is not a function this is not strictly necessary. 6 / 28 Symmetric and Asymmetric Ciphers If the decryption key for a cipher is the same as the encryption key (i.e. KD = KE ) then it is called a symmetric cipher, and we write the key simply as K . All the ciphers we’ve looked at so far are symmetric, and we’ll continue to focus on them for now. If the decryption and encryption keys are different, and it is computationally impractical to determine one from the other, then the cipher is asymmetric. Asymmetric ciphers are a relatively recent invention (mid-1960s), and will be dealt with later in the course. 7 / 28 Diffusion and Confusion Two desirable properties of a cipher (originally proposed by Claude Shannon [1949]) are as follows: Diffusion Localised changes to the plaintext should result in nonlocalised changes to the ciphertext. Ideally, in a block cipher, if a single bit of plaintext is changed, about half the bits in the output block should change (and which bits these are should seem random to the cryptanalyst). Confusion Somewhat difficult to explain, but roughly: it should be very difficult to glean information about the key by examining the statistical properties of the ciphertext. Remember this was conspicuously not true of a simple substitution cipher, where letter frequencies in the ciphertext can lead us straight to the key. 8 / 28 Outline 1 Using Block Ciphers 1 Using Block Ciphers Block Ciphers Block Cipher Modes of Operation MathsBite: Greatest Common Divisor (GCD) 9 / 28 Block Cipher Modes of Operation Suppose that our plaintext consists of a series of blocks: P1 k P2 k P3 k . . . We want to use a block cipher to convert this to a series of enciphered blocks: C1 k C2 k C3 k . . . It may seem obvious how to do this, but in fact there are several possible modes of operation. We’ll look at three: Electronic Code Book (ECB), Cipher Block Chaining (CBC) and Counter Mode (CTR). 10 / 28 Electronic Code Book (ECB) P1 P2 K P3 K K Encryption function Encryption function Encryption function C1 C2 C3 Encryption Decryption C1 = E(P1 , K ) P1 = D(C1 , K ) C2 = E(P2 , K ) P2 = D(C2 , K ) C3 = .. . E(P3 , K ) P3 = .. . D(C3 , K ) 11 / 28 Electronic Code Book (ECB) Pros and Cons Pros Easy to implement; Encryption/decryption can be carried out for more than one block in parallel; Cons A particular plaintext block will always encrypt to the same ciphertext block. Random access: any block in a message can be decrypted without having to decrypt earlier blocks. 12 / 28 Cipher Block Chaining (CBC) P1 P2 P3 VI K K K Encryption function Encryption function Encryption function C1 C2 C3 VI is called the initialisation vector, and is b bits long (i.e. the same length as a block). 13 / 28 Cipher Block Chaining (CBC) Encryption and decryption equations P1 P2 P3 VI K K K Encryption function Encryption function Encryption function C1 C2 C3 Encryption Decryption C1 = E(P1 VI , K ) P1 = D(C1 , K ) VI C2 = E(P2 C1 , K ) P2 = D(C2 , K ) C1 C3 = .. . E(P3 C2 , K ) P3 = .. . D(C3 , K ) C2 14 / 28 Cipher Block Chaining (CBC) Initialisation vector The initialisation vector VI must be known to both sender and receiver, along with the key K . Different values of VI should be used for different messages. Secrecy is not as important for VI as for the key. Ask yourself: What can go wrong if the adversary discovers VI in transmission from sender to receiver? What can go wrong if the adversary is able to modify VI in transmission from sender to receiver? 15 / 28 Cipher Block Chaining (CBC) Pros and Cons Pros If particular plaintext block occurs in different messages, or more than once in the same message, it will probably encrypt into different ciphertext blocks each time it occurs. Plenty of diffusion: if a single bit of the plaintext is changed, it changes the ciphertext for that block and all subsequent blocks in the message. Cons Encryption cannot be carried out for more than one block in parallel (though decryption can). Question: what happens if a single bit of ciphertext gets altered in transmission from sender to receiver? Random access: any block in a message can be decrypted without having to decrypt earlier blocks. 16 / 28 Counter Mode (CTR) VI + 1 mod 2b VI K K Encryption function P1 K Encryption function P2 C1 VI + 2 mod 2b Encryption function P3 C2 C3 17 / 28 Counter Mode (CTR) Encryption and decryption equations VI + 1 mod 2b VI K K Encryption function P1 P2 C2 C3 = = .. . Encryption function P3 C2 Encryption = K Encryption function C1 C1 VI + 2 mod 2b C3 Decryption E(VI + 0 mod 2b , K ) b E(VI + 1 mod 2 , K ) b E(VI + 2 mod 2 , K ) P1 P2 P3 P1 P2 P3 = = = .. . E(VI + 0 mod 2b , K ) C1 b C2 b C3 E(VI + 1 mod 2 , K ) E(VI + 2 mod 2 , K ) 18 / 28 Counter Mode (CTR) Similarity to one-time pad If we implemented a binary one-time pad blockwise, it would work like this: Encryption Decryption C1 = K1 P1 P1 = K1 C1 C2 = K2 P2 P2 = K2 C2 C3 = K3 P3 P3 = K3 .. .. . . where K1 , K2 , . . . are the blocks of the (random) key. C3 So CTR mode is essentially like a binary one-time pad, except that the key blocks are not completely random, being computed from K and IV . 19 / 28 Counter Mode (CTR) Pros and Cons Pros If particular plaintext block occurs in different messages, or more than once in the same message, it will probably encrypt into different ciphertext blocks each time it occurs. Encryption/decryption can be carried out for more than one block in parallel. Random access: any block in a message can be decrypted without having to decrypt earlier blocks. Cons An adversary who: knows at least something about the format of the plaintext, and can modify the ciphertext in transit may be able to modify the content of the message. (A similar vulnerability exists with the one-time pad.) 20 / 28 Altering an Encrypted Message . . . . . . without knowing the key Suppose I know that a bank transaction is sent as plaintext of the form: P a y A $ 1 0101000001100001011110010010000001000001001000000010010000110001 Suppose also that the CTR-mode counter for this block encrypts to: 1111010100111100111100010001010100101010101111001011000010011101 so the resulting ciphertext block is: 1010010101011101100010000011010101101011100111001001010010101100 What happens if I invert the last bit of the message? 21 / 28 Outline 1 Using Block Ciphers 1 Using Block Ciphers Block Ciphers Block Cipher Modes of Operation MathsBite: Greatest Common Divisor (GCD) 22 / 28 Greatest Common Divisor The greatest common divisor (GCD) of two positive integers a and b is—as the name implies—the largest integer that will divide exactly into both a and b. For example: gcd(6, 9) = 3 gcd(10, 5) = 5 gcd(8, 15) = 1 23 / 28 The Euclidean Algorithm An algorithm for determining the GCD of two numbers a and b appears in Euclid’s Elements of circa 300 BC, and is named after him. However, it appears to date back at least to 375 BC, and is one of the earliest algorithms known. Roughly it works like this: starting with the pair of numbers (a, b), repeatedly replace the larger number by its remainder when divided by the smaller number. Carry on until one of the numbers in the pair is zero: then the other number will be the GCD of a and b. For example to find the GCD of 1241 and 833: 1241 833 408 833 408 17 0 17 so the GCD is 17. 24 / 28 The Euclidean Algorithm in Java public s t a t i c i n t gcd ( i n t a , i n t b ) { i n t hi , l o ; i f ( a >= b ) { hi = a; lo = b; } else { hi = b; lo = a; } i f ( l o < 0) throw new I l l e g a l A r g u m e n t E x c e p t i o n ( " gcd : args must be non n e g a t i v e " ) ; i f ( h i == 0 ) throw new I l l e g a l A r g u m e n t E x c e p t i o n ( " gcd ( 0 , 0 ) i s u n d e f in e d " ) ; / / Main a l g o r i t h m : while ( l o > 0 ) { i n t newlo = h i%l o ; hi = lo ; l o = newlo ; } return h i ; } 25 / 28 The Euclidean Algorithm Correctness Why does this algorithm work? How do know it spits out the correct answer? Suppose a > b. Then: Fact 1: If integer q divides a (denoted as q|a) and q divides b (q|b) then q divides a mod b (q|a mod b). Fact 2: If integer q divides b (q|b) and q divides a mod b (q|a mod b) then q divides a (q|a). Excercise 1: Show that this is true. Excercise 2: Show that Fact 1 together with Fact 2 imply the algorithm is correct. 26 / 28 The Euclidean Algorithm Complexity Working out gcd(a, b) using the Euclidean algorithm requires a number of iterations no greater than 4.8 times the number of digits in the smaller number, and usually much less than this. The complexity of the algorithm is O(log a ⇥ log b), so it is entirely feasible to use it for numbers hundreds of bits/digits long. There are gcd algorithms with lower complexities, but none is known which is less than logarithmic in the larger number. 27 / 28 Relatively Prime Numbers Two numbers a and b are said to be relatively prime if their GCD is 1. As we saw in an earlier slide, 8 and 15 are relatively prime (though neither of them is a prime number). On the other hand, 7 and 14 are not relatively prime, even though 7 is prime, because gcd(7, 14) = 7. Several authors use the notation a ? b to signify that a and b are relatively prime. 28 / 28 CO634 Cryptography The Data Encryption Standard (DES) lectures by Carlos A. Perez-Delgado School of Computing, University of Kent c 2019 Outline 1 The Data Encryption Standard (DES) 1 The Data Encryption Standard (DES) DES: Background and outline Attacks on DES Beyond DES 2 / 21 The Data Encryption Standard (DES) Grew out of LUCIFER, an encryption algorithm developed by Horst Feistel at IBM in 1971, and used by Lloyds Bank to secure ATMs (holes in the wall) against telephone fraud.1 In 1973, the (US) National Bureau of Standards—since renamed the National Institute of Standards and Technology (NIST)—asked for proposals for a national cipher standard. IBM submitted a proposal developed from LUCIFER in consultation with the (US) National Security Agency (NSA), amongst others. This was the winning proposal. In 1977, the National Bureau of Standards adopted DES as Federal Information Processing Standard 46. In 1999, NIST indicated that plain DES should not be used for new systems. We’ll see why later. 1 See Hung at https://dspace.mit.edu/bitstream/1721.1/28754/1/59822564.pdf. Stallings and various web sources refer instead to Lloyd’s of London, i.e. the insurance market, which seems less likely. 3 / 21 DES: Early Concerns Key size reduced from LUCIFER’s 128 bits to 56 bits, apparently so DES could be implemented on a single chip; NSA proposed changes to the design of the S-boxes. Were these to ensure that the US government could decrypt traffic? 4 / 21 Product Ciphers If two or more encryption functions are applied in turn, the result is called a product cipher. Encryption key, K1 Plaintext block, P Encryption key, K2 Encryption function, E1 Encryption key, K3 Encryption function, E2 Encryption function, E3 Ciphertext block, C C = E3 (E2 (E1 (P, K1 ), K2 ), K3 ) Often the different stages (rounds, as they’re called) use the same encryption function, i.e. E1 , E2 , . . . are all the same function; only the key changes from round to round. Often the keys K1 , K2 , . . . for each round are derived from a master key; K1 , K2 , . . . are called subkeys. 5 / 21 Feistel Cipher Structure PLAINTEXT BLOCK LP Many block ciphers (including DES and Blowfish) are implemented as product ciphers in which each round operates in the way shown on the left: RP The input block is divided into two halves; First round K1 Round function F R1 The left-hand half is XORed with some function of the right-hand half; The two halves are then swapped. Second round K2 Round function F R2 Final swap LC This structure was proposed by Horst Feistel (and used in LUCIFER). Typically far more than two rounds are used. After all the rounds are complete, the two halves are swapped once more (or equivalently, the swap in the last round is omitted). RC CIPHERTEXT BLOCK 6 / 21 Feistel Cipher Structure Encryption PLAINTEXT BLOCK LP Encryption Suppose there are n rounds, and let Ri (i = 1, . . . n) be the input to the round function in the ith encryption round. Then we have: RP First round K1 Round function F R1 R1 = RP R2 = LP F (R1 , K1 ) Ri+1 = Ri 1 F (Ri , Ki ) Then for i = 2, . . . n Second round K2 Round function F 1 and the ciphertext is given by R2 LC = Rn 1 RC = Rn F (Rn , Kn ) Final swap LC RC CIPHERTEXT BLOCK 7 / 21 Feistel Cipher Structure Decryption PLAINTEXT BLOCK LP RP Final swap Decryption By rearranging the equations on the previous slide (using the properties of ) we get: Second round K1 Round function F R1 Then for i = n First round LC RC Rn 1 = LC F (Rn , Kn ) Ri 1 = Ri+1 F (Ri , Ki ) 1, . . . 2: R2 RC CIPHERTEXT BLOCK = and the plaintext is given by: K2 Round function F Rn LP = R2 Rp = R1 F (R1 , K1 ) So decryption is exactly like encryption, except that the subkeys are used in reverse order. 8 / 21 Anatomy of DES PLAINTEXT BLOCK (64 BITS) Permute bits KEY (56 BITS) SUBKEYS (48 BITS) 16 Feistel rounds Subkey generation Final swap Reverse bit permutation CIPHERTEXT BLOCK (64 BITS) 9 / 21 The DES Round Function This diagram is drawn right-to-left to match the Feistel diagram. S1 Subkey (48 bits) S2 S3 Half-block (32 bits) S4 Permute bits 48 bits S5 Aggregate 4-bit segments S6 Rearrange bits, duplicating 16 of them. Half-block (32 bits) Break into 6-bit segments S7 S8 Each S-box (S=substitution) converts a 6-bit input into a 4-bit output, using (in effect) a look-up table. (See Stallings [2006] Table 3.3 for details.) 10 / 21 S-Box S1 Input 000000 000001 000010 000011 000100 000101 000110 000111 001000 001001 001010 .. . Output 1110 0000 0100 1111 1101 0111 0001 0100 0010 1110 1111 .. . The other S-boxes perform different, but similarly complicated, substitutions. (You’re not expected to remember these numbers!) 11 / 21 Outline 1 The Data Encryption Standard (DES) 1 The Data Encryption Standard (DES) DES: Background and outline Attacks on DES Beyond DES 12 / 21 Attacks on DES Brute-force attacks A brute-force attack on a cipher means trying all possible keys in turn until you find one that turns the ciphertext into intelligible plaintext. There are 256 ⇡ 7 ⇥ 1016 possible DES keys, and vulnerability to brute-force attack was always a concern. In 1998, the Electronic Frontier Foundation (EFF) announced that it had broken a DES encryption using a specially built machine (‘Deep Crack’). The machine cost less than $250,000, and decryption took three days. The following year EFF decrypted a message in less than 24 hours, by distributing computations to 100,000 PCs over the Internet. That same year, NIST recommended that plain DES should not be used for new systems. 13 / 21 Attacks on DES Analytical attacks An analytical attack is one that analyses the properties of an encryption algorithm, and uses this together with other information available to the cryptanalyst (e.g. ciphertext, or plaintext/ciphertext pairs) to break a cipher with less computational effort than is required for a brute-force attack. Note that it isn’t necessary for analytical attacks to lead the cryptanalyst to a specific key: it is still useful if it can significantly reduce the set of possible keys. This set can then be searched exhaustively, as in a brute-force attack. For example . . . 14 / 21 Attacks on DES Timing attacks . . . it has been shown (Hevia and Kiwi [1999]) that in certain implementations of DES, the time taken to encrypt a message depends on the number of bits in the key that are equal to one. So if the cryptanalyst were in a position to observe the time taken to perform the encryption, he could use this to determine the number of one bits in the key, and then search only over the appropriate keys. The worst case (from the cryptanalyst’s point of view!) is when there are 28 1-bits, in which case there are still about 8 ⇥ 1015 keys to be searched. But even that is a tenfold reduction. How can the cryptographer defeat this sort of attack? 15 / 21 Attacks on DES Other analytical attacks Plenty of research (see Stallings [2006] Sec. 3.4 for details), but nothing groundbreaking. Or at least, nothing published! 16 / 21 Outline 1 The Data Encryption Standard (DES) 1 The Data Encryption Standard (DES) DES: Background and outline Attacks on DES Beyond DES 17 / 21 Double DES (not as strong as you might think) What about using a two-stage product cipher, using DES at each stage. In other words compute the ciphertext as follows: C = EDES (EDES (P, K1 ), K2 ) where EDES is the DES encryption function. The two keys together amount to 112 bits, so this looks as if it would be much more secure than a single DES encryption. Unfortunately, if the cryptanalyst has at least two plaintext/ciphertext pairs (i.e. two plaintext blocks and the corresponding ciphertext blocks), then he can attack the code using a meet-in-the-middle attack, using little more effort than a brute-force attack on single DES. (For details, see Stallings [2006] p. 177.) 18 / 21 Triple DES with Two Keys The remedy is to use one more stage: this is called triple DES (3DES). One approach is to use just two keys, as follows: K1 Plaintext block, P K2 DES encryption function K1 DES decryption function DES encryption function Ciphertext block, C C = EDES (DDES (EDES (P, K1 ), K2 ), K1 ) where DDES is the DES decryption function. The reason for using the decryption function in the second stage is so that a 3DES chip can also be used to encrypt/decrypt simple DES: just set K1 = K2 . This approach forms the basis of standards ANSI X9.17 and ISO8732 (not examinable). 19 / 21 Triple DES with Three Keys Cryptanalytic research is making some headway on 3DES with two keys (see Stallings [2006] p. 179 for details). Consequently it is now recommended to use three keys: C = EDES (DDES (EDES (P, K1 ), K2 ), K3 ) (It is still possible to encrypt/decrypt simple-DES traffic by setting the three keys equal.) 20 / 21 The Advanced Encryption Standard (AES) A snag with DES is that (reflecting 1970s technology) it was designed for implementation in dedicated hardware. It does not lend itself to efficient software implementation. In 3DES, the efficiency problems are compounded. NIST issued a call for proposals in 1997 for an improved standard which would avoid this and other problems. The result was the Advanced Encryption Standard, published as a standard in 2001. AES is a symmetric block cipher with a block size of 128 bits and a key size of at least 128 bits. Further details are beyond the scope of this course. 21 / 21 CO634 Cryptography Hash Functions and MACs lectures by Carlos A. Perez-Delgado School of Computing, University of Kent c 2019 Outline 1 Hash Functions and MACs 1 Hash Functions and MACs Introduction Cryptographic hash functions and MACs Implementing cryptographic hashes 2 / 19 Digital Signatures Using Asymmetric Encryption We saw in an earlier lecture that if Alice wanted to send a message to Bob, she could in effect sign the document by encrypting it with her private key. This would convince Bob that the message came from Alice. Also, if Bob stores away Alice’s ciphertext, he can subsequently use it to prevent Alice denying that she sent the message. (Well, that’s provided he can prove that he’s using Alice’s public key to decrypt it.) However, encrypting and decrypting a long document can be time-consuming, and if all we want is a digital signature, there are more efficient ways of achieving this, using cryptographic hash functions or message authentication codes (MACs). 3 / 19 Preventing Alteration in Transit We also saw that if Alice encrypts a message with her private key, this provides some protection against Eve altering the message in transit, because the altered message would probably decrypt to gobbledygook. This is certainly true if the plaintext consists of ordinary English text, for example, or some highly structured type of document. But there are other types of data where it is much harder to distinguish gobbledygook from meaningful data. What if any sequence of bits was a meaningful message? Then it would be impossible to determine whether Eve had tampered with the message. (Eve wouldn’t be able to control what the modified message meant, but possibly his objective is simply to spread confusion.) 4 / 19 Preventing Alteration in Transit Possible remedies One remedy would be for Alice to include some sort of checksum at the end of each message before encrypting it with her private key. After decryption, Bob would verify that the decrypted checksum agreed with the decrypted message: if it didn’t, he’d know that the message had somehow been altered in transit. However, as we shall shortly see, another method is for Alice to use a MAC, or to use a cryptographic hash function and sign the hash with her private key. Either method means that it is unnecessary for Alice to encrypt the entire message with her private key. (Although, if the message is confidential, she may still want to encrypt it with Bob’s public key.) Another name for a cryptographic hash function is a modification detection code (MDC). 5 / 19 Outline 1 Hash Functions and MACs 1 Hash Functions and MACs Introduction Cryptographic hash functions and MACs Implementing cryptographic hashes 6 / 19 Hash Functions A hash function is a function that takes an input of arbitrary length, and produces an output of fixed length. The output is often referred to as the hash of the input. Compilers often use hash functions to manage program identifiers. These identifiers are strings of arbitrary length. Whenever the compiler’s lexical analyser encounters an identifier in the program, it will apply a hash function to it to yield, say, a 16-bit integer. Then, to see whether the identifier has been seen before in the program, the compiler need only consider identifiers with the same hash value, of which there will be relatively few: often none or one. This saves making numerous string comparison operations, which are relatively time consuming. This is the basis of the data structure known as a hash table. 7 / 19 Desirable Properties of Hash Functions When used as the basis for a hash table, it is desirable that a hash function has the following properties: The length (in bits) of the hash value needs to be relatively short, e.g. 8 bits or 16 bits. After all, the hash table may need to allocate some memory for every possible hash value, and 16 bits means that there needs to be space for 65536 entries in the table. When evaluated over various possible inputs (e.g. program inputs), the hash function should yield output values that are evenly spread across the range of possible values. (A hash table wouldn’t be very efficient if almost all inputs hashed down to 17, say.) The hash function must be easy to compute. 8 / 19 Hash Functions in Cryptography Hash functions are also useful in cryptography. However, to defeat brute-force attacks, it is necessary for the hash values to be longer than those used for hash tables: 160 bits is an absolute minimum at present for legacy applications, 256 required for new applications (ENISA Nov 2014). Examples of cryptographic hash functions are MD5 (128 bits, thus obsolete), NIST’s Secure Hash Algorithm (SHA) family (SHA AKA SHA-0, SHA-1, SHA-2, and SHA-3), with 160, 224, 256, 384 and 512 bit variants, and Whirlpool (512 bits). MD5 and less than 256 bit SHA are obsolescent. SHA-0 and SHA-1 are considered obsolete. A MAC is essentially a cryptographic hash function whose computation requires the use of a (secret) key. Examples of MACs are HMAC and CMAC. Some caution is needed in using CMAC. See Stallings [2006] Ch. 12 for (non-examinable) details of SHA, Whirlpool, HMAC and CMAC. 9 / 19 An Application: Mirroring Open-Source Software Repositories of open source software such as sourceforge.net often encourage users to download software from a nearby mirror (such as www.mirrorservice.org) rather than directly from the repository itself. This eases the load on the central server. But what if a site purports to be a mirror of sourceforge.net, but actually modifies some of the mirrored packages to include viruses or spyware? In other words, it is what is sometimes called a Trojan mirror. 10 / 19 Combatting Trojan Mirrors Using a plain hash One way of stopping this is for the central repository to publish on its website a hash (e.g. an MD5 hash, usually called an md5sum) of each of the packages it contains. A user will: 1 download a package from a mirror, 2 compute its md5sum, and 3 compare the value he gets with the value published by the central repository (NB: not by the mirror!). If the results are different, the user knows the package has been modified en route. This imposes little load on the central server. 11 / 19 Combatting Trojan Mirrors Using a signed hash Another way is for the central repository to compute a hash of each package, and then sign the hash by encrypting it with the repository’s private key. The packages and the signed hashes are distributed to the mirrors. A user will: 1 download a package and the corresponding signed key from a mirror, 2 compute the package’s hash value (e.g. md5sum), 3 decrypt the signed hash value using the central repository’s public key, and 4 compare the two hash values. Apart from downloading the central repository’s public key (which only needs to be done once), this imposes no load at all on the central server. 12 / 19 Cryptographic Hash Functions Desirable properties Let’s denote a hash function by H(·). There are more exacting requirements on a cryptographic hash function than one used for a hash table: One-way: Given a hash value h, it must be computationally impractical to find a plaintext message P such that H(P) = h. Weak collision resistance: Given a plaintext message P, it must be computationally impractical to find another message P 0 such that H(P) = H(P 0 ). Strong collision resistance: Better still, it must be computationally impractical to find any two messages P and P 0 that yield the same hash value: H(P) = H(P 0 ). None of these requirements can be met unless the hash values are long enough (128+ bits), as we’ve already remarked. But this is a necessary condition, not a sufficient one. 13 / 19 Outline 1 Hash Functions and MACs 1 Hash Functions and MACs Introduction Cryptographic hash functions and MACs Implementing cryptographic hashes 14 / 19 Iterating a Compression Function VI = h0 P1 Compression function h1 P2 Compression function h2 P3 Compression function A common approach to implementing cryptographic hash functions is to divide up the plaintext into fixed-size blocks: P = P1 k P2 k P3 k . . . k Pn (If the block size is b bits, then the plaintext is padded out to be a multiple of b bits beforehand.) For example, MD5 and SHA both use 512-bit blocks. Successive blocks are then processed using a compression function, which takes in a hash value hi 1 and a plaintext block Pi , and computes a new hash value hi . The initial hash value h0 is set to some fixed value (e.g. zero), also called the initialisation vector VI (cf. cipher block chaining mode). h3 15 / 19 Iterating a Compression Function, 2 hn 1 Pn Compression function If the message contains n blocks, then the hash value h of the entire message is equal to hn : the hash value that emerges after processing the last block. h = hn 16 / 19 MD-Strengthening hn 1 Pn Compression function hn Pcount However, it is a good idea to append a final block Pcount , which contains the total length in bits of the plaintext message (represented in binary). Adding this extra block is called Merkle-Damgård strengthening, or MD-strengthening. (Incidentally, the MD in MD5 stands not for Merkle-Damgård, nor for Modification Detection, but for Message Digest.) Compression function h 17 / 19 Iterated Compression Functions Collision resistance Suppose that a compression function hout = f (Pin , hin ) is strongly collision resistant, in the sense that for any input hash value hin , it is computationally impractical to find any two input blocks Pin and P 0 such that f (Pin , hin ) = f (P 0 , hin ). in in Then it can be shown that, provided MD-strengthening is used, the hash function obtained by iterating f (Pin , hin ) is also strongly collision resistant. (But the converse is not true.) So the problem of designing a collision resistant cryptographic hash can be reduced to the problem of finding a collision resistant compression function. MD5 and SHA both rely on this fact. 18 / 19 Implementing Compression Functions The design of collision resistant compression functions is beyond the scope of this course, but typically they are implemented in a similar way to symmetric block ciphers such as DES, with multiple rounds. For example SHA-512 uses 80 rounds. See Stallings [2006] Ch. 12 or Menezes et al. [1997] §9.4 if you’re interested. 19 / 19 CO634 Cryptography Introduction lectures by Carlos A. Perez-Delgado School of Computing, University of Kent c 2019 Notes for the lectures on Cryptography given by Carlos Perez-Delgado in CO634, 2019 onwards. These notes are provided for the convenience of students, to avoid the need for excessive notetaking during lectures. However, the lectures may differ in content and/or sequence from what is presented here. Also note that these notes are subject to change as the term goes on; in particular, notes for lectures yet to be delivered are only provisional. The slides have been modified and updated by Carlos Perez-Delgado from the lecture notes and supporting infrastructure of Eerke Boiten, Andrew Runnalls, and Dan Grundy, all in their roles as lecturers on CO634 at the University of Kent. Their copyright is acknowledged. 2 / 21 Outline 1 Introduction 1 Introduction MathsBite: Modular Arithmetic Introduction Substitution Ciphers 3 / 21 MathsBite: a mod n If a is an integer and n is a positive integer, then a mod n (read as “a modulo n”) stands for the remainder when a is divided by n. n is called the modulus. So, for example 27 mod 16 = 11, 0 mod 18 = 0, 12 mod 5 = 2. The remainder is always in the range from 0 to n 1, even if a is negative, so for example 27 mod 16 = 5, 12 mod 5 = 3. a= a mod 5 = ... ... 6 4 5 0 4 1 3 2 2 3 1 0 1 2 3 4 5 6 ... 4 0 1 2 3 4 0 1 ... 4 / 21 MathsBite: a mod n in Java In C (and other languages derived from it, such as Java), the modulus is evaluated using the % operator, e.g. 27%16. (In VB it is just Mod.) Beware, however, that % may give unexpected answers if either or both operands are negative. In Java, for example, a%b has the same sign as a, so -7%4 evaluates to -3, not +1 as you might expect. To calculate a mod n correctly for any value of a (but n must still be positive), use Java code such as: a%n + ( a >= 0 ? 0 : n ) 5 / 21 MathsBite: Congruence modulo n Two integers a and b are said to be congruent modulo n (or “equal modulo n”) if (a mod n) = (b mod n) i.e. if a and b give the same remainder when divided by n. This relationship is written in various ways, for example: a ⌘ b (mod n) n a=b 6 / 21 MathsBite: Arithmetic modulo n Arithmetic modulo n (where n is a positive integer) works much like ordinary arithmetic, except that the only numbers are 0, . . . , n 1. For example, in arithmetic modulo 7, the only numbers are 0, 1, 2, 3, 4, 5 and 6. Addition and subtraction are defined in the ordinary way, except that we replace the ordinary result with its remainder modulo n: it’s as if the numbers ‘wrap around’ from n 1 to 0. So for example, in arithmetic modulo 7: 2+2 = 4 5+2 = 0 6+3+2 = 4 1 1 = 0 2 4 = 5 4+2 3 = 3 7 / 21 MathsBite: Precedence of mod In these notes the operator mod is given a low precedence: lower than + and for example, and certainly lower than ⇥, so that a + b mod n means the same as (a + b) mod n (Note that in Java etc., % has the same precedence as *.) 8 / 21 MathsBite: Theorem 1 If a and b are any integers, and n is a positive integer, then: a + b mod n = (a mod n) + (b mod n) mod n a b mod n = (a mod n) (b mod n) mod n For example 1815 + 2028 mod 10 = (1815 mod 10) + (2028 mod 10) mod 10 = 5 + 8 mod 10 = 13 mod 10 = 3 and 8025 9454 mod 5 = (8025 mod 5) = 0 (9454 mod 5) mod 5 4 mod 5 = 1 9 / 21 MathsBite: Theorem 1 We can write a as nqa + ra where qa is an integer and ra = a mod n. (qa is the quotient when a is divided by n, and ra is the remainder.) Similarly we can write b in the form nqb + rb , where rb = b mod n. Then: a + b mod n = nqa + ra + nqb + rb mod n = n(qa + qb ) + ra + rb mod n But n(qa + qb ) is an exact multiple of n so adding it to ra + rb won’t affect the remainder when we divide by n, so n(qa + qb ) + ra + rb mod n = ra + rb mod n Hence a + b mod n = ra + rb mod n = (a mod n) + (b mod n) mod n which proves the first equation of the theorem. The second equation is proved in the same way. 10 / 21 MathsBite: Arithmetic modulo 2 Arithmetic modulo 1 isn’t very interesting, because the only number is 0, and the result of any calculation is therefore quite easy to guess! In arithmetic modulo 2, the only numbers are 0 and 1, and: 0+0 = 0 0 0 = 0 0+1 = 1 0 1 = 1 1+0 = 1 1 0 = 1 1+1 = 0 1 1 = 0 So addition and subtraction do the same thing. You’ve probably met this operation before: it’s called exclusive-OR (XOR). Later on we’ll be using the symbol ‘ ’ for this operation. 11 / 21 Outline 1 Introduction 1 Introduction MathsBite: Modular Arithmetic Introduction Substitution Ciphers 12 / 21 Cryptology and Cryptography Cryptology means the study of codes, i.e. methods for converting plaintext into ciphertext (encryption), and for recovering the plaintext from the ciphertext (decryption). Cryptology embraces two branches: Cryptanalysis Methods for ‘breaking’ codes. At the very least the cryptanalyst will have a certain amount of ciphertext at his disposal. Better, he will have a number of ciphertext messages for which he knows (or is able to guess) the corresponding plaintext. Better still, he will have some ciphertext messages for which he chose the plaintext. He may also know something about the sort of code in use (Kerckhoff’s principle). Cryptography Devising codes that are resistant to cryptanalysis—and, ideally, can be proved to resist cryptanalysis. 13 / 21 If Crypto is the Answer, what is the Question? watermarking digital signatures symmetric authentication, e.g. through Needham-Schröder hashing key distribution and exchange assymmetric authentication and encryption integrity confidentiality 14 / 21 The Scope of Cryptography Suppose that in the course of a business transaction, Alice needs to send a message to Bob. Depending on the context, these are some of the things we might want to ensure using cryptography: Confidentiality/secrecy Even if the communication medium is insecure, nobody other than Bob should be able to read Alice’s message. Authentication Bob should be able to satisfy himself that the message does indeed come from Alice. Integrity Bob should be able to check that Alice’s message has not been modified during transmission, either accidentally or deliberately. Source nonrepudiation Alice should not be able to deny sending the message. Destination nonrepudiation Bob should not be able to deny receiving the message. Signatures Bob should be able to convince a third party (e.g. Charlie) that the message originated from Alice. This is a stronger version of authentication. (cf. Stallings [2006] p. 319) 15 / 21 Books If you want a textbook to support the course, I suggest William Stallings ‘Cryptography and Network Security’, published by Pearson/Prentice Hall, any edition starting from the Fourth Edition [2006], particularly Chapters 0–3, 6, 7, 9, 10 (excluding 10.2 and 10.3), 11, 12 (excluding 12.2 and 12.4), and 13. Reading guide for 5th edition and other textbooks are on the course webpage. A definitive reference book is ‘The Handbook of Applied Cryptography’ by A.J. Menezes, Paul van Oorschot, and Scott A. Vanston, published by CRC Press. It is available online at www.cacr.math.uwaterloo.ca/hac. 16 / 21 Outline 1 Introduction 1 Introduction MathsBite: Modular Arithmetic Introduction Substitution Ciphers 17 / 21 Caesar Cipher Julius Caesar used a cipher in which each letter of the (ancient) Latin alphabet was replaced by the letter three places on: Plain: Cipher: Plain: Cipher: A D N Q B E O R C F P S D G Q T E H R V F I S X G K T Y H L V Z I M X A K N Y B L O Z C M P So, for example VENI VIDI VICI would be enciphered as ZHQM ZMGM ZMFM. 18 / 21 Caesar Cipher Modular Arithmetic Interpretation If we represent each letter by a number: A ! 0, B ! 1, . . . Z ! 22, then we can describe the Caesar cipher by saying that it transforms each plaintext letter p into a ciphertext letter c = p + k mod 23 where k = 3. k is called the key. Different ciphers can be obtained by choosing different values of k . Decryption can be performed using the equation: p=c k mod 23 19 / 21 A General Substitution Cipher Instead of moving on a fixed number of places in the alphabet, how about shuffling up the letters of the (modern English) alphabet in an arbitrary way? For example: Plain: Cipher: Plain: Cipher: A Y N P B B O R C L P U D S Q G E H R K F X S Z G I T F H J U T I N V V J W W A K D X M L C Y E M O Z Q So MEET ME IN ST LOUIS gets encrypted as OHHF OH NP ZF CRTNZ 20 / 21 Substitution Ciphers Vulnerable to letter frequency analysis The following features of English make substitution ciphers vulnerable to cryptanalysis: Widely different letter frequencies. E, T, and A are particularly common; J, X, Q and Z are rare. A single-letter word is very likely to be A or I. Some letters are rarely or never doubled: A, H, I, J, Q, U, V, W, X, Y. Other natural languages (and indeed computer languages) exhibit similar characteristic features. In Italian, for example, words—especially long words—usually end in a vowel. 21 / 21 CO634 Cryptography Introduction lectures by Carlos A. Perez-Delgado School of Computing, University of Kent c 2019 Outline 1 Introduction Recap: Modular Arithmetic MathsBite: Modular Multiplication Recap: Substitution Ciphers Vigenère Ciphers Transposition Ciphers One-time Pads Traffic Analysis Outline 1 Introduction Recap: Modular Arithmetic MathsBite: Modular Multiplication Recap: Substitution Ciphers Vigenère Ciphers Transposition Ciphers One-time Pads Traffic Analysis Reminder: a mod n Last Crytpo lecture we learned about modular arithmetic. If a is an integer and n is a positive integer, then a mod n (read as “a modulo n”) stands for the remainder when a is divided by n. n is called the modulus. So, for example 27 mod 16 = 11, 0 mod 18 = 0, 12 mod 5 = 2. MathsBite: Congruence modulo n Two integers a and b are said to be congruent modulo n (or “equal modulo n”) if (a mod n) = (b mod n) i.e. if a and b give the same remainder when divided by n. This relationship is written in various ways, for example: a ⌘ b (mod n) n a=b MathsBite: Theorem 1 If a and b are any integers, and n is a positive integer, then: a + b mod n = (a mod n) + (b mod n) mod n a b mod n = (a mod n) (b mod n) mod n For example 1815 + 2028 mod 10 = (1815 mod 10) + (2028 mod 10) mod 10 = 5 + 8 mod 10 = 13 mod 10 = 3 and 8025 9454 mod 5 = (8025 mod 5) = 0 = 1 4 mod 5 (9454 mod 5) mod 5 MathsBite: Theorem 1 If a and b are any integers, and n is a positive integer, then: a + b mod n = (a mod n) + (b mod n) mod n a b mod n = (a mod n) (b mod n) mod n For example 1815 + 2028 mod 10 = (1815 mod 10) + (2028 mod 10) mod 10 = 5 + 8 mod 10 = 13 mod 10 = 3 and 8025 9454 mod 5 = (8025 mod 5) = 0 = 1 4 mod 5 (9454 mod 5) mod 5 MathsBite: Theorem 1 If a and b are any integers, and n is a positive integer, then: a + b mod n = (a mod n) + (b mod n) mod n a b mod n = (a mod n) (b mod n) mod n For example 1815 + 2028 mod 10 = (1815 mod 10) + (2028 mod 10) mod 10 = 5 + 8 mod 10 = 13 mod 10 = 3 and 8025 9454 mod 5 = (8025 mod 5) = 0 = 1 4 mod 5 (9454 mod 5) mod 5 MathsBite: Theorem 1 If a and b are any integers, and n is a positive integer, then: a + b mod n = (a mod n) + (b mod n) mod n a b mod n = (a mod n) (b mod n) mod n For example 1815 + 2028 mod 10 = (1815 mod 10) + (2028 mod 10) mod 10 = 5 + 8 mod 10 = 13 mod 10 = 3 and 8025 9454 mod 5 = (8025 mod 5) = 0 = 1 4 mod 5 (9454 mod 5) mod 5 MathsBite: Theorem 1 If a and b are any integers, and n is a positive integer, then: a + b mod n = (a mod n) + (b mod n) mod n a b mod n = (a mod n) (b mod n) mod n For example 1815 + 2028 mod 10 = (1815 mod 10) + (2028 mod 10) mod 10 = 5 + 8 mod 10 = 13 mod 10 = 3 and 8025 9454 mod 5 = (8025 mod 5) = 0 = 1 4 mod 5 (9454 mod 5) mod 5 MathsBite: Theorem 1 If a and b are any integers, and n is a positive integer, then: a + b mod n = (a mod n) + (b mod n) mod n a b mod n = (a mod n) (b mod n) mod n For example 1815 + 2028 mod 10 = (1815 mod 10) + (2028 mod 10) mod 10 = 5 + 8 mod 10 = 13 mod 10 = 3 and 8025 9454 mod 5 = (8025 mod 5) = 0 = 1 4 mod 5 (9454 mod 5) mod 5 MathsBite: Theorem 1 If a and b are any integers, and n is a positive integer, then: a + b mod n = (a mod n) + (b mod n) mod n a b mod n = (a mod n) (b mod n) mod n For example 1815 + 2028 mod 10 = (1815 mod 10) + (2028 mod 10) mod 10 = 5 + 8 mod 10 = 13 mod 10 = 3 and 8025 9454 mod 5 = (8025 mod 5) = 0 = 1 4 mod 5 (9454 mod 5) mod 5 You already knew Modular Arithmetic! . . . . . . You just didn’t know you knew it Think of the time of day. What time (hour) is it right now? What time will it be exactly 10 days from now? What time will it be exactly 193,198,622 days from now? What about 24001 hours from now? What about 1000 hours from now? You already knew Modular Arithmetic! . . . . . . You just didn’t know you knew it Think of the time of day. What time (hour) is it right now? What time will it be exactly 10 days from now? What time will it be exactly 193,198,622 days from now? What about 24001 hours from now? What about 1000 hours from now? You already knew Modular Arithmetic! . . . . . . You just didn’t know you knew it Think of the time of day. What time (hour) is it right now? What time will it be exactly 10 days from now? What time will it be exactly 193,198,622 days from now? What about 24001 hours from now? What about 1000 hours from now? You already knew Modular Arithmetic! . . . . . . You just didn’t know you knew it Think of the time of day. What time (hour) is it right now? What time will it be exactly 10 days from now? What time will it be exactly 193,198,622 days from now? What about 24001 hours from now? What about 1000 hours from now? You already knew Modular Arithmetic! . . . . . . You just didn’t know you knew it Think of the time of day. What time (hour) is it right now? What time will it be exactly 10 days from now? What time will it be exactly 193,198,622 days from now? What about 24001 hours from now? What about 1000 hours from now? Outline 1 Introduction Recap: Modular Arithmetic MathsBite: Modular Multiplication Recap: Substitution Ciphers Vigenère Ciphers Transposition Ciphers One-time Pads Traffic Analysis MathsBite: Multiplication modulo n Modular multiplication is similar to modular sums: we carry out the multiplication in the ordinary way, but then replace the result with its remainder modulo n. So for example, in arithmetic modulo 7: 2⇥2 = 4 5⇥2 = 3 6⇥3⇥2 = 1 MathsBite: Multiplication modulo n Modular multiplication is similar to modular sums: we carry out the multiplication in the ordinary way, but then replace the result with its remainder modulo n. So for example, in arithmetic modulo 7: 2⇥2 = 4 5⇥2 = 3 6⇥3⇥2 = 1 MathsBite: Multiplication modulo n Modular multiplication is similar to modular sums: we carry out the multiplication in the ordinary way, but then replace the result with its remainder modulo n. So for example, in arithmetic modulo 7: 2⇥2 = 4 5⇥2 = 3 6⇥3⇥2 = 1 MathsBite: Times Tables Modulo 7 Remember that in arithmetic modulo n, the only numbers are 0, . . . n 1. So, in arithmetic modulo 7, for example, we can write out all the multiplication tables quite briefly: ⇥ 0 1 2 3 4 5 6 0 0 0 0 0 0 0 0 1 0 1 2 3 4 5 6 2 0 2 4 6 1 3 5 3 0 3 6 2 5 1 4 4 0 4 1 5 2 6 3 5 0 5 3 1 6 4 2 6 0 6 5 4 3 2 1 MathsBite: Times Tables Modulo 9 ⇥ 0 1 2 3 4 5 6 7 8 0 0 0 0 0 0 0 0 0 0 1 0 1 2 3 4 5 6 7 8 2 0 2 4 6 8 1 3 5 7 3 0 3 6 0 3 6 0 3 6 4 0 4 8 3 7 2 6 1 5 5 0 5 1 6 2 7 3 8 4 6 0 6 3 0 6 3 0 6 3 7 0 7 5 3 1 8 6 4 2 8 0 8 7 6 5 4 3 2 1 MathsBite: Theorem 2 If a and b are any integers, and n is a positive integer, then: a ⇥ b mod n = (a mod n) ⇥ (b mod n) mod n Proof: (not examinable) As in the proof of Theorem 1, we can write a as nqa + ra where qa is an integer and ra = a mod n. (qa is the quotient when a is divided by n, and ra is the remainder.) Similarly we can write b in the form nqb + rb , where rb = b mod n. Then: a ⇥ b mod n = = (nqa + ra ) ⇥ (nqb + rb ) mod n n2 qa qb + n(qa rb + qb ra ) + ra rb mod n = ra rb mod n = (a mod n) ⇥ (b mod n) mod n An Application of Theorems 1 and 2 486 mod 9 = 4 ⇥ 10 ⇥ 10 + 8 ⇥ 10 + 6 mod 9 = (4 ⇥ 10 ⇥ 10 mod 9) +(8 ⇥ 10 mod 9) + (6 mod 9) mod 9 (using Theorem 1) = [(4 mod 9) ⇥ (10 mod 9) ⇥ (10 mod 9) mod 9] +[(8 mod 9) ⇥ (10 mod 9) mod 9] +[6 mod 9] mod 9 (using Theorem 2) = [4 ⇥ 1 ⇥ 1] + [8 ⇥ 1] + [6] mod 9 = 4 + 8 + 6 mod 9 = 18 mod 9 = 0 In this way we can show in general that an integer is divisible by 9 if and only if the sum of its digits is divisible by 9. An Application of Theorems 1 and 2 486 mod 9 = 4 ⇥ 10 ⇥ 10 + 8 ⇥ 10 + 6 mod 9 = (4 ⇥ 10 ⇥ 10 mod 9) +(8 ⇥ 10 mod 9) + (6 mod 9) mod 9 (using Theorem 1) = [(4 mod 9) ⇥ (10 mod 9) ⇥ (10 mod 9) mod 9] +[(8 mod 9) ⇥ (10 mod 9) mod 9] +[6 mod 9] mod 9 (using Theorem 2) = [4 ⇥ 1 ⇥ 1] + [8 ⇥ 1] + [6] mod 9 = 4 + 8 + 6 mod 9 = 18 mod 9 = 0 In this way we can show in general that an integer is divisible by 9 if and only if the sum of its digits is divisible by 9. An Application of Theorems 1 and 2 486 mod 9 = 4 ⇥ 10 ⇥ 10 + 8 ⇥ 10 + 6 mod 9 = (4 ⇥ 10 ⇥ 10 mod 9) +(8 ⇥ 10 mod 9) + (6 mod 9) mod 9 (using Theorem 1) = [(4 mod 9) ⇥ (10 mod 9) ⇥ (10 mod 9) mod 9] +[(8 mod 9) ⇥ (10 mod 9) mod 9] +[6 mod 9] mod 9 (using Theorem 2) = [4 ⇥ 1 ⇥ 1] + [8 ⇥ 1] + [6] mod 9 = 4 + 8 + 6 mod 9 = 18 mod 9 = 0 In this way we can show in general that an integer is divisible by 9 if and only if the sum of its digits is divisible by 9. An Application of Theorems 1 and 2 486 mod 9 = 4 ⇥ 10 ⇥ 10 + 8 ⇥ 10 + 6 mod 9 = (4 ⇥ 10 ⇥ 10 mod 9) +(8 ⇥ 10 mod 9) + (6 mod 9) mod 9 (using Theorem 1) = [(4 mod 9) ⇥ (10 mod 9) ⇥ (10 mod 9) mod 9] +[(8 mod 9) ⇥ (10 mod 9) mod 9] +[6 mod 9] mod 9 (using Theorem 2) = [4 ⇥ 1 ⇥ 1] + [8 ⇥ 1] + [6] mod 9 = 4 + 8 + 6 mod 9 = 18 mod 9 = 0 In this way we can show in general that an integer is divisible by 9 if and only if the sum of its digits is divisible by 9. An Application of Theorems 1 and 2 486 mod 9 = 4 ⇥ 10 ⇥ 10 + 8 ⇥ 10 + 6 mod 9 = (4 ⇥ 10 ⇥ 10 mod 9) +(8 ⇥ 10 mod 9) + (6 mod 9) mod 9 (using Theorem 1) = [(4 mod 9) ⇥ (10 mod 9) ⇥ (10 mod 9) mod 9] +[(8 mod 9) ⇥ (10 mod 9) mod 9] +[6 mod 9] mod 9 (using Theorem 2) = [4 ⇥ 1 ⇥ 1] + [8 ⇥ 1] + [6] mod 9 = 4 + 8 + 6 mod 9 = 18 mod 9 = 0 In this way we can show in general that an integer is divisible by 9 if and only if the sum of its digits is divisible by 9. An Application of Theorems 1 and 2 486 mod 9 = 4 ⇥ 10 ⇥ 10 + 8 ⇥ 10 + 6 mod 9 = (4 ⇥ 10 ⇥ 10 mod 9) +(8 ⇥ 10 mod 9) + (6 mod 9) mod 9 (using Theorem 1) = [(4 mod 9) ⇥ (10 mod 9) ⇥ (10 mod 9) mod 9] +[(8 mod 9) ⇥ (10 mod 9) mod 9] +[6 mod 9] mod 9 (using Theorem 2) = [4 ⇥ 1 ⇥ 1] + [8 ⇥ 1] + [6] mod 9 = 4 + 8 + 6 mod 9 = 18 mod 9 = 0 In this way we can show in general that an integer is divisible by 9 if and only if the sum of its digits is divisible by 9. Outline 1 Introduction Recap: Modular Arithmetic MathsBite: Modular Multiplication Recap: Substitution Ciphers Vigenère Ciphers Transposition Ciphers One-time Pads Traffic Analysis Recap: Caesar Cipher Julius Caesar used a cipher in which each letter of the (ancient) Latin alphabet was replaced by the letter three places on: Plain: Cipher: Plain: Cipher: A D N Q B E O R C F P S D G Q T E H R V F I S X G K T Y H L V Z I M X A K N Y B L O Z C M P So, for example VENI VIDI VICI would be enciphered as ZHQM ZMGM ZMFM. Recap: Caesar Cipher Julius Caesar used a cipher in which each letter of the (ancient) Latin alphabet was replaced by the letter three places on: Plain: Cipher: Plain: Cipher: A D N Q B E O R C F P S D G Q T E H R V F I S X G K T Y H L V Z I M X A K N Y B L O Z C M P So, for example VENI VIDI VICI would be enciphered as ZHQM ZMGM ZMFM. Caesar Cipher Modular Arithmetic Interpretation If we represent each letter by a number: A ! 0, B ! 1, . . . Z ! 22, then we can describe the Caesar cipher by saying that it transforms each plaintext letter p into a ciphertext letter c = p + k mod 23 where k = 3. k is called the key. Different ciphers can be obtained by choosing different values of k . Decryption can be performed using the equation: p=c k mod 23 Caesar Cipher Modular Arithmetic Interpretation If we represent each letter by a number: A ! 0, B ! 1, . . . Z ! 22, then we can describe the Caesar cipher by saying that it transforms each plaintext letter p into a ciphertext letter c = p + k mod 23 where k = 3. k is called the key. Different ciphers can be obtained by choosing different values of k . Decryption can be performed using the equation: p=c k mod 23 A General Substitution Cipher Instead of moving on a fixed number of places in the alphabet, how about shuffling up the letters of the (modern English) alphabet in an arbitrary way? For example: Plain: Cipher: Plain: Cipher: A Y N P B B O R C L P U So MEET ME IN ST LOUIS gets encrypted as OHHF OH NP ZF CRTNZ D S Q G E H R K F X S Z G I T F H J U T I N V V J W W A K D X M L C Y E M O Z Q A General Substitution Cipher Instead of moving on a fixed number of places in the alphabet, how about shuffling up the letters of the (modern English) alphabet in an arbitrary way? For example: Plain: Cipher: Plain: Cipher: A Y N P B B O R C L P U So MEET ME IN ST LOUIS gets encrypted as OHHF OH NP ZF CRTNZ D S Q G E H R K F X S Z G I T F H J U T I N V V J W W A K D X M L C Y E M O Z Q A B C D E F (from Stallings [2006]) G H I J K L M N O P Q R S T U V W X Figure 2.5 Relative Frequency of Letters in English Text 0.074 1.974 2.360 6.327 5.987 7.507 9.056 10 0.150 0.978 2.758 1.929 6.749 6.996 6.094 6 2.406 2.015 2.228 4.253 8 0.095 0 4.025 4 2.782 8.167 12.702 14 0.772 0.153 2 1.492 Relative frequency (%) Letter Frequencies in English 12 Y Z Substitution Ciphers Vulnerable to letter frequency analysis The following features of English make substitution ciphers vulnerable to cryptanalysis: Widely different letter frequencies. E, T, and A are particularly common; J, X, Q and Z are rare. A single-letter word is very likely to be A or I. Some letters are rarely or never doubled: A, H, I, J, Q, U, V, W, X, Y. Other natural languages (and indeed computer languages) exhibit similar characteristic features. In Italian, for example, words—especially long words—usually end in a vowel. Substitution Ciphers Vulnerable to letter frequency analysis The following features of English make substitution ciphers vulnerable to cryptanalysis: Widely different letter frequencies. E, T, and A are particularly common; J, X, Q and Z are rare. A single-letter word is very likely to be A or I. Some letters are rarely or never doubled: A, H, I, J, Q, U, V, W, X, Y. Other natural languages (and indeed computer languages) exhibit similar characteristic features. In Italian, for example, words—especially long words—usually end in a vowel. Outline 1 Introduction Recap: Modular Arithmetic MathsBite: Modular Multiplication Recap: Substitution Ciphers Vigenère Ciphers Transposition Ciphers One-time Pads Traffic Analysis Another Look at the Caesar Cipher Let’s think about applying a Caesar-type cipher to modern English (with a 26-letter alphabet); we’ll ignore spaces and punctuation. As before, we assign numbers to letters: A ! 0, B ! 1, . . . Z ! 25. Let L(m) denote the letter corresponding to number m, so for example L(0) = A, L(4) = E, L(25) = Z. Represent our plaintext message M as M = L(m1 ) k L(m2 ) k · · · k L(mj ) where m1 , m2 , . . . mj are numbers corresponding to the successive letters of the message, and k denotes concatenation. Then a Caesar-type cipher forms the corresponding ciphertext c as follows: C = L(m1 + k mod 26) k L(m2 + k mod 26) k · · · k L(mj + k mod 26) Another Look at the Caesar Cipher Let’s think about applying a Caesar-type cipher to modern English (with a 26-letter alphabet); we’ll ignore spaces and punctuation. As before, we assign numbers to letters: A ! 0, B ! 1, . . . Z ! 25. Let L(m) denote the letter corresponding to number m, so for example L(0) = A, L(4) = E, L(25) = Z. Represent our plaintext message M as M = L(m1 ) k L(m2 ) k · · · k L(mj ) where m1 , m2 , . . . mj are numbers corresponding to the successive letters of the message, and k denotes concatenation. Then a Caesar-type cipher forms the corresponding ciphertext c as follows: C = L(m1 + k mod 26) k L(m2 + k mod 26) k · · · k L(mj + k mod 26) Another Look at the Caesar Cipher Let’s think about applying a Caesar-type cipher to modern English (with a 26-letter alphabet); we’ll ignore spaces and punctuation. As before, we assign numbers to letters: A ! 0, B ! 1, . . . Z ! 25. Let L(m) denote the letter corresponding to number m, so for example L(0) = A, L(4) = E, L(25) = Z. Represent our plaintext message M as M = L(m1 ) k L(m2 ) k · · · k L(mj ) where m1 , m2 , . . . mj are numbers corresponding to the successive letters of the message, and k denotes concatenation. Then a Caesar-type cipher forms the corresponding ciphertext c as follows: C = L(m1 + k mod 26) k L(m2 + k mod 26) k · · · k L(mj + k mod 26) Another Look at the Caesar Cipher Let’s think about applying a Caesar-type cipher to modern English (with a 26-letter alphabet); we’ll ignore spaces and punctuation. As before, we assign numbers to letters: A ! 0, B ! 1, . . . Z ! 25. Let L(m) denote the letter corresponding to number m, so for example L(0) = A, L(4) = E, L(25) = Z. Represent our plaintext message M as M = L(m1 ) k L(m2 ) k · · · k L(mj ) where m1 , m2 , . . . mj are numbers corresponding to the successive letters of the message, and k denotes concatenation. Then a Caesar-type cipher forms the corresponding ciphertext c as follows: C = L(m1 + k mod 26) k L(m2 + k mod 26) k · · · k L(mj + k mod 26) The Problem with Substitution Ciphers . . . . . . and an approach to solving it The problem with substitution ciphers is that there is a fixed mapping from plaintext letters to ciphertext letters. Consequently statistical and other properties of the source language (e.g. English) get preserved in the ciphertext, and can be used by cryptanalysts to determine what this mapping is. But suppose we change the Caesar cipher by using different keys for successive letters of the message: C = L(m1 + k1 mod 26) k L(m2 + k2 mod 26) k · · · k L(mj + kj mod 26) where each key k1 , k2 , . . . kj is in the range from 0 to 25. This approach is potentially much more secure. The Problem with Substitution Ciphers . . . . . . and an approach to solving it The problem with substitution ciphers is that there is a fixed mapping from plaintext letters to ciphertext letters. Consequently statistical and other properties of the source language (e.g. English) get preserved in the ciphertext, and can be used by cryptanalysts to determine what this mapping is. But suppose we change the Caesar cipher by using different keys for successive letters of the message: C = L(m1 + k1 mod 26) k L(m2 + k2 mod 26) k · · · k L(mj + kj mod 26) where each key k1 , k2 , . . . kj is in the range from 0 to 25. This approach is potentially much more secure. Vigenère Ciphers Since each of the keys k1 , k2 , . . . kj is in the range 0 to 25, it corresponds to a letter under our mapping. An idea that emerged in the 16th century (commonly, but not entirely accurately, attributed to Blaise de Vigenère) was to use a word or phrase to represent the sequence of key values. For example, the word ‘UNCOPYRIGHTABLE’ represents the key sequence 20, 13, 2, 14, 15, 24, 17, 8, 6, 7, 19, 0, 1, 11, 4. If necessary the word or phrase would be repeated up to the length of the plaintext to be enciphered. Here’s an example using this keyword: Plaintext: Key: Ciphertext: M 20 G E 13 R E 2 G T 14 H M 15 B E 24 C I 17 Z N 8 V S 6 Y T 7 A L 19 E O 0 O U 1 V I 11 T Notice how the double letter EE in the plaintext does not map into a double letter in the ciphertext: in fact the three occurrences of the letter E in the plaintext each map into a different letter in the ciphertext. S 4 W Vigenère Ciphers Since each of the keys k1 , k2 , . . . kj is in the range 0 to 25, it corresponds to a letter under our mapping. An idea that emerged in the 16th century (commonly, but not entirely accurately, attributed to Blaise de Vigenère) was to use a word or phrase to represent the sequence of key values. For example, the word ‘UNCOPYRIGHTABLE’ represents the key sequence 20, 13, 2, 14, 15, 24, 17, 8, 6, 7, 19, 0, 1, 11, 4. If necessary the word or phrase would be repeated up to the length of the plaintext to be enciphered. Here’s an example using this keyword: Plaintext: Key: Ciphertext: M 20 G E 13 R E 2 G T 14 H M 15 B E 24 C I 17 Z N 8 V S 6 Y T 7 A L 19 E O 0 O U 1 V I 11 T Notice how the double letter EE in the plaintext does not map into a double letter in the ciphertext: in fact the three occurrences of the letter E in the plaintext each map into a different letter in the ciphertext. S 4 W Cryptanalysis of Vigenère Ciphers, 1 Although Vigenère ciphers are more resistant to cryptanalysis than ordinary substitution ciphers, Charles Babbage realised that they are by no means impossible to break, especially if the keyword/keyphrase is short. Consider the following example, which is based on the keyword ‘KING’: Plaintext: Key: Ciphertext: Plaintext: Key: Ciphertext: T 10 D M 10 W H 8 P A 8 I E 13 R N 13 A S 6 Y I 6 O U 10 E N 10 X N 8 V T 8 B A 13 N H 13 U N 6 T E 6 K (Example from Simon Singh ‘The Cracking Codebook’ p. 64) D 10 N M 10 W T 8 B O 8 W H 13 U O 13 B E 6 K N 6 T Cryptanalysis of Vigenère Ciphers, 1 Although Vigenère ciphers are more resistant to cryptanalysis than ordinary substitution ciphers, Charles Babbage realised that they are by no means impossible to break, especially if the keyword/keyphrase is short. Consider the following example, which is based on the keyword ‘KING’: Plaintext: Key: Ciphertext: Plaintext: Key: Ciphertext: T 10 D M 10 W H 8 P A 8 I E 13 R N 13 A S 6 Y I 6 O U 10 E N 10 X N 8 V T 8 B A 13 N H 13 U N 6 T E 6 K (Example from Simon Singh ‘The Cracking Codebook’ p. 64) D 10 N M 10 W T 8 B O 8 W H 13 U O 13 B E 6 K N 6 T Cryptanalysis of Vigenère Ciphers, 1 Although Vigenère ciphers are more resistant to cryptanalysis than ordinary substitution ciphers, Charles Babbage realised that they are by no means impossible to break, especially if the keyword/keyphrase is short. Consider the following example, which is based on the keyword ‘KING’: Plaintext: Key: Ciphertext: Plaintext: Key: Ciphertext: T 10 D M 10 W H 8 P A 8 I E 13 R N 13 A S 6 Y I 6 O U 10 E N 10 X N 8 V T 8 B A 13 N H 13 U N 6 T E 6 K (Example from Simon Singh ‘The Cracking Codebook’ p. 64) D 10 N M 10 W T 8 B O 8 W H 13 U O 13 B E 6 K N 6 T Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ EFIQ occurs at offsets 3 and 98. Spacing: 95. (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ EFIQ occurs at offsets 3 and 98. Spacing: 95. PSDLP occurs at offsets 35 and 40. Spacing: 5. (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ EFIQ occurs at offsets 3 and 98. Spacing: 95. PSDLP occurs at offsets 35 and 40. Spacing: 5. WCXYM occurs at offsets 88 and 108. Spacing: 20. (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ EFIQ occurs at offsets 3 and 98. Spacing: 95. PSDLP occurs at offsets 35 and 40. Spacing: 5. WCXYM occurs at offsets 88 and 108. Spacing: 20. ETRL occurs at offsets 67 and 187. Spacing: 120. (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ Suppose the keyword is 5 letters long. Then these letters are encoded with the first letter of the keyword. (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ Suppose the keyword is 5 letters long. Then these letters are encoded with the second letter of the keyword. (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ Suppose the keyword is 5 letters long. Then these letters are encoded with the third letter of the keyword. (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ Suppose the keyword is 5 letters long. Then these letters are encoded with the fourth letter of the keyword. (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers Consider the following ciphertext: WUBEFIQLZURMVOFEHMYMWTIXCGTMPIFKRZUPMVOI RQMMWOZMPULMBNYVQQQMVMVJLEYMHFEFNZPSDLPP SDLPEVQMWCXYMDAVQEEFIQCAYTQOWCXYMWMSEMEF CFWYEYQETRLIQYCGMTWCWFBSMYFPLRXTQYEEXMRU LUKSGWFPTLRQAERLUVPMVYQYCXTWFQLMTELSFJPQ EHMOZCIWCIWFPZSLMAEZIQVLQMZVPPXAWCSMZMOR VGVVQSZETRLQZPBJAZVQIYXEWWOICCGDWHQMMVOW SGNTJPFPPAYBIYBJUTWRLQKLLLMDPYVACDCFQNZP IFPPKSDVPTIDGXMQQVEBMQALKEZMGCVKUZKIZBZL IUAMMVZ Suppose the keyword is 5 letters long. Then these letters are encoded with the last letter of the keyword. (Example from Simon Singh ‘The Cracking Codebook’ p. 65) Cryptanalysis of Vigenère Ciphers, 2 If a particular sequence of letters occurs repeatedly in the ciphertext, this may well correspond to a repeated sequence of plaintext letters (‘THE’ perhaps!) encoded with the same part of the keyword. By looking at the spacing of these repetitions, you can make a good guess at the length of the keyword. Suppose you have determined that the keyword is, say, four letters long. Then you can pick the ciphertext apart into those letters that have been encoded with the first letter of the keyword, those that have been encoded with the second, and so on. Letter frequency analysis will then enable you to decipher the message, and determine the keyword. Even if the keyphrase is very long, the fact that it is itself an English (or whatever) phrase opens the door to statistical cryptanalysis. Cryptanalysis of Vigenère Ciphers, 2 If a particular sequence of letters occurs repeatedly in the ciphertext, this may well correspond to a repeated sequence of plaintext letters (‘THE’ perhaps!) encoded with the same part of the keyword. By looking at the spacing of these repetitions, you can make a good guess at the length of the keyword. Suppose you have determined that the keyword is, say, four letters long. Then you can pick the ciphertext apart into those letters that have been encoded with the first letter of the keyword, those that have been encoded with the second, and so on. Letter frequency analysis will then enable you to decipher the message, and determine the keyword. Even if the keyphrase is very long, the fact that it is itself an English (or whatever) phrase opens the door to statistical cryptanalysis. Cryptanalysis of Vigenère Ciphers, 2 If a particular sequence of letters occurs repeatedly in the ciphertext, this may well correspond to a repeated sequence of plaintext letters (‘THE’ perhaps!) encoded with the same part of the keyword. By looking at the spacing of these repetitions, you can make a good guess at the length of the keyword. Suppose you have determined that the keyword is, say, four letters long. Then you can pick the ciphertext apart into those letters that have been encoded with the first letter of the keyword, those that have been encoded with the second, and so on. Letter frequency analysis will then enable you to decipher the message, and determine the keyword. Even if the keyphrase is very long, the fact that it is itself an English (or whatever) phrase opens the door to statistical cryptanalysis. Outline 1 Introduction Recap: Modular Arithmetic MathsBite: Modular Multiplication Recap: Substitution Ciphers Vigenère Ciphers Transposition Ciphers One-time Pads Traffic Analysis Transposition Cipher The idea of a transposition cipher is to create the ciphertext by rearranging the letters in the plaintext. At its simplest, you could for example for the ciphertext by putting all the odd-numbered letters of the plaintext first (in order), followed by all the even-numbered letters. So for example MEETMEINSTLOUIS would encrypt as: MEMISLUSETENTOI Transposition Cipher The idea of a transposition cipher is to create the ciphertext by rearranging the letters in the plaintext. At its simplest, you could for example for the ciphertext by putting all the odd-numbered letters of the plaintext first (in order), followed by all the even-numbered letters. So for example MEETMEINSTLOUIS would encrypt as: MEMISLUSETENTOI Transposition Cipher as used by Special Operations Executive (SOE) Suppose we want to encrypt the message ‘THIS IS NOT THE END. IT IS NOT EVEN THE BEGINNING OF THE END. BUT IT IS, PERHAPS, THE END OF THE BEGINNING.’. First we choose a keyword or key phrase, e.g. ‘GREENPEACE’ and use this to define a permutation by looking for the order in which its letters occur in the alphabet: G 7 R 10 E 3 E 4 N 8 P 9 E 5 A 1 C 2 E 6 Transposition Cipher as used by Special Operations Executive (SOE) Suppose we want to encrypt the message ‘THIS IS NOT THE END. IT IS NOT EVEN THE BEGINNING OF THE END. BUT IT IS, PERHAPS, THE END OF THE BEGINNING.’. First we choose a keyword or key phrase, e.g. ‘GREENPEACE’ and use this to define a permutation by looking for the order in which its letters occur in the alphabet: G 7 R 10 E 3 E 4 N 8 P 9 E 5 A 1 C 2 E 6 SOE Transposition Cipher, 2 Now write the plaintext rowwise under the key: G 7 T H O E T T T E R 10 H E T G H I H B E 3 I E E I E S E E E 4 S N V N E P E G N 8 I D E N N E N I P 9 S I N I D R D N E 5 N T T N B H O N A 1 O I H G U A F I C 2 T S E O T P T N E 6 T N B F I S H G Finally, form the ciphertext by reading out the plaintext columnwise, taking columns in the order given by the key: OIHGU AFITS EOTPT NIEEI ESEES NVNEP EGNTT NBHON TNBFI SHGTH OETTT EIDEN NENIS INIDR DNHET GHIHB (This I think is how it worked, based mainly on Leo Marks’s book Between Silk and Cyanide.) SOE Transposition Cipher, 2 Now write the plaintext rowwise under the key: G 7 T H O E T T T E R 10 H E T G H I H B E 3 I E E I E S E E E 4 S N V N E P E G N 8 I D E N N E N I P 9 S I N I D R D N E 5 N T T N B H O N A 1 O I H G U A F I C 2 T S E O T P T N E 6 T N B F I S H G Finally, form the ciphertext by reading out the plaintext columnwise, taking columns in the order given by the key: OIHGU AFITS EOTPT NIEEI ESEES NVNEP EGNTT NBHON TNBFI SHGTH OETTT EIDEN NENIS INIDR DNHET GHIHB (This I think is how it worked, based mainly on Leo Marks’s book Between Silk and Cyanide.) SOE Transposition Cipher, 2 Now write the plaintext rowwise under the key: G 7 T H O E T T T E R 10 H E T G H I H B E 3 I E E I E S E E E 4 S N V N E P E G N 8 I D E N N E N I P 9 S I N I D R D N E 5 N T T N B H O N A 1 O I H G U A F I C 2 T S E O T P T N E 6 T N B F I S H G Finally, form the ciphertext by reading out the plaintext columnwise, taking columns in the order given by the key: OIHGU AFITS EOTPT NIEEI ESEES NVNEP EGNTT NBHON TNBFI SHGTH OETTT EIDEN NENIS INIDR DNHET GHIHB (This I think is how it worked, based mainly on Leo Marks’s book Between Silk and Cyanide.) SOE Transposition Cipher, 3 The transposition would be carried out twice, using different keywords for each transposition. Usually the keywords would be drawn from a code poem memorised by the agent. The start of the message would consist of indicator groups to show which words from the poem had been used for the encipherment. It was common for agents in the field to misnumber their keywords, leading to so-called ‘indecipherables’. SOE Transposition Cipher, 3 The transposition would be carried out twice, using different keywords for each transposition. Usually the keywords would be drawn from a code poem memorised by the agent. The start of the message would consist of indicator groups to show which words from the poem had been used for the encipherment. It was common for agents in the field to misnumber their keywords, leading to so-called ‘indecipherables’. SOE Transposition Cipher, 3 The transposition would be carried out twice, using different keywords for each transposition. Usually the keywords would be drawn from a code poem memorised by the agent. The start of the message would consist of indicator groups to show which words from the poem had been used for the encipherment. It was common for agents in the field to misnumber their keywords, leading to so-called ‘indecipherables’. Violette Szabo’s Code Poem . . . . . . or was it? The life that I have Is all that I have And the life that I have Is yours The love that I have Of the life that I have Is yours and yours and yours A sleep I shall have A rest I shall have Yet death will be but a pause For the peace of my years In the long green grass Will be yours and yours and yours Another SOE Code Poem Tickle my wallypad Tongue my zonker And make an oaktree Out of a conker. (Leo Marks ‘Between Silk and Cyanide’ p. 90) Outline 1 Introduction Recap: Modular Arithmetic MathsBite: Modular Multiplication Recap: Substitution Ciphers Vigenère Ciphers Transposition Ciphers One-time Pads Traffic Analysis Improving Vigenère Ciphers We can make Vigenère ciphers much more secure if: we use a key as long as the message; choose the letters of the key completely randomly—as if drawn from an urn like a lottery, except that each letter is put back in the urn after being drawn; use the key only once. Such a cipher is called a one-time pad: it is unbreakable provided the key is secure. Improving Vigenère Ciphers We can make Vigenère ciphers much more secure if: we use a key as long as the message; choose the letters of the key completely randomly—as if drawn from an urn like a lottery, except that each letter is put back in the urn after being drawn; use the key only once. Such a cipher is called a one-time pad: it is unbreakable provided the key is secure. Improving Vigenère Ciphers We can make Vigenère ciphers much more secure if: we use a key as long as the message; choose the letters of the key completely randomly—as if drawn from an urn like a lottery, except that each letter is put back in the urn after being drawn; use the key only once. Such a cipher is called a one-time pad: it is unbreakable provided the key is secure. Improving Vigenère Ciphers We can make Vigenère ciphers much more secure if: we use a key as long as the message; choose the letters of the key completely randomly—as if drawn from an urn like a lottery, except that each letter is put back in the urn after being drawn; use the key only once. Such a cipher is called a one-time pad: it is unbreakable provided the key is secure. One-time Pad Example The story so far Alice has represented a secret message in an alphabet of 27 letters, namely A. . . Z plus space. After encipherment using a one-time pad, the message came out as: |ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS| which Alice then transmitted in Morse code to Bob. Their enemies Darth and Mallory manage to intercept the message during transmission, and are trying to decrypt it. (from Stallings [2006] p. 48) One-time Pad Example The story so far Alice has represented a secret message in an alphabet of 27 letters, namely A. . . Z plus space. After encipherment using a one-time pad, the message came out as: |ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS| which Alice then transmitted in Morse code to Bob. Their enemies Darth and Mallory manage to intercept the message during transmission, and are trying to decrypt it. (from Stallings [2006] p. 48) One-time Pad Example Now read on . . . After a while, Darth suddenly shouts ‘Eureka!’. He’s figured out that Alice must have used the key |PXLMVMSYDOFUYRVZWC TNLEBNECVGDUPAHFZZLMNYIH| and that the message decrypts as |MR MUSTARD WITH THE CANDLESTICK IN THE HALL| One-time Pad Example Now read on . . . After a while, Darth suddenly shouts ‘Eureka!’. He’s figured out that Alice must have used the key |PXLMVMSYDOFUYRVZWC TNLEBNECVGDUPAHFZZLMNYIH| and that the message decrypts as |MR MUSTARD WITH THE CANDLESTICK IN THE HALL| At the same time, Mallory also shouts ‘Eureka’. He’s figured out that Alice must have used the key |MFUGPMIYDGAXGOUFHKLLLMHSQDQOGTEWBQFGYOVUHWT| and that the message decrypts as |MISS SCARLET WITH THE KNIFE IN THE LIBRARY | One-time Pad Example Now read on . . . After a while, Darth suddenly shouts ‘Eureka!’. He’s figured out that Alice must have used the key |PXLMVMSYDOFUYRVZWC TNLEBNECVGDUPAHFZZLMNYIH| and that the message decrypts as |MR MUSTARD WITH THE CANDLESTICK IN THE HALL| At the same time, Mallory also shouts ‘Eureka’. He’s figured out that Alice must have used the key |MFUGPMIYDGAXGOUFHKLLLMHSQDQOGTEWBQFGYOVUHWT| and that the message decrypts as |MISS SCARLET WITH THE KNIFE IN THE LIBRARY | Which of them is right? Outline 1 Introduction Recap: Modular Arithmetic MathsBite: Modular Multiplication Recap: Substitution Ciphers Vigenère Ciphers Transposition Ciphers One-time Pads Traffic Analysis Traffic Analysis Beware that even if eavesdroppers cannot crack your cipher, they may still gain useful intelligence from: Who sent a message to whom, . . . . . . when they sent the message, . . . . . . and how long the message was. This is called traffic analysis. Traffic Analysis Beware that even if eavesdroppers cannot crack your cipher, they may still gain useful intelligence from: Who sent a message to whom, . . . . . . when they sent the message, . . . . . . and how long the message was. This is called traffic analysis. Traffic Analysis Beware that even if eavesdroppers cannot crack your cipher, they may still gain useful intelligence from: Who sent a message to whom, . . . . . . when they sent the message, . . . . . . and how long the message was. This is called traffic analysis. Traffic Analysis Beware that even if eavesdroppers cannot crack your cipher, they may still gain useful intelligence from: Who sent a message to whom, . . . . . . when they sent the message, . . . . . . and how long the message was. This is called traffic analysis. Recap 1 Introduction Recap: Modular Arithmetic MathsBite: Modular Multiplication Recap: Substitution Ciphers Vigenère Ciphers Transposition Ciphers One-time Pads Traffic Analysis CO634 Cryptography Key Management for Symmetric Ciphers lectures by Carlos A. Perez-Delgado School of Computing, University of Kent c 2019 Outline 1 Key Management for Symmetric Ciphers 1 Key Management for Symmetric Ciphers MathsBite: Multiplicative Inverse Modulo n How many keys are needed? Decentralised key control Using a key distribution centre 2 / 26 MathsBite: Multiplicative Inverses in Ordinary Maths In ordinary maths, for example: 21 ⇥ 5 = 105 1 21 ⇥ 5 ⇥ = 21 5 1 1 5 is called the multiplicative inverse of 5 because multiplying by 5 undoes the effect of multiplying by 5. Clearly, for b to be the multiplicative inverse of a, we must have a ⇥ b = 1, so that multiplying by a and then by b is the same as multiplying by 1. In the maths of the real numbers, the multiplicative inverse of a number a is a1 , and every number except 0 has a multiplicative inverse. 3 / 26 MathsBite: Multiplicative Inverses Modulo 9 Let’s look again at arithmetic modulo 9, in which (remember?) the only numbers are 0, 1, . . . 8. Here are the times tables we looked at in a previous lecture: ⇥ 0 1 2 3 4 5 6 7 8 0 0 0 0 0 0 0 0 0 0 1 0 1 2 3 4 5 6 7 8 2 0 2 4 6 8 1 3 5 7 3 0 3 6 0 3 6 0 3 6 4 0 4 8 3 7 2 6 1 5 5 0 5 1 6 2 7 3 8 4 6 0 6 3 0 6 3 0 6 3 7 0 7 5 3 1 8 6 4 2 8 0 8 7 6 5 4 3 2 1 Notice, for example that 2 ⇥ 5 = 1, so 5 is the multiplicative inverse of 2 and vice versa. But 3 doesn’t have a multiplicative inverse, and neither does 6. 4 / 26 MathsBite: Theorem 4 If 0 < a < n and a is relatively prime to n, then a has a multiplicative inverse modulo n. We omit the proof. The multiplicative inverse can be calculated efficiently using a variation of the Euclidean algorithm, called the extended Euclidean algorithm. 5 / 26 MathsBite: Corollary to Theorem 4 Corollary: If p is a prime number and 0 < a < p, then a has a multiplicative inverse modulo p. This is illustrated in the times tables modulo 7: ⇥ 0 1 2 3 4 5 6 0 0 0 0 0 0 0 0 1 0 1 2 3 4 5 6 2 0 2 4 6 1 3 5 3 0 3 6 2 5 1 4 4 0 4 1 5 2 6 3 5 0 5 3 1 6 4 2 6 0 6 5 4 3 2 1 6 / 26 The Extended Euclidean Algorithm Suppose we want to calculate the multiplicative inverse of 543 modulo 997. First think of the steps we would go through to work out gcd(543, 997) using the Euclidean algorithm. In each row Ri below, from R3 onwards, we work out the remainder when the number in row Ri 2 is divided by the number in row Ri 1 : Row R1 R2 R3 R4 R5 R6 R7 Calculation R1 R2 R3 R4 R5 R2 R3 5 ⇥ R4 9 ⇥ R5 R6 Number 997 543 454 89 9 8 1 From this we conclude that gcd(543, 997) = 1, so 543 will certainly have an inverse modulo 997. 7 / 26 The Extended Euclidean Algorithm, 2 Now work through the calculation, but as we go along we express each of the bold numbers in the form 997 ⇥ m + 543 ⇥ n: Row Calculation Number = 997 ⇥ m + 543 ⇥ n R1 997 = 997 ⇥ 1 + 543 ⇥ 0 R2 543 = 997 ⇥ 0 + 543 ⇥ 1 R3 R1 R2 454 = 997 ⇥ 1 + 543 ⇥ ( 1) R4 R2 R3 89 = 997 ⇥ ( 1) + 543 ⇥ 2 R5 R3 5 ⇥ R4 9 = 997 ⇥ 6 + 543 ⇥ ( 11) R6 R4 9 ⇥ R5 8 = 997 ⇥ ( 55) + 543 ⇥ 101 R7 R5 R6 1 = 997 ⇥ 61 + 543 ⇥ ( 112) In other words: 543 ⇥ ( 112) = 997 ⇥ ( 61) + 1 At this stage we have determined that 543 ⇥ ( 112) ⌘ 1( mod 997), so we’re well on our way. It remains to find a multiplicative inverse in the range 0 to 996. Take 996 - 112 = 885. 8 / 26 Outline 1 Key Management for Symmetric Ciphers 1 Key Management for Symmetric Ciphers MathsBite: Multiplicative Inverse Modulo n How many keys are needed? Decentralised key control Using a key distribution centre 9 / 26 Key Distribution Suppose that Alice and Bob wish to communicate secretly using a symmetric cipher. Then there must an encryption key K that is known only to Alice and Bob. Either Alice can create the key and send it to Bob, or vice versa, but in either case it needs to be sent by some secure means: if an eavesdropper Darth is able to hear the key en route, then Darth will be able to decrypt all communications using the key. 10 / 26 Communication between Many Parties Now suppose that there is a group of n traders dealing in a particular commodity (electronic components perhaps). Each trader wants to be able to communicate with any other trader securely, without any of the remaining n 2 traders listening in. Each of the n traders will need a separate key for each of the n 1 other traders, but the key works both ways, so that means that we need n(n 1)/2 keys in all. 11 / 26 Session Keys The more a cryptographic key is used, the more ciphertext based on that key an eavesdropper will be able to collect, thus aiding cryptanalysis. Even if the cipher is computationally secure, it may happen that the key is inadvertently disclosed to an adversary by some other means: we want to try to limit the amount of traffic that the adversary will then be able to decrypt. For these reasons, it is considered good practice to use a session key, i.e. a cryptographic key that is used only for the duration of a particular communication session. (A communication session may not necessarily correspond to a TCP session.) But how do we distribute session keys? 12 / 26 Outline 1 Key Management for Symmetric Ciphers 1 Key Management for Symmetric Ciphers MathsBite: Multiplicative Inverse Modulo n How many keys are needed? Decentralised key control Using a key distribution centre 13 / 26 Decentralised Key Control Master keys In this approach, each pair of traders share a secret master key that they use to set up session keys, and for no other purpose . So that means n(n 1)/2 master keys in all. Because the master keys are used only to set up session keys, there will be little ciphertext based on the master key for a cryptanalyst to work on. 14 / 26 Nonces In UK prison slang, ‘nonce’ means a child molester: that isn’t what we mean here! (This usage is strictly British English, and not used in America.) Instead think of the phase “for the nonce”, which means “for the time being” or, more relevantly “for this one occasion”. In cryptography, a nonce is a number or bit-string generated to identify a particular transaction uniquely: this helps to defeat replay attacks. It is desirable (but not in every case essential) that nonces are generated in a way that is hard for an adversary to predict. 15 / 26 Decentralised Key Control Establishing a session key Suppose that A wants to establish a session key to communicate with B. They share a master key KAB . Here’s the procedure: 1 A creates a nonce NA and sends it to B. 2 B creates a session key Ks , forms a message comprising NA and Ks , encrypts this message using KAB and sends it to A. 3 A decrypts the message to obtain the session key, and to check that the nonce is identical to the one he sent. A now knows that the session key must have come from B (since only A and B know KAB ), and that the message is not a replay. (cf. Menezes et al. [1997] Sec. 12.17) 16 / 26 Outline 1 Key Management for Symmetric Ciphers 1 Key Management for Symmetric Ciphers MathsBite: Multiplicative Inverse Modulo n How many keys are needed? Decentralised key control Using a key distribution centre 17 / 26 Key Distribution Centre If there are a large number of traders, requiring n(n may be impractical. 1)/2 master keys A different approach is to have session keys provided by a central key distribution centre (KDC). The idea is that each of our n traders has a secret master key which the trader uses to communicate with the KDC, and for no other purpose. When trader A wants to start a communication with trader B, he applies to the KDC for a session key. It’s quite difficult to devise a protocol for doing this which is secure against active attacks: we’ll see one possibility next. 18 / 26 Otway-Rees Protocol Suppose that A wants to get a session key to communicate with B. One way of doing this is called the Otway-Rees protocol, and requires four messages to set it up: 1 A message from A to B; 2 A message from B to the KDC; 3 A reply from the KDC to B; 4 A reply from B to A. (cf. Menezes et al. [1997] Sec. 12.29) 19 / 26 Otway-Rees Protocol A’s message to B A’s message to B has the form: N k A k B k E(NA k N k A k B, KA ) where: N is a nonce identifying the transaction. (It isn’t necessary that this nonce be unpredictable.) A is A’s identity. B is B’s identity. NA is another nonce, known only to A (and later to the KDC). E(NA k N k A k B, KA ) is the result of encrypting NA k N k A k B with A’s master key. 20 / 26 Otway-Rees Protocol B’s message to the KDC B’s message to the KDC has the form: N k A k B k E(NA k N k A k B, KA ) k E(NB k N k A k B, KB ) where the first part is identical to the message B received from A, and NB is yet another nonce, this one known only to B (and later to the KDC). E(NB k N k A k B, KB ) is the result of encrypting NB k N k A k B with B’s master key. 21 / 26 Otway-Rees Protocol KDC checks the message So the KDC receives a message of the form: N k A k B k E(NA k N k A k B, KA ) k E(NB k N k A k B, KB ) It now processes the message as follows: 1 The identity of A is in the message in cleartext; the KDC uses this to extract A’s master key KA from its database. 2 The KDC now uses KA to decrypt the part of the message encrypted with KA . It checks that the encrypted values of N, A and B agree with the values in cleartext, and extracts the nonce NA . 3 The identity of B is in the message in cleartext; the KDC uses this to extract B’s master key KB from its database. 4 The KDC now uses KB to decrypt the part of the message encrypted with KB . It checks that the encrypted values of N, A and B agree with the values in cleartext, and extracts the nonce NB . 22 / 26 Otway-Rees Protocol The KDC’s reply to B The KDC now generates a random session key Ks , and composes a message as follows: E(NA k Ks , KA ) k E(NB k Ks , KB ) and sends it to B. Here E(NA k Ks , KA ) contains A’s nonce NA and the session key Ks , both encrypted with A’s master key KA . E(NB k Ks , KB ) contains B’s nonce NB and the session key Ks , both encrypted with B’s master key KB . 23 / 26 Otway-Rees Protocol B’s checks, and reply to A So B received a message of the form: E(NA k Ks , KA ) k E(NB k Ks , KB ) B now decrypts the second part of the message, checks that the decrypted nonce agrees with the value of NB he sent to the KDC, and remembers the session key Ks . B then sends the first part of the message on to A. 24 / 26 Otway-Rees Protocol A’s checks So finally A receives a message of the form: E(NA k Ks , KA ) A now decrypts this message, checks that the decrypted nonce agrees with the value of NA that he sent (encrypted) to B, and remembers the session key Ks . A and B can now proceed to communicate using the session key Ks . 25 / 26 Using a KDC: Pros and Cons + Only the n master keys need to be distributed by secure means (i.e. without using the insecure network, maybe by post or courier); - If an adversary manages to infiltrate the KDC, all communications are compromised. “what good would it do after all to develop impenetrable cryptosystems, if their users were forced to share their keys with a KDC that could be compromised either by burglary or subpoena?” (Diffie [1988]) 26 / 26 CO634 Cryptography The RSA Cipher lectures by Carlos A. Perez-Delgado School of Computing, University of Kent c 2019 Outline 1 The RSA Cipher 1 The RSA Cipher Implementing an asymmetric cipher A half-baked approach The fundamental idea of RSA Practical details of RSA 2 / 24 Timeline c. 1900 BC First known use of something resembling encryption, in ancient Egypt. c. 1500 BC First recorded definite use of encryption, in Mesopotamia. mid 1960s Public-key cryptography said to have been discovered at NSA (Simmons, 1993). 1970 First documented introduction of public-key cryptography, in classified report at UK Government Communications Headquarters (GCHQ). 1973 Clifford Cocks at GCHQ formulates a cipher essentially similar to RSA. 1976 Diffie and Hellman publish the concept of public key cryptography. 1977 Rivest, Shamir and Adleman invent the algorithm that bears their name. 3 / 24 General Approach Treat plaintext blocks as numbers We treat a block of plaintext as a number. For example we looked in an earlier lecture at the ASCII representation of the string “In summer”, as the following bitstring: I n s u m m e r 010010010110111000100000011100110111010101101101011011010110010101110010 But this bitstring can equally be interpreted as an integer whose decimal value is 1354547786872408335730. This is what we do in RSA. 4 / 24 General Approach Encryption and decryption Encryption: C = P e mod n so the encryption key consists of the pair (e, n). Obviously n must be larger than the largest possible value of a plaintext block, i.e. n 2b , if b is the number of bits in a block. Decryption: P = C d mod n so the decryption key consists of the pair (d, n). 5 / 24 General Approach Requirements Decryption must undo encryption, i.e. we must have P = (P e )d mod n = P e⇥d mod n It must be computationally impractical to determine the plaintext or the decryption key, even given large amounts of ciphertext, either by brute-force or cryptanalytic attacks. It must be computationally impractical to determine d given e and n. 6 / 24 Reminder: Theorem 3 (Fermat) If i is any integer and p is a prime number, then: ⇢ 0 if i mod p = 0 p 1 i mod p = 1 otherwise 7 / 24 Reminder: Theorem 4 If 0 < a < n and a is relatively prime to n, then a has a multiplicative inverse modulo n. 8 / 24 Outline 1 The RSA Cipher 1 The RSA Cipher Implementing an asymmetric cipher A half-baked approach The fundamental idea of RSA Practical details of RSA 9 / 24 A Half-Baked Approach Idea: Choose n to be equal to a (large) prime number p. Choose e to be some number (0 < e < p p 1, so that Theorem 4 applies. 1) relatively prime to Choose d to be the multiplicative inverse of e modulo p e ⇥ d mod (p 1, i.e.: 1) = 1 This implies that for some integer k , e ⇥ d = k (p —after all, that’s what ‘modulo p 1) + 1 1’ means! 10 / 24 A Half-Baked Approach Decryption undoes encryption Now: P e⇥d mod p = P k (p 1)+1 mod p = [P k (p 1) ⇥ P] mod p = [(P p 1 )k ⇥ P] mod p Assume for the moment that P 6= 0. Then by Fermat’s theorem, P p 1 mod p will be equal to 1, so: P e⇥d mod p = [1k ⇥ P] mod p = P mod p = P since 0 < P < p In the special case where P = 0, it is also obvious that 0 = 0e⇥d mod p 11 / 24 A Half-Baked Approach How are we doing? Decryption must undo encryption, i.e. we must have P = (P e )d mod n = P e⇥d mod n Achieved! It must be computationally impractical to determine the plaintext or the decryption key, even given large amounts of ciphertext, either by brute-force or cryptanalytic attacks. Maybe. It must be computationally impractical to determine d given e and n. Oops! Given e and n, it is easy to determine the multiplicative inverse of e modulo n 1, i.e. d, using the extended Euclidean algorithm. 12 / 24 Outline 1 The RSA Cipher 1 The RSA Cipher Implementing an asymmetric cipher A half-baked approach The fundamental idea of RSA Practical details of RSA 13 / 24 RSA The fundamental idea Choose n to be the product of two large prime numbers, p and q, i.e. n = pq. Let = (p 1) ⇥ (q 1). Choose e to be relatively prime to , so that Theorem 4 applies. Choose d to be the multiplicative inverse of e modulo , i.e.: e ⇥ d mod (p 1)(q 1) = 1 Again, this implies that for some integer k , e ⇥ d = k (p 1)(q 1) + 1 14 / 24 RSA Decryption undoes encryption, 1 First (cf. Stallings [2006] App. 9A) we’ll prove an intermediate result. Assume P 6= 0. P e⇥d mod p = P k (p 1)(q 1)+1 mod p = [P k (p 1)(q 1) ⇥ P] mod p = [(P p 1 )k (q 1) ⇥ P] mod p = [1k (q 1) ⇥ P] mod p using Fermat’s theorem = P mod p The same result is obvious if P = 0: 0e⇥d mod p = 0 mod p 15 / 24 RSA Decryption undoes encryption, 2 Exactly the same equations apply with p and q interchanged. Assume P 6= 0: P e⇥d mod q = P k (p 1)(q 1)+1 mod q = [P k (p 1)(q 1) ⇥ P] mod q = [(P q 1 )k (p 1) ⇥ P] mod q = [1k (p 1) ⇥ P] mod q using Fermat’s theorem = P mod q The same result is obvious if P = 0: 0e⇥d mod q = 0 mod q 16 / 24 RSA Decryption undoes encryption, 3 We’ve shown that, for any P, P e⇥d mod p = P mod p This implies that P e⇥d P must be a multiple of p. Similarly, we’ve shown that, for any P, P e⇥d mod q = P mod q This in turn implies that P e⇥d P must be a multiple of q. But p and q are distinct prime numbers. If P e⇥d P is a multiple of each of them, it must be a multiple of their product, i.e. of n = pq. Consequently: P e⇥d mod n = P mod n = P so decryption undoes encryption, as required. 17 / 24 RSA How are we doing? Decryption must undo encryption, i.e. we must have P = (P e )d mod n = P e⇥d mod n Achieved! It must be computationally impractical to determine the plaintext or the decryption key, even given large amounts of ciphertext, either by brute-force or cryptanalytic attacks. Believed to be true, provided p, q and e are suitably chosen. It must be computationally impractical to determine d given e and n. Given e and , it would still be easy for an adversary to determine the multiplicative inverse of e modulo . But isn’t part of the public key: only n is. Can an adversary determine from n? 18 / 24 Factorisation To determine = (p 1)(q 1) from n, an adversary would first need to determine the prime factors p and q of n. (Well, there are other ways, but they’re no easier: cf. Stallings [2006] p. 275.) Factorising n like this is considered to be a computationally hard problem. RSA Laboratories have challenged researchers to factorise values of n consisting of various numbers of decimal digits. The $20000 prize for factoring a 193-digit (576-bit) number was claimed on 2005/11/02. As of 2007/11/14, there was a $30000 prize waiting for you if you could factorise this 212-digit (704-bit) number: 74037563479561712828046796097429573142593188889231 28908493623263897276503402826627689199641962511784 39958943305021275853701189680982867331732731089309 00552505116877063299072396380786710086096962537934 650563796359 Unfortunately, the prizes were withdrawn late in 2007. Values of n containing at least 1024 bits (300 digits) look like being proof against factorisation using current computing technology and the fastest known factorisation algorithms. However, mathematics and technology move on! 19 / 24 Outline 1 The RSA Cipher 1 The RSA Cipher Implementing an asymmetric cipher A half-baked approach The fundamental idea of RSA Practical details of RSA 20 / 24 Setting up Key Pairs Here’s what we need to do to create a public key (e, n) and the corresponding private key (d, n): 1 Find two prime numbers p and q, each long enough (more later). Non-examinable details: In Stallings [2006], see p. 276 for some further constraints that p and q should satisfy. 2 Evaluate n = pq, and 3 Choose a number e that is relatively prime to . Common choices of e are 17 and 65537. Because these numbers have only two 1-bits, it makes working out P e mod n (using the algorithm we looked at earlier in the lecture) relatively quick. Also 17 and 65537 are both prime numbers. 4 Use the extended Euclidean algorithm to work out d from e and . = (p 1)(q 1). 21 / 24 Avoiding Some Attacks Don’t share n Different key pairs should endeavour to use different values of n. This is because given (e, n) and (d, n) for one key pair, it is possible to use this information to factorise n, and hence crack any other cipher that uses the same value of n. Padding It is a good idea to pad out the plaintext block with some random bits before encrypting it. If the same message has to be sent to several recipients, the random bits should be different for each recipient. (This is sometimes called salting the plaintext.) This avoids attacks described in Stallings [2006] p. 273 and p. 279, and another described in Menezes et al. [1997] §8.2.2(iii). Timing Attacks Suppose that an adversary can observe the time it takes for a recipient to decrypt certain ciphertexts chosen by the adversary. It has been shown (cf. Stallings [2006] p. 277) that if decryption is implemented in the obvious way, then the adversary can use the observed times to help infer d. The obvious way to avoid this is to adjust the algorithm so that it takes constant time; however, this can degrade its performance appreciably. 22 / 24 Recommended Key Sizes Size of modulus n = pq according to ENISA “Algorithms, Key Sizes and Parameters" report (November 2014). Legacy applications: 1024 bits Future applications: 3072 bits Also notes: even with padding (RSA-OEP) not normally used for encrypting large amounts of text. But signature schemes etc. built on RSA, and key establishment. p Key exponents d, e: always d n and Legacy applications: e 3 or e Future applications: e 65537 65537 23 / 24 Our Quantum Future Factoring is considered hard for classical computers (i.e. there is no known poly-time algorithm) Factoring is known to be easy for quantum computers —i.e. there is a known poly-time quantum algorithm (Shor’s) What is missing is quantum computers that can run such an algorithm Various companies are now engaged in building large-scale universal quantum computers The consensus among experts is that with high probability, quantum computers will be able to break RSA-2048 by 2025 to 2030, and RSA-4096 by 2035 At this later point, further increasing the key-size would be pointless. 24 / 24