Chapter 21 Public-Key Cryptography and Message Authentication Hash Functions • A hash function H accepts a variable-length block of data M as input and produces a fixed-size hash value o h = H(M) o Principal object is data integrity • Cryptographic hash function o An algorithm for which it is computationally infeasible to find either: (a) a data object that maps to a pre-specified hash result (the oneway property) (b) two data objects that map to the same hash result (the collisionfree property) Secure Hash Algorithm (SHA) • SHA was originally developed by NIST • Published as FIPS 180 in 1993 • Was revised in 1995 as SHA-1 o Produces 160-bit hash values • NIST issued revised FIPS 180-2 in 2002 o o o o Adds 3 additional versions of SHA SHA-256, SHA-384, SHA-512 With 256/384/512-bit hash values Same basic structure as SHA-1 but greater security • In 2005 NIST announced the intention to phase out approval of SHA-1 and move to a reliance on the other SHA versions by 2010 N ¥ 1024 bits L bits 128 bits Message 1024 bits 1024 bits M1 IV = H0 512 F 1024 bits M2 1024 MN 1024 + H1 1024 + F L 100..0 H2 F + HN = hash code + = word-by-word addition mod 2 64 Figure 21.2 Message Digest Generation Using SHA-512 SHA-512 Constants Each round t makes use of a 64-bit value Wt, derived from the current 1024bit block being processed (Mi). Each round also makes use of an additive constant Kt, where 0 ≤ t ≤ 79 indicates one of the 80 rounds. These words represent the first sixty-four bits of the fractional parts of the cube roots of the first eighty prime numbers. Each round takes as input the 512-bit buffer value abcdefgh, and updates the contents of the buffer. At input to the first round, the buffer has the value of the intermediate hash value, Hi–1. Mi Hi–1 message schedule a b c W0 The constants provide a “randomized” set of 64-bit patterns, which should eliminate any regularities in the input data. The operations performed during a round consist of circular shifts, and primitive Boolean functions based on AND, OR, NOT, and XOR. The output of the eightieth round is added to the input to the first round (Hi–1) to produce Hi. The addition is done independently for each of the eight words in the buffer with each of the corresponding words in Hi–1, using addition modulo 264. e f g h 64 K0 Round 0 a b c Wt d e f g h Kt Round t a W79 d b c d e f g h Round 79 K79 ++++++++ Hi Figure 21.3 SHA-512 Processing of a Single 1024-Bit Block The SHA-512 algorithm has the property that every bit of the hash code is a function of every bit of the input. The complex repetition of the basic function F produces results that are well mixed; that is, it is unlikely that two messages chosen at random, even if they exhibit similar regularities, will have the same hash code. Unless there is some hidden weakness in SHA-512, which has not so far been published, the difficulty of coming up with two messages having the same message digest is on the order of 2256 operations, while the difficulty of finding a message with a given digest is on the order of 2512 operations. SHA-3 SHA-1 has not yet been "broken” •No one has demonstrated a technique for producing collisions in a practical amount of time •Considered to be insecure and has been phased out for SHA-2 NIST announced in 2007 a competition for the SHA-3 next generation NIST hash function •Winning design was announced by NIST in October 2012 •SHA-3 is a cryptographic hash function that is intended to complement SHA-2 as the approved standard for a wide range of applications SHA-2 shares the same structure and mathematical operations as its predecessors so this is a cause for concern •Because it will take years to find a suitable replacement for SHA-2 should it become vulnerable, NIST decided to begin the process of developing a new hash standard Requirements: • Must support hash value lengths of 224, 256,384, and 512 bits • Algorithm must process small blocks at a time instead of requiring the entire message to be buffered in memory before processing it 1. It must be possible to replace SHA-2 with SHA-3 in any application by a simple drop-in substitution. Therefore, SHA-3 must support hash value lengths of 224, 256, 384, and 512 bits. 2. SHA-3 must preserve the online nature of SHA-2. That is, the algorithm must process comparatively small blocks (512 or 1024 bits) at a time instead of requiring that the entire message be buffered in memory before processing it. Message Authentication Code (MAC) • Also known as a keyed hash function • Typically used between two parties that share a secret key to authenticate information exchanged between those parties Takes as input a secret key and a data block and produces a hash value (MAC) which is associated with the protected message • If the integrity of the message needs to be checked, the MAC function can be applied to the message and the result compared with the associated MAC value • An attacker who alters the message will be unable to alter the associated MAC value without knowledge of the secret key HMAC • • • Interest in developing a MAC derived from a cryptographic hash code o Cryptographic hash functions generally execute faster o Library code is widely available o SHA-1 was not deigned for use as a MAC because it does not rely on a secret key Issued as RFC2014 Has been chosen as the mandatory-toimplement MAC for IP security o Used in other Internet protocols such as Transport Layer Security (TLS) and Secure Electronic Transaction (SET) To preserve the original performance of the hash function without incurring a significant degradation To use, without modifications, available hash functions To allow for easy replaceability of the embedded hash function in case faster or more secure hash functions are found or required To use and handle keys in a simple way To have a well-understood cryptographic analysis of the strength of the authentication mechanism based on reasonable assumptions on the embedded hash function 1. Append zeros to the left end of K to create a bbit string K+ (e.g., if K is of length 160 bits and b = 512, then K will be appended with 44 zero bytes 0x00). K+ 2. XOR (bitwise exclusive-OR) K+ with ipad to produce the b-bit block Si. ipad b bits b bits Y0 Y1 YL–1 Si 3. Append M to Si. 4. Apply H to the stream generated in step 3. b bits IV K+ n bits Hash n bits opad H(Si || M) 5. XOR K+ with opad to produce the b-bit block So. b bits 6. Append the hash result from step 4 to So. 7. Apply H to the stream generated in step 6 and output the result. pad to b bits So IV n bits Hash n bits Note that the XOR with ipad results in flipping one-half of the bits of K . Similarly, the XOR with opad results in flipping one-half of the bits of K , but a different set of bits. In effect, by passing S i and S o through the hash algorithm, we have pseudorandomly generated two keys from K . HMAC(K, M) Figure 21.4 HMAC Structure Security of HMAC • Security depends on the cryptographic strength of the underlying hash function • For a given level of effort on messages generated by a legitimate user and seen by the attacker, the probability of successful attack on HMAC is equivalent to one of the following attacks on the embedded hash function: o Either attacker computes output even with random secret IV • Brute force key O(2n), or use birthday attack o Or attacker finds collisions in hash function even when IV is random and secret • ie. find M and M' such that H(M) = H(M') • Birthday attack O( 2n/2) • MD5 secure in HMAC since only observe Digital Signature • Operation is similar to that of the MAC • The hash value of a message is encrypted with a user’s private key • Anyone who knows the user’s public key can verify the integrity of the message • An attacker who wishes to alter the message would need to know the user’s private key • Implications of digital signatures go beyond just message authentication Public Key Infrastructure (PKI) The RA ensures that the public key is bound to the individual to which it is assigned in a way that ensures non-repudiation. The term trusted third party (TTP) may also be used for certificate authority (CA). The term PKI is sometimes erroneously used to denote public key algorithms, which do not require the use of a CA. There are three main approaches to getting this trust: Certificate Authorities (CAs), Web of Trust (WoT), and Simple public key infrastructure (SPKI). The primary role of the CA is to digitally sign and publish the public key bound to a given user. This is done using the CA's own private key, so that trust in the user key relies on one's trust in the validity of the CA's key. The mechanism that binds keys to users is called the Registration Authority (RA), which may or may not be separate from the CA. The key-user binding is established, depending on the level of assurance the binding has, by software or under human supervision PKI .. Other Hash Function Uses Commonly used to create a one-way password file Can be used for intrusion and virus detection When a user enters a password, the hash of that password is compared to the stored hash value for verification Store H(F) for each file on a system and secure the hash values One can later determine if a file has been modified by recomputing H(F) This approach to password protection is used by most operating systems An intruder would need to change F without changing H(F) Can be used to construct a pseudorandom function (PRF) or a pseudorandom number generator (PRNG) A common application for a hash-based PRF is for the generation of symmetric keys Birthday Attack • In probability theory, the birthday problem or birthday paradox concerns the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. • By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367 (since there are 366 possible birthdays, including February 29). • Q. How many people do I need to ensure with 99.9% that some pair share a birthday? Birthday Attack • It turns out that 99.9% probability is reached with just 70 people, and 50% probability with 23 people. • These conclusions include the assumption that each day of the year (except February 29) is equally probable for a birthday. • The mathematics behind this problem led to a wellknown cryptographic attack called the birthday attack, which uses this probabilistic model to reduce the complexity of cracking a hash function. https://www.youtube.com/watch?v=OBpEFqjkPO8 Birthday Attacks • For a collision resistant attack, an adversary wishes to find two messages or data blocks that yield the same hash function o The effort required is explained by a mathematical result referred to as the birthday paradox • How the birthday attack works: o The source (A) is prepared to sign a legitimate message x by appending the appropriate m-bit hash code and encrypting that hash code with A’s private key o Opponent generates 2m/2 variations x’ of x, all with essentially the same meaning, and stores the messages and their hash values o Opponent generates a fraudulent message y for which A’s signature is desired o Two sets of messages are compared to find a pair with the same hash o The opponent offers the valid variation to A for signature which can then be attached to the fraudulent variation for transmission to the intended recipient • Because the two variations have the same hash code, they will produce the same signature and the opponent is assured of success even though the encryption key is not known A Letter in 237 Variation Alice, Bob and Eve • The three fictitious characters involved in secret transmissions traditionally go by the names of Alice and Bob with Eve, the eavesdropper, intercepting their messages and generally causing mischief. • Perhaps because of the name, Eve is usually regarded as the evil figure in the drama although this is quite unfair: ……as Alice and Bob could be hatching plots of their own and Eve represents a benign intelligence service striving to protect citizens from the conspiratorial schemes of the other pair. Secure Key Exchange • Transmission of a secure message from Alice to Bob does not in itself necessitate the exchange of the key to a cipher, for they can proceed as follows. 1. Alice writes her plaintext message for Bob, and places it in a box that she secures with her own padlock. Only Alice has the key to this lock. 2. She then posts the box to Bob, who of course cannot open it. Bob however then adds a second padlock to the box, for which he alone possesses the key. 3. The box is then returned to Alice, who then removes her own lock, and sends the box for a second time to Bob. 4. This time Bob may unlock the box and read Alice’s message, secure in the knowledge that Eve could not have peeked at the contents during delivery process. Secure Key Exchange • In this way a secret message may be securely sent on an insecure channel without Alice and Bob ever exchanging keys. (Eve still could of course simply steal the box, then neither she nor Bob would know Alice’s message—this corresponds to a physical attack on Alice and Bob’s communications medium.) • This thought experiment shows that there is no law that says that a key must exchange hands in the exchange of secure messages. This represented a fresh way of looking at an age old problem. •So…… Is it possible for Alice and Bob to set up a secure cipher between them without ever meeting one another or making use of a third party to act as a go between? • After all, the practical problem that had dogged cipher applications from the beginning was that of key exchange—the initial transfer of the key to the cipher between the interested parties. Paint Can Example As their secret key, Alice and Bob are going to manufacture an exact colour shade of paint. 1. Each takes one litre of white paint and mixes it with another litre of paint of a colour that only they know: Alice might use her own secret shade of scarlet, Bob his own peculiar blue. 2. They then arrange a rendezvous to exchange paint cans: Alice handing Bob two litres of pink paint, Bob giving Alice a two-litre pot of pale blue. They may even taunt their relentless adversary Eve by inviting her to their tryst and giving her an exact replica of each of the two-litre cans of colored paint. 3. Alice and Bob return home. Alice takes Bob’s can and mixes with it one litre of her special scarlet paint. At the other end, Bob mixes in a litre of his blue into the can that Alice gave to Bob. Both Alice and Bob now have threelitre mixtures of a particular shade of purple, consisting of 1 litre each of white, scarlet, and blue, and it is this exact shade that is the secret key to their cipher. Paint Can Example RSA Public-Key Encryption • By Rivest, Shamir & Adleman of MIT in 1977 • Best known and widely used public-key algorithm • Uses exponentiation of integers modulo a prime • Encrypt: C = Me mod n • Decrypt: M = Cd mod n = (Me)d mod n = M • Both sender and receiver know values of n and e • Only receiver knows value of d • Public-key encryption algorithm with public key PU = {e, n} and private key PR = {d, n} One Way Function • In 1970’s a number of people hit on this and realized its potential importance. • However, to bring the idea to fruition involved the invention of a trapdoor function. Each user would need such a function f that would be in principle available to everyone who could then calculate its values f (x ). •However, the owner of the function, Alice, would know something vital about it that allowed her to decipher and recover x from the value of f (x ). •What is more, other people, even though they knew how to calculate f (x ), must not be able to deduce this key piece of information however hard they try. •The names usually associated with the discovery and development of public key cryptography are Diffie, Hellman and Merkle along with Rivest, Shamir and Adleman from whose initials the name RSA codes derives. RSA in action • The principal ingredient of Alice’s RSA private key is a very large pair of prime numbers, p and q . (In real life these numbers are up to 200 digits in length.) • In order to use Alice’s public key however, Bob does not need p and q but rather the product, n of these two primes: pq = n. This represents the first step in the process. • The next key step however is to invent a trapdoor function f (x ) that can be calculated as long as we possess n but has the property that, given the number f (x ), it is a practical impossibility to recover x without the two magic numbers p and q . • Practical experience had shown that recovering p and q from n took a prohibitive amount of computing power. Prime numbers • Since any message can be translated into a string of numbers, the problem comes down to how Bob may securely send a particular number, let us call it M for message, to Alice without Eve finding out its value. •Alice’s private key is based on two prime numbers, p and q that only she knows. •In this toy example, which is quite representative of the real situation, we shall use the small primes p = 23 and q = 47. • The publicly known product of these two numbers is n = 23 × 47 = 1081. •(Remember that In practice of course, p and q are huge and in any case all this is happening behind the scenes and is done invisibly on behalf of any real life Bob and Alice.) Modular Arithmetic • The approach is to mask the value of M using modular arithmetic, that is to say clock arithmetic in this case based on a clock whose face is numbered by 0, 1, 2, · · · , n -1. • What Alice leaves in the public domain is the number n and also another number, e for encoding messages meant for her. • What Bob sends to Alice is not of course M itself (for if he did then Eve would be liable to overhear) but rather the remainder when Me is divided by n. • For example, if Bob’s message was M = 77 and if the encoding number that Alice tells people to use is e = 15, then Bob, or rather his computer, would calculate the remainder when 7715 was divided by n = 1081. This remainder turns out to be 646. Public Key Encryption And so Bob sends to Alice his disguised message in the form of the enciphered message 646. Eve will presumably intercept this message and know that Bob’s message is encoded as 646 when using Alice’s public key which she knows as well as anyone consists of n = 1081 and e = 15. But how can the original message be teased back out? For Alice, who knows that 1081 = 23 × 47, this is quite straight-forward. For, once in possession of the prime factors of n, it is possible to determine a decoding number d which is found using the values of p , q and e . It turns out in this case that a suitable value for the decoding number is d = 135. Alice’s computer then works out the remainder when 646135 is divided by n = 1081, and the underlying mathematics ensures that the answer will be the original message M = 77. RSA Key Ingredient A key ingredient in the method is the value of the number (p - 1)(q - 1), which is denoted by φ(n), and in this case we see that φ(1081) = 22 × 46 = 1012. The encoding number e that Alice chooses in her public key cannot be completely arbitrary but must have no factor in common with φ(n). The prime factors of 1012 are seen to be 2, 11 and 23 so that e must not be a multiple of any of these three primes. This is only a very mild restriction and Alice’s particular choice of e = 15 = 3 × 5 is perfectly all right. The decoding number d is chosen, and this is always possible, so that the product ed leaves a remainder of 1 when divided by (p - 1)(q - 1). The message number M itself needs to be less than n but in practice this is no restriction as the size of n in real applications is so monstrous it can accommodate all the values of M enough to cover any real message we would ever wish to send. Public Key Encryption To see all this in action we may illustrate with an example featuring even smaller numbers that the one earlier. For instance let us take p = 3 and q = 11 so that n = pq = 33 and φ(n) = (p - 1)(q - 1) = 2 × 10 = 20. Alice then publishes n = 33 and suppose she sets e = 7, which is permissible, as 7 has no factor in common with 20. The number d then has to be chosen so that ed = 7d leaves a remainder of 1 when divided by 20. By inspection we see a solution is d = 3, for then 7d = 21. Public Key Encryption Now Alice has her little RSA cipher all set up. If Bob wants to send the message M = 6, then he computes Me = 67 = 279, 936, divides this number by 33 to find that the remainder is 30, and so Bob would send the number 30 over an open channel. Alice would receive Bob’s 30 and decipher its real meaning by calculating 303 = 27, 000. Division by n = 33 then gives her 27, 000 = 33 × 818 + 6. Again it is only the remainder 6 that is of interest as that is Bob’s plaintext message. Key Generation p and q both prime, p ¹ q Select p, q Calculate n = p ´ q Calculate f(n) = (p – 1)(q – 1) Select integer e gcd(f(n), e) = 1; 1 < e < f(n) Calculate d de mod f(n) = 1 Public key KU = {e, n} Private key KR = {d, n} Encryption Plaintext: M<n Ciphertext: C = Me (mod n) Decryption Ciphertext: C Plaintext: M = Cd (mod n) Figure 21.5 The RSA Algorithm Decryption Encryption plaintext 88 7 88 mod 187 = 11 PU = 7, 187 ciphertext 11 23 11 mod 187 = 88 PR = 23, 187 Figure 21.6 Example of RSA Algorithm plaintext 88 Security of RSA Brute force • Involves trying all possible private keys Mathematical attacks • There are several approaches, all equivalent in effort to factoring the product of two primes Timing attacks • These depend on the running time of the decryption algorithm Chosen ciphertext attacks • This type of attack exploits properties of the RSA algorithm Diffie-Hellman Key Exchange • First published public-key algorithm • By Diffie and Hellman in 1976 along with the exposition of public key concepts • Used in a number of commercial products • Practical method to exchange a secret key securely that can then be used for subsequent encryption of messages • Security relies on difficulty of computing discrete logarithms Diffie Hellman Key Exchange • Diffie and Hellman had a good idea but the challenge was to produce a mathematical version of the paint mixing exchange. • Crucially, the operations involved must commute with one another: when mixing paint, the final outcome depends only on the ratio of the colours we use and not on the order in which the paints are mixed together. • The enciphering processes must likewise be able to slip past one another to produce the same overall effect. • The idea of repeated exponentiation was successfully used by Diffie and Hellman to allow Alice and Bob to use a method akin to this to create a mutual key that any outsider could recreate only with the utmost difficulty. • Their method exploited the added ingredient of modular arithmetic Diffie Hellman…. Alice and Bob choose a base number, for the purposes of the example we take it to be 2, and once again Alice and Bob choose one number each known only to them personally. This time we even insist that they select ordinary positive integers: let us say Alice chooses a = 7 and Bob goes for b = 9. However there is now to be an extra ingredient, another number p , which is also assumed to lie in the public domain: let us suppose that p = 47. Alice now computes 2a but the number she transmits is the remainder when this number is divided by p . In this case she finds 27 = 128 = 2 × 47 + 34, so the number 34 is sent over an insecure channel to Bob. Similarly Bob computes 2b = 29 = 512 = 10 × 47 + 42, and transmits 42 to Alice. Simple Key Encryption • What Alice now does in the security of her own home is calculate the remainder when 42a is divided by p , while Bob calculates the remainder when p is divided into 34 . • Alice and Bob will both end up with the same number, the same key, as in each case the net result will be the remainder when 2ab is divided by p. • Alice will find that the remainder when 427 is divided by 47 is 37, and so will Bob when he divides 349 by 47. • Alice and Bob have now created a shared key, the number 37. •Eve on the other hand is left frustrated. Her mathematical problem is this; she does not know the values of a or b but she does know that 2a and 2b leave respective remainders of 42 and 34 when divided by 47. • The key is to find the remainder when 2ab is divided by 47. Global Public Elements q prime number a a < q and a a primitive root of q User A Key Generation Select private XA XA < q Calculate public YA YA = a XA mod q User B Key Generation Select private XB XB < q Calculate public YB YB = a XB mod q Generation of Secret Key by User A X K = (YB) A mod q Generation of Secret Key by User B X K = (YA) B mod q Figure 21.7 The Diffie-Hellman Key Exchange Algorithm Have •Prime number q = 353 •Primitive root = 3 A and B each compute their public keys •A computes YA = 397 mod 353 = 40 •B computes YB = 3233 mod 353 = 248 Then exchange and compute secret key: •For A: K = (YB)XA mod 353 = 24897 mod 353 = 160 •For B: K = (YA)XB mod 353 = 40233 mod 353 = 160 Attacker must solve: •3a mod 353 = 40 which is hard •Desired answer is 97, then compute key as B does Alice Bob Alice and Bob share a prime q and a, such that a < q and a is a primitive root of q Alice and Bob share a prime q and a, such that a < q and a is a primitive root of q Alice generates a private key XA such that XA < q Bob generates a private key XB such that XB < q Alice calculates a public key YA = aXA mod q YA YB Bob calculates a public key YB = aXB mod q Alice receives Bob’s public key YB in plaintext Bob receives Alice’s public key YA in plaintext Alice calculates shared secret key K = (YB)XA mod q Bob calculates shared secret key K = (YA)XB mod q Figure 21.8 Diffie-Hellman Key Exchange Man-in-the-Middle Attack • Attack is: • All subsequent communications compromised 1. Darth generates private keys XD1 and XD2, and their public keys YD1 and YD2 2. Alice transmits YA to Bob 3. Darth intercepts YA and transmits YD1 to Bob. Darth also calculates K2 4. Bob receives YD1 and calculates K1 5. Bob transmits XA to Alice 6. Darth intercepts XA and transmits YD2 to Alice. Darth calculates K1 7. Alice receives YD2 and calculates K2 Other Public-Key Algorithms Digital Signature Standard (DSS) Elliptic-Curve Cryptography (ECC) • • FIPS PUB 186 • • Seen in standards such as IEEE P1363 • Makes use of SHA-1 and the Digital Signature Algorithm (DSA) Equal security for smaller bit size than RSA Originally proposed in 1991, revised in 1993 due to security concerns, and another minor revision in 1996 • Confidence level in ECC is not yet as high as that in RSA • Based on a mathematical construct known as the elliptic curve • • Cannot be used for encryption or key exchange Uses an algorithm that is designed to provide only the digital signature function Code Breaking Jobs GCHQ Current Vacancies Lab 7. Web Application Attack vectors 7.1 Abusing File Upload on a Vulnerable Web Server 7.2 Cross-site Request Forgery 7.3 SQL & Cross-Site Scripting Vulnerabilities 7.3.1 SQL Injection Vulnerabilities 7.3.3 Testing Apps to Find SQL Injection Vulnerabilities 7.4 Cross Site Scripting (XSS) Reflected Attack The End