Cryptographic Hashing: Blockcipher-Based Constructions, Revisited Tom Shrimpton Portland State University Results from CRYPTO 2004 • “Near-collisions” in SHA-0 [Biham] • Collisions in SHA-0 [Joux, rump session] • Collisions in reduced-round SHA-1 [Biham, rump session] • Collisions in MD4, MD5, RIPEMD, HAVAL-128 [Wang et al., rump session] • Multicollisions in iterated constructions [Joux] 2 Today • What are these objects? • What cryptographic properties do we like for them to have? • How do we build them (particularly, from a blockcipher) • What do we currently understand about proofs, models, bounds on efficiency, etc.? • A call to action! 3 What are cryptographic hash functions? File Hash e.g., md5sum,SHA-1 Cryptographic “Fingerprint” 4 SHA-1 512 bits M1 [NIST] ... M2 Mm for i = 1 to m do Wt = t-th word of Mi ( Wt-3 Wt-8 Wt-14 Wt-16 ) << 1 { A H0i-1; B H1i-1; C H2i-1; 0 t 15 16 t 79 D H3i-1; E H4i-1 for t = 0 to 79 do T A << 5 + gt (B, C, D) + E + Kt + Wt E D; D C; C B >> 2; B A; A T end H0i A +H0i-1; H3i D + H3i-1; H1i B + H1i-1; H4i E + H4i-1 end return H0m H1m H2m H3m H4m H2i C+ H2i-1; 160 bits 5 Today • What are these objects? P • What cryptographic properties do we like for them to have? • How do we build them (particularly, from a blockcipher) • What do we currently understand about proofs, models, bounds on efficiency, etc.? • A call to action! 6 ? ? ? collision-intractable ? ? ? ? ? ? ? ? ? ? one-way function ? ? ? ? ? collision resistance ? ? 7 A motivating quote, and a “fact” “2nd-preimage resistance — it is computationally infeasible to find any second-input which has the same output as any specified input, i.e., given x, to find a 2nd-preimage x’ x such that h(x) = h(x’).” [MOV] How are inputs specified? How is h selected? “Fact Collision resistance implies 2nd-preimage resistance of hash functions” [MOV] This “fact” depends on how you answer the above questions! 8 A cryptographic property (quite informal) 1. Preimage resistance: given a hash function and given a hash output it is hard to invert that output BAD: H(M) = M mod 701 9 Preimage resistance (intuition, but slightly more formal) H: Strings {0,1}n : a finite, nonempty set Strings : set of strings {0,1}* M n : the hash HK length HK M’ keyed-SHA1: {0,1}160 Y {0,1}n {0,1}* {0,1}160 SHA1 is one particular function from this family {0,1}m This direction is “hard” for any “reasonable” adversary 10 Preimage resistance: a definition (formal) probabilistic game “name of game” - random key random domain pt hash the domain pt A runs, returns domain pt event: did A win (find preimage)? 11 A formal framework [RS04] Preimage fixed random range point range point aPre fixed key random key ePre Every range point is hard to invert Every hash function in the family is hard to invert Pre “a” = “always” “e” = “everywhere” 12 More cryptographic properties 1. Preimage resistance given a hash function and given an hash output it is hard to invert that output P 2. Second-preimage resistance given a hash function and given a first input, it is hard to find a second input that collides with the first 3. Collision resistance given a hash function it is hard to find two colliding inputs 13 Preimage fixed random range point range point aPre fixed key random key ePre Pre Second Preimage fixed random domain point domain point aSec fixed key random key eSec Sec Collision Also known as UOWHF fixed key random key Coll Our results[RS04] Coll aSec eSec Sec Provisional Conventional [no arrow] Separation aPre ePre Pre 15 What about near-collisions? M HK HK M’ Y Y’ Such that Y Y’ {0,1}n This should be “hard” for any “reasonable” adversary (Hmm.. what does this mean now?) Strings 16 Research project #1 Continue definitional work What’s the “right” definition for the task? How do we make it formal? 17 Today • What are these objects? P • What cryptographic properties do we like for P them to have? • How do we build them (particularly, from a blockcipher) • What do we currently understand about proofs, models, bounds on efficiency, etc.? • A call to action! 18 How to do this? arbitrary length string H: ´ Strings {0,1}n n-bit string 19 Merkle-Damgard construction [Me89],[Da89] Compression function M1 M3 M2 n f IV h1 f h2 f h3 = H (M) k k Fixed initial value Chaining value MD Theorem: if f is CR, then so is H 20 Mi1 ... M2 Mm 512 bits for i = 1 to m do Wt = { A H0i-1; t-th word of Mi ( Wt-3 Wt-8 Wt-14 Wt-16 ) << 1 B H1i-1; C H2i-1; 0 t 15 16 t 79 D H3i-1; E H4i-1 for t = 0 to 79 do T A << 5 + gt (B, C, D) + E + Kt + Wt E D; D C; C B >> 2; B A; A T end H0..4i1 H0i A +H0i-1; H3i D + H3i-1; H1i B + H1i-1; H4i E + H4i-1 H2i C+ H2i-1; 160 end bits return H0m H1m H2m H3m H4m 160 bits 160 bits 21 Why build hash function from blockciphers? Economy of primitives “Do as much as possible with as little as possible” (late 70s-early 90s): DES – weak keys causes design difficulties – small blocksize easier wins for adversary (now): AES has changed the playing field – no known weak keys – bigger blocksize harder wins for adversary 22 Blockcipher-based compression function #1 (CBC) [Akl83] Is this collision-resistant? 01 M 0 EK(IV)M2 EK(0) E K IV IV K E EK(EK(0)) = EK(EK(0)) 23 Attempt #2 [PGV93] How about this? IV IV 1 M IV1 E1(1) IV M2 IV E0(0) E E IV = IV 24 12 provably-secure compression functions 25 Davies-Meyer compression function [PGV93],[BRS02] Mi hi-1 hi E 26 Mi SHA-0, SHA-1 are blockcipher-based hash functions! Blockcipher 512-bit key, 160-bit block for i = 1 to m do Wt = { A H0i-1; t-th word of Mi ( Wt-3 Wt-8 Wt-14 Wt-16 ) << 1 B H1i-1; C H2i-1; 0 t 15 16 t 79 D H3i-1; E H4i-1 for t = 0 to 79 do T A << 5 + gt (B, C, D) + E + Kt + Wt E D; D C; C B >> 2; B A; A T end H0..4i1 H0i A +H0i-1; H3i D + H3i-1; H1i B + H1i-1; H4i E + H4i-1 Davies-Meyers feedforward H2i C+ H2i-1; 27 Collision resistance in the “ideal cipher” model E E K, x ... E ... -1 K, y -1 EK (y) EK(x) A M, M’ Model blockcipher as a random permutation for E, E -1 coll Adv H ( A ) = Pr [ A finds a collision in H E ] each key Computationally unbounded adversary coll coll Adv H ( q ) = max {Adv H ( A )} Only counted resource is oracle queries A at most q queries 28 Why such a strong model? PRP assumption isn’t enough in general [Simon] Specifically, for each of the 12 there is a PRP that makes collisions easy [Hopwood][Wagner] More importantly, PRP is the wrong tool Security depends on a random, secret key 29 Research project #2 Find new models and/or assumptions What properties does a blockcipher need for hashing? How can we abstract them to models/assumptions? Can we prove things? 30 Moving theory towards practice Mi hi-1 E Mi+1 hi E hi+1 Expensive operations 31 Secure rate-1, fixed-key constructions? No secure rate-1, fixed-key constructions [BCS 04] Mi n hi-1 n f1 n EK n f2 hi n In the black-box model: compression function — collision after 2 blockcipher calls iterated function — collisions in Q(n + lg(n)) calls 32 Research project #3 Find secure, fixed-key, rate < 1, iterated constructions (some progress being made) 33 128 bits too small? Cascaded constructions! n bits n bits HK1(M) || HK2(M) n/2 bits of CR n/2 bits of CR = G (K1,K2) (M) ? n bits of CR Joux: for MD constructions, No! 34 Multicollisions M1 Mm M2 n f IV n h1 n f h2 … hm-1 f hm = H (M) For m(2n/2) work, we can make 2m messages that collide 35 Collisions in cascaded constructions 160 bits 160 bits For G (K1,K2) (M) = HK1(M) || HK2(M) : 1. Create 281-way multicollision under HK1 2. Hash these messages under HK2 Collision in G for work O(280) << O(2160) 36 What about MDC-2? Mi hi-1 E gi-1 E hi gi 37 Huge opportunities for research • Continue definitional work – Formalize “near collisions”, etc. – What are the right properties for specific tasks? • Flesh out the theoretical landscape – Ideal cipher model proofs – PRP assumption no proofs • Find secure, fixed-key, rate < 1, iterated scheme • Analysis of MDC-2 38 39 40