Hash Function Design Overview of the basic components in SHA

Hash Function Design Overview of the basic components in SHA-3 competition Daniel Joščák daniel.joscak@i.cz S.ICZ a.s. Hvězdova 1689/2a, 140 00 Prague 4; Faculty of Mathematics and Physics, Charles University, Prague Abstract In this article we bring an overview of basic building blocks used in the design of new hash functions submitted to the SHA-3 competition. We briefly present the current widely used hash functions MD5, SHA-1, SHA-2 and RIPEMD-160. At the end we consider several properties of the candidates and give an example of candidates that are in SHA-3 competition. Keywords: SHA-3 competition, hash functions. 1 Introduction In 2004 a group of researchers led by Xiaoyun Wang (Shandong University, China) presented real collisions in MD5 and other hash functions at the rump session of Crypto conference and they explained the method in [10]. In 2006 the same group presented a collision attack on SHA–1 in [8] and since then a lot of progress in collision finding algorithms has been made. Although there is no specific reason to believe that a practical attack on any of the SHA–2 family of hash functions is imminent, a successful collision attack on an algorithm in the SHA–2 family could have catastrophic effects for digital signatures. In reaction to this situation the National Institute of Standards and Technology (NIST) created a public competition for a new hash algorithm standard SHA–3 [1]. Except for the obvious requirements of the hash function (i.e. collision resistance, first and second preimage resistance, …) NIST expects SHA–3 to have a security strength that is at least as good as the hash algorithms in the SHA–2 family, and that this security strength will be achieved with significantly improved efficiency. NIST also desires that the SHA–3 hash functions will be designed so that a possibly successful attack on the SHA–2 hash functions is unlikely to be applicable to SHA–3. The submission deadline for new designs was October 31, 2008. 51 algorithms were submitted for the competition. A lot of new ideas appeared in the submissions but candidates also contain some several common properties. We try to summarize common building blocks which appeared and categorize the submission according to them. The information about NIST’s organization of the SHA-3 competition, algorithm speed and current state of attacks and are taken and can be found at NIST web page [1], projects eBash [5] and Hash ZOO [4]. Very good comparison and categorization of the candidates can be found in [7]. 30 Security and Protection of Information 2009 2 Desired properties In this section we briefly present definitions of properties that good hash functions and candidates for SHA-3 algorithm must have. Collision resistant: a hash function H is collision resistant if it is hard to find two distinct inputs that hash to the same output (that is, two distinct inputs m1 and m2, such that H(m1) = H(m2)). Every hash function with more inputs than outputs will necessarily have collisions. Consider a hash function SHA256 that produces 256 bits of output from an arbitrarily large input. Since it must generate one of 2256 outputs for each member of a much larger set of inputs, the pigeonhole principle guarantees that some inputs will hash to the same output. Collision resistance doesn't mean that no collisions exist; simply that they are hard to find. The birthday paradox sets an upper bound on collision resistance: if a hash function produces N bits of output, an attacker can find a collision by performing only 2N/2 hash operations until two outputs happen to match. If there is an easier method than this brute force attack, it is considered a flaw in the hash function. First preimage resistant: a hash function H is said to be first preimage resistant (sometimes only preimage resistant) if given h it is hard to find any m such that h = H(m). Second preimage resistant: a hash function H is said to be second preimage resistant if given an input m1, it is hard to find another input, m2 (not equal to m1) such that H(m1) = H(m2) A preimage attack differs from a collision attack in that there is a fixed hash or message that is being attacked and in its complexity. Optimally, a preimage attack on an n-bit hash function will take an order of 2n operations to be successful. Resistant to length-extension attacks: given H(m) and length of m but not m, by choosing a suitable m' an attacker is not able to calculate H (m || m'), where || denotes concatenation. Efficiency: computation of a hash function must be efficient i.e. speed matters. Hash functions are widely deployed in many applications and it is important to have fast implementation on different architectures. During the first SHA-3 conference organized by NIST organizer announced they initially focus on Intel Architecture 32-bit (IA-32) and Advanced Micro Devices 64-bit (AMD64) but performance on other platforms will not be overlooked. They asked if submitters adjust tunable parameters of candidates to run as fast as SHA-256, SHA-512 on IA-32 and AMD64, are the algorithms secure? If not its chances in competition are lower. Memory requirements and code size is very important for implementation on various embedded systems such as smart cards. HMAC construction: hash function must have at least one construction to support HMAC (or alternative MAC construction) as a pseudorandom function (PRF) i.e. it is hard to distinguish HMACK based on H from a random function. 3 Current hash functions We briefly describe four the most known and used hash algorithms to show an evolution of the hash functions. All of the functions use the same message padding (adding bit “1”, then zeroes and length of the message such that padded message is multiple of the block-size for compression function). All of the functions use the Merkle-Damgård construction from a compression function which is shown in Figure 1. All but RIPEMD-160 uses Davies-Meyer construction of compression function from a block cipher. And Security and Protection of Information 2009 31 all of the functions use a very simple register instruction: logical operators or, and, xor in simple nonlinear function, modular addition, shift and rotation. Functions mainly differ (except the obvious length of the registers, message blocks and outputs) in complexity of the message expansion function and step function which are part of the compression function. The newer the function is, a more complex message expansion and step function is used. M1 IV M2 f Mn f f output Figure 1: Merkle-Damgård construction. 3.1 MD5 MD5 was designed by Ron Rivest in 1991. It was a successor of previous MD4 and the length of output is 128 bits long. The message expansion was very simple - identity and permutations of message-block registers. Step function is shown on Figure 2. The first cryptanalysis appeared in 1993 [6]. Real collisions are known since 2004 [10]. It is not recommended to use this function for cryptographic purposes any more. Figure 2: MD5 step function, F is simple nonlinear function (taken from wikipedia). 3.2 SHA-1 Specification was published in 1995 as the Secure Hash Standard, FIPS PUB 180-1, by NIST. The output of the function has a length of 160 bits. It was a successor of SHA0 which was withdrawn by NSA shortly after its publication and was superseded by the revised version. SHA-1 differs from SHA-0 only by a single bitwise rotation in the message schedule of its compression function; this was done, according to NSA, to correct a flaw in the original algorithm which reduced its cryptographic security. It is the most common hash function used today. 32 Security and Protection of Information 2009 In 2006 a collision attack on SHA–1 was presented in [8]. No real collisions were found till today but the complexity of the attack is claimed to be roughly 261. It is not recommended to use this function for new applications. Figure 3: SHA-1 step function, F is simple nonlinear function (taken from wikipedia). 3.3 SHA-2 SHA-2 is a family of four hash functions SHA 224, SHA 256, SHA 384 and SHA 512. The algorithms were first published in the draft FIPS PUB 180-2 in 2001. The 386 and 512 bit versions use different constants, 64 bits long registers and 1024 bits long message blocks in compression functions. Otherwise they are the same. SHA-2 functions have the same construction properties as SHA-1, but there weren’t any successful applications of the previous attacks on SHA-1 or MD5 published. This is believed to be due to their complex message expansion and step function. Nowadays users are strongly encouraged to move to these functions. Figure 4: SHA-2 step function, Ch, Ma, ∑0 and ∑+ are not so trivial functions (taken from wikipedia). Security and Protection of Information 2009 33 3.4 RIPEMD-160 RIPEMD-160 is a 160-bit cryptographic hash function, designed by H. Dobbertin, A. Bosselaers, and B. Preneel. It is intended to be used as a secure replacement for the 128-bit hash functions MD4, MD5. The speed of the algorithm is similar to the speed of SHA-1 but the structure of the algorithm is different as shown on Figure 5. It uses a balanced Feistel network known from the theory of block ciphers. There are no successful attacks known on RIPEMD-160 and the function is together with the SHA-2 family recommended by ETSI 102176-1. Figure 5: RIPEMD compression function. 4 Building blocks In this section we provide a list of common building blocks that appeared in SHA-3 competition. The list may not be complete and there may be some others common properties of the candidates. For each candidate we tried to summarize pros and cons and some examples of that design strategy. The links for the documentation of the candidates can be found at NIST web site [1]. 4.1 Feedback Shift Register (FSR) Linear and nonlinear feedback shift registers are often used in stream ciphers. Because of their good pseudorandom properties, easy implementation in hardware and well known theory, they are good candidates to use as a building block in compression function. Pros: efficiency in HW, known theory from stream ciphers, easy to implement. Cons: implementation in SW may be slow, possible cons of stream cipher such as long initialization. Examples: MD6, Shabal, Essence, NaSHA. 34 Security and Protection of Information 2009 4.2 Feistel Network A Feistel network is a general method for transforming any function into a permutation. The strategy has been used in the design of many block ciphers and because hash functions are often based on a block cipher it is used there as well. A Feistel network works as follows: Take a block of length n bits and divide it into two parts, called L and R. A round of the cipher can be calculated from the previous round by setting Li = Ri-1 and Ri = Li-1 XOR f(Ri-1, Ki), where Ki is the subkey used in the i-th round and f is an arbitrary round function. If L and R are of the same size, the Feistel network is said to be balanced; if they are not, the Feistel network is said to be unbalanced. Pros: theory and proves from block ciphers. Cons: can not be generalized. Examples: ARIRANG, BLAKE, Chi, CRUNCH, DynamicSHA2, JH, Lesamnta, Sarmal, SIMD, Skein, TIB3. 4.3 Final Output Transformation Method used in some of the hash function to prevent length extension attack. Pros: helps to prove properties and countermeasure the length extension attack. Cons: two different transformation (compression function and output transformation). Examples: Cheetah, Chi, Crunch, ECHO, ECOH, Grostl, Keccak, Lane, Luffa, Lux, Skein, Vortex. 4.4 Message expansion Method for preparing the message blocks to be an input for the step of the compression function similar to key expansion in block ciphers. Pros: theory from block ciphers known as key expansion. Cons: can not be generalized, Examples: ARIRANG, BLAKE, Cheetah, Chi, CRUNCH, ECOH, Edon-R, Hamsi, Khichidy, LANE, Lesamnta, SANDstorm, Shabal, SHAvite-3, SIMD, Skein, TIB3. 4.5 S-box Used for substitution to obscure the relationship between the key (message block) and the ciphertext (value of intermediate chaining variable). Because of the extension and known properties of AES, the majority of hash function submitted to the first round used S-Boxes from AES. Pros: theory from block ciphers (key expansion), speed in HW, Cons: often implemented as look-up tables which can be viewed as a door to possible side channel attacks. Examples: Cheetah, Chi, CRUNCH, ECHO, ECOH, Grostl, Hamsi, JH, Khichidy, LANE, Lesamnta, Luffa, Lux, SANDstorm, Sarmal, SHAvite-3, SWIFFTX, TIB3. (33 out of 51 candidates uses S-Boxes) 4.6 Wide Pipes Countermeasure to prevent multi-collisions and multi-preimages of Joux type [8]. Wide pipe design means that intermediate chaining variable is kept longer than the length of hash output e.g. 512 bits for 256 bit hash. Security and Protection of Information 2009 35 Pros: prevent multi-collisions, Cons: more complex and not as efficient to produce chaining variable of double length with the good properties of chaining variable. Examples: ARIRANG, BMW, Chi, Echo, Edon-R, Grostl, JH, Keccak, Lux. MD6, SIMD. 4.7 MDS Matrixes Good diffusion properties in the theory of block ciphers are often achieved by using of Maximum Distance Separable Matrixes. These matrixes might be helpful also in the design of hash functions. Pros: mathematical background and proven diffusion properties Cons: memory requirements Examples: ARIRANG, Cheetah, ECHO, Fugue, Grostl, JH, LANE, Lux, Sarmal, Vortex. 4.8 Tree structure Tree structure of hashing is an intuitive approach which takes advantage of parallelism from independent compression function threads and countermeasure current attacks on Merkle-Damgård construction. Pros: parallelism, resistant against current attacks on SHA-1 and MD5 Cons: memory requirements and “modes” of operation Example: MD6. 4.9 Sponge structure Works as “absorbing” the message or “squeezing” the message to produce an output. Absorbing works as follows: • Initialize state • XOR some of the message to the state • Apply compression function • XOR some more of the message into the state • Apply compression function … Squeezing works as follows: • Apply compression function • Extract some output • Apply compression function • Extract some output • Apply compression function … Examples: Keccak, Luffa. 4.10 Merkle-Damgård like structure. Structures very similar to Merkle-Damgård constructions of hash functions are still very popular. The Merkle-Damgård construction is shown in Figure 1, the suggested techniques use various chaining of intermediate variables or context. Pros: known structure, speed 36 Security and Protection of Information 2009 Cons: how to prevent previous attacks, multi-collisions and extension attack. Examples: ARIRANG, CRUNCH, Cheetah, Chi, LANE, Sarmal. 5 Conclusion We have tried to present the latest overview in the design of hash functions. We showed the traditional design techniques and presented some of the building blocks of the algorithms submitted to the SHA-3 competition along with their pros and cons. References [1] National Institute of Standards and Technology: Cryptographic Hash Project http://csrc.nist.gov/groups/ ST/hash/index.html [2] National Institute of Standards and Technology: SHA-3 First Round Candidates http://csrc.nist.gov/ groups/ST/hash/sha-3/Round1/submissions_rnd1.html [3] Souradyuti Paul. First SHA-3 conference organized by NIST http://csrc.nist.gov/groups/ST/hash/sha3/Round1/Feb2009/documents/Soura_TunableParameters.pdf [4] IAIK Graz, SHA-3 ZOO http://ehash.iaik.tugraz.at/index.php?title=The_SHA3_Zoo&oldid=3035 [5] Daniel J. Bernstein and Tanja Lange (editors). eBACS: ECRYPT Benchmarking of Cryptographic Systems. http://bench.cr.yp.to, accessed 27 March 2009 [6] Bert den Boer; Antoon Bosselaers. Collisions for the Compression Function of MD5. pp. 293–304. ISBN 3-540-57600-2 [7] Ewan Fleischmann and Christian Forler and Michael Gorski: Classification of the SHA-3 Candidates Cryptology ePrint Archive: Report 511/2008, http://eprint.iacr.org/ version 0.81, 16 February 2009 [8] A. Joux: Multicollisions in iterated hash functions. Application to cascaded constructions. Proceedings of Crypto 2004, LNCS 3152, pages 306-316. [9] Wang X., Yin Y. L., and Yu H.: Finding collisions in the full SHA-1. In Victor Shoup, editor, Advances in Cryptology - CRYPTO ’05, volume 3621 of Lecture Notes in Computer Science, pages 17 – 36. Springer, 2005, 14 - 18 August 2005. [ 10 ] Wang X. and Yu H.: How to Break MD5 and Other Hash Functions. In Ronald Cramer, editor, Advances in Cryptology - EUROCRYPT 2005, volume 3494 of Lecture Notes in Computer Science, pages 19 – 35. Springer, 2005. Security and Protection of Information 2009 37

Hash Function Design Overview of the basic components in SHA

Related documents

Products

Support

Hash Function Design Overview of the basic components in SHA

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib