$ time

advertisement
CSE P 590 TU: Practical Aspects of Modern Cryptography Winter 2006 Project
Bing Wu (9735592)
“Chinese” Attacks on Hashes
1. Background
A hash function is a function that takes a variable-size input and returns a fixed-size
string, which is called the hash value. The hash value is relatively easy to compute for
any given input. A hash function is one-way and collision-free, which are key properties
for our topic.
MD4 is a hash function developed in 1990. It is based on the basic arithmetic and logical
operations. Since its publication, several other hash functions have been designed using
MD4 as the basis, including MD5, HAVAL, RIPEMD, SHA-0, SHA-1, SHA-256, etc.
These hash functions all follow the same design principal as well as have similar
structures as MD4. These hash functions play very important roles in the digital
signatures, data integrity, and many other cryptographic protocols. They not only ensure
the information security, but also improve the efficiency. Among them, MD5 and SHA-1
are most widely used today.
Since the appearance of all these hash functions, people have been trying to develop
techniques to perform collision search attacks on them. Developing new hash functions
and breaking them are like two sides of a coin. They both help people develop better hash
functions to better serve their purposes in the cryptographic world.
There have been significant developments in the area in the past several years. However,
the real breakthroughs came in 2004 and 2005. Dr. Xiaoyun Wang and her Chinese crew
published a series of papers to break MD4, MD5, HAVAL, RIPEMD, and SHA-1 [2-7].
Although A. Joux [8] broke SHA-0 with the time complexity for finding a collision in
about 2 51 SHA-0 operations, their result was significantly better.
Below are the best results by “Chinese” attacks on those hash functions:






The time complexity for finding a collision for MD4 is about 2 8 MD4 operations
[5].
The time complexity for finding a collision for MD5 is to find the first blocks
with about 2 39 MD5 operations, and the second blocks with about 2 32 MD5
operations [6].
The time complexity for finding a collision for HAVAL-128 is about 2 7 HAVAL128 operations [7].
The time complexity for finding a collision for RIPEMD is about 2 18 RIPEMD
operations [5].
The time complexity for finding a collision for SHA-0 is about 2 39 SHA-0
operations [1].
The time complexity for finding a collision for SHA-1 is about 2 69 SHA-1
operations announced in [2] and later on, the result was improved to about
2 63 SHA-1 operations [9].
1
CSE P 590 TU: Practical Aspects of Modern Cryptography Winter 2006 Project
Bing Wu (9735592)
2. “Chinese” Collision attacks
The “Chinese” collision attacks on hash functions are precise differential attacks in which
the differential path is more restrictive since it depends on the message difference as well
as the specific values of the message bits involved. They do not use the exclusive-or as a
measure of difference, but instead use modular integer subtraction as the measure. It’s
called a modular differential by the Chinese crew.
The attacks have some principles applicable to all MD4 family hash functions. Such
attacks first pick a favorable message differential between two messages M and M’ such
that these two messages have a higher probability of having the same hash value.
Depending on the message differential, some number of probabilistic conditions must be
met. It uses a technique called “message modification” to eliminate some of these
conditions which appear in the early stages of the hash compression function. This
reduction leads to a better overall complexity for the collision attack.
Basically, the attacks include three steps:
1) Find a collision differential for which M and M’ probably produce a collision.
Taking MD4 as an example,
M = M’ −M = (m 0 ,m 1 , ...... ,m 15 )
m 1 = 2 31 , m 2 = 2 31 − 2 28 , m 12 = −2 16
mi = 0, 0 ≤ i ≤ 15, i  1, 2, 12.
The reason why the collision differential is selected is not clearly specified in the
article [5], but it’s stated in [7] that based on significant amount of analysis, it’s
believed to be fairly easy for M and M’ to produce a collision with the collision
differential chosen.
2) Derive a set of sufficient conditions which ensure the collision differential to hold.
For example, the MD4 compression function has three rounds. Each round uses a
different nonlinear Boolean function defined as follows:
F(X, Y, Z) = (X  Y)  (X  Z)
G(X, Y, Z) = (X  Y)  (X  Z)  (Y  Z)
H(X, Y, Z) = X  Y  Z
Some properties of these three nonlinear Boolean functions are very helpful for
determining sufficient conditions for the differential paths that are used in the
collision search attack on MD4. The sufficient conditions that ensure all the
characteristics to hold can be verified by these properties. Refer to [5] for details.
2
CSE P 590 TU: Practical Aspects of Modern Cryptography Winter 2006 Project
Bing Wu (9735592)
3) For any random message M, make some modification to M such that almost all
the sufficient conditions hold. This is done by two types of message modification
techniques, which are termed as “single-step modification” and “multi-step
modification”.
Taking MD4 as an example, it is believed that the first two rounds of MD4 is not
one-way, and for the single-step modification, it is to modify M such that all the
conditions in round 1 hold. For the multi-step modification, it is to modify M so
that some bits of M get changed in round 2 to fulfill more conditions while all the
conditions in round 1 remain hold. This greatly improves the probability that M
and M’ may produce a collision. Refer to [5] for details.
3. Results for MD4 and MD5 attacks
Practically, it's computationally feasible to apply “Chinese” attacks on MD4 and MD5
hash functions by employing the computational power of a personal computer. I have C
programs implementing the MD4 algorithm in [5] and MD5 algorithm in [6]. They are
running under the Unix/Linux environment. I used cygwin to mimic the Linux
environment on my WinXP system on a Pentium4 3.40G machine. I used gcc to compile
the programs and used -Os and appropriate -mcpu and -mtune flags for my machine to
achieve the optimal performance.
Below are two runs for each MD4 and MD5 program respectively. As it shows, it takes
about 5 seconds to find a collision for MD4 attacks and about 1 hour for MD5 attacks.
These results clearly demonstrate that both MD4 and MD5 are severely broken and thus
should not be used any longer.
$ time ./md4.exe
unsigned int m0[16] = {
0x45051308, 0x81730a12, 0xd56ad03c, 0x71628a68,
0xa54f00e7, 0xb32a6311, 0x0e13c786, 0xb48eae4b,
0x4656581e, 0x18a6deab, 0x9b50d7b2, 0x0cfc6be7,
0xb42bdf1e, 0x814dcfbb, 0xb776931d, 0xb27bcba6
};
unsigned int m1[16] = {
0x45051308, 0x01730a12, 0x456ad03c, 0x71628a68,
0xa54f00e7, 0xb32a6311, 0x0e13c786, 0xb48eae4b,
0x4656581e, 0x18a6deab, 0x9b50d7b2, 0x0cfc6be7,
0xb42adf1e, 0x814dcfbb, 0xb776931d, 0xb27bcba6
};
real 0m4.506s
user 0m4.496s
sys 0m0.020s
3
CSE P 590 TU: Practical Aspects of Modern Cryptography Winter 2006 Project
Bing Wu (9735592)
$ time ./md4.exe
unsigned int m0[16] = {
0xcbe53b38, 0x5032eef7, 0xb019844f, 0xebd1e372,
0x98d286ff, 0x76430bc9, 0xd2a8b026, 0xc2c5c353,
0x6d4d2c65, 0xa011e2ac, 0x61eec40e, 0x434b154e,
0xebafa851, 0xa9601efa, 0xc48a9a59, 0x578bbb57
};
unsigned int m1[16] = {
0xcbe53b38, 0xd032eef7, 0x2019844f, 0xebd1e372,
0x98d286ff, 0x76430bc9, 0xd2a8b026, 0xc2c5c353,
0x6d4d2c65, 0xa011e2ac, 0x61eec40e, 0x434b154e,
0xebaea851, 0xa9601efa, 0xc48a9a59, 0x578bbb57
};
real 0m5.467s
user 0m5.477s
sys 0m0.010s
$ time ./md5.exe
block #1 done
block #2 done
unsigned int m0[32] = {
0x1929aa4b, 0xd8844d17, 0x8e5e1527, 0x34b458d0,
0xacc2f035, 0xdbe6e5f1, 0x234533f1, 0x6c716baf,
0x05352d45, 0x61f48bfd, 0x769ae8c3, 0x8fda4316,
0x754098e2, 0x8c4005d8, 0xc26ca7b4, 0x22f12708,
0x06dea041, 0xe664ec4e, 0x1d72b3a0, 0x03bdc431,
0x47d0fc1c, 0x4c7bdc4e, 0x76648928, 0xbea20bd6,
0x5079c739, 0x4dae799f, 0xbbd34dfa, 0xdc019c7f,
0x9d18d4e3, 0x253ba683, 0xed5a754e, 0x5a8cd41a,
};
unsigned int m1[32] = {
0x1929aa4b, 0xd8844d17, 0x8e5e1527, 0x34b458d0,
0x2cc2f035, 0xdbe6e5f1, 0x234533f1, 0x6c716baf,
0x05352d45, 0x61f48bfd, 0x769ae8c3, 0x8fdac316,
0x754098e2, 0x8c4005d8, 0x426ca7b4, 0x22f12708,
0x06dea041, 0xe664ec4e, 0x1d72b3a0, 0x03bdc431,
0xc7d0fc1c, 0x4c7bdc4e, 0x76648928, 0xbea20bd6,
0x5079c739, 0x4dae799f, 0xbbd34dfa, 0xdc011c7f,
0x9d18d4e3, 0x253ba683, 0x6d5a754e, 0x5a8cd41a,
};
real 68m56.807s
user 68m42.968s
sys 0m0.070s
4
CSE P 590 TU: Practical Aspects of Modern Cryptography Winter 2006 Project
Bing Wu (9735592)
$ time ./md5.exe
block #1 done
block #2 done
unsigned int m0[32] = {
0x5c625d50, 0x7c27ef85, 0x24835757, 0x9b4b0d82,
0xff777f20, 0x0a0777b0, 0xe2ff34b4, 0xd3302c20,
0x0534ad31, 0x2033e5f7, 0x7fbadc53, 0x12c25f81,
0x5edab51a, 0x746c590d, 0xad958bb0, 0xfbe53434,
0x70fd8d2f, 0x2e073782, 0x22c1af9c, 0xe4b8fb96,
0xc53a1137, 0x0e0f9f6a, 0x66dd690e, 0x8f422950,
0x9e36a841, 0x05f2ddb9, 0x6aa211be, 0x9c429c6b,
0x5a3c7cea, 0x661fa395, 0x1668028a, 0x7c61522d,
};
unsigned int m1[32] = {
0x5c625d50, 0x7c27ef85, 0x24835757, 0x9b4b0d82,
0x7f777f20, 0x0a0777b0, 0xe2ff34b4, 0xd3302c20,
0x0534ad31, 0x2033e5f7, 0x7fbadc53, 0x12c2df81,
0x5edab51a, 0x746c590d, 0x2d958bb0, 0xfbe53434,
0x70fd8d2f, 0x2e073782, 0x22c1af9c, 0xe4b8fb96,
0x453a1137, 0x0e0f9f6a, 0x66dd690e, 0x8f422950,
0x9e36a841, 0x05f2ddb9, 0x6aa211be, 0x9c421c6b,
0x5a3c7cea, 0x661fa395, 0x9668028a, 0x7c61522d,
};
real 56m53.298s
user 56m52.757s
sys 0m0.030s
4. What does it mean and what to do about it?
Well, it means that hash functions such as MD5 are no longer useful as digital signature
hashes. It is no longer the case that you can believe that a person’s signed document is
identical to your version of that document, even if the checksum matches.
SHA-1 is the best we’ve got at the moment and has the time complexity of the order 2 63 ,
which appears to be best standing against collision attacks. While this seems like a huge
number, distributed searches using many computers across the Internet have solved
problems that were twice as complex. In other words, there exist large collections of
computers distributed over the Internet that are capable of finding SHA-1 collisions.
To exploit a collision attack, an adversary would typically begin by constructing two
messages with the same hash where one message appears legitimate. The attacker could
then try to get you to digitally sign the legitimate message. He would then claim that you
actually signed the malicious message, and prove this claim by showing that your
signature matches the malicious message. There is a similar concern for systems involved
5
CSE P 590 TU: Practical Aspects of Modern Cryptography Winter 2006 Project
Bing Wu (9735592)
with the signed code and certificates that an adversary might be able to construct a valid
signature or certificate request that had a corresponding hash collision with a malicious
signature or certificate request.
In terms of practical security, the major concern about these new attacks is that it might
lead to more efficient attacks and thus a migration to stronger hashes is believed to be
mandatory. These attacks broke the strong collision resistance, not pre-image (one-way)
resistance though. That means that it is still infeasible for an attacker to generate a
particular input to a hash function that is guaranteed to produce a particular output.
Because of this, many of the applications that use cryptographic hashes, such as HMACrelated protocols, password storage or document signing, are only minimally affected by
the collision attacks. In the case of document signing, for example, an attacker could not
simply fake a signature from an existing document ─ the attacker would have to fool the
private key holder into signing a pre-selected document. Reversing password encryption
is not made possible by the attacks either. Constructing a password that works for a given
account requires a pre-image attack.
Practically, these collision attacks suggest the acceleration of upgrading systems that use
hash functions. Three viable approaches for improving the application security are:
1) Replace the hash function with a stronger one. The most commonly suggested
approach is to simply employ SHA-2 hash functions, possibly truncating the
output to 160 bits for backward compatibility with SHA-1 in which case extra
special care must be taken to prevent a “man in the middle” attack from
downgrading a SHA-2 session into a more vulnerable SHA-1 session.
Moreover, all new applications and protocols must be designed to have better
collision resistance. These applications and protocols need to be able to
accommodate new hash standards as they are developed.
2) Alter the protocol so that it no longer requires that the hash function be collision
resistant. A recent proposal suggests adding randomness to hash functions [11].
To implement this, the application must have a good source of randomness and
must alter the protocol.
3) Implement simple message pre-processing to convert plaintext messages into a
form that makes all existing collision attacks inapplicable. This approach can be
accomplished with minimal code change [12]. This practical alternative is
appealing for applications which want to extend the secure life of SHA-1.
The bottom line:




Don’t use MD4.
Don’t use MD5.
Don’t use HAVAL.
Don’t use RIPEMD.
6
CSE P 590 TU: Practical Aspects of Modern Cryptography Winter 2006 Project
Bing Wu (9735592)





Don’t use SHA-0.
While SHA1 is showing signs of significant problems in some areas, SHA1
remains stronger than others. However, avoid using SHA-1 if possible because it
is next up to be cracked.
Use SHA-2 hash functions for now and wait for more collision-resistant hash
functions. The SHA-2 standard is currently resisting known SHA-1 attacks.
Theoretical attacks against SHA-2 hash functions may take a few years to turn
into practical attacks. However, they are all potentially vulnerable as well because
of their same design principal as other MD4 family hash functions.
VSH is about the best generally published hash function [10], but it needs to have
more peer review before it can be seriously considered.
Use alternative approaches described in [11-12].
5. Conclusion
The “Chinese” attacks on hashes are remarkable in the cryptographic area. It makes
people eagerly upgrade their systems to employ better hash functions as well as develop
new and more collision-resistant hash functions to better serve their cryptographic
purposes. This will greatly help us achieve a more secure digital world.
References
1) Xiaoyun Wang, Hongbo Yu, Yiqun Lisa Yin, Efficient Collision Search Attacks on
SHA-0, Crypto'05.
2) Xiaoyun Wang, Yiqun Lisa Yin, Hongbo Yu, Finding Collisions in the Full SHA-1,
Crypto'05.
3) Xiaoyun Wang, Yiqun Lisa Yin, Hongbo Yu, Collision Search Attacks on SHA1,
2005.
4) Xiaoyun Wang, Dengguo Feng, Xuejia Lai, Hongbo Yu, Collisions for Hash
Functions MD4, MD5, HAVAL-128, RIPEMD, Crypto'04.
5) Xiaoyun Wang, Xuejia Lai, Dengguo Feng, Hui Chen, Xiuyuan Yu, Cryptanalysis of
the Hash Functions MD4 and RIPEMD, Eurocrypto’05.
6) Xiaoyun Wang, Hongbo Yu, How to Break MD5 and Other Hash
Functions, Eurocrypto’05.
7) Xiaoyun Wang, Dengguo Feng, Xiuyuan Yu, An Attack on Hash Function HAVAL128, Science in China Series E.
8) A. Joux, Collisions for SHA-0, Rump session of Crypto’04, August 2004.
9) Xiaoyun Wang, Andrew Yao, Frances Yao, New Collision Search for SHA-1, Rump
session of Crypto’05, August 2005.
10) Scott Contini, Arjen K. Lenstra, Ron Steinfeld, VSH, an efficient and provable
collision resistant hash function, Rump session of Crypto’05, August 2005.
11) S. Halevi, and H. Krawczyk, Strengthening Digital Signatures via Randomized
Hashing, Internet Draft, 2005.
12) Szydlo M. and Yin Y., Collision-Resistant usage of MD5 and SHA-1 via Message
Preprocessing, IACR Eprint archive 2005 #248.
7
Download