Birthday Attacks

advertisement
Birthday Attacks
Suppose that a 64-bit hash code is used. One might think that this is quite secure.
For example, if an encrypted hash code C is transmitted with the corresponding
unencrypted message M then an opponent would need to find an M' such that
H(M') = H(M) to substitute another message and fool the receiver. On average,
the opponent would have to try about 263 messages to find one that matches the
hash code of the intercepted message.
However, a different sort of attack is possible, based on the birthday paradox
Yuval proposed the following strategy
The source, A, is prepared to "sign" a message by appending the appropriate mbit hash code and encrypting that hash code with A's private key.
1.The opponent generates 2m/2 variations on the message, all of which convey
essentially the same meaning. The opponent prepares an equal number of
messages, all of which are variations on the fraudulent message to be substituted
for the real one.
2. The two sets of messages are compared to find a pair of messages that
produces the same hash code. The probability of success, by the birthday
paradox, is greater than 0.5. If no match is found, additional valid and fraudulent
messages are generated until a match is made.
3. The opponent offers the valid variation to A for signature. This signature can
then be attached to the fraudulent variation for transmission to the intended
recipient. Because the two variations have the same hash code, they will produce
the same signature; the opponent is assured of success even though the
encryption key is not known.
Thus, if a 64-bit hash code is used, the level of effort required is only on the order
of 232.
The generation of many variations that convey the same meaning is not difficult.
For example, the opponent could insert a number of “space-space-backspace”
character pairs between words throughout the document. Variations could then
be generated by substituting “space-backspace-space” in selected instances.
Alternatively, the opponent could simply reword the message but retain the
meaning.
The conclusion to be drawn from this is that the length of the
hash code should be substantial.
To summarize, for a hash code of length , the level of effort required, as we have
seen, is proportional to the following
Preimage resistant
2m
Second preimage resistant
2m
Collision resistant
2m/2
If collision resistance is required (and this is desirable for a general-purpose
secure hash code), then the value determines the strength of the hash code
against brute-force attacks.
Message Digest 5
History of MD5
 The MD5 message-digest algorithm is a widely used cryptographic hash
function producing a 128-bit (16-byte) hash value, typically expressed in
text format as a 32 digit hexadecimal number.
 MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash
function, MD4. The source code in RFC 1321 contains a "by attribution"
RSA license.

In 1996 a flaw was found in the design of MD5. While it was not deemed a
fatal weakness at the time, cryptographers began recommending the use
of other algorithms, such as SHA-1—which has since been found to be
vulnerable as well.
 In 2004 it was shown that MD5 is not collision resistant. As such, MD5 is
not suitable for applications like SSL certificates or digital signatures that
rely on this property for digital security.
 MD5 is one in a series of message digest algorithms designed by Professor
Ronald Rivest of MIT (Rivest, 1992). When analytic work indicated that
MD5's predecessor MD4 was likely to be insecure, Rivest designed MD5 in
1991 as a secure replacement. (Hans Dobbertin did indeed later find
weaknesses in MD4.)
 In 1993, Den Boer and Bosselaers gave an early, although limited, result
of finding a "pseudo-collision" of the MD5 compression function; that is,
two different initialization vectors which produce an identical digest.
 In 1996, Dobbertin announced a collision of the compression function of
MD5 (Dobbertin, 1996). While this was not an attack on the full MD5 hash
function, it was close enough for cryptographers to recommend switching
to a replacement, such as SHA-1 or RIPEMD-160.
PROCESSING
 MD5 processes a variable-length message into a fixed-length output of
128 bits. The input message is broken up into chunks of 512-bit blocks
(sixteen 32-bit words);
 The message is padded so that its length is divisible by 512. The padding
works as follows: first a single bit, 1, is appended to the end of the
message. This is followed by as many zeros as are required to bring the
length of the message up to 64 bits fewer than a multiple of 512.
 The remaining bits are filled up with 64 bits representing the length of the
original message, modulo 264.
 The main MD5 algorithm operates on a 128-bit state, divided into four 32bit words, denoted A, B, C, and D. These are initialized to certain fixed
constants. The main algorithm then uses each 512-bit message block in
turn to modify the state.
 The processing of a message block consists of four similar stages, termed
rounds; each round is composed of 16 similar operations based on a nonlinear function F, modular addition, and left rotation. There are four
possible functions F; a different one is used in each round:
denote the XOR, AND, OR and NOT operations respectively.
One MD5 operation. MD5 consists of 64 of these operations, grouped in four
rounds of 16 operations. F is a nonlinear function; one function is used in each
round. Mi denotes a 32-bit block of the message input, and Ki denotes a 32-bit
constant, different for each operation.
s denotes a left bit rotation by s places;
s varies for each operation.
denotes addition modulo 232.
Download