# K i

advertisement ```Hash Functions
Ramki Thurimella
What is a hash function?




Also known as message digest or fingerprint
Compression: A function that maps arbitrarily long
binary strings to fixed length binary strings
Ease of Computation : Given a hash function and an
input it should be easy to calculate the output
Cryptographic hash functions have more security
requirements compared to the ones used to create
hash tables
2
Applications

Data Integrity


Downloading of large files often include the MD5
digest to verify the integrity of the file
Proof of Ownership

Publish the digest of a document describing a
patent idea in the New York Times


Newspaper charges based on the number of characters
Digital Signatures

Digitally sign the digest of a message instead of
the message to save computational time
3
Properties

Collision resistance




For two distinct messages m1 and m2, h(m1) ≠ h(m2)
By pigeon-hole principle, this is not always possible since
the domain is infinite and the range is finite
The security requirement merely requires that for a h and
m1, some other m2 cannot be found easily, i.e.
computationally infeasible, such that both messages hash
to the same value
One-way property


They are used in pseudorandom number generators, e.g.
generate several keys from one shared secret key
Learning one value should not make learning other values
easy
4
Attacks





Ideally a hash function should be a random
mapping
Attack: A non-generic method of distinguishing the
hash function from an ideal hash function
Non-generic means that the attack exploits the
specific weaknesses of the hash function
Since there is no key, exhaustive search cannot be
performed
Generic attack: Birthday attack
5
Birthday Attack





Expected number of collisions after picking t
pairs:
tC * (1/2n) ≈ t2/2n
2
To get 1 collision, choose t = 2(n/2)
This attack is useful only for collisions
Sometimes a pre-image (given h(m) finding
m) is required, e.g. privacy scenario
Generic attack for pre-image requires 2n
work
6
A non-generic attack
• An insecure hash function
1. Pad m (specifics are not important)
2. Create 128-bit blocks m1,m2,…,mk
3. H0= 128-bit block of all zeros
4. Hi = AES ( Hi-1  mi)
5. Hk is the result
• Attack
– Pick m and assume it splits into m1 &amp; m2
– Let m1 = m1 (+) H1 and m2’ = H2(+) m2 (+) H1
– Assume that m1’ &amp; m2’ are the result of splitting m’
7
Non-Generic Attack (cont.)

Show that m &amp; m’ map to the same value
(Exercise 5.5, pp. 88)
8
The MD5 Function





Specifies the compression function to use
Processes 512-bit message blocks
Produces a 128-bit digest
Is derived from MD4 to address security
concerns
Created by Ronald Rivest and described in
RFC 1321
9
MD5
There are four possible
functions F; a different one
is used in each round:
http://en.wikipedia.org/wiki/File:MD5.svg
10
In the previous Figure






MD5 consists of 64 of these operations, grouped in
four rounds of 16 operations
F is a nonlinear function; one function is used in
each round
Mi denotes a 32-bit block of the message input
Ki denotes a 32-bit constant, different for each
operation
&lt;&lt;&lt;s denotes a left bit rotation
by s places; s varies for each operation
denotes addition modulo 232
11
The SHA-1 Function






Processes 512-bit message blocks
Produces a 160-bit digest
Is also derived from MD4
A slight modification of SHA-0 to fix a
“technical error”
Created by NIST with the aid of the NSA
Defined in FIPS PUB 180-2
12
SHA-1
http://en.wikipedia.org/wiki/File:SHA-1.svg
13
One iteration within the SHA-1
compression function




A, B, C, D and E are 32-bit words of the
state
F is a nonlinear function that varies
Wt is the expanded message word of round
t
Kt is the round constant of round t
14
Examples

% echo &quot;0000 0000&quot; | openssl dgst -sha512 | wc

MD5(M) = 0x93b885adfe0da089cdf634904fd59f71

MD5(M’) =



0x55a54008ad1ba589aa210d2629c1df41
SHA-1(M) = 0x5ba93c9db0cff93f52b521d7420e43f6eda2784f
SHA-1(M’)= 0xbf8b4530d8d246dd74ac53a13471bba17941dff7
On average changing 1 input bit will change 50% of
the output bits

Avalanche condition
15
Weaknesses of hash functions

Length extension




Let m = m1,m2,…,mk
Choose m’ = m1,m2,…,mk,mk+1
Notice that h(m’) = h’(h(m),mk+1)
Why is this a problem?



Alice sends Bob m
Authenticates by sending h(X || m) where X is a secret
known only to Bob
Eve can append to m and update the authentication
code
16
Current State of Hash
Functions

Researchers try to break hash functions in 1 of 2
ways







Finding two messages (usually single message blocks)
hash to the same digest
Find the message that creates the digest of all zeros or all
ones (essentially finding a preimage)
Collisions and preimages can be found for
MD4
Collisions can be constructed for MD5 and SHA-0
Theoretic collisions discovered for SHA-1
SHA-256 &amp; SHA-512 has no known attacks
17
```