1
The Birthday Paradox
July 2011
Definition
Birthday attacks are a class of brute-force techniques that target the cryptographic hash functions . The goal is to take a cryptographic hash function and find two different inputs that produce the same output.
2
The Birthday Problem
What is the probability that at least two of k randomly selected people have the same birthday? (Same month and day, but not necessarily the same year.)
3
The Birthday Paradox
How large must k be so that the probability is greater than 50 percent?
The answer is 23
It is a paradox in the sense that a mathematical truth contradicts common intuition.
4
Birthday paradox in our class
What’s the chances that two people in our class of
43 have the same birthday?
Approximate solution: p
1
e
k
2
2 N
43
2
1
e 2*365
0.92
Where k = 43 people, and N = 365 choices
5
Birthday Calendar Wall
Equivalence to our hashing space
Jan 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Feb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Mar 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Apr 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
May 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Jun 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Jul 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Aug 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Sep 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Oct 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Nov 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Dec 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
6
Calculating the Probability-1
Assumptions
Nobody was born on February 29
People's birthdays are equally distributed over the other 365 days of the year
7
Calculating the Probability-2
In a room of k people
q: the prob. all people have different birthdays q
q
365
364
363
365 365 365
365!/(365
k )!
365 k
362
365
365
k
1
365 p : the prob. at least two of them have the same birthdays
p
1
q
0.5
k
23
8
Calculating the Probability-3
Shared Birthday Probabalities
The Birthday Problem
100,0000%
90,0000%
80,0000%
70,0000%
60,0000%
50,0000%
40,0000%
30,0000%
20,0000%
10,0000%
0,0000%
1 10 19 28 37 46 55
Number of People
64 73 82 91 100
9
Collision Search-1
For collision search, select distinct inputs x i for i=1, 2, ... , n, where n is the number of hash bits and check for a collision in the h(x i
) values
The prob. that no collision is found after selecting k inputs is p no collision
1
1 n
1
2 n
1
3 n
1
k
1
n
(In the case of the birthday paradox k is the number of people randomly selected and the collision condition is the birthday of the people and n=365.)
10
Collision Search-2
For large n p no collision
1
1
1 n
1
x
2 n
1
k
1
n
e
k
2
e
x
when x is small
1
1 n
e
1 n
11
p no collision
e
e
1 n e
2 n e
k
1
n
1
2
3
...
k
1
n
e
k
k
1
2 n
12
Collision Search-3
When k is large, the percentage difference between k and k-1 is small, and we may approximate k-1
k.
p no collision
e
k
k
1
2 n e
k
2
2 n p at least one collision
1
e
k
2
2 n
Collision Search-4
p
1
e
k
2
2 n e
k
2
2 n
1
p e k
2
2 n
1
1
p k 2
2 n
ln(
1
1
p
) k
2 n * ln(
1
1
p
)
13
For the birthday case, the value of k that makes the probability closest to 1/2 is
23 k
2 n * ln 2
1.1774
n
1.1774 * 365
22.49
Attack Prevention
The important property is the length in bits of the message digest produced by the hash function.
If the number of m bit hash , the cardinality n of the hash function is n
2 m
The 0.5 probability of collision for m bit hash, expected number of operation k before finding a collision is very close to k
n
2 m 2
m should be large enough so that it’s not feasible to compute hash values!!!
14
15
Q& A