Introduction RSA Security and implementation Why, thanks to the beauty of prime numbers, RSA is secure (at least for the moment) R. Hayden Advanced Maths Lectures Department of Computing Imperial College London February 2009 rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Trapdoor one-way functions Public key cryptography Assymmetric cryptography — two keys: Public key — widely distributed Private key — users keep secret Mathematically related, but cleartext (and thus private key) hopefully not practically computable given just public key rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Trapdoor one-way functions Trapdoor one-way functions Nice idea, but how can we implement such a scheme? We need a function, which is: Easy to compute Inverse is hard to compute without special information With special information, inverse is also easy to compute Can you think of any? Let’s start by considering just functions whose inverses are hard to compute (one-way functions) . . . rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Trapdoor one-way functions The discrete logarithm problem What about for some a ∈ R, f : x → ax ? log(1 + x) = x − x2 x3 x4 + − + ... 2 3 4 is easy to compute, so f is not one-way For a discrete group (G, ∗), the discrete logarithm logb (g) for b, g ∈ G is the least k ∈ Z≥0 with: g = bk := |b ∗ .{z . . ∗ b} k times In the cyclic groups, ∃b ∈ G ∀g ∈ G ∃k ∈ Z≥0 [g = bk ] rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Trapdoor one-way functions The group Z× p Z× p := ({1, . . . , p − 1}, ×p ) Multiplication modulo p, e.g. p = 7: ×7 1 2 3 4 5 6 1 1 2 3 4 5 6 2 2 4 6 1 3 5 3 3 6 2 5 1 4 4 4 1 5 2 6 3 5 5 3 1 6 4 2 6 6 5 4 3 2 1 Is it a group? What were the requirements of a group again? Closure — if 1 ≤ a, b ≤ p − 1 by defn. 0 ≤ (ab mod p) ≤ p − 1. If ab ≡ 0 mod p, p divides ab and thus p divides a or b, contradiction. Associativity — obvious because regular multiplication of rh@doc.ic.ac.uk The maths behind RSA integers is. Introduction RSA Security and implementation Trapdoor one-way functions DLP (2) Thought to be hard, i.e. no polynomial time algorithm has been found — naïve approach is exponentially hard Assuming hardness, DLP is one-way, but is there a way of introducing a trapdoor? Yes — we will see the Elgamal scheme, specifically in the case of subgroups of the group of points on an elliptic curve over a finite field (next week) This week, we focus on a different, but related one-way trapdoor problem, RSA rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem RSA First public key algorithm which also works for signing Discovered in 1973 by Clifford Cocks, mathematician working at GCHQ, UK intelligence agency. Top secret, only published internally, revealed in 1997 First publicly described in 1977 by Ron Rivest, Adi Shamir and Leonard Adleman (independently discovered) rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Key generation Each user chooses two primes p and q and computes n = pq and σ = (p − 1)(q − 1) Discard p and q Choose e and d such that ed ≡ 1 mod σ Public key: (e, n) Private key: d rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Message representation Represent the message to be encrypted as an integer m < n, e.g. ASCII — interpret message as a number in base 256 Split the message into chunks if necessary, but usually just encrypt a key for a symmetric algorithm Also m should be coprime to n (we will see why). Only p + q − 1 numbers less than n not coprime to n: 1, p, 2p, . . . , (q − 1)p, q, 2q, . . . , (p − 1)q Their proportion is: p+q−1 ≈ 1/p + 1/q pq Can just add padding characters if necessary rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Encryption and decryption If message is m, compute ciphertext c as: c ≡ me mod n Message can be recovered (decrypted) by computing: cd ≡ m mod n Can a computer calculate modular exponents quickly? To find ab mod c, expand b in base 2, e.g. b = 1493 = 1024 + 256 + 128 + 64 + 16 + 4 + 1 k k −1 And use a2 = (a2 )2 mod c rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Why does decryption work? We need to show c d ≡ m mod n. We will use: Theorem (Fermat’s Little Theorem) ap−1 ≡ 1 mod p for any prime p and integer 1 ≤ a ≤ p − 1. We will give an elegant, group-theoretic proof, note, we’re working in the group Z× p we saw earlier. We need a bit more group theory first though... rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Subgroups — definition Definition (Subgroup) If (G, ∗) is a group and H ⊆ G, we say (H, ∗|H ) is a subgroup if it is also a group itself. rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Subgroups — examples ×7 1 2 3 4 5 6 1 1 2 3 4 5 6 2 2 4 6 1 3 5 3 3 6 2 5 1 4 4 4 1 5 2 6 3 5 5 3 1 6 4 2 6 6 5 4 3 2 1 For example, the following are all subgroups of Z× 7: {1, 6} {1, 2, 4} These are not: {1, 5} {1, 2, 4, 6} rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Subgroup generated by an element For any g ∈ G, consider the subset: < g >:= {1, g, g 2 , . . . , g k −1 : g k = 1, g j 6= 1 for 1 ≤ j ≤ k − 1} Always exists such k , because eventually some g a = g b for a > b, then g a = g b−a · g a ⇒ g b−a = 1. Any g n is contained in here because n = mk + r for 0 ≤ r < k , so g n = g r . So < g > is the set of all powers of g, so is closed. Associative (G is), contains identity by definition. Take any g j . Consider powers g j , g 2j , . . ., eventually some g aj = g bj for a > b, in which case, g aj = g (b−a)j · g aj , so g (b−a)j = 1 and g (b−a−1)j is the inverse of g j . So < g > is a subgroup and k is called the order of g rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Examples of generated subgroups ×7 1 2 3 4 5 6 1 1 2 3 4 5 6 2 2 4 6 1 3 5 3 3 6 2 5 1 4 4 4 1 5 2 6 3 5 5 3 1 6 4 2 6 6 5 4 3 2 1 < 2 >= {1, 2, 4} < 3 >= {1, 3, 2, 6, 4, 5} < 6 >= {1, 6} rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Lagrange’s theorem Theorem (Lagrange’s Theorem) If (H, ∗|H ) is a subgroup of (G, ∗), then |H| divides |G|. In particular, the order of any element g ∈ G must divide |G| (because < g > is a subgroup). rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Proof of Lagrange Proof. Let H be a subgroup of G. For g ∈ G, left coset: gH := {gh : h ∈ H}. The left cosets are a partition of G, i.e. : Each g ∈ G is in some coset If g1 , g2 ∈ G, then g1 H ∩ g2 H = ∅ or g1 H = g2 H. Furthermore fg : H → gH for any g ∈ G defined by fg (h) = gh (obv. surjective) is clearly a bijection because gh1 = gh2 ⇒ h1 = h2 (injective). So |gH| = |H| for all g ∈ G. But G is the disjoint union of distinct cosets, thus |G| = k |H| for some k > 0. rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Proof of Fermat’s Little Theorem Theorem (Fermat’s Little Theorem) ap−1 ≡ 1 mod p for any prime p and integer 1 ≤ a ≤ p − 1. Proof. Let k be the order of a in the group Z× p . Then by Lagrange, k divides the order of the group, p − 1, so p − 1 = km for some m, and: ap−1 ≡ akm ≡ (ak )m ≡ 1m ≡ 1 mod p A more general result (not needed by us) is x σ(N) ≡ 1 mod N for x and N coprime, where σ(N) is Euler’s function, the number of integers strictly smaller than N coprime to N. Elegant group-theoretic proof too. rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Definition Subgroups and Lagrange’s theorem Fermat’s little theorem Back to decryption in RSA We wanted to show c d ≡ m mod n. cd ≡ med mod n by defn. of c ≡ m1+k (p−1)(q−1) mod n by defn. of e, d k (p−1)(q−1) ≡ m·m mod (n = pq) Now apply FLT twice (recall we chose m coprime to n) (mk (p−1) )q−1 ≡ 1 mod q (mk (q−1) )p−1 ≡ 1 mod p p and q are coprime, so mk (p−1)(q−1) ≡ 1 mod n. Then: cd ≡ m · 1 ≡ m rh@doc.ic.ac.uk mod n The maths behind RSA Introduction RSA Security and implementation Evidence for security Public key: (e, n), private key: d If can factor n = pq, can break RSA: Can compute σ = (p − 1)(q − 1). Can find d satisfying ed ≡ 1 mod σ, i.e. ed − 1 = kσ using Euclid’s algorithm (cheap) Only need σ to break RSA. Might be an easier way than factoring to get σ? No. If we know σ = (p − 1)(q − 1), we know pq − p − q + 1 = n − (p + q) + 1 and thus we know p + q. Quadratic x 2 − (p + q)x + pq = 0 has roots p and q, which we can find using quadratic formula. This is not a proof breaking RSA is as hard as factoring though... rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Integer factorisation Integer factorisation thought to have no polynomial time algorithm, but not proven. RSA-640, ≈ 30 years of single CPU time (5 calendar months actual). 4096-bit keys are the norm There are sub-exponential integer factorisation algorithms though (number field sieves), this scares some people Estimated factoring time for best algorithm K exp(b1/3 log2/3 (b)) for number of bits b exp(40961/3 log2/3 (4096)) exp(6401/3 log2/3 (640)) ≈ 1015 times longer Best ECDLP algorithm fully exponential, largest key size publicly broken is 109-bits (next 2 weeks!) rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation Computational feasibility 4096-bit p and q. Prime number theorem, nth prime number, approximately equal to n ln n. So if pick random 4096-bit integer, average distance to a prime is less than: (n + 1) ln(n + 1) − n ln n ≈ ln(n + 1) ≈ 4096 · ln 2 ≈ 3000 There is a polynomial time algorithm to check primality Finding e and d such that ed ≡ 1 mod σ is an application of Euclid’s algorithm — polynomial time Modular exponentiation is polynomial time if we use base 2 idea from earlier So RSA algorithm is computationally feasible rh@doc.ic.ac.uk The maths behind RSA Introduction RSA Security and implementation References For further information on RSA and crypto in general: Bruce Schneier’s Appled Cryptography (comprehensive, includes maths but aimed at CS people — lots of source code too!) William Stalling’s Cryptography and Network Security (some mathematical detail, used in NS course) The sci.crypt FAQ: http://www.faqs.org/faqs/cryptography-faq Stephen Levy’s Crypto (light, pop-sci) Next couple of weeks’ lectures on EC crypto! rh@doc.ic.ac.uk The maths behind RSA