Remote Timing Attacks are Practical David Brumley dbrumley@stanford.edu Dan Boneh dabo@crypto.stanford.edu [Modified by Somesh Jha] Various Types of Attacks • Cryptanalysis – Look at carefully chosen plaintext/ciphertexts – Differential and linear cryptanalysis – Differential Cryptanalysis of the Data Encryption Standard by Eli Biham and Adi Shamir • Side channel attacks – Timing attacks – Differential power analysis – Look at characteristics, such as time for decryption and power consumption Overview • Main result: RSA in OpenSSL is vulnerable to a new timing attack: – Attacker can extract RSA private key by measuring web server response time. • Exploiting OpenSSL’s timing vulnerability: – One process can extract keys from another. – Insecure VM can attack secure VM. • Breaks VM isolation. – Extract web server key remotely. • Our attack works across Stanford campus. Why are timing attacks against OpenSSL interesting? • Many OpenSSL Applications – – – – mod_SSL (Apache+mod_SSL has 28% of HTTPS market) stunnel (Secure TCP/IP servers) sNFS (Secure NFS) Many more • Timing attacks mostly applied to smartcards [K’96] – K’96: Paul Kocher, Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems, Advances in Cryptology, 1996. – Never applied to complex systems – Most crypto libraries do not defend: • libgcrypt, cryptlib, ... • Mozilla NSS only one we found to explicitly defend by default • OpenSSL uses well-known algorithms Outline RSA Overview and data dependencies • Present timing attack • Results against OpenSSL 0.9.7 • Defenses RSA Algorithm • RSA decryption: gd mod N = m – d is private decryption exponent, N is public modulus • Chinese remaindering (CRT) uses factors directly. N=pq, and d1 and d2 are pre-computed from d: 1. m1 = gd1 mod q 2. m2 = gd2 mod p 3. combine m1 and m2 to yield m (mod N) • Goal: learn factors of N. – Kocher’s [K’96] attack fails when CRT is used. RSA Decryption Time Variance • Two reasons for decryption time variance: 1. Multiplication algorithm used • OpenSSL uses two different mult. algorithms 2. Modular reduction steps • modular reduction goal: given u, compute u mod q • Occasional extra steps in OpenSSL’s reduction algorithm • There are MANY: – multiplications by input g – modular reductions by factor q (and p) Reduction Timing Dependency • given u, compute u mod q. – OpenSSL uses Montgomery reductions [M’85] . – M’85: Peter Montgomery, Modular Multiplication without Trial Division, Mathematics of Computation, 44(170), 1985. • Time variance in Montgomery reduction: Modular reduction: – One extra step at end of reduction algorithm with probability Pr[extra step] (g mod q) 2q [S’00] Pr[extra step] (g mod q) 2q Decryption Time q 2q p Value of ciphertext Multiplication Timing Dependency • Two algorithms in OpenSSL: – Karatsuba (fast): Multiplying two numbers of equal length – Normal (slow): Multiplying two numbers of different length • To calc xg mod q OpenSSL does: – When x is the same length as (g mod q), use Karatsuba mult. – Otherwise, use Normal multiplication OpenSSL Multiplication Summary Decryption Time Karatsuba Multiplication Normal Multiplication g g<q q g>q Value of ciphertext Data Dependency Summary • Decryption value g < q – Montgomery effect: longer decryption time – Multiplication effect: shorter decryption time • Decryption value g > q – Montgomery effect: shorter decryption time – Multiplication effect: longer decryption time Opposite effects! But one will always dominate Previous Timing Attacks • Kocher’s attack does not apply to RSA-CRT. • Schindler’s attack does not work directly on OpenSSL for two reasons: – OpenSSL uses sliding windows instead of square and multiply – OpenSSL uses two mult. algorithms. Both known timing attacks do not work on OpenSSL. Outline • RSA Overview and data dependencies during decryption Present timing attack • Results against OpenSSL 0.9.7 • Defenses Timing Attack: High Level Assume we have i-1 top bits of q. Goal: find i-th bit of q. 1) Set g=q for the top i-1 bits, and 0 elsewhere. 2) ghi = g, but with the ith bit 1. Then g < ghi - g <q <ghi i’th bit of q is 0. - g <ghi <q i’th bit of q is 1. Goal: decide if g<q<ghi or g<ghi<q 2 cases for ghi Decryption Time g # Reductions Mult routine ghi? ghi? q Value of ciphertext Timing Attack High Level Attack: 1) Suppose g=q for the top i-1 bits, and 0 elsewhere. 2) ghi = g, but with the ith bit 1. Then g < ghi Goal: decide if g<q<ghi or g<ghi<q 3) Sample decryption time for g and ghi: t1 = DecryptTime(g) t2 = DecryptTime(ghi) 4) Time diff creates 0-1 gap If |t1 - t2| is large g and ghi bit i is 0 straddle q else q) g and ghi don’t bit i is 1 straddle q (g < q < ghi) (g < ghi < Small time difference g < ghi < q Decryption Time g # Reductions Mult routine ghi |t1 – t2| 0-1 gap small q Value of ciphertext Large time difference g < q < ghi # Reductions Mult routine Decryption Time g |t1 – t2| 0-1 gap large q Value of ciphertext ghi Timing Attack Details • We know what is “large” and “small” from attack on previous bits. • Decrypting just g does not work because of sliding windows – Decrypt a neighborhood of values near g – Will increase diff. between large and small values larger 0-1 gap • Only need to recover top half bits of q [C’97] • Attack requires only 2 hours, about 1.4 million queries to recover server’s private key. The Zero-One Gap Zero-one gap How does this work with SSL? How do we get the server to decrypt our g? Normal SSL Session Startup 1. ClientHello Regular Client 2. ServerHello USENIX SSL Server (send public key) 3. ClientKeyExchange (re mod N) Result: Encrypted with computed shared master secret Attacking Session Startup 1. ClientHello Attack Client 2. ServerHello (send public key) 3. Record time t1 Send guess g or ghi 4. Alert 5. Record time t2 Compute t2 –t1 USENIX SSL Server Outline • RSA Overview and data dependencies during decryption • Present timing attack Results against OpenSSL 0.9.7 • Defenses Attack extract RSA private key Montgomery reductions dominates zero-one gap Multiplication routine dominates Attack extract RSA private key Montgomery reductions dominates zero-one gap Multiplication routine dominates Attack works on the network Similar timing on WAN vs. LAN Attack Summary • Attack successful, even on a WAN • Attack requires only 350,000 – 1,400,000 decryption queries. • Attack requires only 2 hours to extract server’s private key. Outline • RSA Overview and data dependencies during decryption • Present timing attack • Results against OpenSSL 0.9.7 Defenses RSA Blinding • Decrypt random number related to g: 1. Compute x’ = g*re mod N, r is random 2. Decrypt x’ = m’ 3. Calculate m = m’/r mod N • Since r is random, the decryption time should be random • 2-10% performance penalty Blinding Works! Conclusion • We developed a timing attack based on multiplication and reduction timings • Attack works against real OpenSSL-based servers on regular PC’s. • Lesson: Crypto libraries should always defend against timing attacks. – OpenSSL 0.9.7b enables blinding by default. Questions? Thanks for listening!