www.iaik.tugraz.at ARMageddon: Cache Attacks on Mobile Devices Moritz Lipp, Daniel Gruss, Raphael Spreitzer, Clémentine Maurice, Stefan Mangard Graz University of Technology August 11, 2016 — Usenix Security 2016 1 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at TLDR powerful cache attacks (like Flush+Reload) on x86 why not on ARM? 2 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at TLDR powerful cache attacks (like Flush+Reload) on x86 why not on ARM? We identified and solved challenges systematically to: make all cache attack techniques applicable to ARM monitor user activity attack weak Android crypto show that ARM TrustZone leaks through the cache 2 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at What is a cache attack? (1) Number of accesses cache hits 10 cache misses 7 105 103 101 50 100 150 200 250 300 Access time in CPU cycles 3 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 350 400 www.iaik.tugraz.at What is a cache attack? (2) 4 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at What is a cache attack? (2) 4 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at What is a cache attack? (2) 4 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at What is a cache attack? (2) 4 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at What is a cache attack? (2) 4 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at What is a cache attack? (2) 4 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at What is a cache attack? (2) 4 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at What is a cache attack? (2) 4 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Cache attack techniques Most important techniques: Flush+Reload Prime+Probe Both work on the last-level cache → across cores 5 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Flush+Reload Attacker address space Cache step 0: attacker maps shared library → shared memory, shared in cache 6 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Victim address space www.iaik.tugraz.at Flush+Reload Attacker address space Victim address space Cache cached cached step 0: attacker maps shared library → shared memory, shared in cache 6 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Flush+Reload Attacker address space Cache flushes step 0: attacker maps shared library → shared memory, shared in cache step 1: attacker flushes the shared line 6 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Victim address space www.iaik.tugraz.at Flush+Reload Attacker address space Victim address space Cache loads data step 0: attacker maps shared library → shared memory, shared in cache step 1: attacker flushes the shared line step 2: victim loads data while performing encryption 6 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Flush+Reload Attacker address space Cache reloads data step 0: attacker maps shared library → shared memory, shared in cache step 1: attacker flushes the shared line step 2: victim loads data while performing encryption step 3: attacker reloads data → fast access if the victim loaded the line 6 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Victim address space www.iaik.tugraz.at Prime+Probe Attacker address space step 0: attacker fills the cache (prime) 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Cache Victim address space www.iaik.tugraz.at Prime+Probe Attacker address space step 0: attacker fills the cache (prime) 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Cache Victim address space www.iaik.tugraz.at Prime+Probe Attacker address space step 0: attacker fills the cache (prime) 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Cache Victim address space www.iaik.tugraz.at Prime+Probe Attacker address space Cache loads data step 0: attacker fills the cache (prime) step 1: victim evicts cache lines while performing encryption 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Victim address space www.iaik.tugraz.at Prime+Probe Attacker address space Cache loads data step 0: attacker fills the cache (prime) step 1: victim evicts cache lines while performing encryption 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Victim address space www.iaik.tugraz.at Prime+Probe Attacker address space Victim address space Cache loads data step 0: attacker fills the cache (prime) step 1: victim evicts cache lines while performing encryption 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Prime+Probe Attacker address space Victim address space Cache loads data step 0: attacker fills the cache (prime) step 1: victim evicts cache lines while performing encryption 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Prime+Probe Attacker address space Cache step 0: attacker fills the cache (prime) step 1: victim evicts cache lines while performing encryption 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Victim address space www.iaik.tugraz.at Prime+Probe Attacker address space Cache step 0: attacker fills the cache (prime) step 1: victim evicts cache lines while performing encryption step 2: attacker probes data to determine if the set was accessed 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Victim address space www.iaik.tugraz.at Prime+Probe Attacker address space Cache fas ss cce ta step 0: attacker fills the cache (prime) step 1: victim evicts cache lines while performing encryption step 2: attacker probes data to determine if the set was accessed 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Victim address space www.iaik.tugraz.at Prime+Probe Attacker address space Cache slo w ac ce ss step 0: attacker fills the cache (prime) step 1: victim evicts cache lines while performing encryption step 2: attacker probes data to determine if the set was accessed 7 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Victim address space www.iaik.tugraz.at Caches on Intel CPUs 8 last-level cache (L3): core 0 core 1 core 2 core 3 L1 L1 L1 L1 shared L2 L2 L2 L2 inclusive L3 slice 0 L3 slice 1 L3 slice 2 L3 slice 3 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 = shared memory is shared in cache, across cores! www.iaik.tugraz.at Caches on ARM Cortex-A CPUs core 0 core 1 core 2 core 3 L1 L1 L1 L1 last-level cache (L2): shared but not inclusive L2 9 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 = shared memory not in L2 is not shared in cache www.iaik.tugraz.at Caches on ARM Cortex-A CPUs core 0 core 1 core 2 core 3 L1 L1 L1 L1 last-level cache (L2): shared but not inclusive L2 = shared memory not in L2 is not shared in cache Challenge #1: non-inclusive caches 9 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Modern ARM SoCs big.LITTLE architecture (A53 + A57) → multiple CPUs with no shared cache 10 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Modern ARM SoCs big.LITTLE architecture (A53 + A57) → multiple CPUs with no shared cache Challenge #2: no shared cache 10 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Cache maintenance Instructions to enforce memory coherency x86: unprivileged clflush until ARMv7-A: n/a ARMv8-A: kernel can unlock a flush instruction for userspace 11 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Cache maintenance Instructions to enforce memory coherency x86: unprivileged clflush until ARMv7-A: n/a ARMv8-A: kernel can unlock a flush instruction for userspace Challenge #3: no flush instruction 11 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Cache eviction targeted cache eviction on ARM can be complicated: existing approaches introduce much noise pseudo-random replacement policy unclear how randomness affects existing approaches 12 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Cache eviction targeted cache eviction on ARM can be complicated: existing approaches introduce much noise pseudo-random replacement policy unclear how randomness affects existing approaches Challenge #4: perform fast & reliable cache eviction 12 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Timing measurements x86: rdtsc provides unprivileged access to cycle count ARM: existing attacks require access to privileged mode cycle counter 13 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Timing measurements x86: rdtsc provides unprivileged access to cycle count ARM: existing attacks require access to privileged mode cycle counter Challenge #5: find unprivileged highly accurate timing sources 13 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Challenges #1: non-inclusive caches #2: no shared cache #3: no flush #4: random eviction #5: no unprivileged timing 14 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #1: non-inclusive caches 15 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #1: non-inclusive caches Attacking instruction-inclusive data-non-inclusive caches 15 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #1: non-inclusive caches Attacking instruction-inclusive data-non-inclusive caches Core 0 L1D L1I Sets L1I Core 1 Sets L2 Unified Cache 15 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 L1D www.iaik.tugraz.at Solving #1: non-inclusive caches What about entirely non-inclusive caches? cache coherency protocol fetches data from remote cores instead of DRAM → remote cache hits 16 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #1: non-inclusive caches What about entirely non-inclusive caches? Core 0 L1D L1I Sets L1I Core 1 AM DR L2 Unified Cache Sets o ct t evi 17 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 L1D www.iaik.tugraz.at Solving #1: non-inclusive caches ·104 Number of accesses 3 Hit (cross-core) Miss (cross-core) 2 1 0 18 Hit (same core) Miss (same core) 0 100 200 300 400 500 600 700 800 Measured access time in CPU cycles (OnePlus One) Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 900 1,000 www.iaik.tugraz.at Solving #2: no shared cache Multiple CPUs with no shared cache 19 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #2: no shared cache Multiple CPUs with no shared cache again: cache coherency protocol fetches data from remote CPUs instead of DRAM keep local L2 filled to increase probability of remote L1/L2 eviction timing difference between local and remote still small enough 19 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #3: no flush idea: replace flush instruction with cache eviction Flush+Reload → Evict+Reload 20 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #3: no flush idea: replace flush instruction with cache eviction Flush+Reload → Evict+Reload (works on x86) 20 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #3: no flush idea: replace flush instruction with cache eviction Flush+Reload → Evict+Reload (works on x86) but: cache eviction is slow and can be unreliable 20 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #3: no flush idea: replace flush instruction with cache eviction Flush+Reload → Evict+Reload (works on x86) but: cache eviction is slow and can be unreliable unless you know how to evict 20 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #3: no flush idea: replace flush instruction with cache eviction Flush+Reload → Evict+Reload (works on x86) but: cache eviction is slow and can be unreliable unless you know how to evict central idea of our Rowhammer.js paper 20 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #4: random eviction unique addr. # accesses 48 800 23 22 21 48 800 50 102 96 Cycles Eviction rate 6 517 142 876 6 209 5 101 4 275 (on the Alcatel One Touch Pop 2) 21 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 70.8% 99.1% 100.0% 100.0% 99.9% www.iaik.tugraz.at Solving #5: no unprivileged timing Comparison of 4 different measurement techniques performance counter (privileged) perf event open (syscall, unprivileged) clock gettime (unprivileged) thread counter (multithreaded, unprivileged) 22 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Solving #5: no unprivileged timing Number of accesses ·104 Hit (clock gettime×.15) Miss (clock gettime×.15) Hit (counter thread×.05) Miss (counter thread×.05) 5 4 3 2 1 0 23 Hit (PMCCNTR) Miss (PMCCNTR) Hit (syscall×.25) Miss (syscall×.25) 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 Measured access time (scaled) Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Flush+Flush on the Samsung Galaxy S6 ·104 Number of cases 3 2 1 0 24 Flush (address cached) Flush (address not cached) 0 50 100 150 200 250 300 350 400 450 Measured execution time in CPU cycles Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 500 550 600 www.iaik.tugraz.at Prime+Probe on the Alcatel One Touch Pop 2 40% Victim access No victim access Cases 30% 20% 10% 0% 25 1,800 2,000 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 2,200 2,400 2,600 2,800 3,000 Execution time in CPU cycles 3,200 3,400 www.iaik.tugraz.at Covert channels on Android 26 Work Type Ours (Samsung Galaxy S6) Ours (Samsung Galaxy S6) Ours (Samsung Galaxy S6) Ours (Alcatel One Touch Pop 2) Ours (OnePlus One) Marforio et al. Marforio et al. Schlegel et al. Schlegel et al. Schlegel et al. Flush+Reload, cross-core Flush+Reload, cross-CPU Flush+Flush, cross-core Evict+Reload, cross-core Evict+Reload, cross-core Type of Intents UNIX socket discovery File locks Volume settings Vibration settings Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Bandwidth [bps] Error rate 1 140 650 257 509 178 292 13 618 12 537 4 300 2 600 685 150 87 1.10% 1.83% 0.48% 3.79% 5.00% – – – – – www.iaik.tugraz.at key longpress swipe tap text 0x840 0x880 0x3280 0x7700 0x8080 0x8100 0x8140 0x8840 0x8880 0x8900 0x8940 0x8980 0x11000 0x11040 0x11080 Event Cache template attacks (CTA) Addresses Cache template matrix for libinput.so (on an Alcatel One Touch Pop 2) 27 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Cache template attacks (CTA) alphabet Input enter space 0x66380 0x66340 0x60580 0x60340 0x60280 0x58480 0x57280 0x56940 0x45140 backspace Addresses Cache template matrix for the default AOSP keyboard (on a Samsung Galaxy S6) 28 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at CTA: taps and swipes Access time 200 150 100 50 Tap 0 2 Tap Tap 4 Swipe 6 Swipe Swipe Tap Tap 8 10 12 Time in seconds Tap 14 measured on an Alcatel One Touch Pop 2 29 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Swipe 16 Swipe 18 www.iaik.tugraz.at CTA: taps and swipes Access time 400 Tap 200 0 1 Tap Tap 2 3 Swipe Swipe 5 4 6 Time in seconds measured on a Samsung Galaxy S6 30 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Swipe 7 8 9 www.iaik.tugraz.at CTA: taps and swipes Access time 800 600 400 200 Tap 0 Tap 1 Tap 2 Swipe 3 4 Time in seconds Swipe 5 measured on measured on a OnePlus One 31 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 Swipe 6 7 www.iaik.tugraz.at CTA: distinguishing keys Key Space Access time 300 200 100 t 0 32 h i s Space 1 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 2 i s Space a Space 3 4 Time in seconds m 5 e s s a 6 g e 7 www.iaik.tugraz.at Bouncy Castle a widely used crypto library WhatsApp, ... uses a T-table implementation 33 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Address Plaintext byte values 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xA0 0xB0 0xC0 0xD0 0xE0 0xF0 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xA0 0xB0 0xC0 0xD0 0xE0 0xF0 Address Attacking Bouncy Castle Plaintext byte values Evict+Reload (Alcatel) vs. Flush+Reload (Samsung) 34 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at 0x240 0x280 0x2C0 0x300 0x340 0x380 0x3C0 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xA0 0xB0 0xC0 0xD0 0xE0 0xF0 Offset Attacking Bouncy Castle with Prime+Probe (Alcatel) Plaintext byte values 35 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at Probing time in CPU cycles Leakage from ARM TrustZone (RSA signatures) ·106 1. 5 1 0. 5 0 250 36 Valid key 1 Valid key 2 Valid key 3 Invalid key 260 270 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 280 290 300 310 Set number 320 330 340 350 www.iaik.tugraz.at Conclusions all the powerful cache attacks applicable to smartphones monitor user activity with high accuracy derive crypto keys ARM TrustZone leaks through the cache 37 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016 www.iaik.tugraz.at ARMageddon: Cache Attacks on Mobile Devices Moritz Lipp, Daniel Gruss, Raphael Spreitzer, Clémentine Maurice, Stefan Mangard Graz University of Technology August 11, 2016 — Usenix Security 2016 38 Daniel Gruss, Graz University of Technology August 11, 2016 — Usenix Security 2016