slides

advertisement
Practical Timing Side Channel Attacks
Against Kernel Space ASLR
Fall 2014
Instructor: Kun Sun, Ph.D.
Outline
Background





Side Channel Attack
Kernel Space ASLR
Use timing-based side channel to defeat
kernel space ASLR (A heavy engineering
work)
How to prevent/mitigate this side channel
attack?
2
Outline
Background





Side Channel Attack
Kernel Space ASLR
Use timing-based side channel to defeat
kernel space ASLR
How to prevent/mitigate this side channel
attack?
3
Side Channel


Side channel vs. Covert channel
 Side channel: incidentally, unintentionally
 Covert channel: colluding, intentionally
A side channel is a communication channel
based on the use of system resources not
normally intended for communication
between the subjects (processes) in the
system
Side Channel Types

Computer systems abound with side channels



Storage Channels

use system variables and attributes (other than
time) to signal information

classic example is resource exhaustion channel
Timing Channels

vary the amount of time required to complete a
task to signal information

classic example is load sensing channel
Side channels are typically noisy but information
theory techniques can be used to achieve errorfree communication
Resource Exhaustion Channel


Given 5MB pool of dynamically allocated
Memory
High Process



bit = 1  request 5MB of memory
bit = 0  request 0MB of memory
Low Process


request 5MB of memory
if allocated then bit = 0 otherwise bit = 1
Load Sensing Channel

High Process



bit = 1  enter computation intensive loop
bit = 0  go to sleep
Low Process


perform a task with known computational
requirements
if completed quickly then bit = 0 otherwise bit
=1
Mitigating Side Channel



Audit for storage-based
Time normalization for timing-based
Slow it down, reduce the transmission
date rate
Outline
Background





Side Channel Attack
Kernel Space ASLR
Use timing-based side channel to defeat
kernel space ASLR
How to prevent/mitigate this side channel
attack?
9
ASLR Deployment




Windows user space and kernel space
ASLR since vista.
Pax patches in Linux (not in kernel)
MacOS since 10.5
Mobile OSes including Android and iOS
User Space ASLR vs. Kernel Space ASLR


Shacham et al. showed that user mode ASLR on
32-bit architectures only leaves 16 bit of
randomness, which is not enough to defeat
brute-force attacks.
However, such brute-force attacks are not
applicable for kernel space ASLR.
 An attacker wants to exploit a vulnerability in
kernel code, a wrong offset will cause a
complete crash of the system and thus an
attacker has only one attempt to perform an
exploit.
Windows Kernel Space ASLR

For both 32-bit and 64 bit Vista, Windows 7, and Windows 8.

Obtain the info by reverse-engineering


During boot process, Windows loader allocates kernel_region for the
kernel image and the hardware abstraction layer (HAL)
The entropy is 6 bit, 32*2=64=26
Outline
Background





Side Channel Attack
Kernel Space ASLR
Use timing-based side channel to defeat
kernel space ASLR
How to prevent/mitigate this side channel
attack?
13
Timing-based Side Channel Attack

Target at how a local attacker with
restricted rights can mount a timing-based
side channel attack against the memory
management system to deduce
information about the privileged kernel
space ASLR.
Attack Model


System is protected by ASLR and
Local adversary can run user-space
applications, but cannot execute privileged
kernel mode code.
Challenges on Exploiting the Attack

Timing attacks against a modern CPU are
complicated.


Performance optimizations including hardware
prefetching, speculative execution, multi-core
architectures, or branch prediction.
Target on modern CPU and new OS

Previous work focused on older processors
without such optimization.
TLB and Cache

Translation lookaside buffer: convert virtual address to physical address

Virtually indexed, physically tagged
http://www.cs.princeton.edu/courses/archive/fall04/cos471/lectures/19-VirtualMemory.pdf
Intel i7 Memory Architecture plus Clock Latency
General Approach

Two steps:
Set the system in a specific state from user mode.
2.
Measure the duration of a certain memory access
operation. The time span of this operation then
(possibly) reveals certain information about the
kernel space layout.
One example:

Cache a system call handler by executing sys call.

Access a designated set of user space addresses and
execute the system call again.

If the system call takes longer than expected, the
access of user space addresses evicted the system call
handler code and reveals address info of system code.
1.

Handling Noise



Circumvent performance optimizations
 E.g., test memory in non-consecutive
order to circumvent memory prefetching.
OS Scheduler, shared cache
 Two steps requires 5000 clock cycles
 OS time slides several ms
Guarantee to find cases with No Interrupts
between two steps.
Three Attacks

Attacks on L1/L2/L3 caches and TLB/PS
caches
 Cache probing
 Double page fault
 Cache preloading
Cache Probing


Based on multiple memory addresses mapped
into the same cache set and compete for
available slots
n-way set associative mode.



all available slots are grouped into sets of the size n
and each memory chunk can be stored in all slots of
one particular set.
E.g., in 32-bit address, a typical L3 cache of 8 MB is
16-way set associative. It consists of (8, 192∗1,
024)/64 = 131, 072 single slots that are grouped into
131, 072/16 = 8, 192 different sets.
Reverse-engineering Sandybridge Hash Function
for L3 parallel access among multiple cores
Double Page Fault

Based on the observation on Intel CPUs



When the page translation was successful, but
the access permission check fails (e.g., when
kernel space is accessed from user mode), a
TLB entry is indeed created.
Do not work on AMD CPUs
Can be easily fixed by hardware redesign.
Double Page Fault


First page fault: for each kernel space page p, access p
from user mode and a page fault will be handled by the
operating system and forwarded to the exception handler
of the process.

When p refers to an allocated page, the MMU creates
a TLB entry for p although the succeeding permission
check fails.

When p refers to an unallocated page, since the
translation fails, the MMU does not create a TLB entry
for p.
Second page fault: accessing p again, if p refers to an
allocated kernel page, the page fault will be delivered
faster due to the inherent TLB hit.
Cache Preloading


Double page fault estimates the allocation map of the
kernel space, but do not determine at which concrete
base addresses the kernel and drivers are loaded to.
3 steps in Cache Preloading:

flush all caches (i.e., address translation and
instruction/data caches) to start with a clean state.

preload the address translation caches by indirectly
calling into kernel code, for example by issuing a
sysenter operation.

Intentionally generate a page fault by jumping to
some kernel space address and measure the time. If
the faulting address lies in the same memory range as
the preloaded kernel memory  a shorter time
Testbed


Tested on 32-/64-bit systems running
Windows 7/Linux
Tested on different CPUs + VM:





Intel i7-870 (Nehalem/Bloomfield, Quad-Core)
Intel i7-950 (Nehalem/Lynnfield, Quad-Core)
Intel i7-2600 (Sandybridge, Quad-Core)
AMD Athlon II X3 455 (Triple-Core)
VMWare Player 4.0.2 on Intel i7-870 (with VTx)
Results
Outline
Background





Side Channel Attack
Kernel Space ASLR
Use timing-based side channel to defeat
kernel space ASLR (A heavy engineering
work)
How to prevent/mitigate this side channel
attack?
28
Mitigation Approaches





Root cause: memory hierarchy is a shared
resource
Also, no enough randomization: 64 different slots
for windows kernel
Option one: split caches for user and kernel.
 Shortcoming: checking overhead,
maximum
cache size reduced
Option two: suppress the creation of TLB entries
on successful address translation if an access
violation happens
Option three: Timing normalization; modify the
execution time of the page fault handler
Future works



Test MacOS
Test Android
Obtain the physical address of a certain
memory location from user mode.
Questions?
Download