slides

advertisement
Software fault isolation with API integrity
and multi-principal modules
Yandong Mao, Haogang Chen (MIT CSAIL),
Dong Zhou (Tsinghua University IIIS),
Xi Wang, Nickolai Zeldovich, Frans Kaashoek (MIT CSAIL)
Kernel security is important
• Kernel is fully privileged
• Kernel compromises are devastating
• Remote attacker takes control over the whole
machine
• Local user gains root privilege
Linux kernel is vulnerable
• Vulnerabilities in Linux are routinely discovered
• CVE 2010: 145 vulnerabilities in Linux kernel
• Many exploits attack kernel modules
• 67% of Linux kernel vulnerabilities (CVE 2010)
• This talk focuses on vulnerabilities in kernel
modules
Threat
• Module programmer makes mistake
• Attacker exploits mistake to mount attacks
• Example: buffer overflow, set current UID to root
Module
Privilege escalation!
Kernel memory
Module
memory
UID
One approach: type safe
languages
• Write kernel and modules in Java, C#
• No reference to UID object => cannot directly change
UID
• Attacker cannot synthesize references
Module
Most kernels are not written
in type safe language!
UID
Software Fault Isolation
(SFI[SOSP93])
Can not bypass
SFI check
Module
char *p = 0xf7;
sfi_check_memory(p);
*p = 0;
Module
memory
UID
SFI Runtime
void sfi_check_memory(p) {
if p not in “Module memory”
stop_module();
}
Memory safety is insufficient
for stopping attacks!
• Challenge: module needs to call kernel functions
Core Kernel
void spin_lock_init(spinlock_t *lock) {
lock->v = 0;
}
Module
memory
UID
Spin_module
spinlock_t mylock;
spin_lock_init(&mylock);
Problem: API abuse
• Attacker tricks fully-privileged kernel code to overwrite
UID
Core Kernel
Spin_module
void spin_lock_init(spinlock_t *lock) {
lock->v = 0;
}
spin_lock_init(&cur_proc->uid);
Privilege escalation!
Module
memory
UID
Challenge: lack of API integrity
• Kernel APIs are not written defensively
• Assume the calling module to obey implicit rules
• Do not check arguments, permissions, etc
• Problem: modules cannot be trusted to follow rules
• Module can trick kernel into performing unexpected
actions
• Ideal system would enforce rules for kernel API
• Analogy: system call code assumes nothing about caller,
checks every assumption
State of the art
for protecting APIs
• SFI[SOSP93]: memory safety
• XFI[OSDI06]: no argument checks
• BGI[SOSP09]: manually wrap functions, make kernel
defensive when kernel code invokes callbacks
• Error-prone and time-consuming
• Works if kernel code is well-structured (not Linux)
Our approach: annotation language
• Helps enforce two types of API integrity:
• Argument integrity: programmer controls what
arguments a module can pass to functions
• Callback integrity: kernel invokes callback only if the
module could have invoked callback directly
• Allows programmers to specify principals for privilege
separation within a module
• Less error-prone than manual wrapping, applicable to
complex APIs such as those in Linux
Contributions
• LXFI: software fault isolation system for Linux kernel
modules
• Annotation language for
• Argument integrity
• Callback integrity
• Privilege separation within a module
• Evaluation
• Few annotations for 10 Linux kernel modules
• Stop three real exploits
• 2-4X CPU overhead for netperf
Goals for annotation language
• Enforce argument integrity, callback integrity
and privilege separation within a module
• Minimize programmer effort, e.g.:
• Few annotations
• Avoid data structure and API changes
• Compatible with C
Preventing module exploits
Programmer
annotates core
kernel
Compile time
Runtime
LXFI translates
annotations
to runtime checks
LXFI performs
checks
If annotations capture all implicit
rules, compromised module
cannot violate rules to gain
additional privileges.
Using compiler plugins;
Provide safe default: reject a
module if it calls an
unannotated API
Consulting a dynamic table
of capabilities for each
module
Design of annotation language
• Argument integrity annotations
• Using the spin_lock_init example
• Callback integrity annotations
• Not discussed; see paper
• Privilege separation annotations
• Using dm_crypt (real Linux kernel module)
Enforce argument integrity
• spin_lock_init: three annotations are required
Part
Syntax
Capability write(ptr,size)
Capability
check(cap)
Action
Location
pre(action)
Description
Write [ptr,ptr+size]
Checks cap
Perform action before function call
Example: enforce argument
integrity for spin_lock_init
Core Kernel
void spin_lock_init(spinlock_t *lock)
pre(check(write(lock, sizeof(spinlock_t)))
Spin_module
capability table
LXFI Runtime
write(mylock, 8)
Module
memory
……
lxfi_check_write(mylock, 8);
spin_lock_init(mylock)
……
lxfi_check_write(&cur_proc->uid, 8);
spin_lock_init(&cur_proc->uid)
Privilege escalation
prevented
UID
Where does the capability come
from?
• Granted on allocation
• Two more annotations are required
Part
Capability
Syntax
Description
write(ptr,size) Write [ptr,ptr+size]
Capability
Action
check(cap)
copy(cap)
Check cap
pre(action)
Perform action before function call
post(action)
Perform action after function return
Location
Grant a copy of cap
Example: grant spinlock
Core Kernel
Spin_module
void *kmalloc(size)
post(copy(write(return, size))
LXFI Runtime
……
spinlock_t *mylock = kmalloc(8);
lxfi_copy_write(mylock, 8);
capability table
write(mylock, 8)
What happens when memory is
freed?
• Need to revoke capability to safely reuse memory
• Strawman: revoke capability from caller
• Insufficient! Other modules may have copies of
capability
Part
Capability
Syntax
write(ptr,size) Write [ptr,ptr+size]
check(cap)
No other copies of
Grant a copy of capthe capability remain
Check cap
transfer(cap)
Revoke cap from all modules, and grant
pre(action)
Perform action before function call
post(action)
Perform action after function return
copy(cap)
Capability
Action
Location
Description
Example: safely free a spinlock
Core Kernel
Spin_module
LXFI Runtime
void kfree(void *p)
pre(transfer(write(p, no_size)))
lxfi_transfer_write(mylock, -1);
……
kfree(mylock);
capability table
write(mylock, 8)
other_module
capability table
write(mylock, 8)
Why is spin_module able to call
spin_lock_init, kmalloc, kfree?
• Call capability
• Granted initially according to the module’s symbol table
• Trust module author not to call unnecessary functions
• Dynamically granted when a callback function is passed
Part
Capability
Capability
Action
Location
Syntax
Description
write(ptr,size)
call(a)
Write [ptr,ptr+size]
copy(cap)
Grant a copy of cap
check(cap)
Check cap
transfer(cap)
Revoke cap from all modules, and grant
pre(action)
Perform action before function call
post(action)
Perform action after function return
Call a
Core Kernel
void *kmalloc(size)
post(copy(write(return, size))
void spin_lock_init(spinlock_t *lock)
pre(check(write(lock, sizeof(spinlock_t)))
void kfree(void *p)
pre(transfer(write(p, no_size))
LXFI Runtime
……
Spin_module
capability table
call(kmalloc)
call(spin_lock_init)
call(kfree)
spinlock_t *mylock = kmalloc(8);
lxfi_copy_write(mylock, 8);
……
lxfi_check_write(mylock, 8);
spin_lock_init(mylock)l
……
lxfi_check_write(&cur_proc->uid, 8);
spin_lock_init(&cur_proc->uid);
……
lxfi_transfer_write(mylock, -1);
kfree(mylock);
No way for compromised
spin_module to gain root privilege
• SFI ensures memory safety
• Call capabilities ensure only 3 functions are
allowed
• None of the functions can modify UID because:
• kmalloc never modifies allocated memory
• spin_lock_init can only be called with
writable memory (from kmalloc)
• kfree ensures no capabilities remain after
free
• spin_module can not modify UID!
Privilege separation within a module
• dm_crypt: transparent encryption service for block devices
• This example requires a third type of capability
Part
Syntax
Description
write(ptr,size) Write [ptr,ptr+size]
Capability call(a)
Call a
ref(a, t)
Pass a as t
copy(cap)
Grant a copy of cap
Pass argument a as type t
Capability check(cap)
Check cap
Action
transfer(cap)
Revoke cap from all principals, and grant
Location
pre(action)
Perform action before function call
post(action)
Perform action after function return
Privilege separation
User space
write(“/etc/secret.txt”, “foo”)
Kernel space
int bdev_write(block_device *dev,
const char * data, …)
pre(check(ref(block_device), dev)
Core Kernel
write(enc_disk, “foo”, …)
dm_crypt
capability table
LXFI Runtime
ref(block_device, enc_disk->bdev)
Writing block device does
not require writing to
memory of enc_disk->bdev.
lxfi_check_ref(block_device, enc_disk->bdev)
bdev_write(enc_disk->bdev, E(“foo”), …)
Privilege separation
read(…)
User space
Kernel space
int bdev_write(block_device *dev,
const char * data, …)
pre(check(ref(block_device), dev)
Core Kernel
LXFI Runtime
dm_crypt
capability table
capability table
ref(block_device, enc_disk->bdev)
ref(block_device, enc_usb->bdev)
ref(block_device, enc_usb->bdev)
Decrypt
lxfi_check_ref(block_device, enc_disk->bdev)
bdev_write(enc_disk->bdev, “/etc/pwd”, “foo”)
/etc/pwd: rootpwd=foo
How to define principals
• Associate a principal with every instance a
module supports (e.g. block device in dm_crypt)
• Problem: how to specify and name principals?
• Recall goal: minimize changes to existing data
structures
• Idea: re-use address of data structure as the
name of the principal
• Can typically identify principal from one of the
function arguments
Specifying principals
Part
Syntax
Description
write(ptr,size) Write [ptr,ptr+size]
Capability ref(a, t)
Pass a as t
call(a)
Call a
copy(cap)
Grant a copy of cap
Capability check(cap)
Check cap
Action
transfer(cap)
Revoke cap from all principals, and grant
Location
Principal
pre(action)
Perform action before function call
post(action)
Perform action after function return
Run with privileges of principal ptr
principal(ptr)
Privilege separation
User space
Kernel space
struct dm_type {
int (*map)(struct dm_target *di);
principal(di)
};
Core Kernel
lxfi_set_princ(enc_usb)
dm_crypt.map(enc_usb)
LXFI Runtime
dm_crypt
capability table
capability table
write(enc_disk->bdev, 100)
write(enc_usb->bdev, 100)
Decrypt
lxfi_check_write(enc_disk->bdev, 100)
bdev_write(enc_disk->bdev, “/etc/pwd”, “foo”)
/etc/pwd: rootpwd=foo
Principal name aliasing
• Problem: Kernel identifies a LXFI principal by multiple
addresses
int e1000_probe(struct pci_dev *pcidev) {
struct net_device *ndev = alloc_etherdev(...);
ndev->pcidev = pcidev;
lxfi_princ_alias(pcidev, ndev);
...
}
int e1000_xmit(struct net_device *dev) {
…
}
• Insert code into module to create alias
• The same principal now has multiple names
Other annotation language
features
Part
Capability
Syntax
Description
Save annotation effort for
write(ptr,size)
complex objects that needWrite [ptr,ptr+size]
multiplet)capabilities
ref(a,
Pass a as t
call(a)
Call a
cap_iterator(obj)
A function iterates all cap. of obj
copy(cap)
Grant a copy of cap
if(c-expr) action
Perform action only if c-expr
Capability
Action
check(cap)
Check capGlobal:principal with full
Express
conditional action such as
grant atransfer(cap)
privilege if return value is OK Revoke cap
privilige
from all principals, grant cap
Shared:principal with
minimalbefore
privilege
pre(action)
Perform action
function call
Location
post(action)
Perform action after function return
Principal
principal(ptr)
Run with privileges of principal ptr(global,
shared)
Implementation
• Linux 2.6.36, x64, single-core
• gcc plugin: kernel rewriting for callback integrity
• Clang/LLVM plugin: module rewriting
• Annotation propagation saves effort by inferring
annotations of module functions
Example: annotation propagation
//linux/drivers/net/e1000/e1000_main.c
//from linux/include/pci_driver.h
struct pci_driver {
int (*probe)(struct pci_dev *pcidev)
principal(pcidev)
pre(copy(ref(struct pci_dev), pcidev)
}
LXFI propagates annotation on
probe to modules
int e1000_probe(struct pci_dev *pcidev) {
….
}
struct pci_driver e1000_driver = {
.probe = e1000_probe
};
//linux/drivers/net/ixgbe/ixgbe_main.c
int ixgbe_probe(struct pci_dev *pcidev) {
….
}
struct pci_driver ixgbe_driver = {
.probe = ixgbe_probe
};
Evaluation
• Security
• Annotation effort
• Performance overhead
Security
• Test LXFI with three real privilege escalation
exploits
Exploit
CAN_BCM
CVE ID
CVE-2010-2959
Violated
Property
Unmodified
Linux
LXFI
Memory Safety
CVE-2010-3849
Econet
CVE-2010-3850
API Integrity
CVE-2010-4258
RDS
CVE-2010-3904
API Integrity
• Stopping real attacks requires API integrity
Annotation effort
• Annotate kernel APIs for 10 modules, one at a
time
• Count:
• # of annotated core kernel functions a module
calls
• # of function pointer declarations a module
exports to core kernel
Sharing reduces annotation effort
Category
Module
net device driver
sound device driver
net protocol driver
block device driver
Total
#Functions
# Function Pointers
All
Unique
All
Unique
e1000
81
49
52
47
snd-intel8x0
59
27
12
2
snd-ens1370
48
13
12
2
rds
77
30
42
26
can
53
7
7
3
can-bcm
51
15
17
1
econet
54
15
20
3
dm-crypt
50
24
24
14
dm-zero
6
3
2
0
dm-snapshot
55
16
28
18
334
155
LXFI performance
• netperf, 1 Gigabit e1000 network card, LAN
• Stresses LXFI
Test
Throughput
CPU %
Stock
LXFI
Stock
LXFI
TCP_STREAM TX
836 M bits/sec
828 M bits/sec
13%
48%
UDP_STREAM TX
3.1 M/3.1 M pkt/sec
2.0 M/2.0 M pkt/sec
54%
100%
~30% decrease
CPU time of LXFI actions for
netperf
80%
• Room for improvement
Capability action
Mem-write check
Function Entry
Function Exit
Indirect call check
Future work
• Improve performance
• Faster capability management such as BGI’s
• Extend annotation language to enforce other
types of API integrity
• Perhaps based on Singularity’s contracts
Related work
• Type-safe kernels: Singularity [MSR-TR05]
• LXFI provides similar guarantees in C
• Good support for revocation (transfer) and
principals
• Software fault isolation
• LXFI extends existing SFI systems (SFI, XFI,
BGI) with annotation language
Conclusion
• Extend SFI with annotation language for:
• Argument integrity
• Callback integrity
• Principals
• LXFI: Prototype for Linux
• Annotated 10 kernel modules
• Prevented 3 real privilege escalation exploits
• 2-4X CPU overhead when stressing with netperf
Q&A
Download