AC Chen 2012/09/18 @ ADL

advertisement

CFIMon: Detecting Violation of

Control Flow Integrity using Performance Counters

Yubin Xia, Yutao Liu, Haibo Chen, Binyu Zang in DSN 2012

A.C. Chen 2012/09/18 @ ADL

Outline

• Introduction

• Performance Monitoring Units (PMU)

• CFI Enforcement by CFIMon

• Implementation

• Experiment

• Performance

• Conclusion

A.C. Chen 2012/09/18 @ ADL 2

INTRODUCTION

A.C. Chen 2012/09/18 @ ADL 3

Motivation

• Many classes of security exploits usually involve introducing abnormal control flow transfers

– Code-injection attack

– Code-Reuse Attacks

• return-into-libc (RILC)

• return-oriented programming (ROP)

• jump-oriented programming (JOP)

• Countermeasures

– non-executable stacks

– Stack-Guard

– safe C library

– heuristic means

– ….

– usually designed for a specific problem

A.C. Chen 2012/09/18 @ ADL 4

Some General Solutions…?

• Control flow integrity (CFI) [Abadi et al.]

– statically rewrites a program + dynamic inlined guards

• Suffer from coverage problems

• Control flow locking [Tyler Bletsch et al.]

– recompiles a program

• difficult to be applied to legacy applications

• Architectural support to validate or enforce control flow integrity [Shi et al.]

– need to re-design existing processors

A.C. Chen 2012/09/18 @ ADL 5

In this Paper…

• Detect a set of attacks that cause abnormal control flow transfers --CFIMon

– without changes to existing hardware, source code or binaries

– leverage the hardware support for performance counters to monitor the control flow integrity (CFI)

A.C. Chen 2012/09/18 @ ADL 6

PERFORMANCE MONITORING UNITS

(PMU)

Hardware support for performance monitoring

A.C. Chen 2012/09/18 @ ADL 7

Performance Monitoring Units (PMU)

• perfmon

A.C. Chen 2012/09/18 @ ADL 8

2 Working Modes of PMU

• Interrupt-based mode (basic mode)

– lacks precise instruction pointer information

• the reported IP may be up to tens of instructions away from the actual IP (instruction pointer) causing the event

• Precision mode

– improve the precision and flexibility of PMUs

– e.g. techniques used in Intel CPU:

• PEBS: Precise Event-Based Sampling

• BTS: Branch Trace Store

• LBR: Last Branch Record

• Event Filtering

• Conditional Counting

A.C. Chen 2012/09/18 @ ADL 9

Precision Mode of Intel CPU

---Branch Trace Store (BTS) Mechanism

• Record all control transfer precisely into a predefined buffer

– jump, call, return, interrupt and exception

– also record the addresses of branch source and target

• Let a monitor get the trace in a batch

– an interrupt will be delivered when the buffer is nearly full

• Obtain all the branch information of a running application, help users locate the vulnerabilities

A.C. Chen 2012/09/18 @ ADL 10

CFI ENFORCEMENT BY CFIMON

Offline Analysis and Online Detection

A.C. Chen 2012/09/18 @ ADL 11

Main Idea

• The CFI of an application can be maintained if we can

– get a legal set of branch target addresses for every branch

– check whether the target address of every branch is within the corresponding legal set at runtime

A.C. Chen 2012/09/18 @ ADL 12

Branch Classification in X86 ISA

---Direct Branch & Its Target Address

• Direct Branch (safe branch) √

– Direct jump

• jnz c2ef0 <__write >

– Direct call

• callq 34df0 <abort >

• Since the code is read-only and cannot be modified during runtime, both the direct jump and direct call are considered safe one

A.C. Chen 2012/09/18 @ ADL 13

Branch Classification in X86 ISA

---Indirect Branch & Its Target Address

• Indirect Branch (unsafe branch) √

– Indirect jump

• jmpq *%rdx

D y n a m i c T r a i n i n g

• not possible to gain the whole target address set just by static analysis

– Indirect call

• callq *%rax

A call can only transfer control to the start of a function.

• its target address could be obtained by statically scanning the binary code of the application and the libraries it uses

– Return

• retq

In general, the target address of a return has to be the one next to a call

• its target address could also be obtained by scanning the binary code.

A.C. Chen 2012/09/18 @ ADL 14

CFIMon: 2 Phases

• Offline phase

– build a legal set of target addresses for each branch instruction

• Online phase

– diagnose possible attacks with legal sets following a number of rules

• determine the status of the branch as legal , illegal or suspicious

A.C. Chen 2012/09/18 @ ADL 15

Offline Analysis

--- obtain legal set: ret_set, call_set

• Scans the binary of application and dynamic libraries to get

– ret_set

• contains all addresses of the instructions next to each call

• special cases

– call_set

• contains all addresses of the first instruction of each function int add

(int a, int b){

.

.

.

.

add(3,4); printf(“TEST!”); ret_set

.

} printf(“1 st inst.”);

.

call_set

A.C. Chen 2012/09/18 @ ADL 16

Offline Analysis

--- obtain legal set:train_set

• Use training to collect branches trace ( recorded by BTS ) for each indirect jump, get the legal set of

– train_set

– there could be corner cases which are not covered

• considered as suspicious during online checking

A.C. Chen 2012/09/18 @ ADL 17

Online Detection

<source,target> legal illegal suspicious special case? no

<source> is direct branch?

no

<source> is return ret_set

<source> is indirect call call_set yes yes

<source> is indirect jump train_set yes no yes no yes no s w i t c h i n t o different cases based on <source>

C o n s i d e r t h e s t a t e o f a branch depending o n < t a r g e t > slide-window m e c h a n i s m

A.C. Chen 2012/09/18 @ ADL 18

Slide-Window Mechanism

---For Suspicious Branches

• The diagnose module makes a flexible decision depending on the pattern of the branches

– maintain a window of the states of recent n branches

– apply a rule of tolerating at most m suspicious branches in the recent n ones

• i.e., at most m suspicious branches are accepted in recent n branches

A.C. Chen 2012/09/18 @ ADL 19

IMPLEMENTATION

A.C. Chen 2012/09/18 @ ADL 20

Implementation

• Debian-6 with kernel version 2.6.34

– 2GB 1066MHz main memory

– Intel Core i5 processor with 4 cores

• Based on perf_events to implement the CFIMon

– a unified kernel extension in Linux for user-level performance monitoring

A.C. Chen 2012/09/18 @ ADL 21

CFIMon---Mainly 2 Components

• A kernel extension

– operate the performance samples

– monitor signals

– provide the interfaces to user-level tool

• A user-level tool with 2 modules

– diagnose module

• check the control flow integrity

• receives information from the OS to solve special cases such as signal handling

– control module

• initialize the environment

• launch and synchronize with an application

A.C. Chen 2012/09/18 @ ADL 22

Architecture

A user-level tool with 2 modules

A kernel extension

A.C. Chen 2012/09/18 @ ADL 23

CFIMon---Monitoring

• The user-level tool is the parent process of the application process, executed as a monitoring process

– use ptrace to synchronize with the application process

– run for security check at the critical point

• e.g. when the child process makes the exec system call

A.C. Chen 2012/09/18 @ ADL 24

EVALUATION

Evaluate the detection ability of CFIMon

A.C. Chen 2012/09/18 @ ADL 25

Experimental Samples

• Use several real-world applications as well as 2 demo programs to detect

– Code-Injection Attacks

– Return-to-libc Attacks

– Return-oriented Programming (Samba, GPSd, and Wuftpd-2.6.0 excluded)

A.C. Chen 2012/09/18 @ ADL 26

Evaluation for Code-Injection Attacks

• Use the metasploit framework to generate nopsled before the injected code

– attack each application with injected code 5 times to test the false negatives

– CFIMon detects all these attacks as expected

• report a security alarm

• For example, code-injection attack of Samba

– heap overflow function lsa_trans_name and overwrite the function pointer destructor

– CFIMon detected such attack since the branches have never appeared in the train_set

A.C. Chen 2012/09/18 @ ADL 27

Evaluation for Return-to-libc Attacks

• CFIMon successfully detects all these attacks without experiencing false negatives

• Return-to-libc Attack of GPSd (ver. 2.7)

– format string vulnerability in function gpsd_report

– allows remote attackers to execute arbitrary libc function (e.g. system ) via certain GPS requests (via tcp port 2947 )

– CFIMon marks it and the following branches as suspicious since the branches have never appeared in the train_set

– an alarm is triggered since the number of suspicious branches quickly exceeds the threshold suspicious branches addr. of system addr. of …

.

.

window size = 20 tolerant at most 3 suspicious branches

A.C. Chen 2012/09/18 @ ADL 28

Evaluation for Return-oriented Programming

Attacks

• Similar to other evaluation, CFIMon successfully detects all these attacks without experiencing false negatives

• Return-oriented Programming Attack of Squid

(ver. 2.5-STABLE1)

– stack overflow bug in its helper module, ntlm , when authentication

– smash the stack by supply arbitrary password of at most 300 bytes in function ntlm_check_auth

– violates the rules of CFIMon which enforces that the target address of a return instruction must be the one next to a call

A.C. Chen 2012/09/18 @ ADL 29

PERFORMANCE

Overhead evaluation

A.C. Chen 2012/09/18 @ ADL 30

Performance Evaluation

• Quantitatively evaluate the performance of

CFIMon using several real-world applications

– Apache

– Exim

– Memcached

– Wu-ftpd

A.C. Chen 2012/09/18 @ ADL 31

Overhead Results

• Memory overhead is negligible

– since the size of the tables ( ret_set, call_set and train_set) is quite small

• Performance overhead

Average overhead of pure BTS is 5.2%

Average overhead of CFIMon is only 6.1%

A.C. Chen 2012/09/18 @ ADL 32

CONCLUSION

A.C. Chen 2012/09/18 @ ADL 33

Conclusion

• The proposed CFIMon leveraged the branch trace store (BTS) mechanism to detect violation of control flow integrity

• The performance result shows that CFIMon can be applied to some real-world server applications on off-the-shell systems in daily use

A.C. Chen 2012/09/18 @ ADL 34

Q & A

A.C. Chen 2012/09/18 @ ADL 35

Return-Without-Call

There are several cases that the calling convention may be violated :

– setjmp / longjmp

• Instead of returning to its own caller, the longjmp returns to the caller of setjmp (also a legal address)

– Unix signal handling

• Instead of returning to the caller (OS), the handler returns to the interrupted process

• modify the OS to let the monitor omit the alarm when a signal handler returns

A.C. Chen 2012/09/18 @ ADL 36

Calling Convention

High addr.

Stack Frame of A()

Stack Frame of B()

Stack Frame of C()

Stack Frame of D()

Low addr.

A()

B()

C()

D()

A.C. Chen 2012/09/18 @ ADL 37

setjmp/longjmp

second main

A.C. Chen 2012/09/18 @ ADL 38

Precision Mode of Intel CPU

---PEBS, BTS

• PEBS (Precise Event-Based Sampling)

– Precise Performance Counter

– atomic ‐ freeze: record exact IP address precisely

• BTS (Branch Trace Store)

– to capture all control transfer events

• jump, call, return, interrupt and exception

– also record the addresses of branch source and target

– enables the monitoring of the whole control flow of an application

A.C. Chen 2012/09/18 @ ADL 39

Precision Mode of Intel CPU

---LBR, Event Filtering, Conditional Counting

• LBR (Last Branch Record)

– to record the most recent branches into a register stack

– the size of the register stack is small

• Event Filtering

– to filter events not concerned with

– currently only available in LBR not BTS

• Conditional Counting

– to separate user-level events from kernel-level ones

– only increment counter while the processor is running at a specific privilege level

• e.g. “only counting when at user mode”

A.C. Chen 2012/09/18 @ ADL 40

Download