Presenter: Chuong Ngo Comprehensive Kernel Instrumentation via Dynamic Binary Translation Peter Feiner, Angela Demke Brown, Ashvin Goel University of Toronto No parents, uncles, or girlfriends were killed during the creation of this presentation THE ORIGIN STORY STARTING IN MEDIAS RES DBT is the Answer! Emulation of one instruction set by another through translation of binary code during execution. More practical than static binary translation. ◦ Simplifies identification of executable code. ◦ Amortization of translation overhead costs over time. …and I Remember Everything! The Answer to What? Ports ◦ Abandonware Analysis Bug finding Security Assemble! Pin DynamoRio Valgrind User Level Power Level < 9K JIFL PinOS All the way from Earth-1610 via Cataclysm IT’S A BIRD! IT’S A PLANE! IT’S DRK! But Who Hides Behind the Mask? 4 Goals for kernel DBT framework: ◦ Full coverage of kernel code. ◦ No direct overhead for user level code. ◦ Preserve original concurrency and execution interleaving. ◦ Be transparent. DynamoRio for the kernel. DynamoRio Flashback! Code cache CTIs return control to dispatcher Direct branching patches Next Executing Tail Client callbacks Well Victor…I’ve been thinking. All kernel entry points point to dispatcher. ◦ Shadow descriptor table Self-contained dispatcher ◦ Custom heap allocator ◦ “Pull” I/O model CPU-private data Interrupts delayed in code cache, disabled in dispatcher. Exceptions use restored native states. A Carbonadium Skeleton DRK Initialization Allocates memory for heap ◦ Checks all processors for successful memory mapping. ◦ Must be within 2GB of text and data segments. Individual CPU initialization ◦ Allocate CPU resources ◦ All kernel entry points to dispatcher ◦ All interrupts redirected DRK Normal Operations Dispatcher creates and caches code fragment. Context switches to the code fragment. Determine target of control transfer instruction and dispatch. Kernel exit points executed via native instructions. You Can’t Escape This Timeline! Exceptions run native ◦ Native state must be restored. Interrupts are delayed and emulated. ◦ Other interrupts are disabled. ◦ Captured interrupt executed between block dispatches. How did--? This… you… What are you? HOW DOES IT STACK UP? I’ve always found hardware to be more reliable Test System: Dell Optiplex 980 ◦ 8 GB RAM ◦ 4x Intel Core i7s at 2.8 GHz, no hyperthreading 2 Clients: ◦ Null Client ◦ Instruction Count Filebench I’m the best at what I do? There’s a whole new master of magnetism in town! I know everything. I can’t help it. With great power… 4 Goals for kernel DBT framework: ◦ Full coverage of kernel code. ◦ No direct overhead for user level code. ◦ Preserve original concurrency and execution interleaving. ◦ Be transparent. I’ll be there…around every corner Full coverage of kernel code. Preserve original concurrency and execution interleaving. Fastest man alive with a limp No direct overhead for user level code. ◦ Increased cache and TLB misses. The cosmic rays…what did they do to us? Be transparent. ◦ No code cache consistency. ◦ Shadow descriptor tables readable via hardware registers. ◦ Page table inconsistencies. ◦ CPU-private data. …comes great responsibility. 4 Goals for kernel DBT framework: ◦ Full coverage of kernel code. ◦ No direct overhead for user level code. ◦ Preserve original concurrency and execution interleaving. ◦ Be transparent. This was the world that I had created. DRK APPLICATIONS DRK’s Shadow Memory Storing metadata about memory used. Ported UMBRA. ◦ Simple indirect mapping. ◦ Copy-on-write. ◦ 10x overhead vs. native. KAddrcheck Memory addressability checking tool. Scans slab allocator’s data structures to locate all pages and freelists. ◦ Triggers shadow memory allocations. Addressability checks run on every memory access. Stackcheck Stack overflow guard ◦ Checks for addressability errors. ◦ Kills calling thread and continues. Modified KAddrcheck Resolves overflow without system crash. Triumph! DRK is a kernel-level DBT. DynamoRIO “port”. Heavy implementation. Missing a number of features.