Virtualization and Security Prof. 강병훈 (Brent ByungHoon Kang) TA. 이호준 (Hojoon Lee) {brentkang,hojoon.lee@kaist.ac.kr) CySec Lab (Cyber Security Systems Research Lab), KAIST GSIS (Graduate School of Information Security) 1 Content Hypervisor Internals CPU Virtualization Memory Virtualization Research Trends in Virtualization and Security Hypervisor Security Virtual Machine Introspection Security and Privacy for Cloud Side Channel Attack Issues Virtual Machine Introspection (VMI) Techniques Demo 2 Virtual Machine Introspection - Application • Reduction of Semantic Gap • Isolation: External Monitor & Scanner • Flexibility: Building on-demand security functions Safe from Rootkit Security VM Guest VM Root kit Forwarding Extracted Semantic Hypervisor <Example of VMI application> Can run general tools like… Ptrace, Strace, Netstat… Trustworthy Cloud Service • Not only the defense against external threats, but trustworthy operation is also required… Security solutions are already available. Trustworthy operation also needs to be guaranteed. 고객 데이터 보호- Trustworthy Cloud Service Privacy violated by Google Employee Trustworthy Cloud Service Prevention • De-privileged Control VM ASIS Administrator (remote access) Gues t VM TOBE Guest VM Contr ol VM Gues t VM Guest VM Non Root mode Root mode Control VM Hypervisor Access Control Hypervisor TCB : Hypervisor Trustworthy Cloud Service Detection • • Guarantees Service provider’s legitimate access to guest VM Audit log collection o Information flow tracking among Hypervisor ← → Control VM ← → GUEST VM o Log collecting hooks patched on cloud infrastructure Gues t VM - Log collecting hook injection - Information Flow Tracking - Integrity check Guest VM Control VM Hypervisor Isolated env. Log DB Audit log Analysis VM → Admin : Virus Scan, VMI Admin → VM : Guest OS Patch Hypervisor Internals CPU Virtualization 8 What is Virtual Machine Monitor ▪ Virtual Machine Monitor = Hypervisor ▪ Software layer to monitor virtual machines ▪ Resource management ▪ Scheduling virtual machines ▪ Why virtualization? ▪ IBM mainframe in 1960s ▪ Server Consolidation : to use the remain resource ▪ Isolation from each virtual machine 9 What is Virtual Machine Monitor ▪ Type 1 ▪ Type 2 ▪ Run on OS ▪ VirtualBox, KVM, VMware workstation ▪ Run directly on hardware ▪ Xen, VMware server 10 What is Virtual Machine Monitor ▪ Xen (Type 1) ▪ KVM (Type 2) ▪ Runs as a part of host kernel ▪ Each VM is a host thread ▪ Run directly on hardware ▪ Dom0 11 CPU Virtualization CPU Virtualization Modern hypervisors use hardware-assisted virtualization using Intel VT-x (and AMD SVM) technology We will go through both traditional and HW-assisted virtualization techniques to understand virtualization 13 CPU Virtualization The Rings of Privilege • Privileged Mode (Ring 0) • Contexts in this mode can execute “privileged instructions” • Non-privileged Mode (Ring 1 ~ Ring 3) • Contexts in this mode cannot execute privileged instructions Operating systems run in privileged mode (ring 0) and user applications run in non-privileged mode (ring3) Then WHERE should a hypervisor be placed? 14 Traditional Hypervisor Implementation • Hypervisor in privileged layer (kernel mode) • Guest OS in non-privileged layer (user mode) • As guest OS executes privileged instruction, a trap is generated and hypervisor emulates the instruction (Trap-and-Emulate) • Problem: Some privileged instructions do not cause traps, and performs different operations when executed in Ring0/Ring3 (e.g., POPF) 15 Traditional Hypervisor Implementation Software-only CPU Virtualization • Binary Translation (VMware) •Scan Guest OS code to locate “non-virtualizable” instructions and substitute with equivalent set of virtualizable instructions • Hypercall (Xen) •Modify Guest OS source code (i.e. Linux) •Implement Hypercall interface to replace privileged instructions with explicit request to hypervisors Traditional Hypervisor Implementation Binary Translation • Un-virtualizable instructions are replaced with either • Equivalent set of nonproblematic instructions • Trap to hypervisors Traditional Hypervisor Implementation ParaVirtulization (반가상화) • Implements virtualization through OS source code modification • Maximize the cooperation between OS and hypervisor • Achieves high performance Traditional Hypervisor Implementation Paravirt Operation • Linux provides a unified paravirtualization API called Paravirt Ops • Hypervisor vendors use Paravirt Ops to implement paravirtualization features Traditional Hypervisor Implementation Linux kernel can be compiled to replace sensitive operations with paravirt hypercalls (using compiler directives) Intel VT-x Intel VT-x VMX Enables Hardware-assisted Virtualization • Intel Introduced Virtual Machine eXtension to achieve efficient virtualization in x86 architecture • Splits Ring 0 into two modes • Root mode : Hypervisor mode • Non-root mode : Guest OS kernel mode • Separate execution mode for hypervisor • Allows Guest OS kernel to run in Ring 0 21 Intel VT-x Intel VT-x : Motivation • To complement non-virtualizable aspects of x86 architecture • Enable efficient virtualization in x86 architecture • Introduction of new privilege mode for hypervisor (Rootmode in Ring 0) • Deal with non-virtualizable instructions • Suppress excessive trapping • Eliminate software elimination and enable virtualization without Guest OS modifications (paravirt, binary translation) • Deprivileging the Guest OS is no longer needed 22 Intel VT-x Intel VT-x : Motivation • To complement non-virtualizable aspects of x86 architecture • Enable efficient virtualization in x86 architecture • Introduction of new privilege mode for hypervisor (Rootmode in Ring 0) • Deal with non-virtualizable instructions • Suppress excessive trapping • Eliminate software elimination and enable virtualization without Guest OS modifications (paravirt, binary translation) • Deprivileging the Guest OS is no longer needed 23 Hardware-assisted Virtualization Hardware-assisted Virtualization • VMEXIT occurs automatically for traps that need to be handled by hypervisor • Eliminates need for binary translation or Paravirtualization (Guest OS modification) Hardware-assisted Virtualization • Intel VT-x and AMD SVM add virtualization extensions to intel/amd x86 • Splits privileged layer (ring 0) into Non-root / Root mode • Generates traps for instructions that require hypervisor interposition (VMEXIT/VMENTER) Hardware-assisted Virtualization VM0 App VM1 App ... App ... Guest OS0 VM Exit App App ... App Guest OS1 VM Entry VM Monitor Physical Host Hardware • With VMX, sensitive instructions that require hypervisor causes VM Exit • Hypervisor handles the event and Issues VM Entry to resume Guest VM Hardware-assisted Virtualization VMCS : Virtual Machine Control Structure •Data structure that manages VMX operations •Store guest OS states and VMX configurations Hardware-assisted Virtualization The VMCS consists of six logical groups: • Guest-state area: Processor state saved into the guest-state area on VM exits and loaded on VM entries. • Host-state area: Processor state loaded from the host-state area on VM exits. • VM-execution control fields: Fields controlling processor operation in VMX non-root operation. • VM-exit control fields: Fields that control VM exits. • VM-entry control fields: Fields that control VM entries. • VM-exit information fields: Read-only fields to receive information on VM exits describing the cause and the nature of the VM exit. Virtual Machine Scheduling In OS kernel • Processes are scheduled In Hypervisor • VCPUs are scheduled Memory Virtualization Three Levels of Address space Virtual address Physical address Machine Address Three Levels of Address space Memory virtualization is needed to Allocate and isolate memory to VMs Reflect VM address mappings and permissions to hypervisor’s page tables Provide access to Machine Memory to VMs Shadow Page Table (SPT) SPT had been the most widely used memory virtualization method before Intel Extended Page Tables (EPT) / AMD NPT Key techniques in SPT CR3 accesss traps: Guest writes to CR3 (to make context switches) are trapped to hypervisor and emulated Guest kernel page table write-protected : guest kernel writes to its page table trapped and emulated by hypervisor SPT maps Virtual Address to Machine Address Directly Shadow Page Table (SPT) Shadow Page Table (SPT) Virtual CR3 Guest Guest Guest Page Table Page Table Page Table Shadow Shadow Shadow Page Table Page Table Page Table Real CR3 Shadow Page Table (SPT) Virtual CR3 Guest Guest Guest Page Table Page Table Page Table Shadow Shadow Shadow Page Table Page Table Page Table Real CR3 Shadow Page Table (SPT) Virtual CR3 Guest Guest Guest Page Table Page Table Page Table Shadow Shadow Shadow Page Table Page Table Page Table Real CR3 Shadow Page Table (SPT) Guest Guest Guest Guest Page Table Page Table Page Table Page Table Shadow Shadow Shadow Page Table Page Table Page Table Shadow Page Table (SPT) Virtual CR3 Guest Guest Guest Guest Page Table Page Table Page Table Page Table Shadow Shadow Shadow Shadow Page Table Page Table Page Table Page Table Real CR3 Optimizing SPT : Lazy Pull-through Trap and emulating every guest kernel page table updates causes too many VM EXITs VM EXITs are essentially context switches and cause performance overhead (due to TLB flushing and etc) Lazy pull-through Let L1 (leaf page tables a.k.a PTEs in Linux) be out of sync and sync when page fault actually occurs Emulate L4, L3 ,L2 each time 40 Optimizing SPT : Lazy Pull-through Out of Sync (Lazy PullThrough) Emulate at Every Write L4 L3 L1 L2 41 Issues with SPT Positives Handle page faults in same way as Emulated TLBs Fast guest context switching Page Table Consistency Guest may not need invalidate TLB on writes to off-line page tables Need to trace writes to shadow page tables to invalidate entries Memory Bloat Caching guest page tables takes memory Need to determine when guest has reused page tables 42 Intel Extended Page Tables (EPT) Intel EPT adds another page table pointer called “EPT Base Pointer” Another layer (Nested) of Page Table Walking 43 Intel Extended Page Tables (EPT) 44 Intel Extended Page Tables (EPT) EPT Page Table Entry Structure 45 Intel Extended Page Tables (EPT) TLB filling with two page tables (Guest PGT + EPT) 46 Intel Extended Page Tables (EPT) Address Translation with Nested Paging (EPT) Virtual Address TLB Machine Address 3 1 2 Guest 2 Page Table PhysMap By VMM 47 3 Intel Extended Page Tables (EPT) Guest and EPT Structure 48 Intel Extended Page Tables (EPT) Positives Simplifies monitor design No need for page protection calculus Negatives Guest page table is in physical address space Need to walk Physical Map multiple times Need physical to machine mapping to walk guest page table Need physical to machine mapping for original virtual address Other Memory Virtualization Hardware Assists Monitor Mode has its own address space No need to hide the monitor 49 State of Art VM-based Research 50 Virtual Machine Introspection (VMI) VM introspection systems often aim to Detect or prevent kernellevel malware (i.e. rootkit) Identify malicious programs running in VM Ensure integrity or secrecy of processes and files 51 Virtual Machine Introspection (VMI) VMI systems normally have Introspection VM Monitored user VM VM Monitoring mechanisms 52 Disk Memory Network H/W Events CPU Registers Virtual Machine Introspection (VMI) VM introspection systems often aim to Detect or prevent kernellevel malware (i.e. rootkit) Identify malicious programs running in VM Ensure integrity or secrecy of processes and files VMWare VMSafe API ™ Libvmi (Open source) 53 Out-of-VM Monitoring vs In-VM Monitoring Guest VM Secure VM Guest VM In-Guest Component Secure VM Monitor Hypervisor Hypervisor Out-of-VM In-VM 54 Out-of-VM Monitoring vs In-VM Monitoring Out-of-VM Monitoring Monitor resides outside of VM Benefits Complete isolation provides maximum security Simpler Design Drawbacks Heavy performance overhead due to context switching Monitoring granularity In-VM Monitoring Monitor has agent inside VM Adapted for various reasons Protecting In-VM hooks (Lares Oakland ‘08) Reducing Performance Overhead (SIM CCS ‘09) Narrowing Semantic Gap by Executing In-VM code (SYRINGE RAID ‘12) 55 Out-of-VM Monitoring Techniques Event-triggered Monitoring Techniques 56 Out-of-VM Monitoring Techniques (Cont’d) Event-triggered Monitoring Techniques VMI Techniques leverage HW-assisted virtualization feature to implement Event-triggered techniques VMEXIT: mechanism that forces a process or thread into hypervisor context triggered by certain events VMEXIT trigger conditions execution of sensitive instructions that needs hypervisor intervention Sensitive register value modification Page faults (non-existent or disallowed memory access) Other faults (i.e. general protection fault …) 57 Out-of-VM Monitoring Techniques (Cont’d) Event-triggered Monitoring Techniques (Cont’d) Virtualized Memory Guest VM Memory are managed by hypervisor Shadow Page Table Extended Page Tables Memory Monitoring Mark Pages that we want to monitor Non-writable When Guest attempts to write to the page, a VMEXIT occurs and hypervisor’s VMEXIT handler is invoked One can implement inspection functions in VMEXIT handlers 58 Out-of-VM Monitoring Techniques (Cont’d) Memory Access For physical memory region P in Guest VM There is machine memory M Map M as P’ in Monitor VM’s memory space Guest VM Physical Memory Space Machine Memory Monitor VM Physical Memory Space M P P’ Hypervisor 59 In-VM Monitoring In-VM Monitoring Monitor has agent inside VM Adapted for various reasons Protecting In-VM hooks (Lares Oakland ‘08) Reducing Performance Overhead (SIM CCS ‘09) Narrowing Semantic Gap by Executing In-VM code (SYRINGE RAID ‘12) 60 In-VM Monitoring Lares Places an In-VM Hook that is protected by Hypervisor Security augmentation for traditional kernel hooks typically used by AV software 61 In-VM Monitoring (Con’td) SYRINGE Allows out-of-vm monitor to inject a function call request into VM Execute in-VM kernel code and returns the results back to monitor Goal: alleviate semantic gap by reusing in-VM kernel code 62 Semantic Gap Hypervisor’s visibility on monitored VM are limited to hardware-level abstractions Register values CPU events Physical memory contents Security policies involve high-level abstractions Processes High-level events Data structures 63 Semantic Gap Such discrepancies in level of information creates Semantic Gap VMI development requires reconstruction of low-level data into high-level data structures. 64 Bridging Semantic Gap Virtuoso Automatically generates introspection programs based on dynamic learning from trusted VM 65 Hypervisor-based Dynamic Analysis FireEye Dynamic Threat Intelligence Cloud™ 66 Hypervisor-based Dynamic Analysis Ether (CCS 2008) Monitor The instructions executed by a guest process Memory writes a guest process performs System calls a guest process makes Analysis Coarse-grained tracing System calls Fine-grained tracing Memory writes and the instructions 67 Hypervisor-based Dynamic Analysis Guarantee the occurrence of a debug exception after every instruction by setting a flag called the trap flag Upon handling a debug exception, Ether will once again set the trap flag for the next instruction, thereby inducing a debug exception after every instruction. Preventing from detecting Ether’s presence 68 Hypervisor-based Dynamic Analysis Hiding the Trap Flag PUSHF, INT If a program set the trap flag? Be detected by POPF instruction <Setting trap flag> Page Table Modifications SYSENTER_EIP_MSR Save any value the target attempts to write into SYSENTER_EIP_MSR 69 Security and Privacy For Cloud • Guest Kernel Compromised • Hypervisor is trusted • Need to protect user applications app.1 ........................................... ...... GuestOS Hypervisor Hardware 70 app.N NetWork Security and Privacy For Cloud InkTag Use 2 EPTs to separate trusted and untrusted page tables Compromised OS is forced to use Untrusted EPT Only expose Encrypted application memory Hypervisor allows Trusted Application to use Trusted EPT Can read/write to its own memory Trusted EPT Untrusted EPT 71 Security and Privacy For Cloud InkTag Paraverification – audit kernel response in system call handling HAP mmap(file=… ,token=4 0x7fcb MMU_REGISTER Guest OS pte_update(addr =0x7fcb pte_update(addr… token=4 Hypervisor 72 Security and Privacy For Cloud TCB - Nested Virtualization Threat Model • • app.1 Encrypted Guest OS Compromised Hypervisor Abnormal access to guest VM ...... .. app.6 app.12 ...... .. Encrypted Guest OS app.N Data Exfiltration Control VM Hypervisor Nested Hypervisor Interposed security operations - I/O Verification - Memory Mapping Verification/Protection - CPU values Hiding/Protection Hardware Control flow (Guest OS → Hypervisor) Control flow (Hypervisor → Guest OS) Intel TPM/TXT 73 Integrity check Hypervisor for Security Hypervisor itself is a software and can also a target of software attacks Xen 0wning Trilogy showed possible hypervisor subversion via DMA 74 Q&A