Virtual Switching Without a Hypervisor for a More Secure Cloud Xin Jin Princeton University Joint work with Eric Keller(UPenn) and Jennifer Rexford(Princeton) 1 Public Cloud Infrastructure • Cloud providers offer computing resources on demand to multiple “tenants” • Benefits: – Public (any one can use) – Economies of scale (lower cost) – Flexibility (pay-as-you-go) 2 Server Virtualization • Multiple VMs run on the same server • Benefits – Efficient use of server resources – Backward compatibility • Examples – Xen – KVM – VMware VM VM VM Hypervisor Hardware 3 Network Virtualization • Software switches – Run in the hypervisor or the control VM (Dom0) • Benefits: Flexible control at the “edge” – Access control – Resource and name space isolation – Efficient communication between co-located VMs • Examples – Open vSwitch – VMware’s vSwitch – Cisco’s Nexus 1000v Switch VM VM Software Switch Hypervisor Hardware 4 Security: a major impediment for moving to the cloud! Let’s take a look at where the vulnerabilities are… 5 Vulnerabilities in Server Virtualization Guest VM 1 Guest VM 2 Guest VM 3 Hypervisor Hardware • The hypervisor is quite complex • Large amount of code ―> Bugs (NIST’s National Vulnerability Database) 6 Vulnerabilities in Server Virtualization Guest VM 1 Guest VM 2 Guest VM 3 Hypervisor Hardware • The hypervisor is an attack surface (bugs, vulnerable) ―> Malicious customers attack the hypervisor 7 Vulnerabilities in Network Virtualization Guest VM 1 Virtual Interface Dom0 Software Switch Guest VM 2 Virtual Interface Hypervisor Hardware Physical NIC • Software switch in control VM (Dom0) • Hypervisor is involved in communication 8 Vulnerabilities in Network Virtualization Guest VM 1 Virtual Interface Dom0 Software Switch Guest VM 2 Virtual Interface Hypervisor Hardware Physical NIC • Software switch is coupled with the control VM ―> e.g., software switch crash can lead to a complete system crash 9 Dom0 Disaggregation [e.g., SOSP’11] Dom0 Guest VM 1 Service VM Service VM Guest VM 2 Service VM Service VM Hypervisor Hardware • Disaggregate control VM (Dom0) into smaller, single-purpose and independent components • Malicious customer can still attack hypervisor! 10 NoHype [ISCA’10, CCS’11] Dom0 Emulate, Manage Hypervisor Hardware Guest VM 1 Physical Device Driver Guest VM 2 Physical Device Driver • Pre-allocating memory and cores • Using hardware virtualized I/O devices • Hypervisor is only used to boot up and shut down guest VMs. Virtualized Physical NIC • Eliminate the hypervisor attack surface • What if I want to use a software switch? 11 Software Switching in NoHype Dom0 Emulate, Manage Hypervisor Hardware Guest VM 1 Physical Device Driver Guest VM 2 Software Switch Guest VM 3 Physical Device Driver Virtualized Physical NIC • Bouncing packets through the physical NIC • Consumes excessive bandwidth on PCI bus and the physical NIC! 12 Our Solution Overview Dom0 Guest VM 1 Emulate, Manage Hypervisor Hardware Virtual Interface DomS Software Switch Guest VM 2 Virtual Interface Physical NIC • Eliminate the hypervisor attack surface • Enable software switching in an efficient way 13 Eliminate the Hypervisor-Guest Interaction Guest VM 1 Virtual Interface Polling Dom0 FIFO Software Switch FIFO Hypervisor • Shared memory – Two FIFO buffers for communication • Polling only – Do not use event channel; no hypervisor involvement 14 Limit Damage From a Compromised Switch Guest Guest VM 1 VM 1 Polling Polling Guest VM 1 DomSDom0 Dom0 Software DomS Software Switch Switch Dom0 FIFO Polling FIFO Virtual Virtual Interface Software Interface FIFO Switch FIFO FIFO Virtual Hypervisor Interface Hypervisor FIFO Hypervisor • Decouple software switch from Dom0 – Introduce a Switch Domain (DomS) • Decouple software switch from the hypervisor – Eliminate the hypervisor attack surface 15 Preliminary Prototype Dom0 Guest VM 1 Software Switch Virtual Interface Hypervisor DomS Guest VM 1 Dom0 Polling Virtual Interface FIFO Software Switch FIFO Hyperviso r Hardware Hardware Native Xen Our Solution • Prototype based on – Xen 4.1: used to boot up/shut down VMs – Linux 3.1: kernel module to implement polling/FIFO – Open vSwitch 1.3 16 Preliminary Evaluation Dom0 Guest VM 1 Software Switch Virtual Interface Hypervisor DomS Guest VM 1 Dom0 Polling Virtual Interface FIFO Software Switch FIFO Hyperviso r Hardware Hardware Native Xen Our Solution • Evaluate the throughput between DomS and a guest VM, compared with native Xen • Traffic measurement: Netperf • Configuration: each VM has 1 core and 1GB of RAM 17 – Polling period is fixed to 1ms – Reach high throughput with just 256 FIFO pages (Only 1MB) • Polling Period – Shorter polling period, higher throughput – CPU resource consumption? ―> Future work 1800 1500 1200 900 600 300 0 4 Throughput (Mbps) • FIFO Size Throughput (Mbps) Evaluation on Throughput 8 16 32 64 128 256 512 FIFO Pages (1 page = 4 KB) 4000 3000 2000 1000 0 0.05 0.25 0.5 0.75 1 1.25 1.5 1.75 2 Polling Period (ms) 18 Comparison with Native Xen Throughput (Mbps) Our Solution Native Xen 10000 8000 6000 4000 2000 0 64 128 256 512 1K 2K 4K Message Size(Bytes) 8K 16K • Outperforms native Xen when message size is smaller than 8 KB. • Future work: incorporate more optimization 19 Conclusion and Future Work • Trend towards software switching in the cloud • Security in hypervisor and Dom0 is a big concern • Improve security by enabling software switching without hypervisor involvement • Future work – Detection and remediation of DomS compromise 20 Thanks! Q&A 21