Design and Implementation of a Consolidated Middlebox Architecture Vyas Sekar Sylvia Ratnasamy Michael Reiter Norbert Egi Guangyu Shi 1 Need for Network Evolution New applications Evolving threats Performance, Security, Compliance Policy constraints New devices 2 Network Evolution today: Middleboxes! Type of appliance Data from a large enterprise: >80K users across tens of sites Just network security $10 billion Number Firewalls 166 NIDS 127 Media gateways 110 Load balancers 67 Proxies 66 VPN gateways 45 WAN Optimizers 44 Voice gateways 11 Total Middleboxes Total routers 636 ~900 3 Key “pain points” Narrow Management Management Management interfaces Specialized boxes “Point” solutions! Increases capital expenses & sprawl Increases operating expenses Limits extensibility and flexibility 4 Outline • Motivation • High-level idea: Consolidation • System design • Implementation and Evaluation 5 Key idea: Consolidation Two levels corresponding to two sources of inefficiency Network-wide Controller 2. Consolidate Management 1. Consolidate Platform 6 Consolidation at Platform-Level Today: Independent, specialized boxes Proxy Firewall IDS/IPS AppFilter Decouple Hardware and Software Commodity hardware: e.g., PacketShader, RouteBricks, ServerSwitch, SwitchBlade Consolidation reduces capital expenses and sprawl 7 Consolidation reduces CapEx Multiplexing benefit = Max_of_TotalUtilization / Sum_of_MaxUtilizations 8 Consolidation Enables Extensibility VPN Web Mail IDS Proxy Firewall Protocol Parsers Session Management Contribution of reusable modules: 30 – 80 % 9 Management consolidation enables flexible resource allocation Today: All processing at logical “ingress” Process Process (0.4(P) P) N1 Overload! Process (0.3 P) Process (0.3 P) N2 N3 P: N1 N3 Network-wide distribution reduces load imbalance 10 Outline • Motivation • High-level idea: Consolidation • CoMb: System design • Implementation and Evaluation 11 CoMb System Overview Network-wide Controller Logically centralized e.g., NOX, 4D General-purpose hardware: e.g., PacketShader, RouteBricks, ServerSwitch, Existing work: simple, homogeneous routing-like workload Middleboxes: complex, heterogeneous, new opportunities 12 CoMb Management Layer Goal: Balance load across network. Leverage multiplexing, reuse, distribution Policy Constraints Resource Requirements Network-wide Controller Routing, Traffic Processing responsibilities 13 Capturing Reuse with HyperApps HyperApp: find the union of apps to run HTTP: 1+2 unit of CPU 1+3 units of mem HTTP UDP IDS 2 1 HTTP = IDS & Proxy HTTP NFS Proxy common CPU 3 4 Memory UDP = IDS 3 1 3 1 Memory CPU Memory CPU NFS = Proxy Footprint on resource Need per-packet policy, reuse dependencies! CPU 1 4 Memory Policy, dependency are implicit 14 Modeling Processing Coverage HTTP: Run IDS < Proxy IDS < Proxy 0.4 IDS < Proxy 0.3 IDS < Proxy 0.3 HTTP N1 N3 N1 N2 N3 What fraction of traffic of class HTTP from N1 N3 should each node process? 15 Network-wide Optimization Minimize Maximum Load, Subject to Processing coverage for each class of traffic Fraction of processed traffic adds up to 1 Load on each node sum over HyperApp responsibilities per-path No explicit Dependency Policy A simple, tractable linear program Very close (< 0.1%) to theoretical optimal 16 CoMb System Overview Network-wide Controller Logically centralized e.g., NOX, 4D General-purpose hardware: e.g., PacketShader, RouteBricks, ServerSwitch, Existing work: simple, homogeneous routing-like workload Middleboxes: complex, heterogeneous, new opportunities 17 CoMb Platform Applications Policy Enforcer IDS … Proxy Core1 … Core4 Policy Shim (Pshim) IDS < Proxy Classification: HTTP NIC Traffic Challenges: Performance Parallelize Isolation Challenges: Lightweight Parallelize Challenges: No contention Fast classification 18 Parallelizing Application Instances App-per-core M1 M2 Core1 Core2 PShim HyperApp-per-core M3 Core3 PShim - Inter-core communication - More work for PShim + No in-core context switch M1 M2 Core1 PShim M2 M3 ✔ Core2 PShim + Keeps structures core-local + Better for reuse - But incurs context-switch - Need replicas HyperApp-per-core is better or comparable Contention does not seem to matter! 19 CoMb Platform Design Core-local processing Core 1 M1 Hyper App1 M2 Workload balancing Core 2 M3 M1 M4 Hyper App2 Hyper App3 PShim PShim PShim Q1 Q2 Q3 NIC hardware Core 3 M5 Hyper App4 PShim Q4 M1 M4 Hyper App3 PShim Q5 Parallel, core-local Contention-free network I/O 20 Outline • Motivation • High-level idea: Consolidation • System design: Making Consolidation Practical • Implementation and Evaluation 21 Implementation Network-wide Management Policy Shim Extensible apps Ported logic From Bro Click using CPLEX Kernel mode Click Standalone apps Protocol Memory mapped Or Virtual interfaces Session 8-core Intel Xeon with Intel 82599 NIC 22 Consolidation is Practical • Low overhead for existing applications • Controller takes < 1.6s for 52-node topology • 5x better than VM-based consolidation 23 Benefits: Reduction in Maximum Load MaxLoadToday /MaxLoadConsolidated Consolidation reduces maximum load by 2.5-25X 24 Benefits: Reduction in Provisioning Cost ProvisioningToday /ProvisioningConsolidated Consolidation reduces provisioning cost 1.8-2.5X 25 Discussion • Changes traditional vendor business – Already happening (e.g., “virtual appliances”) – Benefits imply someone will do it! – May already have extensible stacks internally! • Isolation – Current: rely on process-level isolation – Get reuse-despite-isolation? 26 Conclusions • Network evolution occurs via middleboxes • Today: Narrow “point” solutions – High CapEx, OpEx, and device sprawl – Inflexible, difficult to extend • Our proposal: Consolidated architecture – Reduces CapEx, OpEx, and device sprawl – Extensible, general-purpose • More opportunities – Isolation – APIs (H/W—Apps, Management—Apps, App Stack) 27