SDN(++) and Interesting Use-Cases Lecture 18 Aditya Akella 1 Two Parts • SDN for L3-L7 services – What is different? – Why can’t we use SDN/OpenFlow as is? • New use cases – Live Network Migration – What else? 2 Toward Software-Defined Middlebox Networking Aaron Gember, Prathmesh Prabhu, Zainab Ghadiyali, Aditya Akella University of Wisconsin-Madison 3 Middlebox Deployment Models • Arbitrary middlebox placement • New forms of middlebox deployment (VMs, ETTM [NSDI 2011], CoMB [NSDI 2012]) 4 Live Data Center Migration • Move between software-defined data centers Data Center A Data Center B • Existing VM and network migration methods – Unsuitable for changing underlying substrate Programmatic control over middlebox state 5 Middlebox Scaling • Add or remove middlebox VMs based on load • Clone VM (logic, policy, and internal state) – Unsuitable for scaling down or some scaling up Fine-grained control 6 Our Contributions • Classify middlebox state, and discuss what should be controlled • Abstractions and interfaces – Representing state – Manipulating where state resides – Announcing state-related events • Control logic design sketches 7 Software-Defined Middlebox Networking Today SDN-like Middleboxes IPS App App Controller Middlebox Middlebox 8 Key Issues 1. How is the logic divided? 2. Where is state manipulated? 3. What interfaces are exposed? Middlebox App App Controller Middlebox 9 Middlebox State • Configuration input + detailed internal records Src: HostA Server: B Server: B CPU: 50% Proto: TCP Port: 22 State: ESTAB Seq #: 3423 Hash: 34225 Content: ABCDE Balance Method: Round Robin Cache size: 100 Significant state diversity 10 Classification of State Action Src: HostA Server: B Proto: TCP Port: 22 Supporting Server: B CPU: 50% State: ESTAB Seq #: 3423 Hash: 34225 Content: ABCDE Internal & dynamic Tuning Balance Method: Round Robin Only affects performance, not correctness Cache size: 100 Many forms 11 How to Represent State? Per flow Src: HostA 1010110 Server: B Server: B 1000101 CPU: 50% Proto: TCP 1111000 Port: 22 State: ESTAB 1101010 Seq #: 3423 Policy Language Hash: 34225 0101001 Content: ABCDE Significant diversity May be shared Unknown structure Sharedmiddlebox operations Commonality among 12 State Representation Key Field1 = Value1 … FieldN = ValueN Action Offset1 → Const1 … OffsetN → ConstN Supporting Binary Blob • Key: •protocol header for field/value Only suitable per-flowpairs stateidentify traffic• subsets to vendor which state applies Not fully independent • Action: transformation function to change parts of packet to new constants • Supporting: binary blob 13 How to Manipulate State? • Today: only control some state – Constrains flexibility and sophistication • Manipulate all state at controller – Removes too much functionality from middleboxes Controller Middlebox 14 State Manipulation Determine where state resides Controller IPS 1 IPS 2 Create and update state • Control over state placement 1. Broad operations interface 2. Expose state-related events 15 Operations Interface get ( Filter SrcIP = 10.10.54.41 , ) Key SrcIP = 10.10.0.0/16 DPort = 22 Key SrcIP = 10.10.54.41 DstIP = 10.20.1.23 SPort = 12983 DPort = 22 Action * Supporting State = ESTAB Action • Need atomic blocks add ( Key , operations ) DstIP = 10.20.1.0/24 DROP of • Potential for invalid manipulations of state remove( Filter … , Source Destination Proto * 10.20.1.0/24 TCP Other Action * DROP ) 16 Events Interface Controller Firewall • Triggers – Created/updated state – Require state to complete operation Contents Balance visibility• and overhead – Key – Copy of packet? – Copy of new state? 17 Conclusion • Need fine-grained, centralized control over middlebox state to support rich scenarios • Challenges: state diversity, unknown semantics Key Field1 = Value1 … Action Offset1 → Const1 … get/add/remove ( … , Supporting Binary Blob ) 18 Open Questions • • • • • • Encoding supporting state/other action state? Preventing invalid state manipulations? Exposing events with sufficient detail? Maintaining operation during state changes? Designing a variety of control logics? Providing middlebox fault tolerance? 19 Related Work • • • • • • Simple Middlebox COntrol protocol [RFC 4540] Modeling middleboxes [IEEE Network 2008] Stratos – middleboxes in clouds [UW-Madison TR] ETTM – middleboxes in hypervisors [NSDI 2011] COnsolidated MiddleBoxes [NSDI 2012] Efficiently migrating virtual middleboxes [SIGCOMM 2012 Poster] • LIve Migration of Entire network [HotNets 2012] 20 Live Migration of an Entire Network (and its Hosts) Eric Keller, Soudeh Ghorbani, Matthew Caesar, Jennifer Rexford HotNets 2012 Virtual Machine Migration Widely supported to help: • Consolidate to save energy • Re-locate to improve performance Apps Apps Apps Apps Apps Apps OS OS OS OS OS OS Hypervisor Hypervisor 23 But Applications Look Like This Many VMs working together 24 And Rely on the Network Networks have increasing amounts of state Configuration Learned Software-Defined 25 Ensemble Migration Joint (virtual) host and (virtual) network migration No re-learning, No re-configuring, No re-calculating Capitalize on redundancy 26 Some Use Cases 27 1. Moving between cloud providers • Customer driven – for cost, performance, etc. • Provider driven – offload when too full 28 2. Moving to smaller set of servers • Reduce energy consumption (turn off servers, reduce cooling) 29 3. Troubleshooting • Migrate ensemble to infrastructure dedicated to testing (special equipment) 30 Goal: General Management Tool Automated migration according to some objective and easy manual input Objective manual Migration Ensemble Migration Automation Monitoring 31 LIve Migration of Ensembles Tenant Control Tenant Control virtual topology API to operator/ automation Migration Orchestration Migration Primitives Migration is transparent LIME Network Virtualization Software-defined network Virtualized servers 32 Why Transparent? 33 Separate Out Functionality Tenant Control Tenant Control virtual topology Network Virtualization 34 Separate Out Functionality Tenant Control Tenant Control virtual topology Migration Orchestration Migration Primitives Network Virtualization 35 Multi-tenancy Tenant Control Tenant Control Tenants virtual topology Migration Orchestration Migration Primitives Infrastructure Operator Network Virtualization 36 How to Live Migrate an Ensemble Can we base it off of VM migration? • Iteratively copy state • Freeze VM • Copy last delta of state • Un-freeze VM on new server 37 Applying to Ensemble Iterative copy 38 Applying to Ensemble Freeze and copy 39 Applying to Ensemble Resume 40 Applying to Ensemble Resume Complex to implement Downtime potentially large 41 Applying to Whole Network Iterative copy 42 Applying to Whole Network Freeze and copy 43 Applying to Whole Network Resume 44 Applying to Whole Network Resume Lots of packet loss Lots of “backhaul” traffic 45 Applying to Each Switch Iterative copy 46 Applying to Each Switch Freeze and copy 47 Applying to Each Switch Resume 48 Applying to Each Switch Resume Bursts of packet loss Even more “backhaul” traffic Long total time 49 A Better Approach • Clone the network • Migrate the VMs individually (or in groups) 50 Clone the Network Copy state 51 Clone the Network Cloned Operation 52 Clone the Network Migrate VMs 53 Clone the Network Migrate VMs 54 Clone the Network • Minimizes backhaul traffic • No packet loss associated with the network (network is always operational) 55 Consistent View of a Switch Switch_A Application view Migration Orchestration Physical reality Switch_A_0 Migration Primitives • Same guarantees as migration-free • Preserve application semantics Network Virtualization Switch_A_1 56 Sources of Inconsistency Apps Migration-free: packet 0 and packet 1 traverse same physical switch Packet 0 Switch_A_0 R1 R2 VM (end host) OS Packet 1 Switch_A_1 R1 R2 57 1. Local Changes on Switch (e.g. delete rule after idle timeout) Apps VM (end host) OS Packet 0 Switch_A_0 R1 R2 Packet 1 Switch_A_1 R1 R2 58 2. Update from Controller (e.g. rule installed at different times) Apps Install(R_new) Packet 0 Switch_A_0 R_new R1 R2 VM (end host) OS Packet 1 Switch_A_1 R1 R2 59 3. Events to Controller (e.g. forward and send to controller) Packet-in(pkt 1) (received at controller first) Packet 0 Apps VM (end host) OS Packet 1 Packet-in(pkt 0) Switch_A_0 R1 R2 Switch_A_1 R1 R2 60 Consistency in LIME Switch_A * Emulate HW functions * Combine information Migration Orchestration Migration Primitives Network Virtualization Switch_A_0 *Restrict use of some features * Use a commit protocol Switch_A_1 61 Conclusions and Future work • LIME is a general and efficient migration layer • Hope is future SDN is made migration friendly • Develop models and prove correctness? – end-hosts and network – “Observational equivalence” • Develop general migration framework – Control over grouping, order, and approach? 62