Software Defined Networking COMS 6998-10, Fall 2014 Instructor: Li Erran Li (lierranli@cs.columbia.edu) http://www.cs.columbia.edu/~lierranli/coms 6998-10SDNFall2014/ 10/13/2014: SDN Updates Midterm Date • Two options – Oct 27 – Nov 10 • A quick show of hands • Result: midterm on Nov 10 10/13/14 Software Defined Networking (COMS 6998-10) 2 Outline • Announcement: project proposal due Oct 23 • Review of Previous Lecture – Verification of network properties – Verification of controller correctness – Verification of software data plane • SDN Updates – Per Switch Consistent and Minimal Updates – Network-wide Updates • Consistent Updates • Congestion-Free Updates • Network Partition 10/13/14 Software Defined Networking (COMS 6998-10) 3 Review of Previous Lecture (Cont’d) • NetPlumber graph: – Creates a dependency graph of all forwarding rules in the network and uses it to verify policy – Nodes: forwarding rules in the network – Directed Edges: next hop dependency of rules and intra table dependency 10/13/14 Switch 1 Switch 2 R1 R 2 Software Defined Networking (COMS 6998-10) 4 Review of Previous Lecture (Cont’d) 0 1 X X 1 001 1 0XX S S Where is the missing edge? Example NetPlumber graph 10/13/14 Software Defined Networking (COMS 6998-10) Source: P. Kazemian, Stanford 5 Review of Previous Lecture (Cont’d) 0 1 X X 1 001 1 0XX S S Example NetPlumber graph 10/13/14 Software Defined Networking (COMS 6998-10) Source: P. Kazemian, Stanford 6 Review of Previous Lecture (Cont’d) The System Frenetic DSL Domain-specific language • predicates and policies • monitoring • mac learning • network address translation implemented using Frenetic implemented using OX Ox OCaml embedding • predicates and policies • queries OCaml OpenFlow Platform • similar to Nox, Pox, Floodlight, etc. 10/13/14 Software Defined Networking (COMS 6998-10) Source: Nate Foster, Cornell 7 Review of Previous Lecture (Cont’d) Certified NetKAT Controller • Each level of abstraction formalized in Coq • Machine-checked proofs that the transformations between levels preserve semantics • Code extracted to OCaml and deployed with real switch hardware 10/13/14 NetKAT Compiler Optimizer Flow tables Run-time system OpenFlow messages Software Defined Networking (COMS 6998-10) Source: Nate Foster, Cornell 8 Review of Previous Lecture (Cont’d) Verification of software dataplane • Proof of bounded execution property – no more than X instructions per packet • Proof of crash-freedom property – No packet will cause the pipeline to abort • Need to handle – Pipelines – Loops (may exceed time constraint) – Data structures (can crash) 10/13/14 Software Defined Networking (COMS 6998-10) 9 intrusion detection application acceleration IP forwarding m elements do not share mutable state verification time ∼ 2 n m 10 intrusion detection application acceleration ... assert(src != dst); IP forwarding ... do not share mutable state verification time ∼ m 2 n 11 Review of Previous Lecture (Cont’d) Pipeline decomposition • Rule: pipeline structure - distinct packet-processing elements - do not share mutable state • Effect: compose at the element level - can reduce #paths from ∼ 2 n m - to ∼ m 2 n 12 IP options 13 option #1 option #2 ... option #m m options verification time ∼ n m 14 option #1 option #2 ... option #m m options little state sharing across iterations ... verification time ∼ m n 15 Review of Previous Lecture (Cont’d) Loop decomposition • Rule: “mini-pipeline” structure - little state shared across iterations - made explicit by the programmer • Effect: compose at the iteration level - can reduce #paths from ∼ n m - to ∼ m n 16 Review of Previous Lecture (Cont’d) Verified data structures • Use pre-allocated arrays - no dynamic memory (de)allocation - hash table, longest prefix match • Trade-off memory for “verifiability” - at least as fast (array lookups) - but larger memory footprint (pre-allocation) 17 Outline • Announcement: project proposal due Oct 23 • Review of Previous Lecture – Verification of network properties – Verification of controller correctness – Verification of software data plane • SDN Updates – Per Switch Consistent and Minimal Updates – Network-wide Updates • Consistent Updates • Congestion-Free Updates • Network Partition 10/13/14 Software Defined Networking (COMS 6998-10) 18 Updates Happen Network Updates •Maintenance •Failures •ACL Updates Desired Invariants •No black-holes •No loops •No security violations 19 10/13/14 Software Defined Networking (COMS 6998-10) 19 Distributed Programming: non-atomic table updates Priority Predicate Update one Switch Action ⊆ Priority Predicate Action 10 SSH Drop ⊆ Priority Predicate Action 5 dst_ip = H1 Fwd 1 Priority Predicate Action 10 SSH Drop 5 dst_ip = H1 Fwd 1 update re-ordering Priority Predicate Action 5 dst_ip = H1 Fwd 1 5 dst_ip = H2 Fwd 2 10/13/14 ⊆ Priority Predicate Action 10 SSH Drop 5 dst_ip = H1 Fwd 1 5 dst_ip = H2 Fwd 2 Software Defined Networking (COMS 6998-10) Source: Nate Foster, Cornell 20 Update one Switch (Cont’d) • Solution: insert barrier messages to enforce partial ordering of rule updates • Question: what is the algorithm? 10/13/14 Software Defined Networking (COMS 6998-10) 21 Minimal Number of Updates • Minimum dependency can be inferred from rule patterns Pattern Priority A <1, 2, *> 5 B <*, 2, 3> 4 C <1, *, 4> 4 D <1, *, 3> 3 E <*, *, 4> 3 F <*, *, 3> 2 Old 10/13/14 A B D F A C E B D F G C E Software Defined Networking (COMS 6998-10) Pattern Priority A <1, 2, *> 5 B <*, 2, 3> 4 G <*, 2, 4> 3 C <1, *, 4> 2 D <1, *, 3> 3 E <*, *, 4> 1 F <*, *, 3> 2 New 22 Minimal Number of Updates (Cont’d) • With minimum dependency graph, one can always generate minimum-size flowtable update if priority value is continuous A Pattern Priority A <1, 2, *> 5 B <*, 2, 3> 4 C <1, *, 4> 4 D <1, *, 3> 3 E <*, *, 4> 3 F <*, *, 3> 2 A G B C B D C E D E F Pattern Priority A <1, 2, *> 5 B <*, 2, 3> 4 G <*, 2, 4> 3-> 4.5 C <1, *, 4> 2 -> 4 D <1, *, 3> 3 E <*, *, 4> 1 -> 3 F <*, *, 3> 2 F Take-away: Minimum dependency helps eliminate priority updates 10/13/14 Software Defined Networking (COMS 6998-10) 23 How to obtain minimum dependency • Restored from prioritized flowtable after compilation – Incurs complicated header space computation • Constructed along with compilation – Rule dependency can be recursively inferred from policy composition process – Incurs little additional overhead over compilation – See paper for the algorithm 10/13/14 Software Defined Networking (COMS 6998-10) 24 Maintaining Priority Value Distribution • Discrete priority values – Integers ranging [0-65535] for OpenFlow – If new rule is inserted between adjacent priority values, we have to shift existing rules to make room for them • Problem Statement – Assign priority values for priority levels – Objective: minimize the estimation of priority shifts • Online strategy Pattern Priority <1, 2> 5 -> 3 <2, *> 4 -> 2.5 <1, *> 3 -> 2 <*, 2> 3 -> 2 <3, *> 2 -> 1.5 <*, *> 1 – Unknown future policy update sequence 10/13/14 Software Defined Networking (COMS 6998-10) 25 Outline • Announcement: project proposal due Oct 23 • Review of Previous Lecture – Verification of network properties – Verification of controller correctness – Verification of software data plane • SDN Update – Per Switch Update – Network-wide Updates • Consistent Update • Congestion-Free Update • Network Partition 10/13/14 Software Defined Networking (COMS 6998-10) 26 Network Updates Are Hard 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 27 Network Update Abstractions Goal •Tools for whole network update Approach •Develop update abstractions •Endow them with strong semantics •Engineer efficient implementations 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 28 Example: Distributed Access Control Security Policy Src F1 I Traffic Action Web Non-web Any Allow Drop Allow F2 F3 Traffic 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 29 Naive Update Security Policy Src F1 I Traffic Action Web Non-web Any Allow Drop Allow F2 Order F3 F1 F2 F3 I Traffic 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 30 Use an Abstraction! Security Policy ✓ UPDATE ✓ ✓ 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 31 Per-Switch Atomic Update? Security Policy Src F1 I Traffic Action Web Non-web Any Allow Drop Allow F2 F3 Traffic 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 32 Per-Packet Consistent Updates Per-Packet Consistent Update Each packet processed with old or new configuration, but not a mixture of the two. Security Policy Obeys policy: Src Obeys policy: 10/13/14 Software Defined Networking (COMS 6998-10) Traffic Action Web Non-web Any Allow Drop Allow Source: M. Reitblatt, Cornell 33 Universal Property Preservation Theorem: Per-packet consistent updates preserve all trace properties. Trace Property Any property of a single packet’s path through the network. Examples of Trace Properties: Loop freedom, access control, waypointing ... Trace Property Verification Tools: NetPlumber, ConfigChecker ... 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 34 Formal Verification Corollary: To check an invariant, verify the old and new configurations. Security Policy Analyzer ✓ Security Policy ✓ Analyzer Verification Tools • Anteater [SIGCOMM ’11] • NetPlumber [SIGCOMM ’13] • ConfigChecker [ICNP ’09] 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 35 Mechanisms 10/13/14 Software Defined Networking (COMS 6998-10) 36 2-Phase Update Overview •Runtime instruments configurations •Edge rules stamp packets with version •Forwarding rules match on version update(config,topo) Algorithm (2-Phase Update) 1.Install new rules on internal switches, leave old configuration in place Calculate rules, generate messsages 2.Install edge rules that stamp with the new version number 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 37 2-Phase Update in Action F1 I F2 F3 Traffic 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 38 Optimized Mechanisms Optimizations • Extension: strictly adds paths • Retraction: strictly removes paths • Subset: affects small # of paths • Topological: affects small # of switches Runtime • Automatically optimizes • Power of using abstraction 10/13/14 Software Defined Networking (COMS 6998-10) update(config,topo) Calculate rules, generate messsages Source: M. Reitblatt, Cornell 39 Subset Optimization F1 I F2 F3 Traffic 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 40 Correctness Question: How do we convince ourselves these mechanisms are correct? Solution: built an operational semantics, formalized our mechanisms and proved them correct Example: 2-Phase Update 1.Install new rules on internal switches, leave old configuration in place 2.Install edge rules that stamp with the new version number } } Unobservable One-touch Theorem: Unobservable + one-touch = per-packet. 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 41 Implementation • Runtime – NOX Library – OpenFlow 1.0 – 2.5k lines of Python – update(config, topology) – Uses VLAN tags for versions – Automatically applies optimizations update(config,topo) • Verification Tool – Checks OpenFlow configurations – Computation Tree logic (CTL) specification language to specify legal paths (e.g. waypoints) a packet may take • CTL is a branching time temporal logic – Uses NuSMV model checker 10/13/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 42 Evaluation Question: How much extra rule space is required? • Setup – Mininet VM • Applications – Routing and Multicast • Scenarios – Adding/removing hosts – Adding/removing links – Both at the same time 10/13/14 Topologies Fattree Small-world Software Defined Networking (COMS 6998-10) Waxman Source: M. Reitblatt, Cornell 43 Results: Routing Application Fattree 10/13/14 Small-world Software Defined Networking (COMS 6998-10) Waxman Source: M. Reitblatt, Cornell 44 Conclusion • Update abstractions – Per-packet – Per-flow • Mechanisms – 2-Phase Update – Optimizations • Formal model – Network operational semantics – Universal property preservation 1013/14 Software Defined Networking (COMS 6998-10) Source: M. Reitblatt, Cornell 45 Outline • Announcement: project proposal due Oct 23 • Review of Previous Lecture – Verification of network properties – Verification of controller correctness – Verification of software data plane • SDN Update – Per Switch Consistent and Minimal Updates – Network-wide Updates • Consistent Updates • Congestion-Free Updates • Network Partition 10/13/14 Software Defined Networking (COMS 6998-10) 46 DCN is constantly in flux Upgrade Reboot New Switch Switches Traffic Flows 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 47 DCN is constantly in flux Switches Traffic Flows Virtual Machines 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 48 Network updates are painful for operators Switch Upgrade Holy C**p Two weeks before update, Bob has to: • Coordinate with application owners Complex • Prepare a detailed updatePlanning plan • Review and revise the plan with colleagues At the night of update, Bob executes plan by hands, but Unexpected Performance • Application alerts are triggered unexpectedly Degradation • Switch failures force him to backpedal several times. Eight hours later, Bob is still stuck with update: • No sleep over night Laborious Process • Numerous application complaints • No quick fix in sight Bob: An operator 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 49 Congestion-free DCN update is the key • Applications want network updates to be seamless – Reachability – Low network latency (propagation, queuing) – No packet drops Congestion • Congestion-free updates are hard – – – – 10/13/14 Many switches are involved Multi-step plan Different scenarios have distinct requirements Interactions between network and traffic demand changes Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 50 A clos network with ECMP All switches: Equal-Cost Multi-Path (ECMP) Link capacity: 1000 CORE 1 2 3 4 150= 920150 620 + 150 + 150 AGG 1 2 300 ToR 3 4 300 300 1 2 3 4 300 5 600 600 10/13/14 6 5 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 51 Switch upgrade: a naïve solution triggers congestion Link capacity: 1000 CORE 1 2 3 4 1070 620 + 300 150 + 150 = 920 AGG 1 2 Drain AGG1 ToR 10/13/14 3 4 6 5 600 1 2 3 4 Software Defined Networking (COMS 6998-10) 5 Source: J. Liu, Yale 52 Switch upgrade: a smarter solution seems to be working Link capacity: 1000 CORE 1 2 3 4 50 = 1070 970 620 + 300 + 150 AGG 1 2 3 4 Drain AGG1 ToR 10/13/14 6 5 500 1 2 3 4 Software Defined Networking (COMS 6998-10) 100 Weighted ECMP 5 Source: J. Liu, Yale 53 Traffic distribution transition Initial Traffic Distribution Congestion-free CORE 1 AGG 1 2 2 300 ToR 3 3 4 1 4 6 5 300 300 2 3 4 Final Traffic Distribution Congestion-free 300 5 Transition ? CORE 1 AGG 1 2 2 0 ToR 3 3 4 6 5 600 1 4 500 2 3 4 100 5 Simple? NO! Asynchronous Switch Updates 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 54 Asynchronous changes can cause transient congestion When ToR1 is changed but ToR5 is not yet: Link capacity: 1000 CORE 1 2 3 4 620 + 300 + 150 = 1070 AGG 1 2 3 4 6 5 Drain AGG1 300 300 600 ToR 1 2 3 4 5 Not Yet 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 55 Solution: introducing an intermediate step Final Initial CORE 1 2 3 4 CORE 1 AGG 1 2 3 4 Transition AGG 1 2 300 ToR 3 4 300 1 6 5 300 2 3 Congestion-free regardless the asynchronizations ToR 5 CORE 1 AGG 1 2 1 ToR ? 2 3 400 1 3 2 3 4 4 4 6 5 500 2 3 4 100 5 Congestion-free regardless the asynchronizations 6 5 450 4 3 600 Intermediate 200 10/13/14 0 300 4 2 150 5 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 56 How zUpdate performs congestionfree update Update Scenario Operator Update requirements zUpdate Current Traffic Distribution Intermediate Traffic Distribution Intermediate Traffic Distribution Target Traffic Distribution Data Center Network 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 57 Key technical issues • Describing traffic distribution • Representing update requirements • Defining conditions for congestion-free transition • Computing an update plan • Implementing an update plan 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 58 Describing traffic distribution l f CORE l s4 f s2,s 4 s5 =150 150 AGG l ToR : flow f’s load on link v, u v,u s2 f s3 =300 300 s1,s 2 s1 f 10/13/14 600 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 59 Representing update requirements CORE s4 s5 When s2 recovers AGG s2 s3 Drain s2 Constraint: no Constraint: ECMP equal split flow to s2 s1 ToR f 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 60 Switch asynchronization exponentially inflates the possible load values Transition from old traffic distribution to new traffic distribution f ingress 1 2 4 6 egress f 8 3 5 7 f l 7,8 Asynchronous updates can result in 2^5 possible load values on link (7,8) during transition. In large networks, it is impossible to check if the load value exceeds link capacity. 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 61 Two-phase commit reduces the possible load values to two Transition from old traffic distribution to new traffic distribution f ingress 1 version flip 2 4 6 egress 8 3 5 f 7 • With two-phase commit, f’s load on link (7,8) only has two possible values throughout a transition 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 62 Flow asynchronization exponentially inflates the possible load values f1 1 2 4 6 f1 + f2 8 f2 0 3 5 7 l f 7,8 Asynchronous updates to N independent flows can result in 2^N possible load values on link (7,8) 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 63 Handling flow asynchronization f1 1 2 4 6 8 f2 0 3 5 7 The load on link switch 7 to 8 has four potential values, but it is no more than the sum of f1’s maximum potential value and f2’s maximum potential value. 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 64 Computing congestion-free transition plan Linear Programming Constant: Current Traffic Distribution Constraint: Congestion-free Variable: Intermediate Traffic Distribution Constraint: Update Requirements Variable: Intermediate Traffic Distribution Variable: Target Traffic Distribution Constraint: • Deliver all traffic • Flow conservation 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 65 Implementing an update plan • Computation time Weighted-ECMP ECMP Critical Flows Other Flows • Switch table size limit • Update overhead Flows traversing bottleneck links • Failure during transition • Traffic demand variation 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 66 Evaluations • Testbed experiments • Large-scale trace-driven simulations 10/13/14 Software Defined Networking (COMS 6998-10) 67 Testbed setup Switch: Arista 7050 Link: 10Gbps ToR6,7: 6.2Gbps ToR6,7: 6.2Gbps CORE 1 AGG 1 3 2 2 3 4 5 ToR6,7: 6.2Gbps ToR6,7: 6.2Gbps 4 4 5 8 9 6 Drain AGG1 ToR 1 2 3 6 7 ToR5: 6Gbps 10 11 12 ToR8: 6Gbps Traffic Generator 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 68 zUpdate achieves congestion-free switch upgrade Initial CORE 1 AGG 1 2 2 3Gbps ToR 3 3 4 3Gbps 1 2 Intermediate 5 3Gbps 3 4 4 CORE 1 6 AGG 1 2 2 2Gbps 3Gbps 3 1 4 4 4Gbps ToR 5 3 6 5 4.5Gbps 2 3 1.5Gbps 4 5 Real-time link utilization Link Utilization 1.05 Final 1 0.95 CORE 1 AGG 1 2 3 4 0.9 0.85 0.8 0 5 10 15 Time (sec) Link: CORE1-AGG3 10/13/14 20 Link: CORE3-AGG4 25 2 0 ToR 3 4 6Gbps 1 Software Defined Networking (COMS 6998-10) 2 6 5 5Gbps 3 4 Source: J. Liu, Yale 1Gbps 5 69 One-step update causes transient congestion Initial CORE 1 AGG 1 2 2 3Gbps ToR 3 3 4 3Gbps 1 4 6 5 3Gbps 2 3 4 3Gbps 5 Real-time link utilization Final Link Utilization 1.1 1 CORE 1 AGG 1 2 3 4 0.9 0.8 0.7 0 5 10 Link: CORE1-AGG3 10/13/14 0 15 ToR Time (sec) 2 3 4 6Gbps 1 2 6 5 5Gbps 3 4 1Gbps 5 Link: CORE3-AGG4 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 70 Conclusion • Switch and flow asynchrony can cause severe congestion during DCN updates • zUpdate provides congestion-free DCN updates – Novel algorithms to compute update plan – Practical implementation on commodity switches – Evaluations in real DCN topology and update scenarios 10/13/14 Software Defined Networking (COMS 6998-10) Source: J. Liu, Yale 71 Outline • Announcement: project proposal due Oct 23 • Review of Previous Lecture – Verification of network properties – Verification of controller correctness – Verification of software data plane • SDN Update – Per Switch Consistent and Minimal Updates – Network-wide Updates • Consistent Updates • Congestion-Free Updates • Network Partition 10/13/14 Software Defined Networking (COMS 6998-10) 72 Network Partition • Out-of-band control network • Routing and forwarding based on addresses Policy specification using end-host names Controller only aware of local name-address bindings 10/13/14 Software Defined Networking (COMS 6998-10) 73 Network Partition • Consider policy isolating A from B. A control network partition occurs. Only possible choices – Let all packets through (including from A to B) (bad for correctness) – Drop all packets (including from A to D) (bad for availability) 10/13/14 Software Defined Networking (COMS 6998-10) 74 Solution to Network Partition • Network can label packets with sender’s identity – Route based on identity instead of address • Inband control 10/13/14 Software Defined Networking (COMS 6998-10) 75 Other Update Problems • Consistent updates in the controller itself • Minimize rule space used • Scheduling updates to optimize completion time 10/13/14 Software Defined Networking (COMS 6998-10) 76 Questions? 10/13/14 Software Defined Networking (COMS 6998-10) 77
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )