Logical Switching with NSX Elver Sena Sosa © 2014 VMware Inc. All rights reserved. Agenda • VXLAN Overview – Version Dependency – Protocol and Frame Format • NSX for vSphere VXLAN Enhancements • System Architecture • VXLAN Communications – Replication Modes CONFIDENTIAL 2 NSX Logical Switching L2 Logical Switch 1 Challenges • • • • Multi-tenant segmentation VM Mobility requires L2 everywhere Large L2 Physical Network Sprawl – STP Issues HW Memory (MAC, FIB) Table Limits Logical Switch 2 Logical Switch 3 VMware NSX L3 Benefits • • • • Scalable Multi-tenancy across data center Enabling L2 over L3 Infrastructure Overlay Based on VXLAN Logical Switches span across Physical Hosts and Network Switches LOGICAL SWITCHING – Using VXLAN to scale the Network CONFIDENTIAL 3 VXLAN Version Dependencies • vCloud Networking and Security 5.5 uses the existing VXLAN implementation from 5.1 • This presentation is focused on NSX for vSphere – vSphere 5.5 is required to leverage new capabilities introduced with NSX for vSphere – vSphere 5.1 is supported by NSX for vSphere using the previous implementation – Upgrades from 5.1 are supported CONFIDENTIAL 4 VXLAN Protocol Overview Ethernet in IP overlay network – Entire L2 frame encapsulated in UDP – 50+ bytes of overhead 24 bit VXLAN Network Identifier – 16 M logical networks VXLAN can cross Layer 3 network boundaries Overlay between ESXi hosts – VMs do NOT see VXLAN ID VTEP (VXLAN Tunnel End Point) – VMkernel interface which serves as the endpoint for encapsulation/de-encapsulation of VXLAN traffic Technology submitted to IETF for standardization – With Cisco, Citrix, Red Hat, Broadcom, Arista and Others CONFIDENTIAL 5 VXLAN Frame Format • VXLAN Header format updated for NSX for vSphere • REPLICATE_LOCALLY bit used in Unicast and Hybrid modes • When a VTEP receives a frame with REPLICATE_LOCALLY set it must re-inject the frame into the transport network VXLAN Flags 8 bits RSVD 24 bits RRRRILRR VXLAN NI 24 bits RSVD 8 bits NSX for vSphere VXLAN Enhancements – Data Plane • Support for multiple VXLAN vmknics per host to provide additional options for uplink load balancing • DSCP & COS Tag from internal frame copied to external VXLAN encapsulated header • Support for Guest VLAN tagging • vMotion callback • Dedicated TCP/IP stack for VXLAN • Ready for VXLAN hardware offloading to network adapters (in future) CONFIDENTIAL 7 NSX for vSphere VXLAN Enhancements – Control Plane • A highly available and secure control plane to distribute VXLAN network information to ESXi hosts • Removes dependency on multicast routing/PIM in the physical network • Suppress broadcast traffic in VXLAN networks – ARP Directory Service & Cache CONFIDENTIAL 8 System Architecture NSX Manager • • • • Controller Cluster • • • • Pushes VIBs UI / API end point Controller Info VXLAN info to controller VTEP Table MAC Table ARP Table UTEP/MTEP per VXLAN TCP over SSL AMQP vsfwd socket User Kernel netcpa vmklink VDS VXLAN ESXi Host Routing CONFIDENTIAL 9 Cluster Preparation for VXLAN • Cluster preparation broken into two steps – Install - installs kernel modules for VXLAN, Routing and DFW – Configure – creates VTEP interfaces • MTU – VXLAN adds 50 bytes of encapsulation. Set MTU to 1600 bytes. • Segment ID – needed for Multicast and Hybrid mode VXLAN • Transport Zone – defines scope of VXLAN. Can span one or more vSphere Clusters CONFIDENTIAL 10 VXLAN Communication Modes VXLAN Control Plane Modes • Three Control Plane modes supported in NSX for vSphere – Unicast – Hybrid – Multicast • Controller selects one VTEP per remote segment from VTEP table to implement proxy – In Unicast Mode VTEP implements to as UTEP (Unicast Tunnel End Point) – In Hybrid Mode VTEP implements to as MTEP (Multicast Tunnel End Point) • If a UTEP or MTEP leaves a VNI the Controller will select a new proxy within the segment and update the participating • Optimized Replication – VTEPs perform software replication of BUM traffic to local VTEPs and one UTEP/MTEP per remote segment CONFIDENTIAL 12 VXLAN NSX for vSphere– Multicast Mode VXLAN Transport Subnet A 10.20.10.0/24 VXLAN Transport Subnet B 10.20.11.0/24 VM2 VM1 VM4 VM3 VXLAN 5001 vSphere Distributed Switch vSphere Host vSphere Host VTEP VTEP4 10.20.11.11 VTEP3 10.20.11.10 VTEP2 10.20.10.11 VTEP1 10.20.10.10 vSphere Host vSphere Host VTEP VTEP VTEP VXLAN Transport Network L2 - IGMP L3 - PIM L2 - IGMP Multicast Traffic CONFIDENTIAL 13 VXLAN NSX for vSphere – Unicast Mode VXLAN Transport Subnet A 10.20.10.0/24 VXLAN Transport Subnet B 10.20.11.0/24 VM2 VM1 VM4 VM3 VXLAN 5001 vSphere Distributed Switch VTEP1 10.20.10.10 vSphere Host UTEP vSphere Host VTEP4 10.20.11.11 VTEP3 10.20.11.10 VTEP2 10.20.10.11 Controller Cluster VTEP vSphere Host vSphere Host UTEP VTEP VXLAN Transport Network Unicast Traffic CONFIDENTIAL 14 VXLAN NSX for vSphere – Hybrid Mode VXLAN Transport Subnet A 10.20.10.0/24 VXLAN Transport Subnet B 10.20.11.0/24 VM2 VM1 VM4 VM3 VXLAN 5001 vSphere Distributed Switch VTEP2 10.20.10.11 VTEP1 10.20.10.10 vSphere Host vSphere Host MTEP VTEP4 10.20.11.11 VTEP3 10.20.11.10 Controller Cluster VTEP vSphere Host vSphere Host VTEP MTEP VXLAN Transport Network L2 - IGMP L2 - IGMP Multicast Traffic Unicast Traffic CONFIDENTIAL 15