VMware® Network
Virtualization Design Guide
T e c h n i c a l W HI T E P A P E R
January 2013
VMware Network Virtualization
Design Guide
Table of Contents
Intended Audience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Components of the VMware Network Virtualization Solution. . . . . . . . . . . . . . . . . . . . . . . . 4
vSphere Distributed Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Logical Network (VXLAN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
vCloud Networking and Security Edge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
vCloud Networking and Security Manager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
vCloud Director. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
VXLAN Technology Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Standardization Effort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Encapsulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
VXLAN Packet Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Intra-VXLAN Packet Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Inter-VXLAN Packet Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Network Virtualization Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Physical Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Network Topologies with L2 Configuration in the Access Layer. . . . . . . . . . . . . . . . 12
Network Topologies with L3 Configuration in the Access Layer. . . . . . . . . . . . . . . . 13
Logical Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Scenario 1 – Greenfield Deployment: Logical Network with a
Single Physical L2 Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Scenario 2 – Logical Network: Multiple Physical L2 Domains. . . . . . . . . . . . . . . . . . . 15
Scenario 3 – Logical Network: Multiple Physical L2 Domains with vMotion. . . . . . 16
Scenario 4 – Logical Network: Stretched Clusters Across Two Datacenters . . . . . 17
Managing IP Addresses in Logical Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Scaling Network Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Consumption Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
In vCloud Director. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
In vCloud Networking and Security Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Using API. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Troubleshooting and Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Network Health Check. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
VXLAN Connectivity Check – Unicast and Broadcast Tests . . . . . . . . . . . . . . . . . . . . . . 23
Monitoring Logical Flows – IPFIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Port Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
T ECHNICAL W HI T E P A P E R / 2
VMware Network Virtualization
Design Guide
Intended Audience
This document is targeted toward virtualization and network architects interested in deploying VMware®
network virtualization solutions.
Overview
The IT industry has gained significant efficiency and flexibility as a direct result of virtualization. Organizations
are moving toward a virtual datacenter (VDC) model, and flexibility, speed, scale and automation are central to
their success. Although compute and memory resources are pooled and automated, networks and network
services, such as security, have not kept pace. Traditional network and security operations not only reduce
efficiency but also limit the ability of businesses to rapidly deploy, scale and protect applications. VMware
vCloud® Networking and Security™ offers a network virtualization solution to overcome these challenges.
Application
Application
Application
Workload
x86 Environment
Virtual
Machine
Virtual
Machine
Workload
Workload
L2, L3, L4-7 Network Services
Virtual
Machine
Server Hypervisor
Requirement: x86
Physical Compute and Memory
Virtual
Network
Decoupled
Virtual
Network
Virtual
Network
Network Virtualization Platform
Requirement: IP Transport
Physical Network
Figure 1. Server and Network Virtualization Analogy
Figure 1 draws an analogy between compute and network virtualization. Just as VMware vSphere® abstracts
compute capacity from the server hardware to create virtual pools of resources, network virtualization abstracts
the network into a generalized pool of network capacity. The unified pool of network capacity can then be
optimally segmented into logical networks directly attached to specific applications. Customers can create
logical networks that span physical boundaries, optimizing compute resource utilization across clusters and
pods. Unlike legacy architectures, logical networks can be scaled without reconfiguring the underlying physical
hardware. Customers can also integrate network services—such as firewalls, VPNs and load balancers—and
deliver them exactly where they are needed. “Single pane of glass” management for all these services further
reduces the cost and complexity of datacenter operations.
T ECHNICAL W HI T E P A P E R / 3
VMware Network Virtualization
Design Guide
The VMware network virtualization solution addresses the following key needs in today’s datacenter:
•Increasing compute utilization by pooling compute clusters
•Enabling noncontiguous cluster expansion
•Leveraging capacity across multiple racks in the datacenter
•Overcoming IP-addressing challenges when moving workloads
•Avoiding VLAN sprawl in large environments
•Enabling multitenancy at scale without encountering VLAN scale limitations
By adopting network virtualization, customers can effectively address these issues as well as realize the
following business benefits:
•Drive faster provisioning of network and services, enabling business agility
•Improve infrastructure utilization, leading to significant CapEx savings
––Increase compute utilization by 30 percent by efficiently pooling compute resources
––Increase network utilization by 40 percent due to compute pooling and improved traffic management
•Decouple logical networks from physical networks, providing complete flexibility
•Isolate and segment network traffic at scale
•Provide multitenancy without increasing the administrative burden
•Automate repeatable network and service provisioning workflows, translating to 30 percent or more in
OpEx savings on network operations alone
Components of the VMware Network
Virtualization Solution
There are several components bundled in the vCloud Networking and Security suite, plus several components of
the core vSphere layer, used to deploy VMware network virtualization:
1.VMware vSphere Distributed Switch™ 5.1 (VDS)
2. VMware vSphere logical network (VXLAN)
3. VMware vCloud Networking and Security Edge™ 5.1
4.VMware vCloud Networking and Security Manager™ 5.1
5.VMware vCloud Director® 5.1 (not part of the vCloud Networking and Security suite)
6.VMware vCenter Server™ 5.1 (not part of the vCloud Networking and Security suite; shown as part of
item 4 in Figure 2)
T ECHNICAL W HI T E P A P E R / 4
VMware Network Virtualization
Design Guide
5
VCD
3
VMware L3
Edge
vShield
Manager/
vCenter
4
2
Logical Network
(VXLAN)
Physical IP Network
1
VM
VM
VM
VM
VM
Figure 2. VMware VXLAN Solution Components
vSphere Distributed Switch
VDS abstracts the physical network and provides access-level switching in the vSphere hypervisor. It is central to
network virtualization because it enables logical networks that are independent of physical constructs such as
VLAN. Keep in mind the following key points:
•VDS facilitates massive scale, with support for up to 500 physical hosts.
•Multiple features such as Port Mirroring, NetFlow/IPFIX, Configuration Backup and Restore, Network Health
Check, QoS, LACP, and so on, provide a comprehensive toolkit for traffic management, monitoring and
troubleshooting within a virtual network.
For specific feature details, refer to the What’s New in VMware vSphere 5.1 – Networking white paper at
http://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdf.
Logical Network (VXLAN)
VMware network virtualization is built using Virtual eXtensible Local Area Network (VXLAN) overlay networking
technology, an industry standard that VMware developed jointly with major networking vendors. Logical
network enables the following capabilities:
•Creation over existing IP networks of a flexible logical layer 2 (L2) overlay network that works on existing
physical network infrastructure without the need to rearchitect any of the datacenter networks
•Communication (east–west and north–south) while maintaining isolation between tenants
•Application workloads that are agnostic of the overlay network and transparently perform all L2-to-VXLAN
translations in the host
See the following sections for more details on VXLAN technology, architecture components and packet flows.
T ECHNICAL W HI T E P A P E R / 5
VMware Network Virtualization
Design Guide
vCloud Networking and Security Edge
vCloud Networking and Security Edge serves as a VXLAN gateway, translating traffic between the logical
network and a physical VLAN- or IP-based network. In addition, it provides services to the logical network such
as DHCP, NAT, routing (static routing), firewall, VPN and load balancing. It is deployed in a virtual appliance form
factor, supports full active–standby HA functionality and can support up to 9GBps of traffic.
The following are key points to consider for vCloud Networking and Security Edge VXLAN gateway and network
services offered in network virtualization:
•It acts as an L3 gateway to translate between VXLAN and physical networks and is primarily used for
north–south traffic.
•It provides inter-VXLAN routing.
•Each VXLAN segment requires a separate vCloud Networking and Security Edge interface to ensure isolation.
•It is available in three sizes: compact, full and x-large; it offers options to scale up for higher performance or
scale out using multiple virtual appliances.
•vCloud Networking and Security Edge firewall services can be applied on a per–VXLAN segment basis.
•In multitenant deployments, individual pools of IP per tenant can be provided using vCloud Networking and
Security Edge DHCP services.
vCloud Networking and Security Manager
vCloud Networking and Security Manager is the centralized network and security management component of
the vCloud Networking and Security product suite. It is installed from an open virtualization appliance (OVA) file
as a virtual machine by using VMware vSphere Client™.
Keep in mind the following important points about vCloud Networking and Security Manager:
•Using the vCloud Networking and Security Manager user interface, administrators can install, configure and
maintain network and network services components.
•vCloud Networking and Security Manager exposes APIs that can be used to integrate with existing cloud
management systems or for scripts. These are also termed as northbound APIs.
•vCloud Director requires vCloud Networking and Security Manager to offer simple workflows for consumption
of virtual networks and services.
•VMware vCenter Server™ plug-in for vCloud Networking and Security Manager enables customers to perform
VXLAN configuration from vCenter Server as part of the Network Virtualization tab.
vCloud Director
The vCloud Director virtual datacenter container is a highly automatable abstraction of the pooled virtual
infrastructure. Network virtualization is fully integrated in vCloud Director workflows, enabling rapid
self-service provisioning within the context of the application workload. vCloud Director uses vCloud Networking
and Security Manager in the backend to provision network virtualization elements. vCloud Director is not part
of vCloud Networking and Security; it is a separate purchased component. It is not mandatory for deploying a
network virtualization solution, but it is highly recommended to achieve the complete operational flexibility and
agility discussed previously. See consumption models for all available consumption choices for VMware
network virtualization.
T ECHNICAL W HI T E P A P E R / 6
VMware Network Virtualization
Design Guide
VXLAN Technology Overview
Standardization Effort
VXLAN is an Internet Engineering Task Force (IETF) Internet draft formulated in collaboration with leading
networking vendors including Cisco, Arista and Broadcom. It provides a framework for creating L2 overlay
networks over L3 networks. Each L2 overlay network is called a VXLAN segment (or “virtual wire”) and is
uniquely identified by a 24-bit segment ID. This enables customers to create up to 16 million unique VXLAN
segments, each of which is an isolated logical network.
Encapsulation
VXLAN makes use of an encapsulation or tunneling method to carry the L2 overlay network traffic on top of
L3 networks. A special kernel module running on the vSphere hypervisor host along with a vmknic acts as the
virtual tunnel endpoint (VTEP). Each VTEP is assigned a unique IP address that is configured on the
vmknic virtual adapter associated with the VTEP.
The VTEP on the vSphere host handles all encapsulation and deencapsulation of traffic for all virtual machines
running on that host. A VTEP encapsulates the MAC and IP packets from the virtual machines with a
VXLAN+UDP+IP header and sends the packet out as an IP unicast or multicast packet. The latter mode is used
for broadcast and unknown destination MAC frames originated by the virtual machines that must be sent across
the physical IP network.
Figure 3 shows the VXLAN frame format. The original packet between the virtual machines communicating on
the same VXLAN segment is encapsulated with an outer Ethernet header, an outer IP header, an outer UDP
header and a VXLAN header. The encapsulation is done by the source VTEP and is sent out to the destination
VTEP. At the destination VTEP, the packet is stripped of its outer header and is passed on to the destination
virtual machine if the segment ID in the packet is valid.
Outer
MAC
DA
Outer
MAC
SA
Outer
8021.Q
Outer IP
DA
Outer IP
SA
VXLAN Encapsulation
Outer
UDP
VXLAN
Header
8 bytes
Inner
MAC
DA
Inner
MAC
SA
Optional
Inner
8021.Q
Original
Ethernet
Payload
CRC
Original Ethernet Frame
Figure 3. VXLAN Frame Format
The destination MAC address in the outer Ethernet header can be the MAC address of the destination VTEP or
that of an intermediate L3 router. The outer IP header represents the corresponding source and destination
VTEP IPs. The association of the virtual machine’s MAC to the VTEP’s IP is discovered via source learning. More
details on the forwarding table are provided in the “VXLAN Packet Flow” section. The outer UDP header
contains source port, destination port and checksum information. The source port of the UDP header is a hash of
the inner Ethernet frame’s header. This is done to enable a level of entropy for ECMP/load balancing of the
virtual machine–to–virtual machine traffic across the VXLAN overlay. The VXLAN header is an 8-byte field that
has 8 bits to indicate whether the VXLAN Network Identifier (VNI) is valid, 24 bits for the VXLAN Segment
ID/VXLAN VNI and the remaining 24 bits reserved.
T ECHNICAL W HI T E P A P E R / 7
VMware Network Virtualization
Design Guide
VXLAN Packet Flow
The following flow pattern describes the handling of ARP on a VXLAN segment (for the purposes of discussion,
it is a typical ARP packet from a virtual machine (MAC1) connected to a logical L2 network VXLAN 5001):
•Figure 4 shows two virtual machines connected to a logical L2 network. The virtual machines don’t detect any
difference in communicating to the external world. They continue to use standard IP protocol to communicate
with the destination. The traffic flows through the VTEP interface defined on the host.
•Each logical L2 network is associated with an IP multicast group. In this example, VXLAN 5001 is associated
with IP multicast group address (239.1.1.1), and both vSphere hosts (VTEPs) have joined that multicast group.
•The ARP broadcast frame from the virtual machine is encapsulated within an IP multicast frame by the VTEP
on which the virtual machine is running.
•The multicast frame is then sent to the multicast group associated with a logical L2 network segment ID.
•The multicast frame is received by the target VTEPs. The destination VTEPs then validate the logical L2
network segment ID, deencapsulate the packet, and forward it if there are virtual machines on that host that
are connected to this L2 network.
•The destination virtual machine then responds to the ARP request with a unicast packet. The VTEP on the host
on which this destination virtual machine is running establishes a point-to-point tunnel with the VTEP where
the virtual machine MAC1 is hosted.
NOTE:
•The number of multicast groups supported in the physical infrastructure dictates whether there can be a
one-to-one mapping to logical L2 network segment IDs. However, in the scenario where there are more logical
networks than multicast groups, mapping of multiple logical networks to one multicast group is supported.
•Multicast frames are generated only when a broadcast packet is detected on the logical L2 network or if
VTEP’s forwarding table does not have the mapping of a virtual machine MAC-to-VTEP IP for that MAC
address, also called an unknown unicast packet. This is similar to the transparent bridging operation of L2
switches or bridges where the packets are broadcast if there is no entry in the MAC forwarding table that
matches the destination MAC address of a frame. After the virtual machine MAC address–to–VTEP IP address
entry has been discovered and updated into the forwarding table, any future requests for communication to
that particular virtual machine is handled by the source host VTEP by establishing a point-to-point (stateless)
tunnel between destination VTEPs where the virtual machine is hosted.
•The IP multicast protocol acts as a control plane that helps build the forwarding table with virtual machine
MAC address and VTEP IP address mapping. Figure 4 shows the packet encapsulation and a forwarding table
entry in one of the VTEPs.
VTEP MAC addresses are detected during the multicast packet exchange that occurs when a virtual machine is
connected to a virtual wire. No standard ARP request is sent out from the VXLAN kernel module to detect the
VTEP MAC address, so there is no proxy ARP configuration requirement on the first hop router.
T ECHNICAL W HI T E P A P E R / 8
VMware Network Virtualization
Design Guide
L2
IP
Payload
L2
IP
Payload
VM
1
VM
MAC 1
MAC 2
4
VXLAN 5001
vSphere Distributed Switch
vSphere
vSphere
Forwarding Table
VTEP IP
10.20.10.10
VM MAC
VTEP IP
Segment ID
MAC1
10.20.10.10
5001
2
VTEP IP
10.20.10.11
3
L2
IP
UDP
VXLAN
L2
IP
Payload
L2/L3 network
infra
Figure 4. VXLAN Encapsulation and Forwarding Table Example
The next part of this section describes packet flow in the following VXLAN deployments:
1) Intra-VXLAN packet flow; that is, two virtual machines on the same logical L2 network
2) Inter-VXLAN packet flow; that is, two virtual machines on two different logical L2 networks
T ECHNICAL W HI T E P A P E R / 9
VMware Network Virtualization
Design Guide
Intra-VXLAN Packet Flow
Figure 5 shows two traffic flows:
•A virtual machine is communicating with another virtual machine on the same logical L2 network
(red dotted line).
•A virtual machine is communicating with an external device on the Internet (green dotted line).
VM
VM
192.168.1.10
VXLAN BLUE
192.168.1.11
192.168.1.0/24
192.168.1.1
vCloud Networking
and Security Edge
Gateway
172.26.10.10
External Network
172.26.10.0/24
Virtual Machine–to–Virtual Machine communication
Virtual Machine–to–Internet communication
Internet
Figure 5. VXLAN Traffic Flow – Same Logical L2 and External Traffic
In the case of virtual machine–to–virtual machine communication on the same logical L2 network, the following
two traffic flow examples illustrate possibilities that are dependent on where the virtual machines are deployed:
1)Both virtual machines are on the same vSphere host.
2)The virtual machines are on two different vSphere hosts.
In the first case, traffic remains on one vSphere host; in the second case, the virtual machine packet is
encapsulated into a new UDP header by the source VTEP on one vSphere host and is sent over through the
external IP network infrastructure to the destination VTEP on another vSphere host. In this process, the external
switches and routers do not detect anything about the virtual machine’s IP (192.168.1.10/192.168.1.11) and MAC
address because they are embedded in the new UDP header.
In the scenario where the virtual machine is communicating with the external world, as shown by the green
dotted line, it first will send the traffic to gateway IP address 192.168.1.1; the vCloud Networking and Security
Edge gateway will send unencapsulated traffic over its external-facing interface to the Internet.
T ECHNICAL W HI T E P A P E R / 1 0
VMware Network Virtualization
Design Guide
Inter-VXLAN Packet Flow
In the example shown in Figure 6, there are two logical L2 networks, VXLAN Blue and VXLAN Orange. The
virtual machines connected to these networks are isolated from each other. The two networks are assigned with
two different subnet IP addresses, 192.168.1.0/24 and 192.168.2.0/24. The vCloud Networking and Security Edge
gateway acts as the router/gateway between these two isolated logical L2 networks.
The traffic flow between the two virtual machines on different logical networks depends on where the virtual
machines and vCloud Networking and Security Edge gateway appliance are deployed. The following are
possible scenarios:
1) All the virtual machines and the vCloud Networking and Security Edge gateway are on the same
vSphere host.
2) The virtual machines are on different vSphere hosts, and the vCloud Networking and Security Edge
gateway appliance is deployed on one of the vSphere hosts.
3) All the virtual machines and the vCloud Networking and Security Edge gateway appliance are on
different vSphere hosts.
The first case is simple to describe because the traffic remains on the same host. The virtual machines direct the
traffic to the respective gateway IP address of the logical network subnets 192.168.1.1 and 192.168.2.1. The vCloud
Networking and Security Edge gateway receives the traffic on the different interfaces and, based on the firewall
rule, makes the routing decision between the two different interfaces.
The second and third cases of traffic flow involve the encapsulated packets that traverse the physical network
infrastructure before they reach the vCloud Networking and Security Edge gateway, which then routes the
packet to the appropriate destination.
VM
VM
192.168.1.10
VM
192.168.1.11
192.168.2.10
VXLAN Blue
VXLAN Orange
192.168.1.0/24
192.168.2.0/24
192.168.1.1
vCloud Networking
and Security Edge
Gateway
192.168.2.1
172.26.10.10
External Network
172.26.10.0/24
Virtual Machine–to–Virtual Machine
communication between two VXLANs
Internet
Figure 6. VXLAN Traffic Flow – Different Logical L2
T ECHNICAL W HI T E P A P E R / 1 1
VMware Network Virtualization
Design Guide
Network Virtualization Design Considerations
VMware network virtualization can be deployed on top of existing datacenter networks. In this section, we
discuss how the logical networks using VXLANs can be deployed over common datacenter network topologies.
We first discuss requirements for the physical network, followed by logical network deployment options.
Physical Network
The physical datacenter network varies across different customer environments in terms of which network
topology they use in their datacenter. Hierarchical network design provides the required high availability and
scalability to the datacenter network. This section assumes that the reader has some background in various
network topologies utilizing traditional L3 and L2 network configurations. Readers are encouraged to look at the
design guides from the physical network vendor of choice. We will examine some common physical network
topologies and how to enable network virtualization in them.
Network Topologies with L2 Configuration in the Access Layer
In this topology access layer, switches connect to the aggregation layer over an L2 network. Aggregation
switches are the VLAN termination points, as shown in Figure 7. Spanning Tree Protocol (STP) is traditionally
used to avoid loops. Routing protocols run between aggregation and core layers.
VM
VM
VM
VM
VM
VM
VM
Consume Logical L2
Network
VXLAN Fabric
Deploy VDS
vSphere Distributed Switch
VLAN100
VLAN100
Single Subnet
Enable IGMP
Snooping
L3 Access Layer
STP
L2 Trunks
Aggregation Layer
L3 Links
Enable IGMP
Querier
Routing
Rack 1
Core Layer
Rack 10
Figure 7. Datacenter Design – L2 Configuration in Access Layer with STP
In such deployments with a single subnet (VLAN 100) configured on different racks, enabling network
virtualization based on VXLAN requires the following:
•Enable IGMP snooping on the L2 switches.
•Enable the IGMP querier feature on one of the L2/L3 switches in the aggregation layer.
•Increase the end-to-end MTU by a minimum of 50 bytes to accommodate a VXLAN header. The recommended
size is 1,550 or jumbo frames.
T ECHNICAL W HI T E P A P E R / 1 2
VMware Network Virtualization
Design Guide
To overcome slower convergence times and lower link utilization limitations of STP, most datacenter networks
today use technologies such as Cisco vPC/VSS (or MLAG, MCE, SMLT, and so on). From the VXLAN design
perspective, there is no change to the previously stated requirements.
When the physical topology has an access layer with multiple subnets configured (for example, VLAN 100 in
Rack 1 and VLAN 200 in Rack 10 in Figure 8), the aggregation layer must have Protocol-Independent Multicast
(PIM) enabled to ensure that multicast routes across multiple subnets are exchanged.
All the VXLAN requirements previously discussed apply to leaf and spine datacenter architectures as well.
Network Topologies with L3 Configuration in the Access Layer
In this topology, access layer switches connect to the aggregation layer over an L3 network. Access switches are
the VLAN termination points, as shown in Figure 8. Key advantages of this design are better utilization of all the
links using Equal-Cost Multipathing (ECMP) and elimination of STP.
From the VXLAN deployment perspective, the following requirements must be met:
•Enable PIM on access switches.
•Ensure that during the VXLAN preparation process, no VLAN is configured. This ensures that a VDS doesn’t
perform VLAN tagging, also called virtual switch tagging (VST) mode.
•Increase end-to-end MTU by a minimum of 50 bytes to accommodate a VXLAN header. The recommended
size is 1,550 or jumbo frames.
VM
VM
VM
VM
VM
VM
VM
Consume Logical L2
Network
VXLAN Fabric
Deploy VDS
vSphere Distributed Switch
L3 Links
Routing
L3 Access Layer
Enable PIM
ECMP
Aggregation Layer
Rack 1
Core Layer
Rack 10
Figure 8. Datacenter Design – L3 Configuration in Access Layer with ECMP
T ECHNICAL W HI T E P A P E R / 1 3
VMware Network Virtualization
Design Guide
Logical Network
After the physical network has been prepared, logical networks are deployed with VXLAN, with no ongoing
changes to the physical network. The logical network design differs based on the customer’s needs and the type
of compute, network and storage components they have in the datacenter. The following aspects of the virtual
infrastructure should be taken into account before deploying logical networks:
•A cluster is a collection of vSphere hosts and associated virtual machines with shared resources. One cluster
can have a maximum of 32 vSphere hosts.
•A VDS is the datacenter-wide virtual switch that can span across up to 500 hosts in the datacenter. Best
practice is to use one VDS across all clusters to enable simplified design and cluster-wide VMware vSphere
vMotion® migration.
•With VXLAN, a new traffic type is added to the vSphere host: VXLAN transport traffic. As a best practice, the
new VXLAN traffic type should be isolated from other virtual infrastructure traffic types. This can be achieved
by assigning a separate VLAN during the VXLAN preparation process.
•A VMware vSphere ESXi™ host’s infrastructure traffic, including vMotion migration, VMware vSphere Fault
Tolerance, management, and so on, is not encapsulated and is independent of the VXLAN-based logical
network. These traffic types should be isolated from each other, and enough bandwidth should be allocated to
them. As of this release only, VMware does not support placing infrastructure traffic such as vMotion migration
on VXLAN-based virtual networks. Only virtual machine traffic is supported on logical networks.
•To support vMotion migrations of workloads between clusters, all clusters should have access to all storage
resources.
•The link aggregation method configured on the vSphere hosts also impacts how VXLAN transport traffic
traverses the host NICs. The VDS VXLAN port group’s teaming can be configured as failover, LACP active
mode, LACP passive mode or static EtherChannel.
a. When LACP or static EtherChannel is configured, the upstream physical switch must have an equivalent
port channel or EtherChannel configured.
b. Also, if LACP is used, the physical switch must have 5-tuple hash distribution enabled.
c. Virtual port ID and load-based teaming are not supported with VXLAN.
Next, the design in the following three scenarios is discussed.
•Greenfield deployment – A datacenter built from scratch.
•Brownfield deployment – An existing operational datacenter with virtualization.
•Stretched cluster – Two datacenters separated by a short distance.
Scenario 1 – Greenfield Deployment: Logical Network with a Single Physical L2 Domain
In a greenfield deployment, the recommended design is to have a single VDS stretching across all the compute
clusters within the same vCenter Server. All hosts in the VDS are placed on the same L2 subnet (single VLAN on
all uplinks). In Figure 9, the VLAN 10 spanning the racks is switched—not routed—creating a single L2 subnet.
This single subnet serves as the VXLAN transport subnet, and each host receives an IP address from this subnet,
used in VXLAN encapsulation. Multicast and other requirements are met based on the physical network
topology. Refer to the L2 configuration in the access layer shown in Figure 9 for details on multicast-related
configuration.
T ECHNICAL W HI T E P A P E R / 1 4
VMware Network Virtualization
Design Guide
VM
VM
VM
VM
VM
VM
VM
Logical L2
Network
VM
VXLAN 5002
VXLAN 5001
VXLAN Fabric
Rack 1
Cluster 1
VLAN 10
vSphere Distributed Switch
vSphere
vSphere
vSphere
vSphere
Rack 10
Cluster 2
VLAN 10
Legend:
VTEP
vwire5001
portgroup
vwire5002
portgroup
Switch
Figure 9. Greenfield Deployment – One VDS
Keep in mind the following key points while deploying:
•The VDS VXLAN port group must be in the same VLAN across all hosts in all clusters. This configuration is
handled through the vCloud Networking and Security Manager plug-in in vCenter Server.
•VDS, VLAN, teaming and MTU settings must be provided as part of the VXLAN configuration process.
•A VTEP IP address is assigned either via DHCP or statically via vCenter Server.
•Virtual machines communicating outside the logical network (to the Internet or to nonlogical networks within
the datacenter) require a VXLAN gateway.
vMotion Boundary
The vMotion boundary, or the workload migration limit, in VXLAN deployment is dictated by the following
two criteria:
1)vMotion migration is limited to hosts managed by a single vCenter Server instance.
2)vMotion migration is not possible across two VDS.
In this scenario where all the hosts are part of the same VDS, vMotion migration will work across all hosts as long
as the shared storage requirement is satisfied across the two clusters.
Scenario 2 – Logical Network: Multiple Physical L2 Domains
In brownfield deployments, clusters are typically deployed with multiple VDS, one per cluster. Each VDS is on a
different subnet, terminated on an aggregation router. Logical L2 networks can span across these subnet
boundaries. The main difference as compared to scenario 1 is that VXLAN transport traffic is routed instead of
being switched in the same subnet. Multicast and ECMP requirements are dependent on the physical topology.
Refer to the L3 configuration in the access layer shown in Figure 10 for details on multicast-related configuration.
T ECHNICAL W HI T E P A P E R / 1 5
VMware Network Virtualization
Design Guide
VM
VM
VM
VM
VM
VM
VM
Logical L2
Network
VM
VXLAN 5002
VXLAN 5001
VXLAN Fabric
Rack 1
Cluster 1
VLAN 10
vSphere Distributed Switch
vSphere
vSphere Distributed Switch
vSphere
vSphere
vSphere
Rack 10
Cluster 2
VLAN 20
Legend:
VTEP
vwire5001
portgroup
vwire5002
portgroup
Switch
Router
Figure 10. Brownfield Deployment – Two VDS
Keep in mind the following key points while deploying:
•VTEPs in different subnets can route traffic to each other.
•A VTEP IP address is assigned either via DHCP or statically via vCenter.
•Applications running in virtual machines cannot detect the physical topology and are in the same subnet.
•Virtual machines communicating outside the logical network (to the Internet or to nonlogical networks within
the datacenter) require a VXLAN gateway. (See appendix 2 for packet flows.)
vMotion Boundary
In this two-VDS VXLAN deployment, the vMotion boundary is limited to one VDS. The workloads deployed on a
logical L2 network cannot be moved to a host connected to a different VDS. However, if workload placement
alone is the goal, this design enables the choice of any cluster for the deployment of a workload, even if they are
on different physical VLANs.
Scenario 3 – Logical Network: Multiple Physical L2 Domains with vMotion
If vMotion migration across clusters is an important requirement, the following modified design should be used.
Here, a single VDS spans across multiple clusters, enabling vMotion migration across clusters. The following are
some of the key differences in this design:
•No VLAN ID is configured during the VXLAN preparation. The VDS will not perform VLAN tagging for the
VXLAN traffic going out on the uplinks (no VST).
•Dedicated uplinks are required on the hosts to carry untagged VXLAN traffic.
•The physical-switch ports, where the host uplinks are connected, are configured as access ports with appropriate
VLAN. For example, as shown in Figure 11, access switch ports of cluster 1 are configured with VLAN 10; those
of cluster 2 are configured with VLAN 20.
T ECHNICAL W HI T E P A P E R / 1 6
VMware Network Virtualization
Design Guide
VM
VM
VM
VM
VM
VM
VM
Logical L2
Network
VM
VXLAN 5002
VXLAN 5001
VXLAN Fabric
Rack 1
Cluster 1
No VST
Rack 10
Cluster 2
vSphere Distributed Switch
vSphere
vSphere
vSphere
vSphere
No VST
Legend:
VTEP
vwire5001
portgroup
vwire5002
portgroup
VLAN 10
Switch
VLAN 20
Router
Figure 11. Brownfield Deployment – Single VDS to Enable vMotion Migration
Because the storage network is parallel and independent of a logical network, it is assumed that both clusters
can reach the shared storage. Standard vMotion migration distance limitations and single vCenter requirements
still apply. Because the moved virtual machine is still in the same logical L2 network, no IP readdressing is
necessary, even though the physical hosts might be on different subnets.
Scenario 4 – Logical Network: Stretched Clusters Across Two Datacenters
Stretched clusters offer the ability to balance workloads between two datacenters. This nondisruptive workload
mobility enables migration of services between geographically adjacent sites. A stretched cluster design helps
pool resources in two datacenters and enables workload mobility. Virtual machine–to–virtual machine traffic is
within the same logical L2 network, enabling L2 adjacency across datacenters. The virtual machine–to–virtual
machine traffic dynamics are the same as those previously cited. In this section, we will discuss the impact of this
design on north–south traffic (virtual machine communicating outside the logical L2 network) because that is
the main difference as compared to previous scenarios.
Figure 12 shows two sites, site A and site B, with two hosts deployed in each site along with the storage and the
replication setup. Here all hosts are managed by a single vCenter Server and are part of the same VDS. In
general, for stretched cluster design, the following requirements must be met:
•The two datacenters must be managed by one vCenter Server because the VXLAN scope is limited to a single
vCenter Server.
•vMotion support requires that the datacenters have a common stretched VDS (as in scenario 3). A multiple
VDS design, discussed in scenario 2, can also be used, but vMotion migration will not work.
T ECHNICAL W HI T E P A P E R / 1 7
VMware Network Virtualization
Design Guide
VM
After vMotion
VM
VXLAN 5002
vSphere Distributed Switch
Stretched Cluster
WAN
Site A
IP Network
Internet
Storage A
Site B
IP Network
FC/IP
LUN (R/W)
Storage B
Internet
LUN (R/O)
Figure 12. Stretched Cluster
In this design, the vCloud Networking and Security Edge gateway is pinned to one of the datacenters (site A in
this example). In the vCloud Networking and Security 5.1 release, each VXLAN segment can have only one
vCloud Networking and Security Edge gateway. This has the following implications:
•All north–south traffic from the second datacenter (site B) in the same VXLAN (5002) must transit the
vCloud Networking and Security Edge gateway in the first datacenter (site A).
•Also, when a virtual machine is moved from site A to site B, all north–south traffic returns to site A before
reaching the Internet or other physical networks in the datacenter.
•Storage must support a “campus cluster” configuration.
These implications raise obvious concerns regarding bandwidth consumption and latency, so an active–active
multidatacenter design is not recommended. This design is mainly targeted toward the following scenarios:
•Datacenter migrations that require no IP address changes on the virtual machines. After the migration has
been completed, the vCloud Networking and Security Edge gateway can be moved to the new datacenter,
requiring a change in external IP addresses on the vCloud Networking and Security Edge only. If all virtual
machines have public IP addresses and are not behind vCloud Networking and Security Edge gateway network
address translation (NAT), more changes are needed.
•Deployments that require limited north–south traffic. Because virtual machine–virtual machine traffic does not
require crossing the vCloud Networking and Security Edge gateway, the stretched cluster limitation does not
apply.
These scenarios also benefit from elastic pooling of resources and initial workload placement flexibility. If virtual
machines are in different VXLANs, the limitations do not apply.
T ECHNICAL W HI T E P A P E R / 1 8
VMware Network Virtualization
Design Guide
Managing IP Addresses in Logical Networks
In a large cloud environment with multiple tenants, IP address management is a critical task. In this section, we
will focus on IP address management of the virtual machines deployed on the VXLAN logical L2 network. Each
logical L2 network created with VXLAN is a separate L2 broadcast domain. This L2 broadcast domain can be
associated with a separate subnet using a private IP space or publicly routable IP space. Depending on whether
private IP space or publicly routable IP space is used for the assignment to the logical networks, customers must
choose either the NAT or the non-NAT option on the vCloud Networking and Security Edge gateway. So the IP
address assignment depends on whether the virtual machine is connected to a logical L2 network through a
NAT or non-NAT configuration. Let’s take a look at the example with the following two deployments:
1) Using the NAT and DHCP services of the vCloud Networking and Security Edge gateway
2) Not using the NAT and DHCP services of the vCloud Networking and Security Edge gateway
With Network Address Translation
In deployments where customers have limited IP address space, NAT is used to provide address translation from
private IP space to the limited public IP addresses. By utilizing vCloud Networking and Security Edge gateway
services, customers can provide individual tenants with the ability to create their own pool of private IP
addresses, which ultimately get mapped to the publicly routable external IP address of the external vCloud
Networking and Security Edge gateway interface.
Figure 13 shows a three-tenant deployment, with each tenant virtual machine connected to separate logical L2
networks. The blue, green and purple virtual wires (VXLAN segments) are connected to the three internal
interfaces of the vCloud Networking and Security Edge gateway; the external interface of the vCloud
Networking and Security Edge is connected to the Internet via a datacenter router.
192.168.1.10
192.168.1.11
VM
VM
192.168.3.10
VXLAN 5000
VM
192.168.1.0/24
VXLAN 5002
192.168.3.0/24
192.168.2.10
VM
192.168.1.1
192.168.3.1
VXLAN 5001
192.168.2.0/24
vCloud
Networking and
Security Edge
Gateway
192.168.2.1
Standard NAT
Configuration and
DHCP service
172.26.10.1
External Network
172.26.10.0/24
Internet
Figure 13. NAT and DHCP Configuration on vCloud Networking and Security Edge Gateway
T ECHNICAL W HI T E P A P E R / 1 9
VMware Network Virtualization
Design Guide
The following are some configuration details of the vCloud Networking and Security Edge gateway:
•Blue, green and purple virtual wires (VXLAN segments) are associated with separate port groups on a VDS.
Internal interfaces of the vCloud Networking and Security Edge gateway connect to these port groups.
•The vCloud Networking and Security Edge gateway interface connected to the blue virtual wire is configured
with IP 192.168.1.1.
•Enable DHCP service on this internal interface of vCloud Networking and Security Edge by providing a pool of
IP addresses. For example, 192.168.1.10 to 192.168.1.50.
•All the virtual machines connected to the blue virtual wire receive an IP address from the DHCP service
configured on Edge or on the same subnet.
•The NAT configuration on the external interface of the vCloud Networking and Security Edge gateway allows
virtual machines on a virtual wire to communicate with devices on the external network. This communication is
allowed only when the requests are initiated by the virtual machines connected to the internal interface of the
vCloud Networking and Security Edge.
In situations where overlapping IP and MAC address support is required, one vCloud Networking and Security
Edge gateway per tenant is recommended. Figure 14 shows an overlapping IP address deployment with two
tenants and two separate vCloud Networking and Security Edge gateways.
Tenant 1
Tenant 2
10.10.1.10
10.10.1.11
10.10.1.10
VM
VM
VM
VXLAN 5000
VXLAN 5001
10.10.1.0/24
10.10.1.0/24
10.10.1.1
10.10.1.1
vCloud
Networking and
Security Edge
Gateway
vCloud
Networking and
Security Edge
Gateway
10.10.20.1
10.10.10.1
External Network
10.10.0.0/16
IP Core
Figure 14. Overlapping IP and MAC Addresses
Without Network Address Translation
Customers who are not limited by routable IP addresses, have virtual machines with public IP addresses or do
not want to deploy NAT can use static routing on vCloud Networking and Security Edge.
T ECHNICAL W HI T E P A P E R / 2 0
VMware Network Virtualization
Design Guide
172.26.1.10
172.26..1.11
VM
VM
172.26..3.10
VXLAN 5000
VM
172.26.1.0/24
VXLAN 5002
172.26..3.0/24
172.26.2.10
VM
172.26.1.1
172.26.3.1
VXLAN 5001
172.26.2.0/24
vCloud
Networking and
Security Edge
Gateway
172.26.2.1
172.26.10.1
External Network
172.26.10.0/24
Internet
Figure 15. Routable IP Assignments to the Logical Networks
In the deployment shown in Figure 15, the vCloud Networking and Security Edge gateway is not configured with
the DHCP and NAT services. However, static routes are set up between different interfaces of the vCloud
Networking and Security Edge gateway.
Other Network Services
•In a multitenant environment, vCloud Networking and Security Edge firewall can also be used to segment
intertenant and intratenant traffic.
•vCloud Networking and Security Edge load balancer can be used for load balancing external to internal Web
traffic, for example, when multiple Web servers are deployed on the logical network. Static routes must be
configured on the upstream router to properly route inbound traffic to the vCloud Networking and Security
Edge external interface.
•vCloud Networking and Security Edge also provides DNS relay functionality to resolve domain names. DNS
relay configuration should point to an existing DNS in the physical network. Alternatively, a DNS server can be
deployed in the logical network itself.
Scaling Network Virtualization
In this section, we present the design considerations that can be followed for the different components while
planning the scaling of VXLAN networks and associated network services. The following key components and
parameters should be taken into account:
1)VDS:
•One vCenter Server can have 128 VDS.
•One VDS can span across 500 hosts.
•One VDS can support 10,000 port groups. Because a new port group is created for every logical L2
network, this number dictates the number of L2 logical networks that can be created.
T ECHNICAL W HI T E P A P E R / 2 1
VMware Network Virtualization
Design Guide
2)vCloud Networking and Security Edge gateway:
•Each vCloud Networking and Security Edge gateway can have a maximum of 10 interfaces and can
be configured to connect to an internal or external network. The number of logical networks requiring
gateway services determines the number of gateway instances that must be deployed based on the
10-interfaces-per-gateway maximum. For example, if one interface per gateway is connected to an
external network (leaving 9 for internal networks), the number of gateway instances required for
90 logical L2 networks would be 90/9—that is, 10 vCloud Networking and Security Edge gateway devices.
•Available in three different sizes, based on capacity.
3)VXLAN Traffic:
•The planned virtual machine consolidation ratio should take into consideration the amount of virtual
machine traffic that VTEP must handle.
•Meet the bandwidth requirements for the VXLAN traffic by assigning sufficient NICs for the same.
To optimally utilize the uplinks, use link aggregation methods on the physical switches.
4)Multicast:
•Each VXLAN logical network is uniquely identified by a combination of a number called segment ID
(determined from a range defined by the user) and the configured multicast group. The multicast
group–to–VXLAN segment ID mapping is handled by the vCloud Networking and Security Manager.
There is no need to have one-to-one mapping between the segment ID and the multicast group. In case
of a limited number of multicast groups, vCloud Networking and Security Manager maps multiple logical
networks (segment IDs) to one multicast group.
Consumption Models
After the VXLAN configuration has been completed, customers can create and consume logical L2 networks on
demand. Depending on the type of vCloud Networking and Security bundle purchased, they have the following
three options:
1)Use the vCloud Director interface.
2)Use the vCloud Networking and Security Manager interface.
3) Use REST APIs offered by vCloud Networking and Security products.
In vCloud Director
vCloud Director creates a VXLAN network pool implicitly for each provider VDC backed by VXLAN prepared
clusters. The total number of logical networks that can be created using a VXLAN network pool is determined by
the configuration at the time of VXLAN fabric preparation. A cloud administrator can in turn distribute this total
number to the various organization VDCs backed by the provider VDC. The quota allocated to an organization
VDC determines the number of logical networks (organization VDC/ VMware vSphere vApp™ networks) backed
by VXLAN that can be created in that organization VDC.
In vCloud Networking and Security Manager
Customers who don’t have vCloud Director deployment can consume the logical L2 networks through the
vCloud Networking and Security Manager Web interface or through the vSphere Client network virtualization
plug-in.
T ECHNICAL W HI T E P A P E R / 2 2
VMware Network Virtualization
Design Guide
Using API
In addition to vCloud Director and vCloud Networking and Security Manager, vCloud Networking and Security
components can be managed using APIs provided by VMware. For detailed information on how to use the APIs,
refer to the vCloud Networking and Security 5.1 API Programming Guide at
https://www.vmware.com/pdf/vshield_51_api.pdf.
Troubleshooting and Monitoring
The following are some of the important tools that customers should use to troubleshoot and monitor the
VXLAN network. These tools provide the required visibility into the encapsulated VXLAN traffic and also help
manage the overall logical network infrastructure.
Network Health Check
Network Health Check enables proactive reports on virtual and physical network configuration inconsistencies,
reducing operational costs involved in troubleshooting and fixing errors. It checks for the following three
parameters:
•VLAN IDs
•MTU settings
•Teaming configuration
VXLAN Connectivity Check – Unicast and Broadcast Tests
The unicast and broadcast tests available through the vCloud Networking and Security Manager enable
customers to test the configuration across the virtual and physical infrastructure. They also enable verification
that all VTEP configurations are correct and that each VTEP can reach other VTEPs. A gateway address on VTEP
is required for this functionality to work. A VTEP IP address must be assigned using DHCP to configure the
gateway, because static IP configuration on VTEP via vCenter Server does not enable gateways to be
configured. Proxy ARP on upstream gateway/router is not a requirement.
Monitoring Logical Flows – IPFIX
NetFlow v10/IPFIX on VDS enables vendors to predefine custom NetFlow records. A new VXLAN template has
been predefined to monitor traffic flows in logical networks. With this template, customers can monitor VXLAN
flows at virtual machine–level granularity.
Port Mirroring
VDS provides multiple standard port mirroring features such as SPAN, RSPAN and ERSPAN that help in detailed
traffic analysis.
T ECHNICAL W HI T E P A P E R / 2 3
VMware Network Virtualization
Design Guide
Conclusion
The VMware network virtualization solution addresses the current challenges with the physical network
infrastructure and brings flexibility, agility and scale through VXLAN-based logical networks. Along with the
ability to create on-demand logical networks using VXLAN, the vCloud Networking and Security Edge gateway
helps customers deploy various logical network services such as firewall, DHCP, NAT and load balancing on
these networks. The operational tools provided as part of the solution help in the troubleshooting and
monitoring of these overlay networks.
T ECHNICAL W HI T E P A P E R / 2 4
VMware, Inc. 3401 Hillview Avenue Palo Alto CA 94304 USA Tel 877-486-9273 Fax 650-427-5001 www.vmware.com
Copyright © 2013 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. VMware products are covered by one or more patents listed
at http://www.vmware.com/go/patents. VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other jurisdictions. All other marks and names mentioned herein may be
trademarks of their respective companies. Item No: VMW-WP-NETWORK-VIRT-GUIDE-USLET-101
Docsource: OIC - 12VM008.07