June 4, 2012
As data centers are forced to accommodate rapidly growing volumes of information, their capacities to process, store, and transfer data are having difficulty keeping pace. A Gartner survey 1 in November 2010 found that 47% of representatives from 1,004 large enterprises from eight countries ranked data growth as one of their top three challenges. This was followed by system performance and scalability at 37%, and network congestion and connectivity architecture at 36%.
Growing demand based on data proliferation, new applications, and federated applications are exhausting the capacities of many data centers. According to the
Uptime Institute’s 2011 Data Center Industry Survey 2 , more than a third (36%) of data center facilities will run out of space, power, or cooling between 2011 and 2012.
These findings were based on a survey of 525 data center executives. Of the respondents reaching full capacity, 40% plan to build a new data center, while
62% plan to address the problem via server consolidation. Server virtualization is the primary mechanism for implementing server consolidation, but server virtualization introduces new sources of serverto-server traffic that pose challenges for the scalability of the data center network. Server virtualization also requires that the bandwidth of each physical server’s network connection scale in proportion to the increased demands for network I/O from multiple virtual machines sharing the same physical server port(s).
As IT managers re-architect or upgrade their data centers their focus will be on assuring the ability to scale computing performance and networking capacity within the facility’s constraints of space, power, and cooling capacities.
The goal of this white paper is help data center owners and designers quantify the value of 40 Gigabit Ethernet switch port density and scalability as part of a Total
Cost of Ownership (TCO) approach to evaluating alternative data center network designs and
1
http://www.informationweek.com/news/hardware/data_centers/229600034
2
http://gigaom2.files.wordpress.com/2011/06/inaugural-uptime-institute-
annual-data-center-survey-fact-sheet.pdf
Definition of Symbols
P : The number of 40 GbE ports per aggregation switch
M : The effective over-subscription
Ratio
S : The number of aggregation switches
Ccore : The number of 40 GbE ports per aggregation switch that are used to connect to the LAN coreCagg : The number of 40 GbE ports per aggregation switch used to connect to other aggregation switches
Cacc : The number of connections between TOR switches
P – Ccore – Cagg : The number of 40
GbE ports per aggregation switch available for connections to the access layer
4 x m x (P-Ccore-Cagg) : The number of 10 GbE access layer ports that are available for server connection per aggregation
4 x S x m x (P-Ccore-Cagg) : For two tier LAN design with multiple aggregation switches, the number of available server ports
2
implementations with different brands of switching products. This white paper is part of an on-going series of documents and webinars from Extreme Networks that focus on quantifying the value of the key attributes of a data center LAN switch.
The widespread adoption of server virtualization within the data center is introducing new traffic patterns that must be accommodated by the data center network. In addition to the traditional client/server traffic flows (“north-south” traffic), virtual machine migrations and access to networked storage give rise to a significant amount of server-to-server (“east-west”) traffic within PODs and even across the entire data center. Virtual machine migrations among physical servers are being used to support a number of critical data center functions, including load balancing, expanding capacity, disaster recovery, and ensuring high availability. Some of the impact that server virtualization has had on the data center network includes:
Flatter networks
Virtual machine migrations are typically performed within the boundaries of a VLAN.
Extending VLANs across wider segments of the data center requires the elimination or circumvention of Layer 3 boundaries between server PODs. Flatter networks can help lower costs by reducing configuration complexity and by reducing the number of devices that must be managed. Another benefit of flatter data center networks is reduced end-to-end latency due to fewer hops between source and destination nodes.
Higher Bandwidth Server Connectivity
Servers with multi-core processors can support a large and growing number of virtual machines, increasing the demand for network I/O capacity. The trend to multi-core and multi-virtual machine (VM) servers is accelerating the need to transition server connectivity from GbE to 10 GbE. Adding to that pressure is the fact that the next generation of servers emerging in 2012 is expected to feature dual port 10 GbE LANs on the Motherboard (LOM) based on 10GBASE-T. 10GbE server connectivity will also provide support for unified storage networking based on NAS, iSCSI, and FCoE.
I/O Virtualization
I/O virtualization supports a number of virtual Ethernet connections over a single physical port. This allows the assignment of a virtual GbE port to each VM, a recommended best practice by VMware.
The combination of flatter network designs and a transition of server connectivity from GbE to dual 10 GbE will result in a much greater emphasis of the scalability and port density of 10
GbE and 40 GbE data center switches.
3
The scalability of a LAN architecture is determined by the number of server ports that can be supported with a given level of redundancy and over-subscription at different points within the LAN topology. Many data center LANs being deployed today are based on a two tier design that provides high levels of redundancy and low over-subscription levels for server-toserver traffic. Two tier LAN designs are frequently implemented with Top of Rack (TOR) access switches in conjunction with chassis based aggregation switches. The aggregation switches are connected to the LAN core and to the Internet, but all the server-to-server traffic within the data center flows only through the two tiers of access and aggregation switches.
Figure 1 shows a general model for two tier switched LANs that takes into account both connections for redundancy and connections to the LAN core. It is assumed that all servers are attached to the access/TOR switches via 10 GbE ports. Any inter-switch links at the access layer are assumed to be 10 GbE, and all other inter-switch links (i.e., inter-aggregation, access-to-aggregation and aggregation-to-core) are assumed to be 40 GbE. If a given model of switch does not yet support 40 GbE, a LAG with four 10 GbE member links could be substituted. It should be noted, however, that a 40 GbE link is preferable to a LAG of four 10
GbE links because having a single 40 GbE link avoids the issues that can occur when attempting to load balance traffic that consists of a small number of high volume flows.
Ccore
P=ports/switch
Aggregation Layer
Cagg
P-Ccore-Cagg
P-Ccore-Cagg
Cacc
Access Layer
4m(P-Ccore-Cagg)
Servers
Figure 1: Scalability Model for Two Tier Data Center LANs
4
This model can be applied equally well to two tier LANs based on multi-chassis LAGs and two tier fat trees. The model focuses on P, the number of 40 GbE ports per aggregation switch and the number of ports required to make connections both within and among network tiers.
In the model, Ccore is the number of 40 GbE ports per aggregation switch that are used to connect to the LAN core, Cagg is the number of 40 GbE ports per aggregation switch that are used to connect to other aggregation switches (e. g., for ISL/VSL). There may also be 10 GbE inter-switch links within the access/TOR tier to support virtual switch/router functions such as multi-chassis LAG (MLAG) or VRRP.
The access/TOR switches may be oversubscribed with more switch bandwidth allocated to server connections vs. the amount of bandwidth that is provided from the access tier to the aggregation tier. The over-subscription ratio is given by the following ratio:
The amount of bandwidth allocated to server access/The amount of bandwidth allocated to access-to-aggregation connectivity)
Throughout this white paper, the over-subscription ratio will be referred to by the letter m .
A typical high density TOR switch has 48 10 GbE ports for server connectivity and four 40
GbE ports for inter-switch connectivity. Where servers are single-attached to these TOR switches, m is equal to (48 x 10)/(4 x 40) = 3. Where the servers are dual-attached to a pair of
TOR switches with active-passive redundancy, m = 3, but the effective over-subscription ratio is 1.5:1 because only one of the pair of server ports is active at any given time. Where the servers are dual-attached to a pair of TOR switches with active-active MLAG redundancy, the requirement for inter-switch connections ( Cacc ) between the TOR switches means there are two fewer 10 GbE ports per TOR switch available for server connectivity and the oversubscription ratio is equal to m = (46 x 10)/(4 x 40) = 2.88
As shown in Figure 1, the number of 40 GbE ports per aggregation switch that is available for connections to the access layer is equal to P-Ccore-Cagg and the number of 10 GbE access layer ports that are available for server connection per aggregation is equal to 4 x m x (P-
Core-Cagg). For a two tier LAN design with multiple aggregation switches, the number of available server ports is 4 x S x m x (P-Core-Cagg), where S is the number of aggregation switches.
It should be noted that the model presented in Figure 1 is based on having a single aggregation switch, and the factor S needs to be included to account for an aggregation tier with multiple aggregation switches. For fat trees, the number of aggregation switches, or spine switches, is limited by the equal cost forwarding capabilities (16 paths is a typical limit), as well as the port density P. The port configuration of the access/TOR switch also imposes some limitations on the number of aggregation/spine switches that can be configured. For example, for a TOR switch with 48 10 GbE ports and four 40 GbE ports the number of 40
GbE aggregation switches is limited to four. Scaling beyond S=4, requires both a denser access switch with more 40 GbE ports and more 10 GbE port as well to maintain a desired
5
maximum over-subscription ratio. The ultimate fat tree scalability is attained where the 10
GbE/40 GbE access switch has same switching capacity as the aggregation/spine switches.
With these caveats, the model takes into account redundancy and scalability for various Layer
2 and Layer 3 two-tier network designs as summarized in Table 1.
Parameter 2 Tier L2 2 Tier Layer 2
MLAG
2 Tier Layer 2
Fat Tree
2 Tier Layer 3
Fat Tree
Redundancy
Ccore none variable full variable full variable
Full variable
Cagg
Cacc
Max 10 GbE server ports
0
0
4Sm(P-Ccore-
Cagg)
S=1
ISL/VSL
2 per agg switch active/passive server access: 0 active/active:
2 per TOR
4Sm(P-Ccore-
Cagg)
S=2
0 0 active/passive server access: 0 active/active:
2 per TOR
4Sm(P-Ccore-
Cagg)
S = # of aggregation switches active/passive:
2 per TOR active/active:
2 per TOR
4Sm(P-Ccore-
Cagg)
S = # of aggregation switches
Larger P,m,S Larger P,m,S Scaling Larger P, m Larger P, m
Table 1: Scalability of Two Tier 10/40 GbE Data Center LANs
As highlighted in Table 1, the only way that the scalability of the data center LAN can be increased is by increasing the:
Number of aggregation switches
Number of 40 GbE ports per aggregation switch
Level of over-subscription
As stated earlier, a typical initial design process might start from identifying the required number of server ports, the required redundancy, and an upper limit on the over-subscription ratio. As shown in Figure 2, calculating the required number of 40 GbE ports per aggregation switch to meet these requirements is accomplished by inverting the scaling formula. The way this formula can be used is for an IT organization to:
1.
Determine required number of server ports
2.
Select the desired network type from Table 1. This will determine Cagg
3.
Select an access/TOR switch model. This together with the network type will determine
Cacc and m.
6
4.
Select the desired Ccore. This will determine over-subscription ratio for client/server traffic via the core
5.
Calculate the required port density of the aggregation switch
P = ((# of server ports)/4 x S x m) + Ccore + Cagg
Figure 2: Required Aggregation Switch Port Density for Two Tier 10/40 GbE Data Center LANs
To exemplify the formula shown in Figure 2, consider the following network parameters:
The number of servers ports = 4512
M = 3
S = 2
Ccore = 2
Cagg = 2
The formula in Figure indicates that in order to support the indicated network parameters, an access switch with 192 40 GbE ports is required.
Figure 3 shows an example of a data center network that provides fully redundant Layer 2 server-to-server connectivity based on 94 TOR switches, each having 48 10 GbE ports and 4
40 GbE ports plus a pair of high density aggregation switches with 192 40 GbE ports each.
The topology is an MLAG Layer 2 network with oversubscribed TOR switches. Each of the
2,256 servers is connected to two TOR switches in an active/passive mode. The same configuration could also support 4,512 single-attached servers. With active/passive redundancy, the over-subscription of access switches for server-to-server traffic is 1.5:1.
Four 40 GbE links to LAN core
ISL/VSL
Two Aggregation
Switches with 192 40 GbE ports each
M-LAG with 2 40 GbE links to each Aggregation Switch
94 TOR switches
…
4,512 10GbE server ports
2,256 servers dual attached active/passive --oversubscription 1.5:1 or 4,512 single attached servers --oversubscription 3:1
Figure 3: Redundant Two Tier Network Configuration
7
For active-active server connectivity, each pair of TOR switches would need to be configured as a virtual switch with a pair of inter-TOR 10 GbE links for the ISL/VSL connectivity required for the virtual switch, as shown in Figure 4. This would reduce the number of servers per TOR switch from 24 to 23 and the number of dual-attached servers to 2,072. With active/active redundant MLAG server connectivity, the over-subscription ratio for server-toserver traffic is 2.88:1.
Four 40 GbE links to LAN core
ISL/VSL
Two Aggregation
Switches with 192 40 GbE ports each
M-LAG with 2 40 GbE links to each Aggregation Switch
94 TOR switches
…
4,144 10GbE server ports
2,072 servers dual-attached active/ actiive –over-subscription 2.88:1
Figure 4: Redundant Two-Tier Network Configuration
Building a comparable network with essentially the same number of 10 GbE server ports and similar over-subscription ratios using similar TOR switches and an aggregation switch with half the density (i.e., 96 40 GbE ports) requires some design changes. Comparing the two designs provides an illustration of the effect that the density of the aggregation switch can have on the network design and the resulting TCO.
One possibility would be to build a Layer 2 fat tree network using four aggregation switches in the spine/aggregation layer and the same number of TOR switches (94) as the leaves/access switches. However, most TOR switches do not yet support Layer 2 equal cost multi-path forwarding alternatives other than with some form of MLAG. One workaround is to move the
Layer 3 boundary from the aggregation switch to the TOR switch and build a Layer 3 fat tree with OSPF ECMP providing the multi-path functionality. Figure 5 shows what this could look like. Here the ISL links are at the TOR level rather than the aggregation level and the server connection can be made active/active without affecting the topology. With active/passive redundancy, the over-subscription of aggregation switches for server-to-server traffic is
1.44:1, while with active/active redundant server connectivity, the over-subscription ratio is
2.88:1. Note that a Layer 2 and Layer 3 fat trees based on switches with the same port densities at the aggregation and access levels have the same physical topology.
8
Eight 40 GbE links to LAN core
Four Aggregation
Switches with 96 40 GbE ports each one 40 GbE link to each
Aggregation Switch
…
94 TOR switches
4,144 10GbE server ports
2,072 servers dual-attached active/passive --oversubscription 1.44:1 or 2.072 dual-attached active/active servers or 4,144 single attached servers –over-subscription 2.88:1
Figure 5: Redundant Two-Tier, Layer 3 Fat Tree
If a TCO comparison is made of the two networks shown in Figures 4 and 5, some of the differences to consider are the:
Capex and Opex differences with four switches vs. two at the aggregation level, including switch cost, power capacity requirements, rack space requirements, annual power, annual routine admin, and annual service contract costs
Difference in the number of server ports per TOR
Differences in over-subscription ratios to the core
Eight links vs. four links to the LAN core needed for redundancy
Administrative cost and complexity differences with 98 Layer 3 devices vs. two Layer 3 devices
In addition, in a Layer 3 fat tree, there is a need for pre-standard VXLAN, NV-GRE or some other virtual networking solution to enable VM migration across Layer 3 boundaries.
This example shows some of the complexities that can be encountered in comparing the
TCOs of competing data center switching solutions that are based on switches of different port densities, as well as somewhat different functionality.
A typical design process for a data center LAN would include an analysis of the client/server and the server-to-server traffic flows, resulting in a preferred topology as well as the required number of server access ports, permissible over-subscription, required redundancy, and required connectivity of the data center to the campus LAN and/or Internet.
9
Adding redundancy at the server access level (e.g., by taking advantage of dual port 10 GbE
LOMs) and connecting each server to two physical access switches doubles the number of server access ports required. Similarly, adding redundancy at other layers of the hierarchy can essentially double the number of ports required at those levels.
The formula in the preceding section can be used to estimate the total number of 10/40 GbE ports that are required by the design and the number of aggregation and access switches.
These estimates can then be used to develop Total Cost of Ownership (TCO) comparisons between different models of switch or different two tier switch topologies.
The equations in Figure 6 provide a simple example of a TCO analysis for a greenfield data center network requiring all new switches, floor space, and power/cooling facilities. This analysis focuses on the comparison of competing models of switch in a similar topology. For simplicity, costs of cabling and some other costs are not considered.
CAPEX (Aggregation) = [(power/port) (# of ports)+ (power/chassis) (# of switches)] x ($cost/kW power/cooling capacity) +
[(racks/switch) x (# of switches) x 18 sq ft/rack] x $cost/sq ft +
[(racks/switch) x (# of switches)] x $cost/rack +
[# of switches x $cost/switch]
OPEX (Aggregation) = [(# of switches) x (kw/switch) x (8760 hrs/year) x
($cost/kWhr)] +
[ (# of switches) x ($cost of annual routine admin/switch)] +
[ (# of switches) x ($ annual service contract/switch)]
CAPEX (TOR) = [(power/switch) (# of switches)] x
($cost/kW power/cooling capacity) +
[(racks/switch) x (# of switches) x 18 sq ft/rack] x $cost/sq ft +
[(racks/switch) x (# of switches)] x $cost/rack +
[# of switches x $cost/switch]
OPEX (TOR) = [ (# of switches) x (kw/switch) x (8760 hrs/year) x ($cost/kWhr)] +
[ (# of switches) x ($cost of annual routine admin/switch)] +
[ (# of switches) x ($ annual service contract/switch)]
TCO = CAPEX (Aggregation) + CAPEX TOR) +
NPV [OPEX (Aggregation) + OPEX (TOR)]
Figure 6: A Simple TCO Model
According to the Uptime Institute
3
, the cost of provisioning a kilowatt of UPC power/cooling for a Tier II data center is $12,500, while the corresponding figure for a Tier III data center is
$23,000. Their estimate for the corresponding cost per square foot of rack space is $300. The cost a data center rack is estimated to be $3,500. Cost per kWhr is estimated at $0.073, and
3
Uptime Institute White Paper: Cost Model $/kW and $/sq ft of Computer Floor
10
the cost of annual routine maintenance per Layer 2 switch is estimated at $360. The cost of annual routine maintenance per Layer 3 switch is estimated at $720.
Based on the TCO model presented in Figure 6 and the above cost parameters, a TCO comparison was made between two data center LAN designs. One design is the network shown in Figure 4 and implemented with two 8-slot Extreme Black Diamond X-8 switches with 192 wire speed 40 GbE ports and 94 Extreme Summit X670V TOR switches. The second design is shown in Figure 5 and is based on a competing aggregation switch (Switch
X) with 96 wire speed 40 GbE ports and TOR switches of the same port density as the
X670V. The comparison is based on a greenfield data center network with a requirement of
4,144 total server ports, active-active dual 10 GbE server connectivity and an oversubscription ratio for server-to-server traffic of 3 or less.
BD X-8
Summit
X670V
Switch X
Aggregation
Switch X
TOR L3
2
Number of switches
94
4
94
Cost/ switch
44U Racks/ switch
$802,900? 14.5/44
$25,000 1/44
W
Power/ port
$
$44,000
25/44
1/44
W
Power/
Chassis
2,52 kW
257W
2.82 kW
199W
Annual
Service
Table 2: TCO Parameters for Comparison of Two Network Solutions for 4.144 Server
Ports
Extreme
Switch X
Difference
Switches Racks Tier II
Power
Capacity
Space Annual
Power
Routine
Admin
Service TCO
Table 3: Results of TCO Comparison of Two Tier Network with Aggregation Switches of Different Densities
In Table 3, the TCO was reflects OPEX NPV calculated based on a 5 year lifetime and an interest rate of 4%.
An estimation of the capital cost of a new or upgraded data center LAN needs to take into account more factors than simply the cost per port of LAN switches being considered. A preferred approach is to perform a TCO analysis based on a preliminary network design taking into account the total number of server access ports required, required limits of oversubscription for server-to-server traffic, the port density of the switches, and the total number of switches and switch ports required.
11
As has been shown, the port density of the switch can have a significant effect on the LAN topology, the level of over-subscription, the total number of switch ports, and the network power consumed in order to support a required number of server access ports. This underscores the fact that IT organizations that are updating their data center LANs should place greater emphasis on the port density of LAN switches as a differentiating factor among competing products.
Beyond switch cost, other significant CAPEX factors include the cost of power/cooling capacity required to support the switches deployed and the physical density of the switches which affects the rack space and square footage consumed by network hardware.
The port density of switch, together with the number of switches required, can also directly affect OPEX in terms of routine administrative costs, power consumption, and maintenance contract costs.
12