Data Center Solutions Guide
Abstract: The following document provides a Virtualized Data
Center Solution Guide with the architectural components that
tie the network, compute, storage and management together.
A SOLUTION WHITE PAPER
Table of Contents
Introduction................................................................................................................... 4
1.1 DATA CENTER CHALLENGES. . .............................................................................................................. 4
1.2 EXTREME SOLUTION............................................................................................................................5
1.3 ARCHITECTURAL COMPONENTS. . ........................................................................................................6
2. Network Management and Service Orchestration............................................. 7
2.1 OVERVIEW. . ........................................................................................................................................... 7
2.2 NETSIGHT . . ............................................................................................................................................8
2.3 ONEFABRIC CONNECT APIS................................................................................................................9
2.4 ORCHESTRATION................................................................................................................................ 11
2.5 DEVOPS. . .............................................................................................................................................. 11
2.6 CLI SCRIPTING . . ................................................................................................................................... 11
3 Network Abstraction................................................................................................13
3.1 ONECONTROLLER............................................................................................................................... 13
3.2 NETWORK ACCESS CONTROL........................................................................................................... 14
3.3 ANALYTICS-AS-A-SERVICE................................................................................................................ 15
4 Network Infrastructure............................................................................................16
4.1 HIGH AVAILABILITY............................................................................................................................ 17
4.2 MULTIPATH.. ........................................................................................................................................ 17
4.3 REDUNDANCY.. ...................................................................................................................................25
4.4 LOGICAL SEPARATION...................................................................................................................... 31
4.5QUALITY OF SERVICE (QOS)............................................................................................................. 32
4.6 ELASTICITY........................................................................................................................................ 37
4.7 SECURITY.. ..........................................................................................................................................39
4.8 TOR AND EOR DESIGNS. . ................................................................................................................... 41
4.9 DATA CENTER INTERCONNECT (DCI).............................................................................................. 43
4.10 MANAGEMENT ................................................................................................................................ 46
Data Center Solutions Guide – White Paper
2
5 Data Center Infrastructure Elements...................................................................48
5.1 SERVER VIRTUALIZATION................................................................................................................. 50
5.2 STORAGE........................................................................................................................................... 50
5.3 FIREWALLS......................................................................................................................................... 51
5.4 SERVICE CHAINING. . .......................................................................................................................... 52
Data Center Solutions Guide – White Paper
3
WHITE PAPER
Introduction
1.1 DATA CENTER CHALLENGES
Online delivery and consumption models for business and consumer services are
evolving, for both cloud services or traditional IT services. Demand for these services
and application availability has changed the requirements for data centers. Early
motivations for data center changes were associated with massive cost reduction
and redundancy but are now focused on agility and ability to meet requirements in
different cloud models that deliver the new business and consumer services.
Today’s highly distributed wired and wireless networks are designed for increased
flexibility, scale and reliability. Additionally, enterprises of all sizes increasingly
turn to outsourced services of all types – SaaS, PaaS, IaaS, and more. Securely
delivering new end-to-end services and applications across these environments
often results in increased complexity, compromise, and costs. Customers face
challenges that require a data center to be:
Simpler:
1. Managing a virtualized environment is complicated. Both provisioning network
devices and gaining visibility into the traffic on those devices is difficult, timeconsuming and requires too many tools.
2.The data center needs to be improved in order to be more dynamic and
automated via technologies like SDN, but that’s also perceived as being
complex
3.Traffic isn’t optimized both within the data center and between/among
interconnected data centers.
Faster:
1. Rolling out new applications, services and other network changes is inefficient,
ineffective and takes too long.
2.The data center can’t scale to accommodate the speed and performance
demanded by the explosive growth of new applications and devices.
3.The data center is expensive in terms of OPEX; operators are spending too
much time on basic maintenance tasks and not enough on leveraging this
valuable business asset.
Data Center Solutions Guide – White Paper
4
Smarter:
1. The data center needs to have improved availability/redundancy/reliability in
order to eliminate expensive downtime. This also means improved security in
order to eliminate external disruptions.
2.Operators need better analytics into network usage so they can leverage this
Business Intelligence in order to assure Service Level Agreements (SLAs) and to
improve the business overall
Data center LANs are constantly evolving. Business pressures are forcing IT
organizations to adopt new application delivery models. Edge computing models are
transitioning from applications at the edge to virtualized desktops in the data center.
The evolution of the data center from centralized servers to a private cloud is well
underway and will be augmented by hybrid and public cloud computing services.
With data center traffic becoming less client-server centric and more server-server
centric, new data center topologies are emerging. Yesterday’s heavily segmented data
center is becoming less physically segmented and more virtually segmented. Virtual
segmentation allows sharing the same physical infrastructure in the most efficient
manner, leading to both capital and operational expense (CAPEX/OPEX) savings.
Virtual segmentation accelerates the time to spin up new business applications.
1.2 EXTREME SOLUTION
With businesses demanding a broader variety of IT-driven services, overcoming these
constraints has become a priority for IT leadership. Leveraging Extreme Networks
OneFabric Connect and Software-Defined Architecture (SDA), organizations
overcome these challenges with a unified platform for security, virtualization,
manageability, mobility and convergence that enables more reliable provisioning and
delivery of new services and application on a more dynamic IT infrastructure.
With Extreme Networks OneFabric Connect and SDN architecture, the network tier
becomes as dynamic, automated and modifiable as the storage and compute tiers,
providing a simple, fast, and smart networking solution that delivers the benefits of:
• Simplified end-to-end automation that makes network deployment,
management and ongoing operations more cost effective
• Faster provisioning that supports any application while providing flexibility for
deploying the operator’s choice of best-of-breed applications, solutions and
vendors
• Intelligent orchestration compatible with existing systems to take advantage
of present network infrastructures and protect an organization’s existing
investments
Extreme Networks provides the foundation for open, standards-based and
comprehensive SDN platforms and integrated ecosystems. OneFabric Connect
provides an open, programmable and centrally managed foundation for
implementing SDN on any network, as our open, standards-based Software-Defined
Architecture provides a number of key innovations and capabilities, including
fully integrated management, access control, and application analytics for flexibly
deploying new SDN solutions. These solutions operate across heterogeneous network
infrastructures to enable seamless migrations to new applications and services
without compromise.
Data Center Solutions Guide – White Paper
5
Figure 1: Extreme Networks Software Defined Architecture
1.3 ARCHITECTURAL COMPONENTS
The architectural components of the data center as described in this document are
key to any data center design. The Extreme Data Center solution provides secure,
scalable infrastructure with the ability to expand or shrink resources based on
business needs. The ability to quickly provision resources to meet specific business
needs allows application agility. This characteristic has various advantages, which
include, faster time to market, reduced TCO and enabling application agility. This
document should be used as general guidelines and can easily be implemented
to the target customer Data Center requirements, from K-12 school data centers
to large enterprise data centers to Hadoop clusters to Infrastructure-As-A-Service
(IaaS) platforms.
This document will walk through the layers and architectural components and
describe how Extreme Networks each of those requirements. Derived from the
business objectives and the requirements of the applications hosted in the data
center (see the Business Applications in Figure 1), the common design goals include:
• Application availability
• Performance, scale and responsiveness
• Security against attacks/hacks
• Visibility, workload manageability, upgradability
• Resource utilization
Data Center Solutions Guide – White Paper
6
2. Network Management and Service
Orchestration
2.1 OVERVIEW
IT organizations need simplified data center management that requires a single pane of
glass management system that is intelligent, highly automated, and integrated with the
entire data center ecosystem. Simplicity in configuring and deploying the infrastructure
and environment, provisioning, and centralized management is critical for all types of
data centers. It reduces time to deployment and reduces operational expenditures.
Customers want the single pane of glass to interface with the network elements, and
they want different ways of doing it through open APIs without vendor lock-in. They
also need to be able to develop on top of the platform and integrate with 3rd party
solutions, on top of a data center that is brownfield (has existing infrastructure).
Organizations need to dynamically optimize network resources and investments
to fit changing business strategies, and create an open foundation for delivering
new services, increased efficiencies, reduced costs, and sustained advantage
without compromise. Extreme achieves a simplified and consistent user experience
through integrated infrastructure management solutions described below that
enables collaboration and automation to ultimate provide optimized Data center IT
administration and operations.
The value add of a single pain of glass management system are:
Business alignment
• Transform complex network data into business-centric, actionable information
• Centralize and simplify the definition, management, and enforcement of policies
such as guest access or personal devices
• Easily integrate with business applications with Software Defined Networking for
operational efficiency
• Operational efficiency
• Reduce Datacenter IT administrative effort with the automation of routine tasks
and web-based dashboard
• Streamline management with the integration of wired and wireless networks
• Easily enforce policies network-wide for QoS, bandwidth, etc.
• Troubleshoot with the convenience of a smartphone or tablet
• Integrate with enterprise management platforms
• Integrate with other network services and service chaining
Security
• Protect corporate data with centralized monitoring, control, and real-time
response
• Enhance existing investments in network security
• Preserve LAN/WLAN network integrity with unified policies
Data Center Solutions Guide – White Paper
7
2.2 NETSIGHT
A Network Management System (NMS) is essential to provide centralized visibility
and granular control of enterprise network resources end to end. Next generation
data centers have higher requirements of the FCAPS capabilities that a solid NMS
provides, as the NMS also goes beyond just traditional configuration and management
responsibilities. To meet SDN platform requirements and integrate communication
between the infrastructure elements and the OneController, it provides a seamless
migration path for existing ecosystem partners. It can collect topology, host, and
statistical information from OneController and provide visualization for provisioning
and monitoring, and manage security applications and services.
Today, Extreme’s centralized network management application NetSight provides
capabilities common to many NMS solutions and is distinctive for granularity that
reaches beyond ports and VLANs down to individual users, applications, and
protocols. No matter how many moves, adds, or changes occur in your environment,
NetSight keeps everything in view and under control through role-based access
controls. One click can equal a thousand actions when you manage your network with
Extreme Networks. NetSight can even manage hardware beyond Extreme Networks
switching, routing, and wireless hardware, enabling standards-based control of other
vendors’ network equipment.
NetSight OneView: This screen shows devices and MLAG specific information
Data Center Solutions Guide – White Paper
8
2.3 ONEFABRIC CONNECT APIS
With the OneFabric Connect API, business applications are directly controlled
from OneFabric Control Center and Extreme Networks NetSight Advanced
management application.
Figure X – OneFabric Solution Architecture
The result is a complete solution that provides innovative features including:
Increased Agility and Flexibility
• Control managed and unmanaged BYOD devices within the same infrastructure,
with unified single-pane-of-glass visibility
• Easily deploy and manage new applications, devices, users and services
• Leverage pre-defined integrations with other IT systems to enable features like
user and location-based URL filtering
Lower Operational Costs
• Automate provisioning and control of network services by IT systems inside as
well as outside the network management domain
• Discover, track, and document all network-connected assets in real-time
• Automate onboarding and provisioning of network services for any device
Improved Visibility, Control, and Security
• Enforce policies based on context at the network layer for more comprehensive
control
• View application usage and threat detection information to quarantine users and
devices
• Gain insights into asset information for increased visibility, as well as search and
location capabilities for any user and device on the network
Data Center Solutions Guide – White Paper
9
• Ensure mobile device compliance for more accurate policy enforcement
decisions at the network layer
With Extreme Networks OneFabric Connect, organizations can integrate variety of
systems and applications, using either predefined integrations that allow programmatic
control of VM, MDM, CMDB, analytics, web filtering and firewall systems, or by simply
and easily adding customer-defined integrations via existing APIs.
Pre-defined integrations and Technology Solution Integration Partners include:
CONVERGENCE
Microsoft Lync
Polycom CMA
Avaya Easy Management
DATACENTER AND CLOUD
VMware vSphere (vCenter and/or ESX)
VMware View
Microsoft Hyper-V
Microsoft SCVMM
Citrix XenServer with XenCenter
Citrix XenDesktop
MANAGEMENT AND IT OPERATIONS
FNT Command
Microsoft SCCM
CA ITSM
MOBILITY
AirWatch
MobileIron
JAMF Casper
Fibrelink MaaS360
SECURITY
Palo Alto
iBoss Client
IF-MAP
Lightspeed Systems
McAfee EPO
Palo Alto
Extreme Networks Data Center Manager (DCM), part of OneFabric Control Center,
provides IT administrators with a transparent, cross-functional service provisioning
and orchestration tool that bridges the divide between the server, networking, and
storage teams and provides a single integrated view of virtual server and network
environments. By enabling the unification and automation of the physical and virtual
network provisioning, Data Center Manager enables networks to benefit from the
high availability required for mission critical application and data performance. DCM
delivers numerous benefits to IT teams, including the ability to:
• Automate physical and virtual switching environments to streamline data center
network provisioning
• Create consistent configurations throughout the network fabric for predictable
behaviors and simplified troubleshooting
• Increase coordination and improve workflow between network, server, and
storage teams within IT
• Gain granular visibility into traffic flows and real-time and historical data to
simplify incorporation of VMs into the network, improve visibility and control,
and enable simplified auditing of the network via policy-based management
• Unify management through an easily extensible architecture that supports a
variety of hypervisor technologies and vSwitches, including VMware, Citrix, and
Microsoft
Specific DCM Integrations
• VMware vSphere (vCenter and/or ESX)
• VMware View
• Microsoft Windows Server 2008 R2 with Hyper-V support
• Microsoft SCVMM 2008 R2
Data Center Solutions Guide – White Paper
10
• Citrix XenServer with XenCenter
• Citrix XenDesktop
For more information on these and other pre-defined integrations and Technology
Solution Integration Partners, please go to: http://www.extremenetworks.com/
partners/tsp/
2.4 ORCHESTRATION
Customers use a myriad of data center orchestration solutions that need to
seamlessly integrate into the rest of the ecosystem, without vendor lock-in. They want
to rapidly automate service delivery and application provisioning, and to simplify
data center operations, managing the infrastructure elements together. Datacenter
customers want a best-of-breed multi-vendor environment, and vendors that
embrace integration with other vendors are the most appealing.
There are some proprietary orchestration solutions, like VMWare’s vCenter
Orchestrator, and some open-source cloud computing solutions, like OpenStack.
The solutions geared towards cloud data centers can be modular and address cloud
ecosystem requirements for things like provisioning network, compute, storage, as
well as centralized directory, billing, templates, etc
OpenStack has features presented in an abstract view across many physical devices,
and some of these features require dynamic reconfigurations of the devices involved.
Plus OpenStack may use multiple network configurations at the same time. This
dynamic nature poses one of the greater challenges when it is connected to a
physical network. OpenStack provides an internal, virtual node-to-node network and
also can provide physical break-out points into the LAN, along with tenant separation
within the internal virtual network. These different networks typically overlap and
interweave with each other dynamically, but will have to be established across static,
physical network equipment.
Extreme Networks recognizes these challenges and offers solutions to enable
automation and dynamic configuration of network equipment through various means
of configuration. Extreme advocates open, standards-based solutions like OpenStack,
which can flexibly be supported through Extreme’s open APIs in OneFabricConnect
or through our open, standards-based controller OneController. OpenStack will
leverage OneController for the overlay and underlay management. Thus, Extreme
ensures that any business applications developed through them will be deployable
with Extreme.
2.5 DEVOPS
To manage the large numbers of data center servers and VMs that are running
typically identical application and services, DevOps community uses tools like
Puppet, Chef, Salt, and Ansible. These tools provide a programmatic way to perform
configuration tasks. Although traditionally under the compute admin domain, these
tools are useful for management of network infrastructure as well, so under the same
umbrella these administrative tools can manage both compute and network and even
storage. They can maintain switches and verify their configuration by making them
check-in with a centralized server, and to the tool the switch looks like just another
device. Extreme Networks easily supports the DevOps tools, which are based on
open source code with vendor specific interfaces.
2.6 CLI SCRIPTING
Extreme Networks platforms support CLI scripting that could be used to generate
automated sequences, embedded as part of an overall automated workflow in
the data center. To streamline deployment and administration of the data center
Data Center Solutions Guide – White Paper
11
network, data center IT administrators can leverage ExtremeXOS automated switch
management capabilities. The CLI-based scripting, with TCL and python support,
allows users to significantly automate switch management through support of
variables and functions that users customize for handling special events. ExtremeXOS
has a flexible framework that can enable selected trigger events that are directly tied
to the Event Monitoring System (EMS) to activate dynamic profiles, such as when a
user or device connection to a switch port. These profiles contain script commands
and cause dynamic changes to the switch configuration, and can be used for general
manageability of the network or to enforce policies.
For example, scripts can be triggered based on movement of Virtual Machines or
MAC addresses that can then adapt a port’s configuration to match that of a VM.
Another example is where a script can be triggered when a storage array is detected.
The script can be used to configure the switch and network for storage traffic
including things like enabling jumbo frame support, assigning the storage traffic to a
certain traffic class and setting up bandwidth guarantees for that traffic class.
Caption: Data Center Automation and Customization
Data Center Solutions Guide – White Paper
12
3 Network Abstraction
Data centers also have challenges with sub-optimal traffic flows and workloads
that fail to meet application requirements for low latency or resource isolation.
Elastic resourcing or dynamic nature of traffic flows also makes traffic patterns
unpredictable. To optimize the network infrastructure to ensure that the underlay
is performing optimally, minimizing delay, and providing flexible movement of
workloads, there needs to be intelligent route selection, i.e. traffic engineering, and
intelligent VM placement and intelligent VM placement.
Abstraction is key to achieving the agility, manageability, and elasticity in the data
center. Abstraction of the network removes it as a bottleneck, enables VM-VM
reachability regardless of location, and provides the ability to rapidly react to business
application needs.
Networks can be abstracted into network overlays with a corresponding network
underlay. Overlay technologies like VXLAN enable VMs to communicate with one
another while maintaining isolation. Ultimately, they allow data centers to meet larger
scalability requirements for logical network domains and reduce time and cost to
deploying new services through network virtualization. Underlay technologies are the
traditional network pieces that comprise the network fabric. Underlay solutions must
be aware of the network elements, do intelligent traffic-engineering, service insertion,
provide tenant-based QoS, manage malicious behavior, etc.
Overlays cannot be agnostic of the underlays; there should be feedback mechanisms
between the overlay and underlay for optimized performance. Networks are
underutilized as they are, the abstraction will help use the resources better. Plus the
network abstraction and overlays and underlays must be managed by a centralized
system that has a view of the entire domain. Analogous to the role of the hypervisor
on the server that abstracts compute resources and carves them up for the host VMs,
the network needs the role of a controller that abstracts the network fabric resources
and carves them up for tenant VMs. This means the network needs a centralized
controller that will enable new services, and Extreme Networks OneController is
described in the next section
3.1 ONECONTROLLER
Data centers need a single platform to tie together network management, network
access control, network optimization, advanced application analytics. The single
platform can also tie together a heterogeneous, brownfield data center that has
deployed multiple vendors, white box and black box, plus enable the developing
Network Function Virtualization (NFV) solutions. A single platform promotes
community led innovation on top of that platform when it is standards-based and
comprehensive. When it can be deployed ready to integrate with existing and
multi-vendor hardware and software network environments, it preserves customer
investments and avoids vendor lock-in.
Extreme Networks’ OneController is based on a hardened OpenDaylight (ODL)
controller, preserving the integrity of the open API provided by ODL while extending
data center orchestration, automation and provisioning to the entire network
under a single pane of glass. OneController has multiple APIs for interfacing to the
infrastructure elements and can provide overlay and underlay functionality.
The architecture is highly available and redundant, and provides scalability, both
horizontal scalability to support additional devices and vertical scalability being
lightweight. For redundancy, if multiple OneControllers are deployed in Active/
Active or Active/Standby mode, a network management system can provision and
manage the multiple instances, and perform life-cycle management. The software
Data Center Solutions Guide – White Paper
13
development kit (SDK) and developer community will enable customers to evolve
the network to keep pace with emerging security, wireless, and converged SDN
infrastructure. The result is a simpler development platform for the data center.
3.2 NETWORK ACCESS CONTROL
Data Center security needs to be ingrained in every device and also abstracted out
to manage security from a holistic network perspective. Datacenter IT administrators
need to ensure that only the right users have access to the right information from
the right place at the right time including time of day, location, authentication types,
device and OS type, and end system and user groups. They need to perform multiuser, multi-method authentication, vulnerability assessment and assisted remediation,
and to choose whether or not to re-strict access for guests/contractors to public
Internet services only—and how to handle authenticated internal users/devices
that do not pass the security posture assessment. Businesses need the flexibility to
balance user productivity and security.
Extreme’s solution is a centrally deployed and managed appliance called Network
Access Control (NAC). NAC has unique capabilities to take action on the data center.
It has visibility into end hosts and the network elements and can react dynamically
to changes. For example, a vMotion moves a VM from one rack to another rack and
then NetSight can authenticate the VM and provision the VLAN and policy associated
with that VM on the new top of rack switch and remove it from the old top of rack
switch. This VM tracking capability is a result of integration between the NetSight,
NAC, and the network elements, and DCM on top of that allows for automation and
orchestration with the hypervisor elements.
NAC rules to authenticate VMs and provision VLANs and policies associated with the VM
Data Center Solutions Guide – White Paper
14
3.3 ANALYTICS-AS-A-SERVICE
Applications hosted in the Data Center are critical to customers and can be
business-impacting. They vary from commercial off-the-shelf applications to
customized applications to unique homegrown applications, and they can be
extremely complex and demand a certain SLA level. As a result, Data center IT
administrators needs visibility into these applications and how they are being used
and to be able to analyze the application impact on the data center infrastructure
and vice versa, without deploying new or intrusive infrastructure elements to be
able to do this analysis.
From a business perspective, an in-depth view into the real-time and historical
network and applications also provides valuable information for up front budget
planning when implementing new applications for the business while also
ensuring security compliance for approved applications. This saves both time
and money for the business when the critical applications are running at the best
possible performance.
From a Data center IT administrator perspective, visibility into network and
application performance allowing IT to pinpoint and resolve performance issues in
the infrastructure whether they are caused by the network, application, or server. It is
useful to get total application visibility of inter-VM traffic within the data center, even
between VMs on the same server. By eliminating unnecessary application delay, users
become more efficient and can focus on the important aspects of their jobs.
Analytics go beyond just application visibility. Analytics-as-a-Service is a more
encompassing concept, to provide pervasive and actionable visibility and coalesces
data from multiple sources. And then serve that data to be consumable by other
applications such as Security, Performance, and Multi-tenancy. Analytics-as-a-Service
is also an enabler for intelligent VM placement via topology awareness. Intelligent
VM placement goes beyond evaluating server workload and resource availability
(CPU, memory) and beyond placement of services to co-exist with users within the
same geographical data center. There needs to be placement of services within the
same rack or even same server as the clients to meet the application SLAs that are
becoming more and more stringent. Analytics are also becoming a more critical
component of Big Data environments.
Extreme’s Purview platform delivers a network powered application analytics and
optimization solution that captures and analyzes context-based application traffic
to deliver meaningful intelligence - about applications, users, locations and devices.
Purview uses deep packet inspection (DPI) technology with a rich set of application
fingerprinting techniques to detect internally hosted applications (SAP, SOA traffic,
Exchange, SQL, etc.), public cloud applications (Salesforce, Google, Email, YouTube,
P2P, file sharing, etc.), and social media applications (Facebook, Twitter, etc.) at Layer
7 of the OSI model, enabling guarantees for a quality user experience for business
critical applications.
Purview includes over 14,000 application fingerprints and new fingerprints are
continually added. Application fingerprints are XML files that are developed by
Extreme or they can be developed by users themselves to provide visibility to custom
applications that may be used by an organization. Application detection does not
stop with signature-based fingerprints though. To detect applications that try to
obscure themselves (like P2P and others) Purview also includes heuristics (behavioral
detection) based fingerprints to ensure the applications are detected appropriately.
Through its robust fingerprinting technology, Purview is able to identify an application
regardless of whether they run on well known ports or use non-standard ports.
Data Center Solutions Guide – White Paper
15
Purview is enabled by the CoreFlow2 ASIC, in Extreme S-Series and/or K-Series
switches, which identifies new flows and sends a few packets for every new flow to
the Purview engine. Application fingerprinting takes place in the Purview engine and
is then combined with non-sampled NetFlow data collected from the CoreFlow2
powered switch for the duration of the flow, which allows the Purview engine to
process traffic at unprecedented scale with no performance degradation to the
network itself. The Purview Engine determines the application, extracts application
context information such as URL, certificate information, browser version, device
hardware and OS. It measures response times, aggregates the data, adds additional
context derived from identity and access control (optional) like user, role, device type
and identity, location, and then sends it to NetSight for storage.
OneFabric Control Center, which is part of NetSight, provides complete application
management and reporting through dashboards and detailed reporting for Purview.
For example, when Purview is deployed with Extreme, information such as user,
role, device type, and location are integrated with the application flows. Further,
taking advantage of the OneFabric Connect API and SDN architecture, allows simple
integration with other IT applications such as analytics or Big Data processing
engines via XML/SOAP as well as real time notifications using syslog.
Purview can also be integrated with technologies that provide VM-to-VM traffic for
VMs residing on the same hypervisor. For example, Purview integrates with Ixia’s
Phantom vTap to extend application visibility from physical to virtual networking
across the entire data center. Administrators can mirror traffic from the VMs, sending
traffic of interest to Purview, and then they have a complete view of the data center
for total visibility, security, and control..
4 Network Infrastructure
The network fabric provides interconnectivity between servers, storage, security
devices, and the rest of the IT infrastructure. This section describes the requirements
of that data center network. The hardware and software should perform well and
address these requirements regardless whether the data center is a small deployment
Data Center Solutions Guide – White Paper
16
or high scale deployment, considering typical data center modeling parameters like
number of physical servers, virtual machines, VLANs, racks, tenants, etc.
4.1 HIGH AVAILABILITY
High Availability (HA) is crucial to data center networks. Data center failure costs
include both revenue lost and business creditability. System availability is simply
calculated by “system uptime” divided by “total time.”
Availability =( MTBF)/( MTBF+MTTR) where MTBF is Mean Time Between Failure,
MTTR is Mean Time To Repair
AVAILABILITY
DOWN TIME PER YEAR
99.000%
3 Days
15 hours
36 minutes
99.000%
1 Day
19 hours
48 minutes
99.000%
8 hours
46 minutes
99.000%
4 hours
99.000%
23 minutes
53 minutes
99.000%
5 minutes
99.000%
30 seconds
The table above shows availibilty percentage and down time per year
Typically, network architects expect to see 4 or 5 “nines” system availability. Each
additional “9” can raise deployment costs significantly. To achieve a data center with
near zero down time, data center IT administrators need to consider both system/
application resiliency and network resiliency. For connectivity itself, there are two
aspects to consider:
System-level resiliency: increasing availability by using reliable and robust hardware
and software designed specifically for HA and minimizing the MTTR by using
resilient hardware.
One must also consider data center site redundancy through warm standby or hot
standby, and Disaster Recovery (DR) scenarios.
In warm standby, the primary data center will be active and provide services while a
secondary data center will be in standby. The advantage to warm standby is simplicity
of design, configuration and maintenance. However, the disadvantage is no load sharing
between two sites, which leads to under utilization of resources, inability to verify that
the failover to secondary site is fully functional when it is not used consistently during
normal operation, and an unacceptable delay in the event that a manual cutover is
required. It is also difficult to verify that the “warm” failover is functional when it is not
used during normal operation.
In hot standby, both the primary and secondary data centers provide services in a load
sharing manner, optimizing resource utilization. The disadvantage to this scenario is that
it is significantly more complex, requiring the active management of two active data
centers and implementation of bi-directional data mirroring, or synchronous replication,
which results in additional overhead and more bandwidth between the 2 sites.
4.2 MULTIPATH
Extreme Networks connectivity solutions provide the ability to compress the traditional
3-tier network into a physical 2-tier network by virtualizing the routing and switching
functions into a single tier (the middle tier). Virtualized routing provides for greater
resiliency and fewer switches dedicated to just connecting switches. Reducing the
number of uplinks (switch hops) in the data center improves application performance as
Data Center Solutions Guide – White Paper
17
it reduces latency throughout the fabric. The aggregation and core are merged into a
single layer by virtualizing the router function in the data center LAN switch.
Two-tier Data Center Design
Switches are typically deployed in pairs with redundant links inter-connecting them
for resiliency. While this definitely satisfies the desired high-availability, it does
introduce the concepts of loops within the environment. In an effort to avoid these
loops, traditional Layer 2 loop prevention protocols like Spanning Tree Protocol (STP)
were developed. However, STP has many limitations such as inefficient utilization of
links and high convergence times. Modern network fabric designs steer away from
STP. Depending on the size of the deployment and other requirements, customers
can consider several options as described below.
4.2.1 SPINE-LEAF WITH DEVICE-LEVEL AGGREGATION
One can address both the performance as well as the resiliency requirements of
small to medium virtualized data centers by extending the link-level redundancy
capabilities of link aggregation and add support for device-level redundancy. This can
be accomplished by allowing one end of the link aggregated port group to be dualhomed into two different devices to provide device-level redundancy.
With device-level aggregation, the aggregated devices present themselves as a single
entity, and the remote device uses regular link aggregation. The devices leverage
a Layer 2 meshed network fabric for interconnectivity. The upstream switches
now work together to create the perception of a common link aggregated group
so that downstream switch doesn’t see anything different from a link aggregation
perspective, even though the link aggregated ports are now distributed across
multiple switches. This enables all links to be utilized; no links are blocked as they
would be in STP.
Data Center Solutions Guide – White Paper
18
The design below shows device-level aggregation used at the data center LAN
access layer providing connectivity for both applications and IP storage – iSCSI or
NFS attached.
STP versus MLAG/VSB
• STP is configure per
• All links are active in an
VLAN, thus links are
MLAG topology
put into blocking state
• MLAG uses special
based on STP algorithm
blocking logic that
• Effective bandwidth for
prevents L2 loops
switch is only 40G
• Effective bandwidth is
160G
This two-tier leaf-spine architecture provides high performance and resiliency in an
easy deployment architecture, by extending the link-level redundancy capabilities
of link aggregation and adding support for device-level redundancy. With an activeactive model, it can load share for full utilization of network bandwidth. It also has fast
failover convergence performance.
Data Center Solutions Guide – White Paper
19
The spine is composed of high performance, high port density switches. The spine
switches can be modular and support any combination of interface modules. They will
have connections to:
• Upstream WAN device or another data center’s spine switch
• Peer spine switch
• Every leaf switch for full-mesh
• Storage devices (if not at the leaf layer)
• Firewalls
The leaf is composed of highly resilient switches. It provides intra-rack connectivity,
functioning as top of rack switch. The leaf switches will have connections to:
• Upstream spine switches
• Peer leaf switch via the ISC
• Storage devices (if not at the spine layer)
The device-level redundancy on the BDX8 and Summit X670 is provided via the
feature Multi-Switch Link Aggregation (MLAG), and on the S-series and 7100 series is
provided via the feature Virtual Switch Bonding (VSB).
Data Center Solutions Guide – White Paper
20
Extreme Networks MLAG feature allows devices to see a pair of physical switches as a
single logical switch. A device can connect via link aggregation to two MLAG switches
and to the connecting device, they look like a single switch. This functionality provides
redundancy at any layer, at the server access layer or at the aggregation layer. It
dynamically provisions trunked server connectivity using IEEE 802.1AX/802.3ad
link aggregation protocols. Dynamic trunk provisioning can lower OPEX overhead
in comparison to static server NIC teaming. In virtualized configurations, assigning
virtual hosts to an aggregated link provides better application performance and
reduces the need for hypervisor network configuration.
MLAG peers have an Inter Switch Connection (ISC) dedicated control VLAN that
is used exclusively used for inter-MLAG peer control traffic and should not be
provisioned to carry any user data traffic. Data traffic however can traverse the ISC
port using other user-defined data VLANs.
This diagram shows MLAG configuration between leaf layer down to server.
Data Center Solutions Guide – White Paper
21
This diagram shows MLAG configuration between spine and leaf layer
MLAG PORT
LAG PORT
SPINE, LAG 1:1 + SPINE02, LAG 1:1 = MLAG ID1
QSFP+
SPLITTER
LEAF01, LAG 49 + LEAF02, LAG49 = MLAG ID 1
QSFP+ CABLE
CABLE
Extreme also supports MLAG switches to create one or two MLAG peers. The design
in this document focuses on any given switch having just one MLAG peer, but it is
possible for one switch to have two MLAG peers as in a linear daisy chain of ISCs.
Customers can split the downlink hosts or switches between the peers such that if
one of the switches has a failure, only a subset of hosts or switches would lose half
their bandwidth – the remainder connected to the other two MLAG peers would not
be impacted. All the basic MLAG functionality and traffic forwarding rules apply to
one or two MLAG peers.
Data Center Solutions Guide – White Paper
22
4.2.1.2 VIRTUAL SWITCH BONDING
Extreme Networks Virtual Switch Bonding (VSB) is a similar feature as MLAG in that
other switches see the switches as one, but VSB additionally enables the chassis to
be managed as a single entity. Instead of managing two devices each with N ports,
VSB allows administrators to manage one device with 2xN ports. All features are
seamlessly available across both VSB switches. It is a single router, single switch
architecture, single configuration. All features are distributed,
The VSB switches may be connected via dedicated hardware ports. S-Series VSB
allows two chassis to be fully virtualized to form a single entity via dedicated
hardware ports or normal 10G ports. The S-Series depending on the model can use
either multiple ordinary 10G ports or multiple dedicated VSB ports to form the high
speed link between chassis. 7100-Series virtual switch bonding will allow up to eight
switches to form a single entity.
Data Center Solutions Guide – White Paper
23
4.2.2 SPB
For larger data centers with more servers, customers can deploy IEEE Shortest Path
Bridging (SPB), a standards-based protocol. SPB is plug-and-play, leveraging the IS-IS
link state protocol for building a global view of the switch topology and to control
the Layer 2 data plane. SPB builds shortest path trees for each node to every other
node within the domain. These unique shortest path trees ensure efficient usage of
available links within the mesh by always using the shortest path between any two
nodes in the domain. Where multiple equal cost paths exist, the protocols provides
Equal Cost Multipath (ECMP) algorithms to further distribute the load and efficiently
utilize equal path links through the network.
Fully meshed data center designs leveraging SPB provide load-sharing through the
efficient use of multiple paths through network. They improve the resiliency of the
networks because they:
• Have the ability to use all available physical connectivity
• Enable fast restoration of connectivity after failure
• Restrict failures so only directly affected traffic is impacted during restoration; all
surrounding traffic continues unaffected
• Enable rapid restoration of broadcast and multicast connectivity simultaneously
Data Center Solutions Guide – White Paper
24
Shortest Path Bridging (SPB) IEEE 802.1aq was developed as an evolution of
the various Spanning Tree protocols. SPB’s IEEE 802.1 heritage ensures full
interoperability with the existing RSTP/MSTP topologies, in fact SPB leverages the
spanning tree state machine for controlling forwarding on a per shortest path tree
basis. Shortest Path Bridging comes in 2 versions:
• SPBV, using 802.1Q VLAN translation data plane forwarding
• SPBM using 802.1ah MAC-in-MAC encapsulation for data plane forwarding
SPB can be used in conjunction with Fabric Routing and/or Routing-as-a-Service to
push routing to the edge, to optimize east/west and north/south traffic flows.
4.3 REDUNDANCY
4.3.1
LAG AND LACP
Link aggregation (LAG) feature allows customers to increase bandwidth and
availability by using a group of ports to share the traffic load between parallel links to
the same peer device. It provides redundancy through multiple connections, and they
should be load sharing (active-active). This applies for both connectivity between
switches and connectivity between switch and other devices such as hypervisors. As
the consolidation of servers increases, so does the need for resiliency.
ExtremeXOS software supports dynamic load sharing which includes the Link
Aggregation Control Protocol (LACP) and Health Check Link Aggregation. The Link
Aggregation Control Protocol is used to dynamically determine if link aggregation
is possible and then to automatically configure the aggregation. LACP is part of
the IEEE 802.3ad standard and allows the switch to dynamically reconfigure the
link aggregation groups (LAGs). The LAG is enabled only when LACP detects that
the remote device is also using LACP and is able to join the LAG. Health Check Link
Aggregation is used to create a link aggregation group that monitors a particular
TCP/IP address and TCP port. Static load sharing is also supported but is susceptible
to configuration error.
4.3.1.1 LOAD SHARING ALGORITHMS
Depending on the traffic characteristics within the fabric, the appropriate loadsharing algorithm should be selected to ensure proper distribution of traffic across
the fabric. These load-sharing algorithms will need to be explicitly configured
by the customer based on the nature of the traffic patterns that exist within the
infrastructure. Administrators can configure the egress-link selection algorithm to
factor in different traffic components such as Layer 2 source and destination MAC
addresses, Layer 3 IPv4 or IPv6 source and destination IP addresses (IPv4 and IPv6),
Layer 4 TCP or UDP source and destination port numbers, MPLS labels, etc.
LACP should be used when configuring LAG between the spine and leaf switches, for
the ISC LAG. Where supported on the virtualization, customers should also configure
LACP on the switch edge ports and enable it on the servers.
4.3.2 VRRP
4.3.2.1 OVERVIEW
Virtual Router Redundancy Protocol (VRRP) allows multiple switches to provide
redundant routing services to users. VRRP is used to eliminate the single point of
failure associated with manually configuring a default gateway address on each host
in a network. Without using VRRP, if the configured default gateway fails, you must
reconfigure each host on the network to use a different router as the default gateway.
VRRP provides a redundant path for the hosts. Using VRRP, if the default gateway
Data Center Solutions Guide – White Paper
25
fails, the backup router assumes forwarding responsibilities.
When a VRRP router instance becomes active, the master router issues a gratuitous
ARP response that contains the VRRP router MAC address for each VRRP router
IP address. The VRRP MAC address for a VRRP router instance is an IEEE 802 MAC
address in the following hexadecimal format: 00-00-5E-00-01-<vrid>. The master
also always responds to ARP requests for VRRP router IP addresses with an ARP
response containing the VRRP MAC address. Hosts on the network use the VRRP
router MAC address when they send traffic to the default gateway.
4.3.2.2ACTIVE/ACTIVE VRRP
VRRP in Active/Active mode allows both switches of an MLAG pair to simultaneously
act as the default gateway for the subnet.
Data Center Solutions Guide – White Paper
26
4.3.2.3FABRIC ROUTING
Central to all data center designs is the need for optimized traffic routing within the
data center as well as between data centers. Extreme Networks leverages standards
based VRRP to provide a single virtual router gateway shared across multiple physical
devices to provide redundancy and layer 3 resiliency. Normally traffic that needs to be
Layer 3-routed are sent to the Aggregation switches running VRRP, which results in
suboptimal east/west data center traffic flows.
Traffic flows without Fabric Routing
Data Center Solutions Guide – White Paper
27
Extreme Networks Fabric Routing is an enhancement to VRRP that optimizes the
flow of east/west traffic within the data center by allowing the closest router to
forward the data regardless of VRRP mastership. In a Fabric Routing enabled domain,
the traffic is routed by the first switch/router directly to the destination regardless of
the VRRP state it is in. This creates a distributed, nearest hop routing within the fabric
that optimizes throughput, latency and traffic flows (minimizing traffic load through
the fabric). No new protocols are required for Fabric Routing; fabric routing-enabled
devices are fully compatible with standards-based devices.
Traffic flows with Fabric Routing
4.3.2.4VRRP TRACKING
There are certain cases wherein a VRRP master may have to relinquish its VRRP
master status due to failure of an uplink connection. This uplink connection may be
the Data Center’s link to the Internet. In order to add additional intelligence to such
switches, VRRP tracking should be implemented. EXOS supports three VRRP tracking
modes in ANY or ALL logic:
• VLAN Tracking: track active VLANs, e.g. VLANs that go to the Core
• Route Table Tracking: track specified routes in the routing table, e.g. route to
Core next hop
Data Center Solutions Guide – White Paper
28
• Ping Tracking: track connectivity using a simple ping to any outside responder,
e.g. IP address of Core next hop
If the tracking condition fails, then VRRP behaves as though it is locally disabled and
relinquishes master status. [Works with Active/Active?]
4.3.3 ROUTING-AS-A-SERVICE
As described above, Fabric Routing optimizes east/west data center traffic by
pushing routing functionality to the edge of the network so that inter-VLAN traffic
can be switched at the edge. In an SPB deployment, similar edge routing value can be
achieved as Fabric Routing but eliminates Layer 3 routing protocols including VRRP,
and this uses a new Extreme Networks feature called Routing-as-a-Service.
In an SPB deployment, eliminating VRRP may be desirable since VRRP has its own
drawbacks, independent of SPB, that may make it undesirable. VRRP is a chatty
protocol that sends advertisements once per second, so it can be fairly resource
intensive and scaling becomes an issue if there are many interfaces that may require
it. Routing-as-a-Service preserves the VRRP property of virtual IP addressing (anycast
addressing) and router redundancy without actually using VRRP, and utilizing the
best path attributes of SPB. It interoperates with any traditional switch and can be
positioned in various Layer 2 configurations including Virtual Switch Bonding (VSB)
and redundantly attaching to Rapid and Multiple Spanning Trees (RSTP/MSTP).
Traditional routing
Data Center Solutions Guide – White Paper
29
Routing-as-a-Service requires only knowledge of the VLANs within the SPB domain
and traffic can take the optimal path to the destination using MAC resolutions
inherent in SPB. A SPB device that has the routing service enabled on one or more
interfaces will forward traffic to any destination IP addresses that match locally
connected subnets. It will respond to ARP requests using a virtual MAC derived from
the interface configuration. When it cannot resolve the IP destination, it redirects the
packet towards the SPB device capable of doing full routing
Routing-as-a-Service
With SPB, it is possible to know the topology of participating devices as SPB uses
IS-IS to compute all paths to all devices. It is also possible to know the whereabouts
of all hosts, i.e. the precise access devices where the hosts attach to the network. If
every device has all VLANs configured, then all subnets in the domain are already
locally attached and traffic can be routed directly to their destination via SPB. Hosts
on different VLANs (and thus different IP subnets) within the SPB domain can
communicate with one another through single hop routing. Routing-as-a-Service also
avoids asymmetrical routing.
Routing-as-a-Service offers a value proposition of multiple best path connections
while retaining the essential qualities inherent in routing: segmentation and access
control. This means hosts within the SPB domain can communicate in the most direct
way regardless of VLAN association.
Data Center Solutions Guide – White Paper
30
4.3.4 SERVER LOAD BALANCING
Using the unique capabilities of Extreme Networks S-Series switches, a load balancing
solution can be implemented without requiring any additional hardware. Load Sharing
Network Address Translation (LSNAT, as defined in RFC 2391) allows an IP address
and port number to be transformed into a Virtual IP address and port number (VIP)
mapped into many physical devices. The Extreme Networks S-Series provides LSNAT
support on a per VRF basis allowing multiple tenants to each utilize the virtualization
and load balancing capabilities separately on the same device. When traffic destined
to the VIP is seen by the LSNAT device, the device translates it into a real IP address
and port combination using a selected algorithm such as Round Robin, Weighted
Round Robin, Least Load or Fastest Response. This allows the device to choose from
a group of real server addresses and replace the VIP with the selected IP address and
port number.
The LSNAT device then makes the appropriate changes to packet and header
checksums before passing the packet along. On the return path, the device sees the
source and destination pair with the real IP address and port number and knows that
it needs to replace this source address and source port number with the VIP and
appropriate checksum recalculations before sending the packet along. Persistence is
a critical aspect of LSNAT to ensure that all service requests from a particular client
will be directed to the same real server. Sticky persistence functionality provides less
security but increased flexibility, allowing users to load balance all services through
a virtual IP address. In addition, this functionality provides better resource utilization
and thus increased performance.
An essential benefit of using LSNAT is that it can be combined with routing policies.
Configuring different costs for OSPF links, a second redundant server farm can be
made reachable by other metrics. In this way, load balancing is achieved in a much
more cost effective manner.
4.4 LOGICAL SEPARATION
In a virtualized environment there often is a requirement to support multiple
tenants. In an effort to protect multiple tenants from each other, logical
separation is established on Layer 2 and Layer 3 level. There are different levels
of logical separation:
• VLAN
• VRF
• VR
In the given design, all the Layer 3 interfaces will be configured on the Spine switch
with the Leaf switches acting as pure Layer 2 transport. Thus no VR/VRF instances
will be configured on the TOR switches, but all the VLANs need to be configured on
both leaf and spine.
Note: VRF in this context is not to be confused with Layer 3 VPN VRFs.
4.4.1 VLAN
VLAN: At a basic level, the term VLAN is used to refer to a collection of devices
that communicate as if they were on the same physical LAN. LAN segments are not
restricted by the hardware that physically connects them, hence the V=virtual. The
default VLAN is untagged on all ports, and there can be a maximum of 4094 VLANs
on any Extreme platform.
Data Center Solutions Guide – White Paper
31
In order to provision customers with VLANs, it is recommended that ranges be
allocated for
• Internal Infrastructure VLANs
• Management VLANs
• Tenant VLANs
• Public VLANs
4.4.2 VRS
The ExtremeXOS software supports virtual routers. This capability allows a single
physical switch to be split into multiple virtual routers. This feature isolates traffic
forwarded by one VR from traffic forwarded on a different virtual router.
Each virtual router maintains a separate logical forwarding table, which allows
the virtual routers to have overlapping IP addressing. Because each virtual router
maintains its own separate routing information, packets arriving on one virtual router
are never switched to another. Ports on the switch can either be used exclusively by
one virtual router, or can be shared among two or more virtual routers. Each VLAN
can belong to only one virtual router.
There are System VRs and User-defined VRs. The System VRs are used for
Management and there is one default VR-Default pre-allocated. The user VRs can be
created for tenants and the VRs support any of the switch routing protocols including
BGP, OSPF, ISIS, RIP etc.
If the customers are deploying a multi-tenant environment where only the Gold service
tier use dedicated VRs, the VR scale limit can dictate the number of Gold tenants that
can be supported. Examples where VR needs may be required for tenants: higher
bandwidth, routing needs extend beyond static routing or BGP routing.
4.4.3 VRF
Virtual Router and Forwarding instances (VRFs) are similar to VRs in that they
maintain isolation. The routing tables for each VRF are separate from the tables for
other VRs and VRFs, so VRFs can support overlapping address space.
VRFs are created as children of user VRs or VR-Default, and each VRF supports Layer
3 routing and forwarding. VRFs can only run static and BGP.
VRFs tend to scale better than VRs as they require fewer resources, so VRFs are
preferable for tenant isolation.
4.5QUALITY OF SERVICE (QOS)
4.5.1 OVERVIEW
The Quality of Service (QoS) concept of quality is one in which the requirements
of some applications and users are more critical than others, which means that
some traffic receives preferential treatment. By using QoS mechanisms, network
administrators can use existing resources efficiently and ensure the required level of
service without reactively expanding or over-provisioning their networks.
Using QoS in a data center allows you to:
• Give some traffic groups higher priority access to network resources
• Reserve bandwidth for special traffic groups
• Restrict some traffic groups to bandwidth or data rates defined in a Service
Level Agreement (SLA)
Data Center Solutions Guide – White Paper
32
• Count frames and packets that exceed specified limits and optionally discard
them (rate limiting)
• Queue or buffer frames and packets that exceed specified limits and forward
them later (rate shaping)
• Modify QoS related fields in forwarded frames and packets (remarking)
To ensure end-to-end QoS adherence, customers should predetermine the QoS
requirements for the different tiers of service (not forgetting the management
traffic!), and each CoS mapping should be mapped to a certain bandwidth
reservation. Then all switches should be configured consistently with CoS to
qosprofile mappings across the environment. All qosprofiles should be configured
with the correct bandwidth reservations as requirements dictate. Additionally, servers
should match the configuration as well.
4.5.2 CLASSIFICATION
In the given IaaS platform, traffic can be classified
• Infrastructure traffic
• User traffic, for example Gold, Silver, Bronze
The ACL-based traffic classification provide the most control of QoS features and can
be used to apply ingress and egress rate limiting. An ACL can be used to add traffic
to a traffic group based on the following frame or packet components:
• MAC source or destination address
• Ethertype
• IP source or destination address
• IP protocol
• TCP flag
• TCP, UDP, or other Layer 4 protocol
• TCP or UDP port information
• IP fragmentation
Depending on the platform you are using, traffic classified into an ACL traffic group
can have one of these actions:
• Assigned to an ingress meter for rate limiting
• Marked for an egress QoS profile for rate shaping
• Marked for an egress traffic queue for rate shaping
• Marked for DSCP replacement on egress
• Marked for 802.1p priority replacement on egress
• Assigned to an egress meter for rate limiting
Non-ACL based traffic classification (CoS 802.1p OR DiffServ) specify ingress or
egress QoS profile for rate limiting and rate shaping. These groups cannot use
ingress or egress software traffic queues. However, non ACL-based traffic groups
can use the packet-marking feature to change the dot1p or DiffServ values in egress
frames or packets. In addition, port based traffic classification groups forward traffic to egress QoS
profiles based on the incoming port number. VLAN-based traffic classification
forward traffic to egress QoS profiles based on the VLAN membership of the
ingress port.
Data Center Solutions Guide – White Paper
33
Extreme also provides an easy-to-configure flood control mechanism which
automatically classifies different types of flooded traffic (broadcast, multicast,
unknown destination MAC) and then rate limits it. Traffic may be flooded when a VM
or applications on the VM are misbehaving, and by enforcing a maximum bandwidth
on the flooded traffic, the switch can minimize the data center impact of ingress
flooding traffic. Ports can be configured to accept a specified rate of flooded packets
per second, and if that rate is exceeded, the port blocks traffic and drops subsequent
packets until the traffic again drops below the configured rate. This can prevent
degraded throughput performance and even network outages.
4.5.3 EGRESS QOS PROFILES
QoS Profiles are queues that provide ingress or egress rate limiting and rate shaping.
Egress QoS profiles is supported on all ExtremeXOS switches and allows you to
provide dual-rate egress rate shaping for all traffic groups on all egress ports.
When you are configuring ACL-based traffic groups, you can use the qosprofile
action modifier to select an egress QoS profile. For DiffServ-, port-, and VLAN-based
traffic groups, the traffic group configuration selects the egress QoS profile. For CoS
dot1p traffic groups on all platforms, the dot1p value selects the egress QoS profile.
BlackDiamond X8 series switches and Summit family switches have two defaults
egress QoS profiles named QP1 and QP8. Up to six additional QoS profiles (QP2
through QP7) can be configured on the switch. The default settings for egress QoS
profiles are summarized in the following table.
Table 1 - Default EXoS QoS Table
INGRESS 802.1P
PRIORITY VALUE
0-6
7
EGRESS QOS PROFILE
NAME 17
QUEUE SERVICE
PRIORITY VALUE
BUFFER
WEIGHT
NOTES
QP1
1(Low)
100%
1
This QoS profile is part of the default configuration
and cannot be deleted.
QP2
2(LowHi)
100%
1
You must create the QoS profile before using it.
QP3
3(Normal)
100%
1
You must create this QoS profile before using it.
QP4
4(NormalHi)
100%
1
You must create this QoS profile before using it.
QP5
5(Medium)
100%
1
You must create this QoS profile before using it.
QP6
6(MediumHi)
100%
1
You must create this QoS profile before using it.
QP7
7(High)
100%
1
You must create this QoS profile before using it.
You cannot create this QoS profile on SummitStack
QP8
8(HighHi)
100%
1
This QoS profile is part of the default configuration
and cannot be deleted.
4.5.4 BUFFER MANAGEMENT
Regardless of how the network oversubscription is designed, one has to be aware of
the fact that storage technologies will create a completely different traffic pattern
on the network than a typical user or VDI (Virtual Desktop Infrastructure) session.
Storage traffic typically bursts to very high bandwidth in the presence of parallelization
(especially within storage clusters which serve a distributed database).
New standards like parallel Network File System (pNFS) increase that level of
parallelization towards the database servers. This parallelization will often lead to
the condition in which packets must be transmitted at the exact same time (which
is obviously not possible on a single interface); this is the definition of an “incast”
problem. The switch needs to be able to buffer these micro bursts so that none of the
packets in the transaction get lost, otherwise the whole database transaction will fail.
As interface speeds increase, large network packet buffers are required.
Data Center Solutions Guide – White Paper
34
Extreme offers different types of switches depending on the data center
requirements. On one hand, applications with longer-living TCP sessions, such as
storage, iSCSI, FCoE, backup applications, data replications, NFS, streaming, etc.,
require larger network packet buffers. For such cases, the Extreme Networks S-Series
is perfectly positioned with a packet buffer that exceeds 2 Gigabytes per I/O slot
modules to solve this problem.
On the other hand, applications with shorter-living TCP sessions or transactions, such
as high frequency trading, database transactions, character-oriented applications,
many web applications, don’t have such large buffer requirements. For such cases,
Extreme BDX8 and X670 leverages Smart Buffer technology which provides a
dynamic and adaptive on-chip buffer allocation scheme that is superior to static
per-port allocation schemes and avoids latency incurred by off-chip buffers. Ports
have dedicated buffers and in addition can get extra buffer allocation from a shared
pool as needed, thereby demonstrating an effective management of and tolerance for
microbursts. In contrast, arbitrarily large off-chip buffers can exacerbate congestion
or can increase latency and jitter, which leads to less deterministic Big Data job
performance, especially if chaining jobs. While the Extreme hardware maximizes
burst absorption capability and addresses temporary congestion, it also maintains
fairness. Since Extreme’s Smart Buffer technology is adaptive in the shared buffering
allocations, uncongested ports do not get starved of access to the shared buffer pool
and they are not throttled by congestion on other ports, while still allowing congested
ports to get more of the buffers to address the traffic burst.
4.5.5 OVERSUBSCRIPTION
The acceptable oversubscription in a data center network, is highly dependent on the
applications in use and is radically different than in a typical access network. Today’s
design of presentation/web server, application server and database server “layers”
combined with the new dynamics introduced through virtualization make it hard to
predict traffic patterns and load between given systems in the data center network.
The fact is that servers which use a hypervisor to virtualize applications yield higher
performance and the resulting average demand on the interfaces belonging to these
systems will be higher than on a typical server.
Also, if virtual desktops are deployed, one has to carefully engineer the
oversubscription and the quality of service architecture at the LAN access as well.
Typically 0.5 to 1 Mbit/s per client must be reserved – without considering future
streaming requirements.
Data Center Solutions Guide – White Paper
35
Challenges with oversubscription include:
• Potential for congestion collapse
• Slow application performance
• Potential loss of control plane traffic
In general, oversubscription is simply calculated by using the ratio of network
interfaces facing the downstream side versus the number of interfaces facing the
upstream side (uplink) to the data center core
In the case of a MLAG and VSB, all the links between switches are active and allow
traffic to flow through. In the case of x670, the oversubscription ratio at the leaf
switch is 3:1 (480G down and 160G up). In the case of a single 40G link failure
between a leaf switch and spine switch, the oversubscription ratio at the edge switch
will change to 4:1. If traffic utilization between the leaf and spine switch is too high,
the 4:1 could cause serious congestion and packet drop even in a single link failure
scenario. So if it is necessary to maintain the desired oversubscription rate in the
event of single link failure, additional interfaces may be required in the design.
4.5.6 DATA CENTER BRIDGING
Data Center Bridging (DCB) is used to enhance LANs to support I/O convergence
in the data center, so that Ethernet LAN traffic and Fibre Channel (FC) storage area
network (SAN) traffic can be transported on the same Ethernet-based network
infrastructure. Standard Ethernet does not support lossless transport, but the DCB
extends Ethernet to enable it to provide the level of CoS necessary to transport FC
frames encapsulated in Ethernet over an Ethernet network.
Essentially, DCB enables the different treatment of traffic based on a set of priorities.
The benefits of a converged (bridged) data center network include:
• Simpler management with only one fabric to deploy, maintain, and upgrade.
• Fewer failure points where networks connect.
• Lower costs because fewer cables, switches and other equipment require less
power to communicate.
Extreme’s data center solutions use the following specifications from the IEEE 802.1
DCB Task Group are:
• Priority-based Flow Control (PFC) - Provides a link-level, flow-control mechanism
that can be independently controlled for each priority to ensure zero-loss due to
converged-network congestion.
• Enhanced Transmission Selection (ETS) - Provides a common management
framework for bandwidth assignment to traffic classes.
• Data Center Bridging Exchange Protocol (DCBX) - A discovery and capability
exchange protocol used to convey capabilities and configurations of the other
DCB features between neighbors to ensure consistent configuration across the
network.
iSCSI can accomplishes the same state via TCP. However, DCB has aspects that make
an iSCSI environment more reliable and customizable. It can improve performance
and make that performance more deliverable. In a traditional IP network any
lost frames need to be retransmitted. Removal of the potential for loss means
that no retransmissions need to occur; and fewer retransmissions mean a gain in
performance. While retransmissions are rare in well designed, traditional Ethernet
deployments, DCB comes close to removing them completely. The second capability
of DCB that’s important to iSCSI implementations is the allocation of bandwidth on
specific links to specific functions.
Data Center Solutions Guide – White Paper
36
iSCSI over DCB ideally needs an end-to-end DCB-aware connection to take full
advantage of DCB’s lossless nature and bandwidth allocation capabilities but it can
be added where it is needed. If the traffic from the storage to the switch is the issue
then a DCB switch and DCB aware storage would be all that is needed. Alternatively if
the servers need DCB then a DCB aware card and switch would be all that is needed
4.6 ELASTICITY
A data center model that is truly elastic focuses on agility and modularity with
simplified operations. Elasticity means that all the layers of the data center
respond rapidly to new resource demands, adding and removing resources based
on customer needs, with focus on being the whole network and end-to-end
provisioning, not just on a single switch. Elasticity is beyond just an automation
challenge, it’s about how synchronously the data center reacts to end customer
business applications.
4.6.1 VM TRACKING
Data traffic from VM to VM traverses the network as tagged traffic to maintain
VLAN and tenant isolation. The VLANs need to be configured in the network
fabric and associated to the appropriate edge ports on the leaf switches and be
matched to the hypervisor configuration. Extreme Networks switches support
multi-user, multi-method authentication on every port, absolutely essential when
you have virtual machines as well as devices such as IP phones, computers,
printers, copiers, security cameras and badge readers connected to the data
center network. These multiple devices (or virtual machines) can connect to
the same port and each device can have an independent policy configuration
associated to it.
With a manual workflow, every time an administrator creates a new port-group
for VMs and allocates a certain VLAN tag on the hypervisor, the administrator
would have to manually configure the corresponding VLAN on the leaf node, tag
the downlink ports to the server and uplink ports to the spine, and create the
VLAN on the spine and tag appropriate ports. And then when the port group
is deleted or if the VM vMotions to another host, the administrator will need to
repeat these tasks manually.
Fortunately, Extreme Networks provides capabilities that turn this traditionallymanual workflow into a dynamic workflow. Extreme Networks Data Center
Manager (DCM) integrates with the hypervisor (e.g. VMWare) to learn VM MAC
addresses and then inputs this into the switch’s dynamic VLAN feature set that is
called “VM Tracking” on the BDX8 and x670, and called “MAC Authentication” on
the S-series and 7100.
These feature combinations allow automation and orchestration through
the hypervisor elements and dynamic VLAN and policy assignment on the
network elements. When a virtual machine is detected on a port, ExtremeXOS
VM Tracking feature uses NAC, or optionally a local policy, to determine the
VR configuration and the VLAN configuration for the VM, and dynamically
configures the VLAN on the access ports and related policies. If a virtual
machine shuts down or is moved, its VLAN is pruned to preserve bandwidth.
This feature creates an elastic infrastructure in which the network responds to
changes dynamically in the virtual machine network.
Data Center Solutions Guide – White Paper
37
This works with VMs configured to send tagged or untagged traffic. For untagged
traffic, it authenticates against the MAC address. For a case where the VM sends
tagged traffic, the VLAN tag of the received frame is also used to determine VLAN
classification for the VM’s traffic. If VLAN configuration exists for the VM and it
conflicts with the actual tag present in received traffic, the VM tracking feature
reports an EMS message and does not trigger VLAN creation or port addition.
However, if no configuration is present for the VM, the VM tracking feature assumes
that there are no restrictions for classifying traffic for the VM to the received VLAN.
The uplink ports can have either static VLAN configuration or they can also have the
VLANs configured dynamically as needed.
4.6.2 AUTO-CONFIGURATION
Extreme Networks provides a flexible and simple switch configuration solution,
which allows organizations to quickly build networks or replace faulty switches for
business continuity. Extreme Networks Auto Configuration feature is aimed to achieve
plug-and-play deployment. It provides the ability to drop ship Extreme switches into
the customer premises helps reduce or eliminate operational expenditure (OPEX)
and costs involved in staging and any initial switch configuration. It also reduces
costs incurred for customization of configuration with the ability to classify switches
according to function, hardware type, or location. Standards-based classification
(using DHCP) helps administrators create flexible and easy-to-manage configurations.
Extreme Networks Auto Configuration provides:
• Simple configuration which is easily enabled or disabled and the ability to drop
ship Extreme switches into customer premises with feature enabled in advance
either by the channel partners, system integrators or the Value-Added Resellers
(VAR).
Data Center Solutions Guide – White Paper
38
• Standards-based solution making the most effective use of protocols such
as DHCP, and TFTP. DHCP is used for dynamic configuration of network
parameters; TFTP is used for the configuration download.
• Ability to download a configuration file in the standard configuration format
(.cfg), as well as script files (.xsf).
• Works with existing DHCP and TFTP infrastructure in the network, with minimal
customization.
• Ability to create a classification of switches based on the hardware/platform
type which gives greater deployment flexibility.
4.7SECURITY
4.7.1 IDENTITY MANAGEMENT AND VDI
To maintain controlled access to the data center, data center IT administrators
need to learn more about the users and devices as they connect to and disconnect
switches and take appropriate action based on their level of authorization.
Administrators need to collect captured information, query LDAP servers to
collect additional information, and then enable appropriate policies for traffic
filtering, metering, generate EMS messages, and whitelist or blacklist identities.
Extreme has a rich identity management (IDM) platform that seamlessly with
NAC to manage an identity database and respond to all identity event triggers.
IDM works with a variety of software components like LLDP, Kerberos, NetLogin,
FDB, IP-Security.
Extreme’s IDM platform also serves foundation for Virtual Desktop Infrastructure
(VDI). With VDI, a user’s desktop is hosted in the data center as a virtual
machine. What this means is that traditional network access control as well
as Role Based Access Control (RBAC) that was tied to the user’s identity at
the campus network edge now needs to move into the data center servernetwork edge where the user’s desktop is hosted. Extreme Networks’ Identity
management solution, which transparently detects a user’s identity based on the
user’s Kerberos authentication exchange, can be used in the data center at the
server network edge for this purpose.
When the VDI connection broker assigns a VM to a user, the user’s Kerberos
authentication request passes through the network access switch in the data
center. Extreme Networks Identity Management solution, detects the user’s
identity based on passive Kerberos snooping, and provisions the network port
with the right privileges for the user’s desktop that is now a virtual machine.
Based on the user’s identity the appropriate role for the user can be configured
and enforced dynamically directly at the VM level on the network access port.
When a virtual desktop VM moves, the VM tracking capabilities can detect the VM
movement and inform the Identity Management solution so that the user’s role
can be enforced at the target server where the virtual desktop VM has moved.
Data Center Solutions Guide – White Paper
39
4.7.2 TRAFFIC MANAGEMENT WITH THE OS
An operating system designed from the ground up for Data center, ExtremeXOS
builds into the network itself a traffic monitoring capability called CLEAR-Flow.
CLEAR-Flow represents a new paradigm for network traffic management. For the
first time, CLEAR-Flow brings together network monitoring, analysis, and response in
a single process inside the Ethernet switching fabric. This creates a powerful toolbox
for solving diverse network challenges that were previously difficult or impossible to
solve, such as threat detection in high-speed networks.
CLEAR-Flow is a broad framework for implementing security, monitoring, and anomaly
detection in ExtremeXOS software. CLEAR-Flow allows data center IT administrators to
specify certain types of traffic that deserve more attention. Once certain criteria for this
traffic are met, the switch can then either take an immediate, pre-determined action,
or send a copy of the traffic for off-switch analysis. This analysis can, in turn, result in
the appropriate response to the particular traffic, for example, blocking a DoS attack or
rate-limiting a user in violation of his service level agreement.
CLEAR-Flow Processing Architecture
Data Center Solutions Guide – White Paper
40
4.7.3 PROTECT THE NETWORK ELEMENTS
Security in the Data Center has to happen at all levels. The network infrastructure layer
needs to ensure that only authorized users are accessing the data center, and it needs
to protect its network elements from deliberate attacks or unintentional vulnerabilities
that can cause data center outages or bring down critical business applications.
4.7.3.1 DOS PROTECTION
Intentional or unintentional traffic loads may overwhelm CPU processes on the
switches, which would cause the switch to be too busy to service other functions
and switch performance will suffer. Even with very fast CPUs, there will always
be ways to overwhelm the CPU with packets that require costly processing. DoS
Protection is designed to help prevent degraded switch performance by attempting
to characterize the problem and filter out the offending traffic so that other functions
can continue. When a flood of CPU bound packets reaches the switch, DoS Protection
will count these packets. When the packet count nears the alert threshold, packets
headers will be saved. If the threshold is reached, then these headers are analyzed,
and a hardware access control list (ACL) is created to limit the flow of these packets
to the CPU. This ACL will remain in place to provide relief to the CPU until it expires
and the threat goes away.
4.7.3.2 GRATUITOUS ARP PROTECTION
When a host sends an ARP request to resolve its own IP address it is called gratuitous
ARP. While there are some valid times when users would issue a gratuitous ARP, data
center IT administrators may not be able to prevent malicious users from causing
a man-in-the-middle attack using gratuitous ARP. To protect against this type of
attack, the switch can enable Gratuitous ARP protection and in response to receiving
an unexpected gratuitous ARP, it will send out its own gratuitous ARP request to
override the attacker.
4.7.3.3 IP DUPLICATE ADDRESS DETECTION
IP address management in the data center can be challenging depending on the
approach used to manage the IP address. IP address conflicts where the same IP
address is configured on more than one machine will cause interruption to services
so data center IT administrators need to detect and manage these to resolve any
address conflicts as quickly as possible. Extreme’s Duplicate Address Detection
(DAD) feature checks networks attached to a switch to see if IP addresses configured
on the switch are already in use on an attached network.
4.7.3.4 DHCP SNOOPING AND ARP PROTECTION
Data center IT administrators may want to strictly manage and allocate client IP
addresses and prevent duplicate IP addresses from interrupting network operation.
By using DHCP snooping and DHCP secured ARP, the switch won’t build its ARP
table through the normal ARP learning process of tracking ARP requests and replies.
Instead, the switch will build its ARP table from manually configured ARP entries or
those secure ARP entries created by DHCP assignments or reassignments.
4.8 TOR AND EOR DESIGNS
Top of Rack (ToR) designs are often deployed in data centers today. Their modular
design makes staging and deployment of racks easy to incorporate with equipment
life-cycle management. Also cabling is often perceived to be easier when compared
to an End of Row (EoR) design, especially when a large amount of Gigabit Ethernet
attached servers are deployed.
Data Center Solutions Guide – White Paper
41
But ToR also has some disadvantages, such as:
• ToR can introduce additional scalability concerns, specifically congestion over
uplinks and shallow packet buffers which may prevent a predictable Class of
Service (CoS) behavior. In an EoR scenario this can be typically achieved by
adding new line cards to a modular chassis
• Upgrades in technology (i.e. 1G to 10G, or 40G uplinks) often result in the
complete replacement of a typical 1 Rack Unit (RU) ToR switch
• Number of servers in a rack varies over time, thus varying the number of switch
ports that must be provided Unused CAPEX sitting in the server racks is not
efficient
• Number of unused ports (aggregated) will be higher than in an End of Row
(EoR) scenario This can also result in higher power consumption and greater
cooling requirements compared to an EoR scenario
These caveats may result in an overall higher Total Cost of Ownership (TCO) for a
ToR deployment compared to an EoR deployment. Additionally cabling, cooling, rack
space, power and services costs must also be carefully evaluated when choosing an
architecture. Lastly a ToR design results in a higher oversubscription ratio towards the
core and potentially a higher degree of congestion. A fabric-wide quality of service
(QoS) deployment (with the emerging adoption of DCB) cannot fully address this
concern today.
Top of Rack design
Data Center Solutions Guide – White Paper
42
Another data center topology option is an End of Row chassis-based switch for server
connectivity. This design will place chassis-based switches at end of a row or the
middle of a row to allow all the servers in a rack row to connect back to the switches.
Compared to a ToR design the servers can be placed anywhere in the racks so hot
areas due to high server concentration can be avoided. Also the usage of the EoR
equipment is optimized compared to a ToR deployment, with rack space, power
consumption, cooling and CAPEX decreased as well. The number of switches that
must be managed is reduced with the added advantages of a highly available and
scalable design. Typically chassis switches also provide more features and scale in
an EoR scenario compared to smaller platforms typical of ToR designs. On the other
hand, cabling can be more complex as the density in the EoR rack increase.
4.9 DATA CENTER INTERCONNECT (DCI)
4.9.1 OVERVIEW
The evolving traffic patterns of clusters, servers and storage virtualization solutions
are demanding new redundancy schemes. These schemes provide the transport
technology used for inter-data center connectivity and cover the geographical
distances between data centers. They are critical as the network design evolves to
provide ever higher levels of stability, resiliency and performance.
DCI solutions must provide for:
• Cloud Bursting: create an elastic private cloud infrastructure that allows for
optimized application delivery based on current and varying business demands
• Disaster Recovery and Business Continuity: Effective and automated recovery
from a catastrophic failure with no manual intervention to ensure business
continuity
• Workload and Data Mobility: A private cloud infrastructure that is optimized
during runtime on resource utilization, application performance and product cost
requires a borderless, single compute, network and storage infrastructure pool
that can be dynamically allocated
Data Center Solutions Guide – White Paper
43
The transport technology of choice between data centers is dependent upon
several requirements:
• Synchronous or asynchronous data replication
• Jitter and delay acceptance for virtualized applications and their storage
• Jitter and delay acceptance for cluster solutions
• Available bandwidth per traffic class
• Layer 2 or Layer 3 interconnect
4.9.2 LOAD BALANCING REQUIREMENT
An important issue when operating a load-balanced service across data centers
and within a data center is how to handle information that must be kept across the
multiple requests in a user’s session. If this information is stored locally on one back
end server, then subsequent requests going to different back end servers would
not be able to find it. This might be cached information that can be recomputed, in
which case load-balancing a request to a different back end server just introduces a
performance issue.
One solution to the session data issue is to send all requests in a user session
consistently to the same back end server. This is known as “persistence” or
“stickiness”. A downside to this technique is its lack of automatic failover: if a backend
server goes down, its per session information becomes inaccessible, and sessions
depending upon it are lost. So a seamless failover cannot be guaranteed. In most
cases dedicated hardware load balancers are required.
The discussion about load balancing and persistence has a great impact on
separation. The figure below shows a typical situation for cluster node separation
across two redundant data centers. In this example the node separation of different
clusters types with shared nothing and shared data bases are shown.
In many cases, the same subnet is used across both of the data centers, which is then
route summarized. The “cluster” subnet will be advertised as an external route using
“redistribute connected” and by filtering all subnets except the cluster subnet. While
redistributing, the primary data center will be preferred to the remote data center by
lower path cost until such time as the primary data center disappears completely.
Cluster Node Separation Across Two data centers
Data Center Solutions Guide – White Paper
44
The clients placed within the public campus network access the data center services
across redundant switches. In active/standby redundant data center designs, the
same subnet is used across both of the data centers, where the primary data center
is specified with a lower path cost. The redundant switches are grouped together
in one VRRP group, and the same VRRP IP and MAC are used in both locations to
allow common gateway redundancy and allow seamless mobility. In this scenario, the
primary data center will be preferred to the remote data center by lower path cost;
the switch for the active data center will have higher VRRP priority.
However this might cause problems in event of failover, when the traffic must be
re-routed from the primary data center to the backup data center. This is especially
true when traffic traverses stateful firewalls, when one has to make sure that traffic
on both directions passes the same firewall system. Techniques for VRRP interface or
next hop tracking can make sure that this is covered appropriately.
To provide database access across both data centers at any time, connectivity
between access switches and storage systems must be duplicated. Replication of
databases must be achieved through Layer 2 techniques, such as VPLS, GRE, SPB,
or with 802.1Q and RSTP/MSTP along with 802.3ad Link Aggregation or possibly
through switch clustering/bonding techniques. In all cases one will face huge demand
for bandwidth and performance that can be quite expensive for WAN links and must
be properly sized. But the benefit will be improved data center availability and data
center users will be able to load balance across them.
4.9.3 TYPES OF DCI
Typically there can be Layer 3 IP interconnects between data centers and many data
centers may need just that. There are some data center services that need a Layer 2
interconnect to stretch a subnet across multiple data centers, such as VM mobility,
some data storage replication, server clustering, or other high availability and disaster
recovery requirements.
Data Center Solutions Guide – White Paper
45
The simplest method to connect multiple data centers together is to extend the
common VLAN(s) across backbone extending the Layer 2 domain. This solution is a
viable option for many smaller deployments but it should be considered on how this
extension will impact larger networks.
A more scalable method for Layer 2 data center interconnect is desirable. SPB can
extend the Layer 2 domain between data centers. An alternative is to extend a Layer
2 tunnel between the data center sites to allow Layer 2 traffic to be transported
transparently across the Layer 3 infrastructure, with the added benefit of not extending
the size of the size of the spanning tree domain. Extreme switches leverage standard
IP/GRE tunneling or VPLS to interconnect the data centers. In these scenarios, the
multiple data center sites see each other as part of a common Layer 2 domain.
Devices or virtual machines can easily be moved between data centers in a hot or
cold manner. The networks can leverage Extreme Networks functionality including:
• Fabric Routing which optimizes routing for east/west traffic by providing
distributing routing in SPB and VSB-based designs
• IP Host Mobility which optimizes routing for north/south traffic allowing IP
host mobility by distributing a specific host route into the respective routing
protocols to allow efficient symmetric traffic flow
After moving to the new location, the VM will be reachable via its new location as
a result of the VM host route advertisement by the local fabric router in the new
data center location. Fabric routing and host routing optimize the flow of traffic into
and between data centers by providing direct access to and from each data center
symmetrically. The traffic optimization limits the amount of traffic that traverses
the interconnect links to traffic that needs to go between data centers providing
the added benefit of conserving bandwidth on potentially expensive data center
interconnect links.
Considering redundant data center designs where the same subnet is used across
the primary and backup data center, a standard routed layer 3 interconnect may be
suitable. This is environment is suitable in the scenario where traffic does not need to
have direct Layer 2 connectivity between the respective data centers, such as when
physical movement of server connectivity to a new data center is desired.
4.10 MANAGEMENT
4.10.1 DEVICE DISCOVERY
Datacenter IT administrators need to verify physical connectivity to ensure proper
physical writing between devices. Through Extreme Discovery Protocol (EDP)
which is enabled by default, they can validate that the ports on the local Extreme
switch are connected to the expected ports on the remote Extreme switch, looking
at devices names, port numbers, neighbor IDs, number of VLANs those ports have
been added too.
The centralized data center management also can discover devices and provide
topology information, as should be seamlessly integrated for the whole data center.
The NetSight Discovery feature can automatically discover the new switches in the
data center:
Data Center Solutions Guide – White Paper
46
And then the NetSight Topology Map provides an easy way to visualize the data
center. It is an automatically generated visual representation of network connectivity.
Topology Maps provide Network Administrators with in-depth graphical views of
device groupings, device links, VLANs, and Spanning Tree status.
Data Center Solutions Guide – White Paper
47
4.10.2 OUT OF BAND NETWORK
Out of Band (OOB) network management is an integral portion of the network
infrastructure. A best practice for the OOB management network is that it should rely
on a network infrastructure that is completely isolated and independent of the core
data network.
The advantages of separating the data and management plane are:
• Ability to reach the infrastructure in the event of loss of data infrastructure.
• Ability for administrators to troubleshoot and resuscitate the infrastructure in the
event of a complete network outage
• Dedicated path for management traffic, which while not bandwidth intensive, it
is critical in providing key services to the infrastructure.
As an alternative, it is possible to have a management network that is inline with the
existing infrastructure. In this case, precautions must be taken so that access to the
switches is still available even when the core network infrastructure is unavailable.
Using a separate console network for example can achieve this.
Apart from having a separate infrastructure to support the management of the
infrastructure, the following network services should be provisioned to support the
IaaS platform
• SNMP
• NTP
• Syslog
• Authentication
• Network Management
If switch management redundancy is of concern, then 2 Management switches should
be deployed per rack and can be configured as MLAG peers, with VRRP. If there is
only a single management switch, VRRP is not required in the management network.
The hypervisor service consoles can have one VLAN and the leaf/spine switches can
have another VLAN, to maintain separation. And the IP address range used for these
management VLANs should be completely different from the data VLANs.
All default gateways should be directed towards the LAYER 3 IP address
on these VLANs, and the switch will route between the VLANs on the
management infrastructure.
5 Data Center Infrastructure Elements
The integration of the infrastructure elements is key to building a synchronous
data center ecosystem. There are many infrastructure elements, hence many
companies provide purely system integration services because this can be so
challenging. Reference architectures that embody multiple elements to demonstrate
interoperability are useful, for example, the VSPEX integrated reference architecture
from EMC provides a modular reference architecture that is proven with best-in-class
technologies including Extreme Networks.
There are a wide variety of infrastructure elements that are needed in the ecosystem
and there are a plethora of vendors that are available. With so many players, it is
crucial to innovate and push technology boundaries synchronously. Extreme has
many technology solution partners that help enrich the ecosystem of third party
infrastructure elements. Please see http://www.extremenetworks.com/partners/tsp/
for more details on Extreme’s Technology Solution Partner program.
Data Center Solutions Guide – White Paper
48
The infrastructure elements from third parties, what they are and how they fit into the
whole data center ecosystem are also changing rapidly. Traditional centralized models
for compute, storage, services are disaggregating and physical resourcing is evolving
into virtualization resourcing. What embodies a data center “rack” is also being rearchitected, and with the Rack Scale Architecture (RSA) applications are no longer
constrained to resources in a single server. The application drives allocation of compute,
memory, and I/O from pools of resources that are aggregated in the rack itself.
Similarly at the application layer, current models of custom hardware appliances for
different services are being replaced in favor of Network Functions Virtualization
(NFV), which will change the way firewalls, intrusion detect devices, load balancers,
and other virtualized network functions (VNF) are deployed in the data center.
NFV can address the challenges of legacy data centers, and VNF can be linked
through “service chaining” enabling new services to be applied more quickly via the
orchestration mechanisms.
Data Center Solutions Guide – White Paper
49
5.1 SERVER VIRTUALIZATION
5.1.1 OVERVIEW
Virtualization has introduced the ability to create dynamic data centers and with the
added benefit of “green IT.” Server virtualization can provide better reliability and
higher availability in the event of hardware failure. Server virtualization also allows
higher utilization of hardware resources while improving administration by having a
single management interface for all virtual servers.
The server Virtualization layer transform the physical resources of a server by
virtualizing the CPU, RAM, hard disk, and network controller. This transformation
creates fully functional virtual machines that run isolated and encapsulated operating
systems and applications just like physical computers.
Different vendors such as VMWare, Citrix, Microsoft, and others, provide centralized
management platforms, e.g. vSphere, Xen, Hyper-V, respectively, that provide
administrators with a single interface for all aspects of monitoring, managing, and
maintaining the virtual infrastructure, which can be accessed from multiple devices.
The high-availability features of these hypervisors, such VMWare’s vMotion,
Citrix’s XenMotion, Microsoft’s Live Migration, enable seamless migration of virtual
machines and stored files from one server to another, or from one data storage area
to another, with minimal or no performance impact. Coupled with other hypervisor
features for intelligent resource scheduling, the virtual machines have access to the
appropriate resources at any point in time through load balancing of compute and
storage resources.
The servers should be located on an out-of-band management network that has a
completely separate network path as opposed to the data path. This ensures that
loss of the data path will allow administrators to still access the infrastructure via the
management path thereby allowing them to bring back the infrastructure as quickly
as possible.
5.1.2 SWITCHING FUNCTIONS IN HYPERVISORS
In a virtualized world where virtual machines are sharing common resources, there will
be traffic coming in one physical adapter in one physical server, but the traffic will be
destined to different VMs. The physical adapters need to be virtualized for the VMs.
The software-based approach is to have a switching component reside in the
hypervisor on the server. This virtual switch (i.e. vSwitch) is often proprietary to the
hypervisor, and there is also development underway to develop an open standardsbased vSwitch. For performance enhancements, some hardware manufacturers are
developing technologies that provide hardware acceleration for software based
switches, for example Intel’s DPDK technology for packet framework optimizations.
And then there is I/O virtualization, i.e. SR-IOV virtual function, where the adapters
present multiple logical adapters to the hypervisor.
Deployed with the hypervisors, distributed switches function as a single switch
across all associated hosts. This enables data center administrators to set network
configurations that span across all member hosts, and allows virtual machines to
maintain consistent network configuration as they migrate across multiple hosts,
maintaining their uplink properties
5.2 STORAGE
Storage is a critical piece of a virtualized infrastructure. Data Centers are moving
towards converged infrastructures that will result in fewer adapters, cables, and nodes,
and ultimately in more efficient network operations. This is driving more requirements
Data Center Solutions Guide – White Paper
50
around storage being supported by the Ethernet fabric. The BDX8 or S-series modular
switches deliver an easy and effective way to optimize communications through
automatic discovery, classification, and prioritization of SANs.
Storage requirements vary by server type. Application servers require much less
storage than database servers. There are several storage options – Direct Attached
Storage (DAS), Network Attached Storage (NAS), or Storage Area Network (SAN).
In the past, Fibre Channel (FC) offered better reliability and performance but
needed highly-skilled SAN administrators. Dynamic data centers, leveraging server
virtualization with Fibre Channel attached storage, will require the introduction of
a new standard, Fibre Channel over Ethernet (FCoE). FCoE, requires LAN switch
upgrades due to the nature of the underlying requirements, as well as Data Center
Bridging Ethernet standards. FCoE is also non-routable, so it may cause issues
when it comes to the implementation of disaster recovery or large geographical
redundancy that Layer 2 connectivity cannot yet achieve.
On the other hand, iSCSI which is a standard Ethernet block-based storage protocol
that allows SCSI commands to be encapsulated in TCP/IP, give servers access to
storage devices over common Ethernet IP networks. It provides support for faster
speeds and improved reliability, making it more attractive. iSCSI offers increased
flexibility and a more cost effective solution by leveraging existing network components
(NICs, switches, etc.). In addition, Fibre Channel switches typically cost 50% more than
Ethernet switches. Overall, iSCSI is easier to manage than Fibre Channel, considering
most IT personnel familiarity with the management of IP networks.
The storage solutions will also continue to evolve with Software Defined Storage
(SDS). Traditionally the storage management software resides on the storage
controller, and with SDS the software is decoupled from the hardware and moves off
to a server. This decoupling enables more efficient resource management and offers
more flexibility in hardware selection, perhaps even commodity storage hardware,
and easier data moves between multiple storage vendors. SDS also provides the
ability to integrate storage more smoothly into data center orchestration solutions
and analytics tools that need access to large pools of data.
5.3 FIREWALLS
Data Centers need protection against DoS attacks and other threats that target the
users in the data center, and firewalls mitigate threats by inspecting data traffic and
following user-created rules to take action on the traffic. This firewall is responsible
for protecting the entire data center against any malicious attacks from the Internet
or even within itself. It is also responsible for controlling the traffic flow that originates
from the tenants and heads towards the Internet. It is imperative to couple this type
of firewall with a larger security infrastructure to protect the data center against
intelligent attacks like DDoS.
The aggregation layer is the best-suited enforcement point for firewall security, or
additional services like VPN, Intrusion Prevention System (IPS). All servers can access
these services with short but predictable latency and bandwidth in an equal fashion.
High performance and intelligent Layer 4-7 application switches provide always-on,
highly scalable and secure business critical applications or be part of that layer itself.
Network architects typically configure service modules and appliances to be in
transparent (pass-through) mode, since these modules need to be able to be
removed without requiring a reconfiguration of the entire system. When these
modules are put in-line (all traffic passes through them), module throughput must be
calculated so that the service modules will not introduce significant congestion into
the system. One must avoid adding additional points of oversubscription whenever
possible. For example, while traffic from clients to servers must pass through an IPS,
Data Center Solutions Guide – White Paper
51
traffic between servers may not need to. In addition to raw bandwidth, the number
of concurrent sessions and the rate of connections per second that a security device
supports can introduce additional performance issues. The number of concurrent
sessions or connections per second can be calculated from the total number of
servers and end users. While there’s no general rule for this calculation, vendors will
typically supply a recommendation based upon the use model and configuration.
In a typical data center deployment, the firewall access layer consists of a cluster of
multiple firewalls. These firewalls have the ability to exchange traffic flow state between
them, thereby allowing for an active/active access topology. Such a topology allows for
a redundant access network and also provides the necessary resource required, serving
the entire data center with the bandwidth for Internet access.
The firewall physical infrastructure can be connected to the Data Center spine via
links on high-speed interfaces off the BDX8 or S-series. The VMs can then use the
spine switch as the default gateway and let the switch handle the routing. For traffic
destined within the Data Center, the spine switch routes the traffic back down to the
right leaf switches. For traffic destined outside the Data Center, the spine switch may
have a default route to transmit the traffic to the firewall for processing.
Virtual firewall services are also provided by some hypervisors that place firewall
filters on the virtual network adapters that provide stateful inspection of data
traffic and allow or prevent transmission based on user-defined rules. This happens
transparently to the network elements. As with the evolution towards NFV, a lot of the
firewalling is becoming more and more virtualized and distributed because it is less
expensive than doing it in hardware and enables service chaining.
5.4 SERVICE CHAINING
Service chaining is an integral solution of an automated datacenter as it manages how
services provided in the datacenter, switching, Routing, Firewall, Loadbalancing, IDS/
IPS, DLP, Antivirus, Antispam, QoS etc are provided to the tenants of the datacenter.
Traditional datacenters heavily relied on manual configuration to set the precedence
of services and create the traffic path taking traffic from the server to the different
services. Services that traditionally were provided by dedicated hardware and for
the whole network are being virtualized and applied on a per-tenant basis in cloud
environments. Given OneController’s visibility to the entire data center, the controller
can rapidly deploy those new functionalities, and provide service registration,
service insertion, and service chaining. These evolutions will challenge the way IT
administrators build their data centers but will ultimately improve their agility and
ability to offer new services.
The advent of automation and orchestration tools allowing dynamic workload
creation, like OpenStack together with SDN technologies, which Extreme Networks
embraces with its OneController platform, allow to orchestrate the network path
as another resource in the datacenter. The target of this concept is to provide a
generalized policy object containing all the services and relations assigned to a server,
customer or traffic definition. This generalized policy can contain traditional policy
concepts like:
• Vlan Membership
• Traffic QoS
• Simple traffic filters
Data Center Solutions Guide – White Paper
52
Or can relate to more elaborate concepts like:
• Firewall rules
• IDS/IDP rules
• Load balancing group membership in a LB configuration
• Traffic engineering/Resiliency configurations
• Traffic mirroring
Together with ordering for these actions the generalized policy will allow to define
the traffic path in the network and how the traffic is handled from one service
to another to increase the value chain in an automated manner without tedious
individual configuration processes for each server.
Some of the concepts expressed here are present in the Group Policy concept in
OneController, some of them will be developed as enhancements to the Group Policy
framework in future releases of OneController. As some examples:
1. A firewall is added to the network and several firewalling rules are created. These
rules can be registered as extensions to the Policy framework so the policy can
be added to the end system like:
Endpoint_A MAPSTO Policy_A
With the following Policy definition:
Policy_A:
• Apply bandwidth shaping to traffic TCP port 29 to 10Mbps
• Apply QoS premium to TCP port 5006
• Set VLAN 3, egress tagged
• Apply Firewall policy Firewall_A
As part of the firewall registration in the framework must include how the traffic is
handled by the controller and how flows must be created to chain the service such a
way that traffic from Endpoint_A is forwarded through Firewall and the Firewall rule
is applied.
2.A VoIP recording device is added to the network and so a new service definition
can include “recorder processing” as part of its definition indicating that all or
some traffic from that device must be mirrored to the recorder for recording.
3.A guest management system is added to the network to it can register with the
policy API and add the service “captive portal processing” to the list of services
that a user can receive.
http://www.extremenetworks.com/contact
Phone +1-408-579-2800
©2014 Extreme Networks, Inc. All rights reserved. Extreme Networks and the Extreme Networks logo are trademarks or registered trademarks of Extreme Networks, Inc.
in the United States and/or other countries. All other names are the property of their respective owners. For additional information on Extreme Networks Trademarks
please see http://www.extremenetworks.com/company/legal/trademarks/. Specifications and product availability are subject to change without notice. 8916-1114
WWW.EXTREMENETWORKS.COM
Data Center Solutions Guide – White Paper
53