Federation Enterprise Hybrid Cloud 3.5 - Concepts and

advertisement
ABSTRACT
This Solution Guide provides an introduction to the concepts and architectural
options available within the Federation Enterprise Hybrid Cloud solution. It
should be used as an aid to deciding on the most suitable configuration for the
initial deployment of a Federation Enterprise Hybrid Cloud solution.
February 2016
Copyright © 2016 EMC Corporation. All rights reserved. Published in the USA.
Published February 2016
EMC believes the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
The information in this publication is provided as is. EMC Corporation makes no
representations or warranties of any kind with respect to the information in this publication,
and specifically disclaims implied warranties of merchantability or fitness for a particular
purpose. Use, copying, and distribution of any EMC software described in this publication
requires an applicable software license.
EMC2, EMC, Avamar, Data Domain, Data Protection Advisor, Enginuity, GeoSynchrony,
Hybrid Cloud, PowerPath/VE, RecoverPoint, SMI-S Provider, Solutions Enabler, VMAX,
Syncplicity, Unisphere, ViPR, EMC ViPR Storage Resource Management, Virtual Storage
Integrator, VNX, VPLEX, VPLEX, Geo, VPLEX Metro, and the EMC logo are registered
trademarks or trademarks of EMC Corporation in the United States and other countries. All
other trademarks used herein are the property of their respective owners.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on
EMC.com.
Federation Enterprise Hybrid Cloud 3.5
Concepts and Architecture Guide
Solution Guide
Part Number H14719
2
Contents
Chapter 1
Executive Summary ............................................................. 5
Federation solutions ............................................................................................ 6
Document purpose .............................................................................................. 6
Audience ............................................................................................................ 6
Essential reading ................................................................................................ 6
Solution purpose ................................................................................................. 6
Business challenge .............................................................................................. 7
Technology solution............................................................................................. 7
Terminology ....................................................................................................... 8
We value your feedback! ...................................................................................... 9
Chapter 2
Cloud Management Platform Options .................................... 10
Overview ..........................................................................................................11
Cloud management platform components .............................................................11
Cloud management platform models ....................................................................14
Chapter 3
Network Topologies ............................................................ 21
Overview ..........................................................................................................22
Implications of virtual networking technology options .............................................22
Logical network topologies ..................................................................................24
Chapter 4
Single-Site/Single vCenter Topology..................................... 31
Overview ..........................................................................................................32
Single-site networking considerations ...................................................................32
Single-site storage considerations ........................................................................33
Recovery of cloud management platform ..............................................................36
Backup of single-site/single vCenter enterprise hybrid cloud ....................................36
Chapter 5
Dual-Site/Single vCenter Topology ....................................... 37
Overview ..........................................................................................................38
Standard dual-site/single vCenter topology ...........................................................38
Continuous availability dual-site/single vCenter topology ........................................39
Continuous availability network considerations ......................................................40
VPLEX Witness ...................................................................................................45
VPLEX topologies ...............................................................................................46
Continuous availability storage considerations .......................................................53
Recovery of cloud management platform ..............................................................56
Backup in dual-site/single vCenter enterprise hybrid cloud ......................................56
CA dual-site/single vCenter ecosystem .................................................................57
Chapter 6
Dual-Site/Dual vCenter Topology ......................................... 58
Overview ..........................................................................................................59
3
Contents
Standard dual-site/dual vCenter topology .............................................................59
Disaster recovery dual-site/dual vCenter topology ..................................................60
Disaster recovery network considerations .............................................................61
vCenter Site Recovery Manager considerations ......................................................69
vRealize Automation considerations ......................................................................72
Disaster recovery storage considerations ..............................................................73
Recovery of cloud management platform ..............................................................74
Best practices ....................................................................................................75
Backup in dual-site/dual vCenter topology ............................................................75
DR dual-site/dual vCenter ecosystem ...................................................................76
Chapter 7
Data Protection.................................................................. 77
Overview ..........................................................................................................78
Concepts...........................................................................................................79
Standard Avamar configuration............................................................................84
Redundant Avamar/single vCenter configuration ....................................................86
Redundant Avamar/dual vCenter configuration ......................................................90
Chapter 8
Solution Rules and Permitted Configurations ......................... 95
Overview ..........................................................................................................96
Architectural assumptions ...................................................................................96
VMware Platform Services Controller ....................................................................96
VMware vRealize tenants and business groups .......................................................98
EMC ViPR tenants and projects ............................................................................99
General storage considerations .......................................................................... 100
VMware vCenter endpoints ................................................................................ 100
Permitted topology configurations ...................................................................... 101
Permitted topology upgrade paths ...................................................................... 102
Bulk import of virtual machines ......................................................................... 103
DR dual-site/dual vCenter topology restrictions.................................................... 104
Resource sharing ............................................................................................. 106
Data protection considerations ........................................................................... 106
Software resources .......................................................................................... 106
Sizing guidance ............................................................................................... 106
Chapter 9
Conclusion ....................................................................... 107
Conclusion ...................................................................................................... 108
Chapter 10
References ....................................................................... 109
Federation documentation ................................................................................. 110
4
Chapter 1: Executive Summary
This chapter presents the following topics:
Federation solutions ............................................................................................ 6
Document purpose .............................................................................................. 6
Audience ............................................................................................................ 6
Essential reading ................................................................................................ 6
Solution purpose ................................................................................................. 6
Business challenge .............................................................................................. 7
Technology solution............................................................................................. 7
Terminology ....................................................................................................... 8
We value your feedback! ...................................................................................... 9
5
Chapter 1: Executive Summary
EMC II, Pivotal, RSA, VCE, Virtustream, and VMware form a unique Federation of
strategically aligned businesses that are free to execute individually or together. The
Federation businesses collaborate to research, develop, and validate superior, integrated
solutions and deliver a seamless experience to their collective customers. The Federation
provides customer solutions and choice for the software-defined enterprise and the
emerging third platform of mobile, cloud, big data, and social networking.
The Federation Enterprise Hybrid Cloud 3.5 solution is a completely virtualized data center,
fully automated by software. The solution starts with a foundation that delivers IT as a
service (ITaaS), with options for high availability, backup and recovery, and disaster
recovery (DR). It also provides a framework and foundation for add-on modules, such as
database as a service (DaaS), platform as a service (PaaS), and cloud brokering.
This Solution Guide provides an introduction to the concepts and architectural options
available within the Federation Enterprise Hybrid Cloud solution. It should be used as an aid
to deciding on the most suitable configuration for the initial deployment of a Federation
Enterprise Hybrid Cloud solution.
This Solution Guide is intended for executives, managers, architects, cloud administrators,
and technical administrators of IT environments who want to implement a hybrid cloud IaaS
platform. Readers should be familiar with the VMware® vRealize® Suite, storage
technologies, general IT functions and requirements, and how a hybrid cloud infrastructure
accommodates these technologies and requirements.
The Federation Enterprise Hybrid Cloud 3.5: Reference Architecture Guide describes the
reference architecture of a Federation Enterprise Hybrid Cloud solution. The guide introduces
the features and functionality of the solution, the solution architecture and key components,
and the validated hardware and software environments.
The following guides provide further information about various aspects of the Federation
Enterprise Hybrid Cloud solution:

Federation Enterprise Hybrid Cloud 3.5: Reference Architecture Guide

Federation Enterprise Hybrid Cloud 3.5: Administration Guide

Federation Enterprise Hybrid Cloud 3.5: Infrastructure and Operations Management
Guide

Federation Enterprise Hybrid Cloud 3.5: Security Management Guide
The Federation Enterprise Hybrid Cloud solution enables customers to build an enterpriseclass, scalable, multitenant infrastructure that enables:
6

Complete management of the infrastructure service lifecycle

On-demand access to and control of network bandwidth, servers, storage, and security
Chapter 1: Executive Summary

Provisioning, monitoring, protection, and management of the infrastructure services by
the line of business users, without IT administrator involvement

Provisioning from application blueprints with associated infrastructure resources by
line-of-business application owners without IT administrator involvement

Provisioning of backup, continuous availability (CA), and DR services as part of the
cloud service provisioning process

Maximum asset use
While many organizations have successfully introduced virtualization as a core technology
within their data center, the benefits of virtualization have largely been restricted to the IT
infrastructure owners. End users and business units within customer organizations have not
experienced many of the benefits of virtualization, such as increased agility, mobility, and
control.
Transforming from the traditional IT model to a cloud-operating model involves overcoming
the challenges of legacy infrastructure and processes, such as:

Inefficiency and inflexibility

Slow, reactive responses to customer requests

Inadequate visibility into the cost of the requested infrastructure

Limited choice of availability and protection services
The difficulty in overcoming these challenges has given rise to public cloud providers who
have built technology and business models catering to the requirements of end-user agility
and control. Many organizations are under pressure to provide similar service levels within
the secure and compliant confines of the on-premises data center. As a result, IT
departments need to create cost-effective alternatives to public cloud services, alternatives
that do not compromise enterprise features such as data protection, DR, and guaranteed
service levels.
This Federation Enterprise Hybrid Cloud solution integrates the best of EMC and VMware
products and services, and empowers IT organizations to accelerate implementation and
adoption of a hybrid cloud infrastructure, while still enabling customer choice for the
compute and networking infrastructure within the data center. The solution caters to
customers who want to preserve their investment and make better use of their existing
infrastructure and to those who want to build out new infrastructures dedicated to a hybrid
cloud.
This solution takes advantage of the strong integration between EMC technologies and the
VMware vRealize Suite. The solution, developed by EMC and VMware product and services
teams includes EMC scalable storage arrays, integrated EMC and VMware monitoring, and
data protection suites to provide the foundation for enabling cloud services within the
customer environment.
The Federation Enterprise Hybrid Cloud solution offers several key benefits to customers:

Rapid implementation: The solution can be designed and implemented in as little as
28 days, in a validated, tested, and repeatable way. This increases the time-to-value
while simultaneously reducing risk.

Supported solution: Implementing Federation Enterprise Hybrid Cloud through EMC
also results in a solution that is supported by EMC and further reduces risk associated
with the ongoing operations of your hybrid cloud.
7
Chapter 1: Executive Summary

Defined upgrade path: Customers implementing the Federation Enterprise Hybrid
Cloud receive upgrade guidance based on the testing and validation completed by the
Federation engineering teams. This upgrade guidance enables customers, partners,
and EMC services teams to perform upgrades faster, and with reduced risk.

Validated and tested integration: Extensive testing and validation has been
conducted by solutions engineering teams resulting in simplified use, management,
and operation.
The EMC Federation
EMC II, Pivotal, RSA, VCE, Virtustream, and VMware form a unique Federation of
strategically aligned businesses; each can operate individually or together. The Federation
provides customer solutions and choice for the software-defined enterprise and the
emerging “3rd platform” of mobile, cloud, big data and social, transformed by billions of
users and millions of apps.
Table 1 lists the terminology used in this guide.
Table 1.
8
Terminology
Term
Definition
ACL
Access control list
AIA
Authority Information Access
API
Application programming interface
Blueprint
A blueprint is a specification for a virtual, cloud, or physical
machine and is published as a catalog item in the common
service catalog
Business group
A managed object that associates users with a specific set of
catalog services and infrastructure resources
CBT
Changed Block Tracking
CDP
CRL Distribution Point
CRL
Certificate Revocation List
CSR
Certificate Signing Request
DHCP
Dynamic Host Configuration Protocol
Fabric group
A collection of virtualization compute resources and cloud
endpoints managed by one or more fabric administrators
FQDN
Fully qualified domain name
HSM
Hardware security module
IaaS
Infrastructure as a service
IIS
Internet Information Services
LAG
Link aggregation group that bundles multiple physical Ethernet
links between two or more devices into a single logical link can
also be used to aggregate available bandwidth, depending on
the protocol used.
LDAP
Lightweight Directory Access Protocol
LDAPS
LDAP over SSL
Chapter 1: Executive Summary
Term
Definition
MCCLI
Management Console Command Line Interface
PEM
Privacy Enhanced Mail
PKI
Public key infrastructure
PVLAN
Private virtual LAN
SSL
Secure Sockets Layer
TACACS
Terminal Access Controller Access Control System
vRealize Automation
blueprint
A specification for a virtual, cloud, or physical machine that is
published as a catalog item in the vRealize Automation service
catalog
VDC
Virtual device context
vDS
Virtual distributed switch
VLAN
Virtual local area network
VMDK
Virtual machine disk
VRF
Virtual routing and forwarding
VSI
Virtual Storage Integrator
VXLAN
Virtual Extensible LAN
EMC and the authors of this document welcome your feedback on the solution and the
solution documentation. Please contact us at EMC.Solution.Feedback@emc.com with your
comments.
Authors: Ken Gould, Fiona O’Neill
9
Chapter 2: Cloud Management Platform Options
This chapter presents the following topics:
Overview ..........................................................................................................11
Cloud management platform components .............................................................11
Cloud management platform models ....................................................................14
10
Chapter 2: Cloud Management Platform Options
The cloud management platform supports the entire management infrastructure for this
solution. This management infrastructure is divided into three pods (functional areas), which
consist of one or more VMware vSphere® ESXi™ clusters and/or vSphere resource groups,
depending on the model deployed. Each pod performs a solution-specific function.
This chapter describes the components of the management platform and the models
available for use. After reading it, you should be able to decide on the model that suits your
environment.
Management
terminology and
hierarchy
To understand how the management platform is constructed, it is important to know how a
number of terms are used throughout this guide. Figure 1 shows the relationship between
platform, pod, and cluster and their relative scopes as used in the Federation Enterprise
Hybrid Cloud.
Figure 1.
Cloud management terminology and hierarchy
The following distinctions exist in terms of the scope of each term:

Platform (Cloud Management Platform) is an umbrella term intended to represent
the entire management environment.

Pod (Management Pod) Each management pod is a subset of the overall
management platform and represents a distinct area of functionality. What each area
constitutes in terms of compute resources differs depending on the management
models discussed in Cloud management platform models.

Cluster (Technology Cluster) is used in the context of the individual technologies.
While it may refer to vSphere clusters, it can also refer to VPLEX clusters, EMC
RecoverPoint® clusters, and so on.

Resource pools Non-default resource pools are used only when two or more
management pods are collapsed onto the same vSphere cluster and are used to
control and guarantee resources to each affected pod.
Management
platform
components
Federation Enterprise Hybrid Cloud sub-divides the management platform into distinct
functional areas called management pods. These management pods are:

Core Pod

Network Edge Infrastructure (NEI) Pod

Automation Pod
11
Chapter 2: Cloud Management Platform Options
Figure 2 shows how the components of the management stack are distributed among the
management pods.
Figure 2.
Cloud management platform component layout
Core Pod
The Core Pod provides the base set of resources to establish the Federation Enterprise
Hybrid Cloud solution services. It consists of:

External VMware vCenter Server™ (optional): This vCenter instance hosts only the
Core Pod components and hardware. It is required when using the Distributed
management model and may already exist, depending on customer resources.

Cloud VMware vCenter Server: This vCenter instance is used to manage the NEI and
Automation components and compute resources. If using a Collapsed management
model or Hybrid management model, it also hosts the Core Pod components and
hardware. vRealize Automation uses this vCenter Server as its endpoint from which
the appropriate vSphere clusters are reserved for use by vRealize Automation business
groups.
Note: While Figure 2 depicts vCenter, Update Manager, and PSC as one cell of related
components, they are deployed as separate virtual machines.

Microsoft SQL Server: Hosts SQL Server databases used by the Cloud vCenter Server
and VMware Update Manager™. It also hosts the VMware vCenter Site Recovery
Manager™ database in a DR dual-site/dual vCenter topology.
12
Chapter 2: Cloud Management Platform Options
Note: Figure 2 includes separate SQL Server virtual machines for the External and Cloud
vCenter SQL Server databases. This provides maximum resilience. Placing both vCenter
databases on the same SQL Server virtual machine in the Core Pod is also supported. The
vRealize IaaS SQL Server database must be on its own SQL Server instance in the
Automation Pod.

VMware NSX™: Used to deploy and manage the virtual networks for the management
infrastructure and Workload Pods.

EMC SMI-S Provider: Management infrastructure required by EMC ViPR.
Note: When using VCE platforms, the SMI-S provider in the AMP cluster can be used, but must
be kept in line with versions specified by Federation Enterprise Hybrid Cloud Simple Support
Matrix. This may require a VCE Impact Assessment (IA) to achieve.
The hardware hosting the Core Pod is not enabled as a vRealize Automation compute
resource, but the virtual machines it hosts provide the critical services required to
instantiate the cloud.
All of the virtual machines on the Core Pod are deployed on non-ViPR provisioned storage.
The virtual machines can use existing SAN connected storage or any highly availability
storage in the customer environment.
The Federation Enterprise Hybrid Cloud supports Fibre Channel (FC), iSCSI, and NFS storage
from EMC VNX® storage systems for the Core Pod storage. Though not mandatory, FC
connectivity is strongly recommended.
Note: For continuous availability topologies, this must be SAN-based block storage.
All storage should be RAID protected and all vSphere ESXi servers should be configured with
EMC PowerPath/VE for automatic path management and load balancing.
Network Edge Infrastructure (NEI) Pod
The NEI Pod is only required where VMware NSX is deployed, and is used to host NSX
controllers, north-south NSX Edge Services Gateway (ESG) devices, and NSX Distributed
Logical Router (DLR) control virtual machines. Use vSphere DRS rules to ensure that NSX
Controllers are separated from each other and also to ensure that primary ESGs are
separated from primary DLRs so that a host failure does not affect network availability. This
pod provides the convergence point for the physical and virtual networks.
Like the Core Pod, storage for this pod should be RAID protected and the Federation
recommends Fibre Channel connections. vSphere ESXi hosts should run EMC PowerPath®/VE
for automatic path management and load balancing.
Automation Pod
The Automation Pod hosts the remaining virtual machines used for automating and
managing the cloud infrastructure. The Automation Pod supports the services responsible for
functions such as the user portal, automated provisioning, monitoring, and metering.
The Automation Pod is managed by the Cloud vCenter Server instance; however it is
dedicated to automation and management services. Therefore, the resources from this pod
are not exposed to vRealize Automation business groups.
The Automation Pod cannot share networks or storage resources with the workload clusters,
and should be on a distinctly different Layer 3 network to both the Core and NEI
management pods, even when using a collapsed management model. Storage provisioning
for the Automation Pod follows the same guidelines as the NEI Pod. Automation Pod
networks may be VxLANs managed by NSX.
Note: While Figure 2 depicts vRealize IaaS as one cell of related components, the individual
vRealize Automation roles are deployed as separate virtual machines.
13
Chapter 2: Cloud Management Platform Options
Workload Pods
Workload Pods are configured and assigned to fabric groups in VMware vRealize™
Automation. Available resources are used to host virtual machines deployed by business
groups in the Federation Enterprise Hybrid Cloud environment. All business groups can
share the available vSphere ESXi cluster resources.
EMC ViPR® service requests are initiated from the vRealize Automation catalog to provision
Workload Pod storage.
Note: Workload Pods were previously termed resource pods in Enterprise Hybrid Cloud 2.5.1
and earlier.
The Federation Enterprise Hybrid Cloud supports three management models, as shown in
Table 2.
Note: Minimum host count depends on the compute resources specification being sufficient to
support the relevant management virtual machine requirements. The Federation Enterprise
Hybrid Cloud sizing tool may recommend a higher number based on the server specification
chosen.
Table 2.
Federation Enterprise Hybrid Cloud management models
Management
model
Pod
vCenter
used
Cluster used
Minimum
no of hosts
Resource
pool used
Distributed
Core
External
Core Cluster
2
N/A
NEI (NSX Only)
Cloud
NEI Cluster
4
N/A
Automation
Cloud
Automation Cluster
2
N/A
Core
Cloud
Collapsed Cluster
2 (w/o NSX)
Core
Collapsed
Hybrid
4 (w/ NSX)
NEI (NSX Only)
Cloud
Collapsed Cluster
NEI
Automation
Cloud
Collapsed Cluster
Core
Cloud
Core Cluster (AMP*)
3*
N/A
NEI (NSX Only)
Cloud
NEI Cluster
4
N/A
Automation
Cloud
Automation Cluster
2
N/A
Automation
* Based on VCE AMP2-HAP configuration.
For ultimate resilience and ease of use during maintenance windows, creating vSphere
clusters sizes based on N+2 may be appropriate according to customer preference, where N
is the calculated CPU and RAM requirements for the hosted virtual machines plus host
system overhead. The Federation Enterprise Hybrid Cloud sizing tool sizes vSphere clusters
based on an N+1 algorithm.
Table 3 indicates the Federation Enterprise Hybrid Cloud topologies supported by each
management model. The topologies themselves are described later in this document.
14
Chapter 2: Cloud Management Platform Options
Table 3.
Federation Enterprise Hybrid Cloud topologies supported by each
management model
Management
model
Single site
Standard
dualsite/single
vCenter
CA dualsite/single
vCenter
Standard
dualsite/dual
vCenter
Disaster
recovery
dualsite/dual
vCenter
Distributed
Supported
Supported
Supported
Supported
Supported
Collapsed
Supported
Supported
Supported
Supported
Supported
Hybrid
Supported
Supported
Supported
Supported
Supported
The following sections describe each of these models and provides guidance on how to
choose the appropriate model.
Distributed
management
model
The distributed management model uses two separate vCenter instances and each
management pod has its own distinct vSphere cluster. It requires a minimum of six hosts
when used without NSX, or eight hosts when used with NSX.
The first External vCenter Server instance manages all vSphere host and virtual machine
components for the Core Pod. While the virtual machine running this vCenter instance can
also be located within the Core Pod itself, it may be located on a separate system for further
levels of high availability.
The second Cloud vCenter Server instance located on the cloud management platform
manages the NEI, Automation, and Workload Pods supporting the various business groups
within the enterprise. This vCenter server acts as the vSphere end-point for vRealize
Automation.
Figure 3 shows the distributed management model configuration with two vCenters where
the first vCenter supports the Core Pod and the second vCenter supports the remaining
cloud management pods and tenant resources.
Figure 3.
Distributed Federation Enterprise Hybrid Cloud management model –
vSphere view
The distributed management model:

Enables Core Pod functionality and resources to be provided by a pre-existing vSphere
instance within your environment.
15
Chapter 2: Cloud Management Platform Options
Collapsed
management
model

Provides the highest level of resource separation (that is, host level) between the
Core, Automation, and NEI Pods.

Places the NEI Pod ESXi cluster as the single intersection point between the physical
and virtual networks configured within the solution, which eliminates the need to have
critical networking components compete for resources as the solution scales and the
demands of other areas of the cloud management platform increase.

Enhances the resilience of the solution because a separate vCenter server and SQL
Server instance host the core cloud components.
The collapsed management model uses a single vCenter server to host all Core, Automation,
and NEI Pod components as well as the Workload Pods.
Each management pod is implemented as an individual vSphere resource pool on a single
(shared) vSphere cluster, which ensures that each pod receives the correct proportion of
compute and network resources. It requires a minimum of two physical hosts when used
without NSX, or four hosts when used with NSX.
Figure 4 shows an example of how the vSphere configuration might look with a collapsed
management model.
Figure 4.
Collapsed Federation Enterprise Hybrid Cloud management model –
vSphere view
The collapsed management model:
16

Provides the smallest overall management footprint for any given cloud size to be
deployed.

Allows resource allocation between pods to be reconfigured with minimal effort.

Allows high-availability overhead to be reduced by using a single cluster, but does not
alter the CPU, RAM, or storage required to manage the solution.
Chapter 2: Cloud Management Platform Options
Resource pool considerations
Given that a single vSphere cluster is used in the collapsed management model, a vSphere
resource group is required for each management pod to ensure sufficient resources are
reserved for each function. Use the guidelines in Table 4 as the starting point for balancing
these resources appropriately.
Table 4.
Collapsed management model: Resource groups configuration
Resource
Core
NEI
Auto
CPU
20%
20%
60%
RAM
20%
5%
75%
Note: These figures are initial guidelines and should be monitored in each environment and
fine-tuned accordingly. The percentages can be implemented, as shares, in whatever scale is
required, as long as the percentage of shares assigned to each resource pool corresponds to the
ratio of percentages in Table 4.
Hybrid
management
model
The hybrid management model uses a single vCenter server to host all Core, Automation,
and NEI Pod components as well as the Workload Pods. Each management pod has its own
vSphere cluster. Therefore, it requires a minimum of six hosts when used without NSX, or
eight hosts when used with NSX.
Figure 5 shows an example of how the vSphere configuration might look with a hybrid
management model.
Figure 5.
Hybrid Federation Enterprise Hybrid Cloud management model – vSphere
view
17
Chapter 2: Cloud Management Platform Options
The hybrid management model:
Deciding on the
management
model

Provides the highest level of resource separation (that is, host level) between the
Core, Automation, and NEI Pods.

Places the NEI Pod ESXi cluster as the single intersection point between the physical
and virtual networks configured within the solution, which eliminates the need to have
critical networking components compete for resources as the solution scales and the
demands of other areas of the cloud management platform increase.

Is compatible with VxBlock NSX factory deployments, and may use the VCE AMP
vCenter as the Cloud vCenter.
Use the following key criteria to decide which management model is most suited for your
environment:
Reasons to select the distributed management model
Use these criteria to decide if this model is suitable for your environment. Reasons for
selecting the distributed management model are:

To use existing infrastructure to provide the resources that will host the Core Pod.

To achieve the highest level of resource separation (that is, host level) between the
Core, Automation, and NEI Pods.

To minimize the intersection points for north/south traffic to just the hosts that
provide compute resources to the NEI Pod.

To maximize the resilience of the solution by using a separate vCenter server and SQL
Server instance to host the Core Pod components.
Reasons to select the collapsed management model
Use these criteria to decide if this model is suitable for your environment. Reasons for
selecting the collapsed management model are:

To deploy the smallest management footprint for any given cloud size.

To reconfigure resource allocation between pods to be reconfigurable with minimal
effort.
Reasons to select the hybrid management model
Use these criteria to decide if this model is suitable for your environment. Reasons for
selecting the hybrid management model are:
Network quality
of service
considerations

To achieve the highest level of resource separation (that is, host level) between the
Core, Automation, and NEI Pods.

To minimize the intersection points for north/south traffic to just the hosts that
provide compute resources to the NEI Pod.

To have a management model that overlays easily with VCE VxBlock, and to use the
VCE AMP vCenter as the Cloud vCenter.
When using a management model that involves collapsed clusters, it may be necessary to
configure network quality of service (QoS) to ensure that each function has a guaranteed
minimum level of bandwidth available. Table 5 shows the suggested initial QoS settings.
These may be fine-tuned as appropriate to the environment.
Note: These values are suggestions based on the logical network Layout 1 in Chapter 3. As this
layout is only a sample, you should collapse or divide these allocations according to the network
topology you want to implement
18
Chapter 2: Cloud Management Platform Options
Table 5.
Suggested network QoS settings
Name
VLAN
DVS shares
DVS % Min
QoS COS
vmk_ESXi_MGMT
100
500
5%
2
vmk_NFS
200
750
7.5%
4
vmk_iSCSI
300
750
7.5%
4
vmk_vMOTION
400
1400
14%
1
DPG_Core
500
500
5%
2
DPG_NEI
600
500
5%
2
DPG_Automation
700
500
5%
2
DPG_Tenant_Uplink
800
2000
20%
0
VXLAN_Transport
900
*
*
*
Avamar_Target (optional)
1000
**
**
**
DPG_AV_Proxies (optional)
1100
600
6%
0
ESG_DLR_Transit
Virtual Wire
1250
12.5%
0
Workload
Virtual Wire
1250
12.5%
0
*This is a VXLAN_Transport VLAN. The shares are associated with the virtual wire networks
that use the transport VLAN.
**Physical network only. No shares required.
Component high
availability
Using vSphere ESXi clusters with VMware vSphere High Availability (vSphere HA) provides
general virtual machine protection across the management platform. Additional levels of
availability can be provided by using nested clustering between the component virtual
machines themselves, such as Windows Failover Clustering, PostgreSQL clustering, load
balancer clustering, or farms of machines that work together natively in an N+1
architecture, to provide a resilient architecture.
Distributed vRealize Automation
The Federation Enterprise Hybrid Cloud requires the use of distributed vRealize Automation
installations. In this model, multiple instances of each vRealize automation role are deployed
behind a load balancer to ensure scalability and fault tolerance. All-in-one vRealize
Automation installations are not supported for production use.
VMware NSX Load Balancing technology is fully supported, tested, and validated by
Federation Enterprise Hybrid Cloud. Other load balancer technologies supported by VMware
for use in vRealize Automation deployments are also permitted, but configuration assistance
for those technologies should be provided by VMware or the vendor. Use of a load balancer
not officially supported by Federation Enterprise Hybrid Cloud or VMware for use with
vRealize Automation requires a Federation Enterprise Hybrid Cloud request for product
qualification (RPQ).
Clustered vRealize Orchestrator
Both clustered and stand-alone vRealize Orchestrator installations are supported by
Federation Enterprise Hybrid Cloud.
Table 6 details the specific component high-availability options, as supported by each
management model.
19
Chapter 2: Cloud Management Platform Options
Table 6.
vRealize Automation and vRealize Orchestrator High Availability options
Management
model
Distributed
vRealize
Automation
Minimal vRealize
Automation (AIO)
Clustered vRealize
Orchestrator
(Active/Active)
Stand-alone
vRealize
Orchestrator
Distributed
Supported
Not supported
Supported
Supported
Collapsed
Supported
Not supported
Supported
Supported
Hybrid
Supported
Not supported
Supported
Supported
Highly available VMware Platform Services Controller
Highly available configurations for VMware Platform Services Controller are not supported in
Federation Enterprise Hybrid Cloud 3.5.
20
Chapter 3: Network Topologies
This chapter presents the following topics:
Overview ..........................................................................................................22
Implications of virtual networking technology options .............................................22
Logical network topologies ..................................................................................24
21
Chapter 3: Network Topologies
This solution provides a network architecture design that is resistant in the event of failure,
enables optimal throughput, multitenancy and secure separation.
This section presents a number of generic logical network topologies. Further network
considerations specific to each topology are presented in the relevant chapters.
Physical
connectivity
In designing the physical architecture, the main considerations are high availability,
performance, and scalability. Each layer in the architecture should be fault tolerant with
physically redundant connectivity throughout. The loss of any one infrastructure component
or link should not result in loss of service to the tenant; if scaled appropriately, there is no
impact on service performance.
Physical network and FC connectivity to the compute layer may be provided over a
converged network to converged network adapters on each compute blade, or over any
network and FC adapters that are supported by the hardware platform and vSphere.
Supported virtual
networking
technologies
The Federation Enterprise Hybrid Cloud supports different virtual networking technologies as
follows:

VMware NSX for vSphere

VMware vSphere Distributed Switch
The dynamic network services with vRealize Automation showcased in this solution require
either NSX. vSphere Distributed Switch supports static networking configurations only,
precluding the use of VXLANs.
The following section describes the implications, and features available, when VMware NSX
is used with the Federation Enterprise Hybrid Cloud solution compared to non-NSX-based
alternatives.
Solution
attributes with
and without
VMware NSX
22
Table 7 compares the attributes, support, and responsibility for various aspects of the
Federation Enterprise Hybrid Cloud solution, under its various topologies when used with
and without VMware NSX.
Table 7.
Comparing solution attributes with and without VMware NSX
Topology
Solution attributes with NSX
Solution attributes without NSX
Single-site

Provides the fully tested and validated
load balancer component for vRealize
Automation and other Automation Pod
components.


vRealize Automation multi-machine
blueprints may use networking
components provisioned dynamically
by NSX.
Requires a non-NSX load balancer for
vRealize Automation and other
Automation Pod components. Load
balancers listed as supported by
VMware are permitted, but the support
burden falls to VMware or the relevant
vendor.


Supports the full range of NSX
functionality supported by VMware
vRealize.
vRealize Automation blueprints must
use pre-defined vSphere networks only
(no dynamic provisioning of networking
components possible).

Possesses fewer security features due
to the absence of NSX.

Reduces network routing efficiency due
to lack of east-west kernel level routing
options provided by NSX.
Chapter 3: Network Topologies
Standard
dualsite/single
vCenter

Provides the fully tested and validated
load balancer component for vRealize
Automation and other Automation Pod
components.

vRealize Automation multi-machine
blueprints may use networking
components provisioned dynamically
by NSX.

CA dualsite/single
vCenter
Standard
dualsite/dual
vCenter
Supports the full range of NSX
functionality supported by VMware
vRealize.

Provides the fully tested and validated
load balancer component for vRealize
Automation and other Automation Pod
components.

vRealize Automation multi-machine
blueprints may use networking
components provisioned dynamically
by NSX.

Supports the full range of NSX
functionality supported by VMware
vRealize.

Enables automatic path fail over when
the ‘preferred’ site fails.

Enables VXLAN over L2 or Layer 3 DCI
to support tenant workload networks
availability in both physical locations.

Provides the fully tested and validated
load balancer component for vRealize
Automation and other Automation Pod
components.

vRealize Automation multi-machine
blueprints may use networking
components provisioned dynamically
by NSX.

Supports the full range of NSX
functionality supported by VMware
vRealize.

Requires a non-NSX load balancer for
vRealize Automation and other
Automation Pod components. Load
balancers listed as supported by
VMware are permitted, but the support
burden falls to VMware or the relevant
vendor.

vRealize Automation blueprints must
use pre-defined vSphere networks only
(no dynamic provisioning of networking
components possible).

Possesses fewer security features due
to the absence of NSX.

Reduces network routing efficiency due
to lack of east-west kernel level routing
options provided by NSX.

Requires a non-NSX load balancer for
vRealize Automation and other
Automation Pod components. Load
balancers listed as supported by
VMware are permitted, but the support
burden falls to VMware or the relevant
vendor.

vRealize Automation blueprints must
use pre-defined vSphere networks only
(no dynamic provisioning of networking
components possible).

Possesses fewer security features due
to the absence of NSX.

Reduces network routing efficiency due
to lack of east-west kernel level routing
options provided by NSX.

Requires Layer 2 VLANs present at both
sites to back tenant virtual machine
vSphere port groups.

Requires a non-NSX load balancer for
vRealize Automation and other
Automation Pod components. Load
balancers listed as supported by
VMware are permitted, but the support
burden falls to VMware or the relevant
vendor.

vRealize Automation blueprints must
use pre-defined vSphere networks only
(no dynamic provisioning of networking
components possible).

Possesses fewer security features due
to the absence of NSX.

Reduces network routing efficiency due
to lack of east-west kernel level routing
options provided by NSX.
23
Chapter 3: Network Topologies
DR dualsite/dual
vCenter

Provides the fully tested and validated
load balancer component for vRealize
Automation and other Automation Pod
components.

Does not support inter-site protection
of dynamically provisioned VMware
NSX networking artifacts.

Supports consistent NSX security
group membership by ensuring virtual
machines are placed in corresponding
predefined security groups across sites
via Federation workflows.

Allows fully automated network reconvergence for tenant resource pods
networks on the recovery site via
Federation workflows, the
redistribution capability of BGP/OSPF
and the use of NSX redistribution
polices.

Does not honor NSX security tags
applied to a virtual machine on the
protected site prior to failover.

Requires a non-NSX load balancer for
vRealize Automation and other
Automation Pod components. Load
balancers listed as supported by
VMware are permitted, but the support
burden falls to VMware or the relevant
vendor.

vRealize Automation blueprints must
use pre-defined vSphere networks only
(no dynamic provisioning of networking
components possible).

Possesses fewer security features due
to the absence of NSX.

Reduces network routing efficiency due
to lack of east-west kernel level routing
options provided by NSX.

Requires customer-supplied IP mobility
technology.

Requires manual or customer provided
re-convergence process for tenant
resource pods on the recovery site.
Each logical topology is designed to address the requirements of multitenancy and secure
separation of the tenant resources. It is also designed to align with security best practices
for segmenting networks according to the purpose or traffic type.
In the distributed management platform option, a minimum of one distributed vSwitch is
required for each of the External and Cloud vCenters, unless you run the Core Pod
components on standard vSwitches. In that case, a minimum of one distributed vSwitch is
required for the Cloud vCenter to support NSX networks. Multiple distributed vSwitches are
supported in both cases.
Note: While the minimum is one distributed vSwitch per vCenter, the Federation recommends
two distributed vSwitches in the Cloud vCenter. The first distributed switch should be used for
cloud management networks and the second distributed switch for tenant workload networks.
The sample layouts provided later in this chapter use this model and indicate which networks
are on each distributed switch by indicating vDS1 or vDS2. Additional distributed switches can
be created for additional tenants if required.
In the collapsed management platform option, there must be at least one distributed
vSwitch in the Cloud vCenter to support NSX. Multiple distributed vSwitches are supported.
24
Chapter 3: Network Topologies
Network layouts
The following network layouts are sample configurations intended to assist in understanding
the elements that need to be catered for in a Federation Enterprise Hybrid Cloud network
design. They do not represent a prescriptive list of the permitted configurations for logical
networks in Federation Enterprise Hybrid Cloud. The network layout should be designed
based on individual requirements.
Layout 1
Figure 6 shows one possible logical-to-physical network layout where standard vSphere
switches are used for the basic infrastructural networks.
This layout may be preferable where:

Additional NIC cards are available in the hosts to be used.

Increased protection against errors in configuration at a distributed vSwitch level is
required


It does this by placing the NFS, iSCSI, and vSphere vMotion networks on standard
vSwitches.
Dynamic networking technology is required through the use of NSX or vCloud
Networking and Security
Note: All VLAN suggestions are samples only and should be determined by the network team in
each particular environment.
Figure 6.
Network layout 1
25
Chapter 3: Network Topologies
Descriptions of each network are provided in Table 8.
Table 8.
Name
Type
Switch
type
Location
VLAN
Description
vmk_ESXi_MGMT
VMkernel
Standard
vSwitch
vSphere
ESXi hosts
100
VMkernel on each vSphere ESXi host
that hosts the management interface
for the vSphere ESXi host itself.
DPG_Core network should be able to
reach this network.
vmk_NFS
VMkernel
Standard
vSwitch
External
vCenter
and Cloud
vCenter
200
Optional VMkernel used to mount NFS
datastores to the vSphere ESXi hosts.
NFS File Storage should be connected
to the same VLAN / subnet or routable
from this subnet.
vmk_iSCSI
VMkernel
Standard
vSwitch
External
vCenter
and Cloud
vCenter
300
Optional VMkernel used to mount
iSCSI datastores to the vSphere ESXi
hosts. iSCSI network portals should be
configured to use the same VLAN /
subnet or routable from this subnet.
vmk_vMOTION
VMkernel
Standard
vSwitch
External
vCenter
and Cloud
vCenter
400
VMkernel used for vSphere vMotion
between vSphere ESXi hosts.
DPG_Core
vSphere
distributed
port group
Distributed
vSwitch 1
External
vCenter
500
Port group to which the management
interfaces of all the core management
components connect
DPG_NEI
vSphere
distributed
port group
Distributed
vSwitch 1
Cloud
vCenter
600
Port group to which the NSX
controllers on the NEI Pod connect.
DPG_Core network should be able to
reach this network.
DPG_Automation
vSphere
distributed
port group
Distributed
vSwitch 1
Cloud
vCenter
700
Port group to which the management
interfaces of all the Automation Pod
components connect
DPG_Tenant_Uplink
vSphere
distributed
port group
Distributed
vSwitch 2
Cloud
vCenter
800
Port group used for all tenant traffic to
egress from the cloud. Multiples may
exist.
VXLAN_Transport
NSX
distributed
port group
Distributed
vSwitch 2
Cloud
vCenter
900
Port group used for VTEP endpoints
between vSphere ESXi hosts to allow
VXLAN traffic.
ESG_DLR_Transit
NSX logical
switch
Distributed
vSwitch 2
Cloud
vCenter
Virtual
wire
VXLAN segments connecting Tenant
Edge and Tenant DLRs. Multiples may
exist.
Workload
NSX logical
switch
Distributed
vSwitch 2
Cloud
vCenter
Virtual
wire
Workload VXLAN segments. Multiples
may exist.
Avamar_Target
Primary
PVLAN
N/A
Physical
switches
1000
Promiscuous primary PVLAN to which
physical Avamar grids are connected.
This PVLAN has an associated
secondary isolated PVLAN (1100) in
which the Avamar proxies are placed
(Optional)
26
Network layout 1 descriptions
Chapter 3: Network Topologies
Name
Type
Switch
type
Location
VLAN
Description
DPG_AV_Proxies
Secondary
PVLAN /
Distributed
vSwitch 2
Physical
switches/
Cloud
vCenter
1100
Isolated secondary PVLAN to which
Avamar Proxies virtual machines are
connected. This PVLAN enables
proxies to communicate with Avamar
Grids on the Avamar_Target network
but prevents proxies from
communicating with each other
(Optional)
vSphere
distributed
port group
Layout 2
Figure 7 shows a second possible logical to physical network layout where distributed
vSphere switches are used for all basic infrastructural networks other than the vSphere ESXi
management network.
This layout may be preferable where:

Fewer NIC cards are available in the hosts to be used.

Increased consolidation of networks is required


It does this by placing all bar the ESXi management interfaces on distributed
vSwitches.
Dynamic networking technology is required through the use of NSX or vCloud
Networking and Security
Note: All VLAN suggestions are samples only and should be determined by the network team in
each particular environment.
Figure 7.
Network layout 2
27
Chapter 3: Network Topologies
Descriptions of each network are provided in Table 9.
Table 9.
28
Network layout 2 descriptions
Name
Type
Switch
type
Location
VLAN
Description
vmk_ESXi_MGMT
VMkernel
Standard
vSwitch
ESXi hosts
100
VMkernel on each vSphere ESXi
host that hosts the management
interface for the vSphere ESXi
host itself. DPG_Core network
should be able to reach this
network.
vmk_NFS
VMkernel
Distributed
vSwitch 1
External
vCenter and
Cloud
vCenter
200
Optional VMkernel used to mount
NFS datastores to the vSphere
ESXi hosts. NFS File Storage
should be connected to the same
VLAN / subnet or routable from
this subnet.
vmk_iSCSI
VMkernel
Distributed
vSwitch 1
External
vCenter and
Cloud
vCenter
300
Optional VMkernel used to mount
iSCSI datastores to the vSphere
ESXi hosts. iSCSI network portals
should be configured to use the
same VLAN / subnet or routable
from this subnet.
vmk_vMOTION
VMkernel
Distributed
vSwitch 1
External
vCenter and
Cloud
vCenter
400
VMkernel used for vSphere
vMotion between vSphere ESXi
hosts.
DPG_Core
vSphere
distributed
port group
Distributed
vSwitch 1
External
vCenter
500
Port group to which the
management interfaces of all the
core management components
connect
DPG_NEI
vSphere
distributed
port group
Distributed
vSwitch 1
Cloud
vCenter
600
Port group to which the NSX
controllers on the NEI Pod
connect. DPG_Core network
should be able to reach this
network.
DPG_Automation
vSphere
distributed
port group
Distributed
vSwitch 1
Cloud
vCenter
700
Port group to which the
management interfaces of all the
Automation Pod components
connect
DPG_Tenant_Uplink
vSphere
distributed
port group
Distributed
vSwitch 2
Cloud
vCenter
800
Port group used for all tenant
traffic to egress from the cloud.
Multiples may exist.
VXLAN_Transport
NSX
distributed
port group
Distributed
vSwitch 2
Cloud
vCenter
900
Port group used for VTEP
endpoints between vSphere ESXi
hosts to allow VXLAN traffic.
ESG_DLR_Transit
NSX logical
switch
Distributed
vSwitch 2
Cloud
vCenter
Virtual
wire
VXLAN segments connecting
Tenant Edge and Tenant DLRs.
Multiples may exist
Workload
NSX logical
switch
Distributed
vSwitch 2
Cloud
vCenter
Virtual
wire
Workload VXLAN segments.
Multiples may exist.
Chapter 3: Network Topologies
Name
Type
Switch
type
Location
VLAN
Description
Avamar_Target
Primary
PVLAN
N/A
Physical
switches
1000
Promiscuous primary PVLAN to
which physical Avamar grids are
connected. This PVLAN has an
associated secondary isolated
PVLAN (1100) in which the
Avamar proxies are placed
Secondary
PVLAN/
Distributed
vSwitch
Physical
switches/
Cloud
vCenter
1100
Isolated secondary PVLAN to
which Avamar Proxies virtual
machines are connected. This
PVLAN enables proxies to
communicate with Avamar Grids
on the Avamar_Target network
but prevents proxies from
communicating with each other
(Optional)
DPG_AV_Proxies
(Optional)
vSphere
distributed
port group
Layout 3
Figure 8 shows a third possible logical-to-physical network layout where distributed vSphere
switches are used for all networks other than the management network.
This layout may be preferable where:

There is no requirement for dynamic networking.

Reduction of management host count is paramount (as it removes the need for the
NEI Pod).
Note: All VLAN suggestions are samples only and should be determined by the network team in
each particular environment.
Figure 8.
Network layout 3
29
Chapter 3: Network Topologies
Descriptions of each network are provided in Table 10.
Table 10.
Name
Type
Switch
type
Location
VLAN
Description
vmk_ESXi_MGMT
VMkernel
Standard
vSwitch
ESXi Hosts
100
VMkernel on each vSphere ESXi host
that hosts the management interface
for the ESXi Host itself. DPG_Core
network should be able to reach this
network.
vmk_NFS
VMkernel
Standard
vSwitch
External
vCenter
and Cloud
vCenter
200
Optional VMkernel used to mount NFS
datastores to the vSphere ESXi hosts.
NFS File Storage should be connected
to the same VLAN/subnet or routable
from this subnet.
vmk_iSCSI
VMkernel
Standard
vSwitch
External
vCenter
and Cloud
vCenter
300
Optional VMkernel used to mount
iSCSI datastores to the vSphere ESXi
hosts. iSCSI network portals should be
configured to use the same VLAN /
subnet or routable from this subnet.
vmk_vMOTION
VMkernel
Standard
vSwitch
External
vCenter
and Cloud
vCenter
400
VMkernel used for vSphere vMotion
between vSphere ESXi hosts.
DPG_Core
vSphere
distributed
port group
Distributed
vSwitch 1
External
vCenter
500
Port group to which the management
interfaces of all the core management
components connect
DPG_Automation
vSphere
distributed
port group
Distributed
vSwitch 1
Cloud
vCenter
600
Port group to which the management
interfaces of all the Automation Pod
components connect
DPG_Tenant_Uplink
vSphere
distributed
port group
Distributed
vSwitch 2
Cloud
vCenter
700
Port group used for all tenant traffic to
egress from the cloud. Multiples may
exist.
DPG_Workload_1
vSphere
distributed
port group
Distributed
vSwitch 2
Cloud
vCenter
800
Port group used for workload traffic
DPG_Workload_2
vSphere
distributed
port group
Distributed
vSwitch 2
Cloud
vCenter
900
Port group used for workload traffic
Avamar_Target
Primary
PVLAN
N/A
Physical
switches
1000
Promiscuous primary PVLAN to which
physical Avamar grids are connected.
This PVLAN has an associated
secondary isolated PVLAN (1100) in
which the Avamar proxies are placed
Secondary
PVLAN /
Distributed
vSwitch 2
Physical
switches/
Cloud
vCenter
1100
Isolated secondary PVLAN to which
Avamar Proxies virtual machines are
connected. This PVLAN enables
proxies to communicate with Avamar
Grids on the Avamar_Target network
but prevents proxies from
communicating with each other
(optional)
DPG_AV_Proxies
(optional)
30
Network layout 3 descriptions
vSphere
distributed
port group
Chapter 4: Single-Site/Single vCenter Topology
This chapter presents the following topics:
Overview ..........................................................................................................32
Single-site networking considerations ...................................................................32
Single-site storage considerations ........................................................................33
Recovery of cloud management platform ..............................................................36
Backup of single-site/single vCenter enterprise hybrid cloud ....................................36
31
Chapter 4: Single-Site/Single vCenter Topology
This chapter describes networking and storage considerations for a single-site/single
vCenter topology in the Federation Enterprise Hybrid Cloud solution.
When to use the
single-site
topology
The single-site/single vCenter Federation Enterprise Hybrid Cloud topology should be used
when restart or recovery of the cloud to another data center is not required. It can also be
used as the base deployment on top of which you may layer the dual-site/single vCenter or
dual-site/dual vCenter topology at a later time.
Architecture
Figure 9 shows the single-site/single vCenter architecture for the Federation Enterprise
Hybrid Cloud solution including the required sets of resources separated by pod.
Figure 9.
Supported virtual
networking
technologies
Supported
VMware NSX
features
32
Federation Enterprise Hybrid Cloud single-site architecture
The Federation Enterprise Hybrid Cloud supports the following virtual networking
technologies in the single-site topology:

VMware NSX (recommended)

VMware vSphere Distributed Switch
When using VMware NSX in a single-site architecture, the Federation Enterprise Hybrid
Cloud solution supports the full range of NSX functionality supported by VMware vRealize
Automation. The integration between these components provides all the required
functionality, including but not limited to:

Micro-segmentation

Dynamic provisioning of VMware NSX constructs via vRealize blueprints

Use of NSX security polices, groups, and tags

Integration with the VMware NSX Partner ecosystem for enhanced security
Chapter 4: Single-Site/Single vCenter Topology
NSX best
practices
In a single-site topology, when NSX is used, all NSX Controller components reside in the
same site and within the NEI Pod. NSX best practice recommends that each NSX Controller
is placed on a separate physical host.
NSX creates Edge Services Gateways (ESGs) and Distributed Logical Routers (DLRs). Best
practice for ESGs and DLRs recommends that they are deployed in HA pairs, and that the
ESGs and DLRs are separated from each other onto different physical hosts.
Combining these best practices means that a minimum of four physical hosts are required to
support the NEI Pod function when NSX is used.
VMware anti-affinity rules should be used to ensure that the following conditions are true
during optimum conditions:

NSX Controllers reside on different hosts.

NSX ESGs configured for high availability reside on different hosts.

NSX DLR Control virtual machines reside on different hosts.

NSX ESG and DLR Control virtual machines reside on different hosts.
When using the Federation Enterprise Hybrid Cloud Sizing tool, consider the choice of server
specification for the NEI Pod to ensure efficient use of hardware resources, as the tool will
enforce the four-server minimum when NSX is chosen.
Storage design
This Federation Enterprise Hybrid Cloud solution presents storage in the form of storage
service offerings that greatly simplify virtual storage provisioning.
The storage service offerings are based on ViPR virtual pools, which are tailored to meet the
performance requirements of general IT systems and applications. Multiple storage system
virtual pools, consisting of different disk types, are configured and brought under ViPR
management.
ViPR presents the storage to the enterprise hybrid cloud as virtual storage pools, abstracting
the underlying storage details and enabling provisioning tasks to be aligned with the
application’s class of service. In Federation Enterprise Hybrid Cloud, each ViPR virtual pool
represents a storage service offering can be supported or backed by multiple storage pools
of identical performance and capacity. This storage service offering concept is summarized
in Figure 10.
33
Chapter 4: Single-Site/Single vCenter Topology
Figure 10.
Storage service offerings for the hybrid cloud
Note: The storage service offerings in Figure 10 are suggestions only. Storage service offerings
can be configured and named as appropriate to reflect their functional use.
The storage service examples in Figure 10 suggest the following configurations:

All Flash: Can be provided by either EMC XtremIO™, VNX as all-flash storage, or
VMAX FAST VP where only the flash tier is used.

Tiered: Provides VNX or VMAX block or file-based VMFS or NFS storage devices and is
supported by multiple storage pools using EMC Fully Automated Storage Tiering for
Virtual Pools (FAST® VP) and EMC Fully Automated Storage Tiering (FAST®) Cache.

Single Tier: Provides EMC VNX block- or file-based VMFS or NFS storage and is
supported by multiple storage pools using a single storage type of NL-SAS in this
example.
We suggest these storage service offerings only to highlight what is possible in a Federation
Enterprise Hybrid Cloud environment. The full list of supported platforms includes:

EMC VMAX

EMC VNX

EMC XtremIO

EMC ScaleIO®

EMC VPLEX

EMC RecoverPoint

Isilon® (Workload use only)
As a result many other storage service offerings can be configured to suit business and
application needs, as appropriate.
Note: The Federation recommends that you follow the best practice guidelines when deploying
any of the supported platform technologies. The Federation Enterprise Hybrid Cloud does not
require any variation from these best practices.
34
Chapter 4: Single-Site/Single vCenter Topology
Storage
consumption
vRealize Automation provides the framework to build relationships between vSphere storage
profiles and Business Groups so that they can be consumed through the service catalog.
Initially, physical storage pools are configured on the storage system and made available to
ViPR where they are configured into their respective virtual pools. At provisioning time, LUNs
or file systems are configured from these virtual pools and presented to vSphere as VMFS or
NFS datastores. The storage is then discovered by vRealize Automation and made available
for assignment to business groups within the enterprise.
This storage service offering approach greatly simplifies the process of storage
administration. Instead of users having to configure the placement of individual virtual
machine disks (VMDKs) on different disk types such as serial-attached storage (SAS) and
FC, they simply select the appropriate storage service level required for their business need.
Virtual disks provisioned on FAST VP storage benefit from the intelligent data placement.
While frequently accessed data is placed on disks with the highest level of service, less
frequently used data is migrated to disks reflecting that service level.
When configuring virtual machine storage, a business group administrator can configure
blueprints to deploy virtual machines onto any of the available storage service levels. In the
example in Figure 11, a virtual machine can be deployed with a blueprint including a SQL
Server database, to a storage service offering named Prod-2, which was designed with the
performance requirements of such an application in mind.
Figure 11.
Blueprint storage configuration in vRealize Automation
The devices for this SQL Server database machine have different performance requirements,
but rather than assigning different disk types to each individual drive, each virtual disk can
be configured on the Prod-2 storage service offering. This allows the underlying FAST
technology to handle the best location for each individual block of data across the tiers. The
vRealize Automation storage reservation policy ensures that the VMDKs are deployed to the
appropriate storage.
The storage presented to vRealize Automation can be shared and consumed across the
various business groups using the capacity and reservation policy framework in vRealize
Automation.
Storage
provisioning
Storage is provisioned to the Workload vSphere clusters in the environment using the
Provision Cloud Storage catalog item that can provision VNX, VMAX, XtremIO, ScaleIO, and
VPLEX Local storage to single-site topology workload clusters.
The workflow interacts with both ViPR and vRealize Automation to create the storage,
present it to the chosen vSphere cluster and add the new volume to the relevant vRealize
Storage Reservation Policy.
35
Chapter 4: Single-Site/Single vCenter Topology
vSphere clusters are made eligible for storage provisioning by tagging them with vRealize
Automation custom properties that define them as Unprotected clusters, that is, that they
are not involved in any form of inter-site replication relationship. This tagging is done during
the installation and preparation of vSphere clusters for use by the Federation Enterprise
Hybrid Cloud using the Unprotected Cluster Onboarding workflows provided as part of the
Federation Enterprise Hybrid Cloud self-service catalog.
Note: Virtual machines on the cluster may still be configured to use backup as a service, as
shown in Chapter 7.
As local-only (unprotected) vSphere clusters can also exist in continuous availability and DR
topologies, this process ensures that only the correct type of storage is presented to the
single-site vSphere clusters and no misplacement of virtual machines intended for inter-site
protection occurs.
ViPR virtual pools
For block-based provisioning, ViPR virtual arrays should not contain more than one protocol.
For Federation Enterprise Hybrid Cloud this means that ScaleIO storage and FC block
storage must be provided via separate virtual arrays.
Note: Combining multiple physical arrays into fewer virtual arrays to provide storage to virtual
pools is supported.
Single-site
topology
Recovery of the management platform does not apply to a single-site topology, because
there is no target site to recover to.
Single-site/single
vCenter topology
backup
The recommended option for backup in a single-site/single vCenter topology is the Standard
Avamar configuration, though the Redundant Avamar/single vCenter configuration may also
be used to provide additional resilience. Both options are described in Chapter 7.
.
36
Chapter 5: Dual-Site/Single vCenter Topology
This chapter presents the following topics:
Overview ..........................................................................................................38
Standard dual-site/single vCenter topology ...........................................................38
Continuous availability dual-site/single vCenter topology ........................................39
Continuous availability network considerations ......................................................40
VPLEX Witness ...................................................................................................45
VPLEX topologies ...............................................................................................46
Continuous availability storage considerations .......................................................53
Recovery of cloud management platform ..............................................................56
Backup in dual-site/single vCenter enterprise hybrid cloud ......................................56
CA dual-site/single vCenter ecosystem .................................................................57
37
Chapter 5: Dual-Site/Single vCenter Topology
This chapter describes the networking and storage considerations for a dual-site/single
vCenter topology in the Federation Enterprise Hybrid Cloud solution.
When to use the
dual-site/single
vCenter topology
The dual-site/single vCenter Federation Enterprise Hybrid Cloud topology may be used when
restart of the cloud to another data center is required. It should only be used in either of the
following two scenarios:

Standard dual-site/single vCenter topology

Two sites are present that require management via a single vCenter instance and a
single Federation Enterprise Hybrid Cloud management platform/portal.
Note: In this case, the scope of the term ‘site’ is at the user’s discretion. It could be taken to
mean separate individual geographical locations, or could also mean independent islands of
infrastructure in the same geographical location, such as independent VCE VxBlock platforms.

This topology has no additional storage considerations beyond the single-site/single
vCenter topology because each site has completely independent storage.

When used with VMware NSX, this topology employs an additional NEI Pod on the
second site to ensure north/south network traffic egresses the second site in the
most efficient manner. The local NEI Pod will host the Edge gateway services for its
respective site.
Note: There is a 1:1 relationship between NSX Manager and vCenter Server, although
there is a second NEI Pod, there is still only one NSX manager instance.
Continuous availability dual-site/single vCenter topology
Continuous availability is required. This topology also requires that:

EMC VPLEX storage is available.

Stretched Layer 2 VLANs are permitted or the networking technology chosen supports
VXLANs.

The latency between the two physical data center locations is less than 10 ms.
The standard dual-site/single vCenter Federation Enterprise Hybrid Cloud topology controls
two sites, each with independent islands of infrastructure using a single vCenter instance
and Federation Enterprise Hybrid Cloud management stack/portal.
This architecture provides a mechanism to extend an existing Federation Enterprise Hybrid
Cloud by adding additional independent infrastructure resources to an existing cloud, when
resilience of the management platform itself is not required. Figure 12 shows the
architecture used for this topology option.
38
Chapter 5: Dual-Site/Single vCenter Topology
Figure 12.
Federation Enterprise Hybrid Cloud standard dual-site/single vCenter
architecture
The continuous availability (CA) dual-site/single vCenter Federation Enterprise Hybrid Cloud
topology is an extension of the standard dual-site/single vCenter model that stretches the
infrastructure across sites, using VMware vSphere Metro Storage Cluster (vMSC), vSphere
HA, and VPLEX in Metro configuration.
This topology enables multi-site resilience across two sites with automatic restart of both the
management platform and workload virtual machines on the surviving site. Figure 13 shows
the architecture used for this topology option.
39
Chapter 5: Dual-Site/Single vCenter Topology
Figure 13.
Supported virtual
networking
technologies
Federation Enterprise Hybrid Cloud CA dual-site/single vCenter
architecture
The Federation Enterprise Hybrid Cloud supports the following virtual networking
technologies in the dual-site/single vCenter topology:

VMware NSX (recommended)

VMware vSphere Distributed Switch
If vSphere Distributed Switch is used in the CA dual-site/single vCenter topology, then all
networks must be backed by a Layer 2 VLAN that is present in both locations. VMware NSX
enables you to use VXLANs backed by a Layer 3 DCI.
40
Chapter 5: Dual-Site/Single vCenter Topology
Supported
VMware NSX
features
NSX best
practices
When using VMware NSX in dual-site/single vCenter architecture, the Federation Enterprise
Hybrid Cloud solution supports the full range of NSX functionality supported by VMware
vRealize Automation. The integration between these components provides all required
functionality including, but is not limited to:

Micro-segmentation

Dynamic provisioning of VMware NSX constructs via vRealize blueprints

Use of NSX security polices, groups, and tags

Integration with the VMware NSX Partner ecosystem for enhanced security
In the CA dual-site, single vCenter topology, when NSX is used, all NSX Controller
components reside in the NEI Pod, but the NEI Pod is supported by a vSphere Metro Storage
(stretched) cluster. NSX best practice recommends that each controller is placed on a
separate physical host.
NSX creates Edge Services Gateways (ESGs) and Distributed Logical Routers (DLRs). Best
practice for ESGs and DLRs recommends that they are deployed in HA pairs, and that the
ESGs and DLRs are separated from each other onto different physical hosts.
Combining these best practices, and ensuring that each site is fully capable of running the
NSX infrastructure optimally means that a minimum of four physical hosts per site (eight in
total) are required to support the NEI Pod function when NSX is used.
VMware affinity and anti-affinity rules should be used to ensure that the following conditions
are true during optimum conditions:

NSX Controllers reside on different hosts.

NSX Edge Services Gateways reside on different hosts.

NSX Distributed Logical Router Control virtual machines reside on different host.

NSX ESG and DLR Control virtual machines do not reside on the same physical hosts.

All NSX Controllers reside on a given site, and move laterally within that site before
moving to the alternate site.
When using the Federation Enterprise Hybrid Cloud Sizing tool, appropriate consideration
should be given to the choice of server specification for the NEI Pod to ensure efficient use
of hardware resources, as the tool will enforce the four server minimum when NSX is
chosen.
Data Center
Interconnect
Data centers that are connected together over a metro link can use either Layer 2 bridged
VLAN connectivity or Layer 3 routed IP connectivity.
Both Data Center Interconnect (DCI) options have advantages and disadvantages. However,
new standards and technologies, such as Virtual Extensible LAN (VXLAN), address most of
the disadvantages.
Layer 2 DCI
A Layer 2 DCI should be used in continuous availability scenarios where VMware NSX is not
available.
Traditional disadvantages of Layer 2 DCI
The risks related to Layer 2 extensions between data centers mirror some of the limitations
faced in traditional Ethernet broadcast domains.
The limiting factor is the scalability of a single broadcast domain. A large number of hosts
and virtual machines within a broadcast domain, all of which contend for shared network
resources, can result in broadcast storms. The results of broadcast storms are always to the
41
Chapter 5: Dual-Site/Single vCenter Topology
detriment of network availability, adversely affecting application delivery and ultimately
leading to a poor user experience. This can affect productivity.
As the CA architecture is stretched across both data centers, a broadcast storm could cause
disruption in both the primary and secondary data centers.
Multiple Layer 2 interconnects create additional challenges for stretched networks. If
unknown broadcast frames are not controlled, loops in the Layer 2 extension can form. This
can also cause potential disruption across both data centers, resulting in network downtime
and loss of productivity.
If used, the Spanning Tree Protocol (STP) needs to be run and carefully managed to control
loops across the primary and secondary site interconnecting links.
Loop avoidance and broadcast suppression mechanisms are available to the IT professional,
but must be carefully configured and managed.
Traditional advantages of Layer 2 DCI
The greatest advantage of Layer 2 DCI is the IP address mobility of physical and virtual
machines across both data centers. This simplifies recovery in the event of a failure in the
primary data center.
Note: Layer 2 connectivity is often necessary for applications where heartbeats and clustering
techniques are used across multiple hosts. In some cases, technologies might not be able to
span Layer 3 boundaries.
Layer 3 DCI
A Layer 2 DCI may be used in continuous availability scenarios where VMware NSX is
available.
Traditional disadvantages of Layer 3 DCI
If an infrastructure failure occurs at the primary site, a machine migrated to the secondary
data center must be reconfigured to use an alternate IP addressing scheme. This can be
more time consuming and error prone than having a high-availability deployment across a
single Layer 2 domain.
Inter-site machine clustering may not be supported over a Layer 3 boundary, which can be
either multicast or broadcast based.
Traditional advantages of Layer 3 DCI
Layer 3 DCI does not use extended broadcast domains or require the use of STP. Therefore,
there is greater stability of the production and services networks across both primary and
secondary data centers.
Note: The data center interconnect physical link is subject to the availability of the local
telecommunications service provider and the business requirement of the enterprise.
Optimal continuous availability DCI networking solution
The network topology used in the CA for Federation Enterprise Hybrid Cloud solution can use
the advantages of both Layer 2 and Layer 3 DCI topologies when used with VMware NSX.
Layer 2 requirements such as resource and management traffic are handled by the VXLAN
implementation enabled by NSX. This offers the advantage of IP mobility across both sites
by placing the resource and management traffic on spanned VXLAN segments. It also
eliminates the complexity of STP and performance degradation that large broadcast domains
can introduce.
42
Chapter 5: Dual-Site/Single vCenter Topology
VXLANs can expand the number of Layer 2 domains or segments beyond the 802.1q limit of
4,096 VLANs to a theoretical limit of 16 million. VXLANs can also extend the Layer 2
environment over Layer 3 boundaries.
An underlying Layer 3 data center interconnect runs a dynamic route distribution protocol
with rapid convergence characteristics such as Open Shortest Path First (OSPF). OSPF
routing metrics route the ingress traffic to the primary data center. If the primary data
center is unavailable, the OSPF algorithm automatically converges routes to the secondary
data center. This is an important advantage compared to using a traditional Layer 2 DCI and
Layer 3 DCI solution in isolation.
Note: NSX also supports Border Gateway Protocol (BGP) and Intermediate System to
Intermediate System (IS-IS) route distribution protocols. The Federation Enterprise Hybrid
Cloud supports OSPF and BGP, but not IS-IS.
In a collapsed management model, all clusters are part of the same vCenter instance and
therefore can all be configured to use the security and protection features offered by the
same NSX Manager instance. If this is not a requirement for Core Pod, then a stretched
Layer 2 network may also be used.
In a distributed management model, two vCenter instances are used. Given the 1:1
relationship between a vCenter instance and NSX Manager, a second NSX manager instance
would be required if the Core Pod is to use the security and protection NSX provisioned
networks. Given the small number of virtual machines present in the external vCenter, it
may be appropriate to consider a stretched Layer 2 VLAN for this network if the second NSX
manager instance is deemed unnecessary.
Figure 14 shows one possible scenario where two data centers are connected using both a
Layer 2 and a routed Layer 3 IP link and how the Core, NEI, Automation, and Workload
segments could be provisioned.
43
Chapter 5: Dual-Site/Single vCenter Topology
Figure 14.
Continuous availability data center interconnect example using VMware
NSX
In this scenario, the following properties are true:

vSphere ESXi stretched clusters are utilized to host the Core, Automation, NEI, and
Workload virtual machines. This, with vSphere HA, enables virtual machines to be
automatically restarted on the secondary site, if the primary site fails.

The Core Pod virtual machines are connected to a stretched VLAN. This prevents the
need for a second NSX manager machine.

The NSX controllers (NEI Pod) are connected to the same stretched VLAN as the Core
Pod virtual machines.

The Automation Pod virtual machines are connected to an NSX logical network, backed
by VXLAN and available across both sites.

The Workload Pod virtual machines are connected to a NSX logical network, backed by
VXLAN and available across both sites.

VXLAN encapsulated traffic must be able to travel between vSphere ESXi hosts at both
sites.
One or more NSX Edge Services Gateways (ESGs) are deployed at each site to control traffic
flow between the virtual and physical network environments.
Note: NSX supports three modes of replication for VXLAN traffic unicast, multicast and hybrid.
Unicast mode enables VXLAN traffic to be carried across Layer 3 boundaries without assistance
from the underlying physical network, but requires availability of the NSX Controllers.
44
Chapter 5: Dual-Site/Single vCenter Topology
vSphere HA, with VPLEX and VPLEX Witness, enables the cloud-management platform
virtual machines to restore the cloud-management service on the secondary site in the
event of a total loss of the primary data center. In this scenario, the virtual machines
automatically move to and operate from vSphere ESXi nodes residing in the secondary data
center.
Edge Services Gateway considerations

All workload virtual machines should use NSX logical switches connected to a
Distributed Logical Router (DLR). The DLR can provide the same default gateway to a
virtual machine, whether it is running at the primary or secondary site.

DLRs should be connected to at least one ESG at each site and a dynamic route
distribution protocol (such as OSPF and others supported by NSX) should be used to
direct traffic flow. We recommend that you use both NSX High Availability and vSphere
High Availability in conjunction with host DRS groups, virtual machine DRS groups and
virtual machine DRS affinity rules to ensure that DLR virtual machines run on the
correct site in optimum conditions.
This solution has all the advantages of traditional Layer 2 and Layer 3 solutions. It provides
increased flexibility and scalability by implementing VXLANs, and benefits from increased
stability by not extending large broadcast domains across the VPLEX Metro.
VPLEX Witness is an optional component deployed in customer environments where the
regular preference rule sets are insufficient to provide seamless zero or near-zero recovery
time objective (RTO) storage availability in the event of site disasters or VPLEX cluster and
inter-cluster failures.
Without VPLEX Witness, all distributed volumes rely on configured rule sets to identify the
preferred cluster in the event of a cluster partition or cluster/site failure. However, if the
preferred cluster fails (for example, as a result of a disaster event), VPLEX is unable to
automatically enable the surviving cluster to continue I/O operations to the affected
distributed volumes. VPLEX Witness is designed to overcome this.
The VPLEX Witness server is deployed as a virtual appliance running on a customer’s
vSphere ESXi host that is deployed in a failure domain separate from both of the VPLEX
clusters. The third fault domain must have power and IP isolation from both the Site A and
Site B fault domains, which host the VPLEX Metro Clusters.
This eliminates the possibility of a single fault affecting both the cluster and VPLEX Witness.
VPLEX Witness connects to both VPLEX clusters over the management IP network. By
reconciling its own observations with the information reported periodically by the clusters,
VPLEX Witness enables the clusters to distinguish between inter-cluster network partition
failures and cluster failures, and to automatically resume I/O operations in these situations.
Figure 15 shows an example of a high-level deployment of VPLEX Witness and how it can
augment an existing static preference solution. The VPLEX Witness server resides in a fault
domain separate from the VPLEX clusters on Site A and Site B.
45
Chapter 5: Dual-Site/Single vCenter Topology
Figure 15.
High-level deployment of EMC VPLEX Witness
VMware classifies the stretched VPLEX Metro cluster configuration with VPLEX into the
following categories:
Deciding on
VPLEX topology

Uniform host access configuration with VPLEX host Cross-Connect—vSphere
ESXi hosts in a distributed vSphere cluster have a connection to the local VPLEX
system and paths to the remote VPLEX system. The remote paths presented to the
vSphere ESXi hosts are stretched across distance.

Non-uniform host access configuration without VPLEX host Cross-Connect—
vSphere ESXi hosts in a distributed vSphere cluster have a connection only to the local
VPLEX system.
Use the following guidelines to help you decide which topology suits your environment:


Uniform host
access
configuration
with VPLEX host
Cross-Connect
Uniform (Cross-Connect) is typically used where:

Inter-site latency is less than 5ms.

Stretched SAN configurations are possible.
Non-Uniform (without Cross-Connect) is typically used where:

Inter-site latency is between 5 ms and 10 ms.

Stretched SAN configurations are not possible.
EMC GeoSynchrony® supports the concept of a VPLEX Metro cluster with Cross-Connect.
This configuration provides a perfect platform for a uniform vSphere stretched-cluster
deployment. VPLEX with host Cross-Connect is designed for deployment in a metropolitantype topology with latency that does not exceed 5 ms round-trip time (RTT).
vSphere ESXi hosts can access a distributed volume on the local VPLEX cluster and on the
remote cluster in the event of a failure. When this configuration is used with VPLEX Witness,
vSphere ESXi hosts are able to survive through multiple types of failure scenarios. For
example, in the event of a VPLEX cluster or back-end storage array failure, the vSphere
ESXi hosts can still access the second VPLEX cluster with no disruption in service.
In the unlikely event that the preferred site fails, VPLEX Witness intervenes and ensures that
access to the surviving cluster is automatically maintained. In this case, vSphere HA
automatically restarts all affected virtual machines.
46
Chapter 5: Dual-Site/Single vCenter Topology
Figure 16 shows that all ESXi hosts are connected to the VPLEX clusters at both sites. This
can be achieved in a number of ways:

Merge switch fabrics by using Inter-Switch Link (ISL) technology used to connect local
and remote SANs.

Connect directly to the remote data center fabric without merging the SANs.
Figure 16.
Deployment model with VPLEX host Cross-Connect
This type of deployment is designed to provide the highest possible availability for a
Federation Enterprise Hybrid Cloud environment. It can withstand multiple failure scenarios
including switch, VPLEX, and back-end storage at a single site with no disruption in service.
For reasons of performance and availability, the Federation recommends that separate host
bus adapters be used for connecting to local and remote switch fabrics.
Note: VPLEX host Cross-Connect is configured at the host layer only and does not imply any
cross connection of the back-end storage. The back-end storage arrays remain locally connected
to their respective VPLEX clusters.
From the host perspective, in the uniform deployment model with VPLEX host CrossConnect, the vSphere ESXi hosts are zoned to both the local and the remote VPLEX clusters.
Figure 17 displays the VPLEX storage views for a host named DRM-ESXi088, physically
located in Site A of our environment.
Here the initiators for the host are registered and added to both storage views with the
distributed device being presented from both VPLEX clusters.
47
Chapter 5: Dual-Site/Single vCenter Topology
Figure 17.
VPLEX storage views with host Cross-Connect
This configuration is transparent to the vSphere ESXi host. The remote distributed volume is
presented as an additional set of paths.
Figure 18 shows the eight available paths that are presented to host DRM-ESXi088, for
access to the VPLEX distributed volume hosting the datastore named CC-Shared-M3. The
serial numbers of the arrays are different because four of the paths are presented from the
first VPLEX cluster and the remaining four are presented from the second.
Figure 18.
Datastore paths in a VPLEX with host Cross Connect configuration
PowerPath/VE autostandby mode
Neither the host nor the native multipath software can by themselves distinguish between
local and remote paths. This poses a potential impact on performance if remote paths are
used for I/O in normal operations because of the cross-connect latency penalty.
48
Chapter 5: Dual-Site/Single vCenter Topology
PowerPath/VE provides the concept of autostandby mode, which automatically identifies all
remote paths and sets them to standby (asb:prox is the proximity-based autostandby
algorithm). This feature ensures that only the most efficient paths are used at any given
time.
PowerPath/VE groups paths internally by VPLEX cluster. The VPLEX cluster with the lowest
minimum path latency is designated as the local/preferred VPLEX cluster, while the other
VPLEX cluster within the VPLEX Metro system is designated as the remote/non-preferred
cluster.
A path associated with the local/preferred VPLEX cluster is put in active mode, while a path
associated with the remote/non-preferred VPLEX cluster is put in autostandby mode. This
forces all I/O during normal operations to be directed towards the local VPLEX cluster. If a
failure occurs where the paths to the local VPLEX cluster are lost, PowerPath/VE activates
the standby paths and the host remains up and running on the local site, while accessing
storage on the remote site.
Non-uniform host
access
configuration
without VPLEX
Cross-Connect
The non-uniform host configuration can be used for a Federation Enterprise Hybrid Cloud
deployment if greater distances are required. The supported latency of this configuration
requires that the round-trip time be within 5 ms to comply with VMware HA requirements.
Without the cross-connect deployment, vSphere ESXi hosts at each site have connectivity to
only that sites VPLEX cluster.
Figure 19 shows that hosts located at each site have connections to only their respective
VPLEX cluster. The VPLEX clusters have a link between them to support the VPLEX Metro
configuration, and the VPLEX Witness is located in a third failure domain.
Figure 19.
VPLEX architecture without VPLEX Cross-Connect
The major benefit of this deployment option is that greater distances can be achieved in
order to protect the infrastructure. With the EMC VPLEX AccessAnywhereTM feature, the nonuniform deployment offers the business another highly resilient option that can withstand
various types of failures including front-end and back-end single path failure, single switch
failure, and single back-end array failure.
Figure 20 shows the storage views from VPLEX cluster 1 and cluster 2. In the example nonuniform deployment, hosts DRM-ESXi077 and DRM-ESXi099 represent hosts located in
different data centers. They are visible in their site-specific VPLEX cluster’s storage view.
With AccessAnywhere, the hosts have simultaneous write access to the same distributed
device, but only via the VPLEX cluster on the same site.
49
Chapter 5: Dual-Site/Single vCenter Topology
Figure 20.
VPLEX Storage Views without VPLEX Cross-Connect
Figure 21 shows the path details for one of the hosts in a stretched cluster that has access
to the datastores hosted on the VPLEX distributed device. The World Wide Name (WWN) on
the Target column shows that all paths to that distributed device belong to the same VPLEX
cluster. PowerPath/VE has also been installed on all of the hosts in the cluster, and it has
automatically set the VPLEX volume to the adaptive failover mode. The autostandby feature
is not used in this case because all the paths to the device are local.
Figure 21.
vSphere Datastore Storage paths without VPLEX Cross-Connect
With vSphere HA, the virtual machines are also protected against major outages, such as
network partitioning of the VPLEX WAN link or an entire site failure. In order to prevent any
unnecessary down time, the Federation recommends that the virtual machines reside on the
site that would win ownership of the VPLEX distributed volume in the event of such a
partitioning occurring.
Site affinity for
management
platform
machines
50
When using the CA dual-site/single vCenter topology, the Federation recommends that all
platform components are bound to a given site using VMware affinity ‘should’ rules. This
ensures minimum latencies between components while still allowing them to move to the
surviving site in the case of a site failure.
Chapter 5: Dual-Site/Single vCenter Topology
Site affinity for
tenant virtual
machines
The solution uses VMware Host Distributed Resource Scheduler (DRS) groups to subdivide
the vSphere ESXi hosts in each workload and management cluster into groupings of hosts
corresponding to their respective sites. It does this by defining two VMware host DRS groups
in the format SiteName_Hosts where the site names of both sites are defined during the
installation of the Federation Enterprise Hybrid Cloud foundation package.
VMware virtual machine DRS groups are also created in the format Sitename_VMs during
the preparation of the ESXi cluster for continuous availability.
Storage reservation polices (SRPs) created by the Federation Enterprise Hybrid Cloud
storage as service workflows are automatically named to indicate the preferred site in which
that storage type is run.
Note: In this case, the preferred site setting means that in the event of a failure that results in
the VPLEX units being unable to communicate, that this site will be the one that continues to
provide read/write access to the storage.
During deployment of a virtual machine though the vRealize portal, the user is asked to
choose from a list of storage reservation policies. Federation Enterprise Hybrid Cloud custom
workflows use this information to place the virtual machine on a vSphere ESXi cluster with
access to the required storage type and placing the virtual machine into the appropriate
virtual machine DRS group.
Virtual machines to host DRS rules are then used to bind virtual machines to the preferred
site by configuring the SiteName_VMs virtual machine DRS group with a setting of “should
run” on the respective SiteName_Hosts host DRS group. This ensures virtual machines run
on the required site, while allowing them the flexibility of failing over if the infrastructure on
that site becomes unavailable.
Figure 22 shows how the virtual machine DRS groups and affinity rules might look in a
sample configuration.
Figure 22.
Sample view of site affinity DRS group and rule configuration
51
Chapter 5: Dual-Site/Single vCenter Topology
Note: The values “SiteA” and “SiteB” shown in both Figure 22 and Figure 23 can and should be
replaced with meaningful site names in a production environment. They must correlate with the
site name values entered during the Federation Enterprise Hybrid Cloud Foundation package
initialization for site affinity to work correctly.
Figure 23 shows a simple example of two scenarios where virtual machines are deployed to
a vMSC and how the logic operates to place those virtual machines on their preferred sites.
Figure 23.
Deploying virtual machines with site affinity
Scenario 1: Deploy VM1 with affinity to Site A
This scenario describes deploying a virtual machine (VM1) with affinity to Site A:
1.
During virtual machine deployment, the user chooses a storage reservation policy
named SiteA_Preferred_CA_Enabled.
1.
This storage reservation policy choice filters the cluster choice to only those clusters
with that reservation policy. In this case cluster 1.
2.
Based on the selected storage reservation policy, Federation Enterprise Hybrid
Cloud workflows programmatically determine that Site A is the preferred location,
and therefore locates the virtual machine DRS affinity group corresponding with Site
A, namely SiteA_VMs.
3.
The expected result is:
a.
VM1 is deployed into SiteA_VMs, residing on host CL1-H1 or CL1H2.
b.
VM1 is deployed onto a datastore from the SiteA_Preferred_CA_Enabled
storage reservation policy, for example:
VPLEX_Distributed_LUN_SiteA_Preferred_01 or
VPLEX_Distributed_LUN_SiteA_Preferred_02
Scenario 2: Deploy VM2 with affinity to Site B
This scenario describes deploying a virtual machine (VM2) with affinity to Site B:
1.
52
During virtual machine deployment, the user chooses a storage reservation policy
named SiteB_Preferred_CA_Enabled.
Chapter 5: Dual-Site/Single vCenter Topology
2.
This storage reservation policy choice filters the cluster choice to only those clusters
with that reservation policy. In this case cluster 1.
3.
Based on the selected storage reservation policy, Federation Enterprise Hybrid
Cloud workflows programmatically determine that Site B is the preferred location,
and therefore locates the virtual machine DRS affinity group corresponding with Site
B, namely SiteB_VMs.
4.
The expected result is:
a.
VM2 is deployed into SiteB_VMs, meaning it resides on hosts CL1-H3 or
CL1H4.
b.
VM1 is deployed onto a datastore from the SiteB_Preferred_CA_Enabled
storage reservation policy. For example:
VPLEX_Distributed_LUN_SiteB_Preferred_01 or
VPLEX_Distributed_LUN_SiteB_Preferred_02
ViPR virtual
arrays
There must be at least one virtual array for each site. By configuring the virtual arrays in
this way, ViPR can discover the VPLEX and storage topology. You should carefully plan and
perform this step because it is not possible to change the configuration after resources have
been provisioned, without first disruptively removing the provisioned volumes.
ViPR virtual pools
ViPR virtual pools for block storage offer two options under High Availability: VPLEX local
and VPLEX distributed. When you specify local high availability for a virtual pool, the ViPR
storage provisioning services create VPLEX local virtual volumes. If you specify VPLEX
distributed high availability for a virtual pool, the ViPR storage provisioning services create
VPLEX distributed virtual volumes.
To configure a VPLEX distributed virtual storage pool through ViPR:

Ensure a virtual array exists for both sites, with the relevant physical arrays associated
with those virtual arrays. Each VPLEX cluster must be a member of the virtual array at
its own site only.

Before creating a VPLEX high-availability virtual pool at the primary site, create a local
pool at the secondary site. This is used as the target virtual pool when creating VPLEX
distributed virtual volumes.
When creating the VPLEX high-availability virtual pool on the source site, select the
source storage pool from the primary site, the remote virtual array, and the remote
pool created in Step 2. This pool is used to create the remote mirror volume that
makes up the remote leg of the VPLEX virtual volume.
Note: This pool is considered remote when creating the high availability pool because it
belongs to VPLEX cluster 2 and we are creating the high availability pool from VPLEX
cluster 1.
Figure 24 shows this configuration, where VPLEX High Availability Virtual Pool represents
the VPLEX high-availability pool being created.
53
Chapter 5: Dual-Site/Single vCenter Topology
Figure 24.
Interactions between local and VPLEX distributed pools
As described in Site affinity for tenant virtual machines, Federation Enterprise Hybrid Cloud
workflows leverage the ‘winning’ site in a VPLEX configuration to determine which site to
map virtual machines to. To enable active/active clusters, it is therefore necessary to create
two sets of datastores – one set that will win on Site A and another set than will win on Site
B. To enable this, you need to configure an environment similar to Figure 24 for Site A, and
the inverse of it for Site B (where the local pool is on Site A, and the high availability pool is
configured from Site B).
ViPR and VPLEX
consistency
groups
interaction
VPLEX uses consistency groups to maintain common settings on multiple LUNs. To create a
VPLEX consistency group using ViPR, a ViPR consistency group must be specified when
creating a new volume. ViPR consistency groups are used to control multi-LUN consistent
snapshots and have a number of important rules associated with them when creating VPLEX
distributed devices:

All volumes in any given ViPR consistency group must contain only LUNs from the
same physical array. As a result of these considerations, the Federation Enterprise
Hybrid Cloud STaaS workflows create a new consistency group per physical array, per
vSphere cluster per site.

All VPLEX distributed devices in a given ViPR consistency group must have source and
target backing LUNS from the same pair of arrays.
As a result of these two rules, it is a requirement of the Federation Enterprise Hybrid Cloud
that an individual ViPR virtual pool is created for every physical array that provides physical
pools for use in a VPLEX distributed configuration.
Virtual Pool
Collapser
function
Federation Enterprise Hybrid Cloud STaaS workflows use the name of the ViPR virtual pool
chosen as part of the naming for the vRealize Storage Reservation Policy (SRP) that the new
datastore is added to. The Virtual Pool Collapser (VPC) function of Federation Enterprise
Hybrid Cloud collapses the LUNs from multiple virtual pools into a single SRP.
The VPC function can be used in the scenario where multiple physical arrays provide
physical storage pools of the same configuration or service level to VPLEX, but through
54
Chapter 5: Dual-Site/Single vCenter Topology
different virtual pools, and where required to ensure that all LUNS provisioned across those
physical pools are collapsed into the same SRP.
VPC can be enabled or disabled at a global Federation Enterprise Hybrid Cloud level. When
enabled, the Federation Enterprise Hybrid Cloud STaaS workflows examine the naming
convention of the virtual pool selected to determine which SRP it should add the datastore
to. If the virtual pool has the string ‘_VPC-‘ in it, then Federation Enterprise Hybrid Cloud
knows that it should invoke VPC logic.
Virtual Pool Collapser example
Figure 25 shows an example of VPC in use. In this scenario, the administrator has enabled
the VPC function and created two ViPR virtual pools

GOLD_VPC-000001, which has physical pools from Array 1

GOLD_VPC-000002, which has physical pools from Array 2
When determining how to construct the SRP name to be used, the VPC function will only use
that part of the virtual pool name that exists before ‘_VPC-‘. In this example that results in
the term ‘GOLD’ which then contributes to the common SRP name of
SITEA_GOLD_CA_Enabled. This makes it possible to conform to the rules of ViPR
consistency groups as well as providing a single SRP for all datastores of the same type,
which maintains abstraction and balanced datastore usage at the vRealize layer.
Figure 25.
Virtual Pool Collapser example
In the example shown in Figure 25, all storage is configured to win on a single site (Site A).
To enable true active/active vSphere Metro Storage clusters, additional pools should be
configured in the opposite direction, as mentioned in Continuous availability storage
considerations.
Storage
provisioning
VPLEX distributed storage is provisioned to the Workload vSphere clusters in the
environment using the Federation Enterprise Hybrid Cloud catalog item named Provision
Cloud Storage.
As shown in Figure 24, these VPLEX volumes can be backed by VMAX, VNX, or XtremIO
arrays.
Note: The Federation recommends that you follow the best practice guidelines when deploying
any of the supported platform technologies. The Federation Enterprise Hybrid Cloud does not
require any variation from these best practices.
55
Chapter 5: Dual-Site/Single vCenter Topology
The workflow interacts with both ViPR and vRealize Automation to create the storage,
presents it to the chosen vSphere cluster, and adds the new volume to the relevant vRealize
storage reservation policy.
As with the single-site topology, vSphere clusters are made eligible for storage provisioning
by ‘tagging’ them with vRealize Automation custom properties. However, in this case they
are defined as CA Enabled clusters, that is, they are part of a vMSC that spans both sites in
the environment. This tagging is done during the installation and preparation of vSphere
clusters for use by the Federation Enterprise Hybrid Cloud using the CA Cluster Onboarding
workflow provided as part of the Federation Enterprise Hybrid Cloud self-service catalog.
As local-only vSphere clusters can also be present in CA topology, the Provision Cloud
Storage catalog item will automatically present only ViPR VPLEX distributed virtual storage
pools to provision from when you attempt to provision to a CA-enabled vSphere cluster.
56
Standard dualsite/single
vCenter topology
This model provides no resilience/recovery for the cloud management platform. To enable
this you should use the CA dual-site/single vCenter variant.
CA dualsite/single
vCenter topology
As all of the management pods reside on vMSC, management components are recovered
through vSphere HA mechanisms. Assuming the VPLEX Witness has been deployed in a third
fault domain, this should happen automatically.
Dual-site/single
vCenter topology
backup
The primary option for backup in a dual-site/single vCenter topology is the Redundant
Avamar/single vCenter configuration though the Standard Avamar configuration may also be
used if backup is only required on one of the two sites. Both options are described in
Chapter 7.
Chapter 5: Dual-Site/Single vCenter Topology
Ecosystem
interactions
Figure 26 shows how the concepts in this chapter interact in a CA dual-site/single vCenter
configuration. Data protection concepts from Chapter 7 are also included.
Figure 26.
CA dual-site/single vCenter ecosystem
57
Chapter 6: Dual-Site/Dual vCenter Topology
This chapter presents the following topics:
Overview ..........................................................................................................59
Standard dual-site/dual vCenter topology .............................................................59
Disaster recovery dual-site/dual vCenter topology ..................................................60
Disaster recovery network considerations .............................................................61
vCenter Site Recovery Manager considerations ......................................................69
vRealize Automation considerations ......................................................................72
Disaster recovery storage considerations ..............................................................73
Recovery of cloud management platform ..............................................................74
Best practices ....................................................................................................75
Backup in dual-site/dual vCenter topology ............................................................75
DR dual-site/dual vCenter ecosystem ...................................................................76
58
Chapter 6: Dual-Site/Dual vCenter Topology
This chapter describes networking and storage considerations for a dual-site/dual vCenter
topology in the Federation Enterprise Hybrid Cloud solution.
When to use the
dual-site/dual
vCenter topology
The dual-site/single vCenter Federation Enterprise Hybrid Cloud topology may be used in
either of the following scenarios.
Standard dual-site/dual vCenter topology
Two sites are present that require management via independent vCenter instances and a
single Federation Enterprise Hybrid Cloud management stack/portal.
Each site must have their own storage and networking resources, otherwise this model has
no additional considerations per site than those listed in the single-site/single vCenter
model. This is because each site has totally independent infrastructure resources with
independent vCenters, but is managed by the same Federation Enterprise Hybrid Cloud
management platform/portal.
Note: In this case, the scope of the term ‘site’ is at the users’ discretion. This can be separate
individual geographical locations, or independent islands of infrastructure in the same
geographical location, such as independent VxBlock platforms.
Disaster recovery dual-site/single vCenter topology
Disaster recovery (restart of virtual machines on another site through the use of VMware
Site Recovery Manager) is required. This topology also requires that EMC RecoverPoint is
available.
Note: Typically this model is used when the latency between the two physical data center
locations exceeds the required latency for the use of vSphere metro storage clusters using
VPLEX distributed storage (10 ms).
The standard dual-site/dual vCenter Federation Enterprise Hybrid Cloud architecture controls
two sites, each with independent islands of infrastructure, each using its own vCenter
instance but controlled by a single Federation Enterprise Hybrid Cloud management
platform/portal.
This architecture provides a mechanism to extend an existing Federation Enterprise Hybrid
Cloud by adding additional independent infrastructure resources to an existing cloud, when
resilience of the management platform itself is not required, but where the resources being
added either already belong to an existing vCenter or it is desirable for them to do so.
Figure 27 shows the architecture used for this topology option.
59
Chapter 6: Dual-Site/Dual vCenter Topology
Figure 27.
Federation Enterprise Hybrid Cloud standard dual-site/dual vCenter
architecture
The DR dual-site/dual vCenter topology for the Federation Enterprise Hybrid Cloud solution
provides protection and restart capability for workloads deployed to the cloud. Management
and workload virtual machines are placed on storage protected by RecoverPoint and are
managed from VMware vCenter Site Recovery Manager™.
This topology allows for multi-site resilience across two sites with DR protection for both the
management platform and workload virtual machines on the surviving site. Figure 28 shows
the overall architecture of the solution.
60
Chapter 6: Dual-Site/Dual vCenter Topology
Figure 28.
Physical network
design
Federation Enterprise Hybrid Cloud DR dual-site/dual vCenter architecture
The Federation Enterprise Hybrid Cloud solution deploys a highly resilient and fault-tolerant
network architecture for intra-site network, compute, and storage networking. To achieve
this, it uses features such as redundant hardware components, multiple link aggregation
technologies, dynamic routing protocols, and high availability deployment of logical
networking components. The DR dual-site/dual vCenter topology of the Federation
Enterprise Hybrid Cloud solution requires network connectivity across two sites using WAN
technologies. It maintains the resiliency of the Federation Enterprise Hybrid Cloud by
implementing a similarly high-availability and fault tolerant network design with redundant
links and dynamic routing protocols. The high-availability features of the solution, which can
minimize downtime and service interruption, address any component-level failure within the
site.
Throughput and latency requirements are other important aspects of physical network
design. To determine these requirements, consider carefully both the size of the workload
and data that must be replicated between sites and the requisite RPOs and RTOs for your
61
Chapter 6: Dual-Site/Dual vCenter Topology
applications. Traffic engineering and QOS capabilities can be used to guarantee the
throughput and latency requirements of data replication.
Requirements
based on the
management
model
The DR dual-site/dual vCenter topology is supported on all Federation Enterprise Hybrid
Cloud management models. The Automation Pod components must be on a different Layer 3
network to the Core and NEI Pod components so that they can be failed over using VMware
Site Recovery Manager, and the Automation network re-converged without affecting the
Core and NEI Pod components on the source site.
Supported virtual
networking
technologies
The Federation Enterprise Hybrid Cloud supports the following virtual networking
technologies in the dual-site/dual vCenter topology:
Supported
VMware NSX
features
Unsupported
VMware NSX
features

VMware NSX (recommended)

VMware vSphere Distributed Switch backed by non-NSX technologies for network reconvergence
When using VMware NSX in dual-site /dual vCenter architecture, the Federation Enterprise
Hybrid Cloud supports many NSX features, including but not limited to the following:

Micro-segmentation

Use of NSX security polices and groups
The following VMware NSX Features are not supported in dual-site/dual vCenter topology:

Inter-site protection of dynamically provisioned VMware NSX networking components

NSX security tags
Note: NSX security tags are not honored on failover as part of the out-of-the-box solution, but
may be implemented as a professional services engagement.
NSX best
practices
In a DR dual-site/dual vCenter topology, when NSX is used, NSX Controllers reside on each
site’s corresponding NEI Pod. NSX best practice recommends that each controller is placed
on a separate physical host.
NSX will create ESGs and DLRs. Best practice for ESGs and DLRs recommends that they are
deployed in HA pairs, and that the ESGs and DLRs are separated onto different physical
hosts.
Combining the above best practices means that a minimum of four physical hosts per site
(eight hosts in total) are required to support the NEI pod function when NSX is used.
VMware anti-affinity rules should be used to ensure that the following conditions are true
during optimum conditions:

NSX controllers reside on different hosts.

NSX ESGs reside on different hosts.

NSX DLR Control virtual machines reside on different host.

NSX ESG and DLR Control virtual machines do not reside on the same physical hosts.
When using the Federation Enterprise Hybrid Cloud Sizing tool, appropriate consideration
should be given to the choice of server specification for the NEI Pod to ensure efficient use
of hardware resources, as the tool will enforce the four server per site minimum when NSX
is chosen. Figure 29 shows how the various NSX components are deployed independently on
both sites within the topology.
62
Chapter 6: Dual-Site/Dual vCenter Topology
Figure 29.
NEI Pods from the cloud vCenter Server instances on Site A and Site B
Perimeter NSX Edge
When used with VMware NSX, the Federation Enterprise Hybrid Cloud solution provides
multitier security support and security policy enforcement by deploying NSX Edges as
perimeter firewalls. An NSX Edge can be deployed at different tiers to support tiered security
policy control. Each site's NSX Manager deploys corresponding NSX Edge Services Gateways
(ESGs) configured for services such as firewall, DHCP, NAT, VPN, and SSL-VPN.
Logical switches
When used with VMware NSX, the Federation Enterprise Hybrid Cloud solution provides
logical networking support through NSX logical switches that corresponds to VXLAN
segments. These logical switches support the extension of Layer 2 connections between
various virtual machines and other networking components such as NSX Edges and logical
routers. The use of VXLAN also increases the scalability of the solution.
For the DR for the Federation Enterprise Hybrid Cloud solution topology, transit logical
switches are required on both sites to provide connections between the DLRs and NSX
Edges, as shown in Figure 30 and Figure 31. Duplicate logical switches are also needed on
both sites for use by the workload virtual machines.
63
Chapter 6: Dual-Site/Dual vCenter Topology
Figure 30.
Logical switches on Site A
Figure 31.
Logical switches on Site B
Distributed logical router
When VMware NSX is used with the DR dual-site/dual vCenter topology, the NSX network
elements of logical switches, DLRs and ESGs must be in place before configuring DRprotected blueprints to enable DR-protected workload provisioning.
The DLRs perform east-west routing between the NSX logical switches. Additionally, the DLR
can provide gateway services such as NAT for the virtual machines connected to the preprovisioned application with the ESG performing north-south routing between the DLR to the
physical core network.
Note: These network elements must be created either using the NSX UI or by direct API calls.
When these elements are in place, vRealize Automation blueprints can be configured to
connect a machine's network adapter to their respective logical switch.
The DLR control virtual machine is deployed on the NEI Pod in high-availability mode. In this
mode, two virtual machines are deployed on separate hosts as an active/passive pair. The
active/passive pair maintains state tables and verifies each other's availability through
heartbeats. When a failure of the active DLR is detected, the passive DLR immediately takes
over and maintains the connection state and workload availability.
64
Chapter 6: Dual-Site/Dual vCenter Topology
A DLR kernel module is deployed to each NSX-enabled Workload Pod host to provide
east/west traffic capability and broadcast reduction.
To provide default gateway services on both sites, a corresponding DLR must be deployed
on both sites, as shown in Figure 32.
Figure 32.
IP mobility
between the
primary and
recovery sites
DLR interfaces on Site A and Site B
The Federation Enterprise Hybrid Cloud solution supports migration of virtual machines to a
recovery site without the need to change the IP addresses of the virtual machines. It does
this by fully automating network re-convergence of tenant resource pods during disaster
recovery failover when using VMware NSX only.
Non-NSX network technology requirements
The use of vSphere Distributed Switch, backed by other non-NSX networking technologies is
permitted, but requires that the chosen technology supports IP mobility.
Additionally, it requires that network re-convergence for both the Automation Pod and
tenant resource pods is carried out manually in accordance with the chosen network
technology, or that automation of that network re-convergence is developed as a
professional services engagement.
Maintenance of the alternative network convergence strategy is outside the scope of
Federation Enterprise Hybrid Cloud support.
VMware NSX-based IP mobility
Default gateways on each site are created using DLRs. By configuring the DLRs on both sites
identically, the same IP addresses and IP subnets are assigned to their corresponding
65
Chapter 6: Dual-Site/Dual vCenter Topology
network interfaces, as shown in Figure 32. In this way, there is no need to reconfigure
workloads default gateway settings in a recovery scenario.
A dynamic routing protocol is configured for the logical networking and is integrated with the
physical networking to support dynamic network convergence and IP mobility for the
networks (subnets) supported for DR. This approach simplifies the solution and eliminates
the need to deploy additional services to support IP address changes.
As shown in Figure 33, route distribution requires that an IP prefix created for each logical
switch that is connected to the DLR.
Note: For DR protection the IP Prefix must be configured with the same name and IP network
values on both primary and recovery DLRs.
Figure 33.
Route redistribution policy on Site A and Site B
A route redistribution policy is configured by adding one or more prefixes to the Route
Distribution table, so that logical switch networks defined in the prefix list can be
redistributed to the dynamic routing protocol on the primary site DLR where the virtual
machines are deployed and running. The route redistribution policy on the recovery site DLR
66
Chapter 6: Dual-Site/Dual vCenter Topology
is configured to deny redistribution of networks connected to the recovery site, as shown in
Figure 30.
In the event of a disaster or a planned migration, you should execute a recovery plan in
VMware Site Recovery Manager. After the virtual machines are powered off, the Federation
Enterprise Hybrid Cloud network convergence scripts automatically (when using VMware
NSX) determine the networks relevant to the cluster being failed over, and modify the action
settings of those networks on the primary site DLR to deny redistribution of the networks
associated with the cluster being failed.
Note: Only the protected networks contained in the specific Site Recovery Manager protection
plan being executed will be set to ‘Deny’.
A subsequent recovery step uses the same network convergence scripts to modify the route
redistribution policy on the recovery site DLR to permit redistribution of the corresponding
recovery site networks before powering on the virtual machines. This dynamic network
convergence ensures that the virtual machines can reach infrastructure services, such as
Domain Name System (DNS) and Microsoft Active Directory on the recovery site, and
reduces the recovery time.
You can implement an additional level of routing control from a site to the WAN peering
point to ensure that only appropriate networks are advertised. To enable network failover
with the same IP subnet on both sites, a network can be active only on the primary site or
the recovery site. To support this, the unit of failover for a network is restricted to a single
compute cluster. All virtual machines on a compute cluster can fail over to the recovery site
without affecting virtual machines running on other compute clusters.
If the network spans multiple clusters, the administrator must configure the recovery plan to
ensure that all virtual machines on the same network are active only on one site.
Supported
routing protocols
The Federation Enterprise Hybrid Cloud has validated network designs using both OSPF and
BGP in disaster recovery environments. BGP is recommended over OSPF, but both have
been validated.
VMware NSXbased security
design
This section describes the additional multitier security services available to virtual machines
deployed in the Federation Enterprise Hybrid Cloud solution when used with VMware NSX.
NSX security policies
NSX security policies use security groups to simplify security policy management. A security
group is a collection of objects, such as virtual machines, to which a security policy can be
applied. To enable this capability the machines contained in the multi-machine blueprint
must be configured with one or more security groups. A network security administrator or
application security administrator configures the security policies to secure application traffic
according to business requirements.
To ensure consistent security policy enforcement for virtual machines on the recovery site,
you must configure the security policies on both the primary and recovery sites.
NSX perimeter Edge security
Perimeter edges are deployed using NSX Edges on both the primary and recovery sites. The
perimeter NSX Edge provides security features, such as stateful firewalls, and other services
such as DHCP, NAT, VPN, and load balancer.
The configuration of various services must be manually maintained on both the primary and
recovery site perimeter edges. This ensures consistent security policy enforcement in case
of DR or planned migration of virtual machines to the recovery site.
67
Chapter 6: Dual-Site/Dual vCenter Topology
NSX distributed firewall
The Federation Enterprise Hybrid Cloud solution supports the distributed firewall capability
of NSX to protect virtual machine communication and optimize traffic flow.
The distributed firewall is configured though the Networking and Security -> Service
Composer -> Security Groups section of the vSphere web client. Figure 34 shows various
security groups that may be pre-created in the NSX security configuration.
Figure 34.
Security groups on the primary and recovery sites
The Federation Enterprise Hybrid Cloud solution provides an option to associate security
group information with a machine blueprint. When a business user deploys the blueprint, the
virtual machine is included in the security group configuration. This ensures enforcement of
the applicable security policy as soon as the virtual machine is deployed.
As shown in Figure 35, a corresponding security group of the same name must be created
on the recovery site. To ensure that workloads are consistently protected after failover, both
primary and recovery site security policies must be identically configured.
Figure 35.
68
Security group on the recovery site
Chapter 6: Dual-Site/Dual vCenter Topology
Overview
This DR for Federation Enterprise Hybrid Cloud solution incorporates storage replication
using RecoverPoint, storage provisioning using ViPR, and integration with Site Recovery
Manager to support DR services for applications and virtual machines deployed in the hybrid
cloud. Site Recovery Manager natively integrates with vCenter and NSX to support DR,
planned migration, and recovery plan testing.
RecoverPoint and
ViPR Storage
Replication
Adapters
Site Recovery Manager integrates with EMC RecoverPoint storage replication and ViPR
automated storage services via EMC Storage Replication Adapters (SRAs). The SRAs control
the EMC RecoverPoint replication process. The EMC RecoverPoint SRA controls the
Automation Pod datastores. The ViPR SRA controls protected Workload Pod datastores.
Site mappings
To support DR services, the Site Recovery Manager configuration must include resource
mappings between the vCenter Server instance on the protected site and the vCenter Server
instance on the recovery site. The mappings enable the administrator to define automated
recovery plans for failing over application workloads between the sites according to defined
RTOs and RPOs. The resources you need to map include resource pools, virtual machine
folders, networks, and the placeholder datastore. The settings must be configured on both
the protected and recovery sites to support application workload recovery between the two
sites.
Resource pool mappings
A Site Recovery Manager resource pool specifies the compute cluster, host, or resource pool
that is running a protected application. Resource pools must be mapped between the
protected site and the recovery site in both directions so that, when an application fails
over, the application can then run on the mapped compute resources on the recovery site.
Folder mappings
When virtual machines are deployed using the Federation Enterprise Hybrid Cloud solution,
the virtual machines are placed in particular folders in the vCenter Server inventory to
simplify administration. By default, virtual machines are deployed in a folder named VRM.
This folder must be mapped between the protected and recovery sites in both directions.
When used with Federation Enterprise Hybrid Cloud backup services, the folders used by
backup as a service are automatically created in both vCenters and mapped in Site Recovery
Manager.
Network mappings
Virtual machines may be configured to connect to different networks when deployed.
Applications deployed with DR support must be deployed on networks that have been
configured as defined in the Disaster recovery network considerations section. The networks
must be mapped in Site Recovery Manager between the protected and recovery sites in both
directions. For testing recovery plans, you should deploy a test network and use test
network mappings when you create the recovery plan.
Note: A Layer 3 network must be failed over entirely. Active machines in a given Layer 3
network must reside only in the site with the "permit" route redistribution policy.
Placeholder datastore
For every protected virtual machine, Site Recovery Manager creates a placeholder virtual
machine on the recovery site. The placeholder virtual machine retains the virtual machine
properties specified by the global inventory mappings or specified during protection of the
individual virtual machine.
69
Chapter 6: Dual-Site/Dual vCenter Topology
A placeholder datastore must be accessible to the compute clusters that support the DR
services. The placeholder datastore must be configured in Site Recovery Manager and must
be associated with the compute clusters.
Disaster recovery
support for
Automation Pod
vApps
The Federations Enterprise Hybrid Cloud used several components that are deployed as a
vSphere vApp. Currently this list includes:

EMC ViPR Controller

EMC ViPR SRM
Site Recovery Manager protects virtual machines, but does not preserve the vApp structure
required for EMC ViPR Controller and EMC ViPR SRM virtual machines to function.
The high-level steps to achieve recovery of vApps are:
1.
Deploy the vApp identically in both sites.
2.
Vacate the vApp on the recovery site (delete the virtual machines, but retain the
virtual machine container).
3.
Protect the vApp on the protected site through Site Recover Manager, mapping the
vApp containers from both sites.
4.
Reapply virtual machine vApp settings on placeholder virtual machines.
For additional details on the process and if other vApps in the environment are required, see
the VMware Knowledge Base topic:
vCenter Operations Manager 5.0.x: Using Site Recovery Manager to Protect a vApp
Deployment.
Protection groups
A protection group is the unit of failover in Site Recovery Manager. The Federation
Enterprise Hybrid Cloud solution supports failover at the granularity of the Workload Pod.
In the context of the DR dual-site/dual vCenter topology, two Workload Pods are assigned to
a DR pair, where one pod is the primary and is considered the protected cluster, and the
second pod is the alternate site and is considered the recovery cluster. All protection groups
associated with a DR pair and all the virtual machines running on a particular pod must
failover together.
For the DR dual-site/dual vCenter topology there is a 1:1 mapping between a DR pair and a
recovery plan, and each recovery plan will contain one or more protection groups.
Each protection group contains a single replicated vSphere datastore, and all the virtual
machines that are running on that datastore. When you deploy new virtual machines on a
Workload Pod vRealize Automation, those virtual machines are automatically added to the
corresponding protection group and fail over with that protection group.
Recovery plans
Recovery plans enable administrators to automate the steps required for recovery between
the primary and recovery sites. A recovery plan may include one or more protection groups.
You can test recovery plans to ensure that protected virtual machines recover correctly to
the recovery site.
Tenant Pod recovery plans
The automated networking re-convergence capabilities of this DR for Federation Enterprise
Hybrid Cloud solution eliminate the need to change the IP addresses of workload virtual
machines when they failover from one site to the other. Instead, the tenant networks move
with the virtual machines and supports virtual machine communication outside the network
when on the recovery site.
70
Chapter 6: Dual-Site/Dual vCenter Topology
When using VMware NSX, Federation Enterprise Hybrid Cloud can automate network reconvergence of the tenant workload pods via a custom VMware Site Recovery Manager step
of the Site Recovery Manager recovery plan, ensuring security policy compliance on the
recovery site during a real failover. However, running a test Site Recovery Manager recovery
plan with VMware NSX does not affect the production virtual machines, because the network
convergence automation step has the required built-in intelligence to know that the
networks should not be re-converged in that scenario.
If non-NSX alternatives are used, then this network re-convergence is not automated, and
therefore needs to be done manually during a pause in the Site Recovery Manager recovery
plan, or via an automated Site Recovery Manager task created as part of a professional
services engagement.
Note: A recovery plan must be manually created per DR-enabled cluster before any STaaS
operations are executed - two per pair to enable failover and failback.
Automation Pod recovery plans
Network re-convergence of the network supporting the Federation Enterprise Hybrid Cloud
Automation Pod is a manual task irrespective of the presence of VMware NSX.
Note: This reflects the out-of-the-box solution experience. Automated network re-convergence
for the Automation Pod can be achieved via a professional services engagement.
Collapsed
management
model
When configuring the protection group and recovery plans for the Automation Pod
components under a collapsed management model, you must exclude all Core and NEI Pod
components from the configurations. This is to ensure that system does not attempt to fail
over the Core and NEI components from one site to the other.
71
Chapter 6: Dual-Site/Dual vCenter Topology
Configuring
primary and
recovery site
endpoints
The Federation Enterprise Hybrid Cloud solution uses vRealize Automation to provide
automated provisioning and management of cloud resources such as storage and virtual
machines. To support DR services for cloud resources, you must configure vRealize
Automation two virtual (vCenter) endpoints.
The first endpoint is configured to support IaaS services for the first site; this endpoint uses
the vCenter Server instance where the storage and virtual machines for the first site are
deployed. The second endpoint is configured to serve as the recovery site for the resources
of the first site.
If required, workloads can also be configured to run in the secondary site, with recovery in
the first site by configuring multiple DR cluster pairs with a protected cluster in each site,
and a corresponding recovery cluster on the other site.
To configure each endpoint, a separate vCenter agent must be installed on the IAAS server
that is running vRealize Automation.
Configuring the
infrastructure for
disaster recovery
services
The vRealize Automation IaaS administrator must assign the compute resources for the
Workload Pods, on both the protected and recovery sites, to the fabric administrator for
allocation to business groups.
In the dual-site/dual vCenter DR configuration, you must designate Workload Pods (clusters)
as DR-enabled, and all workloads deployed to that cluster will be DR-protected. If you have
additional workloads that don’t require DR support then additional local (unprotected)
workload pods should be provisioned to accommodate this.
When replicated storage is provisioned to a protected Workload Pod, the fabric administrator
must update the reservation policies for the relevant business groups to allocate the newly
provisioned storage.
Federation Enterprise Hybrid Cloud STaaS workflows automatically add newly provisioned
storage to the appropriate protection group. This ensures that the virtual machines deployed
on the storage are automatically protected and are included in the recovery plans defined
for the Workload Pod.
Configuring
application
blueprints for
disaster recovery
Storage reservation policies are used to deploy virtual machine disks to a datastore that
provides the required RPO. The vRealize Automation IaaS administrator must create storage
reservation policies to reflect the RPOs of different datastores. The fabric administrator must
then assign the policies to the appropriate datastores of the compute clusters.
Business Group administrators can configure the blueprints for virtual machines so that
business users can select an appropriate storage reservation policy when deploying an
application. The business user requests a catalog item in the Federation Enterprise Hybrid
Cloud tenant portal, selects storage for the virtual machines, and assigns an appropriate
storage reservation policy for the virtual machines disks based on the required RPO. The
choice made at this point also dictates whether the virtual machine will be DR protected or
not.
The virtual machines disks are then placed on datastores that support the required RPO. The
virtual machines are automatically deployed with the selected DR protection service and
associated security policy for both the primary and recovery sites.
72
Chapter 6: Dual-Site/Dual vCenter Topology
ViPR managed
Workload Pod
storage
For the Workload Pods, ViPR SRA manages the protection of ViPR-provisioned storage.
ViPR SRA provides an interface between Site Recovery Manager and ViPR Controller. ViPR
Controller, which is part of the Automation Pod, must be running and accessible before the
ViPR SRA can instruct ViPR to control the EMC RecoverPoint replication functions.
This means that the Automation Pod and ViPR vApp must be functioning before Site
Recovery Manager can execute a recovery of the Workload Pods.
Storage at each
site
The Core and NEI clusters on each site require site-specific storage that does not need to be
protected by EMC RecoverPoint. Site Recovery Manager also requires site-specific datastores
on each site to contain the placeholder virtual machines for the tenant and automation pods.
The Automation Pod storage must be distinct from the Core and NEI storage and protected
by EMC RecoverPoint.
ViPR virtual
arrays
There must be at least one virtual array for each site. By configuring the virtual arrays in
this way, ViPR can discover the EMC RecoverPoint and storage topology. You should
carefully plan and perform this step because it is not possible to change the configuration
after resources have been provisioned, without first disruptively removing the provisioned
volumes.
ViPR virtual pools
When you specify EMC RecoverPoint as the protection option for a virtual pool, the ViPR
storage provisioning services create the source and target volumes and the source and
target journal volumes, as shown in Figure 36.
Figure 36.
ViPR/EMC RecoverPoint protected virtual pool
Each DR-protected/recovery cluster pair has storage that replicates (under normal
conditions) in a given direction, for example, from Site A to Site B. To allow active/active
site configuration, additional DR cluster pairs should be configured whose storage replicates
in the opposite direction. You must create two sets of datastores – one set that will replicate
from Site A and another set that will replicate from Site B. To enable this, you need to
configure an environment similar to Figure 36 for Site A, and the inverse of it for Site B
(where the protected source pool is Site B, and local target pool is on Site A).
RecoverPoint
journal
considerations
Every RecoverPoint-protected LUN requires access to a journal LUN to maintain the history
of disk writes to the LUN. The performance of the journal LUN is critical in the overall
performance of the system attached to the RecoverPoint-protected LUN and therefore its
73
Chapter 6: Dual-Site/Dual vCenter Topology
performance capability should be in line with the expected performance needs of that
system.
By default, ViPR uses the same virtual pool for both the target and the journal LUN for a
RecoverPoint copy, but it does allow you to specify a separate or dedicated pool. In both
cases, the virtual pool and its supporting physical pools should be sized to provide adequate
performance.
Storage
provisioning
EMC RecoverPoint protected storage is provisioned to the Workload vSphere clusters in the
environment using the catalog item named Provision Cloud Storage.
Note: The Federation recommends that you follow the best practice guidelines when deploying
any of the supported platform technologies. The Federation Enterprise Hybrid Cloud does not
require any variation from these best practices.
The workflow interacts with both ViPR and vRealize Automation to create the storage,
present it to the chosen vSphere cluster and add the new volume to the relevant vRealize
storage reservation policy.
As with the single-site topology, vSphere clusters are made eligible for storage provisioning
by tagging them with vRealize Automation custom properties. However, in this case they are
defined as DR-enabled clusters, that is, they are part of a Site Recovery Manager
configuration that maps protected clusters to recovery clusters. This tagging is done during
the installation and preparation of vSphere clusters for use by the Federation Enterprise
Hybrid Cloud using the DR Cluster Onboarding workflow provided as part of the Federation
Enterprise Hybrid Cloud self-service catalog.
As local-only vSphere clusters can also be present in a DR dual-site/dual vCenter topology,
when you attempt to provision to a DR-enabled vSphere cluster, the Provision Cloud
Storage catalog item will automatically present only EMC RecoverPoint-protected virtual
storage pools to provision from.
Standard dualsite/dual vCenter
topology
This model provides no resilience/recovery for the cloud management platform. To enable
this you should use the DR dual-site/dual vCenter variant.
DR dual-site/dual
vCenter topology
In the DR dual-site/dual vCenter topology, EMC RecoverPoint and Site Recovery Manager
protect the Automation Pod. This allows for recovery between Site A and Site B in planned
and unplanned recovery scenarios. EMC RecoverPoint SRA for Site Recovery Manager is
used to interact with EMC RecoverPoint during a failover of the Automation Pod’s resources.
The Core and NEI Pods (when NSX is used) are created manually on both sites to mirror
functionality such as NSX dynamic routing, NSX security groups, NSX security policies
(firewall rules) and to host the Site Recovery Manager servers. As a result, there is no need
to protect them using EMC RecoverPoint or Site Recovery Manager.
In a distributed management model, this is accomplished by excluding the Core and NEI
Pods from the process of creating associated datastore replications, protection groups, and
recovery plans for the vSphere ESXi clusters hosting those functions.
In a collapsed management model, all components are on the same vSphere ESXi cluster,
so the Core and NEI components must be excluded from Site Recovery Manager recovery
plans and protections groups for that cluster. Despite residing on the same vSphere cluster,
the Automation Pod components should be on a distinct network and a distinct set of
datastores, so that they can be failed over between sites without affecting Core or NEI
components.
74
Chapter 6: Dual-Site/Dual vCenter Topology
Tenant workload networks are automatically re-converged to the recovery site by the
Federation Enterprise Hybrid Cloud solution when used with VMware NSX. When non-NSX
alternatives are used, tenant network re-convergence is not automated by the Federation
Enterprise Hybrid Cloud. Automation Pod network re-convergence is a manual step with or
without the presence of VMware NSX.
The vCenter Server instances on each site manage the NEI, Automation, and Workload Pods
on their respective sites and act as the vSphere end-points for vRealize Automation. The
vCenter Server instances are integrated using Site Recovery Manager, which maintains
failover mappings for the networks, clusters, and folders between the two sites.
Naming
conventions
VMware vCenter Site Recovery Manager protection groups
Protection group names must match the Workload Pod names—for example, if
SAComputePod2 is the name of Workload Pod 2 on Site A, then the Site Recovery Manager
protection group must also be named SAComputePod2. The solution relies on this
correspondence when performing several of the automation tasks necessary for successful
failover and subsequent virtual machine management through vRealize Automation.
VMware NSX security groups
Security group names must be the same on both sites.
VMware NSX security policies
Security policy names must be the same on both sites.
EMC ViPR virtual pools
ViPR virtual pool names must be meaningful because they are the default names for storage
reservation policies. For example, when creating Tier 1 DR protected storage with an RPO of
10 minutes, Tier 1 – DR Enabled – 10 Minute RPO is an appropriate name.
NSX logical
networks
Each Workload Pod (compute cluster) must have its own transport zone. The NEI Pod must
be a member of each transport zone. If a transport zone spans multiple compute clusters,
the corresponding Site Recovery Manager protection groups must be associated with the
same Site Recovery Manager recovery plan.
The reason for this is that, when a transport zone spans multiple compute clusters, network
mobility from Site A to Site B affects the virtual machines deployed across these clusters;
therefore, the clusters must be failed over as a set.
DR dual-site/dual
vCenter topology
backup
The recommended option for backup in a DR dual-site/dual vCenter topology is the
Redundant Avamar/dual vCenter configuration. This option is described in detail in Chapter
7.
75
Chapter 6: Dual-Site/Dual vCenter Topology
Ecosystem
interactions
Figure 37 shows how the concepts in this chapter interact in a DR dual-site/dual vCenter
configuration. Data protection concepts from Chapter 7 are also included.
Figure 37.
76
DR ecosystem
Chapter 7: Data Protection
This chapter presents the following topics:
Overview ..........................................................................................................78
Concepts...........................................................................................................79
Standard Avamar configuration............................................................................84
Redundant Avamar/single vCenter configuration ....................................................86
Redundant Avamar/dual vCenter configuration ......................................................90
77
Chapter 7: Data Protection
This chapter discusses the considerations for implementing data protection, also known as
backup as a service (BaaS) in the context of the Federation Enterprise Hybrid Cloud.
Backup and recovery of a hybrid cloud is a complicated undertaking in which many factors
must be considered, including:

Backup type and frequency

Impact and interaction with replication

Recoverability methods and requirements

Retention periods

Automation workflows

Interface methods (workflows, APIs, GUI, CLI, scripts, and so on)

Implementation in a CA or DR-enabled environment
VMware vRealize Orchestrator™, which is central to all of the customizations and operations
used in this solution, manages operations across several EMC and VMware products,
including:

VMware vRealize Automation

VMware vCenter

EMC Avamar and EMC Data Protection Advisor™
This solution uses Avamar as the technology to protect your datasets. Using Avamar, this
backup solution includes the following characteristics:

Abstracts and simplifies backup and restore operations for cloud users

Uses VMware Storage APIs for Data Protection, which provides Changed Block
Tracking for faster backup and restore operations

Provides full image backups for running virtual machines

Eliminates the need to manage backup agents for each virtual machine in most cases

Minimizes network traffic by deduplicating and compressing data
Note: The Federation recommends that you engage an Avamar product specialist to design,
size, and implement a solution specific to your environment and business needs.
78
Chapter 7: Data Protection
Scalable backup
architecture
The Federation Enterprise Hybrid Cloud backup configurations add scalable backup by
adding the ability to configure an array of Avamar instances. Federation Enterprise Hybrid
Cloud BaaS workflows automatically distribute the workload in a round-robin way across the
available Avamar instances, and provides a catalog item to enable additional Avamar
instances (up to a maximum of 15 Avamar replication pairs) to be added to the
configuration.
When new Avamar instances are added, new virtual machine workloads are automatically
assigned to those new instances until an equal number of virtual machines are assigned to
all Avamar instances in the environment. Once that target has been reached, virtual
machines are assigned in a round-robin way again.
The configuration of the Avamar instances is stored by the Federation Enterprise Hybrid
Cloud workflows for later reference when reconfiguring or adding instances.
Avamar
replication pairs
An Avamar replication pair is defined as a relationship configured between two Avamar
instances, and is used by the Federation Enterprise Hybrid Cloud workflows to ensure
backup data is protected against the loss of a physical Avamar instance. Normally this is
used to ensure that data backed up on one site is available to restore on a secondary site,
but it could also be used to provide extra resilience on a single site if required.
The Federation Enterprise Hybrid Cloud provides two different redundant Avamar
configurations that use an array of Avamar replication pairs to achieve the same scalability
as the standard Avamar configuration but with the added resilience that every instance of
Avamar has a replication partner, to which it can replicate any backup sets that it receives.
Note: In the standard Avamar configuration, each instance is technically configured as the first
member of an Avamar replication pair. In this case, no redundancy exists, but it can be added
later by adding a second member to each replication pair.
To achieve this, the Federation Enterprise Hybrid Cloud uses the concepts of primary and
secondary Avamar instances within each replication pair, and the ability to reverse these
personalities so that, in the event of a failure, backup and restore operations can continue.
The primary Avamar instance is where all scheduled backups are executed. It is also the
instance that Federation Enterprise Hybrid Cloud on-demand backup and restore features
communicate with in response to dynamic user requests. The primary Avamar instance also
has all the currently active replication groups, making it responsible for replication of new
backup sets to the secondary Avamar instance.
The secondary Avamar instance has the same configurations for backup and replication
policies, except that BaaS workflows initially configure these policies in a disabled state. If
the primary Avamar instance becomes unavailable, the policies on the secondary Avamar
instance can be enabled via the Toggle Single Avamar Pair Designations catalog item to
enable backup and replication operations to continue.
Note: Replication operations do not catch up until the original primary Avamar instance (now
designated as secondary) becomes available again, at which time replication automatically
transmits newer backup sets to the secondary system.
In this solution, after a redundant Avamar configuration is enabled, the Federation
Enterprise Hybrid Cloud workflows configure all subsequent backups with replication
enabled. If one member of the Avamar replication pair is offline, backups taken to the
surviving member of the pair will automatically be replicated once the offline member is
brought back online.
79
Chapter 7: Data Protection
How each Avamar instance in a replication pair operates varies based on which backup
topology is configured, and is described in the context of each individual topology later in
this chapter.
VMware vCenter
folder structure
and backup
service level
relationship
When a backup service level is created via the Create Backup Service Level vRealize
Automation catalog item, it creates an associated set of folders in the cloud vCenter (or both
cloud vCenters if done in a dual-site/dual vCenter environment). The number of folders
created depends on how many Avamar pairs are present, and these folders become part of
the mechanism for distributing the backup load.
Note: In a DR dual-site/dual vCenter environment, the Create a Backup Service Level
catalog item automatically creates Site Recovery Manager folder mappings between the new
folders created in the first cloud vCenter and their corresponding folders in the second vCenter.
Example
If you create a backup service level named Daily-7yr in your environment, and four Avamar
replication pairs (numbered 0 through 3) are present, then the following folders are created
in the relevant cloud vCenter servers:

Daily-7yr-Pair0

Daily-7yr-Pair1

Daily-7yr-Pair2

Daily-7yr-Pair3
When you assign a virtual machine to the Daily-7yr backup policy, the workflows use a
selection algorithm to determine the Avamar pair with least load, and then assign the virtual
machine to the associated folder. So if Avamar-Pair2 is determined to be the best target,
then the virtual machine is moved to the Daily-7yr-Pair2 vCenter folder and automatically
backed up by Avamar-Pair2 as a result.
How the Avamar instances are assigned to monitor and backup these folders differs
dependent on which backup topology is deployed, and is described in the context of each
individual topology later in this chapter.
Avamar pair to
vSphere cluster
association
Avamar image-level backups work by mounting snapshots of VMDKs to Avamar proxy virtual
machines and then backing up the data to the Avamar instance that the Avamar proxy is
registered with.
In a fully deployed Federation Enterprise Hybrid Cloud with up to 10,000 user virtual
machines and hundreds of vSphere clusters, this could lead to Avamar proxy sprawl if not
properly configured and controlled.
To do this, the Federation Enterprise Hybrid Cloud associates vSphere clusters to a subset of
Avamar replications pairs. This means that a reduced number of Avamar proxy virtual
machines are required to service the cloud. Associations between a vSphere cluster and an
Avamar pair is done via the Federation Enterprise Hybrid Cloud BaaS Associate Avamar
Pairs with vSphere Cluster catalog item.
Note: In a DR dual-site/dual vCenter topology, when a protected cluster is associated with an
Avamar pair, the associated recovery cluster is automatically associated with the same Avamar
pair to ensure continuity of service on failover.
80
Chapter 7: Data Protection
Avamar
designations
In the redundant Avamar/single vCenter configuration, there are two Avamar instances in
each pair, and both are assigned to monitor the same vCenter folder and to backup any
virtual machines that folder contains.
To ensure that this does not result in both instances backing up the same virtual machine
and then replicating each backup (four copies in total), the Federation Enterprise Hybrid
Cloud uses primary and secondary Avamar instances within each replication pair, and the
ability to reverse these personalities so that, in the event of a failure, backup and restore
operations can continue.
The primary Avamar instance is where all scheduled backups are executed. It is also the
instance that Federation Enterprise Hybrid Cloud on-demand backup and restore features
communicate with in response to dynamic user requests. The primary Avamar instance also
has all the currently active replication groups, making it responsible for replication of new
backup sets to the secondary Avamar instance.
The secondary Avamar instance has the same configurations for backup and replication
policies, except that BaaS workflows initially configure these policies in a disabled state. If
the primary Avamar instance becomes unavailable, the policies on the secondary Avamar
instance can be enabled via the Toggle Single Avamar Pair Designations catalog item to
enable backup and replication operations to continue.
Note: Avamar designations are only relevant in the redundant Avamar/single vCenter topology,
because the standard Avamar configuration does not have replication, and in the redundant
Avamar/dual vCenter configuration each member of a pair is configured to monitor a folder from
only one of the two vCenters.
Avamar proxy
server
configuration
To associate an Avamar pair with a vSphere cluster, an Avamar proxy virtual machine needs
to be deployed to that cluster.
Standard Avamar configuration
In single-site topologies, all proxies are on the same site. Therefore, the minimum number
of proxy virtual machines required per Avamar pair for each cluster is one. Two is
recommended for high availability, if there is scope within the overall number of proxies that
can be deployed to the environment. Ideally, this number should be in the region of 60 to
80 proxies.
Redundant Avamar/single vCenter configuration
As the virtual machines on every vSphere cluster could be backed up by either of the
members of an Avamar replication pair at different points in time, proxies for both the
primary and secondary Avamar instances of every associated Avamar replicated pair should
be deployed to every vSphere cluster. This means a minimum of two proxies is required.
Four proxies would provide additional resilience, if the scope exists within the overall
number of proxies that can be deployed to the environment.
If the environment also includes CA, then the proxies for the Site A instances should be
bound to Site A using virtual machine DRS affinity groups with a DRS virtual machine to
host rule that sets those virtual machines to must run on a host DRS group that contains
the Site A hosts. Similarly, proxies for the Site B Avamar instance should be bound to Site B
hosts.
This ensures that no unnecessary cross-WAN backups occur, as Avamar can use vStorage
APIs for Data Protection to add VMDKs (from the local leg of the VPLEX volume) to proxy
virtual machines bound to physical hosts on the same site as the primary Avamar instance.
81
Chapter 7: Data Protection
Redundant Avamar/dual vCenter configuration
In a dual-site/dual vCenter configuration, each vSphere cluster must have an Avamar proxy
virtual machine for the local Avamar instance of every Avamar replicated pair associated
with it. This ensures backups are taken locally and replicated to the other member of the
Avamar pair.
In a dual-site/dual vCenter configuration with DR, when a failover occurs, virtual machines
will be moved from the vCenter folders on Site A to their corresponding vCenter folders on
Site B, at which point the other member of the Avamar replication pair will assume
responsibility for backing up and restoring those virtual machines.
Therefore, each vSphere cluster still only requires a minimum of one Avamar proxy for
every Avamar instance that is associated with it. Two will provide extra resilience.
Note: In this configuration, if a failure of a single Avamar instance occurs without the failure of
the vCenter infrastructure on the same site, then the second member of the Avamar replication
pair will not automatically assume responsibility to backup virtual machines. To further protect
against this scenario, additional resilience can be added on each site by using an Avamar RAIN
grid.
Avamar
administratively
full
Determining that a backup target, in this case an Avamar instance, has reached capacity
can be based on a number of metrics of the virtual machines it is responsible for protecting,
including:

The number of virtual machines assigned to the instance

The total capacity of those virtual machines

The rate of change of the data of those virtual machines

The effective deduplication ratio that can be achieved while backing up those virtual
machines

The available network bandwidth and backup window size
Because using these metrics can be somewhat subjective, the Federation Enterprise Hybrid
Cloud provides the ability for an administrator to preclude an Avamar instance or Avamar
replication pair from being assigned further workload by setting a binary Administrative
Full flag set via the Set Avamar to Administrative Full vRealize Automation catalog item.
When a virtual machine is enabled for data protection via Federation Enterprise Hybrid Cloud
BaaS workflows, the available Avamar instances are assessed to determine the most
suitable target. If an Avamar instance or Avamar replication pair has had the
Administrative Full flag set, then that instance/pair is excluded from the selection algorithm
but continues to back up its existing workloads through on-demand or scheduled backups.
If workloads are retired and an Avamar instance or pair is determined to have free capacity,
again the Administrative Full flag can be toggled back, including it in the selection
algorithm.
Policy-based
replication
82
Policy-based replication provides granular control of the replication process. With policybased replication, you create replication groups in Avamar Administrator to define the
following replication settings:

Members of the replication group, which are either entire domains or individual clients

Priority for the order in which backup data replicates

Types of backups to replicate based on the retention setting for the backup or the date
on which the backup occurred

Maximum number of backups to replicate for each client
Chapter 7: Data Protection

Destination server for the replicated backups

Schedule for replication

Retention period of replicated backups on the destination server
The redundant Avamar configurations automatically create a replication group associated
with each backup policy and configure it with a 60-minute stagger to the interval associated
with the backup policy. This enables the backups to complete before the replication starts.
Note: This schedule can be manually altered within the Avamar GUI, but it is important that you
make changes to both the primary and secondary versions of the replication group schedule so
that replication operates as required if the Avamar personalities are reversed.
Replication
control
If Data Domain is used as a backup target, Avamar is responsible for replication of Avamar
data from the source Data Domain system to the destination Data Domain system. As a
result, all configuration and monitoring of replication is done via the Avamar server. This
includes the schedule on which Avamar data is replicated between Data Domain units.
You cannot schedule replication of data on the Data Domain system separately from the
replication of data on the Avamar server. There is no way to track replication by using Data
Domain administration tools.
Note: Do not configure Data Domain replication to replicate data to another Data Domain
system that is configured for use with Avamar. When you use Data Domain replication, the
replicated data does not refer to the associated remote Avamar server.
83
Chapter 7: Data Protection
Architecture
This section describes the features of the standard Avamar configuration shown in Figure 38
and the environments where it may be used.
Figure 38.
Scenarios for use
Standard Avamar configuration architecture
Best use
The most logical fit for a standard Avamar configuration is a single-site Federation
Enterprise Hybrid Cloud deployment.
Alternate uses
The standard Avamar configuration can be used in topologies such as CA dual-site and DR
dual-site topologies with the following caveats:
84

The architecture provides no resilience on the secondary site in either of the dual-site
topologies. If the site that hosts the Avamar instances is lost, then there is no ability
to restore from backup.

In the CA dual-site/single vCenter topology, any virtual machines that reside on the
site with no Avamar instances present will back up across the WAN connection.

In the DR dual-site/dual vCenter topology, any virtual machines that reside on the
recovery site (and therefore are registered with a different vCenter) have no ability to
back up.
Chapter 7: Data Protection
In the standard Avamar configuration, if the Create Backup Service level workflow creates
a folder named Daily-7yr, and there are four Avamar replications pairs present, then it will
configure the following backup policies with the Avamar replication pairs:

Avamar-Pair0: Assigned to monitor vCenter folder Daily-7yr-Pair0

Avamar-Pair1: Assigned to monitor vCenter folder Daily-7yr-Pair1

Avamar-Pair2: Assigned to monitor vCenter folder Daily-7yr-Pair2

Avamar-Pair3: Assigned to monitor vCenter folder Daily-7yr-Pair3
In this case, each pair has only one member, and therefore only one Avamar instance is
monitoring each folder.
Characteristics
The characteristics of the standard Avamar configuration are:

All Avamar instances are standalone, that is, backup sets are not replicated to a
secondary Avamar system.

It works in the context of a single cloud vCenter only.

All Avamar instances contain active backup policies.
Note: An Avamar instance can be set to administratively full and still have active backup
policies.

Distribution
examples
All Avamar instances are considered to be on the same site, and therefore the roundrobin distribution of virtual machines to vCenter folders includes all Avamar instances
that are:

Assigned to the vSphere cluster that the virtual machine is on.

Are not set to Administratively Full.
The following scenarios convey how virtual machines are assigned to vCenter folders to
distribute load evenly across Avamar instances, assuming the following configuration, as
shown in Figure 38:

Four Avamar instances and two vSphere clusters exist

AV_REP_PAIR0 and AV_REP_PAIR1 are assigned to Cluster 1

AV_REP_PAIR2 and AV_REP_PAIR3 are assigned to Cluster 2
Note: In this example all virtual machines are deployed to the backup policy named Daily-7yr.
Scenario 1: VM1 is deployed to Cluster 1 - No other workload virtual machines
exist

AV_REP_PAIR2 and AV_REP_PAIR3 are ruled out because they are not assigned to
Cluster 1.

AV_REP_PAIR0 and AV_REP_PAIR1 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 1.

It is placed in a folder named Daily-7yr-Pair0, indicating assignment to
AV_REP_PAIR0. AV_REP_PAIR1 is an equally viable candidate as both grids are
empty, but AV_REP_PAIR0 is selected based on numerical order.
85
Chapter 7: Data Protection
Scenario 2: VM2 is deployed to Cluster 1 - VM1 exists

AV_REP_PAIR2 and AV_REP_PAIR3 are ruled out because they are not assigned to
Cluster 1.

AV_REP_PAIR0 and AV_REP_PAIR1 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 1.

It is placed in a folder named Daily-7yr-Pair1 indicating assignment to
AV_REP_PAIR1 because the round-robin algorithm determined that
AV_REP_PAIR1 had fewer virtual machines than the other candidate
AV_REP_PAIR0.
VM3 follows a similar logic and ends up being managed by AV_REP_PAIR2 while VM4 is
managed by AV_REP_PAIR3.
Architecture
This section describes the features of the redundant Avamar/single vCenter configuration
shown in Figure 39 and the environments where it can be used.
Figure 39.
86
Redundant Avamar/single vCenter configuration
Chapter 7: Data Protection
Scenarios for use
Best use
The most logical fit for a redundant Avamar/single vCenter configuration is a dual-site/single
vCenter Federation Enterprise Hybrid Cloud deployment.
Alternate uses
The redundant Avamar/single vCenter configuration can be used in the single-site topology
with no caveats to provide a backup infrastructure that can tolerate the loss of a physical
Avamar instance.
Note: The redundant Avamar/single vCenter should not be used in a DR dual-site topology
because doing so imposes caveats that can be overcome using the redundant Avamar/dual
vCenter configuration without the need for any extra components.
vCenter folder
assignments
In the redundant Avamar/single vCenter configuration, if the Create Backup Service level
workflow creates a folder called Daily-7yr and there are four Avamar replications pairs
present then it will configure the following backup policies with the Avamar replication pairs:

Avamar-Pair0: Assigned to monitor vCenter folder Daily-7yr-Pair0

Avamar-Pair1: Assigned to monitor vCenter folder Daily-7yr-Pair1

Avamar-Pair2: Assigned to monitor vCenter folder Daily-7yr-Pair2

Avamar-Pair3: Assigned to monitor vCenter folder Daily-7yr-Pair3
As there is only one vCenter, and therefore only one vCenter folder per Avamar replication
pair, each Avamar instance in the pair is configured to monitor the same vCenter folder. At
this point, the concept of primary and secondary Avamar members are employed to ensure
that only one member of the pair is actively backing up and replicating the virtual machines
at any given point in time.
Characteristics
The characteristics of the redundant Avamar/single vCenter configuration are:

All Avamar instances are configured in pairs and all backups are replicated.

It works in the context of a single cloud vCenter only.

Fifty percent of the Avamar instances have active backup and replication polices at any
given point in time (50 percent of the Avamar instances are primary, 50 percent are
secondary.)
Note: Primary means that the backup policies on that instance are enabled. An Avamar
instance can be set to administratively full and still be considered primary.

Avamar replication pairs are defined as split across sites, and therefore the roundrobin distribution of virtual machines to vCenter folders includes all Avamar pairs that:

Are assigned to the vSphere cluster that the virtual machine is on.

Have their primary member on the same site as the virtual machine DRS Affinity
group that the virtual machine is a member of.

Are not set to Administratively Full.
87
Chapter 7: Data Protection
Distribution
examples
The following scenarios convey how virtual machines are assigned to vCenter folders in
order to distribute load evenly across Avamar instances, assuming the following
configuration (as shown in Figure 39):

Six primary Avamar instances, six secondary instances. and two vSphere clusters exist

AV_REP_PAIR0 through AV_REP_PAIR3 are assigned to Cluster 1

AV_REP_PAIR4 and AV_REP_PAIR5 are assigned to Cluster 2
Note: In this example, all virtual machines are deployed to a backup policy named Daily-7yr.
Scenario 1: VM1 is deployed to Cluster 1, Site A - No other workload virtual
machines exist

AV_REP_PAIR4 and AV_REP_PAIR5 are ruled out because they are not assigned to
Cluster 1.

AV_REP_PAIR1 and AV_REP_PAIR3 are ruled out for being primary on Site B.

AV_REP_PAIR0 and AV_REP_PAIR2 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 1 – Host CL1-H1.

It is placed in a folder named Daily-7yr-Pair0 indicating assignment to
AV_REP_PAIR0. AV_REP_PAIR2 is an equally viable candidate as both grids are
empty, but AV_REP_PAIR0 is chosen based on numerical order.
Scenario 2: VM2 is deployed to Cluster 1, Site A - VM1 exists

AV_REP_PAIR4 and AV_REP_PAIR5 are ruled out because they are not assigned to
Cluster 1.

AV_REP_PAIR1 and AV_REP_PAIR3 are ruled out because their primary instances
are on Site B.

AV_REP_PAIR0 and AV_REP_PAIR2 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 1 – Host CL1-H1.

It is placed in a folder named Daily-7yr-Pair2 indicating assignment to
AV_REP_PAIR2 because the round-robin algorithm determined that
AV_REP_PAIR2 had fewer virtual machines than the other candidate
AV_REP_PAIR0.
Scenario 3: VM3 is deployed to Cluster 1, Site B - VM1 and VM2 exist

AV_REP_PAIR4 and AV_REP_PAIR5 are ruled out because they are not assigned to
Cluster 1.

AV_REP_PAIR0 and AV_REP_PAIR2 are ruled out because their primary instances
are on Site A.
88

AV_REP_PAIR1 and AV_REP_PAIR3 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 1 – Host CL1-H2.

It is placed in a folder named Daily-7yr-Pair1 indicating assignment to
AV_REP_PAIR1. AV_REP_PAIR3 is an equally viable candidate as both grids are
empty, but AV_REP_PAIR1 is selected based on numerical order.
Chapter 7: Data Protection
Scenario 4: VM4 is deployed to Cluster 1, Site B - VM1, VM2, and VM3 exist

AV_REP_PAIR4 and AV_REP_PAIR5 are ruled out because they are not assigned to
Cluster 1.

AV_REP_PAIR0 and AV_REP_PAIR2 are ruled out because their primary instances
are on Site A.

AV_REP_PAIR1 and AV_REP_PAIR3 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 1 – Host CL1-H2.

It is placed in a folder named Daily-7yr-Pair3 indicating assignment to
AV_REP_PAIR3 because the round-robin algorithm determined that
AV_REP_PAIR3 had fewer virtual machines than the other candidate
AV_REP_PAIR1.
When VM5 and VM6 are deployed to Cluster 2, the same logic dictates that VM5 is managed
by AV_REP_PAIR4 while VM6 is managed by AV_REP_PAIR5 based on the Cluster to
Avamar Pair mappings.
89
Chapter 7: Data Protection
Architecture
This section describes the features of the redundant Avamar/dual vCenter configuration
shown in Figure 40 and the environments where it may be used.
Figure 40.
Scenarios for use
Redundant Avamar/dual vCenter configuration
Best use
The most logical fit for a redundant Avamar/dual vCenter configuration is a dual-site/dual
vCenter Federation Enterprise Hybrid Cloud deployment.
Alternate uses
There are no valid alternate uses for this configuration as no other topology uses dual-cloud
vCenters.
90
Chapter 7: Data Protection
vCenter folder
assignments
In the redundant Avamar/dual vCenter configuration, if the Create Backup Service level
workflow creates a folder named Daily-7yr, and there are four Avamar replications pairs
present, then it will configure the following backup policies with the Avamar replication
pairs:

Avamar-Pair0: Assigned to monitor vCenter folder Daily-7yr-Pair0

Avamar-Pair1: Assigned to monitor vCenter folder Daily-7yr-Pair1

Avamar-Pair2: Assigned to monitor vCenter folder Daily-7yr-Pair2

Avamar-Pair3: Assigned to monitor vCenter folder Daily-7yr-Pair3
Because there are two vCenters, each Avamar instance in a pair is configured to monitor
one of the two corresponding vCenter folders, that is, the instance on Site A monitors the
folder from the Site A vCenter, and the instance from Site B monitors the folder from the
Site B vCenter. As a virtual machine can only be one of the two folders (even in a DR dualsite/dual vCenter topology) there is no duplication of backups.
Note: When VMware Site Recovery Manager is used, placeholder virtual machines are created
as part of the Site Recovery Manager protection process. To ensure that Avamar does not detect
these placeholder virtual machines, additional folders are created in each vCenter with a ‘_PH’
suffix, and placeholder virtual machines are located in these folders via Site Recovery Manager
folder mappings. Before failing over a DR cluster, run the Prepare for DP Failover catalog
item. This moves the production virtual machines out of their service level folders on the
protected site, so that their placeholders are not created in an Avamar monitored folders when
Site Recovery Manager re-protects the virtual machine after failover.
Characteristics
The characteristics of the redundant Avamar/single vCenter configuration are:

All Avamar instances are configured in pairs and all backups are replicated.

It works in the context of a dual-cloud vCenter only.
Note: An Avamar instance can be set to Administratively Full and still have active
backup and replication polices.

Distribution
examples
Avamar replication pairs are defined as being split across sites, and therefore the
round-robin distribution of virtual machines to vCenter folders include all Avamar pairs
that are:

Assigned to the vSphere cluster that the virtual machine is on.

Not set to Administratively Full.
The following scenarios convey how virtual machines are assigned to vCenter folders in
order to distribute load evenly across Avamar instances, assuming the following
configuration (as shown in Figure 39):

Six active Avamar instances (in three Avamar replication pairs), three protected
vSphere clusters, three recovery vSphere clusters, and two local clusters exist

AV_REP_PAIR0 through AV_REP_PAIR1 are assigned to Clusters 1 through 6

AV_REP_PAIR2 is assigned to Clusters 7 and 8
Note: All virtual machines are deployed to the backup policy named Daily-7yr for the example.
Scenario 1: VM1 is deployed to Cluster 1 - No other workload virtual machines
exist

AV_REP_PAIR2 is ruled out because it is not assigned to Cluster 1.

AV_REP_PAIR0 and AV_REP_PAIR1 are identified as potential targets.
91
Chapter 7: Data Protection

The expected results are:

The virtual machine is deployed to Cluster 1 – Host CL1-H1.

It is placed in a folder named Daily-7yr-Pair0 indicating assignment to
AV_REP_PAIR0. AV_REP_PAIR1 is an equally viable candidate as both grids are
empty, but AV_REP_PAIR0 is selected based on numerical order.

Because Cluster 1 is on Site A, AV_INSTANCE_00 will back up the virtual machine
and replicate the backups to AV_INSTANCE_01.
Scenario 2: VM2 is deployed to Cluster 1 - VM1 exists

AV_REP_PAIR2 is ruled out as it is not assigned to Cluster 1.

AV_REP_PAIR0 and AV_REP_PAIR1 are identified as potential targets

The expected results are:

The virtual machine is deployed to Cluster 1 – Host CL1-H1.

It is placed in a folder named Daily-7yr-Pair1 indicating assignment to
AV_REP_PAIR1 because the round-robin algorithm determined that
AV_REP_PAIR2 had fewer virtual machines than the other candidate,
AV_REP_PAIR0.

Because Cluster 1 is on Site A, AV_INSTANCE_02 will back up the virtual machine
and replicate the backups to AV_INSTANCE_03.
Scenario 3: VM3 is deployed to Cluster 3 - VM1 and VM2 exist

AV_REP_PAIR2 is ruled out as it is not assigned to Cluster 3.

AV_REP_PAIR0 and AV_REP_PAIR1 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 3 – Host CL3-H1.

It is placed in a folder named Daily-7yr-Pair0 indicating assignment to
AV_REP_PAIR0. AV_REP_PAIR1 is an equally viable candidate as both have equal
virtual machines (one assigned to each pair globally, but none to the instances on
Site B), but AV_REP_PAIR0 is selected based on numerical order.
Scenario 4: VM4 is deployed to Cluster 3 - VM1, VM2, and VM3 exist

AV_REP_PAIR2 is ruled out, because it is not assigned to Cluster 3.

AV_REP_PAIR0 and AV_REP_PAIR1 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 3 – Host CL3-H1.

It is placed in a folder named Daily-7yr-Pair1 indicating assignment to
AV_REP_PAIR1 because the round-robin algorithm determined that
AV_REP_PAIR1 had fewer virtual machines than the other candidate
AV_REP_PAIR0.

Because Cluster 3 is on Site B, AV_INSTANCE_03 will back up the virtual machine
and replicate the backups to AV_INSTANCE_02.
Scenario 5: VM5 is deployed to Cluster 5 - VM1, VM2, VM3 and VM4 exist
92

AV_REP_PAIR2 is ruled out, because it is not assigned to Cluster 5.

AV_REP_PAIR0 and AV_REP_PAIR1 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 5 – Host CL5-H1.

It is placed in a folder named Daily-7yr-Pair0 indicating assignment to
AV_REP_PAIR0. AV_REP_PAIR1 is an equally viable candidate because both have
Chapter 7: Data Protection
equal virtual machines (two assigned to each pair globally), but AV_REP_PAIR0 is
selected based on numerical order.

Because Cluster 5 is on Site A, AV_INSTANCE_00 will back up the virtual machine
and replicate the backups to AV_INSTANCE_01.
Scenario 6: VM6 is deployed to Cluster 6 - VM1, VM2, VM3, VM4 and VM5 exist

AV_REP_PAIR2 is ruled out, because it is not assigned to Cluster 6.

AV_REP_PAIR0 and AV_REP_PAIR1 are identified as potential targets.

The expected results are:

The virtual machine is deployed to Cluster 6 – Host CL6-H1.

It is placed in a folder named Daily-7yr-Pair1 indicating assignment to
AV_REP_PAIR1 because the round-robin algorithm determined that
AV_REP_PAIR1 had fewer virtual machines than the other candidate
AV_REP_PAIR0.

Because Cluster 6 is on Site B, AV_INSTANCE_03 will back up the virtual machine
and replicate the backups to AV_INSTANCE_02.
Scenario 7: VM7 is deployed to Cluster 7 - VM1, VM2, VM3, VM4, VM5 and VM6
exist

AV_REP_PAIR0 and AV_REP_PAIR1 are ruled out, because they are not assigned to
Cluster 7.

AV_REP_PAIR2 is identified as the only potential target.

The expected results are:

The virtual machine is deployed to Cluster 7 – Host CL7-H1.

It is placed in a folder named Daily-7yr-Pair2 indicating assignment to
AV_REP_PAIR3. We found no other viable candidates.

Because Cluster 7 is on Site A, AV_INSTANCE_05 will back up the virtual machine
and replicate the backups to AV_INSTANCE_05.
93
Chapter 7: Data Protection
Redundant
Avamar/dual
vCenter proxy
example
Figure 41 shows an example of how proxies might be configured in a redundant
Avamar/dual vCenter environment
Figure 41.
94
Redundant Avamar/dual vCenter proxy example
Chapter 8: Solution Rules and Permitted Configurations
This chapter presents the following topics:
Overview ..........................................................................................................96
Architectural assumptions ...................................................................................96
VMware Platform Services Controller ....................................................................96
VMware vRealize tenants and business groups .......................................................98
EMC ViPR tenants and projects ............................................................................99
General storage considerations .......................................................................... 100
VMware vCenter endpoints ................................................................................ 100
Permitted topology configurations ...................................................................... 101
Permitted topology upgrade paths ...................................................................... 102
Bulk import of virtual machines ......................................................................... 103
DR dual-site/dual vCenter topology restrictions.................................................... 104
Resource sharing ............................................................................................. 106
Data protection considerations ........................................................................... 106
Software resources .......................................................................................... 106
Sizing guidance ............................................................................................... 106
95
Chapter 8: Solution Rules and Permitted Configurations
This chapter looks at the rules, configurations, and dependencies between the Federation
Enterprise Hybrid Cloud components and their constructs, outlining how this influences the
supported configurations within the cloud.
Assumption and
justifications
The following assumptions and justifications apply to the Federation Enterprise Hybrid Cloud
architecture:

The appliance based version of vCenter is not supported. The vCenter Server full
installation is used because it:

Provides support for an external Microsoft SQL Server database

Resides on a Windows System that also supports the VMware Update Manager™
service, enabling minimal resource requirements in smaller configurations

VMware Platform Services Controller is used instead of the vRealize Automation
Identity Appliance because it supports the multisite, single sign-on requirements of the
solution

Windows-based Platform Services Controllers are used as they are the natural upgrade
path from previous versions of Federation Enterprise Hybrid Cloud

The appliance-based versions of Platform Services Controllers are not supported
This solution uses VMware Platform Services Controller in place of the vRealize Automation
Identity Appliance. VMware Platform Services Controller is deployed on the dedicated virtual
machine server in each Core Pod (multiple Core Pods exist in the DR dual-site/dual vCenter
topology) and an additional Platform Services Controller (Auto-Platform Services Controller)
is deployed on a server in the Automation Pod.
The Auto-Platform Services Controller server provides authentication services to all the
Automation Pod management components requiring Platform Services Controller integration.
This configuration enables authentication services to fail over with the other automation
components and enables a seamless transition between Site A and Site B. There is no need
to change IP addresses, DNS, or management component settings.
Platform Services
Controller
domains
The Federation Enterprise Hybrid Cloud uses one or more Platform Services Controller
domains depending on the management platform deployed. Platform Services Controller
instances are configured within those domains according to the following model:


96
External Platform Services Controller domain (distributed management model only)

First external Platform Services Controller

Second external Platform Services Controller (DR dual-site/dual vCenter topology
only)
Cloud SSO domain (All topologies)

First Cloud Platform Services Controller

Automation Pod Platform Services Controller

Second Cloud Platform Services Controller (DR dual-site/dual vCenter topology
only)
Chapter 8: Solution Rules and Permitted Configurations
Figure 42 shows the Platform Services Controller domains and how each Platform Services
Controller instance and domain required are configured.
Figure 42.
First Platform
Services
Controller
instance in each
single sign-on
domain
SSO domain and vCenter SSO instance relationships
This first VMware Platform Services Controller deployed in each single sign-on domain is
deployed by creating a new vCenter Single Sign-On domain, enabling it to participate in the
default vCenter Single Sign-On namespace (vsphere.local). This primary Platform Services
Controller server supports identity sources, such as Active Directory, OpenLDAP, local
operating system users, and SSO embedded users and groups.
This is the default deployment mode when you install VMware Platform Services Controller.
97
Chapter 8: Solution Rules and Permitted Configurations
Subsequent
vCenter Single
Sign-On
instances in each
single sign-on
domain
Additional VMware Platform Services Controller instances are installed by joining the new
Platform Services Controller to an existing single sign-on domain, making them part of the
existing domain, but in a new SSO site. When you create Platform Services Controller
servers in this fashion, the deployed Platform Services Controller instances become
members of the same authentication namespace as the Platform Services Controller
instance. This deployment mode should only be used after you have deployed the first
Platform Services Controller instance in each single sign-on domain.
In vSphere 6.0, VMware Platform Services Controller single sign-on data (such as policies,
solution users, application users, and identity sources) is automatically replicated between
each Platform Services Controller instance in the same authentication namespace every 30
seconds.
vRealize tenant
design
The Federation Enterprise Hybrid Cloud can operate using single or multiple vRealize
Automation tenants.
STaaS operations rely on the tenant URL value configured as part of the vRealize
Automation tenant, and therefore require an individual vRealize Orchestrator server per
additional tenant, if the ability for multiple tenants to execute STaaS operations is required.
The Federation Enterprise Hybrid Cloud foundation package needs to be installed on each of
the vRealize Orchestrator servers, entering the relevant tenant URL during installation. This
is also required in order to populate the vRealize Automation catalog in each tenant with the
relevant STaaS catalog items.
vRealize tenant
best practice
As the vRealize Automation IaaS administrator is a system-wide role, having multiple
tenants configure endpoints and execute STaaS operations may not provide any additional
value over and above the use of a single tenant with multiple business groups. Therefore,
while multiple tenants are supported, Federation Enterprise Hybrid Cloud is normally
deployed with a single tenant with respect to STaaS and Data Protection operations.
vRealize business
group design
Federation Enterprise Hybrid Cloud uses two system business groups in each non-default
tenant. The first, EHCSystem, is used as the target for installation of the vRealize
Automation advanced services STaaS catalog items. It does not require any compute
resources. The second system business group, EHCOperations, is used as the group where
Federation Enterprise Hybrid Cloud storage administrators are configured. It is given
entitlements to the STaaS and Cluster Onboarding catalog items. It has no compute
resource requirements.
vRealize business
group best
practice
The Federation recommends that applications provisioned using vRealize Automation
Application Services each have a separate business group per application type to enable
administrative separation of blueprint creation and manipulation.
Figure 43 shows an example where the EHCSystem and EHCOperations system business
groups are configured alongside three tenant business groups (IT, HR, and Manufacturing)
and three application business groups used by vRealize Automation Application Services for
Microsoft SharePoint, Oracle, and Microsoft Exchange.
98
Chapter 8: Solution Rules and Permitted Configurations
Figure 43.
Software-defined data center tenant design and endpoints
ViPR tenants
The Federation Enterprise Hybrid Cloud uses a single ViPR tenant. The default provider
tenant or an additional non-default tenant may be used.
ViPR projects
Federation Enterprise Hybrid Cloud STaaS operations rely on a correlation between the
tenant URL value of the user executing the request and a ViPR project name. Therefore, to
enable STaaS for an additional vRealize tenant, you must create a corresponding ViPR
project whose name and case match that of the vRealize tenant URL.
As each project can have a maximum total storage capacity (quota) associated with it that
cannot be exceeded, the use of multiple ViPR projects enables multiple vRealize Automation
tenants within the Federation Enterprise Hybrid Cloud to provision storage from the same
storage endpoints in a controlled or limited fashion.
ViPR consistency
groups
ViPR consistency groups are an important component of the CA and DR topologies for the
Federation Enterprise Hybrid Cloud solution. Consistency groups logically group volumes
within a project to ensure that a set of common properties is applied to an entire group of
volumes during a fault event. This ensures host-to-cluster or application-level consistency
when a failover occurs.
Consistency groups are created by Federation Enterprise Hybrid Cloud STaaS operations and
are specified when CA or DR-protected volumes are provisioned. Consistency group names
must be unique within the ViPR environment.
When used with VPLEX in the CA dual-site/single vCenter, these consistency groups are
created per physical array, per vSphere cluster, and per site.
When used with RecoverPoint in a DR dual-site/dual vCenter configuration, these
consistency groups are created in 1:1 relationship with the vSphere datastore/LUN.
99
Chapter 8: Solution Rules and Permitted Configurations
vSphere
datastore
clusters
VMware Raw
Device Mappings
(RDMs)
The Federation Enterprise Hybrid Cloud does not support datastore clusters for the following
reasons:
1.
Linked clones do not work with datastore clusters, causing multi machine blueprints
to fail unless configured with explicitly different reservations for edge devices.
2.
vRealize Automation already performs capacity analysis during initial placement.
3.
Can result in inconsistent behavior when virtual machines report their location to
vRealize Automation. Misalignment with vRealize reservations can makes the virtual
machine un-editable.
4.
Federation Enterprise Hybrid Cloud STaaS operations do not place new LUNs into
datastore clusters, therefore all datastore clusters would have to be manually
maintained.
5.
Specific to DR:
a.
Day 2 storage DRS migrations between datastores would break the Site
Recovery Manager protection for the virtual machines moved.
b.
Day 2 storage DRS migrations between datastores would result in rereplicating the entire virtual machine to the secondary site.
VMware Raw Device Mappings are not created by or supported by Federation Enterprise
Hybrid Cloud STaaS services. If created outside of STaaS services, then any issues arising
from their use will not be supported by Federation Enterprise Hybrid Cloud customer
support. If changes are required in the environment to make them operate correctly, then a
Federation Enterprise Hybrid Cloud RPQ should be submitted first for approval. Federation
Enterprise Hybrid Cloud support teams may request that you back out any change made to
the environment to facilitate RDMs.
Multiple vCenter endpoints are supported within Federation Enterprise Hybrid Cloud.
However depending on the topology chosen, there are certain considerations as outlined in
this section.
Single-site/single
vCenter and dualsite/single
vCenter
topologies
These topologies can support:

Only one vCenter per tenant with the ability to execute STaaS services.

More than one vCenter per tenant as long as the second and subsequent vCenter
endpoints do not require STaaS or BaaS services.
Enabling additional STaaS-enabled vCenter endpoints
To enable additional STaaS-enabled vCenter endpoints, these topologies require a separate
vRealize tenant per vCenter endpoint for the following reasons:
100

Federation Enterprise Hybrid Cloud STaaS catalog items use vRealize Orchestrator
through the vRealize Automation advanced server configuration, which only allows one
vRealize Orchestrator to be configured per tenant.

This vRealize Orchestrator stores important vCenter configuration details gathered
during the Federation Enterprise Hybrid Cloud Foundation installation process.

To store additional vCenter configuration details, an additional vRealize Orchestrator is
required.

Specifying the additional vRealize Orchestrator as an advanced server configuration
requires an additional tenant.
Chapter 8: Solution Rules and Permitted Configurations
Each vCenter endpoint requires its own independent vRealize Orchestrator server and NSX
Manager instance.

The vRealize Orchestrator consideration is based on the additional tenant
consideration above.

The NSX Manager requirement is based on the VMware requirement for a 1:1
relationship between vCenter and NSX.
Enabling additional BaaS-enabled vCenter endpoints
These topologies require independent Avamar instances for each vCenter endpoint to enable
BaaS services.
Note: As backup service level names use a common vRealize dictionary, backup service levels
created by each tenant will be visible to all tenants. Therefore, it is advisable to name backup
service levels to indicate which tenant they were created for. This enables an operator to
identify the backup service levels relevant to them.
Dual-site/dual
vCenter
topologies
This topology can support:

Two vCenters per tenant with the ability to execute STaaS and BaaS services.

More than two vCenters per tenant as long as the third and subsequent vCenter
endpoints do not require STaaS, BaaS or DRaaS services.
Enabling additional STaaS and BaaS-enabled vCenter endpoints
Additional STaaS and BaaS enabled vCenter endpoints require additional tenants and
Avamar instances similar to the single vCenter topologies.
Note: As backup service level names use a common vRealize dictionary, backup service levels
created by each tenant will be visible to all tenants. Therefore it is advisable to name backup
service levels to indicate which tenant they were created for. This enables an operator to
identify the backup service levels relevant to them.
Combining
topologies
The following configurations are permitted for each Federation Enterprise Hybrid Cloud
instance:

Local only (single-site/single vCenter)

Local plus CA combined


Uses the CA dual-site/single vCenter topology and provides local-only and CA
functionality via distinct Workload Pods with separate networks and storage
Local plus DR combined

Uses the DR dual-site/dual vCenter topology and provides local-only and DR
functionality via distinct Workload Pods with separate networks and storage
Note: Federation Enterprise Hybrid Cloud 3.5 does not support both DR and CA functionality on
the same Federation Enterprise Hybrid Cloud instance.
101
Chapter 8: Solution Rules and Permitted Configurations
Single site to
continuous
availability
upgrade
Single-site Federation Enterprise Hybrid Cloud deployments can be upgraded to CA dualsite/single vCenter topology by adopting either an online or offline upgrade approach with
the following considerations.
Considerations

The topology upgrade is an EMC professional services engagement and provides three
basic methods of conversion based on the original storage design

NFS to VPLEX Distributed VMFS (Online via Storage vMotion)

Standard VMFS to VPLEX Distributed VMFS (Offline via VPLEX encapsulation)

VPLEX Local VMFS to VPLEX Distributed VMFS (Online via VPLEX Local to VPLEX
Metro conversion)

If NFS is in use for management platform storage, then new VPLEX storage is
required.

In non-BaaS environments, local workloads can be migrated to new CA clusters using
storage vMotion if required.
Note: Federation Enterprise Hybrid Cloud 3.5 does not currently provide an automated
mechanism to achieve this. Contact EMC Professional Services to assist in this process.

Existing local-only workload clusters may remain as local-only clusters or be converted
to CA-enabled clusters after the topology upgrade.
Note: EMC Professional Services should execute this process

In BaaS environments, virtual machines requiring CA protection should remain on the
original cluster and the cluster should be converted to a CA-enabled cluster.
This is due to the need to carefully manage the relationships of vSphere clusters,
Avamar grids, Avamar proxies, and vCenter folder structure to preserve the ability to
restore backups taken prior to the topology upgrade.

Single-site to
disaster recovery
upgrade
After the topology upgrade, new clusters can be provisioned to provide CA or localonly functionality for new tenant virtual machines.
Single-site Federation Enterprise Hybrid Cloud deployments can be upgraded to DR dualsite/dual vCenter topology by adopting with the following considerations:
Considerations

Additional Core and NEI Pod infrastructure and components need to be deployed on
the second site.

Additional Automation Pod infrastructure needs to be deployed on the second site to
become the target for the Automation Pod failover.

EMC RecoverPoint needs to be installed and configured and all Automation Pod LUNs
replicated to the second site.
Note: If NFS volumes were used for Automation Pod storage then new FC-based block
datastores should be provided, and the Automation Pod components migrated to the new
storage using Storage vMotion.

102
Prior to the upgrade, the Automation Pod components must be deployed on a distinct
network segment from the Core and NEI Pods.
Chapter 8: Solution Rules and Permitted Configurations

A Microsoft SQL Server instance and a vCenter Single Sign-On role must be deployed
to a server in the Automation Pod during the initial deployment.

Migration of previously existing virtual machines from local to DR clusters is not
currently supported with default functionality.
Note: If there is a requirement to DR-enabled pre-existing tenant workloads, contact EMC
Services teams to provide this as custom functionality.
Importing from
virtual machines
and adding
Federation
Enterprise Hybrid
Cloud services
For environments that require existing virtual machines to be imported into the Federation
Enterprise Hybrid Cloud, the bulk import feature of vRealize Automation enables the import
of one of more virtual machines.
This functionality is available only to vRealize Automation users who have Fabric
Administrator and Business Group Manager privileges. The Bulk Import feature imports
virtual machines intact with defining data such as reservation, storage path, blueprint,
owner, and any custom properties.
Federation Enterprise Hybrid Cloud offers the ability to layer Federation Enterprise Hybrid
Cloud services onto pre-existing virtual machines by using and extending the bulk import
process. Before beginning the bulk import process, the following conditions must be true:

Target virtual machines are located in an Federation Enterprise Hybrid Cloud vCenter
endpoint
Note: This is not an additional IaaS-only vCenter endpoint if they are also present.


Target virtual machines must be located on the correct vRealize Automation managed
compute resource cluster and that cluster must already be on-boarded as a Federation
Enterprise Hybrid Cloud cluster.

In cases where DR services are required for the target virtual machines, then they
must be on a DR-enabled cluster.

In cases where data protection services are required for the target virtual
machines, then they must be on a cluster that is associated with an Avamar pair.
Target virtual machines must be located on the correct vRealize Automation managed
datastore.

In cases where DR services are required for the target virtual machines, then they
must be on a datastore protected by EMC RecoverPoint.

In cases where data protection services are required for the target virtual
machines, then they must be on a datastore that is registered with an EMC Avamar
grid.
Note: The process for importing these virtual machines and adding Federation Enterprise Hybrid
Cloud services is documented in the Federation Enterprise Hybrid Cloud 3.5: Administration
Guide.
103
Chapter 8: Solution Rules and Permitted Configurations
Multimachine
blueprints
Load balancers cannot be deployed as part of a protected multimachine blueprint. However,
you can manually edit the upstream Edge to include load-balancing features for a newly
deployed multimachine blueprint.
vRealize
Automation
Failover state operations
Provisioning of virtual machines to a protected DR cluster is permitted at any time, as long
as that site is operational. If you provision a virtual machine while the recovery site is
unavailable due to vCenter Site Recovery Manager disaster recovery failover, you need to
run the DR Remediation catalog item to bring it into protected status when the recovery
site is back online.
During STaaS provisioning of a protected datastore, Federation Enterprise Hybrid Cloud
workflows issue a DR auto-protect attempt for the new datastore with vCenter Site Recovery
Manager. If both sites are operational when the request is issued, this should be successful.
If, however, one site is offline (vCenter Site Recovery Manager Disaster Recovery Failover)
when the request is made, the datastore will be provisioned, but you must run the DR
Remediation catalog item to bring it into a protected status.
Note: The DR Remediation catalog item can be run at any time to ensure that all DR items
are protected correctly.
104
Failover
granularity
While replication is at the datastore level, the unit of failover for in a DR configuration is a
DR-enabled cluster. It is not possible to failover a subset of virtual machines on a single DRprotected cluster. This is because all networks supporting these virtual machines are
converged to the recovery site during a failover.
RecoverPoint
cluster
limitations
There is also a limit of 64 consistency groups per RecoverPoint appliance and 128
consistency groups per RecoverPoint cluster. Therefore, the number of nodes deployed in
the RecoverPoint cluster should be sized to allow appropriate headroom for surviving
appliances to take over the workload of failed appliance.
RecoverPoint
licensing
The Federation Enterprise Hybrid Cloud supports RecoverPoint CL-based licensing only. It
does not support RecoverPoint SE or RecoverPoint EX, as these versions are not currently
supported by EMC ViPR.
VMware Site
Recovery
Manager
limitations
Protection maximums
Table 11 shows the maximums that apply for Site Recovery Manager-protected resources.
Table 11.
Site Recovery Manager protection maximums
Total number of
Maximum
Virtual machines configured for protection using array-based replication
5,000
Virtual machines per protection group
500
Protection groups
250
Recovery plans
250
Protection groups per recovery plan
250
Virtual machines per recovery plan
2,000
Replicated datastores (using array-based replication) and >1
RecoverPoint cluster
255
Chapter 8: Solution Rules and Permitted Configurations
Recovery maximums
Table 12 shows the maximums that apply for Site Recovery Manager recovery plans.
Table 12.
Implied
Federation
Enterprise Hybrid
Cloud storage
maximums
Site Recovery Manager protection maximums
Total number of
Maximum
Concurrently executing recovery plans
10
Concurrently recovering virtual machines using array-based replication
2,000
Table 13 indicates the storage maximums in a Federation Enterprise Hybrid Cloud DR
environment, when all other maximums are taken into account.
Table 13.
Implied Federation Enterprise Hybrid Cloud storage maximums
Total number of
Maximum
DR enabled datastores per RecoverPoint Consistency Group
1
DR enabled datastores per RecoverPoint Cluster
128
DR enabled datastores per Federation Enterprise Hybrid Cloud
environment
250
To ensure maximum protection for DR-enabled vSphere clusters, the Federation Enterprise
Hybrid Cloud STaaS workflows create each LUN in its own RecoverPoint consistency group.
This ensures that ongoing STaaS provisioning operations have no effect on either the
synchronized state of existing LUNs or the history of restore points for those LUNs
maintained by EMC RecoverPoint.
Because there is a limit of 128 consistency groups per EMC RecoverPoint cluster, there is
therefore a limit of 128 Federation Enterprise Hybrid Cloud STaaS provisioned LUNs per
RecoverPoint cluster. To extend the scalability further, additional EMC RecoverPoint clusters
are required.
Each new datastore is added to its own Site Recovery Manager protection group. As there is
a limit of 250 protection groups per Site Recovery Manager installation, this limits the total
number of datastores in a DR environment to 250, irrespective of the number of
RecoverPoint clusters deployed.
Storage support
Supports VMAX, VNX, XtremIO, and VMAX3 (behind VPLEX) only.
Network support
The Federation Enterprise Hybrid Cloud provides fully automated network re-convergence
during disaster recovery failover when using VMware NSX only. The use of vSphere
Distributed Switch backed by other networking technologies is also permitted, but requires
that network re-convergence is carried out manually in accordance with the chosen network
technology, or that automation of network re-convergence is developed as a professional
services engagement.
NSX security
support
Only supports the assignment of blueprint virtual machines to a security group. Does not
support the assignment of blueprints to security policies or security tags.
105
Chapter 8: Solution Rules and Permitted Configurations
Resource
isolation
As vRealize Automation endpoints are visible to all vRealize Automation IaaS administrators,
resource isolation in the truest sense is not possible. However, use of locked blueprints and
storage reservation policies can be used to ensure that certain types of workload (such as
those whose licensing is based on CPU count) can be restricted to only a subset of the
Workload Pods available in the environment. This includes the ability to control those
licensing requirements across tenants by ensuring that all relevant deployments are on the
same set of compute resources.
Resource sharing
All endpoints configured across the vRealize Automation instance by an IaaS administrator
are available to be added to fabric groups, and therefore consumed by any business group
across any of the vRealize Automation tenants.
Provisioning to vCenter endpoints, however, can still only be done through the tenant
configured as part of the Federation Enterprise Hybrid Cloud foundation installation in that
tenant and its vRealize Orchestrator server.
106
Application
tenant
integration
The Federation recommends that applications provisioned using vRealize Automation
Application Services each have their own business group by application type to enable
administrative separation of blueprint creation and manipulation.
Supported
Avamar
platforms
The Federation Enterprise Hybrid Cloud supports physical Avamar infrastructure only. It
does not support Avamar Virtual Edition
Scale out limits
Federation Enterprise Hybrid Cloud 3.5 supports a maximum of 15 Avamar replication pairs
(30 individual physical instances).
Federation
Enterprise Hybrid
Cloud software
resources
For information about qualified components and versions required for the initial release of
the Federation Enterprise Hybrid Cloud 3.5 solution, refer to the Federation Enterprise
Hybrid Cloud 3.5: Reference Architecture Guide. For up-to-date supported version
information, refer to the EMC Simple Support Matrix: EMC Hybrid Cloud 3.5:
elabnavigator.emc.com.
Federation
Enterprise Hybrid
Cloud sizing
For all Federation Enterprise Hybrid Cloud sizing operations, refer to the EMC Mainstay
Sizing tool: mainstayadvisor.com/go/emc.
Chapter 9: Conclusion
This chapter presents the following topic:
Conclusion ...................................................................................................... 108
107
Chapter 9: Conclusion
The Federation Enterprise Hybrid Cloud solution provides on-demand access and control of
infrastructure resources and security while enabling customers to maximize asset use.
Specifically, the solution integrates all the key functionality that customers demand of a
hybrid cloud and provides a framework and foundation for adding other services.
This solution provides the following features and functionality:

Continuous availability

Disaster recovery

Data protection

Automation and self-service provisioning

Multitenancy and secure separation

Workload-optimized storage

Elasticity and service assurance

Monitoring

Metering and chargeback
The solution uses the best of EMC and VMware products and services to empower customers
to accelerate the implementation and adoption of hybrid cloud while still enabling customer
choice for the compute and networking infrastructure within the data center.
108
Chapter 10: References
This chapter presents the following topic:
Federation documentation ................................................................................. 110
109
Chapter 10: References
These documents are available on EMC.com. Access to Online Support depends on your
login credentials. If you do not have access to a document, contact your Federation
representative.
110

Federation Enterprise Hybrid Cloud 3.5: Reference Architecture Guide

Federation Enterprise Hybrid Cloud 3.5: Infrastructure and Operations Management
Guide

Federation Enterprise Hybrid Cloud 3.5: Security Management Guide

Federation Enterprise Hybrid Cloud 3.5: Administration Guide

VCE Foundation for Federation Enterprise Hybrid Cloud Addendum 3.5

VCE Foundation Upgrade from 3.1 to 3.5 Process
Download