Collated vSphere Best Practices, Pre-requisites and

advertisement
Collated vSphere Best Practices,
Pre-requisites and requirements
Paul Meehan
http://paulpmeehan.com
1
2
Introduction ....................................................................................................................... 6
1.1
Version Control ........................................................................................................... 6
1.2
Current Scope .............................................................................................................. 7
vSphere Installation and Setup ..........................................................................................8
2.2
ESXi Booting Requirements .......................................................................................8
2.3
ESXi Support for 64-Bit Guest Operating Systems .................................................. 10
2.4
Hardware Requirements for vCenter Server, vCenter Single Sign On, vSphere
Client, and vSphere Web Client ............................................................................................ 11
2.5
vCenter Server and vSphere Client System Recommendations for Performance
Based on Deployment Size ................................................................................................... 14
3
4
5
2.6
Required Ports for vCenter Server ............................................................................ 16
2.7
Required Ports for the vCenter Server Appliance ..................................................... 17
2.8
Prepare Your System and Install the Auto Deploy Server ........................................ 19
2.9
Auto Deploy Best Practices and Security Consideration ..........................................20
2.10
Before You Install vCenter Server ............................................................................. 22
2.11
(vCenter) Database Prerequisites ............................................................................. 24
2.12
Using a User Account for Running vCenter Server .................................................. 29
HA Clusters.......................................................................................................................30
3.1
Requirements for a vSphere HA Cluster ...................................................................30
3.2
EVC Requirements for Hosts ....................................................................................30
3.3
Best Practices for vSphere HA Clusters .................................................................... 31
3.4
Best Practices for Networking ................................................................................... 39
3.5
Fault Tolerance ......................................................................................................... 42
Networking ....................................................................................................................... 49
4.1
vSphere Distributed Switch Health Check ................................................................ 49
4.2
vDS Port Group settings and parameters ................................................................. 49
4.3
vSphere Network I/O Control ................................................................................... 52
4.4
TCP Segmentation Offload and Jumbo Frames ....................................................... 53
4.5
Single Root I/O Virtualization (SR-IOV) .................................................................. 54
4.6
Configure NetFlow Settings ...................................................................................... 58
4.7
Mounting NFS Volumes ............................................................................................ 58
4.8
Networking Best Practices ........................................................................................ 59
Storage .............................................................................................................................. 61
5.1
Making LUN Decisions ............................................................................................. 61
5.2
Best Practices for Fibre Channel Storage .................................................................. 63
5.3
Preventing Fibre Channel SAN Problems ................................................................. 64
6
5.4
Disable Automatic Host Registration ....................................................................... 64
5.5
Optimizing Fibre Channel SAN Storage Performance .............................................. 64
5.6
iSCSI .......................................................................................................................... 67
5.7
iSCSI SAN Restrictions ............................................................................................. 69
5.8
iBFT iSCSI Boot Overview ........................................................................................ 71
5.9
Best Practices for iSCSI Storage ................................................................................ 73
5.10
Preventing iSCSI SAN Problems ............................................................................... 73
5.11
Optimizing iSCSI SAN Storage Performance ............................................................ 74
5.12
Checking Ethernet Switch Statistics ......................................................................... 78
5.13
iSCSI SAN Configuration Checklist .......................................................................... 78
5.14
Identifying Device Connectivity Problems................................................................ 78
5.15
Best Practices for SSD Devices ..................................................................................83
5.16
Upgrading VMFS Datastores ................................................................................... 88
5.17
Set Up Dynamic Disk Mirroring ...............................................................................89
5.18
Creating a Diagnostic Partition ................................................................................ 90
5.19
About Raw Device Mapping ..................................................................................... 90
5.20
Raw Device Mapping Characteristics........................................................................ 92
5.21
VMkernel and Storage............................................................................................... 93
5.22
Understanding Multipathing and Failover ............................................................... 95
5.23
Array-Based Failover with iSCSI............................................................................... 96
5.24
Path Failover and Virtual Machines..........................................................................98
5.25
Managing Multiple Paths .......................................................................................... 99
5.26
VMware Multipathing Module................................................................................ 100
5.27
Path Scanning and Claiming ................................................................................... 102
5.28
Managing Storage Paths and Multipathing Plug-Ins.............................................. 103
5.29
Multipathing Considerations .................................................................................. 104
5.30
Hardware Acceleration on NAS Devices ................................................................. 107
5.31
Hardware Acceleration Considerations .................................................................. 107
5.32
Booting ESXi with Software FCoE .......................................................................... 108
5.33
Requirements and Considerations for Software FCoE Boot................................... 108
5.34
Best Practices for Software FCoE Boot ................................................................... 109
vSphere Resource Management ..................................................................................... 110
6.1
Configuring Resource Allocation Settings .............................................................. 110
6.2
Memory Virtualization Basics ................................................................................. 122
6.3
Memory Reliability.................................................................................................. 133
6.4
Managing Storage I/O Resources ........................................................................... 133
7
6.5
Set Storage I/O Control Threshold Value ............................................................... 135
6.6
Managing Resource Pools ....................................................................................... 136
6.7
Managing Resource Pools ....................................................................................... 137
6.8
Resource Pool Admission Control .......................................................................... 139
6.9
Creating a DRS Cluster ............................................................................................141
6.10
DRS Cluster Requirements ......................................................................................141
6.11
Removing a Host from a Cluster ............................................................................. 144
6.12
DRS Cluster Validity ............................................................................................... 145
6.13
DPM ........................................................................................................................ 149
6.14
Datastore clusters.................................................................................................... 152
6.15
Setting the Aggressiveness Level for Storage DRS.................................................. 152
6.16
Datastore Cluster Requirements ............................................................................. 153
6.17
Adding and Removing Datastores from a Datastore Cluster .................................. 153
6.18
Storage DRS Anti-Affinity Rules ............................................................................. 155
6.19
Storage vMotion Compatibility with Datastore Clusters ........................................ 157
6.20
Using NUMA Systems with ESXi ........................................................................ 157
Security ........................................................................................................................... 163
7.2
Securing Standard Switch Ports.............................................................................. 172
7.3
Cipher Strength ....................................................................................................... 174
7.4
Control CIM-Based Hardware Monitoring Tool Access ......................................... 175
7.5
General Security Recommendations ....................................................................... 176
7.6
ESXi Firewall Configuration ....................................................................................177
7.7
Lockdown Mode Behavior....................................................................................... 180
7.8
Lockdown Mode Configurations ............................................................................. 181
7.9
ESXi Authentication and User Management .......................................................... 181
7.10
Best Practices for Roles and Permissions ............................................................... 185
7.11
Replace a Default ESXi Certificate with a CA-Signed Certificate ........................... 185
7.12
Modifying ESXi Web Proxy Settings ....................................................................... 186
7.13
General Virtual Machine Protection ....................................................................... 187
7.14
Removing Unnecessary Hardware Devices ............................................................ 189
7.15
Securing vCenter Server Systems............................................................................ 189
7.16
Best Practices for Virtual Machine and Host Security ............................................ 190
7.17
Installing Antivirus Software .................................................................................. 190
7.18
Managing ESXi Log Files .........................................................................................191
7.19
Securing Fault Tolerance Logging Traffic ................................................................191
7.20
Auto Deploy Security Considerations ..................................................................... 192
8
9
7.21
Image Builder Security Considerations .................................................................. 192
7.22
Host Password Strength and Complexity ............................................................... 193
7.23
Synchronizing Clocks on the vSphere Network ...................................................... 195
7.24
Monitoring and Restricting Access to SSL Certificates .......................................... 195
MSCS .............................................................................................................................. 196
8.2
Cluster Virtual Machines Across Physical Hosts .................................................... 199
8.3
Cluster Physical and Virtual Machines ................................................................... 199
8.4
vSphere MSCS Setup Checklist .............................................................................. 200
Virtual Machine Administration ................................................................................... 202
9.1
What Is a Virtual Machine?.................................................................................... 202
9.2
Installing the Microsoft Sysprep Tool .................................................................... 204
9.3
Virtual Machine Compatibility Options .................................................................. 207
9.4
Change CPU Hot Plug Settings in the vSphere Web Client ................................... 209
9.5
VM Disk Persistence Modes .................................................................................... 213
9.6
SCSI Controller Configuration ................................................................................ 214
9.7
Configure Fibre Channel NPIV Settings in the vSphere Web Client ...................... 219
9.8
Managing Multi-Tiered Applications with vSphere vApp in the vSphere Web Client
221
9.9
vCenter Solutions Manager ..................................................................................... 222
9.10
Monitoring vServices .............................................................................................. 223
9.11
Using Snapshots To Manage Virtual Machines ...................................................... 224
9.12 Change Disk Mode to Exclude Virtual Disks from Snapshots in the vSphere Web
Client 229
1 Introduction
Hello and Welcome…. It’s 02/01/2014.
While studying for VCAP-DCD and preparing a submission for VCDX, I’ve been trying to
find a really good source for all pre-requisites, best practices, requirements and other
important information used by vSphere Architects, admins and end users. As a designer,
understanding the impact of your decisions against these items is key.
I like to have things in one place if possible. This document has been created from the
existing vSphere official documentation set (pubs.vmware.com), using copy and paste, to
capture the requirements above into a single document that can be used as a reference.
Note:




This is the result of my personal editing of what I believe are the useful
nuggets we all need to know. However, once I started I realised it could
never be a 3-4 page document which was my original intention, so better
to make it complete and use an index to allow people to search around.
The Intellectual Property for this material is not mine. It has not been
added to in ANY way by me. This material is VMware Copyrighted
material that is freely available on pubs.vmware.com.
Once I started doing this I noticed this link where this is a massive
amount of documentation, best practices etc. http://bit.ly/1caCcQs.
o That should be Bookmark #1 for any vSphere/vCloud Architect.
I have copied what I believe are other nuggets of info that will help you
implementing the best possible vSphere design.
The current version (1.0) relates to v5.1 which is the subject of my studies.
As with all things in the Vmware community it’s important to pass this around so folks
following the same certification route might find it useful, or just for general reference. So
please share and I truly hope you find it useful.
This document assumes a medium level of understanding and is not a “build” guide.
Please email me on info@tieroneconsulting.ie with any feedback or other ideas.
1.1 Version Control
Version
1.0
Date
02/01/2014
Author
Paul P Meehan
Description
Issued version for vSphere 5.1.
Still requires some additional input from other
vSphere official documentation.
1.2 Current Scope
At the time of writing the following vSphere documents have been reviewed, explored, edited
and copied.
I will be adding to this list on a regular basis and will upgrade the set to v5.5 in the near
future.








Availability
Networking
Storage
Security
Resource Management
Installation and Setup
Host Profiles
Virtual Machine Administration
2 vSphere Installation and Setup
2.1.1
Hardware and System Resources
To install and use ESXi 5.1, your hardware and system resources must meet the following
requirements:












Supported server platform. For a list of supported platforms, see the VMware
Compatibility Guide at http://www.vmware.com/resources/compatibility.
ESXi 5.1 will install and run only on servers with 64-bit x86 CPUs.
ESXi 5.1 requires a host machine with at least two cores.
ESXi 5.1 supports only LAHF and SAHF CPU instructions.
ESXi 5.1 requires the NX/XD bit to be enabled for the CPU in the BIOS.
ESXi supports a broad range of x64 multicore processors. For a complete list of
supported processors, see the VMware compatibility guide at
http://www.vmware.com/resources/compatibility.
ESXi requires a minimum of 2GB of physical RAM. Provide at least 8GB of RAM to
take full advantage of ESXi features and run virtual machines in typical production
environments.
To support 64-bit virtual machines, support for hardware virtualization (Intel VT-x
or AMD RVI) must be enabled on x64 CPUs.
One or more Gigabit or 10Gb Ethernet controllers. For a list of supported network
adapter models, see the VMware Compatibility Guide at
http://www.vmware.com/resources/compatibility.
Any combination of one or more of the following controllers:
o Basic SCSI controllers. Adaptec Ultra-160 or Ultra-320, LSI Logic FusionMPT, or most NCR/Symbios SCSI.
o RAID controllers. Dell PERC (Adaptec RAID or LSI MegaRAID), HP Smart
Array RAID, or IBM (Adaptec) ServeRAID controllers.
SCSI disk or a local, non-network, RAID LUN with unpartitioned space for the virtual
machines.
For Serial ATA (SATA), a disk connected through supported SAS controllers or
supported on-board SATA controllers. SATA disks will be considered remote, not
local. These disks will not be used as a scratch partition by default because they are
seen as remote.
Note
You cannot connect a SATA CD-ROM device to a virtual machine on an ESXi 5.1 host. To use
the SATA CD-ROM device, you must use IDE emulation mode.
2.2 ESXi Booting Requirements
vSphere 5.1 supports booting ESXi hosts from the Unified Extensible Firmware Interface
(UEFI). With UEFI you can boot systems from hard drives, CD-ROM drives, or USB media.
Network booting or provisioning with VMware Auto Deploy requires the legacy BIOS
firmware and is not available with UEFI.
ESXi can boot from a disk larger than 2TB provided that the system firmware and the
firmware on any add-in card that you are using support it. See the vendor documentation.
Note
Changing the boot type from legacy BIOS to UEFI after you install ESXi 5.1 might cause the
host to fail to boot. In this case, the host displays an error message similar to: Not a
VMware boot bank. Changing the host boot type between legacy BIOS and UEFI is not
supported after you install ESXi 5.1.
2.2.1 Storage Requirements for ESXi 5.1 Installation
Installing ESXi 5.1 requires a boot device that is a minimum of 1GB in size. When booting
from a local disk or SAN/iSCSI LUN, a 5.2GB disk is required to allow for the creation of the
VMFS volume and a 4GB scratch partition on the boot device. If a smaller disk or LUN is
used, the installer will attempt to allocate a scratch region on a separate local disk. If a local
disk cannot be found the scratch partition, /scratch, will be located on the ESXi host
ramdisk, linked to /tmp/scratch. You can reconfigure /scratch to use a separate disk or LUN.
For best performance and memory optimization, VMware recommends that you do not leave
/scratch on the ESXi host ramdisk.
To reconfigure /scratch, see Set the Scratch Partition from the vSphere Client.
Due to the I/O sensitivity of USB and SD devices the installer does not create a scratch
partition on these devices. As such, there is no tangible benefit to using large USB/SD
devices as ESXi uses only the first 1GB. When installing on USB or SD devices, the installer
attempts to allocate a scratch region on an available local disk or datastore. If no local disk or
datastore is found, /scratch is placed on the ramdisk. You should reconfigure /scratch to use
a persistent datastore following the installation.
In Auto Deploy installations, the installer attempts to allocate a scratch region on an
available local disk or datastore. If no local disk or datastore is found /scratch is placed on
ramdisk. You should reconfigure /scratch to use a persistent datastore following the
installation.
For environments that boot from a SAN or use Auto Deploy, it is not necessary to allocate a
separate LUN for each ESXi host. You can co-locate the scratch regions for many ESXi hosts
onto a single LUN. The number of hosts assigned to any single LUN should be weighed
against the LUN size and the I/O behavior of the virtual machines.
2.2.2 Recommendation for Enhanced ESXi Performance
To enhance performance, install ESXi on a robust system with more RAM than the
minimum required and with multiple physical disks.
Recommendations for Enhanced Performance
System ElementRecommendation
ESXi hosts require more RAM than typical servers. Provide at least 8GB
of RAM to take full advantage of ESXi features and run virtual machines
in typical production environments. An ESXi host must have sufficient
RAM to run concurrent virtual machines. The following examples are
provided to help you calculate the RAM required by the virtual machines
running on the ESXi host.
Operating four virtual machines with Red Hat Enterprise Linux or
Windows XP requires at least 3GB of RAM for baseline performance. This
figure includes approximately 1024MB for the virtual machines, 256MB
minimum for each operating system as recommended by vendors.
RAM
Running these four virtual machines with 512MB RAM requires that the
ESXi host have approximately 4GB RAM, which includes 2048MB for the
virtual machines.
These calculations do not take into account possible memory savings from
using variable overhead memory for each virtual machine. See vSphere
Resource Management .
Dedicated Fast
Ethernet adapters
for virtual
machines
Place the management network and virtual machine networks on different
physical network cards. Dedicated Gigabit Ethernet cards for virtual
machines, such as Intel PRO 1000 adapters, improve throughput to
virtual machines with high network traffic.
Disk location
Place all data that your virtual machines use on physical disks allocated
specifically to virtual machines. Performance is better when you do not
place your virtual machines on the disk containing the ESXi boot image.
Use physical disks that are large enough to hold disk images that all the
virtual machines use.
VMFS5
partitioning
The ESXi installer creates the initial VMFS volumes on the first blank
local disk found. To add disks or modify the original configuration, use
the vSphere Client. This practice ensures that the starting sectors of
partitions are 64K-aligned, which improves storage performance.
2.3 ESXi Support for 64-Bit Guest Operating Systems
ESXi offers support for several 64-bit guest operating systems.
For a complete list of operating systems supported for ESXi, see the VMware Compatibiity
Guide at http://www.vmware.com/resources/compatibility/search.php.
Hosts running virtual machines with 64-bit guest operating systems have the following
hardware requirements:

For AMD Opteron-based systems, the processors must be Opteron Rev E or later.

For Intel Xeon-based systems, the processors must include support for Intel
Virtualization Technology (VT). Many servers that include CPUs with VT support
might have VT disabled by default, so you must enable VT manually. If your CPUs
support VT ,but you do not see this option in the BIOS, contact your vendor to request
a BIOS version that lets you enable VT support.
To determine whether your server has 64-bit VMware support, you can download the CPU
Identification Utility from the VMware Web site.
2.4 Hardware Requirements for vCenter Server, vCenter Single
Sign On, vSphere Client, and vSphere Web Client
The vCenter Server system is a physical machine or virtual machine with access to a
supported database. The vCenter Server system must meet specific requirements. The
vCenter Server machines must meet the hardware requirements.
2.4.1 vCenter Single Sign On, Inventory Service and vCenter Server Hardware
Requirements
You can install vCenter Single Sign On, Inventory Service, and vCenter Server on the same
host machine (as with vCenter Simple Install) or on different machines. Minimum Hardware
Requirements for vCenter Single Sign On, Running on a Separate Host Machine from
vCenter Server and Minimum Hardware Requirements for vCenter Inventory Service,
Running on a Separate Host Machine from vCenter Server list the hardware requirements
for Single Sign On and Inventory Service, running on separate host machines. If you install
vCenter Single Sign On, vCenter Inventory Service, and vCenter Server on the same host
machine, the Single Sign On and Inventory Service memory and disk storage requirements
are in addition to the requirements for vCenter Server. See Minimum Hardware
Requirements for vCenter Server.
Minimum Hardware Requirements for vCenter Single Sign On, Running on a Separate Host
Machine from vCenter Server
vCenter Single
Sign On
Requirement
Hardware
Processor
Intel or AMD x64 processor with two or more logical cores, each with a
speed of 2GHz.
Memory
3GB. Memory requirements might be higher if the vCenter Single Sign On
database runs on the same host machine. If vCenter Single Sign On runs
on the same host machine as vCenter Server, see Minimum Hardware
Requirements for vCenter Server.
Disk storage
2GB. Disk requirements might be higher if the vCenter Single Sign On
database runs on the same host machine.
Network speed
1Gbps
2.4.2 Minimum Hardware Requirements for vCenter Inventory Service,
Running on a Separate Host Machine from vCenter Server
vCenter
Inventory
Service
Hardware
Requirement
Processor
Intel or AMD x64 processor with two or more logical cores, each with a
speed of 2GHz.
Memory
3GB. If vCenter Inventory Service runs on the same host machine as
vCenter Server, see Minimum Hardware Requirements for vCenter Server.
Disk storage
Network speed
At least 60GB for medium- to large-sized inventories (more than 100 hosts
or 1000 virtual machines).
If vCenter Inventory Service runs on the same host machine as vCenter
Server, see Minimum Hardware Requirements for vCenter Server.
1Gbps
2.4.3 Minimum Hardware Requirements for vCenter Server
vCenter Server
Requirement
Hardware
CPU
Two 64-bit CPUs or one 64-bit dual-core processor.
Processor
2.0GHz or faster Intel 64 or AMD 64 processor. The Itanium (IA64)
processor is not supported. Processor requirements might be higher if the
database runs on the same machine.
The amount of memory needed depends on your vCenter Server
configuration.
If vCenter Server is installed on a different host machine than
vCenter Single Sign On and vCenter Inventory Service, 4GB of
RAM are required.
Memory
If vCenter Server, vCenter Single Sign On and vCenter Inventory
Service are installed on the same host machine (as with vCenter
Simple Install), 10GB of RAM are required.
Memory requirements are higher if the vCenter Server database or vCenter
Single Sign On database runs on the same machine as vCenter Server.
vCenter Server includes several Java services: VMware VirtualCenter
Management Webservices (tc Server), Inventory Service, and Profile-
2.4.2 Minimum Hardware Requirements for vCenter Inventory Service,
Running on a Separate Host Machine from vCenter Server
vCenter
Inventory
Service
Hardware
Requirement
Driven Storage Service. When you install vCenter Server, you select the size
of your vCenter Server inventory to allocate memory for these services. The
inventory size determines the maximum JVM heap settings for the
services. You can adjust this setting after installation if the number of hosts
in your environment changes. See the recommendations in JVM Heap
Settings for vCenter Server.
The amount of disk storage needed for the vCenter Server installation
depends on your vCenter Server configuration.
Disk storage
If vCenter Server is installed on a different host machine than vCenter
Single Sign On and vCenter Inventory Service, 4GB are required.
If vCenter Server, vCenter Single Sign On and vCenter Inventory
The JVM heap settings for vCenter Server depend on your inventory size. See Configuring
VMware Tomcat Server Settings in vCenter Server 5.1.
2.4.4 JVM Heap Settings for vCenter Server
Inventory
Service
Profile-Driven
Storage
Service
Small inventory (1-100 hosts
1GB
or 1-1000 virtual machines)
3GB
512MB
Medium inventory (100400 hosts or 1000-4000
virtual machines)
6GB
1GB
12GB
2GB
vCenter Server
Inventory
VMware VirtualCenter
Management Webservices
(tc Server)
2GB
Large inventory (More than
400 hosts or 4000 virtual 3GB
machines)
Note
Installing vCenter Server on a network drive or USB flash drive is not supported.
For the hardware requirements of your database, see your database documentation. The
database requirements are in addition to the vCenter Server requirements if the database
and vCenter Server run on the same machine.
2.4.5 VMware vCenter Server Appliance Hardware Requirements and
Recommendations
Important
The embedded database is not configured to manage an inventory that contains
more than 5 hosts and 50 virtual machines. If you use the embedded database
with the vCenter Server Appliance, exceeding these limits can cause numerous
problems, including causing vCenter Server to stop responding.
2.4.5.1 Hardware Requirements for VMware vCenter Server Appliance
VMware vCenter
Server
Requirement
Appliance
Hardware
The vCenter Server Appliance requires at least 7GB of disk space, and is limited to
a maximum size of 80GB. The vCenter Server Appliance can be deployed with
Disk storage on the thin-provisioned virtual disks that can grow to the maximum size of 80GB. If the
host machine does not have enough free disk space to accommodate the growth of
host machine
the vCenter Server Appliance virtual disks, vCenter Server might cease operation,
and you will not be able to manage your vSphere environment.
Very small inventory (10 or fewer hosts, 100 or fewer virtual machines): at least
4GB.
Memory in the
VMware vCenter
Server Appliance
Small inventory (10-100 hosts or 100-1000 virtual machines): at least 8GB.
Medium inventory (100-400 hosts or 1000-4000 virtual machines): at least 16GB.
Large inventory (More than 400 hosts or 4000 virtual machines): at least 24GB.
2.5 vCenter Server and vSphere Client System Recommendations
for Performance Based on Deployment Size
The number of hosts and powered-on virtual machines in your environment affects
performance. Use the following system requirements as minimum guidelines for reasonable
For increased performance, you can configure systems in your environment with values
greater than those listed here.
Processing requirements are listed in terms of hardware CPU cores. Only physical cores are
counted. In hyperthreaded systems, logical CPUs do not count as separate cores.
Log
Management Agent (hostd)
VirtualCenter Agent (vpxa)
vSphere HA agent (Fault
Domain Manager, fdm)
Maximum Log Number of Rotations Minimum Disk
File Size
to Preserve
Space Required
10240KB
10
100MB
5120KB
10
50MB
5120KB
10
50MB
Important
The recommended disk sizes assume default log levels. If you configure more detailed log
levels, more disk space is required.
Medium Deployment of Up to 50 Hosts and 500 Powered-On Virtual Machines
Product
Cores
Memory
Disk
vCenter Server
2
4GB
5GB
vSphere Client
1
1GB
1.5GB
Large Deployment of Up to 300 Hosts and 3,000 Powered-On Virtual Machines
Product
Cores
Memory
Disk
vCenter Server
4
8GB
10GB
vSphere Client
1
1GB
1.5GB
Extra-Large Deployment of Up to 1,000 Hosts and 10,000 Powered-On Virtual Machines
Product
Cores
Memory
Disk
vCenter Server
8
16GB
10GB
vSphere Client
2
1GB
1.5GB
2.5.1 vSphere Web Client Hardware Requirements
The vSphere Web Client has two components: A Java server and an Adobe Flex client
application running in a browser.
Hardware Requirements for the vSphere Web Client Server Component
vSphere Web Client Server Hardware Requirement
Memory
At least 2GB: 1GB for the Java heap, and 1GB for
The resident code
Hardware Requirements for the vSphere Web Client Server Component
vSphere Web Client Server Hardware Requirement
The stack for Java threads
Global/bss segments for the Java process
CPU
2.00 GHz processor with 4 cores
Disk Storage
At least 2GB free disk space
Networking
Gigabit connection recommended
2.5.2 Recommended Minimum Size and Rotation Configuration for hostd,
vpxa, and fdm Logs.
Log
Maximum Log Number of RotationsMinimum Disk
File Size
to Preserve
Space Required
Management Agent (hostd)
10240KB
10
100MB
VirtualCenter Agent (vpxa)
5120KB
10
50MB
vSphere HA agent (Fault
Domain Manager, fdm)
5120KB
10
50MB
2.6 Required Ports for vCenter Server
Port
Description
vCenter Server requires port 80 for direct HTTP connections. Port 80
redirects requests to HTTPS port 443. This redirection is useful if you
accidentally use http://server instead of https://server.
80
If you use a custom Microsoft SQL database (not the bundled SQL Server
2008 database) that is stored on the same host machine as the vCenter
Server, port 80 is used by the SQL Reporting Service. When you install
vCenter Server, the installer will prompt you to change the HTTP port for
vCenter Server. Change the vCenter Server HTTP port to a custom value to
ensure a successful installation.
Microsoft Internet Information Services (IIS) also use port 80. See Conflict
Between vCenter Server and IIS for Port 80.
389
This port must be open on the local and all remote instances of vCenter
Server. This is the LDAP port number for the Directory Services for the
Port
Description
vCenter Server group. The vCenter Server system needs to bind to port
389, even if you are not joining this vCenter Server instance to a Linked
Mode group. If another service is running on this port, it might be
preferable to remove it or change its port to a different port. You can run
the LDAP service on any port from 1025 through 65535.
If this instance is serving as the Microsoft Windows Active Directory,
change the port number from 389 to an available port from 1025 through
65535.
The default port that the vCenter Server system uses to listen for
connections from the vSphere Client. To enable the vCenter Server system
to receive data from the vSphere Client, open port 443 in the firewall.
443
The vCenter Server system also uses port 443 to monitor data transfer from
SDK clients.
If you use another port number for HTTPS, you must use ip-address:port
when you log in to the vCenter Server system.
636
For vCenter Server Linked Mode, this is the SSL port of the local instance.
If another service is running on this port, it might be preferable to remove
it or change its port to a different port. You can run the SSL service on any
port from 1025 through 65535.
2.7 Required Ports for the vCenter Server Appliance
Port
Description
80
vCenter Server requires port 80 for direct HTTP connections. Port 80
redirects requests to HTTPS port 443. This redirection is useful if you
accidentally use http://server instead of https://server.
The default port that the vCenter Server system uses to listen for
connections from the vSphere Client. To enable the vCenter Server
system to receive data from the vSphere Client, open port 443 in the
firewall.
443
The vCenter Server system also uses port 443 to monitor data transfer
from SDK clients.
If you use another port number for HTTPS, you must use ipaddress:port when you log in to the vCenter Server system.
902
The default port that the vCenter Server system uses to send data to
managed hosts. Managed hosts also send a regular heartbeat over UDP
port 902 to the vCenter Server system. This port must not be blocked
Port
Description
by firewalls between the server and the hosts or between hosts.
Port 902 must not be blocked between the vSphere Client and the
hosts. The vSphere Client uses this port to display virtual machine
consoles.
8080
Web Services HTTP. Used for the VMware VirtualCenter Management
Web Services.
8443
Web Services HTTPS. Used for the VMware VirtualCenter
Management Web Services.
10080
vCenter Inventory Service HTTP
10443
vCenter Inventory Service HTTPS
10109
vCenter Inventory Service database
514
vSphere Syslog Collector server
2.8 Prepare Your System and Install the Auto Deploy Server
Before you turn on a host for PXE boot with vSphere Auto Deploy, you must install
prerequisite software and set up the DHCP and TFTP servers that Auto Deploy interacts
with.

Ensure that the hosts that you will provision with Auto Deploy meet the hardware
requirements for ESXi 5.1.
See ESXi Hardware Requirements.
Note
You cannot provision EFI hosts with Auto Deploy unless you switch the EFI system to BIOS
compatibility mode.

Ensure that the ESXi hosts have network connectivity to vCenter Server and that all
port requirements are met.
See Required Ports for vCenter Server.




If you want to use VLANs in your Auto Deploy environment, you must set up the end
to end networking properly. When the host is PXE booting, the UNDI driver must be
set up to tag the frames with proper VLAN IDs. You must do this set up manually by
making the correct changes in the BIOS. You must also correctly configure the ESXi
port groups with the correct VLAN IDs. Ask your network administrator how VLAN
IDs are used in your environment.
Ensure that you have enough storage for the Auto Deploy repository. The Auto
Deploy server uses the repository to store data it needs, including the rules and rule
sets you create and the VIBs and image profiles that you specify in your rules.
Best practice is to allocate 2GB to have enough room for four image profiles and some
extra space. Each image profile requires approximately 350MB. Determine how
much space to reserve for the Auto Deploy repository by considering how many
image profiles you expect to use.
Obtain the vCenter Server installation media, which include the Auto Deploy
installer, or deploy the vCenter Server Appliance.
See Installing vCenter Server.
See Using Auto Deploy with the VMware vCenter Server Appliance.


Ensure that a TFTP server is available in your environment. If you require a
supported solution, purchase a supported TFTP server from your vendor of choice.
Obtain administrative privileges to the DHCP server that manages the network
segment you want to boot from. You can use a DHCP server already in your
environment, or install a DHCP server. For your Auto Deploy setup, replace the
gpxelinux.0 file name with undionly.kpxe.vmw-hardwired.

Secure your network as you would for any other PXE-based deployment method.
Auto Deploy transfers data over SSL to prevent casual interference and snooping.
However, the authenticity of the client or the Auto Deploy server is not checked
during a PXE boot. .
Note
Auto Deploy is not supported with NPIV (N_Port ID Virtualization).


Set up a remote Syslog server. See the vCenter Server and Host Management
documentation for Syslog server configuration information. Configure the first host
you boot to use the remote syslog server and apply that host's host profile to all other
target hosts. Optionally, install and use the vSphere Syslog Collector, a vCenter
Server support tool that provides a unified architecture for system logging and
enables network logging and combining of logs from multiple hosts.
Install ESXi Dump Collector and set up your first host so all core dumps are directed
to ESXi Dump Collector and apply the host profile from that host to all other hosts.
See Configure ESXi Dump Collector with ESXCLI and Set Up ESXi Dump Collector
from the Host Profiles Interface in the vSphere Client.
See Install or Upgrade vSphere ESXi Dump Collector.

Auto Deploy does not support a pure IPv6 environment because the PXE boot
specifications do not support IPv6. However, after the initial PXE boot state, the rest
of the communication can happen over IPv6. You can register Auto Deploy to the
vCenter Server system with IPv6, and you can set up the host profiles to bring up
hosts with IPv6 addresses. Only the initial boot process requires an IPv4 address.
2.9 Auto Deploy Best Practices and Security Consideration
Follow best practices when installing vSphere Auto Deploy and when using Auto Deploy
with other vSphere components. Set up a highly available Auto Deploy infrastructure in large
production environments or when using stateless caching. Follow all security guidelines that
you would follow in a PXE boot environment, and consider the recommendations in this
chapter.
2.9.1 Auto Deploy and vSphere HA Best Practices
You can improve the availability of the virtual machines running on hosts provisioned with
Auto Deploy by following best practices.

Some environments configure the hosts provisioned with Auto Deploy with a
distributed switch or configure virtual machines running on the hosts with Auto Start
Manager. In those environments, deploy the vCenter Server system so that its
availability matches the availability of the Auto Deploy server. Several approaches are
possible.
o
o
o
In a proof of concept environment, deploy the vCenter Server system and the
Auto Deploy server on the same system. In all other situations, install the two
servers on separate systems.
Deploy vCenter Server Heartbeat.
VMware vCenter Server Heartbeat delivers high availability for VMware
vCenter Server, protecting the virtual and cloud infrastructure from
application, configuration, operating system, or hardware related outages.
Deploy the vCenter Server system in a virtual machine. Run the vCenter Server virtual
machine in a vSphere HA enabled cluster and configure the virtual machine with a vSphere
HA restart priority of high. Include two or more hosts in the cluster that are not managed by
Auto Deploy and pin the vCenter Server virtual machine to these hosts by using a rule
(vSphere HA DRS required VM to host rule). You can set up the rule and then disable DRS if
you do not wish to use DRS in the cluster. The greater the number of hosts that are not
managed by Auto Deploy the greater your resilience to host failures.
Note
This approach is not suitable if you use Auto Start Manager because Auto Start Manager is
not supported in a cluster enabled for vSphere HA.
2.9.2 Auto Deploy Networking Best Practices
Prevent networking problems by following Auto Deploy networking best practices.
IP Address
Allocation
Using DHCP reservations is highly recommended for address allocation.
Fixed IP addresses are supported by the host customization mechanism,
but providing input for each host is cumbersome and not recommended.
VLAN
Using Auto Deploy in environments that do not use VLANs is highly
Considerations recommended.
If you intend to use Auto Deploy in an environment that uses VLANs, you
must make sure that the hosts you want to provision can reach the DHCP
server. How hosts are assigned to a VLAN depends on the setup at your site.
The VLAN ID might be assigned by the switch or by the router, or you
might be able to set the VLAN ID in the host's BIOS or through the host
profile. Contact your network administrator to determine the steps for
allowing hosts to reach the DHCP server.
2.9.2.1 Auto Deploy and VMware Tools Best Practices
See the VMware Knowledge Base article 2004018 for Auto Deploy and VMware Tools best
practices.
2.9.2.2 Auto Deploy Load Management Best Practice
Simultaneously booting large numbers of hosts places a significant load on the Auto Deploy
server. Because Auto Deploy is a web server at its core, you can use existing web server
scaling technologies to help distribute the load. For example, one or more caching reverse
proxy servers can be used with Auto Deploy. The reverse proxies serve up the static files that
make up the majority of an ESXi boot image. Configure the reverse proxy to cache static
content and pass all requests through to the Auto Deploy server. See the VMware Techpubs
Video Using Reverse Web Proxy Servers for Auto Deploy.
onfigure the hosts to boot off the reverse proxy by using multiple TFTP servers, one for each
reverse proxy server. Finally, set up the DHCP server to send different hosts to different
TFTP servers.
When you boot the hosts, the DHCP server sends them to different TFTP servers. Each TFTP
server sends hosts to a different server, either the Auto Deploy server or a reverse proxy
server, significantly reducing the load on the Auto Deploy server.
After a massive power outage, VMware recommends that you bring up the hosts on a percluster basis. If you bring up multiple clusters simultaneously, the Auto Deploy server might
experience CPU bottlenecks. All hosts come up after a potential delay. The bottleneck is less
severe if you set up the reverse proxy.
2.9.3 vSphere Auto Deploy Logging and Troubleshooting Best Practices
To resolve problems you encounter with vSphere Auto Deploy, use the Auto Deploy logging
information from the vSphere Client and set up your environment to send logging
information and core dumps to remote hosts.
2.10 Before You Install vCenter Server
2.10.1 System Prerequisites
 Verify that your system meets the requirements listed in Hardware Requirements for
vCenter Server, vCenter Single Sign On, vSphere Client, and vSphere Web Client and
vCenter Server Software Requirements, and that the required ports are open, as
discussed in Required Ports for vCenter Server.
 Before you install or upgrade any vSphere product, synchronize the clocks of all
machines on the vSphere network. See Synchronizing Clocks on the vSphere
Network.
 Review the Windows Group Policy Object (GPO) password policy for your system
machines. The Single Sign On installation requires you to enter passwords that
comply with GPO password policy.
 Verify that the DNS name of the vCenter Server host machine matches the actual
computer name.
 Verify that the host name of the machine that you are installing vCenter Server on
complies with RFC 952 guidelines.
 The installation path of vCenter Server must be compatible with the installation
requirements for Microsoft Active Directory Application Mode (ADAM/AD LDS). The









installation path cannot contain any of the following characters: non-ASCII
characters, commas (,), periods (.), exclamation points (!), pound signs (#), at signs
(@), or percentage signs (%).
Verify that the host machine computer name is no more than 15 characters.
Verify that the system on which you are installing vCenter Server is not an Active
Directory domain controller.
On each system that is running vCenter Server, verify that the domain user account
has the following permissions:
o Member of the Administrators group
o Act as part of the operating system
o Log on as a service
vCenter Server requires the Microsoft .NET 3.5 SP1 Framework. If your system does
not have it installed, the vCenter Server installer installs it. The .NET 3.5 SP1
installation might require Internet connectivity to download more files.
If the system that you use for your vCenter Server installation belongs to a workgroup
rather than a domain, not all functionality is available to vCenter Server. If assigned
to a workgroup, the vCenter Server system is not able to discover all domains and
systems available on the network when using some features. To determine whether
the system belongs to a workgroup or a domain, right-click My Computer. Click
Properties and click the Computer Name tab. The Computer Name tab displays either
a Workgroup label or a Domain label.
Verify that the NETWORK SERVICE account has read permission on the folder in
which vCenter Server is installed and on the HKLM registry.
During the installation, verify that the connection between the machine and the
domain controller is working.
Before the vCenter Server installation, in the Administrative Tools control panel of
the vCenter Single Sign-On instance that you will register vCenter Server to, verify
that the vCenter Single Sign-On and RSA SSPI services are started.
You must log in as a member of the Administrators group on the host machine, with a
user name that does not contain any non-ASCII characters.
2.10.2 Network Prerequisites
 Verify that the fully qualified domain name (FQDN) of the system where you will
install vCenter Server is resolvable. To check that the FQDN is resolvable, type
nslookup your_vCenter_Server_fqdn at a command line prompt. If the FQDN is
resolvable, the nslookup command returns the IP and name of the domain controller
machine.
 Verify that DNS reverse lookup returns a fully qualified domain name when queried
with the IP address of the vCenter Server. When you install vCenter Server, the
installation of the web server component that supports the vSphere Client fails if the
installer cannot look up the fully qualified domain name of the vCenter Server from
its IP address. Reverse lookup is implemented using PTR records. To create a PTR
record, see the documentation for your vCenter Server host operating system.
 Verify that no Network Address Translation (NAT) exists between the vCenter Server
system and the hosts it will manage.
 Install vCenter Server, like any other network server, on a machine with a fixed IP
address and well known DNS name, so that clients can reliably access the service.
Assign a static IP address and host name to the Windows server that will host the
vCenter Server system. This IP address must have a valid (internal) domain name
system (DNS) registration. Ensure that the ESXi host management interface has a
valid DNS resolution from the vCenter Server and all vSphere Clients. Ensure that the
vCenter Server has a valid DNS resolution from all ESXi hosts and all vSphere
Clients. If you use DHCP instead of a static IP address for vCenter Server, make sure
that the vCenter Server computer name is updated in the domain name service
(DNS). Ping the computer name to test this connection. For example, if the computer
name is host-1.company.com, run the following command in the Windows command
prompt:
ping host-1.company.com
If you can ping the computer name, the name is updated in DNS.
For the vCenter Single Sign-On installer to automatically discover Active Directory identity
sources, verify that the following conditions are met.



The Active Directory identity source must be able to authenticate the user who is
logged in to perform the Single Sign-On installation.
The DNS of the Single Sign-On Server host machine must contain both lookup and
reverse lookup entries for the domain controller of the Active Directory. For example,
pinging mycompany.com should return the domain controller IP address for
mycompany. Similarly, the ping -a command for that IP address should return the
domain controller hostname. Avoid trying to correct name resolution issues by
editing the hosts file. Instead, make sure that the DNS server is correctly set up.
The system clock of the Single Sign-On Server host machine must be synchronized
with the clock of the domain controller.
2.11 (vCenter) Database Prerequisites






Verify that your vCenter Server database meets the database requirements. See
vCenter Server Database Configuration Notes and Preparing vCenter Server
Databases.
Create a vCenter Server database, unless you plan to install the bundled database.
Create a vCenter Single Sign-On database, unless you plan to install the bundled
database.
If you are using an existing database for Single Sign On, you must create a database
user (RSA_USER) and database administrator (RSA_DBA) to use for the Single Sign
On database installation and setup. To create these users, run the script
rsaIMSLiteDBNameSetupUsers.sql. The script is included in the vCenter Server
installer download package, at vCenter Server Installation directory\SSOServer.
If you are using an existing database with your vCenter Single Sign-On installation or
upgrade, make sure that the table spaces are named RSA_DATA and RSA_INDEX.
Any other table space names will cause the vCenter Single Sign-On Installation to fail.
If you are using an existing database for Single Sign-On, to ensure that table space is
created for the database, run the script rsaIMSLite<DBName>SetupTablespaces.sql.
The script is included in the vCenter Server installer download package, at vCenter
Server Installation directory\Single Sign
On\DBScripts\SSOServer\Schema\your_existing_database. You can run this script
prior to the installation, or during the installation, when you are prompted by the
installer. You can leave the installer to run the script, and resume the installer after
you run the script.
2.11.1 vCenter Single Sign On Components
vCenter Single Sign On includes these components: STS (Security Token Service), an
administration server, vCenter Lookup Service, and the RSA SSPI service.
When you install vCenter Single Sign-On, the following components are deployed.
STS (Security
Token Service)
The STS service issues Security Assertion Markup Language (SAML)
tokens. These security tokens pass information about a system user
between an identity provider and a web service. This service enables a
user who has logged on through vCenter Single Sign-On to use multiple
web-service delivered applications without authenticating to each one.
Administration
server
The Administration Server configures the vCenter Single Sign-On server
and manages users and groups.
vCenter Lookup
Service
The Lookup Service contains topology information about the vSphere
infrastructure, enabling vSphere components to connect to each other
securely.
RSA SSPI service The Security Support Provider Interface is a Microsoft Windows-based
API used to perform authentication against Security Support Providers
such as NTLM and Kerberos.
2.11.2 vCenter Lookup Service
vCenter Lookup Service is a component of vCenter Single Sign On. Lookup Service registers
the location of vSphere components so they can securely find and communicate with each
other.
The vCenter Single Sign-On installer also deploys the VMware Lookup Service on the same
address and port. The Lookup Service enables different components of vSphere to find one
another in a secure way. When you install vCenter Server components after vCenter Single
Sign-On, you must provide the Lookup Service URL. The Inventory Service and the vCenter
Server installers ask for the Lookup Service URL and then contact the Lookup Service to find
vCenter Single Sign-On. After installation, the Inventory Service and vCenter Server are
registered in Lookup Service so other vSphere components, like the vSphere Web Client, can
find them.
2.11.3 Setting the vCenter Server Administrator User
In vCenter Server 5.1 with vCenter Single Sign On, the way you set the vCenter Server
administrator user depends on your vCenter Single Sign On deployment.
In vSphere versions before vSphere 5.1, vCenter Server administrators are the users that
belong to the local operating system administrators group.
In vSphere 5.1, when you install vCenter Server, you must provide the default (initial)
vCenter Server administrator user or group. For small deployments where vCenter Server
and vCenter Single Sign-On are deployed on the same host machine, you can designate the
local operating system group Administrators as vCenter Server administrative users. This
option is the default. This behavior is unchanged from vCenter Server 5.0.
For larger installations, where vCenter Single Sign-On and vCenter Server are deployed on
different hosts, you cannot preserve the same behavior as in vCenter Server 5.0. Instead,
assign the vCenter Server administrator role to a user or group from an identity source that
is registered in the vCenter Single Sign-On server: Active Directory, OpenLDAP, or the
system identity source.
2.11.4 Authenticating to the vCenter Server 5.1 Environment
In vCenter Server 5.1, users authenticate through vCenter Single Sign On.
In vCenter Server versions earlier than vCenter Server 5.1, when a user connects to vCenter
Server, vCenter Server authenticates the user by validating the user against an Active
Directory domain or the list of local operating system users.
Because vCenter Server now has its own vCenter Single Sign-On server, you must create
Single Sign-On users to manage the Single Sign-On server. These users might be different
from the users that administer vCenter Server.
The default vCenter Single Sign-On administrator user ID is admin@System-Domain. You
can create Single Sign-On administrator users with the Single Sign-On administration tool in
the vSphere Web Client. You can associate the following permissions with these users: Basic,
Regular, and Administrator.
Users can log in to vCenter Server with the vSphere Client or the vSphere Web Client.


Using the vSphere Client, the user logs in to each vCenter Server separately. All
linked vCenter Server instances are visible on the left pane of the vSphere Client. The
vSphere Client does not show vCenter Server systems that are not linked to the
vCenter Server that the user logged in to unless the user connects to those vCenter
Server systems explicitly. This behavior is unchanged from vCenter Server versions
earlier than version 5.1.
Using the vSphere Web Client, users authenticate to vCenter Single Sign-On, and are
connected to the vSphere Web Client. Users can view all the vCenter Server instances
that the user has permissions on. After users connect to vCenter Server, no further
authentication is required. The actions users can perform on objects depend on the
user's vCenter Server permissions on those objects.
For vCenter Server versions earlier than vCenter Server 5.1, you must explicitly register each
vCenter Server system with the vSphere Web Client, using the vSphere Web Client
Administration Application.
For more information about vCenter Single Sign On, see vSphere Security.
2.11.5 How vCenter Single Sign-On Deployment Scenarios Affect Log In
Behavior
The way that you deploy vCenter Single Sign-On and the type of user who installs vCenter
Single Sign-On affects which administrator user accounts have privileges on the Single SignOn server and on vCenter Server.
During the vCenter Server installation process, certain users are granted privileges to log in
to vCenter Server and certain users are granted privileges to manage vCenter Single Sign-On.
The vCenter Server administrator might not be the same user as the vCenter Single Sign-On
administrator. This means that when you log in to the vSphere Web Client as the default
Single Sign-On administrator (admin@System-Domain), you might not see any vCenter
Server systems in the inventory. The inventory appears to be empty because you see only the
systems upon which you have privileges in the vSphere Web Client.
This also means that when you log in to the vSphere Web Client as the default vCenter Server
administrator, you might not see the vCenter Single Sign-On configuration tool. The
configuration tool is not present because only the default vCenter Single Sign-On
Administrator (admin@System-Domain) is allowed to view and manage vCenter Single SignOn after installation. The Single Sign-On administrator can create additional administrator
users if necessary.
2.11.6 Login Behavior When You Use vCenter Simple Install
The vCenter Simple Install process installs vCenter Single Sign-On, the Inventory Service,
and vCenter Server on one system. The account you use when you run the Simple Install
process affects which users have privileges on which components.
When you log in as a domain account user or local account user to install vCenter Server
using vCenter Simple Install, the following behavior occurs upon installation.




By default, users in the local operating system Administrators group can log in to the
vSphere Web Client and vCenter Server. These users cannot configure Single Sign-On
or view the Single Sign-On management interface in the vSphere Web Client.
By default, the vCenter Single Sign-On administrator user is admin@SystemDomain. This user can log in to the vSphere Web Client to configure Single Sign-On
and add accounts to manage Single Sign-On if necessary. This user cannot view or
configure vCenter Server.
If you are logged in as a domain account user, the default Active Directory identity
sources are discovered automatically during vCenter Single Sign On installation. If
you are logged in as a local account user, Active Directory identity sources are not
discovered automatically during vCenter Single Sign On installation.
The local operating system (localos or hostname) users are added as an identity
source.
2.11.7 Login Behavior When You Deploy vCenter Single Sign-On as a
Standalone Server
Deploying vCenter Single Sign-On in Basic mode means that a standalone version of vCenter
Single Sign-On is installed on a system. Multiple vCenter Server, Inventory Service, and
vSphere Web Client instances can point to this standalone version of vCenter Single Sign-On.
In this deployment scenario, the installation process grants admin@System-Domain vCenter
Server privileges by default. In addition, the installation process creates the user
admin@System-Domain to manage vCenter Single Sign-On.
Note
When you install vCenter Server components with separate installers, you can choose which
account or group can log in to vCenter Server upon installation. Specify this account or group
on the Single Sign-On Information page of the installer, in the following text box: vCenter
Server administrator recognized by vCenter Single Sign-On. For example, to grant a group of
domain administrators permission to log in to vCenter Server, type of name of the domain
administrators group, such as Domain Admins@VCADSSO.LOCAL.
In high availablity and multisite Single Sign-On modes, there is no local operating system
identity source. Therefore, it will not work if you enter Administrators or Administrator
in the text box vCenter Server administrator recognized by vCenter Single Sign-On.
Administrators is treated as the local operating system group Administrators, and
Administrator is treated me as local operating system user Administrator.
2.11.8 dentity Sources for vCenter Server with vCenter Single Sign On
vCenter Server 5.1 with vCenter Single Sign On adds support for several new types of user
repository.
vCenter Server versions earlier than version 5.1 supported Active Directory and local
operating system users as user repositories. vCenter Server 5.1 supports the following types
of user repositories as identity sources.




Active Directory.
OpenLDAP.
Local operating system.
System.
vCenter Single Sign-On identity sources are managed by Single Sign-On administrator users.
You can attach multiple identity sources from each type to a single Single Sign-On server.
Each identity source has a name that is unique within the scope of the corresponding Single
Sign-On server instance. There is always exactly one System identity source, named SystemDomain.
There can be at most one local operating system identity source. On Linux systems, the
identity source label is localOS. On Windows systems, the identity source label is the
system's host name. The local operating system identity source can exist only in nonclustered Single Sign-On server deployments.
You can attach remote identity sources to a Single Sign-On server instance. Remote identity
sources are limited to any of Active Directory, and OpenLDAP server implementations.
During Single Sign On installation, the installer can automatically discover Active Directory
identity sources, if your system meets the appropriate prerequisites. See the section
"Network Prerequisites" in Prerequisites for Installing vCenter Single Sign-On, Inventory
Service, and vCenter Server.
For more information about vCenter Single Sign On, see vSphere Security.
2.12 Using a User Account for Running vCenter Server
You can use the Microsoft Windows built-in system account or a user account to run vCenter
Server. With a user account, you can enable Windows authentication for SQL Server, and it
provides more security.
The user account must be an administrator on the local machine. In the installation wizard,
you specify the account name as DomainName\Username. You must configure the SQL
Server database to allow the domain account access to SQL Server.
The Microsoft Windows built-in system account has more permissions and rights on the
server than the vCenter Server system needs, which can contribute to security problems.
For SQL Server DSNs configured with Windows authentication, use the same user account
for the VMware VirtualCenter Management Webservices service and the DSN user.
If you do not plan to use Microsoft Windows authentication for SQL Server or you are using
an Oracle or DB2 database, you might still want to set up a local user account for the vCenter
Server system. The only requirement is that the user account is an administrator on the local
machine.
Note
If you install an instance of vCenter Server as a local system account on a local SQL Server
database with Integrated Windows NT Authentication, and you add an Integrated Windows
NT Authentication user to the local database server with the same default database as
vCenter Server, vCenter Server might not start. See vCenter Server Fails to Start When
Installed as a Local System Account on a Local SQL Server Database with Integrated
Windows NT Authentication.
3 HA Clusters
3.1 Requirements for a vSphere HA Cluster
Review this list before setting up a vSphere HA cluster. For more information, follow the
appropriate cross reference or see Creating a vSphere HA Cluster.




All hosts must be licensed for vSphere HA.
You need at least two hosts in the cluster.
All hosts need to be configured with static IP addresses. If you are using DHCP, you
must ensure that the address for each host persists across reboots.
There should be at least one management network in common among all hosts and
best practice is to have at least two. Management networks differ depending on the
version of host you are using.
o ESX hosts - service console network.
o ESXi hosts earlier than version 4.0 - VMkernel network.
o ESXi hosts version 4.0 and later ESXi hosts - VMkernel network with the
Management traffic checkbox enabled.
See Best Practices for Networking.
To ensure that any virtual machine can run on any host in the cluster, all hosts should have
access to the same virtual machine networks and datastores. Similarly, virtual machines
must be located on shared, not local, storage otherwise they cannot be failed over in the case
of a host failure.
Note
vSphere HA uses datastore heartbeating to distinguish between partitioned, isolated, and
failed hosts. Accordingly, if there are some datastores that are more reliable in your
environment, configure vSphere HA to give preference to them.


For VM Monitoring to work, VMware tools must be installed. See VM and
Application Monitoring.
vSphere HA supports both IPv4 and IPv6. A cluster that mixes the use of both of
these protocol versions, however, is more likely to result in a network partition.
3.2 EVC Requirements for Hosts
To improve CPU compatibility between hosts that have varying CPU feature sets, you can
hide some host CPU features from the virtual machine by placing the host in an Enhanced
vMotion Compatibility (EVC) cluster. Hosts in an EVC cluster and hosts that you add to an
existing EVC cluster must meet EVC requirements.


Power off all virtual machines in the cluster that are running on hosts with a feature
set greater than the EVC mode that you intend to enable, or migrate out of the
cluster.
All hosts in the cluster must meeting the following requirements.
Requirements
Description
Supported ESX/ESXi ESX/ESXi 3.5 Update 2 or later.
version
vCenter Server
The host must be connected to a vCenter Server system.
CPUs
A single vendor, either AMD or Intel.
Advanced CPU
features enabled
Enable these CPU features in the BIOS if they are available:
Hardware virtualization support (AMD-V or Intel VT)
MD No eXecute(NX)
Intel eXecute Disable (XD)
Note
Hardware vendors sometimes disable particular CPU features in the
BIOS by default. This can cause problems in enabling EVC, because
the EVC compatibility checks detect the absence of features that are
expected to be present for a particular CPU. If you cannot enable EVC
on a system with a compatible processor, ensure that all features are
enabled in the BIOS.
Supported CPUs for To check EVC support for a specific processor or server model, see
the EVC mode that you the VMware Compatibility Guide at
want to enable
http://www.vmware.com/resources/compatibility/search.php.
3.3 Best Practices for vSphere HA Clusters
To ensure optimal vSphere HA cluster performance, you should follow certain best practices.
This topic highlights some of the key best practices for a vSphere HA cluster. You can also
refer to the vSphere High Availability Deployment Best Practices publication for further
discussion.
3.3.1 Setting Alarms to Monitor Cluster Changes
When vSphere HA or Fault Tolerance take action to maintain availability, for example, a
virtual machine failover, you can be notified about such changes. Configure alarms in
vCenter Server to be triggered when these actions occur, and have alerts, such as emails, sent
to a specified set of administrators.
Several default vSphere HA alarms are available.



Insufficient failover resources (a cluster alarm)
Cannot find master (a cluster alarm)
Failover in progress (a cluster alarm)




Host HA status (a host alarm)
VM monitoring error (a virtual machine alarm)
VM monitoring action (a virtual machine alarm)
Failover failed (a virtual machine alarm)
Note
The default alarms include the feature name, vSphere HA.
3.3.2 Monitoring Cluster Validity
A valid cluster is one in which the admission control policy has not been violated.
A cluster enabled for vSphere HA becomes invalid when the number of virtual machines
powered on exceeds the failover requirements, that is, the current failover capacity is smaller
than configured failover capacity. If admission control is disabled, clusters do not become
invalid.
In the vSphere Web Client, select vSphere HA from the cluster's Monitor tab and then select
Configuration Issues. A list of current vSphere HA issues appears.
In the vSphere Client, the cluster's Summary tab displays a list of configuration issues for
clusters. The list explains what has caused the cluster to become invalid or overcommitted.
DRS behavior is not affected if a cluster is red because of a vSphere HA issue.
3.3.3 vSphere HA and Storage vMotion Interoperability in a Mixed Cluster
In clusters where ESXi 5.x hosts and ESX/ESXi 4.1 or prior hosts are present and where
Storage vMotion is used extensively or Storage DRS is enabled, do not deploy vSphere HA.
vSphere HA might respond to a host failure by restarting a virtual machine on a host with an
ESXi version different from the one on which the virtual machine was running before the
failure. A problem can occur if, at the time of failure, the virtual machine was involved in a
Storage vMotion action on an ESXi 5.x host, and vSphere HA restarts the virtual machine on
a host with a version prior to ESXi 5.0. While the virtual machine might power on, any
subsequent attempts at snapshot operations could corrupt the vdisk state and leave the
virtual machine unusable.
3.3.4 Admission Control Best Practices
The following recommendations are best practices for vSphere HA admission control.

Select the Percentage of Cluster Resources Reserved admission control policy. This
policy offers the most flexibility in terms of host and virtual machine sizing. When
configuring this policy, choose a percentage for CPU and memory that reflects the
number of host failures you want to support. For example, if you want vSphere HA to
set aside resources for two host failures and have ten hosts of equal capacity in the
cluster, then specify 20% (2/10).

Ensure that you size all cluster hosts equally. For the Host Failures Cluster Tolerates
policy, an unbalanced cluster results in excess capacity being reserved to handle
failures because vSphere HA reserves capacity for the largest hosts. For the
Percentage of Cluster Resources Policy, an unbalanced cluster requires that you
specify larger percentages than would otherwise be necessary to reserve enough
capacity for the anticipated number of host failures.

If you plan to use the Host Failures Cluster Tolerates policy, try to keep virtual
machine sizing requirements similar across all configured virtual machines. This
policy uses slot sizes to calculate the amount of capacity needed to reserve for each
virtual machine. The slot size is based on the largest reserved memory and CPU
needed for any virtual machine. When you mix virtual machines of different CPU and
memory requirements, the slot size calculation defaults to the largest possible, which
limits consolidation.

If you plan to use the Specify Failover Hosts policy, decide how many host failures to
support and then specify this number of hosts as failover hosts. If the cluster is
unbalanced, the designated failover hosts should be at least the same size as the nonfailover hosts in your cluster. This ensures that there is adequate capacity in case of
failure.
3.3.5 Using Auto Deploy with vSphere HA
You can use vSphere HA and Auto Deploy together to improve the availability of your virtual
machines. Auto Deploy provisions hosts when they power up and you can also configure it to
install the vSphere HA agent on such hosts during the boot process. See the Auto Deploy
documentation included in vSphere Installation and Setup for details.
3.3.6 Best Practices for Networking
Observe the following best practices for the configuration of host NICs and network topology
for vSphere HA. Best Practices include recommendations for your ESXi hosts, and for
cabling, switches, routers, and firewalls.
3.3.7 Network Configuration and Maintenance
The following network maintenance suggestions can help you avoid the accidental detection
of failed hosts and network isolation because of dropped vSphere HA heartbeats.

When making changes to the networks that your clustered ESXi hosts are on,
suspend the Host Monitoring feature. Changing your network hardware or
networking settings can interrupt the heartbeats that vSphere HA uses to detect host
failures, and this might result in unwanted attempts to fail over virtual machines.

When you change the networking configuration on the ESXi hosts themselves, for
example, adding port groups, or removing vSwitches, suspend Host Monitoring. After
you have made the networking configuration changes, you must reconfigure vSphere
HA on all hosts in the cluster, which causes the network information to be
reinspected. Then re-enable Host Monitoring.
Note
Because networking is a vital component of vSphere HA, if network maintenance needs to be
performed inform the vSphere HA administrator.
3.3.8 Networks Used for vSphere HA Communications
To identify which network operations might disrupt the functioning of vSphere HA, you
should know which management networks are being used for heart beating and other
vSphere HA communications.


On legacy ESX hosts in the cluster, vSphere HA communications travel over all
networks that are designated as service console networks. VMkernel networks are not
used by these hosts for vSphere HA communications.
On ESXi hosts in the cluster, vSphere HA communications, by default, travel over
VMkernel networks, except those marked for use with vMotion. If there is only one
VMkernel network, vSphere HA shares it with vMotion, if necessary. With ESXi 4.x
and ESXi, you must also explicitly enable the Management traffic checkbox for
vSphere HA to use this network.
Note
To keep vSphere HA agent traffic on the networks you have specified, configure hosts so
vmkNICs used by vSphere HA do not share subnets with vmkNICs used for other purposes.
vSphere HA agents send packets using any pNIC that is associated with a given subnet if
there is also at least one vmkNIC configured for vSphere HA management traffic.
Consequently, to ensure network flow separation, the vmkNICs used by vSphere HA and by
other features must be on different subnets.
3.3.9 Network Isolation Addresses
A network isolation address is an IP address that is pinged to determine whether a host is
isolated from the network. This address is pinged only when a host has stopped receiving
heartbeats from all other hosts in the cluster. If a host can ping its network isolation address,
the host is not network isolated, and the other hosts in the cluster have either failed or are
network partitioned. However, if the host cannot ping its isolation address, it is likely that
the host has become isolated from the network and no failover action is taken.
By default, the network isolation address is the default gateway for the host. Only one default
gateway is specified, regardless of how many management networks have been defined. You
should use the das.isolationaddress[...] advanced attribute to add isolation addresses for
additional networks. See vSphere HA Advanced Attributes.
3.3.10 Network Path Redundancy
Network path redundancy between cluster nodes is important for vSphere HA reliability. A
single management network ends up being a single point of failure and can result in failovers
although only the network has failed.
If you have only one management network, any failure between the host and the cluster can
cause an unnecessary (or false) failover activity if heartbeat datastore connectivity is not
retained during the networking failure. Possible failures include NIC failures, network cable
failures, network cable removal, and switch resets. Consider these possible sources of failure
between hosts and try to minimize them, typically by providing network redundancy.
You can implement network redundancy at the NIC level with NIC teaming, or at the
management network level. In most implementations, NIC teaming provides sufficient
redundancy, but you can use or add management network redundancy if required.
Redundant management networking allows the reliable detection of failures and prevents
isolation or partition conditions from occurring, because heartbeats can be sent over
multiple networks.
Configure the fewest possible number of hardware segments between the servers in a cluster.
The goal being to limit single points of failure. Additionally, routes with too many hops can
cause networking packet delays for heartbeats, and increase the possible points of failure.
3.3.11 Network Redundancy Using NIC Teaming
Using a team of two NICs connected to separate physical switches improves the reliability of
a management network. Because servers connected through two NICs (and through separate
switches) have two independent paths for sending and receiving heartbeats, the cluster is
more resilient. To configure a NIC team for the management network, configure the vNICs in
vSwitch configuration for Active or Standby configuration. The recommended parameter
settings for the vNICs are:


Default load balancing = route based on originating port ID
Failback = No
After you have added a NIC to a host in your vSphere HA cluster, you must reconfigure
vSphere HA on that host.
3.3.12 Network Redundancy Using a Second Network
As an alternative to NIC teaming for providing redundancy for heartbeats, you can create a
second management network connection, which is attached to a separate virtual switch. The
original management network connection is used for network and management purposes.
When the second management network connection is created, vSphere HA sends heartbeats
over both management network connections. If one path fails, vSphere HA still sends and
receives heartbeats over the other path.
Sphere HA Advanced Attributes
Attribute
Description
das.isolationaddress[...]
Sets the address to ping to determine if a host is isolated
from the network. This address is pinged only when
heartbeats are not received from any other host in the
cluster. If not specified, the default gateway of the
management network is used. This default gateway has to
be a reliable address that is available, so that the host can
determine if it is isolated from the network. You can
specify multiple isolation addresses (up to 10) for the
cluster: das.isolationaddressX, where X = 0-9. Typically
you should specify one per management network.
Specifying too many addresses makes isolation detection
take too long.
das.usedefaultisolationaddress
By default, vSphere HA uses the default gateway of the
console network as an isolation address. This attribute
specifies whether or not this default is used (true|false).
das.isolationshutdowntimeout
The period of time the system waits for a virtual machine
to shut down before powering it off. This only applies if the
host's isolation response is Shut down VM. Default value is
300 seconds.
das.slotmeminmb
Defines the maximum bound on the memory slot size. If
this option is used, the slot size is the smaller of this value
or the maximum memory reservation plus memory
overhead of any powered-on virtual machine in the cluster.
das.slotcpuinmhz
Defines the maximum bound on the CPU slot size. If this
option is used, the slot size is the smaller of this value or
the maximum CPU reservation of any powered-on virtual
machine in the cluster.
das.vmmemoryminmb
Defines the default memory resource value assigned to a
virtual machine if its memory reservation is not specified
or zero. This is used for the Host Failures Cluster Tolerates
admission control policy. If no value is specified, the
default is 0 MB.
das.vmcpuminmhz
Defines the default CPU resource value assigned to a
virtual machine if its CPU reservation is not specified or
Sphere HA Advanced Attributes
Attribute
Description
zero. This is used for the Host Failures Cluster Tolerates
admission control policy. If no value is specified, the
default is 32MHz.
das.iostatsinterval
Changes the default I/O stats interval for VM Monitoring
sensitivity. The default is 120 (seconds). Can be set to any
value greater than, or equal to 0. Setting to 0 disables the
check.
Disables configuration issues created if the host does not
das.ignoreinsufficienthbdatastorehave sufficient heartbeat datastores for vSphere HA.
Default value is false.
3.3.13 Monitoring Cluster Validity
A valid cluster is one in which the admission control policy has not been violated.
A cluster enabled for vSphere HA becomes invalid when the number of virtual machines
powered on exceeds the failover requirements, that is, the current failover capacity is smaller
than configured failover capacity. If admission control is disabled, clusters do not become
invalid.
In the vSphere Web Client, select vSphere HA from the cluster's Monitor tab and then select
Configuration Issues. A list of current vSphere HA issues appears.
In the vSphere Client, the cluster's Summary tab displays a list of configuration issues for
clusters. The list explains what has caused the cluster to become invalid or overcommitted.
DRS behavior is not affected if a cluster is red because of a vSphere HA issue.
3.3.14 vSphere HA and Storage vMotion Interoperability in a Mixed Cluster
In clusters where ESXi 5.x hosts and ESX/ESXi 4.1 or prior hosts are present and where
Storage vMotion is used extensively or Storage DRS is enabled, do not deploy vSphere HA.
vSphere HA might respond to a host failure by restarting a virtual machine on a host with an
ESXi version different from the one on which the virtual machine was running before the
failure. A problem can occur if, at the time of failure, the virtual machine was involved in a
Storage vMotion action on an ESXi 5.x host, and vSphere HA restarts the virtual machine on
a host with a version prior to ESXi 5.0. While the virtual machine might power on, any
subsequent attempts at snapshot operations could corrupt the vdisk state and leave the
virtual machine unusable.
3.3.15 Admission Control Best Practices
The following recommendations are best practices for vSphere HA admission control.




Select the Percentage of Cluster Resources Reserved admission control policy. This
policy offers the most flexibility in terms of host and virtual machine sizing. When
configuring this policy, choose a percentage for CPU and memory that reflects the
number of host failures you want to support. For example, if you want vSphere HA to
set aside resources for two host failures and have ten hosts of equal capacity in the
cluster, then specify 20% (2/10).
Ensure that you size all cluster hosts equally. For the Host Failures Cluster Tolerates
policy, an unbalanced cluster results in excess capacity being reserved to handle
failures because vSphere HA reserves capacity for the largest hosts. For the
Percentage of Cluster Resources Policy, an unbalanced cluster requires that you
specify larger percentages than would otherwise be necessary to reserve enough
capacity for the anticipated number of host failures.
If you plan to use the Host Failures Cluster Tolerates policy, try to keep virtual
machine sizing requirements similar across all configured virtual machines. This
policy uses slot sizes to calculate the amount of capacity needed to reserve for each
virtual machine. The slot size is based on the largest reserved memory and CPU
needed for any virtual machine. When you mix virtual machines of different CPU and
memory requirements, the slot size calculation defaults to the largest possible, which
limits consolidation.
If you plan to use the Specify Failover Hosts policy, decide how many host failures to
support and then specify this number of hosts as failover hosts. If the cluster is
unbalanced, the designated failover hosts should be at least the same size as the nonfailover hosts in your cluster. This ensures that there is adequate capacity in case of
failure.
3.3.16 vSphere HA Security
vSphere HA is enhanced by several security features.
Select firewall
ports opened
vSphere HA uses TCP and UDP port 8182 for agent-to-agent
communication. The firewall ports open and close automatically to
ensure they are open only when needed.
Configuration files
protected using file
system
permissions
vSphere HA stores configuration information on the local storage or
on ramdisk if there is no local datastore. These files are protected
using file system permissions and they are accessible only to the root
user. Hosts without local storage are only supported if they are
managed by Auto Deploy.
Detailed logging
The location where vSphere HA places log files depends on the
version of host.
For ESXi 5.x hosts, vSphere HA writes to syslog only by default, so
logs are placed where syslog is configured to put them. The log file
names for vSphere HA are prepended with fdm, fault domain
manager, which is a service of vSphere HA.
For legacy ESXi 4.x hosts, vSphere HA writes to
/var/log/vmware/fdm on local disk, as well as syslog if it is
configured.
For legacy ESX 4.x hosts, vSphere HA writes to
/var/log/vmware/fdm.
Secure vSphere HA vSphere HA logs onto the vSphere HA agents using a user account,
vpxuser, created by vCenter Server. This account is the same account
logins
used by vCenter Server to manage the host. vCenter Server creates a
random password for this account and changes the password
periodically. The time period is set by the vCenter Server
VirtualCenter.VimPasswordExpirationInDays setting. Users with
administrative privileges on the root folder of the host can log in to
the agent.
Secure
communication
All communication between vCenter Server and the vSphere HA
agent is done over SSL. Agent-to-agent communication also uses SSL
except for election messages, which occur over UDP. Election
messages are verified over SSL so that a rogue agent can prevent only
the host on which the agent is running from being elected as a master
host. In this case, a configuration issue for the cluster is issued so the
user is aware of the problem.
Host SSL
certificate
verification
required
vSphere HA requires that each host have a verified SSL certificate.
Each host generates a self-signed certificate when it is booted for the
first time. This certificate can then be regenerated or replaced with
one issued by an authority. If the certificate is replaced, vSphere HA
needs to be reconfigured on the host. If a host becomes disconnected
from vCenter Server after its certificate is updated and the ESXi or
ESX Host agent is restarted, then vSphere HA is automatically
reconfigured when the host is reconnected to vCenter Server. If the
disconnection does not occur because vCenter Server host SSL
certificate verification is disabled at the time, verify the new
certificate and reconfigure vSphere HA on the host.
3.4 Best Practices for Networking
Observe the following best practices for the configuration of host NICs and network topology
for vSphere HA. Best Practices include recommendations for your ESXi hosts, and for
cabling, switches, routers, and firewalls.
3.4.1 Network Configuration and Maintenance
The following network maintenance suggestions can help you avoid the accidental detection
of failed hosts and network isolation because of dropped vSphere HA heartbeats.



When making changes to the networks that your clustered ESXi hosts are on,
suspend the Host Monitoring feature. Changing your network hardware or
networking settings can interrupt the heartbeats that vSphere HA uses to detect host
failures, and this might result in unwanted attempts to fail over virtual machines.
When you change the networking configuration on the ESXi hosts themselves, for
example, adding port groups, or removing vSwitches, suspend Host Monitoring. After
you have made the networking configuration
Then re-enable Host Monitoring.
Note
Because networking is a vital component of vSphere HA, if network maintenance needs to be
performed inform the vSphere HA administrator.
3.4.2 Networks Used for vSphere HA Communications
To identify which network operations might disrupt the functioning of vSphere HA, you
should know which management networks are being used for heart beating and other
vSphere HA communications.


On legacy ESX hosts in the cluster, vSphere HA communications travel over all
networks that are designated as service console networks. VMkernel networks are not
used by these hosts for vSphere HA communications.
On ESXi hosts in the cluster, vSphere HA communications, by default, travel over
VMkernel networks, except those marked for use with vMotion. If there is only one
VMkernel network, vSphere HA shares it with vMotion, if necessary. With ESXi 4.x
and ESXi, you must also explicitly enable the Management traffic checkbox for
vSphere HA to use this network.
Note
To keep vSphere HA agent traffic on the networks you have specified, configure hosts so
vmkNICs used by vSphere HA do not share subnets with vmkNICs used for other purposes.
vSphere HA agents send packets using any pNIC that is associated with a given subnet if
there is also at least one vmkNIC configured for vSphere HA management traffic.
Consequently, to ensure network flow separation, the vmkNICs used by vSphere HA and by
other features must be on different subnets.
3.4.3 Network Isolation Addresses
A network isolation address is an IP address that is pinged to determine whether a host is
isolated from the network. This address is pinged only when a host has stopped receiving
heartbeats from all other hosts in the cluster. If a host can ping its network isolation address,
the host is not network isolated, and the other hosts in the cluster have either failed or are
network partitioned. However, if the host cannot ping its isolation address, it is likely that
the host has become isolated from the network and no failover action is taken.
By default, the network isolation address is the default gateway for the host. Only one default
gateway is specified, regardless of how many management networks have been defined. You
should use the das.isolationaddress[...] advanced attribute to add isolation addresses for
additional networks. See vSphere HA Advanced Attributes.
3.4.4 Network Path Redundancy
Network path redundancy between cluster nodes is important for vSphere HA reliability. A
single management network ends up being a single point of failure and can result in failovers
although only the network has failed.
If you have only one management network, any failure between the host and the cluster can
cause an unnecessary (or false) failover activity if heartbeat datastore connectivity is not
retained during the networking failure. Possible failures include NIC failures, network cable
failures, network cable removal, and switch resets. Consider these possible sources of failure
between hosts and try to minimize them, typically by providing network redundancy.
You can implement network redundancy at the NIC level with NIC teaming, or at the
management network level. In most implementations, NIC teaming provides sufficient
redundancy, but you can use or add management network redundancy if required.
Redundant management networking allows the reliable detection of failures and prevents
isolation or partition conditions from occurring, because heartbeats can be sent over
multiple networks.
Configure the fewest possible number of hardware segments between the servers in a cluster.
The goal being to limit single points of failure. Additionally, routes with too many hops can
cause networking packet delays for heartbeats, and increase the possible points of failure.
3.4.5 Network Redundancy Using NIC Teaming
Using a team of two NICs connected to separate physical switches improves the reliability of
a management network. Because servers connected through two NICs (and through separate
switches) have two independent paths for sending and receiving heartbeats, the cluster is
more resilient. To configure a NIC team for the management network, configure the vNICs in
vSwitch configuration for Active or Standby configuration. The recommended parameter
settings for the vNICs are:


Default load balancing = route based on originating port ID
Failback = No
After you have added a NIC to a host in your vSphere HA cluster, you must reconfigure
vSphere HA on that host.
3.4.6 Network Redundancy Using a Second Network
As an alternative to NIC teaming for providing redundancy for heartbeats, you can create a
second management network connection, which is attached to a separate virtual switch. The
original management network connection is used for network and management purposes.
When the second management network connection is created, vSphere HA sends heartbeats
over both management network connections. If one path fails, vSphere HA still sends and
receives heartbeats over the other path.
3.5 Fault Tolerance
3.5.1 Fault Tolerance Checklist
The following checklist contains cluster, host, and virtual machine requirements that you
need to be aware of before using vSphere Fault Tolerance.
Review this list before setting up Fault Tolerance. You can also use the VMware SiteSurvey
utility (download at http://www.vmware.com/download/shared_utilities.html) to better
understand the configuration issues associated with the cluster, host, and virtual machines
being used for vSphere FT.
Note
The failover of fault tolerant virtual machines is independent of vCenter Server, but you must
use vCenter Server to set up your Fault Tolerance clusters.
3.5.2 Cluster Requirements for Fault Tolerance
You must meet the following cluster requirements before you use Fault Tolerance.
At least two FT-certified hosts running the same Fault Tolerance version or host build
number. The Fault Tolerance version number appears on a host's Summary tab in the
vSphere Web Client or vSphere Client.
Note
For legacy hosts prior to ESX/ESXi 4.1, this tab lists the host build number instead. Patches
can cause host build numbers to vary between ESX and ESXi installations. To ensure that
your legacy hosts are FT compatible, do not mix legacy ESX and ESXi hosts in an FT pair.



ESXi hosts have access to the same virtual machine datastores and networks. See
Best Practices for Fault Tolerance.
Fault Tolerance logging and VMotion networking configured. See Configure
Networking for Host Machines in the vSphere Client or Configure Networking for
Host Machines in the vSphere Web Client.
vSphere HA cluster created and enabled. See Creating a vSphere HA Cluster. vSphere
HA must be enabled before you can power on fault tolerant virtual machines or add a
host to a cluster that already supports fault tolerant virtual machines.
3.5.3 Host Requirements for Fault Tolerance
You must meet the following host requirements before you use Fault Tolerance.




Hosts must have processors from the FT-compatible processor group. It is also highly
recommended that the hosts' processors are compatible with one another. See the
VMware knowledge base article at http://kb.vmware.com/kb/1008027 for
information on supported processors.
Hosts must be licensed for Fault Tolerance.
Hosts must be certified for Fault Tolerance. See
http://www.vmware.com/resources/compatibility/search.php and select Search by
Fault Tolerant Compatible Sets to determine if your hosts are certified.
The configuration for each host must have Hardware Virtualization (HV) enabled in
the BIOS.
To confirm the compatibility of the hosts in the cluster to support Fault Tolerance, you can
also run profile compliance checks as described in Create Cluster and Check Compliance in
the vSphere Client or Create Cluster and Check Compliance in the vSphere Web Client.
3.5.4 Virtual Machine Requirements for Fault Tolerance
You must meet the following virtual machine requirements before you use Fault Tolerance.






No unsupported devices attached to the virtual machine. See Fault Tolerance
Interoperability.
Virtual machines must be stored in virtual RDM or virtual machine disk (VMDK)
files that are thick provisioned. If a virtual machine is stored in a VMDK file that is
thin provisioned and an attempt is made to enable Fault Tolerance, a message
appears indicating that the VMDK file must be converted. To perform the conversion,
you must power off the virtual machine.
Incompatible features must not be running with the fault tolerant virtual machines.
See Fault Tolerance Interoperability.
Virtual machine files must be stored on shared storage. Acceptable shared storage
solutions include Fibre Channel, (hardware and software) iSCSI, NFS, and NAS.
Only virtual machines with a single vCPU are compatible with Fault Tolerance.
Virtual machines must be running on one of the supported guest operating systems.
See the VMware knowledge
3.5.5 Fault Tolerance Interoperability
Before configuring vSphere Fault Tolerance, you should be aware of the features and
products Fault Tolerance cannot interoperate with.
3.5.5.1 vSphere Features Not Supported with Fault Tolerance
The following vSphere features are not supported for fault tolerant virtual machines.


Snapshots. Snapshots must be removed or committed before Fault Tolerance can be
enabled on a virtual machine. In addition, it is not possible to take snapshots of
virtual machines on which Fault Tolerance is enabled.
Storage vMotion. You cannot invoke Storage vMotion for virtual machines with Fault
Tolerance turned on. To migrate the storage, you should temporarily turn off Fault
Tolerance, and perform the storage vMotion action. When this is complete, you can
turn Fault Tolerance back on.


Linked clones. You cannot enable Fault Tolerance on a virtual machine that is a
linked clone, nor can you create a linked clone from an FT-enabled virtual machine.
Virtual Machine Backups. You cannot back up an FT-enabled virtual machine using
Storage API for Data Protection, vSphere Data Protection, or similar backup products
that require the use of a virtual machine snapshot, as performed by ESXi. To back up
a fault tolerant virtual machine in this manner, you must first disable FT, then reenable FT after performing the backup. Storage array-based snapshots do not affect
FT.
3.5.5.2 Features and Devices Incompatible with Fault Tolerance
For a virtual machine to be compatible with Fault Tolerance, the Virtual Machine must not
use the following features or devices.
Features and Devices Incompatible with Fault Tolerance and Corrective Actions
Incompatible Feature or
Device
Corrective Action
Symmetric multiprocessor
(SMP) virtual machines. Only Reconfigure the virtual machine as a single vCPU. Many
virtual machines with a single workloads have good performance configured as a single
vCPU are compatible with
vCPU.
Fault Tolerance.
Physical Raw Disk mapping
(RDM).
Reconfigure virtual machines with physical RDM-backed
virtual devices to use virtual RDMs instead.
CD-ROM or floppy virtual
Remove the CD-ROM or floppy virtual device or reconfigure
devices backed by a physical or
the backing with an ISO installed on shared storage.
remote device.
Paravirtualized guests.
If paravirtualization is not required, reconfigure the virtual
machine without a VMI ROM.
USB and sound devices.
Remove these devices from the virtual machine.
N_Port ID Virtualization
(NPIV).
Disable the NPIV configuration of the virtual machine.
NIC passthrough.
This feature is not supported by Fault Tolerance so it must
be turned off.
vlance networking drivers.
Fault Tolerance does not support virtual machines that are
configured with vlance virtual NIC cards. However, vmxnet2,
vmxnet3, and e1000 are fully supported.
Virtual disks backed with thinprovisioned storage or thick- When you turn on Fault Tolerance, the conversion to the
provisioned disks that do not appropriate disk format is performed by default. You must
have clustering features
power off the virtual machine to trigger this conversion.
enabled.
Features and Devices Incompatible with Fault Tolerance and Corrective Actions
Incompatible Feature or
Device
Corrective Action
The hot plug feature is automatically disabled for fault
tolerant virtual machines. To hot plug devices (either adding
or removing), you must momentarily turn off Fault
Tolerance, perform the hot plug, and then turn on Fault
Tolerance.
Note
Hot-plugging devices.
When using Fault Tolerance, changing the settings of a
virtual network card while a virtual machine is running is a
hot-plug operation, since it requires "unplugging" the
network card and then "plugging" it in again. For example,
with a virtual network card for a running virtual machine, if
you change the network that the virtual NIC is connected to,
FT must be turned off first.
Extended Page Tables/Rapid
Virtualization Indexing
(EPT/RVI).
EPT/RVI is automatically disabled for virtual machines with
Fault Tolerance turned on.
Serial or parallel ports
Remove these devices from the virtual machine.
3.5.6 Best Practices for Fault Tolerance
To ensure optimal Fault Tolerance results, you should follow certain best practices.
In addition to the following information, see the white paper VMware Fault Tolerance
Recommendations and Considerations at
http://www.vmware.com/resources/techresources/10040.
3.5.6.1 Host Configuration
Consider the following best practices when configuring your hosts.


Hosts running the Primary and Secondary VMs should operate at approximately the
same processor frequencies, otherwise the Secondary VM might be restarted more
frequently. Platform power management features that do not adjust based on
workload (for example, power capping and enforced low frequency modes to save
power) can cause processor frequencies to vary greatly. If Secondary VMs are being
restarted on a regular basis, disable all power management modes on the hosts
running fault tolerant virtual machines or ensure that all hosts are running in the
same power management modes.
Apply the same instruction set extension configuration (enabled or disabled) to all
hosts. The process for enabling or disabling instruction sets varies among BIOSes.
See the documentation for your hosts' BIOSes about how to configure instruction
sets.
3.5.6.2 Homogeneous Clusters
vSphere Fault Tolerance can function in clusters with nonuniform hosts, but it works best in
clusters with compatible nodes. When constructing your cluster, all hosts should have the
following configuration:






Processors from the same compatible processor group.
Common access to datastores used by the virtual machines.
The same virtual machine network configuration.
The same ESXi version.
The same Fault Tolerance version number (or host build number for hosts prior to
ESX/ESXi 4.1).
The same BIOS settings (power management and hyperthreading) for all hosts.
Run Check Compliance to identify incompatibilities and to correct them.
3.5.6.3 Performance
To increase the bandwidth available for the logging traffic between Primary and Secondary
VMs use a 10Gbit NIC, and enable the use of jumbo frames.
3.5.6.4 Store ISOs on Shared Storage for Continuous Access
Store ISOs that are accessed by virtual machines with Fault Tolerance enabled on shared
storage that is accessible to both instances of the fault tolerant virtual machine. If you use
this configuration, the CD-ROM in the virtual machine continues operating normally, even
when a failover occurs.
For virtual machines with Fault Tolerance enabled, you might use ISO images that are
accessible only to the Primary VM. In such a case, the Primary VM can access the ISO, but if
a failover occurs, the CD-ROM reports errors as if there is no media. This situation might be
acceptable if the CD-ROM is being used for a temporary, noncritical operation such as an
installation.
3.5.6.5 Avoid Network Partitions
A network partition occurs when a vSphere HA cluster has a management network failure
that isolates some of the hosts from vCenter Server and from one another. See Network
Partitions. When a partition occurs, Fault Tolerance protection might be degraded.
In a partitioned vSphere HA cluster using Fault Tolerance, the Primary VM (or its Secondary
VM) could end up in a partition managed by a master host that is not responsible for the
virtual machine. When a failover is needed, a Secondary VM is restarted only if the Primary
VM was in a partition managed by the master host responsible for it.
To ensure that your management network is less likely to have a failure that leads to a
network partition, follow the recommendations in Best Practices for Networking.
3.5.6.6 Viewing Fault Tolerance Errors in the vSphere Client
When errors related to your implementation of Fault Tolerance are generated by vCenter
Server, the Fault Details screen appears.
This screen lists faults related to Fault Tolerance and for each fault it provides the type of
fault (red is an error, yellow is a warning), the name of the virtual machine or host involved,
and a brief description.
You can also invoke this screen for a specific failed Fault Tolerance task. To do this, select the
task in either the Recent Tasks pane or the Tasks & Events tab for the entity that experienced
the fault and click the View details link that appears in the Details column.
3.5.6.7 Viewing Fault Tolerance Errors in the vSphere Web Client
When tasks related to your implementation of Fault Tolerance cause errors, you can view
information about them in the Recent Tasks pane.
The Recent Tasks pane lists a summary of each error under the Failed tab. For information
about failed tasks, click More Tasks to open the Task Console.
In the Task Console, each task is listed with information that includes its Name, Target, and
Status. In the Status column, if the task failed, the type of fault it generated is described. For
information about a task, select it and details appear in the pane below the task list.
3.5.6.8 Upgrade Hosts Used for Fault Tolerance
When you upgrade hosts that contain fault tolerant virtual machines, ensure that the
Primary and Secondary VMs continue to run on hosts with the same FT version number or
host build number (for hosts prior to ESX/ESXi 4.1).
3.5.6.9 Prerequisites
Verify that you have cluster administrator privileges.
Verify that you have sets of four or more ESXi hosts that are hosting fault tolerant virtual
machines that are powered on. If the virtual machines are powered off, the Primary and
Secondary VMs can be relocated to hosts with different builds.
This upgrade procedure is for a minimum four-node cluster. The same instructions can be
followed for a smaller cluster, though the unprotected interval will be slightly longer.
Procedure






Using vMotion, migrate the fault tolerant virtual machines off of two hosts.
Upgrade the two evacuated hosts to the same ESXi build.
Turn off Fault Tolerance on the Primary VM.
Using vMotion, move the disabled Primary VM to one of the upgraded hosts.
Turn on Fault Tolerance on the Primary VM that was moved.
Repeat Step 1 to Step 5 for as many fault tolerant virtual machine pairs as can be
accommodated on the upgraded hosts.

Using vMotion, redistribute the fault tolerant virtual machines.
All ESXi hosts in a cluster are upgraded.
3.5.7 vSphere Fault Tolerance Configuration Recommendations
You should observe certain guidelines when configuring Fault Tolerance.





In addition to non-fault tolerant virtual machines, you should have no more than four
fault tolerant virtual machines (primaries or secondaries) on any single host. The
number of fault tolerant virtual machines that you can safely run on each host is
based on the sizes and workloads of the ESXi host and virtual machines, all of which
can vary.
If you are using NFS to access shared storage, use dedicated NAS hardware with at
least a 1Gbit NIC to obtain the network performance required for Fault Tolerance to
work properly.
Ensure that a resource pool containing fault tolerant virtual machines has excess
memory above the memory size of the virtual machines. The memory reservation of a
fault tolerant virtual machine is set to the virtual machine's memory size when Fault
Tolerance is turned on. Without this excess in the resource pool, there might not be
any memory available to use as overhead memory.
Use a maximum of 16 virtual disks per fault tolerant virtual machine.
To ensure redundancy and maximum Fault Tolerance protection, you should have a
minimum of three hosts in the cluster. In a failover situation, this provides a host that
can accommodate the new Secondary VM that is created.
4 Networking
4.1 vSphere Distributed Switch Health Check
vSphere 5.1 distributed switch health check helps identify and troubleshoot configuration
errors in vSphere distributed switches.
The following errors are common configuration errors that health check helps identify.



Mismatched VLAN trunks between a vSphere distributed switch and physical switch.
Mismatched MTU settings between physical network adapters, distributed switches,
and physical switch ports.
Mismatched virtual switch teaming policies for the physical switch port-channel
settings.
Health check monitors the following:



VLAN. Checks whether vSphere distributed switch VLAN settings match trunk port
configuration on the adjacent physical switch ports.
MTU. Checks whether the physical access switch port MTU jumbo frame setting
based on per VLAN matches the vSphere distributed switch MTU setting.
Teaming policies. Checks whether the physical access switch ports EtherChannel
setting matches the distributed switch distributed port group IPHash teaming policy
settings.
Health check is limited to only the access switch port to which the distributed switch uplink
connects.
Note
For VLAN and MTU checks, you must have at least two link-up physical uplink NICs for the
distributed switch.
For a teaming policy check, you must have at least two link-up physical uplink NICs and two
hosts when applying the policy.
4.2 vDS Port Group settings and parameters
 Setting
Description
Port binding Choose when ports are assigned to virtual machines connected to this
distributed port group.
Static binding: Assign a port to a virtual machine when the virtual machine
connects to the distributed port group. This option is not available when the
vSphere Web Client is connected directly to ESXi.
Dynamic binding: Assign a port to a virtual machine the first time the
virtual machine powers on after it is connected to the distributed port
group. Dynamic binding is deprecated in ESXi 5.0.
Ephemeral: No port binding. This option is not available when the vSphere
Web Client is connected directly to ESXi.
Port
allocation
Elastic: The default number of ports is eight. When all ports are assigned, a
new set of eight ports is created. This is the default.
Fixed: The default number of ports is set to eight. No additional ports are
created when all ports are assigned.
Number of
ports
Enter the number of ports on the distributed port group.
Network
Use the drop-down menu to assign the new distributed port group to a userresource pool defined network resource pool. If you have not created a network resource
pool, this menu is empty.
VLAN
Use the Type drop-down menu to select VLAN options:
None: Do not use VLAN.
VLAN: In the VLAN ID field, enter a number between 1 and 4094.
VLAN Trunking: Enter a VLAN trunk range.
Private VLAN: Select a private VLAN entry. If you did not create any private
VLANs, this menu is empty.
Advanced
Select this check box to customize the policy configurations for the new
distributed port group.
 (Optional) In the Security section, edit the security exceptions and click Next.
Setting
Description
Promiscuous Reject. Placing a guest adapter in promiscuous mode has no effect on which
mode
frames are received by the adapter.
Accept. Placing a guest adapter in promiscuous mode causes it to detect all
frames passed on the vSphere distributed switch. These frames are allowed
under the VLAN policy for the port group to which the adapter is connected.
MAC address Reject. If you set to Reject and the guest operating system changes the MAC
address of the adapter to anything other than what is in the .vmx configuration
changes
file, all inbound frames are dropped.
If the Guest OS changes the MAC address back to match the MAC address in
the .vmx configuration file, inbound frames are passed again.
Accept. Changing the MAC address from the Guest OS has the intended effect:
frames to the new MAC address are received.
Forged
transmits
Reject. Any outbound frame with a source MAC address that is different from
the one currently set on the adapter is dropped.
Accept. No filtering is performed and all outbound frames are passed.
 Setting
Description
Status
If you enable either Ingress Traffic Shaping or Egress Traffic Shaping, you are
setting limits on the amount of networking bandwidth allocated for each virtual
adapter associated with this particular port group. If you disable the policy,
services have a free, clear connection to the physical network by default.
Average
Bandwidth
Establishes the number of bits per second to allow across a port, averaged over
time. This is the allowed average load.
Peak
Bandwidth
The maximum number of bits per second to allow across a port when it is
sending and receiving a burst of traffic. This tops the bandwidth used by a port
whenever it is using its burst bonus.
Burst Size
The maximum number of bytes to allow in a burst. If this parameter is set, a
port might gain a burst bonus when it does not use all its allocated bandwidth.
Whenever the port needs more bandwidth than specified by Average
Bandwidth, it might temporarily transmit data at a higher speed if a burst bonus
is available. This parameter tops the number of bytes that might be
accumulated in the burst bonus and thus transferred at a higher speed.
(Optional) In the Teaming and failover section, edit the settings and click Next.
Setting
Description
Load
balancing
Specify how to choose an uplink.





Route based on the originating virtual port. Choose an uplink based
on the virtual port where the traffic entered the distributed switch.
Route based on IP hash. Choose an uplink based on a hash of the
source and destination IP addresses of each packet. For non-IP
packets, whatever is at those offsets is used to compute the hash.
Route based on source MAC hash. Choose an uplink based on a hash
of the source Ethernet.
Route based on physical NIC load. Choose an uplink based on the
current loads of physical NICs.
Use explicit failover order. Always use the highest order uplink from
the list of Active adapters which passes failover detection criteria.
Note
IP-based teaming requires that the physical switch be configured with
etherchannel. For all other options, disable etherchannel.
Network
failover
detection
Specify the method to use for failover detection.


Link Status only. Relies solely on the link status that the network
adapter provides. This option detects failures, such as cable pulls and
physical switch power failures, but not configuration errors, such as a
physical switch port being blocked by spanning tree or that is
misconfigured to the wrong VLAN or cable pulls on the other side of a
physical switch.
Beacon Probing. Sends out and listens for beacon probes on all NICs
in the team and uses this information, in addition to link status, to
determine link failure. This detects many of the failures previously
mentioned that are not detected by link status alone.
Note
Do not use beacon probing with IP-hash load balancing.
Notify
switches
Select Yes or No to notify switches in the case of failover. If you select Yes,
whenever a virtual NIC is connected to the distributed switch or whenever
that virtual NIC’s traffic would be routed over a different physical NIC in the
team because of a failover event, a notification is sent out over the network to
update the lookup tables on physical switches. In almost all cases, this
process is desirable for the lowest latency of failover occurrences and
migrations with vMotion.
4.3 vSphere Network I/O Control
Network resource pools determine the bandwidth that different network traffic types are
given on a vSphere distributed switch.
When network I/O control is enabled, distributed switch traffic is divided into the following
predefined network resource pools: Fault Tolerance traffic, iSCSI traffic, vMotion traffic,
management traffic, vSphere Replication (VR) traffic, NFS traffic, and virtual machine
traffic.
You can also create custom network resource pools for virtual machine traffic. You can
control the bandwidth each network resource pool is given by setting the physical adapter
shares and host limit for each network resource pool.
The physical adapter shares assigned to a network resource pool determine the share of the
total available bandwidth guaranteed to the traffic associated with that network resource
pool. The share of transmit bandwidth available to a network resource pool is determined by
the network resource pool's shares and what other network resource pools are actively
transmitting. For example, if you set your FT traffic and iSCSI traffic resource pools to 100
shares, while each of the other resource pools is set to 50 shares, the FT traffic and iSCSI
traffic resource pools each receive 25% of the available bandwidth. The remaining resource
pools each receive 12.5% of the available bandwidth. These reservations apply only when the
physical adapter is saturated.
Note
The iSCSI traffic resource pool shares do not apply to iSCSI traffic on a dependent hardware
iSCSI adapter.
The host limit of a network resource pool is the upper limit of bandwidth that the network
resource pool can use.
Assigning a QoS priority tag to a network resource pool applies an 802.1p tag to all outgoing
packets associated with that network resource pool.
Select the Physical adapter shares for the network resource pool.
Option
 Custom
Description
 Type a specific number of shares, from 1 to 100, for this network
resource pool.

High

Sets the shares for this resource pool to 100.

Normal

Sets the shares for this resource pool to 50.

Low

Sets the shares for this resource pool to 25.
4.4 TCP Segmentation Offload and Jumbo Frames
You enable jumbo frames on a vSphere distributed switch or vSphere standard
switch by changing the maximum transmission units (MTU). TCP Segmentation Offload
(TSO) is enabled on the VMkernel interface by default, but must be enabled at the virtual
machine level.
4.4.1 Enabling TSO
To enable TSO at the virtual machine level, you must replace the existing vmxnet or flexible
virtual network adapters with enhanced vmxnet virtual network adapters. This replacement
might result in a change in the MAC address of the virtual network adapter.
TSO support through the enhanced vmxnet network adapter is available for virtual machines
that run the following guest operating systems:



Microsoft Windows 2003 Enterprise Edition with Service Pack 2 (32 bit and 64 bit)
Red Hat Enterprise Linux 4 (64 bit)
Red Hat Enterprise Linux 5 (32 bit and 64 bit)

SUSE Linux Enterprise Server 10 (32 bit and 64 bit)
4.4.1.1 Enable TSO Support for a Virtual Machine
You can enable TSO support on a virtual machine by using an enhanced vmxnet adapter for
that virtual machine.
4.5 Single Root I/O Virtualization (SR-IOV)
vSphere 5.1 and later supports Single Root I/O Virtualization (SR-IOV). SR-IOV is a
specification that allows a single Peripheral Component Interconnect Express (PCIe)
physical device under a single root port to appear to be multiple separate physical devices to
the hypervisor or the guest operating system.
SR-IOV uses physical functions (PFs) and virtual functions (VFs) to manage global functions
for the SR-IOV devices. PFs are full PCIe functions that include the SR-IOV Extended
Capability which is used to configure and manage the SR-IOV functionality. It is possible to
configure or control PCIe devices using PFs, and the PF has full ability to move data in and
out of the device. VFs are lightweight PCIe functions that contain all the resources
necessary for data movement but have a carefully minimized set of configuration resources.
SR-IOV-enabled PCIe devices present multiple instances of themselves to the guest OS
instance and hypervisor. The number of virtual functions presented depends on the device.
For SR-IOV-enabled PCIe devices to function, you must have the appropriate BIOS and
hardware support, as well as SR-IOV support in the guest driver or hypervisor instance.
4.5.1 SR-IOV Support
vSphere 5.1 supports SR-IOV. However, some features of vSphere are not functional when
SR-IOV is enabled.
4.5.1.1 Supported Configurations
To use SR-IOV, your environment must meet the following configuration requirements:
Supported Configurations for Using SR-IOV
Component
vSphere
Requirements
Hosts with Intel processors require ESXi 5.1 or later.
Hosts with AMD processors are not supported with SR-IOV.
Must be compatible with the ESXi release.
Must have an Intel processor.
Must not have an AMD processor.
Physical host
Must support input/output memory management unit (IOMMU),
and must have IOMMU enabled in the BIOS.
Must support SR-IOV, and must have SR-IOV enabled in the BIOS.
Contact the server vendor to determine whether the host supports
SR-IOV.
Supported Configurations for Using SR-IOV
Component
Requirements
Must be compatible with the ESXi release.
Physical NIC
Must be supported for use with the host and SR-IOV according to the
technical documentation from the server vendor.
Must have SR-IOV enabled in the firmware.
Must be certified by VMware.
PF driver in ESXi for Must be installed on the ESXi host. The ESXi release provides a
the physical NIC
default driver for certain NICs, while for others you must download
and manually install it.
Guest OS
Red Hat Enterprise Linux 6.x
Windows Server 2008 R2 with SP2
o verify compatibility of physical hosts and NICs with ESXi releases, see the VMware
Compatibility Guide.
4.5.1.2 Availability of Features
The following features are not available for virtual machines configured with SR-IOV:














vMotion
Storage vMotion
vShield
Netflow
Virtual Wire
High Availability
Fault Tolerance
DRS
DPM
Suspend and resume
Snapshots
MAC-based VLAN for passthrough virtual functions
Hot addition and removal of virtual devices, memory, and vCPU
Participation in a cluster environment
Note
Attempts to enable or configure unsupported features with SR-IOV in the vSphere Web
Client result in unexpected behavior in your environment.
4.5.1.3 Supported NICs
The following NICs are supported for virtual machines configured with SR-IOV. All NICs
must have drivers and firmware that support SR-IOV. Some NICs might require SR-IOV to
be enabled on the firmware.



Products based on the Intel 82599ES 10 Gigabit Ethernet Controller Family (Niantic)
Products based on the Intel Ethernet Controller X540 Family (Twinville)
Emulex OneConnect (BE3)
4.5.1.4 Upgrading from earlier versions of vSphere
If you upgrade from vSphere 5.0 or earlier to vSphere 5.1 or later, SR-IOV support is not
available until you update the NIC drivers for the vSphere release. NICs must have firmware
and drivers that support SR-IOV enabled for SR-IOV functionality to operate.
4.5.2 vSphere 5.1 and Virtual Function Interaction
Virtual functions (VFs) are lightweight PCIe functions that contain all the resources
necessary for data movement but have a carefully minimized set of configuration resources.
There are some restrictions in the interactions between vSphere 5.1 and VFs.




When a physical NIC creates VFs for SR-IOV to use, the physical NIC becomes a
hidden uplink and cannot be used as a normal uplink. This means it cannot be added
to a standard or distributed switch.
There is no rate control for VFs in vSphere 5.1. Every VF could potentially use the
entire bandwidth for a physical link.
When a VF device is configured as a passthrough device on a virtual machine, the
standby and hibernate functions for the virtual machine are not supported.
Due to the limited number of vectors available for passthrough devices, there is a
limited number of VFs supported on an vSphere ESXi host . vSphere 5.1 SR-IOV
supports up to 41 VFs on supported Intel NICs and up to 64 VFs on supported
Emulex NICs.
4.5.3 DirectPath I/O vs SR-IOV
SR-IOV offers performance benefits and tradeoffs similar to those of DirectPath I/O.
DirectPath I/O and SR-IOV have similar functionalty but you use them to accomplish
different things.
SR-IOV is beneficial in workloads with very high packet rates or very low latency
requirements. Like DirectPath I/O, SR-IOV is not compatible with certain core virtualization
features, such as vMotion. SR-IOV does, however, allow for a single physical device to be
shared amongst multiple guests.
With DirectPath I/O you can map only one physical funtion to one virtual machine. SR-IOV
lets you share a single physical device, allowing multiple virtual machines to connect directly
to the physical funtion.
This functionality allows you to virtualize low-latency (less than 50 microsec) and high PPS
(greater than 50,000 such as network appliances or purpose built solutions) workloads on a
VMWorkstation.
4.5.3.1 Configure SR-IOV in a Host Profile with the vSphere Web Client
Before you can connect a virtual machine to a virtual function, you have to configure the
virtual functions of the physical NIC on your host by using a host profile.
You can enable SR-IOV virtual functions on the host by using the esxcli system module
parameters set vCLI command on the NIC driver parameter for virtual functions in
accordance with the driver documentation. For more information about using vCLI
commands, see vSphere Command-Line Interface Documentation.
4.5.3.2 LACP Limitations on a vSphere Distributed Switch
Link Aggregation Control Protocol (LACP) on a vSphere distributed switch allows network
devices to negotiate automatic bundling of links by sending LACP packets to a peer.
However, there are some limitations when using LACP with a vSphere distributed switch.






LACP only works with IP Hash load balancing and Link Status Network failover
detection.
LACP is not compatible with iSCSI software multipathing.
vSphere only supports one LACP group per distributed switch, and only one LACP
group per host.
LACP settings do not exist in host profiles.
LACP between two nested ESXi hosts is not possible.
LACP does not work with port mirroring.
Option
Description
Promiscuous
Mode
Reject — Placing a guest adapter in promiscuous mode has no effect on
which frames are received by the adapter.
Accept — Placing a guest adapter in promiscuous mode causes it to detect
all frames passed on the vSphere standard switch that are allowed under
the VLAN policy for the port group that the adapter is connected to.
MAC Address
Changes
Reject — If you set the MAC Address Changes to Reject and the guest
operating system changes the MAC address of the adapter to anything
other than what is in the .vmx configuration file, all inbound frames are
dropped.
If the Guest OS changes the MAC address back to match the MAC address
in the .vmx configuration file, inbound frames are passed again.
Accept — Changing the MAC address from the Guest OS has the intended
effect: frames to the new MAC address are received.
Forged
Transmits
Reject — Any outbound frame with a source MAC address that is different
from the one currently set on the adapter are dropped.
Accept — No filtering is performed and all outbound frames are passed.
4.6 Configure NetFlow Settings
NetFlow is a network analysis tool that you can use to monitor network monitoring and
virtual machine traffic.
NetFlow is available on vSphere distributed switch version 5.0.0 and later.
Procedure











Log in to the vSphere Client and select the Networking inventory view.
Right-click the vSphere distributed switch in the inventory pane, and select Edit
Settings.
Navigate to the NetFlow tab.
Type the IP address and Port of the NetFlow collector.
Type the VDS IP address.
With an IP address to the vSphere distributed switch, the NetFlow collector can
interact with the vSphere distributed switch as a single switch, rather than interacting
with a separate, unrelated switch for each associated host.
(Optional) Use the up and down menu arrows to set the Active flow export timeout
and Idle flow export timeout.
(Optional) Use the up and down menu arrows to set the Sampling rate.
The sampling rate determines what portion of data NetFlow collects, with the
sampling rate number determining how often NetFlow collects the packets. A
collector with a sampling rate of 2 collects data from every other packet. A collector
with a sampling rate of 5 collects data from every fifth packet.
(Optional) Select Process internal flows only to collect data only on network activity
between virtual machines on the same host.
Click OK.
4.6.1 CDP
Option
Description
Listen
ESXi detects and displays information about the associated Cisco switch port, but
information about the vSphere distributed switch is not available to the Cisco
switch administrator.
AdvertiseESXi makes information about the vSphere distributed switch available to the
Cisco switch administrator, but does not detect and display information about the
Cisco switch.
Both
ESXi detects and displays information about the associated Cisco switch and
makes information about the vSphere distributed switch available to the Cisco
switch administrator.
4.7 Mounting NFS Volumes
ESXi supports VMkernel-based NFS mounts for storing virtual disks on NFS datastores.
In addition to storing virtual disks on NFS datastores, you can also use NFS Datastores as a
central repository for ISO images and virtual machine templates. For more information
about creating NFS datastores, see vSphere Storage.
ESXi supports NFS version 3 over Layer 2 and Layer 3 Network switches. Host servers and
NFS storage arrays must be on different subnets and the network switch must handle the
routing information.
4.8 Networking Best Practices
Consider these best practices when you configure your network.










Separate network services from one another to achieve greater security and better
performance.
Put a set of virtual machines on a separate physical NIC. This separation allows for a
portion of the total networking workload to be shared evenly across multiple CPUs.
The isolated virtual machines can then better serve traffic from a Web client, for
example
Keep the vMotion connection on a separate network devoted to vMotion. When
migration with vMotion occurs, the contents of the guest operating system’s memory
is transmitted over the network. You can do this either by using VLANs to segment a
single physical network or by using separate physical networks (the latter is
preferable).
When using passthrough devices with a Linux kernel version 2.6.20 or earlier, avoid
MSI and MSI-X modes because these modes have significant performance impact.
To physically separate network services and to dedicate a particular set of NICs to a
specific network service, create a vSphere standard switch or vSphere distributed
switch for each service. If this is not possible, separate network services on a single
switch by attaching them to port groups with different VLAN IDs. In either case,
confirm with your network administrator that the networks or VLANs you choose are
isolated in the rest of your environment and that no routers connect them.
You can add and remove network adapters from a standard or distributed switch
without affecting the virtual machines or the network service that is running behind
that switch. If you remove all the running hardware, the virtual machines can still
communicate among themselves. If you leave one network adapter intact, all the
virtual machines can still connect with the physical network.
To protect your most sensitive virtual machines, deploy firewalls in virtual machines
that route between virtual networks with uplinks to physical networks and pure
virtual networks with no uplinks.
For best performance, use vmxnet3 virtual NICs.
Every physical network adapter connected to the same vSphere standard switch or
vSphere distributed switch should also be connected to the same physical network.
Configure all VMkernel network adapters to the same MTU. When several VMkernel
network adapters are connected to vSphere distributed switches but have different
MTUs configured, you might experience network connectivity problems.

When creating a distributed port group, do not use dynamic binding. Dynamic
binding is deprecated in ESXi 5.0.
5 Storage
5.1 Making LUN Decisions
You must plan how to set up storage for your ESXi systems before you format LUNs with
VMFS datastores.
When you make your LUN decision, keep in mind the following considerations:



Each LUN should have the correct RAID level and storage characteristic for the
applications running in virtual machines that use the LUN.
Each LUN must contain only one VMFS datastore.
If multiple virtual machines access the same VMFS, use disk shares to prioritize
virtual machines.
You might want fewer, larger LUNs for the following reasons:



More flexibility to create virtual machines without asking the storage administrator
for more space.
More flexibility for resizing virtual disks, doing snapshots, and so on.
Fewer VMFS datastores to manage.
You might want more, smaller LUNs for the following reasons:





Less wasted storage space.
Different applications might need different RAID characteristics.
More flexibility, as the multipathing policy and disk shares are set per LUN.
Use of Microsoft Cluster Service requires that each cluster disk resource is in its own
LUN.
Better performance because there is less contention for a single volume.
When the storage characterization for a virtual machine is not available, there is often no
simple method to determine the number and size of LUNs to provision. You can experiment
using either a predictive or adaptive scheme.
5.1.1 Use the Predictive Scheme to Make LUN Decisions
When setting up storage for ESXi systems, before creating VMFS datastores, you must
decide on the size and number of LUNs to provision. You can experiment using the
predictive scheme.
Procedure




Provision several LUNs with different storage characteristics.
Create a VMFS datastore on each LUN, labeling each datastore according to its
characteristics.
Create virtual disks to contain the data for virtual machine applications in the VMFS
datastores created on LUNs with the appropriate RAID level for the applications'
requirements.
Use disk shares to distinguish high-priority from low-priority virtual machines.
Note
Disk shares are relevant only within a given host. The shares assigned to virtual machines on
one host have no effect on virtual machines on other hosts.

Run the applications to determine whether virtual machine performance is
acceptable.
5.1.2 Use the Adaptive Scheme to Make LUN Decisions
When setting up storage for ESXi hosts, before creating VMFS datastores, you must decide
on the number and size of LUNS to provision. You can experiment using the adaptive
scheme.
Procedure




Provision a large LUN (RAID 1+0 or RAID 5), with write caching enabled.
Create a VMFS on that LUN.
Create four or five virtual disks on the VMFS.
Run the applications to determine whether disk performance is acceptable.
If performance is acceptable, you can place additional virtual disks on the VMFS. If
performance is not acceptable, create a new, large LUN, possibly with a different RAID level,
and repeat the process. Use migration so that you do not lose virtual machines data when
you recreate the LUN.
5.1.3 NPIV Capabilities and Limitations
Learn about specific capabilities and limitations of the use of NPIV with ESXi.
ESXi with NPIV supports the following items:



NPIV supports vMotion. When you use vMotion to migrate a virtual machine it
retains the assigned WWN.
If you migrate an NPIV-enabled virtual machine to a host that does not support
NPIV, VMkernel reverts to using a physical HBA to route the I/O.
If your FC SAN environment supports concurrent I/O on the disks from an activeactive array, the concurrent I/O to two different NPIV ports is also supported.
When you use ESXi with NPIV, the following limitations apply:




Because the NPIV technology is an extension to the FC protocol, it requires an FC
switch and does not work on the direct attached FC disks.
When you clone a virtual machine or template with a WWN assigned to it, the clones
do not retain the WWN.
NPIV does not support Storage vMotion.
Disabling and then re-enabling the NPIV capability on an FC switch while virtual
machines are running can cause an FC link to fail and I/O to stop.
ESXi system
requirements
Follow vendor recommendation for the server booting from a SAN.
Adapter
requirements
Enable and correctly configure the adapter, so it can access the boot LUN.
See your vendor documentation.
Access control
Each host must have access to its own boot LUN only, not the boot LUNs
of other hosts. Use storage system software to make sure that the host
accesses only the designated LUNs.
Multiple servers can share a diagnostic partition. You can use array
specific LUN masking to achieve this.
Multipathing
support
Multipathing to a boot LUN on active-passive arrays is not supported
because the BIOS does not support multipathing and is unable to activate
a standby path.
SAN
considerations
SAN connections must be through a switched topology if the array is not
certified for direct connect topology. If the array is certified for direct
connect topology, the SAN connections can be made directly to the array.
Boot from SAN is supported for both switched topology and direct connect
topology if these topologies for the specific array are certified.
Hardwarespecific
considerations
If you are running an IBM eServer BladeCenter and use boot from SAN,
you must disable IDE drives on the blades.
5.2 Best Practices for Fibre Channel Storage
When using ESXi with Fibre Channel SAN, follow best practices that VMware offers to
avoid performance problems.
The vSphere Client and the vSphere Web Client offer extensive facilities for collecting
performance information. The information is graphically displayed and frequently updated.
You can also use the resxtop or esxtop command-line utilities. The utilities provide a detailed
look at how ESXi uses resources in real time. For more information, see the vSphere
Resource Management documentation.
Check with your storage representative if your storage system supports Storage API - Array
Integration hardware acceleration features. If it does, refer to your vendor documentation for
information on how to enable hardware acceleration support on the storage system side. For
more information, see Storage Hardware Acceleration.
This chapter includes the following topics:

Preventing Fibre Channel SAN Problems




Disable Automatic Host Registration
Disable Automatic Host Registration in the vSphere Web Client
Optimizing Fibre Channel SAN Storage Performance
Fibre Channel SAN Configuration Checklist
5.3 Preventing Fibre Channel SAN Problems
When using ESXi in conjunction with a Fibre Channel SAN, you must follow specific
guidelines to avoid SAN problems.
You should observe these tips for preventing problems with your SAN configuration:







Place only one VMFS datastore on each LUN.
Do not change the path policy the system sets for you unless you understand the
implications of making such a change.
Document everything. Include information about zoning, access control, storage,
switch, server and FC HBA configuration, software and firmware versions, and
storage cable plan.
o Plan for failure:
o Make several copies of your topology maps. For each element, consider what
happens to your SAN if the element fails.
Cross off different links, switches, HBAs and other elements to ensure you did not
miss a critical failure point in your design.
Ensure that the Fibre Channel HBAs are installed in the correct slots in the host,
based on slot and bus speed. Balance PCI bus load among the available busses in the
server.
Become familiar with the various monitor points in your storage network, at all
visibility points, including host's performance charts, FC switch statistics, and storage
performance statistics.
Be cautious when changing IDs of the LUNs that have VMFS datastores being used
by your ESXi host. If you change the ID, the datastore becomes inactive and its
virtual machines fail. You can resignature the datastore to make it active again. See
Managing Duplicate VMFS Datastores.
If there are no running virtual machines on the VMFS datastore, after you change the ID of
the LUN, you must use rescan to reset the ID on your host. For information on using rescan,
see Storage Refresh and Rescan Operations.
5.4 Disable Automatic Host Registration
When you use EMC CLARiiON or Invista arrays for storage, it is required that the hosts
register with the arrays. ESXi performs automatic host registration by sending the host's
name and IP address to the array. If you prefer to perform manual registration using storage
management software, disable the ESXi auto-registration feature.
5.5 Optimizing Fibre Channel SAN Storage Performance
Several factors contribute to optimizing a typical SAN environment.
If the environment is properly configured, the SAN fabric components (particularly the SAN
switches) are only minor contributors because of their low latencies relative to servers and
storage arrays. Make sure that the paths through the switch fabric are not saturated, that is,
that the switch fabric is running at the highest throughput.
5.5.1 Storage Array Performance
Storage array performance is one of the major factors contributing to the performance of the
entire SAN environment.
If there are issues with storage array performance, be sure to consult your storage array
vendor’s documentation for any relevant information.
Follow these general guidelines to improve the array performance in the vSphere
environment:



When assigning LUNs, remember that each LUN is accessed by a number of hosts,
and that a number of virtual machines can run on each host. One LUN used by a host
can service I/O from many different applications running on different operating
systems. Because of this diverse workload, the RAID group containing the ESXi
LUNs should not include LUNs used by other servers that are not running ESXi.
Make sure read/write caching is enabled.
SAN storage arrays require continual redesign and tuning to ensure that I/O is load
balanced across all storage array paths. To meet this requirement, distribute the
paths to the LUNs among all the SPs to provide optimal load balancing. Close
monitoring indicates when it is necessary to rebalance the LUN distribution.
Tuning statically balanced storage arrays is a matter of monitoring the specific performance
statistics (such as I/O operations per second, blocks per second, and response time) and
distributing the LUN workload to spread the workload across all the SPs.
Note
Dynamic load balancing is not currently supported with ESXi.
5.5.2 Server Performance with Fibre Channel
You must consider several factors to ensure optimal server performance.
Each server application must have access to its designated storage with the following
conditions:



High I/O rate (number of I/O operations per second)
High throughput (megabytes per second)
Minimal latency (response times)
Because each application has different requirements, you can meet these goals by choosing
an appropriate RAID group on the storage array. To achieve performance goals:

Place each LUN on a RAID group that provides the necessary performance levels. Pay
attention to the activities and resource utilization of other LUNS in the assigned
RAID group. A high-performance RAID group that has too many applications doing





I/O to it might not meet performance goals required by an application running on the
ESXi host.
Make sure that each server has a sufficient number of HBAs to allow maximum
throughput for all the applications hosted on the server for the peak period. I/O
spread across multiple HBAs provide higher throughput and less latency for each
application.
To provide redundancy in the event of HBA failure, make sure the server is connected
to a dual redundant fabric.
When allocating LUNs or RAID groups for ESXi systems, multiple operating systems
use and share that resource. As a result, the performance required from each LUN in
the storage subsystem can be much higher if you are working with ESXi systems than
if you are using physical machines. For example, if you expect to run four I/O
intensive applications, allocate four times the performance capacity for the ESXi
LUNs.
When using multiple ESXi systems in conjunction with vCenter Server, the
performance needed from the storage subsystem increases correspondingly.
The number of outstanding I/Os needed by applications running on an ESXi system
should match the number of I/Os the HBA and storage array can handle.
5.5.3 Fibre Channel SAN Configuration Checklist
This topic provides a checklist of special setup requirements for different storage arrays and
ESXi hosts.
Multipathing Setup Requirements
Component
Comments
All storage arrays
Write cache must be disabled if not battery backed.
Topology
No single failure should cause both HBA and SP failover,
especially with active-passive storage arrays.
IBM TotalStorage DS 4000
(formerly FastT)
Host type must be LNXCL or VMware in later versions.
AVT (Auto Volume Transfer) is disabled in this host mode.
HDS 9500V family (Thunder) requires two host modes:
Host Mode 1: Standard.
HDS 99xx and 95xxV family
Host Mode 2: Sun Cluster
HDS 99xx family (Lightning) and HDS Tabma (USP) require
host mode set to Netware.
EMC Symmetrix
Enable the SPC2 and SC3 settings. Contact EMC for the latest
settings.
EMC Clariion
Set the EMC Clariion failover mode to 1 or 4. Contact EMC for
Multipathing Setup Requirements
Component
Comments
details.
Host type must be Linux.
HP MSA
Set the connection type for each HBA port to Linux.
5.6 iSCSI
5.6.1 iSCSI Naming Conventions
iSCSI uses a special unique name to identify an iSCSI node, either target or initiator. This
name is similar to the WorldWide Name (WWN) associated with Fibre Channel devices and
is used as a way to universally identify the node.
iSCSI names are formatted in two different ways. The most common is the IQN format.
For more details on iSCSI naming requirements and string profiles, see RFC 3721 and RFC
3722 on the IETF Web site.
5.6.1.1 iSCSI Qualified Name (IQN) Format
The IQN format takes the form iqn.yyyy-mm.naming-authority:unique name, where:



yyyy-mm is the year and month when the naming authority was established.
naming-authority is usually reverse syntax of the Internet domain name of the
naming authority. For example, the iscsi.vmware.com naming authority could have
the iSCSI qualified name form of iqn.1998-01.com.vmware.iscsi. The name indicates
that the vmware.com domain name was registered in January of 1998, and iscsi is a
subdomain, maintained by vmware.com.
unique name is any name you want to use, for example, the name of your host. The
naming authority must make sure that any names assigned following the colon are
unique, such as:
o iqn.1998-01.com.vmware.iscsi:name1
o iqn.1998-01.com.vmware.iscsi:name2
o iqn.1998-01.com.vmware.iscsi:name999
5.6.1.2 Enterprise Unique Identifier (EUI) Format
The EUI format takes the form eui.16 hex digits.
For example, eui.0123456789ABCDEF.
The 16-hexadecimal digits are text representations of a 64-bit number of an IEEE EUI
(extended unique identifier) format. The top 24 bits are a company ID that IEEE registers
with a particular company. The lower 40 bits are assigned by the entity holding that
company ID and must be unique.
5.6.2 SCSI Storage System Types
ESXi supports different storage systems and arrays.
The types of storage that your host supports include active-active, active-passive, and ALUAcompliant.
Active-active
storage system
Allows access to the LUNs simultaneously through all the storage ports
that are available without significant performance degradation. All the
paths are active at all times, unless a path fails.
Active-passive
storage system
A system in which one storage processor is actively providing access to a
given LUN. The other processors act as backup for the LUN and can be
actively providing access to other LUN I/O. I/O can be successfully sent
only to an active port for a given LUN. If access through the active
storage port fails, one of the passive storage processors can be activated
by the servers accessing it.
Asymmetrical
storage system
Supports Asymmetric Logical Unit Access (ALUA). ALUA-complaint
storage systems provide different levels of access per port. ALUA allows
hosts to determine the states of target ports and prioritize paths. The
host uses some of the active paths as primary while others as secondary.
Virtual port
storage system
Allows access to all available LUNs through a single virtual port. These
are active-active storage devices, but hide their multiple connections
though a single port. ESXi multipathing does not make multiple
connections from a specific port to the storage by default. Some storage
vendors supply session managers to establish and manage multiple
connections to their storage. These storage systems handle port failover
and connection balancing transparently. This is often referred to as
transparent failover.
5.6.3 Error Correction
To protect the integrity of iSCSI headers and data, the iSCSI protocol defines error correction
methods known as header digests and data digests.
Both parameters are disabled by default, but you can enable them. These digests pertain to,
respectively, the header and SCSI data being transferred between iSCSI initiators and
targets, in both directions.
Header and data digests check the end-to-end, noncryptographic data integrity beyond the
integrity checks that other networking layers provide, such as TCP and Ethernet. They check
the entire communication path, including all elements that can change the network-level
traffic, such as routers, switches, and proxies.
The existence and type of the digests are negotiated when an iSCSI connection is established.
When the initiator and target agree on a digest configuration, this digest must be used for all
traffic between them.
Enabling header and data digests does require additional processing for both the initiator
and the target and can affect throughput and CPU use performance.
5.7 iSCSI SAN Restrictions
A number of restrictions exist when you use ESXi with an iSCSI SAN.




ESXi does not support iSCSI-connected tape devices.
You cannot use virtual-machine multipathing software to perform I/O load balancing
to a single physical LUN.
ESXi does not support multipathing when you combine independent hardware
adapters with either software or dependent hardware adapters.
ESXi does not support IPv6 with software iSCSI and dependent hardware iSCSI.
5.7.1 Dependent Hardware iSCSI Considerations
When you use dependent hardware iSCSI adapters with ESXi, certain considerations apply.




When you use any dependent hardware iSCSI adapter, performance reporting for a
NIC associated with the adapter might show little or no activity, even when iSCSI
traffic is heavy. This behavior occurs because the iSCSI traffic bypasses the regular
networking stack.
If you use a third-party virtual switch, for example Cisco Nexus 1000V DVS, disable
automatic pinning. Use manual pinning instead, making sure to connect a VMkernel
adapter (vmk) to an appropriate physical NIC (vmnic). For information, refer to your
virtual switch vendor documentation.
The Broadcom iSCSI adapter performs data reassembly in hardware, which has a
limited buffer space. When you use the Broadcom iSCSI adapter in a congested
network or under heavy load, enable flow control to avoid performance degradation.
Flow control manages the rate of data transmission between two nodes to prevent a
fast sender from overrunning a slow receiver. For best results, enable flow control at
the end points of the I/O path, at the hosts and iSCSI storage systems.

To enable flow control for the host, use the esxcli system module parameters
command. For details, see the VMware knowledge base article at
http://kb.vmware.com/kb/1013413

Broadcom iSCSI adapters do not support IPv6.
5.7.2 Managing iSCSI Network
Special consideration apply to network adapters, both physical and VMkernel, that are
associated with an iSCSI adapter.
After you create network connections for iSCSI, an iSCSI indicator on a number of
Networking dialog boxes becomes enabled. This indicator shows that a particular virtual or
physical network adapter is iSCSI-bound. To avoid disruptions in iSCSI traffic, follow these
guidelines and considerations when managing iSCSI-bound virtual and physical network
adapters:







Make sure that the VMkernel network adapters are assigned addresses on the same
subnet as the iSCSI storage portal they connect to.
iSCSI adapters using VMkernel adapters are not able to connect to iSCSI ports on
different subnets, even if those ports are discovered by the iSCSI adapters.
When using separate vSphere switches to connect physical network adapters and
VMkernel adapters, make sure that the vSphere switches connect to different IP
subnets.
If VMkernel adapters are on the same subnet, they must connect to a single vSwitch.
If you migrate VMkernel adapters to a different vSphere switch, move associated
physical adapters.
Do not make configuration changes to iSCSI-bound VMkernel adapters or physical
network adapters.
Do not make changes that might break association of VMkernel adapters and
physical network adapters. You can break the association if you remove one of the
adapters or the vSphere switch that connects them, or change the 1:1 network policy
for their connection.
5.7.3 iSCSI Network Troubleshooting
A warning sign indicates non-compliant port group policy for an iSCSI-bound VMkernel
adapter.
Problem
The VMkernel adapter's port group policy is considered non-compliant in the following
cases:




The VMkernel adapter is not connected to an active physical network adapter.
The VMkernel adapter is connected to more than one physical network adapter.
The VMkernel adapter is connected to one or more standby physical adapters.
The active physical adapter is changed.
CHAP Security Level
Description
Supported
Software iSCSI
None
The host does not use CHAP authentication. Dependent
Select this option to disable authentication if hardware iSCSI
it is currently enabled.
Independent
hardware iSCSI
The host prefers a non-CHAP connection,
Use unidirectional CHAP
but can use a CHAP connection if required
if required by target
by the target.
Software iSCSI
Dependent
hardware iSCSI
Software iSCSI
Use unidirectional CHAP The host prefers CHAP, but can use nonunless prohibited by
CHAP connections if the target does not
target
support CHAP.
Dependent
hardware iSCSI
Independent
hardware iSCSI
Software iSCSI
The host requires successful CHAP
Dependent
Use unidirectional CHAP authentication. The connection fails if CHAP hardware iSCSI
negotiation fails.
Independent
hardware iSCSI
Use bidirectional CHAP
Software iSCSI
The host and the target support bidirectional
Dependent
CHAP.
hardware iSCSI
5.8 iBFT iSCSI Boot Overview
ESXi hosts can boot from an iSCSI SAN using the software or dependent hardware iSCSI
adapters and network adapters.
To deploy ESXi and boot from the iSCSI SAN, the host must have an iSCSI boot capable
network adapter that supports the iSCSI Boot Firmware Table (iBFT) format. The iBFT is a
method of communicating parameters about the iSCSI boot device to an operating system.
Before installing ESXi and booting from the iSCSI SAN, configure the networking and iSCSI
boot parameters on the network adapter and enable the adapter for the iSCSI boot. Because
configuring the network adapter is vendor specific, review your vendor documentation for
instructions.
When you first boot from iSCSI, the iSCSI boot firmware on your system connects to an
iSCSI target. If login is successful, the firmware saves the networking and iSCSI boot
parameters in the iBFT and stores the table in the system's memory. The system uses this
table to configure its own iSCSI connection and networking and to start up.
The following list describes the iBFT iSCSI boot sequence.



When restarted, the system BIOS detects the iSCSI boot firmware on the network
adapter.
The iSCSI boot firmware uses the preconfigured boot parameters to connect with the
specified iSCSI target.
If the connection to the iSCSI target is successful, the iSCSI boot firmware writes the
networking and iSCSI boot parameters in to the iBFT and stores
3. the table in the system memory.
Note
The system uses this table to configure its own iSCSI connection and networking and
to start up.
4.
5.
6.
7.
The BIOS boots the boot device.
The VMkernel starts loading and takes over the boot operation.
Using the boot parameters from the iBFT, the VMkernel connects to the iSCSI target.
After the iSCSI connection is established, the system boots.
5.8.1 iBFT iSCSI Boot Considerations
When you boot the ESXi host from iSCSI using iBFT-enabled network adapters, certain
considerations apply.
The iBFT iSCSI boot does not support the following items:

IPv6

Failover for the iBFT-enabled network adapters
Note
Update your NIC's boot code and iBFT firmware using vendor supplied tools before trying to
install and boot VMware ESXi. Consult vendor documentation and VMware HCL for
supported boot code and iBFT firmware versions for VMware ESXi iBFT boot. The boot code
and iBFT firmware released by vendors prior to the ESXi 4.1 release might not work.
After you set up your host to boot from iBFT iSCSI, the following restrictions apply:

You cannot disable the software iSCSI adapter. If the iBFT configuration is present in
the BIOS, the host re-enables the software iSCSI adapter during each reboot.
Note
If you do not use the iBFT-enabled network adapter for the iSCSI boot and do not want the
software iSCSI adapter to be always enabled, remove the iBFT configuration from the
network adapter.

You cannot remove the iBFT iSCSI boot target using the vSphere Client or the
vSphere Web Client. The target appears on the list of adapter static targets.
5.9 Best Practices for iSCSI Storage
When using ESXi with the iSCSI SAN, follow best practices that VMware offers to avoid
problems.
Check with your storage representative if your storage system supports Storage API - Array
Integration hardware acceleration features. If it does, refer to your vendor documentation for
information on how to enable hardware acceleration support on the storage system side. For
more information, see Storage Hardware Acceleration.
This chapter includes the following topics:




Preventing iSCSI SAN Problems
Optimizing iSCSI SAN Storage Performance
Checking Ethernet Switch Statistics
iSCSI SAN Configuration Checklist
5.10 Preventing iSCSI SAN Problems
When using ESXi in conjunction with a SAN, you must follow specific guidelines to avoid
SAN problems.
You should observe these tips for avoiding problems with your SAN configuration:



Place only one VMFS datastore on each LUN. Multiple VMFS datastores on one LUN
is not recommended.
Do not change the path policy the system sets for you unless you understand the
implications of making such a change.
Document everything. Include information about configuration, access control,
storage, switch, server and iSCSI HBA configuration, software and firmware versions,
and storage cable plan.
Plan for failure:







Make several copies of your topology maps. For each element, consider what happens
to your SAN if the element fails.
Cross off different links, switches, HBAs and other elements to ensure you did not
miss a critical failure point in your design.
Ensure that the iSCSI HBAs are installed in the correct slots in the ESXi host, based
on slot and bus speed. Balance PCI bus load among the available busses in the server.
Become familiar with the various monitor points in your storage network, at all
visibility points, including ESXi performance charts, Ethernet switch statistics, and
storage performance statistics.
Be cautious when changing IDs of the LUNs that have VMFS datastores being used
by your host. If you change the ID, virtual machines running on the VMFS datastore
will fail.
If there are no running virtual machines on the VMFS datastore, after you change the
ID of the LUN, you must use rescan to reset the ID on your host. For information on
using rescan, see Storage Refresh and Rescan Operations.
If you need to change the default iSCSI name of your iSCSI adapter, make sure the
name you enter is worldwide unique and properly formatted. To avoid storage access
problems, never assign the same iSCSI name to different adapters, even on different
hosts.
5.11 Optimizing iSCSI SAN Storage Performance
Several factors contribute to optimizing a typical SAN environment.
If the network environment is properly configured, the iSCSI components provide adequate
throughput and low enough latency for iSCSI initiators and targets. If the network is
congested and links, switches or routers are saturated, iSCSI performance suffers and might
not be adequate for ESXi environments.
5.11.1 Storage System Performance
Storage system performance is one of the major factors contributing to the performance of
the entire iSCSI environment.
If issues occur with storage system performance, consult your storage system vendor’s
documentation for any relevant information.
When you assign LUNs, remember that you can access each shared LUN through a number
of hosts, and that a number of virtual machines can run on each host. One LUN used by the
ESXi host can service I/O from many different applications running on different operating
systems. Because of this diverse workload, the RAID group that contains the ESXi LUNs
should not include LUNs that other hosts use that are not running ESXi for I/O intensive
applications.
Enable read caching and write caching.
Load balancing is the process of spreading server I/O requests across all available SPs and
their associated host server paths. The goal is to optimize performance in terms of
throughput (I/O per second, megabytes per second, or response times).
SAN storage systems require continual redesign and tuning to ensure that I/O is load
balanced across all storage system paths. To meet this requirement, distribute the paths to
the LUNs among all the SPs to provide optimal load balancing. Close monitoring indicates
when it is necessary to manually rebalance the LUN distribution.
Tuning statically balanced storage systems is a matter of monitoring the specific
performance statistics (such as I/O operations per second, blocks per second, and response
time) and distributing the LUN workload to spread the workload across all the SPs.
5.11.2 Server Performance with iSCSI
You must consider several factors to ensure optimal server performance.
Each server application must have access to its designated storage with the following
conditions:



High I/O rate (number of I/O operations per second)
High throughput (megabytes per second)
Minimal latency (response times)
Because each application has different requirements, you can meet these goals by choosing
an appropriate RAID group on the storage system. To achieve performance goals, perform
the following tasks:




Place each LUN on a RAID group that provides the necessary performance levels. Pay
attention to the activities and resource utilization of other LUNS in the assigned
RAID group. A high-performance RAID group that has too many applications doing
I/O to it might not meet performance goals required by an application running on the
ESXi host.
Provide each server with a sufficient number of network adapters or iSCSI hardware
adapters to allow maximum throughput for all the applications hosted on the server
for the peak period. I/O spread across multiple ports provides higher throughput and
less latency for each application.
To provide redundancy for software iSCSI, make sure the initiator is connected to all
network adapters used for iSCSI connectivity.
When allocating LUNs or RAID groups for ESXi systems, multiple operating systems
use and share that resource. As a result, the performance required from each LUN in
the storage subsystem can be much higher if you are working with ESXi systems than
if you are using physical machines. For example, if you expect to run four I/O
intensive applications, allocate four times the performance capacity for the ESXi
LUNs.


When using multiple ESXi systems in conjunction with vCenter Server, the
performance needed from the storage subsystem increases correspondingly.
The number of outstanding I/Os needed by applications running on an ESXi system
should match the number of I/Os the SAN can handle.
5.11.3 Network Performance
A typical SAN consists of a collection of computers connected to a collection of storage
systems through a network of switches. Several computers often access the same storage.
Single Ethernet Link Connection to Storage shows several computer systems connected to a
storage system through an Ethernet switch. In this configuration, each system is connected
through a single Ethernet link to the switch, which is also connected to the storage system
through a single Ethernet link. In most configurations, with modern switches and typical
traffic, this is not a problem.
Single Ethernet Link Connection to Storage
When systems read data from storage, the maximum response from the storage is to send
enough data to fill the link between the storage systems and the Ethernet switch. It is
unlikely that any single system or virtual machine gets full use of the network speed, but this
situation can be expected when many systems share one storage device.
When writing data to storage, multiple systems or virtual machines might attempt to fill
their links. As Dropped Packets shows, when this happens, the switch between the systems
and the storage system has to drop data. This happens because, while it hassingle connection
to the storage device, it has more traffic to send to the storage system than a single link can
carry. In this case, the switch drops network packets because the amount of data it can
transmit is limited by the speed of the link between it and the storage system.
Dropped Packets
Recovering from dropped network packets results in large performance degradation. In
addition to time spent determining that data was dropped, the retransmission uses network
bandwidth that could otherwise be used for current transactions.
iSCSI traffic is carried on the network by the Transmission Control Protocol (TCP). TCP is a
reliable transmission protocol that ensures that dropped packets are retried and eventually
reach their destination. TCP is designed to recover from dropped packets and retransmits
them quickly and seamlessly. However, when the switch discards packets with any
regularity, network throughput suffers significantly. The network becomes congested with
requests to resend data and with the resent packets, and less data is actually transferred than
in a network without congestion.
Most Ethernet switches can buffer, or store, data and give every device attempting to send
data an equal chance to get to the destination. This ability to buffer some transmissions,
combined with many systems limiting the number of outstanding commands, allows small
bursts from several systems to be sent to a storage system in turn.
If the transactions are large and multiple servers are trying to send data through a single
switch port, a switch's ability to buffer one request while another is transmitted can be
exceeded. In this case, the switch drops the data it cannot send, and the storage system must
request retransmission of the dropped packet. For example, if an Ethernet switch can buffer
32KB on an input port, but the server connected to it thinks it can send 256KB to the storage
device, some of the data is dropped.
Most managed switches provide information on dropped packets, similar to the following:
*: interface is up
IHQ: pkts in input hold queue
OHQ: pkts in output hold queue
RXBS: rx rate (bits/sec)
TXBS: tx rate (bits/sec)
TRTL: throttle count
IQD: pkts dropped from input queue
OQD: pkts dropped from output queue
RXPS: rx rate (pkts/sec)
TXPS: tx rate (pkts/sec)
Sample Switch Information
Interface
IHQ IQD OHQ OQD RXBS RXPS TXBS TXPS TRTL
* GigabitEthernet0/1 3
9922 0
0
476303000 62273 477840000 63677 0
In this example from a Cisco switch, the bandwidth used is 476303000 bits/second, which is
less than half of wire speed. In spite of this, the port is buffering incoming packets and has
dropped quite a few packets. The final line of this interface summary indicates that this port
has already dropped almost 10,000 inbound packets in the IQD column.
Configuration changes to avoid this problem involve making sure several input Ethernet
links are not funneled into one output link, resulting in an oversubscribed link. When a
number of links transmitting near capacity are switched to a smaller number of links,
oversubscription is a possibility.
Generally, applications or systems that write a lot of data to storage, such as data acquisition
or transaction logging systems, should not share Ethernet links to a storage device. These
types of applications perform best with multiple connections to storage devices.
Multiple Connections from Switch to Storage shows multiple connections from the switch to
the storage.
Multiple Connections from Switch to Storage
Using VLANs or VPNs does not provide a suitable solution to the problem of link
oversubscription in shared configurations. VLANs and other virtual partitioning of a network
provide a way of logically designing a network, but do not change the physical capabilities of
links and trunks between switches. When storage traffic and other network traffic end up
sharing physical connections, as they would with a VPN, the possibility for oversubscription
and lost packets exists. The same is true of VLANs that share interswitch trunks.
Performance design for a SANs must take into account the physical limitations of the
network, not logical allocations.
5.12 Checking Ethernet Switch Statistics
Many Ethernet switches provide different methods for monitoring switch health.
Switches that have ports operating near maximum throughput much of the time do not
provide optimum performance. If you have ports in your iSCSI SAN running near the
maximum, reduce the load. If the port is connected to an ESXi system or iSCSI storage, you
can reduce the load by using manual load balancing.
If the port is connected between multiple switches or routers, consider installing additional
links between these components to handle more load. Ethernet switches also commonly
provide information about transmission errors, queued packets, and dropped Ethernet
packets. If the switch regularly reports any of these conditions on ports being used for iSCSI
traffic, performance of the iSCSI SAN will be poor.
5.13 iSCSI SAN Configuration Checklist
This topic provides a checklist of special setup requirements for different storage systems
and ESXi hosts.
5.14 Identifying Device Connectivity Problems
When your ESXi host experiences a problem while connecting to a storage device, the host
treats the problem as permanent or temporary depending on certain factors.
Storage connectivity problems are caused by a variety of reasons. Although ESXi cannot
always determine the reason for a storage device or its paths being unavailable, the host
differentiates between a permanent device loss (PDL) state of the device and a transient all
paths down (APD) state of storage.
Permanent
Device Loss
(PDL)
A condition that occurs when a storage device permanently fails or is
administratively removed or excluded. It is not expected to become available.
When the device becomes permanently unavailable, ESXi receives
appropriate sense codes or a login rejection from storage arrays, and is able
to recognize that the device is permanently lost.
All Paths A condition that occurs when a storage device becomes inaccessible to the host
Down
(APD)
and no paths to the device are available. ESXi treats this as a transient condition
because typically the problems with the device are temporary and the device is
expected to become available again.
5.14.1 Detecting PDL Conditions
A storage device is considered to be in the permanent device loss (PDL) state when it
becomes permanently unavailable to your ESXi host.
Typically, the PDL condition occurs when a device is unintentionally removed, or its unique
ID changes, or when the device experiences an unrecoverable hardware error.
When the storage array determines that the device is permanently unavailable, it sends SCSI
sense codes to the ESXi host. The sense codes allow your host to recognize that the device
has failed and register the state of the device as PDL.
Note
The sense codes must be received on all paths to the device for the device to be considered
permanently lost.
The following VMkernel log example of a SCSI sense code indicates that the device is in the
PDL state.
H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x25 0x0 or Logical Unit Not
Supported
For information about SCSI sense codes, see Troubleshooting Storage in vSphere
Troubleshooting.
In the case of iSCSI arrays with a single LUN per target, PDL is detected through iSCSI login
failure. An iSCSI storage array rejects your host's attempts to start an iSCSI session with a
reason Target Unavailable. As with the sense codes, this response must be received on all
paths for the device to be considered permanently lost.
After registering the PDL state of the device, the host stops attempts to reestablish
connectivity or to issue commands to the device to avoid becoming blocked or unresponsive.
The I/O from virtual machines is terminated.
Note
vSphere HA can detect PDL and restart failed virtual machines.
The vSphere Web Client displays the following information for the device:


The operational state of the device changes to Lost Communication.
All paths are shown as Dead.

Datastores on the device are grayed out.
It is possible for a device to return from PDL, however, data consistency is not guaranteed.
Note
The host cannot detect PDL conditions and continues to treat the device connectivity
problems as APD when a storage device permanently fails in a way that does not return
appropriate SCSI sense codes or a login rejection.
For additional details, see the VMware knowledge base article at
http://kb.vmware.com/kb/2004684.
5.14.2 Performing Planned Storage Device Removal
When a storage device is malfunctioning, you can avoid permanent device loss (PDL)
conditions or all paths down (APD) conditions and perform a planned removal and
reconnection of a storage device.
Planned device removal is an intentional disconnection of a storage device. You might also
plan to remove a device for such reasons as upgrading your hardware or reconfiguring your
storage devices. When you perform an orderly removal and reconnection of a storage device,
you complete a number of tasks.












Migrate virtual machines from the device you plan to detach.
See the vCenter Server and Host Management documentation.
Unmount the datastore deployed on the device.
See Unmount VMFS or NFS Datastores.
Detach the storage device.
See Detach Storage Devices.
For an iSCSI device with a single LUN per target, delete the static target entry from
each iSCSI HBA that has a path to the storage device.
See Remove Static Targets in the vSphere Web Client.
Perform any necessary reconfiguration of the storage device by using the array
console.
Reattach the storage device.
See Attach Storage Devices.
Mount the datastore and restart the virtual machines. See Mount VMFS
5.14.3 Handling Transient APD Conditions
A storage device is considered to be in the all paths down (APD) state when it becomes
unavailable to your ESXi host for an unspecified period of time.
The reasons for an APD state can be, for example, a failed switch or a disconnected storage
cable.
In contrast with the permanent device loss (PDL) state, the host treats the APD state as
transient and expects the device to be available again.
The host indefinitely continues to retry issued commands in an attempt to reestablish
connectivity with the device. If the host's commands fail the retries for a prolonged period of
time, the host and its virtual machines might be at risk of having performance problems and
potentially becoming unresponsive.
To avoid these problems, your host uses a default APD handling feature. When a device
enters the APD state, the system immediately turns on a timer and allows your host to
continue retrying nonvirtual machine commands for a limited time period.
By default, the APD timeout is set to 140 seconds, which is typically longer than most devices
need to recover from a connection loss. If the device becomes available within this time, the
host and its virtual machine continue to run without experiencing any problems.
If the device does not recover and the timeout ends, the host stops its attempts at retries and
terminates any nonvirtual machine I/O. Virtual machine I/O will continue retrying. The
vSphere Web Client displays the following information for the device with the expired APD
timeout:



The operational state of the device changes to Dead or Error.
All paths are shown as Dead.
Datastores on the device are dimmed.
Even though the device and datastores are unavailable, virtual machines remain responsive.
You can power off the virtual machines or migrate them to a different datastore or host.
If later one or more device paths becomes operational, subsequent I/O to the device is issued
normally and all special APD treatment ends.
5.14.3.1 Disable Storage APD Handling
The storage all paths down (APD) handling on your ESXi host is enabled by default. When it
is enabled, the host continues to retry nonvirtual machine I/O commands to a storage device
in the APD state for a limited time period. When the time period expires, the host stops its
retry attempts and terminates any nonvirtual machine I/O. You can disable the APD
handling feature on your host.
If you disable the APD handling, the host will indefinitely continue to retry issued commands
in an attempt to reconnect to the APD device. Continuing to retry is the same behavior as in
ESXi version 5.0. This behavior might cause virtual machines on the host to exceed their
internal I/O timeout and become unresponsive or fail. The host might become disconnected
from vCenter Server.
Procedure




Browse to the host in the vSphere Web Client navigator.
Click the Manage tab, and click Settings.
Under System, click Advanced System Settings.
Under Advanced System Settings, select the Misc.APDHandlingEnable parameter
and click the Edit icon.

Change the value to 0.
If you disabled the APD handling, you can reenable it when a device enters the APD state.
The internal APD handling feature turns on immediately and the timer starts with the
current timeout value for each device in APD.
5.14.3.2 Change Timeout Limits for Storage APD
The timeout parameter controls how many seconds the ESXi host will retry nonvirtual
machine I/O commands to a storage device in an all paths down (APD) state. If needed, you
can change the default timeout value.
The timer starts immediately after the device enters the APD state. When the timeout
expires, the host marks the APD device as unreachable and fails any pending or new
nonvirtual machine I/O. Virtual machine I/O will continue to be retried.
The default timeout parameter on your host is 140 seconds. You can increase the value of the
timeout if, for example, storage devices connected to your ESXi host take longer than 140
seconds to recover from a connection loss.
Note
If you change the timeout value while an APD is in progress, it will not effect the timeout for
that APD.
Procedure





Browse to the host in the vSphere Web Client navigator.
Click the Manage tab, and click Settings.
nder System, click Advanced System Settings.
Under Advanced System Settings, select the Misc.APDTimeout parameter and click
the Edit icon.
Change the default value.
You can enter a value between 20 and 99999 seconds.
5.14.4 Check the Connection Status of a Storage Device
Use the esxcli command to verify the connection status of a particular storage device.
In the procedure, --server=server_name specifies the target server. The specified target
server prompts you for a user name and password. Other connection options, such as a
configuration file or session file, are supported. For a list of connection options, see Getting
Started with vSphere Command-Line Interfaces.
Prerequisites
Install vCLI or deploy the vSphere Management Assistant (vMA) virtual machine. See
Getting Started with vSphere Command-Line Interfaces. For troubleshooting , run esxcli
commands in the ESXi Shell.
Procedure
1. Run the esxcli --server=server_name storage core device list d=device_ID command.
2. Check the connection status in the Status: field.
o on - Device is connected.
o dead - Device has entered the APD state. The APD timer starts.
o dead timeout - The APD timeout has expired.
o not connected - Device is in the PDL state.
5.14.5 PDL Conditions and High Availability
When a datastore enters a Permanent Device Loss (PDL) state, High Availability (HA) can
power off virtual machines located on the datastore and then restart them on an available
datastore. VMware offers advanced options to regulate the power off and restart operations
for virtual machines.
Advanced Parameters to Regulate PDL
Parameter
Description
disk.terminateVMOnPDLDefault
When set to true, this option enables default power off for
all virtual machines on the ESXi host.
scsi0:0.terminateVMOnPDL
Power off parameter that you can set for each individual
virtual machine.
This option overrides disk.terminateVMOnPDLDefault.
This option is set to true by default. It allows HA to restart
virtual machines that were powered off while the PDL
das.maskCleanShutdownEnabledcondition was in progress. When this option is set to true,
HA restarts all virtual machines, including those that were
intentionally powered off buy a user.
5.15 Best Practices for SSD Devices
Follow these best practices when you use SSD devices in vSphere environment.



Use datastores that are created on SSD storage devices to allocate space for ESXi host
cache. For more information see the vSphere Resource Management documentation.
Make sure to use the latest firmware with SSD devices. Frequently check with your
storage vendors for any updates.
Carefully monitor how intensively you use the SSD device and calculate its estimated
lifetime. The lifetime expectancy depends on how actively you continue to use the
SSD device.
5.15.1 Estimate SSD Lifetime
When working with SSDs, monitor how actively you use them and calculate their estimated
lifetime.
Typically, storage vendors provide reliable lifetime estimates for an SSD under ideal
conditions. For example, a vendor might guarantee a lifetime of 5 years under the condition
of 20GB writes per day. However, the more realistic life expectancy of the SSD will depend
on how many writes per day your host actually generates. Follow these steps to calculate the
lifetime of the SSD.
Procedure





Obtain the number of writes on the SSD by running the esxcli storage core
device stats get -d=device_ID command.
The Write Operations item in the output shows the number. You can average this
number over a period of time.
Estimate lifetime of your SSD by using the following formula:
vendor provided number of writes per day times vendor provided life span divided
by actual average number of writes per day
For example, if your vendor guarantees a lifetime of 5 years under the condition of
20GB writes per day, and the actual number of writes per day is 30GB, the life span
of your SSD will be approximately 3.3 years.
5.15.2 How VMFS5 Differs from VMFS3
VMFS5 provides many improvements in scalability and performance over the previous
version.
VMFS5 has the following improvements:










Greater than 2TB storage devices for each VMFS extent.
Increased resource limits such as file descriptors.
Standard 1MB file system block size with support of 2TB virtual disks.
Greater than 2TB disk size for RDMs in physical compatibility mode.
Support of small files of 1KB.
With ESXi 5.1, any file located on a VMFS5 datastore, new or upgraded from VMFS3,
can be opened in a shared mode by a maximum of 32 hosts. VMFS3 continues to
support 8 hosts or fewer for file sharing. This affects VMware products that use
linked clones, such as View Manager.
Scalability improvements on storage devices that support hardware acceleration. For
information, see Storage Hardware Acceleration.
Default use of hardware assisted locking, also called atomic test and set (ATS)
locking, on storage devices that support hardware acceleration. For information
about how to turn off ATS locking, see Turn off ATS Locking.
Ability to reclaim physical storage space on thin provisioned storage devices. For
information, see Array Thin Provisioning and VMFS Datastores.
Online upgrade process that upgrades existing datastores without disrupting hosts or
virtual machines that are currently running. For information, see Upgrading VMFS
Datastores.
For information about block size limitations of a VMFS datastore, see the VMware
knowledge base article at http://kb.vmware.com/kb/1003565.
5.15.3 VMFS Datastores and Storage Disk Formats
Storage devices that your host supports can use either the master boot record (MBR) format
or the GUID partition table (GPT) format.
With ESXi 5.0 and later, if you create a new VMFS5 datastore, the device is formatted with
GPT. The GPT format enables you to create datastores larger than 2TB and up to 64TB for a
single extent.
VMFS3 datastores continue to use the MBR format for their storage devices. Consider the
following items when you work with VMFS3 datastores:




For VMFS3 datastores, the 2TB limit still applies, even when the storage device has a
capacity of more than 2TB. To be able to use the entire storage space, upgrade a
VMFS3 datastore to VMFS5. Conversion of the MBR format to GPT happens only
after you expand the datastore to a size larger than 2TB.
When you upgrade a VMFS3 datastore to VMFS5, the datastore uses the MBR
format. Conversion to GPT happens only after you expand the datastore to a size
larger than 2TB.
When you upgrade a VMFS3 datastore, remove from the storage device any partitions
that ESXi does not recognize, for example, partitions that use the EXT2 or EXT3
formats. Otherwise, the host cannot format the device with GPT and the upgrade
fails.
You cannot expand a VMFS3 datastore on devices that have the GPT partition
format.
5.15.4 VMFS Datastores as Repositories
ESXi can format SCSI-based storage devices as VMFS datastores. VMFS datastores primarily
serve as repositories for virtual machines.
With VMFS5, you can have up to 256 VMFS datastores per host, with the maximum size of
64TB. The required minimum size for a VMFS datastore is 1.3GB, however, the
recommended minimum size is 2GB.
Note
Always have only one VMFS datastore for each LUN.
You can store multiple virtual machines on the same VMFS datastore. Each virtual machine,
encapsulated in a set of files, occupies a separate single directory. For the operating system
inside the virtual machine, VMFS preserves the internal file system semantics, which ensures
correct application behavior and data integrity for applications running in virtual machines.
When you run multiple virtual machines, VMFS provides specific locking mechanisms for
virtual machine files, so that virtual machines can operate safely in a SAN environment
where multiple ESXi hosts share the same VMFS datastore.
In addition to virtual machines, the VMFS datastores can store other files, such as virtual
machine templates and ISO images.
5.15.5 VMFS Metadata Updates
A VMFS datastore holds virtual machine files, directories, symbolic links, RDM descriptor
files, and so on. The datastore also maintains a consistent view of all the mapping
information for these objects. This mapping information is called metadata.
Metadata is updated each time you perform datastore or virtual machine management
operations. Examples of operations requiring metadata updates include the following:








Creating, growing, or locking a virtual machine file
Changing a file's attributes
Powering a virtual machine on or off
Creating or deleting a VMFS datastore
Expanding a VMFS datastore
Creating a template
Deploying a virtual machine from a template
Migrating a virtual machine with vMotion
When metadata changes are made in a shared storage enviroment, VMFS uses special
locking mechanisms to protect its data and prevent multiple hosts from concurrently writing
to the metadata.
5.15.6 VMFS Locking Mechanisms
In a shared storage environment, when multiple hosts access the same VMFS datastore,
specific locking mechanisms are used. These locking mechanism prevent multiple hosts from
concurrently writing to the metadata and ensure that no data corruption occurs.
VMFS supports SCSI reservations and atomic test and set (ATS) locking.
5.15.6.1 SCSI Reservations
VMFS uses SCSI reservations on storage devices that do not support hardware acceleration.
SCSI reservations lock an entire storage device while an operation that requires metadata
protection is performed. After the operation completes, VMFS releases the reservation and
other operations can continue. Because this lock is exclusive, excessive SCSI reservations by
a host can cause performance degradation on other hosts that are accessing the same VMFS.
For information about how to reduce SCSI reservations, see the vSphere Troubleshooting
documentation.
5.15.6.2 Atomic Test and Set (ATS)
For storage devices that support hardware acceleration, VMFS uses the ATS algorithm, also
called hardware assisted locking. In contrast with SCSI reservations, ATS supports discrete
locking per disk sector. For information about hardware acceleration, see Storage Hardware
Acceleration.
Mechanisms that VMFS uses to apply different types of locking depend on the VMFS
version.
Use of ATS Locking on Devices with Hardware Acceleration Support
Storage
Devices
New VMFS5
Upgraded VMFS5
VMFS3
Single extent ATS only
ATS, but can revert to SCSI ATS, but can revert to SCSI
reservations
reservations
Multiple
extents
ATS except when locks on ATS except when locks on
non-head
non-head
Spans only over ATScapable devices
In certain cases, you might need to turn off the ATS-only setting for a new VMFS5 datastore.
For information, see Turn off ATS Locking.
5.15.6.3 Turn off ATS Locking
When you create a VMFS5 datastore on a device that supports atomic test and set (ATS)
locking, the datastore is set to the ATS-only mode. In certain circumstances, you might need
to turn off the ATS mode setting.
Turn off the ATS setting when, for example, your storage device is downgraded or firmware
updates fail and the device no longer supports hardware acceleration. The option that you
use to turn off the ATS setting is available only through the ESXi Shell. For more
information, see the Getting Started with vSphere Command-Line Interfaces.
Procedure
1. To turn off the ATS setting, run the following command:
vmkfstools --configATSOnly 0 device
The device parameter is the path to the head extent device on which VMFS5 was
deployed. Use the following format:
/vmfs/devices/disks/disk_ID:P
5.15.7 Using Layer 3 Routed Connections to Access NFS Storage
When you use Layer 3 (L3) routed connections to access NFS storage, consider certain
requirements and restructions.
Ensure that your environment meets the following requirements:




Use Cisco's Hot Standby Router Protocol (HSRP) in IP Router. If you are using nonCisco router, be sure to use Virtual Router Redundancy Protocol (VRRP) instead.
Use Quality of Service (QoS) to prioritize NFS L3 traffic on networks with limited
bandwidths, or on networks that experience congestion. See your router
documentation for details.
Follow Routed NFS L3 best practices recommended by storage vendor. Contact your
storage vendor for details.
Disable Network I/O Resource Management (NetIORM).

If you are planning to use systems with top-of-rack switches or switch-dependent I/O
device partitioning, contact your system vendor for compatibility and support.
In an L3 environment the following restrictions apply:





The environment does not support VMware Site Recovery Manager.
The environment supports only NFS protocol. Do not use other storage protocols
such as FCoE over the same physical network.
The NFS traffic in this environment does not support IPv6.
The NFS traffic in this environment can be routed only over a LAN. Other
environments such as WAN are not supported.
The environment does not support Distributed Virtual Switch (DVS).
5.16 Upgrading VMFS Datastores
If your datastores were formatted with VMFS2 or VMFS3, you can upgrade the datastores to
VMFS5.
When you perform datastore upgrades, consider the following items:





To upgrade a VMFS2 datastore, you use a two-step process that involves upgrading
VMFS2 to VMFS3 first. Because ESXi 5.0 and later hosts cannot access VMFS2
datastores, use a legacy host, ESX/ESXi 4.x or earlier, to access the VMFS2 datastore
and perform the VMFS2 to VMFS3 upgrade.
After you upgrade your VMFS2 datastore to VMFS3, the datastore becomes available
on the ESXi 5.x host, where you complete the process of upgrading to VMFS5.
You can perform a VMFS3 to VMFS5 upgrade while the datastore is in use with
virtual machines powered on.
While performing an upgrade, your host preserves all files on the datastore.
The datastore upgrade is a one-way process. After upgrading your datastore, you
cannot revert it back to its previous VMFS format.
An upgraded VMFS5 datastore differs from a newly formatted VMFS5.
Comparing Upgraded and Newly Formatted VMFS5 Datastores
CharacteristicsUpgraded VMFS5
Formatted
VMFS5
File block size
1, 2, 4, and 8MB
1MB
Subblock size
64KB
8KB
Partition format
MBR. Conversion to GPT happens only after you expand
GPT
the datastore to a size larger than 2TB.
Datastore limits Retains limits of VMFS3 datastore.
5.16.1 Increase VMFS Datastore Capacity in the vSphere Client
When you need to create virtual machines on a datastore, or when the virtual machines
running on a datastore require more space, you can dynamically increase the capacity of a
VMFS datastore.
Use one of the following methods to increase a VMFS datastore:


Add a new extent. An extent is a partition on a storage device. You can add up to 32
extents of the same storage type to an existing VMFS datastore. The spanned VMFS
datastore can use any or all of its extents at any time. It does not need to fill up a
particular extent before using the next one.
Grow an extent in an existing VMFS datastore, so that it fills the available adjacent
capacity. Only extents with free space immediately after them are expandable.
Note
If a shared datastore has powered on virtual machines and becomes 100% full, you can
increase the datastore's capacity only from the host with which the powered on virtual
machines are registered.
5.17 Set Up Dynamic Disk Mirroring
Typically, you cannot use logical-volume manager software on virtual machines to mirror
virtual disks. However, if your Microsoft Windows virtual machines support dynamic disks,
you can protect the virtual machines from an unplanned storage device loss by mirroring
virtual disks across two SAN LUNs.
Prerequisites


Use a Windows virtual machine that supports dynamic disks.
Required privilege: Advanced
Procedure
1. Create a virtual machine with two virtual disks.
Make sure to place the disks on different datastores.
2. Log in to your virtual machine and configure the disks as dynamic mirrored disks.
See Microsoft documentation.
3. After the disks synchronise, power off the virtual machine.
4. Change virtual machine settings to allow the use of dynamic disk mirroring.
a. Right-click the virtual machine and select Edit Settings.
b. Click the VM Options tab and expand the Advanced menu.
c. Click Edit Configuration next to Configuration Parameters.
d. Click Add Row and add the following parameters:
Name
Value
scsi#.returnNoConnectDuringAPD
scsi#.returnBusyOnNoConnectStatus
e. Click OK.
5.18 Creating a Diagnostic Partition
To run successfully, your host must have a diagnostic partition or a dump partition to store
core dumps for debugging and technical support.
Typically, a local diagnostic partition is created during ESXi installation. You can override
this default behavior if, for example, you use shared storage devices instead of local storage.
To prevent automatic disk formatting, detach the local storage devices from the host before
you install ESXi and power on the host for the first time. You can later create a diagnostic
partition on a local disk or on a private or shared SAN LUN using the client.
The following considerations apply:




A diagnostic partition cannot be located on an iSCSI LUN accessed through the
software iSCSI or dependent hardware iSCSI adapter. For more information about
diagnostic partitions with iSCSI, see General Boot from iSCSI SAN
Recommendations.
Unless you are using diskless servers, set up a diagnostic partition on a local storage.
Each host must have a diagnostic partition of 110MB. If multiple hosts share a
diagnostic partition on a SAN LUN, the partition should be large enough to
accommodate core dumps of all hosts.
If a host that uses a shared diagnostic partition fails, reboot the host and extract log
files immediately after the failure. Otherwise, the second host that fails before you
collect the diagnostic data of the first host might not be able to save the core dump.
To mange the host’s diagnostic partition, use the vCLI commands. See vSphere CommandLine Interface Concepts and Examples.
5.18.1 Create a Diagnostic Partition in the vSphere Client
You can create a diagnostic partition for your host.
Procedure




Log in to the vSphere Client and select the host from the Inventory panel.
Click the Configuration tab and click Storage in the Hardware panel.
Click Datastores and click Add Storage.
Select Diagnostic and click Next.
If you do not see Diagnostic as an option, the host already has a diagnostic partition.

Specify the type of diagnostic partition.
5.19 About Raw Device Mapping
An RDM is a mapping file in a separate VMFS volume that acts as a proxy for a raw physical
storage device. The RDM allows a virtual machine to directly access and use the storage
True
False
device. The RDM contains metadata for managing and redirecting disk access to the physical
device.
The file gives you some of the advantages of direct access to a physical device while keeping
some advantages of a virtual disk in VMFS. As a result, it merges VMFS manageability with
raw device access.
RDMs can be described in terms such as mapping a raw device into a datastore, mapping a
system LUN, or mapping a disk file to a physical disk volume. All these terms refer to RDMs.
Raw Device Mapping
Although VMware recommends that you use VMFS datastores for most virtual disk storage,
on certain occasions, you might need to use raw LUNs or logical disks located in a SAN.
For example, you need to use raw LUNs with RDMs in the following situations:


When SAN snapshot or other layered applications run in the virtual machine. The
RDM better enables scalable backup offloading systems by using features inherent to
the SAN.
In any MSCS clustering scenario that spans physical hosts — virtual-to-virtual
clusters as well as physical-to-virtual clusters. In this case, cluster data and quorum
disks should be configured as RDMs rather than as virtual disks on a shared VMFS.
Think of an RDM as a symbolic link from a VMFS volume to a raw LUN. The mapping makes
LUNs appear as files in a VMFS volume. The RDM, not the raw LUN, is referenced in the
virtual machine configuration. The RDM contains a reference to the raw LUN.
Using RDMs, you can:



Use vMotion to migrate virtual machines using raw LUNs.
Add raw LUNs to virtual machines using the vSphere Client or the vSphere Web
Client.
Use file system features such as distributed file locking, permissions, and naming.
Two compatibility modes are available for RDMs:

Virtual compatibility mode allows an RDM to act exactly like a virtual disk file,
including the use of snapshots.
5.19.1 RDM Considerations and Limitations
Certain considerations and limitations exist when you use RDMs.





The RDM is not available for direct-attached block devices or certain RAID devices.
The RDM uses a SCSI serial number to identify the mapped device. Because block
devices and some direct-attach RAID devices do not export serial numbers, they
cannot be used with RDMs.
If you are using the RDM in physical compatibility mode, you cannot use a snapshot
with the disk. Physical compatibility mode allows the virtual machine to manage its
own, storage-based, snapshot or mirroring operations.
Virtual machine snapshots are available for RDMs with virtual compatibility mode.
You cannot map to a disk partition. RDMs require the mapped device to be a whole
LUN.
If you use vMotion to migrate virtual machines with RDMs, make sure to maintain
consistent LUN IDs for RDMs across all participating ESXi hosts.
5.20 Raw Device Mapping Characteristics
An RDM is a special mapping file in a VMFS volume that manages metadata for its mapped
device. The mapping file is presented to the management software as an ordinary disk file,
available for the usual file-system operations. To the virtual machine, the storage
virtualization layer presents the mapped device as a virtual SCSI device.
Key contents of the metadata in the mapping file include the location of the mapped device
(name resolution), the locking state of the mapped device, permissions, and so on.
5.20.1 RDM Virtual and Physical Compatibility Modes
You can use RDMs in virtual compatibility or physical compatibility modes. Virtual mode
specifies full virtualization of the mapped device. Physical mode specifies minimal SCSI
virtualization of the mapped device, allowing the greatest flexibility for SAN management
software.
In virtual mode, the VMkernel sends only READ and WRITE to the mapped device. The
mapped device appears to the guest operating system exactly the same as a virtual disk file in
a VMFS volume. The real hardware characteristics are hidden. If you are using a raw disk in
virtual mode, you can realize the benefits of VMFS such as advanced file locking for data
protection and snapshots for streamlining development processes. Virtual mode is also more
portable across storage hardware than physical mode, presenting the same behavior as a
virtual disk file.
In physical mode, the VMkernel passes all SCSI commands to the device, with one exception:
the REPORT LUNs command is virtualized so that the VMkernel can isolate the LUN to the
owning virtual machine. Otherwise, all physical characteristics of the underlying hardware
are exposed. Physical mode is useful to run SAN management agents or other SCSI targetbased software in the virtual machine. Physical mode also allows virtual-to-physical
clustering for cost-effective high availability.
VMFS5 supports greater than 2TB disk size for RDMs in physical compatibility mode only.
The following restrictions apply:

You cannot relocate larger than 2TB RDMs to datastores other than VMFS5.

You cannot convert larger than 2TB RDMs to virtual disks, or perform other
operations that involve RDM to virtual disk conversion. Such operations include
cloning.
Features Available with Virtual Disks and Raw Device Mappings
ESXi Features
Virtual Disk
File
Virtual Mode
RDM
Physical Mode RDM
Yes
SCSI Commands Passed
Through
No
No
REPORT LUNs is not passed
through
vCenter Server Support
Yes
Yes
Yes
Snapshots
Yes
Yes
No
Distributed Locking
Yes
Yes
Yes
Clustering
Cluster-in-a-box
Physical-to-virtual clustering
Cluster-in-a-box
cluster-acrossonly
cluster-across-boxes
boxes
SCSI Target-Based
Software
No
No
Yes
VMware recommends that you use virtual disk files for the cluster-in-a-box type of
clustering. If you plan to reconfigure your cluster-in-a-box clusters as cluster-across-boxes
clusters, use virtual mode RDMs for the cluster-in-a-box clusters.
Option
Description
Persistent
Changes are immediately and permanently written to the disk.
NonpersistentChanges to the disk are discarded when you power off or revert to the
snapshot.
5.21 VMkernel and Storage
The VMkernel is a high-performance operating system that runs directly on the ESXi host.
The VMkernel manages most of the physical resources on the hardware, including memory,
physical processors, storage, and networking controllers.
To manage storage, VMkernel has a storage subsystem that supports several Host Bus
Adapters (HBAs) including parallel SCSI, SAS, Fibre Channel, FCoE, and iSCSI. These HBAs
connect a wide variety of active-active, active-passive, and ALUA storage arrays that are
certified for use with the VMkernel. See the vSphere Compatibility Guide for a list of the
supported HBAs and storage arrays.
The primary file system that the VMkernel uses is the VMware Virtual Machine File System
(VMFS). VMFS is a cluster file system designed and optimized to support large files such as
virtual disks and swap files. The VMkernel also supports the storage of virtual disks on NFS
file systems.
The storage I/O path provides virtual machines with access to storage devices through device
emulation. This device emulation allows a virtual machine to access files on a VMFS or NFS
file system as if they were SCSI devices. The VMkernel provides storage virtualization
functions such as the scheduling of I/O requests from multiple virtual machines and
multipathing.
In addition, VMkernel offers several Storage APIs that enable storage partners to integrate
and optimize their products for vSphere.
The following graphic illustrates the basics of the VMkernel core, with special attention to
the storage stack. Storage‐related modules reside between the logical device I/O scheduler
and the adapter I/O scheduler layers.
VMkernel and Storage
This chapter includes the following topics:

Storage APIs
5.21.1 Storage APIs
Storage APIs is a family of APIs used by third-party hardware, software, and storage
providers to develop components that enhance several vSphere features and solutions.
This publication describes the following sets of Storage APIs and explains how they
contribute to your storage environment. For information about other APIs from this family,
including Storage API - Data Protection and Storage API - Site Recovery Manager, see the
VMware Web site.



Storage APIs - Multipathing, also known as the Pluggable Storage Architecture (PSA).
PSA is a collection of VMkernel APIs that allows storage partners to enable and
certify their arrays asynchronous to ESXi release schedules, as well as deliver
performance‐enhancing, multipathing and load‐balancing behaviors that are
optimized for each array. For more information, see Managing Multiple Paths.
Storage APIs - Array Integration, formerly known as VAAI, include the following
APIs:
o Hardware Acceleration APIs. Allows arrays to integrate with vSphere to
transparently offload certain storage operations to the array. This integration
significantly reduces CPU overhead on the host. See Storage Hardware
Acceleration.
o Array Thin Provisioning APIs. Help to monitor space use on thin-provisioned
storage arrays to prevent out-of-space conditions, and to perform space
reclamation. See Array Thin Provisioning and VMFS Datastores.
Storage APIs - Storage Awareness. These vCenter Server-based APIs enable storage
arrays to inform the vCenter Server about their configurations, capabilities, and
storage health and events. See Using Storage Vendor Providers.
5.22 Understanding Multipathing and Failover
5.22.1 Host-Based Failover with iSCSI
When setting up your ESXi host for multipathing and failover, you can use multiple iSCSI
HBAs or multiple NICs depending on the type of iSCSI adapters on your host.
For information on different types of iSCSI adapters, see iSCSI Initiators.
When you use multipathing, specific considerations apply.




ESXi does not support multipathing when you combine an independent hardware
adapter with software iSCSI or dependent iSCSI adapters in the same host.
Multipathing between software and dependent adapters within the same host is
supported.
On different hosts, you can mix both dependent and independent adapters.
The following illustration shows multipathing setups possible with different types of iSCSI
initiators.
Host-Based Path Failover
5.22.2 Failover with Hardware iSCSI
With hardware iSCSI, the host typically has two or more hardware iSCSI adapters available,
from which the storage system can be reached using one or more switches. Alternatively, the
setup might include one adapter and two storage processors so that the adapter can use a
different path to reach the storage system.
On the Host-Based Path Failover illustration, Host1 has two hardware iSCSI adapters, HBA1
and HBA2, that provide two physical paths to the storage system. Multipathing plug-ins on
your host, whether the VMkernel NMP or any third-party MPPs, have access to the paths by
default and can monitor health of each physical path. If, for example, HBA1 or the link
between HBA1 and the network fails, the multipathing plug-ins can switch the path over to
HBA2.
5.22.3 Failover with Software iSCSI
With software iSCSI, as shown on Host 2 of the Host-Based Path Failover illustration, you
can use multiple NICs that provide failover and load balancing capabilities for iSCSI
connections between your host and storage systems.
For this setup, because multipathing plug-ins do not have direct access to physical NICs on
your host, you first need to connect each physical NIC to a separate VMkernel port. You then
associate all VMkernel ports with the software iSCSI initiator using a port binding technique.
As a result, each VMkernel port connected to a separate NIC becomes a different path that
the iSCSI storage stack and its storage-aware multipathing plug-ins can use.
For information on how to configure multipathing for software iSCSI, see Setting Up iSCSI
Network.
5.23 Array-Based Failover with iSCSI
Some iSCSI storage systems manage path use of their ports automatically and transparently
to ESXi.
When using one of these storage systems, your host does not see multiple ports on the
storage and cannot choose the storage port it connects to. These systems have a single virtual
port address that your host uses to initially communicate. During this initial communication,
the storage system can redirect the host to communicate with another port on the storage
system. The iSCSI initiators in the host obey this reconnection request and connect with a
different port on the system. The storage system uses this technique to spread the load
across available ports.
If the ESXi host loses connection to one of these ports, it automatically attempts to reconnect
with the virtual port of the storage system, and should be redirected to an active, usable port.
This reconnection and redirection happens quickly and generally does not disrupt running
virtual machines. These storage systems can also request that iSCSI initiators reconnect to
the system, to change which storage port they are connected to. This allows the most
effective use of the multiple ports.
The Port Redirection illustration shows an example of port redirection. The host attempts to
connect to the 10.0.0.1 virtual port. The storage system redirects this request to 10.0.0.2. The
host connects with 10.0.0.2 and uses this port for I/O communication.
Note
The storage system does not always redirect connections. The port at 10.0.0.1 could be used
for traffic, also.
Port Redirection
If the port on the storage system that is acting as the virtual port becomes unavailable, the
storage system reassigns the address of the virtual port to another port on the system. Port
Reassignment shows an example of this type of port reassignment. In this case, the virtual
port 10.0.0.1 becomes unavailable and the storage system reassigns the virtual port IP
address to a different port. The second port responds to both addresses.
Port Reassignment
With this form of array-based failover, you can have multiple paths to the storage only if you
use multiple ports on the ESXi host. These paths are active-active. For additional
information, see iSCSI Session Management.
5.24Path Failover and Virtual Machines
Path failover occurs when the active path to a LUN is changed from one path to another,
usually because of a SAN component failure along the current path.
When a path fails, storage I/O might pause for 30 to 60 seconds until your host determines
that the link is unavailable and completes failover. If you attempt to display the host, its
storage devices, or its adapters, the operation might appear to stall. Virtual machines with
their disks installed on the SAN can appear unresponsive. After failover is complete, I/O
resumes normally and the virtual machines continue to run.
However, when failovers take a long time to complete, a Windows virtual machine might
interrupt the I/O and eventually fail. To avoid the failure, set the disk timeout value for the
Windows virtual machine to at least 60 seconds.
5.24.1 Set Timeout on Windows Guest OS
Increase the standard disk timeout value on a Windows guest operating system to avoid
disruptions during a path failover.
This procedure explains how to change the timeout value by using the Windows registry.
Prerequisites
Back up the Windows registry.
Procedure
1. Select Start > Run.
2. Type regedit.exe, and click OK.
3. In the left-panel hierarchy view, double-click HKEY_LOCAL_MACHINE > System
> CurrentControlSet > Services > Disk.
4. Double-click TimeOutValue.
5. Set the value data to 0x3c (hexadecimal) or 60 (decimal) and click OK.
After you make this change, Windows waits at least 60 seconds for delayed disk
5. operations to complete before it generates errors.
6. Reboot guest OS for the change to take effect.
5.25 Managing Multiple Paths
To manage storage multipathing, ESXi uses a collection of Storage APIs, also called the
Pluggable Storage Architecture (PSA). The PSA is an open, modular framework that
coordinates the simultaneous operation of multiple multipathing plug-ins (MPPs). The PSA
allows 3rd party software developers to design their own load balancing techniques and
failover mechanisms for particular storage array, and insert their code directly into the ESXi
storage I/O path.
Topics discussing path management use the following acronyms.
Multipathing Acronyms
AcronymDefinition
PSA
Pluggable Storage Architecture
NMP
Native Multipathing Plug-In. Generic VMware multipathing module.
PSP
Path Selection Plug-In, also called Path Selection Policy. Handles path selection
for a given device.
SATP
Storage Array Type Plug-In, also called Storage Array Type Policy. Handles path
failover for a given storage array.
The VMkernel multipathing plug-in that ESXi provides by default is the VMware Native
Multipathing Plug-In (NMP). The NMP is an extensible module that manages sub plug-ins.
There are two types of NMP sub plug-ins, Storage Array Type Plug-Ins (SATPs), and Path
Selection Plug-Ins (PSPs). SATPs and PSPs can be built-in and provided by VMware, or can
be provided by a third party.
If more multipathing functionality is required, a third party can also provide an MPP to run
in addition to, or as a replacement for, the default NMP.
When coordinating the VMware NMP and any installed third-party MPPs, the PSA performs
the following tasks:








Loads and unloads multipathing plug-ins.
Hides virtual machine specifics from a particular plug-in.
Routes I/O requests for a specific logical device to the MPP managing that device.
Handles I/O queueing to the logical devices.
Implements logical device bandwidth sharing between virtual machines.
Handles I/O queueing to the physical storage HBAs.
Handles physical path discovery and removal.
Provides logical device and physical path I/O statistics.
As the Pluggable Storage Architecture illustration shows, multiple third-party MPPs can run
in parallel with the VMware NMP. When installed, the third-party MPPs replace the
behavior of the NMP and take complete control of the path failover and the load-balancing
operations for specified storage devices.
Pluggable Storage Architecture
The multipathing modules perform the following operations:






Manage physical path claiming and unclaiming.
Manage creation, registration, and deregistration of logical devices.
Associate physical paths with logical devices.
Support path failure detection and remediation.
Process I/O requests to logical devices:
o Select an optimal physical path for the request.
o Depending on a storage device, perform specific actions necessary to handle
path failures and I/O command retries.
Support management tasks, such as reset of logical devices.
5.26VMware Multipathing Module
By default, ESXi provides an extensible multipathing module called the Native Multipathing
Plug-In (NMP).
Generally, the VMware NMP supports all storage arrays listed on the VMware storage HCL
and provides a default path selection algorithm based on the array type. The NMP associates
a set of physical paths with a specific storage device, or LUN. The specific details of handling
path failover for a given storage array are delegated to a Storage Array Type Plug-In (SATP).
The specific details for determining which physical path is used to issue an I/O request to a
storage device are handled by a Path Selection Plug-In (PSP). SATPs and PSPs are sub plugins within the NMP module.
With ESXi, the appropriate SATP for an array you use will be installed automatically. You do
not need to obtain or download any SATPs.
5.26.1 VMware SATPs
Storage Array Type Plug-Ins (SATPs) run in conjunction with the VMware NMP and are
responsible for array-specific operations.
ESXi offers a SATP for every type of array that VMware supports. It also provides default
SATPs that support non-specific active-active and ALUA storage arrays, and the local SATP
for direct-attached devices. Each SATP accommodates special characteristics of a certain
class of storage arrays and can perform the array-specific operations required to detect path
state and to activate an inactive path. As a result, the NMP module itself can work with
multiple storage arrays without having to be aware of the storage device specifics.
After the NMP determines which SATP to use for a specific storage device and associates the
SATP with the physical paths for that storage device, the SATP implements the tasks that
include the following:



Monitors the health of each physical path.
Reports changes in the state of each physical path.
Performs array-specific actions necessary for storage fail-over. For example, for
active-passive devices, it can activate passive paths.
5.26.2 VMware PSPs
Path Selection Plug-Ins (PSPs) are sub plug-ins of the VMware NMP and are responsible for
choosing a physical path for I/O requests.
The VMware NMP assigns a default PSP for each logical device based on the SATP associated
with the physical paths for that device. You can override the default PSP. For information,
see Path Scanning and Claiming.
By default, the VMware NMP supports the following PSPs:
MW_PSP_MRU
The host selects the path that it used most recently. When the path
becomes unavailable, the host selects an alternative path. The host does
not revert back to the original path when that path becomes available
again. There is no preferred path setting with the MRU policy. MRU is
the default policy for most active-passive storage devices.
The VMW_PSP_MRU ranking capability allows you to assign ranks to
individual paths. To set ranks to individual paths, use the esxcli storage
nmp psp generic pathconfig set command. For details, see the VMware
knowledge base article at http://kb.vmware.com/kb/2003468.
The policy is displayed in the client as the Most Recently Used
(VMware) path selection policy.
VMW_PSP_FIXED The host uses the designated preferred path, if it has been configured.
Otherwise, it selects the first working path discovered at system boot
time. If you want the host to use a particular preferred path, specify it
manually. Fixed is the default policy for most active-active storage
devices.
Note
If the host uses a default preferred path and the path's status turns to
Dead, a new path is selected as preferred. However, if you explicitly
designate the preferred path, it will remain preferred even when it
becomes inaccessible.
Displayed in the client as the Fixed (VMware) path selection policy.
VMW_PSP_RR
The host uses an automatic path selection algorithm rotating through all
active paths when connecting to active-passive arrays, or through all
available paths when connecting to active-active arrays. RR is the
default for a number of arrays and can be used with both active-active
and active-passive arrays to implement load balancing across paths for
different LUNs.
Displayed in the client as the Round Robin (VMware) path selection
policy.
5.26.3 VMware NMP Flow of I/O
When a virtual machine issues an I/O request to a storage device managed by the NMP, the
following process takes place.







The NMP calls the PSP assigned to this storage device.
The PSP selects an appropriate physical path on which to issue the I/O.
The NMP issues the I/O request on the path selected by the PSP.
If the I/O operation is successful, the NMP reports its completion.
If the I/O operation reports an error, the NMP calls the appropriate SATP.
The SATP interprets the I/O command errors and, when appropriate, activates the
inactive paths.
The PSP is called to select a new path on which to issue the I/O.
5.27 Path Scanning and Claiming
When you start your ESXi host or rescan your storage adapter, the host discovers all physical
paths to storage devices available to the host. Based on a set of claim rules, the host
determines which multipathing plug-in (MPP) should claim the paths to a particular device
and become responsible for managing the multipathing support for the device.
By default, the host performs a periodic path evaluation every 5 minutes causing any
unclaimed paths to be claimed by the appropriate MPP.
The claim rules are numbered. For each physical path, the host runs through the claim rules
starting with the lowest number first. The attributes of the physical path are compared to the
path specification in the claim rule. If there is a match, the host assigns the MPP specified in
the claim rule to manage the physical path. This continues until all physical paths are
claimed by corresponding MPPs, either third-party multipathing plug-ins or the native
multipathing plug-in (NMP).
For the paths managed by the NMP module, a second set of claim rules is applied. These
rules determine which Storage Array Type Plug-In (SATP) should be used to manage the
paths for a specific array type, and which Path Selection Plug-In (PSP) is to be used for each
storage device.
Use the vSphere Client or the vSphere Web Client to view which SATP and PSP the host is
using for a specific storage device and the status of all available paths for this storage device.
If needed, you can change the default VMware PSP using the client. To change the default
SATP, you need to modify claim rules using the vSphere CLI.
You can find some information about modifying claim rules in Managing Storage Paths and
Multipathing Plug-Ins.
For more information about the commands available to manage PSA, see Getting Started
with vSphere Command-Line Interfaces.
For a complete list of storage arrays and corresponding SATPs and PSPs, see the SAN Array
Model Reference section of the vSphere Compatibility Guide.
5.27.1 Viewing the Paths Information
You can review the storage array type policy (SATP) and path selection policy (PSP) that the
ESXi host uses for a specific storage device and the status of all available paths for this
storage device. You can access the path information from both the Datastores and Devices
views. For datastores, you review the paths that connect to the device the datastore is
deployed on.
The path information includes the SATP assigned to manage the device, the PSP, a list of
paths, and the status of each path. The following path status information can appear:
Active
Paths available for issuing I/O to a LUN. A single or multiple working paths
currently used for transferring data are marked as Active (I/O).
Standby If active paths fail, the path can quickly become operational and can be used for
I/O.
Disabled The path is disabled and no data can be transferred.
Dead
The software cannot connect to the disk through this path.
If you are using the Fixed path policy, you can see which path is the preferred path. The
preferred path is marked with an asterisk (*) in the Preferred column.
For each path you can also display the path's name. The name includes parameters that
describe the path: adapter ID, target ID, and device ID. Usually, the path's name has the
format similar to the following:
fc.adapterID-fc.targetID-naa.deviceID
Note
When you use the host profiles editor to edit paths, you must specify all three parameters
that describe a path, adapter ID, target ID, and device ID
5.28 Managing Storage Paths and Multipathing Plug-Ins
Use the esxcli commands to manage the PSA multipathing plug-ins and storage paths
assigned to them.
You can display all multipathing plug-ins available on your host. You can list any third-party
MPPs, as well as your host's NMP and SATPs and review the paths they claim. You can also
define new paths and specify which multipathing plug-in should claim the paths.
For more information about commands available to manage PSA, see the Getting Started
with vSphere Command-Line Interfaces.
5.29Multipathing Considerations
Specific considerations apply when you manage storage multipathing plug-ins and claim
rules.
The following considerations help you with multipathing:







If no SATP is assigned to the device by the claim rules, the default SATP for iSCSI or
FC devices is VMW_SATP_DEFAULT_AA. The default PSP is VMW_PSP_FIXED.
When the system searches the SATP rules to locate a SATP for a given device, it
searches the driver rules first. If there is no match, the vendor/model rules are
searched, and finally the transport rules are searched. If no match occurs, NMP
selects a default SATP for the device.
If VMW_SATP_ALUA is assigned to a specific storage device, but the device is not
ALUA-aware, no claim rule match occurs for this device. The device is claimed by the
default SATP based on the device's transport type.
The default PSP for all devices claimed by VMW_SATP_ALUA is VMW_PSP_MRU.
The VMW_PSP_MRU selects an active/optimized path as reported by the
VMW_SATP_ALUA, or an active/unoptimized path if there is no active/optimized
path. This path is used until a better path is available (MRU). For example, if the
VMW_PSP_MRU is currently using an active/unoptimized path and an
active/optimized path becomes available, the VMW_PSP_MRU will switch the
current path to the active/optimized one.
While VMW_PSP_MRU is typically selected for ALUA arrays by default, certain
ALUA storage arrays need to use VMW_PSP_FIXED. To check whether your storage
array requires VMW_PSP_FIXED, see the VMware Compatibility Guide or contact
your storage vendor. When using VMW_PSP_FIXED with ALUA arrays, unless you
explicitly specify a preferred path, the ESXi host selects the most optimal working
path and designates it as the default preferred path. If the host selected path becomes
unavailable, the host selects an alternative available path. However, if you explicitly
designate the preferred path, it will remain preferred no matter what its status is.
By default, the PSA claim rule 101 masks Dell array pseudo devices. Do not delete this
rule, unless you want to unmask these devices.
5.29.1 List Multipathing Claim Rules for the Host
Use the esxcli command to list available multipathing claim rules.
Claim rules indicate which multipathing plug-in, the NMP or any third-party MPP, manages
a given physical path. Each claim rule identifies a set of paths based on the following
parameters:




Vendor/model strings
Transportation, such as SATA, IDE, Fibre Channel, and so on
Adapter, target, or LUN location
Device driver, for example, Mega-RAID
In the procedure, --server=server_name specifies the target server. The specified target
server prompts you for a user name and password. Other connection options, such as a
configuration file or session file, are supported. For a list of connection options, see Getting
Started with vSphere Command-Line Interfaces.
Prerequisites
Install vCLI or deploy the vSphere Management Assistant (vMA) virtual machine. See
Getting Started with vSphere Command-Line Interfaces. For troubleshooting , run esxcli
commands in the ESXi Shell.
Procedure
1. Run the esxcli --server=server_name storage core claimrule list -claimrule-class=MP command to list the multipathing claim rules.
5.29.1.1 Example: Sample Output of the esxcli storage core claimrule list Command
Rule Class Rule
Class
MP
0
runtime
MP
1
runtime
MP
2
runtime
MP
3
runtime
MP
4
runtime
MP
101
runtime
model=Universal Xport
MP
101
file
model=Universal Xport
MP
200
runtime
MP
200
file
MP
201
runtime
target=* lun=*
MP
201
file
target=* lun=*
MP
202
runtime
MP
202
file
MP
65535 runtime
Type
transport
transport
transport
transport
transport
vendor
Plugin
NMP
NMP
NMP
NMP
NMP
MASK_PATH
Matches
transport=usb
transport=sata
transport=ide
transport=block
transport=unknown
vendor=DELL
vendor
MASK_PATH
vendor=DELL
vendor
vendor
location
MPP_1
MPP_1
MPP_2
vendor=NewVend model=*
vendor=NewVend model=*
adapter=vmhba41 channel=*
location
MPP_2
adapter=vmhba41 channel=*
driver
driver
vendor
MPP_3
MPP_3
NMP
driver=megaraid
driver=megaraid
vendor=* model=*
This example indicates the following:





The NMP claims all paths connected to storage devices that use the USB, SATA, IDE,
and Block SCSI transportation.
You can use the MASK_PATH module to hide unused devices from your host. By
default, the PSA claim rule 101 masks Dell array pseudo devices with a vendor string
of DELL and a model string of Universal Xport.
The MPP_1 module claims all paths connected to any model of the NewVend storage
array.
The MPP_3 module claims the paths to storage devices controlled by the Mega-RAID
device driver.
Any paths not described in the previous rules are claimed by NMP.

The Rule Class column in the output describes the category of a claim rule. It can
5.29.2 Hardware Acceleration for Block Storage Devices
With hardware acceleration, your host can integrate with block storage devices, Fibre
Channel or iSCSI, and use certain storage array operations.
ESXi hardware acceleration supports the following array operations:



Full copy, also called clone blocks or copy offload. Enables the storage arrays to make
full copies of data within the array without having the host read and write the data.
This operation reduces the time and network load when cloning virtual machines,
provisioning from a template, or migrating with vMotion.
Block zeroing, also called write same. Enables storage arrays to zero out a large
number of blocks to provide newly allocated storage, free of previously written data.
This operation reduces the time and network load when creating virtual machines
and formatting virtual disks.
Hardware assisted locking, also called atomic test and set (ATS). Supports discrete
virtual machine locking without use of SCSI reservations. This operation allows disk
locking per sector, instead of the entire LUN as with SCSI reservations.
Check with your vendor for the hardware acceleration support. Certain storage arrays
require that you activate the support on the storage side.
On your host, the hardware acceleration is enabled by default. If your storage does not
support the hardware acceleration, you can disable it.
In addition to hardware acceleration support, ESXi includes support for array thin
provisioning. For information, see Array Thin Provisioning and VMFS Datastores.
ESXi host can communicate directly and does not require the VAAI plug-ins.
If the device does not support T10 SCSI or provides partial support, ESXi reverts to using the
VAAI plug-ins, installed on your host, or uses a combination of the T10 SCSI commands and
plug-ins. The VAAI plug-ins are vendor-specific and can be either VMware or partner
developed. To manage the VAAI capable device, your host attaches the VAAI filter and
vendor-specific VAAI plug-in to the device.
For information about whether your storage requires VAAI plug-ins or supports hardware
acceleration through T10 SCSI commands, see the vSphere Compatibility Guide or check
with your storage vendor.
You can use several esxcli commands to query storage devices for the hardware acceleration
support information. For the devices that require the VAAI plug-ins, the claim rule
commands are also available. For information about esxcli commands, see Getting Started
with vSphere Command-Line Interfaces.
5.29.2.1 Display Hardware Acceleration Plug-Ins and Filter
To communicate with the devices that do not support the T10 SCSI standard, your host uses
a combination of a single VAAI filter and a vendor-specific VAAI plug-in. Use the esxcli
command to view the hardware acceleration filter and plug-ins currently loaded into your
system.
In the procedure, --server=server_name specifies the target server. The specified target
server prompts you for a user name and password. Other connection options, such as a
configuration file or session file, are supported. For a list of connection options, see Getting
Started with vSphere Command-Line Interfaces.
5.30 Hardware Acceleration on NAS Devices
Hardware acceleration allows your host to integrate with NAS devices and use several
hardware operations that NAS storage provides.
The following list shows the supported NAS operations:


Full file clone. This operation is similar to the VMFS block cloning except that NAS
devices clone entire files instead of file segments.
Reserve space. Enables storage arrays to allocate space for a virtual disk file in thick
format.
Typically, when you create a virtual disk on an NFS datastore, the NAS server determines the
allocation policy. The default allocation policy on most NAS servers is thin and does not
guarantee backing storage to the file. However, the reserve space operation can instruct the
NAS device to use vendor-specific mechanisms to reserve space for a virtual disk. As a result,
you can create thick virtual disks on the NFS datastore.


Lazy file clone. Allows VMware View to offload creation of linked clones to a NAS
array.
Extended file statistics. Enables storage arrays to accurately report space utilization.
With NAS storage devices, the hardware acceleration integration is implemented through
vendor-specific NAS plug-ins. These plug-ins are typically created by vendors and are
distributed as VIB packages through a web page. No claim rules are required for the NAS
plug-ins to function.
There are several tools available for installing and upgrading VIB packages. They include the
esxcli commands and vSphere Update Manager. For more information, see the vSphere
Upgrade and Installing and Administering VMware vSphere Update Manager
documentation.
5.31 Hardware Acceleration Considerations
When you use the hardware acceleration functionality, certain considerations apply.
Several reasons might cause a hardware-accelerated operation to fail.
For any primitive that the array does not implement, the array returns an error. The error
triggers the ESXi host to attempt the operation using its native methods.
The VMFS data mover does not leverage hardware offloads and instead uses software data
movement when one of the following occurs:


The source and destination VMFS datastores have different block sizes.
The source file type is RDM and the destination file type is non-RDM (regular file).


The source VMDK type is eagerzeroedthick and the destination VMDK type is thin.
The source or destination VMDK is in sparse or hosted format.


The source virtual machine has a snapshot.
The logical address and transfer length in the requested operation are not aligned to
the minimum alignment required by the storage device. All datastores created with the
vSphere Client or the vSphere Web Client are aligned automatically.
The VMFS has multiple LUNs or extents, and they are on different arrays.

Hardware cloning between arrays, even within the same VMFS datastore, does not work.
5.32 Booting ESXi with Software FCoE
ESXi supports boot from FCoE capable network adapters.
When you install and boot ESXi from an FCoE LUN, the host can use a VMware software
FCoE adapter and a network adapter with FCoE capabilities. The host does not require a
dedicated FCoE HBA.
You perform most configurations through the option ROM of your network adapter. The
network adapters must support one of the following formats, which communicate parameters
about an FCoE boot device to VMkernel.


FCoE Boot Firmware Table (FBFT). FBFT is Intel propriety.
FCoE Boot Parameter Table (FBPT). FBPT is defined by VMware for third-party
vendors to implement software FCoE boot.
The configuration parameters are set in the option ROM of your adapter. During an ESXi
installation or a subsequent boot, these parameters are exported in to system memory in either
FBFT format or FBPT format. The VMkernel can read the configuration settings and use
them to access the boot LUN.
This chapter includes the following topics:




Requirements and Considerations for Software FCoE Boot
Best Practices for Software FCoE Boot
Set Up Software FCoE Boot
Troubleshooting Installation and Boot from Software FCoE
5.33 Requirements and Considerations for Software FCoE Boot
When you boot the ESXi host from SAN using software FCoE, certain requirements and
considerations apply.
5.33.1 Requirements
 ESXi 5.1.
 The network adapter must have the following capabilities:
o Be FCoE capable.
o Support ESXi 5.x open FCoE stack.
o
Contain FCoE boot firmware which can export boot information in FBFT
format or FBPT format.
5.33.2 Considerations




You cannot change software FCoE boot configuration from within ESXi.
Coredump is not supported on any software FCoE LUNs, including the boot LUN.
Multipathing is not supported at pre-boot.
Boot LUN cannot be shared with other hosts even on shared storage.
5.34Best Practices for Software FCoE Boot
VMware recommends several best practices when you boot your system from a software
FCoE LUN.


Make sure that the host has access to the entire boot LUN. The boot LUN cannot be
shared with other hosts even on shared storage.
If you use Intel 10 Gigabit Ethernet Controller (Niantec) with a Cisco switch,
configure the switch port in the following way:
o Enable the Spanning Tree Protocol (STP).
o Turn off switchport trunk native vlan for the VLAN used for FCoE.
6 vSphere Resource Management
6.1 Configuring Resource Allocation Settings
6.1.1 Resource Allocation Shares
Shares specify the relative importance of a virtual machine (or resource pool). If a virtual
machine has twice as many shares of a resource as another virtual machine, it is entitled to
consume twice as much of that resource when these two virtual machines are competing for
resources.
Shares are typically specified as High, Normal, or Low and these values specify share values
with a 4:2:1 ratio, respectively. You can also select Custom to assign a specific number of
shares (which expresses a proportional weight) to each virtual machine.
Specifying shares makes sense only with regard to sibling virtual machines or resource pools,
that is, virtual machines or resource pools with the same parent in the resource pool
hierarchy. Siblings share resources according to their relative share values, bounded by the
reservation and limit. When you assign shares to a virtual machine, you always specify the
priority for that virtual machine relative to other powered-on virtual machines.
The following table shows the default CPU and memory share values for a virtual machine.
For resource pools, the default CPU and memory share values are the same, but must be
multiplied as if the resource pool were a virtual machine with four virtual CPUs and 16 GB of
memory.
Share Values
SettingCPU share values
Memory share values
High
2000 shares per virtual
CPU
20 shares per megabyte of configured virtual machine
memory.
Normal
1000 shares per virtual
CPU
10 shares per megabyte of configured virtual machine
memory.
Low
500 shares per virtual
CPU
5 shares per megabyte of configured virtual machine
memory.
For example, an SMP virtual machine with two virtual CPUs and 1GB RAM with CPU and
memory shares set to Normal has 2x1000=2000 shares of CPU and 10x1024=10240 shares
of memory.
Note
Virtual machines with more than one virtual CPU are called SMP (symmetric
multiprocessing) virtual machines. ESXi supports up to 64 virtual CPUs per virtual machine.
The relative priority represented by each share changes when a new virtual machine is
powered on. This affects all virtual machines in the same resource pool. All of the irtual
machines have the same number of virtual CPUs. Consider the following examples.


Two CPU-bound virtual machines run on a host with 8GHz of aggregate CPU
capacity. Their CPU shares are set to Normal and get 4GHz each.
A third CPU-bound virtual machine is powered on. Its CPU shares value is set to
High, which means it should have twice as many shares as the machines set to
Normal. The new virtual machine receives 4GHz and the two other machines get only
2GHz each. The same result occurs if the user specifies a custom share value of 2000
for the third virtual machine.
6.1.2 Resource Allocation Reservation
A reservation specifies the guaranteed minimum allocation for a virtual machine.
vCenter Server or ESXi allows you to power on a virtual machine only if there are enough
unreserved resources to satisfy the reservation of the virtual machine. The server guarantees
that amount even when the physical server is heavily loaded. The reservation is expressed in
concrete units (megahertz or megabytes).
For example, assume you have 2GHz available and specify a reservation of 1GHz for VM1
and 1GHz for VM2. Now each virtual machine is guaranteed to get 1GHz if it needs it.
However, if VM1 is using only 500MHz, VM2 can use 1.5GHz.
Reservation defaults to 0. You can specify a reservation if you need to guarantee that the
minimum required amounts of CPU or memory are always available for the virtual machine.
6.1.3 Resource Allocation Limit
Limit specifies an upper bound for CPU, memory, or storage I/O resources that can be
allocated to a virtual machine.
A server can allocate more than the reservation to a virtual machine, but never allocates
more than the limit, even if there are unused resources on the system. The limit is expressed
in concrete units (megahertz, megabytes, or I/O operations per second).
CPU, memory, and storage I/O resource limits default to unlimited. When the memory limit
is unlimited, the amount of memory configured for the virtual machine when it was created
becomes its effective limit.
In most cases, it is not necessary to specify a limit. There are benefits and drawbacks:


Benefits — Assigning a limit is useful if you start with a small number of virtual
machines and want to manage user expectations. Performance deteriorates as you
add more virtual machines. You can simulate having fewer resources available by
specifying a limit.
Drawbacks — You might waste idle resources if you specify a limit. The system does
not allow virtual machines to use more resources than the limit, even when the
system is underutilized and idle resources are available. Specify the limit only if you
have good reasons for doing so.
6.1.4 Resource Allocation Settings Suggestions
Select resource allocation settings (shares, reservation, and limit) that are appropriate for
your ESXi environment.
The following guidelines can help you achieve better performance for your virtual machines.



If you expect frequent changes to the total available resources, use Shares to allocate
resources fairly across virtual machines. If you use Shares, and you upgrade the host,
for example, each virtual machine stays at the same priority (keeps the same number
of shares) even though each share represents a larger amount of memory, CPU, or
storage I/O resources.
Use Reservation to specify the minimum acceptable amount of CPU or memory, not
the amount you want to have available. The host assigns additional resources as
available based on the number of shares, estimated demand, and the limit for your
virtual machine. The amount of concrete resources represented by a reservation does
not change when you change the environment, such as by adding or removing virtual
machines.
When specifying the reservations for virtual machines, do not commit all resources
(plan to leave at least 10% unreserved). As you move closer to fully reserving all
capacity in the system, it becomes increasingly difficult to make changes to
reservations and to the resource pool hierarchy without violating admission control.
In a DRS-enabled cluster, reservations that fully commit the capacity of the cluster or
of individual hosts in the cluster can prevent DRS from migrating virtual machines
between hosts.
 Option
Description
Shares
CPU shares for this resource pool with respect to the parent’s total. Sibling
resource pools share resources according to their relative share values bounded
by the reservation and limit. Select Low, Normal, or High, which specify share
values respectively in a 1:2:4 ratio. Select Custom to give each virtual machine a
specific number of shares, which expresses a proportional weight.
ReservationGuaranteed CPU allocation for this resource pool. Select Expandable
Reservation to specify that more than the specified reservation is allocated if
resources are available in a parent.
Limit
Upper limit for this resource pool’s CPU allocation. Select Unlimited to specify
no upper limit.
 Edit the Memory Resources.
Option
Description
Shares
Memory shares for this resource pool with respect to the parent’s total. Sibling
resource pools share resources according to their relative share values bounded
by the reservation and limit. Select Low, Normal, or High, which specify share
values respectively in a 1:2:4 ratio. Select Custom to give each virtual machine a
specific number of shares, which expresses a proportional weight.
ReservationGuaranteed memory allocation for this resource pool. Select Expandable
Reservation to specify that more than the specified reservation is allocated if
resources are available in a parent.
Upper limit for this resource pool’s memory allocation. Select Unlimited to
specify no upper limit.
Limit
6.1.5 Changing Resource Allocation Settings—Example
The following example illustrates how you can change resource allocation settings to
improve virtual machine performance.
Assume that on an ESXi host, you have created two new virtual machines—one each for your
QA (VM-QA) and Marketing (VM-Marketing) departments.
Single Host with Two Virtual Machines
In the following example, assume that VM-QA is memory intensive and accordingly you
want to change the resource allocation settings for the two virtual machines to:


Specify that, when system memory is overcommitted, VM-QA can use twice as much
memory and CPU as the Marketing virtual machine. Set the memory shares and CPU
shares for VM-QA to High and for VM-Marketing set them to Normal.
Ensure that the Marketing virtual machine has a certain amount of guaranteed CPU
resources. You can do so using a reservation setting.
Procedure








Start the vSphere Client and connect to a vCenter Server system.
Right-click VM-QA, the virtual machine for which you want to change shares, and
select Edit Settings.
Select the Resources tab, and in the CPU panel, select High from the Shares dropdown menu.
In the Memory panel, select High from the Shares drop-down menu.
Click OK.
Right-click the marketing virtual machine (VM-Marketing) and select Edit Settings.
In the CPU panel, change the Reservation value to the desired number.
Click OK.
If you select the cluster’s Resource Allocation tab and click CPU, you should see that shares
for VM-QA are twice that of the other virtual machine. Also, because the virtual machines
have not been powered on, the Reservation Used fields have not changed.
6.1.6 Admission Control
When you power on a virtual machine, the system checks the amount of CPU and memory
resources that have not yet been reserved. Based on the available unreserved resources, the
system determines whether it can guarantee the reservation for which the virtual machine is
configured (if any). This process is called admission control.
If enough unreserved CPU and memory are available, or if there is no reservation, the virtual
machine is powered on. Otherwise, an Insufficient Resources warning appears.
Note
In addition to the user-specified memory reservation, for each virtual machine there is also
an amount of overhead memory. This extra memory commitment is included in the
admission control calculation.
When the vSphere DPM feature is enabled, hosts might be placed in standby mode (that is,
powered off) to reduce power consumption. The unreserved resources provided by these
hosts are considered available for admission control. If a virtual machine cannot be powered
on without these resources, a recommendation to power on sufficient standby hosts is made.
6.1.7 CPU Virtualization Basics
CPU virtualization emphasizes performance and runs directly on the processor whenever
possible. The underlying physical resources are used whenever possible and the
virtualization layer runs instructions only as needed to make virtual machines operate as if
they were running directly on a physical machine.
CPU virtualization is not the same thing as emulation. ESXi does not use emulation to run
virtual CPUs. With emulation, all operations are run in software by an emulator. A software
emulator allows programs to run on a computer system other than the one for which they
were originally written. The emulator does this by emulating, or reproducing, the original
computer’s behavior by accepting the same data or inputs and achieving the same results.
Emulation provides portability and runs software designed for one platform across several
platforms.
When CPU resources are overcommitted, the ESXi host time-slices the physical processors
across all virtual machines so each virtual machine runs as if it has its specified number of
virtual processors. When an ESXi host runs multiple virtual machines, it allocates to each
virtual machine a share of the physical resources. With the default resource allocation
settings, all virtual machines associated with the same host receive an equal share of CPU per
virtual CPU. This means that a single-processor virtual machines is assigned only half of the
resources of a dual-processor virtual machine.
This chapter includes the following topics:




Software-Based CPU Virtualization
Hardware-Assisted CPU Virtualization
Virtualization and Processor-Specific Behavior
Performance Implications of CPU Virtualization
6.1.8 Software-Based CPU Virtualization
With software-based CPU virtualization, the guest application code runs directly on the
processor, while the guest privileged code is translated and the translated code executes on
the processor.
The translated code is slightly larger and usually executes more slowly than the native
version. As a result, guest programs, which have a small privileged code component, run with
speeds very close to native. Programs with a significant privileged code component, such as
system calls, traps, or page table updates can run slower in the virtualized environment.
Hardware-Assisted CPU Virtualization
Certain processors provide hardware assistance for CPU virtualization.
When using this assistance, the guest can use a separate mode of execution called guest
mode. The guest code, whether application code or privileged code, runs in the guest mode.
On certain events, the processor exits out of guest mode and enters root mode. The
hypervisor executes in the root mode, determines the reason for the exit, takes any required
actions, and restarts the guest in guest mode.
When you use hardware assistance for virtualization, there is no need to translate the code.
As a result, system calls or trap-intensive workloads run very close to native speed. Some
workloads, such as those involving updates to page tables, lead to a large number of exits
from guest mode to root mode. Depending on the number of such exits and total time spent
in exits, hardware-assisted CPU virtualization can speed up execution significantly.
6.1.9 Virtualization and Processor-Specific Behavior
Although VMware software virtualizes the CPU, the virtual machine detects the specific
model of the processor on which it is running.
Processor models might differ in the CPU features they offer, and applications running in the
virtual machine can make use of these features. Therefore, it is not possible to use vMotion®
to migrate virtual machines between systems running on processors with different feature
sets. You can avoid this restriction, in some cases, by using Enhanced vMotion Compatibility
(EVC) with processors that support this feature. See the vCenter Server and Host
Management documentation for more information.
6.1.10 Performance Implications of CPU Virtualization
CPU virtualization adds varying amounts of overhead depending on the workload and the
type of virtualization used.
An application is CPU-bound if it spends most of its time executing instructions rather than
waiting for external events such as user interaction, device input, or data retrieval. For such
applications, the CPU virtualization overhead includes the additional instructions that must
be executed. This overhead takes CPU processing time that the application itself can use.
CPU virtualization overhead usually translates into a reduction in overall performance.
For applications that are not CPU-bound, CPU virtualization likely translates into an
increase in CPU use. If spare CPU capacity is available to absorb the overhead, it can still
deliver comparable performance in terms of overall throughput.
ESXi supports up to 64 virtual processors (CPUs) for each virtual machine.
Note
Deploy single-threaded applications on uniprocessor virtual machines, instead of on SMP
virtual machines that have multiple CPUs, for the best performance and resource use.
Single-threaded applications can take advantage only of a single CPU. Deploying such
applications in dual-processor virtual machines does not speed up the application. Instead, it
causes the second virtual CPU to use physical resources that other virtual machines could
otherwise use.
6.1.11 Specifying CPU Configuration
You can specify CPU configuration to improve resource management. However, if you do not
customize CPU configuration, the ESXi host uses defaults that work well in most situations.
You can specify CPU configuration in the following ways:




Use the attributes and special features available through the vSphere Client. The
vSphere Client graphical user interface (GUI) allows you to connect to the ESXi host
or a vCenter Server system.
Use advanced settings under certain circumstances.
Use the vSphere SDK for scripted CPU allocation.
Use hyperthreading.
6.1.12 Multicore Processors
Multicore processors provide many advantages for a host performing multitasking of virtual
machines.
Intel and AMD have each developed processors which combine two or more processor cores
into a single integrated circuit (often called a package or socket). VMware uses the term
socket to describe a single package which can have one or more processor cores with one or
more logical processors in each core.
A dual-core processor, for example, can provide almost double the performance of a singlecore processor, by allowing two virtual CPUs to execute at the same time. Cores within the
same processor are typically configured with a shared last-level cache used by all cores,
potentially reducing the need to access slower main memory. A shared memory bus that
connects a physical processor to main memory can limit performance of its logical
processors if the virtual machines running on them are running memory-intensive
workloads which compete for the same memory bus resources.
Each logical processor of each processor core can be used independently by the ESXi CPU
scheduler to execute virtual machines, providing capabilities similar to SMP systems. For
example, a two-way virtual machine can have its virtual processors running on logical
processors that belong to the same core, or on logical processors on different physical cores.
The ESXi CPU scheduler can detect the processor topology and the relationships between
processor cores and the logical processors on them. It uses this information to schedule
virtual machines and optimize performance.
The ESXi CPU scheduler can interpret processor topology, including the relationship
between sockets, cores, and logical processors. The scheduler uses topology information to
optimize the placement of virtual CPUs onto different sockets to maximize overall cache
utilization, and to improve cache affinity by minimizing virtual CPU migrations.
In undercommitted systems, the ESXi CPU scheduler spreads load across all sockets by
default. This improves performance by maximizing the aggregate amount of cache available
to the running virtual CPUs. As a result, the virtual CPUs of a single SMP virtual machine are
spread across multiple sockets (unless each socket is also a NUMA node, in which case the
NUMA scheduler restricts all the virtual CPUs of the virtual machine to reside on the same
socket.)
In some cases, such as when an SMP virtual machine exhibits significant data sharing
between its virtual CPUs, this default behavior might be sub-optimal. For such workloads, it
can be beneficial to schedule all of the virtual CPUs on the same socket, with a shared lastlevel cache, even when the ESXi host is undercommitted. In such scenarios, you can override
the default behavior of spreading virtual CPUs across packages by including the following
configuration option in the virtual machine's .vmx configuration file:
sched.cpu.vsmpConsolidate="TRUE".
6.1.13 Hyperthreading
Hyperthreading technology allows a single physical processor core to behave like two logical
processors. The processor can run two independent applications at the same time. To avoid
confusion between logical and physical processors, Intel refers to a physical processor as a
socket, and the discussion in this chapter uses that terminology as well.
Intel Corporation developed hyperthreading technology to enhance the performance of its
Pentium IV and Xeon processor lines. Hyperthreading technology allows a single processor
core to execute two independent threads simultaneously.
While hyperthreading does not double the performance of a system, it can increase
performance by better utilizing idle resources leading to greater throughput for certain
important workload types. An application running on one logical processor of a busy core
can expect slightly more than half of the throughput that it obtains while running alone on a
non-hyperthreaded processor. Hyperthreading performance improvements are highly
application-dependent, and some applications might see performance degradation with
hyperthreading because many processor resources (such as the cache) are shared between
logical processors.
Note
On processors with Intel Hyper-Threading technology, each core can have two logical
processors which share most of the core's resources, such as memory caches and functional
units. Such logical processors are usually called threads.
Many processors do not support hyperthreading and as a result have only one thread per
core. For such processors, the number of cores also matches the number of logical
processors. The following processors support hyperthreading and have two threads per core.



Processors based on the Intel Xeon 5500 processor microarchitecture.
Intel Pentium 4 (HT-enabled)
Intel Pentium EE 840 (HT-enabled)
6.1.14 Hyperthreading and ESXi Hosts
A host that is enabled for hyperthreading should behave similarly to a host without
hyperthreading. You might need to consider certain factors if you enable hyperthreading,
however.
ESXi hosts manage processor time intelligently to guarantee that load is spread smoothly
across processor cores in the system. Logical processors on the same core have consecutive
CPU numbers, so that CPUs 0 and 1 are on the first core together, CPUs 2 and 3 are on the
second core, and so on. Virtual machines are preferentially scheduled on two different cores
rather than on two logical processors on the same core.
If there is no work for a logical processor, it is put into a halted state, which frees its
execution resources and allows the virtual machine running on the other logical processor on
the same core to use the full execution resources of the core. The VMware scheduler properly
accounts for this halt time, and charges a virtual machine running with the full resources of a
core more than a virtual machine running on a half core. This approach to processor
management ensures that the server does not violate any of the standard ESXi resource
allocation rules.
Consider your resource management needs before you enable CPU affinity on hosts using
hyperthreading. For example, if you bind a high priority virtual machine to CPU 0 and
another high priority virtual machine to CPU 1, the two virtual machines have to share the
same physical core. In this case, it can be impossible to meet the resource demands of these
virtual machines. Ensure that any custom affinity settings make sense for a hyperthreaded
system.
6.1.15 Hyperthreaded Core Sharing Options
You can set the hyperthreaded core sharing mode for a virtual machine using the vSphere
Client.
Hyperthreaded Core Sharing Modes
Option Description
Any
The default for all virtual machines on a hyperthreaded system. The virtual CPUs of
a virtual machine with this setting can freely share cores with other virtual CPUs
from this or any other virtual machine at any time.
Hyperthreaded Core Sharing Modes
Option Description
None
Virtual CPUs of a virtual machine should not share cores with each other or with
virtual CPUs from other virtual machines. That is, each virtual CPU from this virtual
machine should always get a whole core to itself, with the other logical CPU on that
core being placed into the halted state.
This option is similar to none. Virtual CPUs from this virtual machine cannot share
cores with virtual CPUs from other virtual machines. They can share cores with the
Internalother virtual CPUs from the same virtual machine.
You can select this option only for SMP virtual machines. If applied to a
uniprocessor virtual machine, the system changes this option to none.
These options have no effect on fairness or CPU time allocation. Regardless of a virtual
machine’s hyperthreading settings, it still receives CPU time proportional to its CPU shares,
and constrained by its CPU reservation and CPU limit values.
For typical workloads, custom hyperthreading settings should not be necessary. The options
can help in case of unusual workloads that interact badly with hyperthreading. For example,
an application with cache thrashing problems might slow down an application sharing its
physical core. You can place the virtual machine running the application in the none or
internal hyperthreading status to isolate it from other virtual machines.
If a virtual CPU has hyperthreading constraints that do not allow it to share a core with
another virtual CPU, the system might deschedule it when other virtual CPUs are entitled to
consume processor time. Without the hyperthreading constraints, you can schedule both
virtual CPUs on the same core.
The problem becomes worse on systems with a limited number of cores (per virtual
machine). In such cases, there might be no core to which the virtual machine that is
descheduled can be migrated. As a result, virtual machines with hyperthreading set to none or
internal can experience performance degradation, especially on systems with a limited
number of cores.
6.1.16 Quarantining
In certain rare circumstances, ESXi might detect that an application is interacting badly with
the Pentium IV hyperthreading technology. (This does not apply to systems based on the
Intel Xeon 5500 processor microarchitecture.) In such cases, quarantining, which is
transparent to the user, might be necessary.
For example, certain types of self-modifying code can disrupt the normal behavior of the
Pentium IV trace cache and can lead to substantial slowdowns (up to 90 percent) for an
application sharing a core with the problematic code. In those cases, the ESXi host
quarantines the virtual CPU running this code and places its virtual machine in the none or
internal mode, as appropriate.
6.1.17 Using CPU Affinity
By specifying a CPU affinity setting for each virtual machine, you can restrict the assignment
of virtual machines to a subset of the available processors in multiprocessor systems. By
using this feature, you can assign each virtual machine to processors in the specified affinity
set.
CPU affinity specifies virtual machine-to-processor placement constraints and is different
from the relationship created by a VM-VM or VM-Host affinity rule, which specifies virtual
machine-to-virtual machine host placement constraints.
In this context, the term CPU refers to a logical processor on a hyperthreaded system and
refers to a core on a non-hyperthreaded system.
The CPU affinity setting for a virtual machine applies to all of the virtual CPUs associated
with the virtual machine and to all other threads (also known as worlds) associated with the
virtual machine. Such virtual machine threads perform processing required for emulating
mouse, keyboard, screen, CD-ROM, and miscellaneous legacy devices.
In some cases, such as display-intensive workloads, significant communication might occur
between the virtual CPUs and these other virtual machine threads. Performance might
degrade if the virtual machine's affinity setting prevents these additional threads from being
scheduled concurrently with the virtual machine's virtual CPUs. Examples of this include a
uniprocessor virtual machine with affinity to a single CPU or a two-way SMP virtual machine
with affinity to only two CPUs.
For the best performance, when you use manual affinity settings, VMware recommends that
you include at least one additional physical CPU in the affinity setting to allow at least one of
the virtual machine's threads to be scheduled at the same time as its virtual CPUs. Examples
of this include a uniprocessor virtual machine with affinity to at least two CPUs or a two-way
SMP virtual machine with affinity to at least three CPUs.
6.1.18 Potential Issues with CPU Affinity
Before you use CPU affinity, you might need to consider certain issues.
Potential issues with CPU affinity include:



For multiprocessor systems, ESXi systems perform automatic load balancing. Avoid
manual specification of virtual machine affinity to improve the scheduler’s ability to
balance load across processors.
Affinity can interfere with the ESXi host’s ability to meet the reservation and shares
specified for a virtual machine.
Because CPU admission control does not consider affinity, a virtual machine with
manual affinity settings might not always receive its full reservation.
Virtual machines that do not have manual affinity settings are not adversely affected
by virtual machines with manual affinity settings.



When you move a virtual machine from one host to another, affinity might no longer
apply because the new host might have a different number of processors.
The NUMA scheduler might not be able to manage a virtual machine that is already
assigned to certain processors using affinity.
Affinity can affect the host's ability to schedule virtual machines on multicore or
hyperthreaded processors to take full advantage of resources shared on such
processors.
6.1.19 Host Power Management Policies
ESXi can take advantage of several power management features that the host hardware
provides to adjust the trade-off between performance and power use. You can control how
ESXi uses these features by selecting a power management policy.
In general, selecting a high-performance policy provides more absolute performance, but at
lower efficiency (performance per watt). Lower-power policies provide less absolute
performance, but at higher efficiency.
ESXi provides five power management policies. If the host does not support power
management, or if the BIOS settings specify that the host operating system is not allowed to
manage power, only the Not Supported policy is available.
You select a policy for a host using the vSphere Client. If you do not select a policy, ESXi
uses Balanced by default.
CPU Power Management Policies
Power
Management
Policy
Description
Not supported
The host does not support any power management features or power
management is not enabled in the BIOS.
The VMkernel detects certain power management features, but will not
High Performance use them unless the BIOS requests them for power capping or thermal
events.
The VMkernel uses the available power management features
Balanced (Default) conservatively to reduce host energy consumption with minimal
compromise to performance.
Low Power
The VMkernel aggressively uses available power management features
to reduce host energy consumption at the risk of lower performance.
Custom
The VMkernel bases its power management policy on the values of
several advanced configuration parameters. You can set these
parameters in the vSphere Client Advanced Settings dialog box.
When a CPU runs at lower frequency, it can also run at lower voltage, which saves power.
This type of power management is typically called Dynamic Voltage and Frequency Scaling
(DVFS). ESXi attempts to adjust CPU frequencies so that virtual machine performance is not
affected.
When a CPU is idle, ESXi can take advantage of deep halt states (known as C-states). The
deeper the C-state, the less power the CPU uses, but the longer it takes for the CPU to
resume running. When a CPU becomes idle, ESXi applies an algorithm to predict how long it
will be in an idle state and chooses an appropriate C-state to enter. In power management
policies that do not use deep C-states, ESXi uses only the shallowest halt state (C1) for idle
CPUs.
6.1.20 Select a CPU Power Management Policy
You set the CPU power management policy for a host using the vSphere Client.
Prerequisites
Verify that the BIOS settings on the host system allow the operating system to control power
management (for example, OS Controlled).
Note
Some systems have Processor Clocking Control (PCC) technology, which allows ESXi to
manage power on the host system even if the host BIOS settings do not specify OS Controlled
mode. With this technology, ESXi does not manage P-states directly. Instead, the host
cooperates with the BIOS to determine the processor clock rate. HP systems that support
this technology have a BIOS setting called Cooperative Power Management that is enabled
by default.
6.2 Memory Virtualization Basics
Before you manage memory resources, you should understand how they are being virtualized
and used by ESXi.
The VMkernel manages all machine memory. The VMkernel dedicates part of this managed
machine memory for its own use. The rest is available for use by virtual machines. Virtual
machines use machine memory for two purposes: each virtual machine requires its own
memory and the virtual machine monitor (VMM) requires some memory and a dynamic
overhead memory for its code and data.
The virtual and physical memory space is divided into blocks called pages. When physical
memory is full, the data for virtual pages that are not present in physical memory are stored
on disk. Depending on processor architecture, pages are typically 4 KB or 2 MB. See
Advanced Memory Attributes.
This chapter includes the following topics:
6.2.1 Virtual Machine Memory
Each virtual machine consumes memory based on its configured size, plus additional
overhead memory for virtualization.
The configured size is a construct maintained by the virtualization layer for the virtual
machine. It is the amount of memory that is presented to the guest operating system, but it is
independent of the amount of physical RAM that is allocated to the virtual machine, which
depends on the resource settings (shares, reservation, limit) explained below.
For example, consider a virtual machine with a configured size of 1GB. When the guest
operating system boots, it detects that it is running on a dedicated machine with 1GB of
physical memory. The actual amount of physical host memory allocated to the virtual
machine depends on its memory resource settings and memory contention on the ESXi host.
In some cases, the virtual machine might be allocated the full 1GB. In other cases, it might
receive a smaller allocation. Regardless of the actual allocation, the guest operating system
continues to behave as though it is running on a dedicated machine with 1GB of physical
memory.
Shares
Specify the relative priority for a virtual machine if more than the reservation is
available.
Reservation Is a guaranteed lower bound on the amount of physical memory that the host
reserves for the virtual machine, even when memory is overcommitted. Set the
reservation to a level that ensures the virtual machine has sufficient memory to
run efficiently, without excessive paging.
After a virtual machine has accessed its full reservation, it is allowed to retain
that amount of memory and this memory is not reclaimed, even if the virtual
machine becomes idle. For example, some guest operating systems (for
example, Linux) might not access all of the configured memory immediately
after booting. Until the virtual machines accesses its full reservation, VMkernel
can allocate any unused portion of its reservation to other virtual machines.
However, after the guest’s workload increases and it consumes its full
reservation, it is allowed to keep this memory.
Limit Is an upper bound on the amount of physical memory that the host can allocate to the
virtual machine. The virtual machine’s memory allocation is also implicitly limited by
its configured size.
Overhead memory includes space reserved for the virtual machine frame buffer and
various virtualization data structures.
6.2.2 Memory Overcommitment
For each running virtual machine, the system reserves physical memory for the virtual
machine’s reservation (if any) and for its virtualization overhead.
Because of the memory management techniques the ESXi host uses, your virtual machines
can use more memory than the physical machine (the host) has available. For example, you
can have a host with 2GB memory and run four virtual machines with 1GB memory each. In
that case, the memory is overcommitted.
Overcommitment makes sense because, typically, some virtual machines are lightly loaded
while others are more heavily loaded, and relative activity levels vary over time.
To improve memory utilization, the ESXi host transfers memory from idle virtual machines
to virtual machines that need more memory. Use the Reservation or Shares parameter to
preferentially allocate memory to important virtual machines. This memory remains
available to other virtual machines if it is not in use.
In addition, memory compression is enabled by default on ESXi hosts to improve virtual
machine performance when memory is overcommitted as described in Memory
Compression.
6.2.3 Memory Sharing
Many workloads present opportunities for sharing memory across virtual machines.
For example, several virtual machines might be running instances of the same guest
operating system, have the same applications or components loaded, or contain common
data. ESXi systems use a proprietary page-sharing technique to securely eliminate redundant
copies of memory pages.
With memory sharing, a workload consisting of multiple virtual machines often consumes
less memory than it would when running on physical machines. As a result, the system can
efficiently support higher levels of overcommitment.
The amount of memory saved by memory sharing depends on workload characteristics. A
workload of many nearly identical virtual machines might free up more than thirty percent of
memory, while a more diverse workload might result in savings of less than five percent of
memory.
6.2.4 Software-Based Memory Virtualization
ESXi virtualizes guest physical memory by adding an extra level of address translation.

The VMM for each virtual machine maintains a mapping from the guest operating
system's physical memory pages to the physical memory pages on the underlying
machine. (VMware refers to the underlying host physical pages as “machine” pages
and the guest operating system’s physical pages as “physical” pages.)
Each virtual machine sees a contiguous, zero-based, addressable physical memory space. The
underlying machine memory on the server used by each virtual machine is not necessarily
contiguous.



The VMM intercepts virtual machine instructions that manipulate guest operating
system memory management structures so that the actual memory management unit
(MMU) on the processor is not updated directly by the virtual machine.
The ESXi host maintains the virtual-to-machine page mappings in a shadow page
table that is kept up to date with the physical-to-machine mappings (maintained by
the VMM).
The shadow page tables are used directly by the processor's paging hardware.
This approach to address translation allows normal memory accesses in the virtual machine
to execute without adding address translation overhead, after the shadow page tables are set
up. Because the translation look-aside buffer (TLB) on the processor caches direct virtual-tomachine mappings read from the shadow page tables, no additional overhead is added by the
VMM to access the memory.
6.2.5 Performance Considerations
The use of two-page tables has these performance implications.



No overhead is incurred for regular guest memory accesses.
Additional time is required to map memory within a virtual machine, which might
mean:
o The virtual machine operating system is setting up or updating virtual address
to physical address mappings.
o The virtual machine operating system is switching from one address space to
another (context switch).
Like CPU virtualization, memory virtualization overhead depends on workload.
6.2.6 Hardware-Assisted Memory Virtualization
Some CPUs, such as AMD SVM-V and the Intel Xeon 5500 series, provide hardware support
for memory virtualization by using two layers of page tables.
The first layer of page tables stores guest virtual-to-physical translations, while the second
layer of page tables stores guest physical-to-machine translation. The TLB (translation lookaside buffer) is a cache of translations maintained by the processor's memory management
unit (MMU) hardware. A TLB miss is a miss in this cache and the hardware needs to go to
memory (possibly many times) to find the required translation. For a TLB miss to a certain
guest virtual address, the hardware looks at both page tables to translate guest virtual
address to host physical address.
The diagram illustrates the ESXi implementation of memory virtualization.
ESXi Memory Mapping




The boxes represent pages, and the arrows show the different memory mappings.
The arrows from guest virtual memory to guest physical memory show the mapping
maintained by the page tables in the guest operating system. (The mapping from
virtual memory to linear memory for x86-architecture processors is not shown.)
The arrows from guest physical memory to machine memory show the mapping
maintained by the VMM.
The dashed arrows show the mapping from guest virtual memory to machine
memory in the shadow page tables also maintained by the VMM. The underlying
processor running the virtual machine uses the shadow page table mappings.
Because of the extra level of memory mapping introduced by virtualization, ESXi can
effectively manage memory across all virtual machines. Some of the physical memory of a
virtual machine might be mapped to shared pages or to pages that are unmapped, or
swapped out.
A host performs virtual memory management without the knowledge of the guest operating
system and without interfering with the guest operating system’s own memory management
subsystem.
6.2.7 Performance Considerations
When you use hardware assistance, you eliminate the overhead for software memory
virtualization. In particular, hardware assistance eliminates the overhead required to keep
shadow page tables in synchronization with guest page tables. However, the TLB miss
latency when using hardware assistance is significantly higher. As a result, whether or not a
workload benefits by using hardware assistance primarily depends on the overhead the
memory virtualization causes when using software memory virtualization. If a workload
involves a small amount of page table activity (such as process creation, mapping the
memory, or context switches), software virtualization does not cause significant overhead.
Conversely, workloads with a large amount of page table activity are likely to benefit from
hardware assistance.
6.2.8 Overhead Memory on Virtual Machines
Virtual machines require a certain amount of available overhead memory to power on. You
should be aware of the amount of this overhead.
The following table lists the amount of overhead memory a virtual machine requires to
power on. After a virtual machine is running, the amount of overhead memory it uses might
differ from the amount listed in the table. The sample values were collected with VMX swap
enabled and hardware MMU enabled for the virtual machine. (VMX swap is enabled by
default.)
Note
The table provides a sample of overhead memory values and does not attempt to provide
information about all possible configurations. You can configure a virtual machine to have
up to 64 virtual CPUs, depending on the number of licensed CPUs on the host and the
number of CPUs that the guest operating system supports.
Sample Overhead Memory on Virtual Machines
Memory (MB)
1 VCPU
2 VCPUs
4 VCPUs
8 VCPUs
256
20.29
24.28
32.23
48.16
1024
25.90
29.91
37.86
53.82
4096
48.64
52.72
60.67
76.78
16384
139.62
143.98
151.93
168.60
6.2.9 How ESXi Hosts Allocate Memory
A host allocates the memory specified by the Limit parameter to each virtual machine, unless
memory is overcommitted. ESXi never allocates more memory to a virtual machine than its
specified physical memory size.
For example, a 1GB virtual machine might have the default limit (unlimited) or a userspecified limit (for example 2GB). In both cases, the ESXi host never allocates more than
1GB, the physical memory size that was specified for it.
When memory is overcommitted, each virtual machine is allocated an amount of memory
somewhere between what is specified by Reservation and what is specified by Limit. The
amount of memory granted to a virtual machine above its reservation usually varies with the
current memory load.
A host determines allocations for each virtual machine based on the number of shares
allocated to it and an estimate of its recent working set size.


Shares — ESXi hosts use a modified proportional-share memory allocation policy.
Memory shares entitle a virtual machine to a fraction of available physical memory.
Working set size — ESXi hosts estimate the working set for a virtual machine by
monitoring memory activity over successive periods of virtual machine execution
time. Estimates are smoothed over several time periods using techniques that
respond rapidly to increases in working set size and more slowly to decreases in
working set size.
This approach ensures that a virtual machine from which idle memory is reclaimed can ramp
up quickly to its full share-based allocation when it starts using its memory more actively.
Memory activity is monitored to estimate the working set sizes for a default period of 60
seconds. To modify this default , adjust the Mem.SamplePeriod advanced setting. See Set
Advanced Host Attributes.
6.2.10 VMX Swap Files
Virtual machine executable (VMX) swap files allow the host to greatly reduce the amount of
overhead memory reserved for the VMX process.
Note
VMX swap files are not related to the swap to host cache feature or to regular host-level swap
files.
ESXi reserves memory per virtual machine for a variety of purposes. Memory for the needs
of certain components, such as the virtual machine monitor (VMM) and virtual devices, is
fully reserved when a virtual machine is powered on. However, some of the overhead
memory that is reserved for the VMX process can be swapped. The VMX swap feature
reduces the VMX memory reservation significantly (for example, from about 50MB or more
per virtual machine to about 10MB per virtual machine). This allows the remaining memory
to be swapped out when host memory is overcommitted, reducing overhead memory
reservation for each virtual machine.
The host creates VMX swap files automatically, provided there is sufficient free disk space at
the time a virtual machine is powered on.
6.2.11 Memory Tax for Idle Virtual Machines
If a virtual machine is not actively using all of its currently allocated memory, ESXi charges
more for idle memory than for memory that is in use. This is done to help prevent virtual
machines from hoarding idle memory.
The idle memory tax is applied in a progressive fashion. The effective tax rate increases as
the ratio of idle memory to active memory for the virtual machine rises. (In earlier versions
of ESXi that did not support hierarchical resource pools, all idle memory for a virtual
machine was taxed equally.)
You can modify the idle memory tax rate with the Mem.IdleTax option. Use this option,
together with the Mem.SamplePeriod advanced attribute, to control how the system
determines target memory allocations for virtual machines. See Set Advanced Host
Attributes.
6.2.12 Using Swap Files
You can specify the location of your swap file, reserve swap space when memory is
overcommitted, and delete a swap file.
ESXi hosts use swapping to forcibly reclaim memory from a virtual machine when the
vmmemctl driver is not available or is not responsive.





It was never installed.t is explicitly disabled.
It is not running (for example, while the guest operating system is booting).
It is temporarily unable to reclaim memory quickly enough to satisfy current system
demands.
It is functioning properly, but maximum balloon size is reached.
Standard demand-paging techniques swap pages back in when the virtual machine
needs them.
6.2.13 Swap File Location
By default, the swap file is created in the same location as the virtual machine's configuration
file.
A swap file is created by the ESXi host when a virtual machine is powered on. If this file
cannot be created, the virtual machine cannot power on. Instead of accepting the default, you
can also:


Use per-virtual machine configuration options to change the datastore to another
shared storage location.
Use host-local swap, which allows you to specify a datastore stored locally on the
host. This allows you to swap at a per-host level, saving space on the SAN. However,
it can lead to a slight degradation in performance for vSphere vMotion because pages
swapped to a local swap file on the source host must be transferred across the
network to the destination host.
6.2.14 Configure Virtual Machine Swapfile Properties for the Host
Configure a swapfile location for the host to determine the default location for virtual
machine swapfiles.
By default, swapfiles for a virtual machine are located on a VMFS3 datastore in the folder
that contains the other virtual machine files. However, you can configure your host to place
virtual machine swapfiles on an alternative datastore.
You can use this option to place virtual machine swapfiles on lower-cost or higherperformance storage. You can also override this host-level setting for individual virtual
machines.
Setting an alternative swapfile location might cause migrations with vMotion to complete
more slowly. For best vMotion performance, store virtual machine swapfiles in the same
directory as the virtual machine.
If vCenter Server manages your host, you cannot change the swapfile location if you connect
directly to the host by using the vSphere Client. You must connect the vSphere Client to the
vCenter Server system.
6.2.15 Swapping to Host Cache
Datastores that are created on solid state drives (SSD) can be used to allocate space for host
cache. The host reserves a certain amount of space for swapping to host cache.
The host cache is made up of files on a low-latency disk that ESXi uses as a write back cache
for virtual machine swap files. The cache is shared by all virtual machines running on the
host. Host-level swapping of virtual machine pages makes the best use of potentially limited
SSD space.
Using swap to host cache is not the same as placing regular swap files on SSD-backed
datastores. Even if you enable swap to host cache, the host still needs to create regular swap
files. However, when you use swap to host cache, the speed of the storage where the host
places regular swap files is less important.
The Host Cache Configuration page allows you to view the amount of space on a datastore
that a host can use to swap to host cache. Only SSD-backed datastores appear in the list of
datastores on the Host Cache Configuration page.
6.2.16 Sharing Memory Across Virtual Machines
Many ESXi workloads present opportunities for sharing memory across virtual machines (as
well as within a single virtual machine).
For example, several virtual machines might be running instances of the same guest
operating system, have the same applications or components loaded, or contain common
data. In such cases, a host uses a proprietary transparent page sharing technique to securely
eliminate redundant copies of memory pages. With memory sharing, a workload running in
virtual machines often consumes less memory than it would when running on physical
machines. As a result, higher levels of overcommitment can be supported efficiently.
Use the Mem.ShareScanTime and Mem.ShareScanGHz advanced settings to control the rate
at which the system scans memory to identify opportunities for sharing memory.
You can also disable sharing for individual virtual machines by setting the
sched.mem.pshare.enable option to FALSE (this option defaults to TRUE). See Set Advanced
Virtual Machine Attributes.
ESXi memory sharing runs as a background activity that scans for sharing opportunities over
time. The amount of memory saved varies over time. For a fairly constant workload, the
amount generally increases slowly until all sharing opportunities are exploited.
To determine the effectiveness of memory sharing for a given workload, try running the
workload, and use resxtop or esxtop to observe the actual savings. Find the information
in the PSHARE field of the interactive mode in the Memory page.
6.2.17 Memory Compression
ESXi provides a memory compression cache to improve virtual machine performance when
you use memory overcommitment. Memory compression is enabled by default. When a
host's memory becomes overcommitted, ESXi compresses virtual pages and stores them in
memory.
Because accessing compressed memory is faster than accessing memory that is swapped to
disk, memory compression in ESXi allows you to overcommit memory without significantly
hindering performance. When a virtual page needs to be swapped, ESXi first attempts to
compress the page. Pages that can be compressed to 2 KB or smaller are stored in the virtual
machine's compression cache, increasing the capacity of the host.
You can set the maximum size for the compression cache and disable memory compression
using the Advanced Settings dialog box in the vSphere Client.
6.2.18 Measuring and Differentiating Types of Memory Usage
The Performance tab of the vSphere Client displays a number of metrics that can be used to
analyze memory usage.
Some of these memory metrics measure guest physical memory while other metrics measure
machine memory. For instance, two types of memory usage that you can examine using
performance metrics are guest physical memory and machine memory. You measure guest
physical memory using the Memory Granted metric (for a virtual machine) or Memory
Shared (for a host). To measure machine memory, however, use Memory Consumed (for a
virtual machine) or Memory Shared Common (for a host). Understanding the conceptual
difference between these types of memory usage is important for knowing what these metrics
are measuring and how to interpret them.
The VMkernel maps guest physical memory to machine memory, but they are not always
mapped one-to-one. Multiple regions of guest physical memory might be mapped to the
same region of machine memory (in the case of memory sharing) or specific regions of guest
physical memory might not be mapped to machine memory (when the VMkernel swaps out
or balloons guest physical memory). In these situations, calculations of guest physical
memory usage and machine memory usage for an individual virtual machine or a host differ.
Consider the example in the following figure, which shows two virtual machines running on
a host. Each block represents 4 KB of memory and each color/letter represents a different set
of data on a block.
Memory Usage Example
The performance metrics for the virtual machines can be determined as follows:
To determine Memory Granted (the amount of guest physical memory that is mapped to
machine memory) for virtual machine 1, count the number of blocks in virtual machine 1's
guest physical memory that have arrows to machine memory and multiply by 4 KB. Since
there are five blocks with arrows, Memory Granted would be 20 KB.
Memory Consumed is the amount of machine memory allocated to the virtual machine,
accounting for savings from shared memory. First, count the number of blocks in machine
memory that have arrows from virtual machine 1's guest Measuring and Differentiating
Types of Memory Usage
The Performance tab of the vSphere Client displays a number of metrics that can be used to
analyze memory usage.
Some of these memory metrics measure guest physical memory while other metrics measure
machine memory. For instance, two types of memory usage that you can examine using
performance metrics are guest physical memory and machine memory. You measure guest
physical memory using the Memory Granted metric (for a virtual machine) or Memory
Shared (for a host). To measure machine memory, however, use Memory Consumed (for a
virtual machine) or Memory Shared Common (for a host). Understanding the conceptual
difference between these types of memory usage is important for knowing what these metrics
are measuring and how to interpret them.
The VMkernel maps guest physical memory to machine memory, but they are not always
mapped one-to-one. Multiple regions of guest physical memory might be mapped to the
same region of machine memory (in the case of memory sharing) or specific regions of guest
physical memory might not be mapped to machine memory (when the VMkernel swaps out
or balloons guest physical memory). In these situations, calculations of guest physical
memory usage and machine memory usage for an individual virtual machine or a host differ.
Consider the example in the following figure, which shows two virtual machines running on
a host. Each block represents 4 KB of memory and each color/letter represents a different set
of data on a block.
Memory Usage Example
The performance metrics for the virtual machines can be determined as follows:


To determine Memory Granted (the amount of guest physical memory that is
mapped to machine memory) for virtual machine 1, count the number of blocks in
virtual machine 1's guest physical memory that have arrows to machine memory and
multiply by 4 KB. Since there are five blocks with arrows, Memory Granted would be
20 KB.
Memory Consumed is the amount of machine memory allocated to the virtual
machine, accounting for savings from shared memory. First, count the number of
blocks in machine memory that have arrows from virtual machine 1's guest physical
memory. There are three such blocks, but one block is shared with virtual machine 2.
So count two full blocks plus half of the third and multiply by 4 KB for a total of 10
KB Memory Consumed.
The important difference between these two metrics is that Memory Granted counts the
number of blocks with arrows at the guest physical memory level and Memory Consumed
counts the number of blocks with arrows at the machine memory level. The number of blocks
differs between the two levels due to memory sharing and so Memory Granted and Memory
Consumed differ. This is not problematic and shows that memory is being saved through
sharing or other reclamation techniques.
A similar result is obtained when determining Memory Shared and Memory Shared
Common for the host.


Memory Shared for the host is the sum of each virtual machine's Memory Shared.
Calculate this by looking at each virtual machine's guest physical memory and
counting the number of blocks that have arrows to machine memory blocks that
themselves have more than one arrow pointing at them. There are six such blocks in
the example, so Memory Shared for the host is 24 KB.
Memory Shared Common is the amount of machine memory that is shared by virtual
machines. To determine this, look at the machine memory and count the number of
blocks that have more than one arrow pointing at them. There are three such blocks,
so Memory Shared Common is 12 KB.
Memory Shared is concerned with guest physical memory and looks at the origin of the
arrows. Memory Shared Common, however, deals with machine memory and looks at the
destination of the arrows.
The memory metrics that measure guest physical memory and machine memory might
appear contradictory. In fact, they are measuring different aspects of a virtual machine's
memory usage. By understanding the differences between these metrics, you can better
utilize them to diagnose performance issues.
6.3 Memory Reliability
Memory reliability, also known as error insolation, allows ESXi to stop using parts of
memory when it determines that a failure might occur, as well as when a failure did occur.
When enough corrected errors are reported at a particular address, ESXi stops using this
address to prevent the corrected error from becoming an uncorrected error.
Memory reliability provides a better VMkernel reliability despite corrected and uncorrected
errors in RAM. It also enables the system to avoid using memory pages that might contain
errors.
6.3.1 Correct an Error Isolation Notification
With memory reliability, VMkernel stops using pages that receive an error isolation
notification.
The user receives an event in the vSphere Client when VMkernel recovers from an
uncorrectable memory error, when VMkernel retires a significant percentage of system
memory due to a large number of correctable errors, or if there is a large number of pages
that are unable to retire.
Procedure



Vacate the host.
Migrate the virtual machines.
Run tests.
6.4 Managing Storage I/O Resources
vSphere Storage I/O Control allows cluster-wide storage I/O prioritization, which allows
better workload consolidation and helps reduce extra costs associated with over
provisioning.
Storage I/O Control extends the constructs of shares and limits to handle storage I/O
resources. You can control the amount of storage I/O that is allocated to virtual machines
during periods of I/O congestion, which ensures that more important virtual machines get
preference over less important virtual machines for I/O resource allocation.
When you enable Storage I/O Control on a datastore, ESXi begins to monitor the device
latency that hosts observe when communicating with that datastore. When device latency
exceeds a threshold, the datastore is considered to be congested and each virtual machine
that accesses that datastore is allocated I/O resources in proportion to their shares. You set
shares per virtual machine. You can adjust the number for each based on need.
Configuring Storage I/O Control is a two-step process:
1. Enable Storage I/O Control for the datastore.
2. Set the number of storage I/O shares and upper limit of I/O operations per second
(IOPS) allowed for each virtual machine.
By default, all virtual machine shares are set to Normal (1000) with unlimited IOPS.
Note
Storage I/O Control is enabled by default on Storage DRS-enabled datastore clusters.
6.4.1 Storage I/O Control Resource Shares and Limits
You allocate the number of storage I/O shares and upper limit of I/O operations per second
(IOPS) allowed for each virtual machine. When storage I/O congestion is detected for a
datastore, the I/O workloads of the virtual machines accessing that datastore are adjusted
according to the proportion of virtual machine shares each virtual machine has.
Storage I/O shares are similar to those used for memory and CPU resource allocation, which
are described in Resource Allocation Shares. These shares represent the relative importance
of a virtual machine with regard to the distribution of storage I/O resources. Under resource
contention, virtual machines with higher share values have greater access to the storage
array, which typically results in higher throughput and lower latency.
When you allocate storage I/O resources, you can limit the IOPS that are allowed for a
virtual machine. By default, these are unlimited. If a virtual machine has more than one
virtual disk, you must set the limit on all of its virtual disks. Otherwise, the limit will not be
enforced for the virtual machine. In this case, the limit on the virtual machine is the
aggregation of the limits for all virtual disks.
The benefits and drawbacks of setting resource limits are described in Resource Allocation
Limit. If the limit you want to set for a virtual machine is in terms of MB per second instead
of IOPS, you can convert MB per second into IOPS based on the typical I/O size for that
virtual machine. For example, to restrict a backup application with 64KB IOs to 10MB per
second, set a limit of 160 IOPS.
6.4.2 Storage I/O Control Requirements
Storage I/O Control has several requirements and limitations.




Datastores that are Storage I/O Control-enabled must be managed by a single
vCenter Server system.
Storage I/O Control is supported on Fibre Channel-connected, iSCSI-connected, and
NFS-connected storage. Raw Device Mapping (RDM) is not supported.
Storage I/O Control does not support datastores with multiple extents.
Before using Storage I/O Control on datastores that are backed by arrays with
automated storage tiering capabilities, check the VMware Storage/SAN
Compatibility Guide to verify whether your automated tiered storage array has been
certified to be compatible with Storage I/O Control.
Automated storage tiering is the ability of an array (or group of arrays) to migrate
LUNs/volumes or parts of LUNs/volumes to different types of storage media (SSD, FC, SAS,
SATA) based on user-set policies and current I/O patterns. No special certification is
required for arrays that do not have these automatic migration/tiering features, including
those that provide the ability to manually migrate data between different types of storage
media.
6.4.3 Storage I/O Control Resource Shares and Limits
You allocate the number of storage I/O shares and upper limit of I/O operations per second
(IOPS) allowed for each virtual machine. When storage I/O congestion is detected for a
datastore, the I/O workloads of the virtual machines accessing that datastore are adjusted
according to the proportion of virtual machine shares each virtual machine has.
Storage I/O shares are similar to those used for memory and CPU resource allocation, which
are described in Resource Allocation Shares. These shares represent the relative importance
of a virtual machine with regard to the distribution of storage I/O resources. Under resource
contention, virtual machines with higher share values have greater access to the storage
array, which typically results in higher throughput and lower latency.
When you allocate storage I/O resources, you can limit the IOPS that are allowed for a
virtual machine. By default, these are unlimited. If a virtual machine has more than one
virtual disk, you must set the limit on all of its virtual disks. Otherwise, the limit will not be
enforced for the virtual machine. In this case, the limit on the virtual machine is the
aggregation of the limits for all virtual disks.
The benefits and drawbacks of setting resource limits are described in Resource Allocation
Limit. If the limit you want to set for a virtual machine is in terms of MB per second instead
of IOPS, you can convert MB per second into IOPS based on the typical I/O size for that
virtual machine. For example, to restrict a backup application with 64KB IOs to 10MB per
second, set a limit of 160 IOPS.
6.5 Set Storage I/O Control Threshold Value
The congestion threshold value for a datastore is the upper limit of latency that is allowed for
a datastore before Storage I/O Control begins to assign importance to the virtual machine
workloads according to their shares.
You do not need to adjust the threshold setting in most environments.
Caution
Storage I/O Control will not function correctly unless all datatores that share the same
spindles on the array have the same congestion threshold.
If you change the congestion threshold setting, set the value based on the following
considerations.
A higher value typically results in higher aggregate throughput and weaker isolation.
Throttling will not occur unless the overall average latency is higher than the threshold.
If throughput is more critical than latency, do not set the value too low. For example, for
Fibre Channel disks, a value below 20 ms could lower peak disk throughput. A very high
value (above 50 ms) might allow very high latency without any significant gain in overall
throughput.
A lower value will result in lower device latency and stronger virtual machine I/O
performance isolation. Stronger isolation means that the shares controls are enforced more
often. Lower device latency translates into lower I/O latency for the virtual machines with
the highest shares, at the cost of higher I/O latency experienced by the virtual machines with
fewer shares.
If latency is more important, a very low value (lower than 20 ms) will result in lower device
latency and better isolation among I/Os at the potential cost of a decrease in aggregate
datastore throughput.
6.5.1 Monitor Storage I/O Control Shares
Use the datastore Performance tab to monitor how Storage I/O Control handles the I/O
workloads of the virtual machines accessing a datastore based on their shares.
Datastore performance charts allow you to monitor the following information:






Average latency and aggregated IOPS on the datastore
Latency among hosts
Queue depth among hosts
Read/write IOPS among hosts
Read/write latency among virtual machine disks
Read/write IOPS among virtual machine disks
Procedure
1. Select the datastore in the vSphere Client inventory and click the Performance tab.
2. From the View drop-down menu, select Performance.
For more information, see the vSphere Monitoring and Performance documentation.
6.6 Managing Resource Pools
A resource pool is a logical abstraction for flexible management of resources. Resource pools
can be grouped into hierarchies and used to hierarchically partition available CPU and
memory resources.
Each standalone host and each DRS cluster has an (invisible) root resource pool that groups
the resources of that host or cluster. The root resource pool does not appear because the
resources of the host (or cluster) and the root resource pool are always the same.
Users can create child resource pools of the root resource pool or of any user-created child
resource pool. Each child resource pool owns some of the parent’s resources and can, in turn,
have a hierarchy of child resource pools to represent successively smaller units of
computational capability.
A resource pool can contain child resource pools, virtual machines, or both. You can create a
hierarchy of shared resources. The resource pools at a higher level are called parent resource
pools. Resource pools and virtual machines that are at the same level are called siblings. The
cluster itself represents the root resource pool. If you do not create child resource pools, only
the root resource pools exist.
In the following example, RP-QA is the parent resource pool for RP-QA-UI. RP-Marketing
and RP-QA are siblings. The three virtual machines immediately below RP-Marketing are
also siblings.
Parents, Children, and Siblings in Resource Pool Hierarchy
For each resource pool, you specify reservation, limit, shares, and whether the reservation
should be expandable. The resource pool resources are then available to child resource pools
and virtual machines.
6.7 Managing Resource Pools
You can create a child resource pool of any ESXi host, resource pool, or DRS cluster.
Note
If a host has been added to a cluster, you cannot create child resource pools of that host. If
the cluster is enabled for DRS, you can create child resource pools of the cluster.
When you create a child resource pool, you are prompted for resource pool attribute
information. The system uses admission control to make sure you cannot allocate resources
that are not available.
Prerequisites
The vSphere Client is connected to the vCenter Server system. If you connect the vSphere
Client directly to a host, you cannot create a resource pool.
Procedure
1. In the vSphere Client inventory, select a parent object for the resource pool (a host,
another resource pool, or a DRS cluster).
2. Select File > New > Resource Pool.
3. Type a name to identify the resource pool.
4. Specify how to allocate CPU and memory resources.
The CPU resources for your resource pool are the guaranteed physical resources the host
reserves for a resource pool. Normally, you accept the default and let the host handle
resource allocation.
Option
Description
Shares
Specify shares for this resource pool with respect to the parent’s total
resources. Sibling resource pools share resources according to their relative
share values bounded by the reservation and limit.
Select Low, Normal, or High to specify share values respectively in a 1:2:4
ratio.
Select Custom to give each virtual machine a specific number of shares,
which expresses a proportional weight.
Reservation
Specify a guaranteed CPU or memory allocation for this resource pool.
Defaults to 0.
A nonzero reservation is subtracted from the unreserved resources of the
parent (host or resource pool). The resources are considered reserved,
regardless of whether virtual machines are associated with the resource pool.
Expandable
Reservation
When the check box is selected (default), expandable reservations are
considered during admission control.
If you power on a virtual machine in this resource pool, and the combined
reservations of the virtual machines are larger than the reservation of the
resource pool, the resource pool can use resources from its parent or
ancestors.
Limit
Specify the upper limit for this resource pool’s CPU or memory allocation.
You can usually accept the default (Unlimited).
To specify a limit, deselect the Unlimited check box
6.8 Resource Pool Admission Control
When you power on a virtual machine in a resource pool, or try to create a child resource
pool, the system performs additional admission control to ensure the resource pool’s
restrictions are not violated.
Before you power on a virtual machine or create a resource pool, ensure that sufficient
resources are available using the Resource Allocation tab in the vSphere Client. The
Available Reservation value for CPU and memory displays resources that are unreserved.
How available CPU and memory resources are computed and whether actions are performed
depends on the Reservation Type.
Reservation Types
Reservation
Description
Type
Fixed
Expandable
(default)
The system checks whether the selected resource pool has sufficient
unreserved resources. If it does, the action can be performed. If it does not, a
message appears and the action cannot be performed.
The system considers the resources available in the selected resource pool
and its direct parent resource pool. If the parent resource pool also has the
Expandable Reservation option selected, it can borrow resources from its
parent resource pool. Borrowing resources occurs recursively from the
ancestors of the current resource pool as long as the Expandable Reservation
option is selected. Leaving this option selected offers more flexibility, but, at
the same time provides less protection. A child resource pool owner might
reserve more resources than you anticipate.
The system does not allow you to violate preconfigured Reservation or Limit settings. Each
time you reconfigure a resource pool or power on a virtual machine, the system validates all
parameters so all service-level guarantees can still be met.
6.8.1 Expandable Reservations Example 1
This example shows you how a resource pool with expandable reservations works.
Assume an administrator manages pool P, and defines two child resource pools, S1 and S2,
for two different users (or groups).
The administrator knows that users want to power on virtual machines with reservations,
but does not know how much each user will need to reserve. Making the reservations for S1
and S2 expandable allows the administrator to more flexibly share and inherit the common
reservation for pool P.
Without expandable reservations, the administrator needs to explicitly allocate S1 and S2 a
specific amount. Such specific allocations can be inflexible, especially in deep resource pool
hierarchies and can complicate setting reservations in the resource pool hierarchy.
Expandable reservations cause a loss of strict isolation. S1 can start using all of P's
reservation, so that no memory or CPU is directly available to S2.
6.8.2 Expandable Reservations Example 2
This example shows how a resource pool with expandable reservations works.
Assume the following scenario, as shown in the figure.





Parent pool RP-MOM has a reservation of 6GHz and one running virtual machine
VM-M1 that reserves 1GHz.
You create a child resource pool RP-KID with a reservation of 2GHz and with
Expandable Reservation selected.
You add two virtual machines, VM-K1 and VM-K2, with reservations of 2GHz each
to the child resource pool and try to power them on.
VM-K1 can reserve the resources directly from RP-KID (which has 2GHz).
No local resources are available for VM-K2, so it borrows resources from the parent
resource pool, RP-MOM. RP-MOM has 6GHz minus 1GHz (reserved by the virtual
machine) minus 2GHz (reserved by RP-KID), which leaves 3GHz unreserved. With
3GHz available, you can power on the 2GHz virtual machine.
Admission Control with Expandable Resource Pools: Successful Power-On
Now, consider another scenario with VM-M1 and VM-M2.



Power on two virtual machines in RP-MOM with a total reservation of 3GHz.
You can still power on VM-K1 in RP-KID because 2GHz are available locally.
When you try to power on VM-K2, RP-KID has no unreserved CPU capacity so it
checks its parent. RP-MOM has only 1GHz of unreserved capacity available (5GHz of
RP-MOM are already in use—3GHz reserved by the local virtual machines and 2GHz
reserved by RP-KID). As a result, you cannot power on VM-K2, which requires a
2GHz reservation.
Admission Control with Expandable Resource Pools: Power-On Prevented
6.9 Creating a DRS Cluster
A cluster is a collection of ESXi hosts and associated virtual machines with shared resources
and a shared management interface. Before you can obtain the benefits of cluster-level
resource management you must create a cluster and enable DRS.
Depending on whether or not Enhanced vMotion Compatibility (EVC) is enabled, DRS
behaves differently when you use vSphere Fault Tolerance (vSphere FT) virtual machines in
your cluster.
DRS Behavior with vSphere FT Virtual Machines and EVC
EVC
DRS (Load Balancing)
Enabled Enabled (Primary and Secondary VMs)
Disabled Disabled (Primary and Secondary VMs)
DRS (Initial Placement)
Enabled (Primary and Secondary VMs)
Disabled (Primary VMs)
Fully Automated (Secondary VMs)
6.9.1 Migration Recommendations
If you create a cluster with a default manual or partially automated mode, vCenter Server
displays migration recommendations on the DRS Recommendations page.
The system supplies as many recommendations as necessary to enforce rules and balance the
resources of the cluster. Each recommendation includes the virtual machine to be moved,
current (source) host and destination host, and a reason for the recommendation. The
reason can be one of the following:





Balance average CPU loads or reservations.
Balance average memory loads or reservations.
Satisfy resource pool reservations.
Satisfy an affinity rule.
Host is entering maintenance mode or standby mode.
6.10 DRS Cluster Requirements
Hosts that are added to a DRS cluster must meet certain requirements to use cluster features
successfully.
6.10.1 Shared Storage Requirements
A DRS cluster has certain shared storage requirements.
Ensure that the managed hosts use shared storage. Shared storage is typically on a SAN, but
can also be implemented using NAS shared storage.
See the vSphere Storage documentation for information about other shared storage.
6.10.2 Shared VMFS Volume Requirements
A DRS cluster has certain shared VMFS volume requirements.
Configure all managed hosts to use shared VMFS volumes.



Place the disks of all virtual machines on VMFS volumes that are accessible by source
and destination hosts.
Ensure the VMFS volume is sufficiently large to store all virtual disks for your virtual
machines.
Ensure all VMFS volumes on source and destination hosts use volume names, and all
virtual machines use those volume names for specifying the virtual disks.
Note
Virtual machine swap files also need to be on a VMFS accessible to source and destination
hosts (just like .vmdk virtual disk files). This requirement does not apply if all source and
destination hosts are ESX Server 3.5 or higher and using host-local swap. In that case,
vMotion with swap files on unshared storage is supported. Swap files are placed on a VMFS
by default, but administrators might override the file location using advanced virtual
machine configuration options.
Processor Compatibility Requirements
A DRS cluster has certain processor compatibility requirements.
To avoid limiting the capabilities of DRS, you should maximize the processor compatibility
of source and destination hosts in the cluster.
vMotion transfers the running architectural state of a virtual machine between underlying
ESXi hosts. vMotion compatibility means that the processors of the destination host must be
able to resume execution using the equivalent instructions where the processors of the
source host were suspended. Processor clock speeds and cache sizes might vary, but
processors must come from the same vendor class (Intel versus AMD) and the same
processor family to be compatible for migration with vMotion.
Processor families are defined by the processor vendors. You can distinguish different
processor versions within the same family by comparing the processors’ model, stepping
level, and extended features.
Sometimes, processor vendors have introduced significant architectural changes within the
same processor family (such as 64-bit extensions and SSE3). VMware identifies these
exceptions if it cannot guarantee successful migration with vMotion.
vCenter Server provides features that help ensure that virtual machines migrated with
vMotion meet processor compatibility requirements. These features include:



Enhanced vMotion Compatibility (EVC) – You can use EVC to help ensure vMotion
compatibility for the hosts in a cluster. EVC ensures that all hosts in a cluster present
the same CPU feature set to virtual machines, even if the actual CPUs on the hosts
differ. This prevents migrations with vMotion from failing due to incompatible CPUs.
Configure EVC from the Cluster Settings dialog box. The hosts in a cluster must meet
certain requirements for the cluster to use EVC. For information about EVC and EVC
requirements, see the vCenter Server and Host Management documentation.
CPU compatibility masks – vCenter Server compares the CPU features available to a
virtual machine with the CPU features of the destination host to determine whether
to allow or disallow migrations with vMotion. By applying CPU compatibility masks
to individual virtual machines, you can hide certain CPU features from the virtual
machine and potentially prevent migrations with vMotion from failing due to
incompatible CPUs.
6.10.3 vMotion Requirements for DRS Clusters
A DRS cluster has certain vMotion requirements.
To enable the use of DRS migration recommendations, the hosts in your cluster must be part
of a vMotion network. If the hosts are not in the vMotion network, DRS can still make initial
placement recommendations.
To be configured for vMotion, each host in the cluster must meet the following requirements:

vMotion does not support raw disks or migration of applications clustered using
Microsoft Cluster Service (MSCS).

vMotion requires a private Gigabit Ethernet migration network between all of the
vMotion enabled managed hosts. When vMotion is enabled on a managed host,
configure a unique network identity object for the managed host and connect it to the
private migration network.
Automation Level
Action
Manual
Initial placement: Recommended host(s) is displayed.
Migration: Recommendation is displayed.
Partially Automated
Initial placement: Automatic.
Migration: Recommendation is displayed.
Fully Automated
Initial placement: Automatic.
Migration: Recommendation is executed automatically.
6.10.4 Set a Custom Automation Level for a Virtual Machine
After you create a DRS cluster, you can customize the automation level for individual virtual
machines to override the cluster’s default automation level.
For example, you can select Manual for specific virtual machines in a cluster with full
automation, or Partially Automated for specific virtual machines in a manual cluster.
If a virtual machine is set to Disabled, vCenter Server does not migrate that virtual machine
or provide migration recommendations for it. This is known as pinning the virtual machine
to its registered host.
Note
If you have not enabled Enhanced vMotion Compatibility (EVC) for the cluster, fault tolerant
virtual machines are set to DRS disabled. They appear on this screen, but you cannot assign
an automation mode to them.
6.10.5 Add an Unmanaged Host to a Cluster
You can add an unmanaged host to a cluster. Such a host is not currently managed by the
same vCenter Server system as the cluster and it is not visible in the vSphere Client.
Procedure
1. Select the cluster to which to add the host and select Add Host from the right-click
menu.
2. Enter the host name, user name, and password, and click Next.
3. View the summary information and click Next.
4. Select what to do with the host’s virtual machines and resource pools.

Put this host’s virtual machines in the cluster’s root resource pool
vCenter Server removes all existing resource pools of the host and the virtual machines in the
host’s hierarchy are all attached to the root. Because share allocations are relative to a
resource pool, you might have to manually change a virtual machine’s shares after selecting
this option, which destroys the resource pool hierarchy.

Create a resource pool for this host’s virtual machines and resource pools
vCenter Server creates a top-level resource pool that becomes a direct child of the cluster and
adds all children of the host to that new resource pool. You can supply a name for that new
top-level resource pool. The default is Grafted from <host_name>.
The host is added to the cluster.
6.11 Removing a Host from a Cluster
When you remove a host from a DRS cluster, you affect resource pool hierarchies, virtual
machines, and you might create invalid clusters. Consider the affected objects before you
remove the host.
Resource Pool Hierarchies – When you remove a host from a cluster, the host retains only
the root resource pool, even if you used a DRS cluster and decided to graft the host resource
pool when you added the host to the cluster. In that case, the hierarchy remains with the
cluster. You can create a host-specific resource pool hierarchy.
Note
Ensure that you remove the host from the cluster by first placing it in maintenance mode. If
you instead disconnect the host before removing it from the cluster, the host retains the
resource pool that reflects the cluster hierarchy.


Virtual Machines – A host must be in maintenance mode before you can remove it
from the cluster and for a host to enter maintenance mode all powered-on virtual
machines must be migrated off that host. When you request that a host enter
maintenance mode, you are also asked whether you want to migrate all the poweredoff virtual machines on that host to other hosts in the cluster.
Invalid Clusters – When you remove a host from a cluster, the resources available for
the cluster decrease. If the cluster has enough resources to satisfy the reservations of
all virtual machines and resource pools in the cluster, the cluster adjusts resource
allocation to reflect the reduced amount of resources. If the cluster does not have
enough resources to satisfy the reservations of all resource pools, but there are
enough resources to satisfy the reservations for all virtual machines, an alarm is
issued and the cluster is marked yellow. DRS continues to run.
6.11.1 Place a Host in Maintenance Mode
You place a host in maintenance mode when you need to service it, for example, to install
more memory. A host enters or leaves maintenance mode only as the result of a user request.
Virtual machines that are running on a host entering maintenance mode need to be migrated
to another host
6.12 DRS Cluster Validity
The vSphere Client indicates whether a DRS cluster is valid, overcommitted (yellow), or
invalid (red).
DRS clusters become overcommitted or invalid for several reasons.




A cluster might become overcommitted if a host fails.
A cluster becomes invalid if vCenter Server is unavailable and you power on virtual
machines using a vSphere Client connected directly to a host.
A cluster becomes invalid if the user reduces the reservation on a parent resource pool
while a virtual machine is in the process of failing over.
If changes are made to hosts or virtual machines using a vSphere Client connected to
a host while vCenter Server is unavailable, those changes take effect. When vCenter
Server becomes available again, you might find that clusters have turned red or
yellow because cluster requirements are no longer met.
When considering cluster validity scenarios, you should understand these terms.
Reservation
Reservation
Used
Unreserved
A fixed, guaranteed allocation for the resource pool input by the user.
The sum of the reservation or reservation used (whichever is larger) for each
child resource pool, added recursively.
This nonnegative number differs according to resource pool type.


Nonexpandable resource pools: Reservation minus reservation used.
Expandable resource pools: (Reservation minus reservation used) plus
any unreserved resources that can be borrowed from its ancestor
resource pools.
6.12.1 Valid DRS Clusters
A valid cluster has enough resources to meet all reservations and to support all running
virtual machines.
The following figure shows an example of a valid cluster with fixed resource pools and how
its CPU and memory resources are computed.
Valid Cluster with Fixed Resource Pools
The cluster has the following characteristics:




A cluster with total resources of 12GHz.
Three resource pools, each of type Fixed (Expandable Reservation is not selected).
The total reservation of the three resource pools combined is 11GHz (4+4+3 GHz).
The total is shown in the Reserved Capacity field for the cluster.
RP1 was created with a reservation of 4GHz. Two virtual machines. (VM1 and VM7)
of 2GHz each are powered on (Reservation Used: 4GHz). No resources are left for
powering on additional virtual machines. VM6 is shown as not powered on. It
consumes none of the reservation.


RP2 was created with a reservation of 4GHz. Two virtual machines of 1GHz and
2GHz are powered on (Reservation Used: 3GHz). 1GHz remains unreserved.
RP3 was created with a reservation of 3GHz. One virtual machine with 3GHz is
powered on. No resources for powering on additional virtual machines are available.
The following figure shows an example of a valid cluster with some resource pools (RP1 and
RP3) using reservation type Expandable.
Valid Cluster with Expandable Resource Pools
A valid cluster can be configured as follows:





A cluster with total resources of 16GHz.
RP1 and RP3 are of type Expandable, RP2 is of type Fixed.
The total reservation used of the three resource pools combined is 16GHz (6GHz for
RP1, 5GHz for RP2, and 5GHz for RP3). 16GHz shows up as the Reserved Capacity
for the cluster at top level.
RP1 was created with a reservation of 4GHz. Three virtual machines of 2GHz each are
powered on. Two of those virtual machines (for example, VM1 and VM7) can use
RP1’s reservations, the third virtual machine (VM6) can use reservations from the
cluster’s resource pool. (If the type of this resource pool were Fixed, you could not
power on the additional virtual machine.)
RP2 was created with a reservation of 5GHz. Two virtual machines of 1GHz and
2GHz are powered on (Reservation Used: 3GHz). 2GHz remains unreserved.
RP3 was created with a reservation of 5GHz. Two virtual machines of 3GHz and 2GHz are
powered on. Even though this resource pool is of type Expandable, no additional 2GHz
virtual machine can be powered on because the parent’s extra resources are already used by
RP1.
6.12.2 Overcommitted DRS Clusters
A cluster becomes overcommitted (yellow) when the tree of resource pools and virtual
machines is internally consistent but the cluster does not have the capacity to support all
resources reserved by the child resource pools.
There will always be enough resources to support all running virtual machines because,
when a host becomes unavailable, all its virtual machines become unavailable. A cluster
typically turns yellow when cluster capacity is suddenly reduced, for example, when a host in
the cluster becomes unavailable. VMware recommends that you leave adequate additional
cluster resources to avoid your cluster turning yellow.
Yellow Cluster
In this example:






A cluster with total resources of 12GHz coming from three hosts of 4GHz each.
Three resource pools reserving a total of 12GHz.
The total reservation used by the three resource pools combined is 12GHz
(4+5+3 GHz). That shows up as the Reserved Capacity in the cluster.
One of the 4GHz hosts becomes unavailable, so total resources reduce to 8GHz.
At the same time, VM4 (1GHz) and VM3 (3GHz), which were running on the host
that failed, are no longer running.
The cluster is now running virtual machines that require a total of 6GHz. The cluster
still has 8GHz available, which is sufficient to meet virtual machine requirements.
The resource pool reservations of 12GHz can no longer be met, so the cluster is marked as
yellow.
6.12.3 Invalid DRS Clusters
A cluster enabled for DRS becomes invalid (red) when the tree is no longer internally
consistent, that is, resource constraints are not observed.
The total amount of resources in the cluster does not affect whether the cluster is red. A
cluster can be red, even if enough resources exist at the root level, if there is an inconsistency
at a child level.
You can resolve a red DRS cluster problem either by powering off one or more virtual
machines, moving virtual machines to parts of the tree that have sufficient resources, or
editing the resource pool settings in the red part. Adding resources typically helps only when
you are in the yellow state.
A cluster can also turn red if you reconfigure a resource pool while a virtual machine is
failing over. A virtual machine that is failing over is disconnected and does not count toward
the reservation used by the parent resource pool. You might reduce the reservation of the
parent resource pool before the failover completes. After the failover is complete, the virtual
machine resources are again charged to the parent resource pool. If the pool’s usage becomes
larger than the new reservation, the cluster turns red.
If a user is able to start a virtual machine (in an unsupported way) with a reservation of
3GHz under resource pool 2, the cluster would become red, as shown in the following figure.
Red Cluster
6.13 DPM
Note
ESXi hosts cannot automatically be brought out of standby mode unless they are running in
a cluster managed by vCenter Server.
vSphere DPM can use one of three power management protocols to bring a host out of
standby mode: Intelligent Platform Management Interface (IPMI), Hewlett-Packard
Integrated Lights-Out (iLO), or Wake-On-LAN (WOL). Each protocol requires its own
hardware support and configuration. If a host does not support any of these protocols it
cannot be put into standby mode by vSphere DPM. If a host supports multiple protocols,
they are used in the following order: IPMI, iLO, WOL.
6.13.1 Test Wake-on-LAN for vSphere DPM
The use of Wake-on-LAN (WOL) for the vSphere DPM feature is fully supported, if you
configure and successfully test it according to the VMware guidelines. You must perform
these steps before enabling vSphere DPM for a cluster for the first time or on any host that is
being added to a cluster that is using vSphere DPM.
Prerequisites
Before testing WOL, ensure that your cluster meets the prerequisites.





Your cluster must contain at least two ESX 3.5 (or ESX 3i version 3.5) or later hosts.
Each host's vMotion networking link must be working correctly. The vMotion
network should also be a single IP subnet, not multiple subnets separated by routers.
The vMotion NIC on each host must support WOL. To check for WOL support, first
determine the name of the physical network adapter corresponding to the VMkernel
port by selecting the host in the inventory panel of the vSphere Client, selecting the
Configuration tab, and clicking Networking. After you have this information, click on
Network Adapters and find the entry corresponding to the network adapter. The
Wake On LAN Supported column for the relevant adapter should show Yes.
To display the WOL-compatibility status for each NIC on a host, select the host in the
inventory panel of the vSphere Client, select the Configuration tab, and click Network
Adapters. The NIC must show Yes in the Wake On LAN Supported column.
The switch port that each WOL-supporting vMotion NIC is plugged into should be set
to auto negotiate the link speed, and not set to a fixed speed (for example, 1000
Mb/s). Many NICs support WOL only if they can switch to 100 Mb/s or less when the
host is powered off.
After you verify these prerequisites, test each ESXi host that is going to use WOL to support
vSphere DPM. When you test these hosts, ensure that the vSphere DPM feature is disabled
for the cluster.
Caution
Ensure that any host being added to a vSphere DPM cluster that uses WOL as a wake
protocol is tested and disabled from using power management if it fails the testing. If this is
not done, vSphere DPM might power off hosts that it subsequently cannot power back up.
6.13.2 Using VM-Host Affinity Rules
You use a VM-Host affinity rule to specify an affinity relationship between a group of virtual
machines and a group of hosts. When using VM-Host affinity rules, you should be aware of
when they could be most useful, how conflicts between rules are resolved, and the
importance of caution when setting required affinity rules.
One use case where VM-Host affinity rules are helpful is when the software you are running
in your virtual machines has licensing restrictions. You can place such virtual machines into
a DRS group and then create a rule that requires them to run on a host DRS group that
contains only host machines that have the required licenses.
Note
When you create a VM-Host affinity rule that is based on the licensing or hardware
requirements of the software running in your virtual machines, you are responsible for
ensuring that the groups are properly set up. The rule does not monitor the software running
in the virtual machines nor does it know what non-VMware licenses are in place on which
ESXi hosts.
If you create more than one VM-Host affinity rule, the rules are not ranked, but are applied
equally. Be aware that this has implications for how the rules interact. For example, a virtual
machine that belongs to two DRS groups, each of which belongs to a different required rule,
can run only on hosts that belong to both of the host DRS groups represented in the rules.
When you create a VM-Host affinity rule, its ability to function in relation to other rules is
not checked. So it is possible for you to create a rule that conflicts with the other rules you
are using. When two VM-Host affinity rules conflict, the older one takes precedence and the
newer rule is disabled. DRS only tries to satisfy enabled rules and disabled rules are ignored.
DRS, vSphere HA, and vSphere DPM never take any action that results in the violation of
required affinity rules (those where the virtual machine DRS group 'must run on' or 'must
not run on' the host DRS group). Accordingly, you should exercise caution when using this
type of rule because of its potential to adversely affect the functioning of the cluster. If
improperly used, required VM-Host affinity rules can fragment the cluster and inhibit the
proper functioning of DRS, vSphere HA, and vSphere DPM.
A number of cluster functions are not performed if doing so would violate a required affinity
rule.




DRS does not evacuate virtual machines to place a host in maintenance mode.
DRS does not place virtual machines for power-on or load balance virtual machines.
vSphere HA does not perform failovers.
vSphere DPM does not optimize power management by placing hosts into standby
mode.
To avoid these situations, exercise caution when creating more than one required affinity
rule or consider using VM-Host affinity rules that are preferential only (those where the
virtual machine DRS group 'should run on' or 'should not run on' the host DRS group).
Ensure that the number of hosts in the cluster with which each virtual machine is affined is
large enough that losing a host does not result in a lack of hosts on which the virtual machine
can run. Preferential rules can be violated to allow the proper functioning of DRS, vSphere
HA, and vSphere DPM.
Note
You can create an event-based alarm that is triggered when a virtual machine violates a VMHost affinity rule. In the vSphere Client, add a new alarm for the virtual machine and select
VM is violating VM-Host Affinity Rule as the event trigger. For more information about
creating and editing alarms, see the vSphere Monitoring and Performance documentation.
6.14 Datastore clusters
Initial placement occurs when Storage DRS selects a datastore within a datastore cluster on
which to place a virtual machine disk. This happens when the virtual machine is being
created or cloned, when a virtual machine disk is being migrated to another datastore
cluster, or when you add a disk to an existing virtual machine.
Initial placement recommendations are made in accordance with space constraints and with
respect to the goals of space and I/O load balancing. These goals aim to minimize the risk of
over-provisioning one datastore, storage I/O bottlenecks, and performance impact on virtual
machines.
Storage DRS is invoked at the configured frequency (by default, every eight hours) or when
one or more datastores in a datastore cluster exceeds the user-configurable space utilization
thresholds. When Storage DRS is invoked, it checks each datastore's space utilization and
I/O latency values against the threshold. For I/O latency, Storage DRS uses the 90th
percentile I/O latency measured over the course of a day to compare against the threshold.
6.15 Setting the Aggressiveness Level for Storage DRS
The aggressiveness of Storage DRS is determined by specifying thresholds for space used and
I/O latency.
Storage DRS collects resource usage information for the datastores in a datastore cluster.
vCenter Server uses this information to generate recommendations for placement of virtual
disks on datastores.
When you set a low aggressiveness level for a datastore cluster, Storage DRS recommends
Storage vMotion migrations only when absolutely necessary, for example, if when I/O load,
space utilization, or their imbalance is high. When you set a high aggressiveness level for a
datastore cluster, Storage DRS recommends migrations whenever the datastore cluster can
benefit from space or I/O load balancing.
In the vSphere Client, you can use the following thresholds to set the aggressiveness level for
Storage DRS:
Space
Utilization
Storage DRS generates recommendations or performs migrations when the
percentage of space utilization on the datastore is greater than the threshold
you set in the vSphere Client.
I/O Latency Storage DRS generates recommendations or performs migrations when the
90th percentile I/O latency measured over a day for the datastore is greater
than the threshold.
You can also set advanced options to further configure the aggressiveness level of Storage
DRS.
Space
utilization
difference
This threshold ensures that there is some minimum difference between the
space utilization of the source and the destination. For example, if the space
used on datastore A is 82% and datastore B is 79%, the difference is 3. If the
threshold is 5, Storage DRS will not make migration recommendations from
datastore A to datastore B.
I/O load
balancing
invocation
interval
After this interval, Storage DRS runs to balance I/O load.
I/O imbalance Lowering this value makes I/O load balancing less aggressive. Storage DRS
computes an I/O fairness metric between 0 and 1, which 1 being the fairest
threshold
distribution. I/O load balancing runs only if the computed metric is less
than 1 - (I/O imbalance threshold / 100).
6.16 Datastore Cluster Requirements
Datastores and hosts that are associated with a datastore cluster must meet certain
requirements to use datastore cluster features successfully.
Follow these guidelines when you create a datastore cluster.

Datastore clusters must contain similar or interchangeable datastores.
A datastore cluster can contain a mix of datastores with different sizes and I/O capacities,
and can be from different arrays and vendors. However, the following types of datastores
cannot coexist in a datastore cluster.
NFS and VMFS datastores cannot be combined in the same datastore cluster.
Replicated datastores cannot be combined with non-replicated datastores in
the same Storage-DRS-enabled datastore cluster.
All hosts attached to the datastores in a datastore cluster must be ESXi 5.0 and later.
If datastores in the datastore cluster are connected to ESX/ESXi 4.x and earlier hosts,
Storage DRS does not run.
Datastores shared across multiple datacenters cannot be included in a datastore
cluster.
As a best practice, do not include datastores that have hardware acceleration enabled
in the same datastore cluster as datastores that do not have hardware acceleration
enabled. Datastores in a datastore cluster must be homogeneous to guarantee
hardware acceleration-supported behavior.
o
o



6.17 Adding and Removing Datastores from a Datastore Cluster
You add and remove datastores to and from an existing datastore cluster by dragging them
in the vSphere Client inventory.
You can add to a datastore cluster any datastore that is mounted on a host in the vSphere
Client inventory, with the following exceptions:


All hosts attached to the datastore must be ESXi 5.0 and later.
The datastore cannot be in more than one datacenter in the same instance of the
vSphere Client.
When you remove a datastore from a datastore cluster, the datastore remains in the vSphere
Client inventory and is not unmounted from the host.
6.17.1 Place a Datastore in Maintenance Mode
If you need to take a datastore out of service, you can place the datastore in Storage DRS
maintenance mode.
Prerequisites
Storage DRS is enabled on the datastore cluster that contains the datastore that is entering
maintenance mode.
No CD-ROM image files are stored on the datastore.
There are at least two datastores in the datastore cluster.
Procedure
1. In the vSphere Client inventory, right-click a datastore in a datastore cluster and select
Enter SDRS Maintenance Mode.
A list of recommendations appears for datastore maintenance mode migration.
1. (Optional) On the Placement Recommendations tab, deselect any
recommendations you do not want to apply.
Note
The datastore cannot enter maintenance mode without evacuating all disks. If you deselect
recommendations, you must manually move the affected virtual machines.
1. If necessary, click Apply Recommendations.
vCenter Server uses Storage vMotion to migrate the virtual disks from the source datastore
to the destination datastore and the datastore enters maintenance mode.
The datastore icon might not be immediately updated to reflect the datastore's current state.
To update the icon immediately, click Refresh.
6.17.2 gnore Storage DRS Affinity Rules for Maintenance Mode
Storage DRS affinity or anti-affinity rules might prevent a datastore from entering
maintenance mode. You can ignore these rules when you put a datastore in maintenance
mode.
When you enable the Ignore Affinity Rules for Maintenance option for a datastore cluster,
vCenter Server ignores Storage DRS affinity and anti-affinity rules that prevent a datastore
from entering maintenance mode.
Storage DRS rules are ignored only for evacuation recommendations. vCenter Server does
not violate the rules when making space and load balancing recommendations or initial
placement recommendations.
Procedure
1. In the vSphere Client inventory, right-click a datastore cluster and select Edit
Settings.
2. In the right pane of the Edit Datastore Cluster dialog box, select SDRS Automation.
Click Advanced Options.
1.
2.
3.
4.
Select IgnoreAffinityRulesForMaintenance.
In the Value column, type 1 to enable the option.
Type 0 to disable the option.
Click OK
6.18 Storage DRS Anti-Affinity Rules
You can create Storage DRS anti-affinity rules to control which virtual disks should not be
placed on the same datastore within a datastore cluster. By default, a virtual machine's
virtual disks are kept together on the same datastore.
When you create an anti-affinity rule, it applies to the relevant virtual disks in the datastore
cluster. Anti-affinity rules are enforced during initial placement and Storage DRSrecommendation migrations, but are not enforced when a migration is initiated by a user.
Note
Anti-affinity rules do not apply to CD-ROM ISO image files that are stored on a datastore in
a datastore cluster, nor do they apply to swapfiles that are stored in user-defined locations.
Inter-VM AntiAffinity Rules
Specify which virtual machines should never be kept on the same
datastore. See Create Inter-VM Anti-Affinity Rules.
Intra-VM AntiAffinity Rules
Specify which virtual disks associated with a particular virtual machine
must be kept on different datastores. See Create Intra-VM Anti-Affinity
Rules.
If you move a virtual disk out of the datastore cluster, the affinity or anti-affinity rule no
longer applies to that disk.
When you move virtual disk files into a datastore cluster that has existing affinity and antiaffinity rules, the following behavior applies:



Datastore Cluster B has an intra-VM affinity rule. When you move a virtual disk out
of Datastore Cluster A and into Datastore Cluster B, any rule that applied to the
virtual disk for a given virtual machine in Datastore Cluster A no longer applies. The
virtual disk is now subject to the intra-VM affinity rule in Datastore Cluster B.
Datastore Cluster B has an inter-VM anti-affinity rule. When you move a virtual disk
out of Datastore Cluster A and into Datastore Cluster B, any rule that applied to the
virtual disk for a given virtual machine in Datastore Cluster A no longer applies. The
virtual disk is now subject to the inter-VM anti-affinity rule in Datastore Cluster B.
Datastore Cluster B has an intra-VM anti-affinity rule. When you move a virtual disk
out of Datastore Cluster A and into Datastore Cluster B, the intra-VM anti-affinity
rule does not apply to the virtual disk for a given virtual machine because the rule is
limited to only specified virtual disks in Datastore Cluster B.
Note
Storage DRS rules might prevent a datastore from entering maintenance mode. You can
choose to ignore Storage DRS rules for maintenance mode by enabling the Ignore Affinity
Rules for Maintenance option.
6.18.1 Create Inter-VM Anti-Affinity Rules
You can create an anti-affinity rule to indicate that all virtual disks of certain virtual
machines must be kept on different datastores. The rule applies to individual datastore
clusters.
Virtual machines that participate in an inter-VM anti-affinity rule in a datastore cluster must
be associated with an intra-VM affinity rule in the datastore cluster. The virtual machines
must also comply with the intra-VM affinity rule.
If a virtual machine is subject to an inter-VM anti-affinity rule, the following behavior
applies:



Storage DRS places the virtual machine's virtual disks according to the rule.
Storage DRS migrates the virtual disks using vMotion according to the rule, even if
the migration is for a mandatory reason such as putting a datastore in maintenance
mode.
If the virtual machine's virtual disk violates the rule, Storage DRS makes migration
recommendations to correct the error or reports the violation as a fault if it cannot
make a recommendation that will correct the error.
No inter-VM anti-affinity rules are defined by default.
Procedure
1. In the vSphere Client inventory, right-click a datastore cluster and select Edit
Settings.
2. In the left pane of the Edit Datastore Cluster dialog box, select Rules.
3. Click Add.
4. Type a name for the rule.
5. From the Type menu, select VM anti-affinity.
6. Click Add.
7. Click Select Virtual Machine.
8. Select at least two virtual machines and click OK.
9. Click OK to save the rule.

6.19 Storage vMotion Compatibility with Datastore Clusters
A datastore cluster has certain vSphere Storage vMotion® requirements.





The host must be running a version of ESXi that supports Storage vMotion.
The host must have write access to both the source datastore and the destination
datastore.
The host must have enough free memory resources to accommodate Storage vMotion.
The destination datastore must have sufficient disk space.
The destination datastore must not be in maintenance mode or entering maintenance
mode.
6.20 Using NUMA Systems with ESXi
ESXi supports memory access optimization for Intel and AMD Opteron processors in server
architectures that support NUMA (non-uniform memory access).
After you understand how ESXi NUMA scheduling is performed and how the VMware
NUMA algorithms work, you can specify NUMA controls to optimize the performance of
your virtual machines.
This chapter includes the following topics:






What is NUMA?
How ESXi NUMA Scheduling Works
VMware NUMA Optimization Algorithms and Settings
Resource Management in NUMA Architectures
Using Virtual NUMA
Specifying NUMA Controls
6.20.1 What is NUMA?
NUMA systems are advanced server platforms with more than one system bus. They can
harness large numbers of processors in a single system image with superior price to
performance ratios.
For the past decade, processor clock speed has increased dramatically. A multi-gigahertz
CPU, however, needs to be supplied with a large amount of memory bandwidth to use its
processing power effectively. Even a single CPU running a memory-intensive workload, such
as a scientific computing application, can be constrained by memory bandwidth.
This problem is amplified on symmetric multiprocessing (SMP) systems, where many
processors must compete for bandwidth on the same system bus. Some high-end systems
often try to solve this problem by building a high-speed data bus. However, such a solution is
expensive and limited in scalability.
NUMA is an alternative approach that links several small, cost-effective nodes using a highperformance connection. Each node contains processors and memory, much like a small
SMP system. However, an advanced memory controller allows a node to use memory on all
other nodes, creating a single system image. When a processor accesses memory that does
not lie within its own node (remote memory), the data must be transferred over the NUMA
connection, which is slower than accessing local memory. Memory access times are not
uniform and depend on the location of the memory and the node from which it is accessed,
as the technology’s name implies.
6.20.2Challenges for Operating Systems
Because a NUMA architecture provides a single system image, it can often run an operating
system with no special optimizations.
The high latency of remote memory accesses can leave the processors under-utilized,
constantly waiting for data to be transferred to the local node, and the NUMA connection can
become a bottleneck for applications with high-memory bandwidth demands.
Furthermore, performance on such a system can be highly variable. It varies, for example, if
an application has memory located locally on one benchmarking run, but a subsequent run
happens to place all of that memory on a remote node. This phenomenon can make capacity
planning difficult.
Some high-end UNIX systems provide support for NUMA optimizations in their compilers
and programming libraries. This support requires software developers to tune and recompile
their programs for optimal performance. Optimizations for one system are not guaranteed to
work well on the next generation of the same system. Other systems have allowed an
administrator to explicitly decide on the node on which an application should run. While this
might be acceptable for certain applications that demand 100 percent of their memory to be
local, it creates an administrative burden and can lead to imbalance between nodes when
workloads change.
Ideally, the system software provides transparent NUMA support, so that applications can
benefit immediately without modifications. The system should maximize the use of local
memory and schedule programs intelligently without requiring constant administrator
intervention. Finally, it must respond well to changing conditions without compromising
fairness or performance.
6.20.3How ESXi NUMA Scheduling Works
ESXi uses a sophisticated NUMA scheduler to dynamically balance processor load and
memory locality or processor load balance.
1. Each virtual machine managed by the NUMA scheduler is assigned a home node. A
home node is one of the system’s NUMA nodes containing processors and local
memory, as indicated by the System Resource Allocation Table (SRAT).
2. When memory is allocated to a virtual machine, the ESXi host preferentially allocates
it from the home node. The virtual CPUs of the virtual machine are constrained to
run on the home node to maximize memory locality.
3. The NUMA scheduler can dynamically change a virtual machine's home node to
respond to changes in system load. The scheduler might migrate a virtual machine to
a new home node to reduce processor load imbalance. Because this might cause more
of its memory to be remote, the scheduler might migrate the virtual machine’s
memory dynamically to its new home node to improve memory locality. The NUMA
scheduler might also swap virtual machines between nodes when this improves
overall memory locality.
Some virtual machines are not managed by the ESXi NUMA scheduler. For example, if you
manually set the processor or memory affinity for a virtual machine, the NUMA scheduler
might not be able to manage this virtual machine. Virtual machines that are not managed by
the NUMA scheduler still run correctly. However, they don't benefit from ESXi NUMA
optimizations.
The NUMA scheduling and memory placement policies in ESXi can manage all virtual
machines transparently, so that administrators do not need to address the complexity of
balancing virtual machines between nodes explicitly.
The optimizations work seamlessly regardless of the type of guest operating system. ESXi
provides NUMA support even to virtual machines that do not support NUMA hardware, such
as Windows NT 4.0. As a result, you can take advantage of new hardware even with legacy
operating systems.
A virtual machine that has more virtual processors than the number of physical processor
cores available on a single hardware node can be managed automatically. The NUMA
scheduler accommodates such a virtual machine by having it span NUMA nodes. That is, it is
split up as multiple NUMA clients, each of which is assigned to a node and then managed by
the scheduler as a normal, non-spanning client. This can improve the performance of certain
memory-intensive workloads with high locality. For information on configuring the behavior
of this feature, see Advanced Virtual Machine Attributes.
ESXi 5.0 and later includes support for exposing virtual NUMA topology to guest operating
systems. For more information about virtual NUMA control, see Using Virtual NUMA.
6.20.4
VMware NUMA Optimization Algorithms and Settings
This section describes the algorithms and settings used by ESXi to maximize application
performance while still maintaining resource guarantees.
6.20.5 Home Nodes and Initial Placement
When a virtual machine is powered on, ESXi assigns it a home node. A virtual machine runs
only on processors within its home node, and its newly allocated memory comes from the
home node as well.
Unless a virtual machine’s home node changes, it uses only local memory, avoiding the
performance penalties associated with remote memory accesses to other NUMA nodes.
When a virtual machine is powered on, it is assigned an initial home node so that the overall
CPU and memory load among NUMA nodes remains balanced. Because internode latencies
in a large NUMA system can vary greatly, ESXi determines these internode latencies at boot
time and uses this information when initially placing virtual machines that are wider than a
single NUMA node. These wide virtual machines are placed on NUMA nodes that are close to
each other for lowest memory access latencies.
Initial placement-only approaches are usually sufficient for systems that run only a single
workload, such as a benchmarking configuration that remains unchanged as long as the
system is running. However, this approach is unable to guarantee good performance and
fairness for a datacenter-class system that supports changing workloads. Therefore, in
addition to initial placement, ESXi 5.0 does dynamic migration of virtual CPUs and memory
between NUMA nodes for improving CPU balance and increasing memory locality.
6.20.6
Dynamic Load Balancing and Page Migration
ESXi combines the traditional initial placement approach with a dynamic rebalancing
algorithm. Periodically (every two seconds by default), the system examines the loads of the
various nodes and determines if it should rebalance the load by moving a virtual machine
from one node to another.
This calculation takes into account the resource settings for virtual machines and resource
pools to improve performance without violating fairness or resource entitlements.
The rebalancer selects an appropriate virtual machine and changes its home node to the least
loaded node. When it can, the rebalancer moves a virtual machine that already has some
memory located on the destination node. From that point on (unless it is moved again), the
virtual machine allocates memory on its new home node and it runs only on processors
within the new home node.
Rebalancing is an effective solution to maintain fairness and ensure that all nodes are fully
used. The rebalancer might need to move a virtual machine to a node on which it has
allocated little or no memory. In this case, the virtual machine incurs a performance penalty
associated with a large number of remote memory accesses. ESXi can eliminate this penalty
by transparently migrating memory from the virtual machine’s original node to its new home
node:
1. The system selects a page (4KB of contiguous memory) on the original node and
copies its data to a page in the destination node.
2. The system uses the virtual machine monitor layer and the processor’s memory
management hardware to seamlessly remap the virtual machine’s view of memory, so
that it uses the page on the destination node for all further references, eliminating the
penalty of remote memory access.
When a virtual machine moves to a new node, the ESXi host immediately begins to migrate
its memory in this fashion. It manages the rate to avoid overtaxing the system, particularly
when the virtual machine has little remote memory remaining or when the destination node
has little free memory available. The memory migration algorithm also ensures that the ESXi
host does not move memory needlessly if a virtual machine is moved to a new node for only a
short period.
When initial placement, dynamic rebalancing, and intelligent memory migration work in
conjunction, they ensure good memory performance on NUMA systems, even in the
presence of changing workloads. When a major workload change occurs, for instance when
new virtual machines are started, the system takes time to readjust, migrating virtual
machines and memory to new locations. After a short period, typically seconds or minutes,
the system completes its readjustments and reaches a steady state.
6.20.7 Transparent Page Sharing Optimized for NUMA
Many ESXi workloads present opportunities for sharing memory across virtual machines.
For example, several virtual machines might be running instances of the same guest
operating system, have the same applications or components loaded, or contain common
data. In such cases, ESXi systems use a proprietary transparent page-sharing technique to
securely eliminate redundant copies of memory pages. With memory sharing, a workload
running in virtual machines often consumes less memory than it would when running on
physical machines. As a result, higher levels of overcommitment can be supported efficiently.
Transparent page sharing for ESXi systems has also been optimized for use on NUMA
systems. On NUMA systems, pages are shared per-node, so each NUMA node has its own
local copy of heavily shared pages. When virtual machines use shared pages, they don't need
to access remote memory.
Note
This default behavior is the same in all previous versions of ESX and ESXi.
6.20.8
Resource Management in NUMA Architectures
You can perform resource management with different types of NUMA architecture.
With the proliferation of highly multicore systems, NUMA architectures are becoming more
popular as these architectures allow better performance scaling of memory intensive
workloads. All modern Intel and AMD systems have NUMA support built into the
processors. Additionally, there are traditional NUMA systems like the IBM Enterprise XArchitecture that extend Intel and AMD processors with NUMA behavior with specialized
chipset support.
Typically, you can use BIOS settings to enable and disable NUMA behavior. For example, in
AMD Opteron-based HP Proliant servers, NUMA can be disabled by enabling node
interleaving in the BIOS. If NUMA is enabled, the BIOS builds a system resource allocation
table (SRAT) which ESXi uses to generate the NUMA information used in optimizations. For
scheduling fairness, NUMA optimizations are not enabled for systems with too few cores per
NUMA node or too few cores overall. You can modify the numa.rebalancecorestotal and
numa.rebalancecoresnode options to change this behavior.
6.20.9
Using Virtual NUMA
vSphere 5.0 and later includes support for exposing virtual NUMA topology to guest
operating systems, which can improve performance by facilitating guest operating system
and application NUMA optimizations.
Virtual NUMA topology is available to hardware version 8 virtual machines and is enabled
by default when the number of virtual CPUs is greater than eight. You can also manually
influence virtual NUMA topology using advanced configuration options.
You can affect the virtual NUMA topology with two settings in the vSphere Client: number of
virtual sockets and number of cores per socket for a virtual machine. If the number of cores
per socket (cpuid.coresPerSocket) is greater than one, and the number of virtual cores in the
virtual machine is greater than 8, the virtual NUMA node size matches the virtual socket
size. If the number of cores per socket is less than or equal to one, virtual NUMA nodes are
created to match the topology of the first physical host where the virtual machine is powered
on.
When the number of virtual CPUs and the amount of memory used grow proportionately,
you can use the default values. For virtual machines that consume a disproportionally large
amount of memory, you can override the default values in one of the following ways:


Increase the number of virtual CPUs, even if this number of virtual CPUs is not used.
See Change the Number of Virtual CPUs.
Use advanced options to control virtual NUMA topology and its mapping over
physical NUMA topology. See Virtual NUMA Controls.
7 Security
VMware designed the virtualization layer, or VMkernel, to run virtual machines. It controls
the hardware that hosts use and schedules the allocation of hardware resources among the
virtual machines. Because the VMkernel is fully dedicated to supporting virtual machines
and is not used for other purposes, the interface to the VMkernel is strictly limited to the API
required to manage virtual machines.
ESXi provides additional VMkernel protection with the following features:
Memory
Hardening
The ESXi kernel, user-mode applications, and executable components such
as drivers and libraries are located at random, non-predictable memory
addresses. Combined with the non-executable memory protections made
available by microprocessors, this provides protection that makes it difficult
for malicious code to use memory exploits to take advantage of
vulnerabilities.
Kernel Module Digital signing ensures the integrity and authenticity of modules, drivers and
applications as they are loaded by the VMkernel. Module signing allows ESXi
Integrity
to identify the providers of modules, drivers, or applications and whether
they are VMware-certified. VMware software and certain third-party drivers
are signed by VMware.
Trusted
vSphere uses Intel Trusted Platform Module/Trusted Execution Technology
Platform
(TPM/TXT) to provide remote attestation of the hypervisor image based on
Module (TPM) hardware root of trust. The hypervisor image comprises the following
elements:
ESXi software (hypervisor) in VIB (package) format
Third-party VIBs
Third-party drivers
To leverage this capability, your ESXi system must have TPM and TXT
enabled.
When TPM and TXT are enabled, ESXi measures the entire hypervisor stack
when the system boots and stores these measurements in the Platform
Configuration Registers (PCR) of the TPM. The measurements include the
VMkernel, kernel modules, drivers, native management applications that run
on ESXi, and any boot-time configuration options. All VIBs that are installed
on the system are measured.
Third-party solutions can use this feature to build a verifier that detects
tampering of the hypervisor image, by comparing the image with an image of
the expected known good values. vSphere does not provide a user interface to
view these measurements.
The measurements are exposed in a vSphere API. An event log is provided as
part of the API, as specified by the Trusted Computing Group (TCG)
standard for TXT.
7.1.1 Security and Virtual Machines
Virtual machines are the containers in which applications and guest operating systems run.
By design, all VMware virtual machines are isolated from one another. This isolation enables
multiple virtual machines to run securely while sharing hardware and ensures both their
ability to access hardware and their uninterrupted performance.
Even a user with system administrator privileges on a virtual machine’s guest operating
system cannot breach this layer of isolation to access another virtual machine without
privileges explicitly granted by the ESXi system administrator. As a result of virtual machine
isolation, if a guest operating system running in a virtual machine fails, other virtual
machines on the same host continue to run. The guest operating system failure has no effect
on:



The ability of users to access the other virtual machines
The ability of the operational virtual machines to access the resources they need
The performance of the other virtual machines
Each virtual machine is isolated from other virtual machines running on the same hardware.
Although virtual machines share physical resources such as CPU, memory, and I/O devices,
a guest operating system on an individual virtual machine cannot detect any device other
than the virtual devices made available to it.
Virtual Machine Isolation
Because the VMkernel mediates the physical resources and all physical hardware access
takes place through the VMkernel, virtual machines cannot circumvent this level of isolation.
Just as a physical machine communicates with other machines in a network through a
network card, a virtual machine communicates with other virtual machines running in the
same host through a virtual switch. Further, a virtual machine communicates with the
physical network, including virtual machines on other ESXi hosts, through a physical
network adapter.
Virtual Networking Through Virtual Switches
These characteristics apply to virtual machine isolation in a network context:



If a virtual machine does not share a virtual switch with any other virtual machine, it
is completely isolated from virtual networks within the host.
If no physical network adapter is configured for a virtual machine, the virtual
machine is completely isolated from any physical networks.
If you use the same safeguards (firewalls, antivirus software, and so forth) to protect
a virtual machine from the network as you would for a physical machine, the virtual
machine is as secure as the physical machine.
You can further protect virtual machines by setting up resource reservations and limits on
the host. For example, through the detailed resource controls available in ESXi, you can
configure a virtual machine so that it always receives at least 10 percent of the host’s CPU
resources, but never more than 20 percent.
Resource reservations and limits protect virtual machines from performance degradation
that would result if another virtual machine consumed excessive shared hardware resources.
For example, if one of the virtual machines on a host is incapacitated by a denial-of-service
(DoS) attack, a resource limit on that machine prevents the attack from taking up so much of
the hardware resources that the other virtual machines are also affected. Similarly, a
resource reservation on each of the virtual machines ensures that, in the event of high
resource demands by the virtual machine targeted by the DoS attack, all the other virtual
machines still have enough resources to operate.
By default, ESXi imposes a form of resource reservation by applying a distribution algorithm
that divides the available host resources equally among the virtual machines while keeping a
certain percentage of resources for use by other system components. This default behavior
provides a degree of natural protection from DoS and distributed denial-of-service (DDoS)
attacks. You set specific resource reservations and limits on an individual basis to customize
the default behavior so that the distribution is not equal across the virtual machine
configuration.
VMware Security Resources on the Web
Topic
Resource
VMware securityhttp://www.vmware.com/security/
policy, up-to-
VMware Security Resources on the Web
Topic
Resource
date security
alerts, security
downloads, and
focus
discussions of
security topics
http://www.vmware.com/support/policies/security_response.html
Corporate
VMware is committed to helping you maintain a secure environment.
security
Security issues are corrected in a timely manner. The VMware Security
response policy Response Policy states our commitment to resolve possible vulnerabilities
in our products.
http://www.vmware.com/support/policies/
VMware supports a variety of storage systems, software agents such as
backup agents, system management agents, and so forth. You can find lists
of agents, tools, and other software that supports ESXi by searching
http://www.vmware.com/vmtn/resources/ for ESXi compatibility guides.
Third-party
software support
policy
The industry offers more products and configurations than VMware can
test. If VMware does not list a product or configuration in a compatibility
guide, Technical Support will attempt to help you with any problems, but
cannot guarantee that the product or configuration can be used. Always
evaluate security risks for unsupported products or configurations carefully.
General
information
about
virtualization
and security
VMware Virtual Security Technical Resource Center
http://www.vmware.com/go/security/
Compliance and
security
standards, as
well as partner
solutions and in-http://www.vmware.com/go/compliance/
depth content
about
virtualization
and compliance
Information
about VMsafe
technology for
protection of
http://www.vmware.com/go/vmsafe/
VMware Security Resources on the Web
Topic
Resource
virtual
machines,
including a list
of partner
solutions
7.1.2 Connecting to the Virtual Machine Console Through a Firewall
When you connect your client to ESXi hosts through vCenter Server, certain ports are
required for user and administrator communication with virtual machine consoles. These
ports support different client functions, interface with different layers on ESXi, and use
different authentication protocols.
Port This is the port that vCenter Server assumes is available for receiving data from ESXi.
902 The vSphere Client uses this port to provide a connection for guest operating system
mouse, keyboard, screen (MKS) activities on virtual machines. It is through this port
that users interact with the virtual machine guest operating systems and applications.
Port 902 is the port that the vSphere Client assumes is available when interacting with
virtual machines.
Port 902 connects vCenter Server to the host through the VMware Authorization
Daemon (vmware-authd). This daemon multiplexes port 902 data to the appropriate
recipient for processing. VMware does not support configuring a different port for this
connection.
Port The vSphere Client and SDK use this port to send data to vCenter Server managed
443 hosts. Also, the vSphere SDK, when connected directly to ESXi, use this port to
support any management functions related to the server and its virtual machines. Port
443 is the port that clients assume is available when sending data to ESXi. VMware
does not support configuring a different port for these connections.
Port 443 connects clients to ESXi through the Tomcat Web service or the SDK. The
host process multiplexes port 443 data to the appropriate recipient for processing.
Port The vSphere Client uses this port to provide a connection for guest operating system
903 MKS activities on virtual machines. It is through this port that users interact with the
guest operating systems and applications of the virtual machine. Port 903 is the port
that the vSphere Client assumes is available when interacting with virtual machines.
VMware does not support configuring a different port for this function.
Port 903 connects the vSphere Client to a specified virtual machine configured on
ESXi.
The following figure shows the relationships between vSphere Client functions, ports, and
processes.
If you have a firewall between your vCenter Server system and vCenter Server managed host,
open ports 443 and 903 in the firewall to allow data transfer to ESXi hosts from vCenter
Server .
For additional information on configuring the ports, see the firewall system administrator.
7.1.3 Connecting ESXi Hosts Through Firewalls
If you have a firewall between two ESXi hosts and you want to allow transactions between
the hosts or use vCenter Server to perform any source or target activities, such as vSphere
High Availability (vSphere HA) traffic, migration, cloning, or vMotion, you must configure a
connection through which the managed hosts can receive data.
To configure a connection for receiving data, open ports for traffic from services such as
vSphere High Availability, vMotion, and vSphere Fault Tolerance. See TCP and UDP Ports
for Management Access for a list of ports. Refer to the firewall system administrator for
additional information on configuring the ports.
7.1.4 TCP and UDP Ports for Management Access
vCenter Server, ESXi hosts, and other network components are accessed using
predetermined TCP and UDP ports. If you manage network components from outside a
firewall, you might be required to reconfigure the firewall to allow access on the appropriate
ports.
The table lists TCP and UDP ports, and the purpose and the type of each. Ports that are open
by default at installation time are indicated by (Default).
TCP and UDP Ports
Port
Purpose
Traffic Type
22
SSH Server
Incoming TCP
53
DNS Client
(Default)
Incoming and
outgoing UDP
68
DHCP Client
(Default)
Incoming and
outgoing UDP
TCP and UDP Ports
Port
Purpose
161
SNMP Server
(Default)
Traffic Type
Incoming UDP
vSphere Fault Tolerance (FT) (outgoing TCP, UDP)
HTTP access
Incoming TCP
The default non-secure TCP Web port typically used in conjunction
80
(Default) with port 443 as a front end for access to ESXi networks from the Outgoing TCP,
Web. Port 80 redirects traffic to an HTTPS landing page (port
UDP
443).
WS-Management
111
Incoming and
RPC service used for the NIS register by vCenter Virtual Appliance
(Default)
outgoing TCP
123
NTP Client
Outgoing UDP
135
Used to join vCenter Virtual Appliance to sn Active Direcotry
(Default) domain
Incoming and
outgoing TCP
427
The CIM client uses the Service Location Protocol, version 2
(Default) (SLPv2) to find CIM servers.
Incoming and
outgoing UDP
HTTPS access
vCenter Server access to ESXi hosts
Default SSL Web port
vSphere Client access to vCenter Server
443
vSphere Client access to ESXi hosts
(Default)
WS-Management
Incoming TCP
vSphere Client access to vSphere Update Manager
Third-party network management client connections to vCenter
Server
Third-party network management clients access to hosts
7.1.5 Security Considerations for VLANs
The way you set up VLANs to secure parts of a network depends on factors such as the guest
operating system and the way your network equipment is configured.
ESXi features a complete IEEE 802.1q-compliant VLAN implementation. VMware cannot
make specific recommendations on how to set up VLANs, but there are factors to consider
when using a VLAN deployment as part of your security enforcement policy.
7.1.5.1 VLANs as Part of a Broader Security Implementation
VLANs are an effective means of controlling where and how widely data is transmitted
within the network. If an attacker gains access to the network, the attack is likely to be
limited to the VLAN that served as the entry point, lessening the risk to the network as a
whole.
VLANs provide protection only in that they control how data is routed and contained after it
passes through the switches and enters the network. You can use VLANs to help secure
Layer 2 of your network architecture—the data link layer. However, configuring VLANs
does not protect the physical layer of your network model or any of the other layers. Even if
you create VLANs, provide additional protection by securing your hardware (routers, hubs,
and so forth) and encrypting data transmissions.
VLANs are not a substitute for firewalls in your virtual machine configurations. Most
network configurations that include VLANs also include firewalls. If you include VLANs in
your virtual network, be sure that the firewalls that you install are VLAN-aware.
7.1.5.2 Properly Configure VLANs
Equipment misconfiguration and network hardware, firmware, or software defects can make
a VLAN susceptible to VLAN-hopping attacks.
VLAN hopping occurs when an attacker with authorized access to one VLAN creates packets
that trick physical switches into transmitting the packets to another VLAN that the attacker is
not authorized to access. Vulnerability to this type of attack usually results from a switch
being misconfigured for native VLAN operation, in which the switch can receive and
transmit untagged packets.
To help prevent VLAN hopping, keep your equipment up to date by installing hardware and
firmware updates as they become available. Also, follow your vendor’s best practice
guidelines when you configure your equipment.
VMware standard switches do not support the concept of a native VLAN. All data passed on
these switches is appropriately tagged. However, because other switches in the network might
be configured for native VLAN operation, VLANs configured with standard switches can
still be vulnerable to VLAN hopping.
If you plan to use VLANs to enforce network security, disable the native VLAN feature for
all switches unless you have a compelling reason to operate some of your VLANs in native
mode. If you must use native VLAN, see your switch vendor’s configuration guidelines for
this feature.
7.1.6 Standard Switch Protection and VLANs
VMware standard switches provide safeguards against certain threats to VLAN security.
Because of the way that standard switches are designed, they protect VLANs against a
variety of attacks, many of which involve VLAN hopping.
Having this protection does not guarantee that your virtual machine configuration is
invulnerable to other types of attacks. For example, standard switches do not protect the
physical network against these attacks; they protect only the virtual network.
Standard switches and VLANs can protect against the following types of attacks.
MAC flooding
802.1q and ISL
tagging attacks
Doubleencapsulation
attacks
Floods a switch with packets that contain MAC addresses tagged as
having come from different sources. Many switches use a contentaddressable memory table to learn and store the source address for each
packet. When the table is full, the switch can enter a fully open state in
which every incoming packet is broadcast on all ports, letting the attacker
see all of the switch’s traffic. This state might result in packet leakage
across VLANs.
Although VMware standard switches store a MAC address table, they do
not get the MAC addresses from observable traffic and are not vulnerable
to this type of attack.
Force a switch to redirect frames from one VLAN to another by tricking
the switch into acting as a trunk and broadcasting the traffic to other
VLANs.
VMware standard switches do not perform the dynamic trunking required
for this type of attack and, therefore, are not vulnerable.
Occur when an attacker creates a double-encapsulated packet in which the
VLAN identifier in the inner tag is different from the VLAN identifier in
the outer tag. For backward compatibility, native VLANs strip the outer
tag from transmitted packets unless configured to do otherwise. When a
native VLAN switch strips the outer tag, only the inner tag is left, and that
inner tag routes the packet to a different VLAN than the one identified in
the now-missing outer tag.
VMware standard switches drop any double-encapsulated frames that a
virtual machine attempts to send on a port configured for a specific
VLAN. Therefore, they are not vulnerable to this type of attack.
Multicast brute- Involve sending large numbers of multicast frames to a known VLAN
almost simultaneously to overload the switch so that it mistakenly allows
force attacks
some of the frames to broadcast to other VLANs.
Spanning-tree
attacks
VMware standard switches do not allow frames to leave their correct
broadcast domain (VLAN) and are not vulnerable to this type of attack.
Target Spanning-Tree Protocol (STP), which is used to control bridging
between parts of the LAN. The attacker sends Bridge Protocol Data Unit
(BPDU) packets that attempt to change the network topology, establishing
themselves as the root bridge. As the root bridge, the attacker can sniff the
contents of transmitted frames.
Random frame
attacks
VMware standard switches do not support STP and are not vulnerable to
this type of attack.
Involve sending large numbers of packets in which the source and
destination addresses stay the same, but in which fields are randomly
changed in length, type, or content. The goal of this attack is to force
packets to be mistakenly rerouted to a different VLAN.
VMware standard switches are not vulnerable to this type of attack.
Because new security threats develop over time, do not consider this an exhaustive list of
attacks. Regularly check VMware security resources on the Web to learn about security,
recent security alerts, and VMware security tactics.
7.2 Securing Standard Switch Ports
As with physical network adapters, a virtual network adapter can send frames that appear to
be from a different machine or impersonate another machine so that it can receive network
frames intended for that machine. Also, like physical network adapters, a virtual network
adapter can be configured so that it receives frames targeted for other machines.
When you create a standard switch for your network, you add port groups to impose a policy
configuration for the virtual machines and storage systems attached to the switch. You create
virtual ports through the vSphere Client.
As part of adding a port or standard port group to a standard switch, the vSphere Client
configures a security profile for the port. You can use this security profile to ensure that the
host prevents the guest operating systems for its virtual machines from impersonating other
machines on the network. This security feature is implemented so that the guest operating
system responsible for the impersonation does not detect that the impersonation was
prevented.
The security profile determines how strongly you enforce protection against impersonation
and interception attacks on virtual machines. To correctly use the settings in the security
profile, you must understand the basics of how virtual network adapters control transmissions
and how attacks are staged at this level.
Each virtual network adapter has its own MAC address assigned when the adapter is created.
This address is called the initial MAC address. Although the initial MAC address can be
reconfigured from outside the guest operating system, it cannot be changed by the guest
operating system. In addition, each adapter has an effective MAC address that filters out
incoming network traffic with a destination MAC address different from the effective MAC
address. The guest operating system is responsible for setting the effective MAC address and
typically matches the effective MAC address to the initial MAC address.
When sending packets, an operating system typically places its own network adapter’s
effective MAC address in the source MAC address field of the Ethernet frame. It also places
the MAC address for the receiving network adapter in the destination MAC address field.
The receiving adapter accepts packets only when the destination MAC address in the packet
matches its own effective MAC address.
Upon creation, a network adapter’s effective MAC address and initial MAC address are the
same. The virtual machine’s operating system can alter the effective MAC address to another
value at any time. If an operating system changes the effective MAC address, its network
adapter receives network traffic destined for the new MAC address. The operating system can
send frames with an impersonated source MAC address at any time. This means an operating
system can stage malicious attacks on the devices in a network by impersonating a network
adapter that the receiving network authorizes.
You can use standard switch security profiles on hosts to protect against this type of attack by
setting three options. If you change any default settings for a port, you must modify the
security profile by editing standard switch settings in the vSphere Client.
7.2.1 MAC Address Changes
The setting for the MAC Address Changes option affects traffic that a virtual machine
receives.
When the option is set to Accept, ESXi accepts requests to change the effective MAC address
to other than the initial MAC address.
When the option is set to Reject, ESXi does not honor requests to change the effective MAC
address to anything other than the initial MAC address, which protects the host against MAC
impersonation. The port that the virtual adapter used to send the request is disabled and the
virtual adapter does not receive any more frames until it changes the effective MAC address
to match the initial MAC address. The guest operating system does not detect that the MAC
address change was not honored.
Note
The iSCSI initiator relies on being able to get MAC address changes from certain types of
storage. If you are using ESXi iSCSI and have iSCSI storage, set the MAC Address Changes
option to Accept.
In some situations, you might have a legitimate need for more than one adapter to have the
same MAC address on a network—for example, if you are using Microsoft Network Load
Balancing in unicast mode. When Microsoft Network Load Balancing is used in the standard
multicast mode, adapters do not share MAC addresses.
MAC address changes settings affect traffic leaving a virtual machine. MAC address changes
will occur if the sender is permitted to make them, even if standard switches or a receiving
virtual machine does not permit MAC address
hanges.
7.2.2 Forged Transmissions
The setting for the Forged Transmits option affects traffic that is transmitted from a virtual
machine.
When the option is set to Accept, ESXi does not compare source and effective MAC
addresses.
To protect against MAC impersonation, you can set this option to Reject. If you do, the host
compares the source MAC address being transmitted by the operating system with the
effective MAC address for its adapter to see if they match. If the addresses do not match,
ESXi drops the packet.
The guest operating system does not detect that its virtual network adapter cannot send
packets by using the impersonated MAC address. The ESXi host intercepts any packets with
impersonated addresses before they are delivered, and the guest operating system might
assume that the packets are dropped.
7.2.3 Promiscuous Mode Operation
Promiscuous mode eliminates any reception filtering that the virtual network adapter would
perform so that the guest operating system receives all traffic observed on the wire. By
default, the virtual network adapter cannot operate in promiscuous mode.
Although promiscuous mode can be useful for tracking network activity, it is an insecure
mode of operation, because any adapter in promiscuous mode has access to the packets
regardless of whether some of the packets are received only by a particular network adapter.
This means that an administrator or root user within a virtual machine can potentially view
traffic destined for other guest or host operating systems.
Note
In some situations, you might have a legitimate reason to configure a standard switch to
operate in promiscuous mode (for example, if you are running network intrusion detection
software or a packet sniffer).
7.3 Cipher Strength
Transmitting data over insecure connections presents a security risk because malicious users
might be able to scan data as it travels through the network. As a safeguard, network
components commonly encrypt the data so that it cannot be easily read.
To encrypt data, the sending component, such as a gateway or redirector, applies
cryptographic algorithms, or ciphers, to alter the data before transmitting it. The receiving
component uses a key to decrypt the data, returning it to its original form. Several ciphers
are in use, and the level of security that each provides is different. One measure of a cipher’s
ability to protect data is its cipher strength—the number of bits in the encryption key. The
larger the number, the more secure the cipher.
To ensure the protection of the data transmitted to and from external network connections,
ESXi uses one of the strongest block ciphers available—256-bit AES block encryption. ESXi
also uses 1024-bit RSA for key exchange. These encryption algorithms are the default for the
following connections.




vSphere Client connections to vCenter Server and to ESXi through the management
interface.
SDK connections to vCenter Server and to ESXi.
Management interface connections to virtual machines through the VMkernel.
SSH connections to ESXi through the management interface.
7.3.1 SSH Security
You can use SSH to remotely log in to the ESXi Shell and perform troubleshooting tasks for
the host.
SSH configuration in ESXi is enhanced to provide a high security level.
Version 1 SSH
protocol
disabled
VMware does not support Version 1 SSH protocol and uses Version 2
protocol exclusively. Version 2 eliminates certain security problems present
in Version 1 and provides you with a safe way to communicate with the
management interface.
Improved
SSH supports only 256-bit and 128-bit AES ciphers for your connections.
cipher strength
These settings are designed to provide solid protection for the data you transmit to the
management interface through SSH. If this configuration is too restricted for your needs, you
can lower security parameters.
7.4 Control CIM-Based Hardware Monitoring Tool Access
The Common Information Model (CIM) system provides an interface that enables hardwarelevel management from remote applications using a set of standard APIs. To ensure that the
CIM interface is secure, provide only the minimum access necessary to these applications. If
an application has been provisioned with a root or full administrator account and the
application is compromised, the full virtual environment might be compromised.
CIM is an open standard that defines a framework for agent-less, standards-based
monitoring of hardware resources for ESXi. This framework consists of a CIM object
manager, often called a CIM broker, and a set of CIM providers.
CIM providers are used as the mechanism to provide management access to device drivers
and underlying hardware. Hardware vendors, including server manufacturers and specific
hardware device vendors, can write providers to provide monitoring and management of
their particular devices. VMware also writes providers that implement monitoring of server
hardware, ESXi storage infrastructure, and virtualization-specific resources. These providers
run inside the ESXi system and therefore are designed to be extremely lightweight and
focused on specific management tasks. The CIM broker takes information from all CIM
providers, and presents it to the outside world via standard APIs, the most common one
being WS-MAN.
Do not provide root credentials to remote applications to access the CIM interface. Instead,
create a service account specific to these applications and grant read-only access to CIM
information to any local account defined on the ESXi system, as well as any role defined in
vCenter Server.
Procedure



Create a service account specific to CIM applications.
Grant read-only access to CIM information to any local account defined on the ESXi
system, as well as any role defined in vCenter Server.
(Optional) If the application requires write access to the CIM interface, create a role
to apply to the service account with only two privileges:
o Host.Config.SystemManagement
o Host.CIM.CIMInteraction
This role can be local to the host or centrally defined on vCenter Server, depending on how
the monitoring application works.
When a user logs into the host with the service account (for example, using the vSphere
Client), the user has only the privileges SystemManagement and CIMInteraction, or readonly access.
7.5 General Security Recommendations
To protect the host against unauthorized intrusion and misuse, VMware imposes constraints
on several parameters, settings, and activities. You can loosen the constraints to meet your
configuration needs, but if you do so, make sure that you are working in a trusted
environment and have taken enough other security measures to protect the network as a
whole and the devices connected to the host.
Consider the following recommendations when evaluating host security and administration.

Limit user access.
To improve security, restrict user access to the management interface and enforce access
security policies like setting up password restrictions.
The ESXi Shell has privileged access to certain parts of the host. Therefore, provide only
trusted users with ESXi Shell login access.
Also, strive to run only the essential processes, services, and agents such as virus checkers,
and virtual machine backups.

Use the vSphere Client to administer your ESXi hosts.
Whenever possible, use the vSphere Client or a third-party network management tool to
administer your ESXi hosts instead of working though the command-line interface as the
root user. Using the vSphere Client lets you limit the accounts with access to the ESXi Shell,
safely delegate responsibilities, and set up roles that prevent administrators and users from
using capabilities they do not need.

Use only VMware sources to upgrade ESXi components.
The host runs a variety of third-party packages to support management interfaces or tasks
that you must perform. VMware does not support upgrading these packages from anything
other than a VMware source. If you use a download or patch from another source, you might
compromise management interface security or functions. Regularly check third-party vendor
sites and the VMware knowledge base for security alerts.
In addition to implementing the firewall, risks to the hosts are mitigated using other
methods.






ESXi runs only services essential to managing its functions, and the distribution is
limited to the features required to run ESXi.
By default, all ports not specifically required for management access to the host are
closed. You must specifically open ports if you need additional services.
By default, weak ciphers are disabled and all communications from clients are
secured by SSL. The exact algorithms used for securing the channel depend on the
SSL handshake. Default certificates created on ESXi use SHA-1 with RSA encryption
as the signature algorithm.
The Tomcat Web service, used internally by ESXi to support access by Web clients,
has been modified to run only those functions required for administration and
monitoring by a Web client. As a result, ESXi is not vulnerable to the Tomcat security
issues reported in broader use.
VMware monitors all security alerts that could affect ESXi security and, if needed,
issues a security patch.
Insecure services such as FTP and Telnet are not installed, and the ports for these
services are closed by default. Because more secure services such as SSH and SFTP
are easily available, always avoid using these insecure services in favor of their safer
alternatives. If you must use insecure services and have implemented sufficient
protection for the host, you must explicitly open ports to support them.
7.6 ESXi Firewall Configuration
ESXi includes a firewall between the management interface and the network. The firewall is
enabled by default.
At installation time, the ESXi firewall is configured to block incoming and outgoing traffic,
except traffic for the default services listed in TCP and UDP Ports for Management Access.
Note
The firewall also allows Internet Control Message Protocol (ICMP) pings and
communication with DHCP and DNS (UDP only) clients.
Supported services and management agents that are required to operate the host are described
in a rule set configuration file in the ESXi firewall directory /etc/vmware/firewall/. The file
contains firewall rules and lists each rule's relationship with ports and protocols.
You cannot add a rule to the ESXi firewall unless you create and install a VIB that contains
the rule set configuration file. The VIB authoring tool is available to VMware partners.
Note
The behavior of the NFS Client rule set (nfsClient) is different from other rule sets. When the
NFS Client rule set is enabled, all outbound TCP ports are open for the destination hosts in
the list of allowed IP addresses. See NFS Client Rule Set Behavior for more information.
7.6.1 Rule Set Configuration Files
A rule set configuration file contains firewall rules and describes each rule's relationship with
ports and protocols. The rule set configuration file can contain rule sets for multiple services.
Rule set configuration files are located in the /etc/vmware/firewall/ directory. To add a
service to the host security profile, VMware partners can create a VIB that contains the port
rules for the service in a configuration file. VIB authoring tools are available to VMware
partners.
The ESXi 5.x ruleset.xml format is the same as in ESX and ESXi 4.x, but has two additional
tags: enabled and required. The ESXi 5.x firewall continues to support the 4.x ruleset.xml
format.
Each set of rules for a service in the rule set configuration file contains the following
information.





A numeric identifier for the service, if the configuration file contains more than one
service.
A unique identifier for the rule set, usually the name of the service.
For each rule, the file contains one or more port rules, each with a definition for
direction, protocol, port type, and port number or range of port numbers.
A flag indicating whether the service is enabled or disabled when the rule set is
applied.
An indication of whether the rule set is required and cannot be disabled.
Only users with the Administrator role can access the ESXi Shell. Users who are
in the Active Directory group ESX Admins are automatically assigned the
Administrator role. Any user with the Administrator role can execute system
commands (such as vmware -v) using the ESXi Shell.
7.6.2 Lockdown Mode
To increase the security of your ESXi hosts, you can put them in lockdown mode.
When you enable lockdown mode, no users other than vpxuser have authentication
permissions, nor can they perform operations against the host directly. Lockdown mode
forces all operations to be performed through vCenter Server.
When a host is in lockdown mode, you cannot run vSphere CLI commands from an
administration server, from a script, or from vMA against the host. External software or
management tools might not be able to retrieve or modify information from the ESXi host.
Note
Users with the DCUI Access privilege are authorized to log in to the Direct Console User
Interface (DCUI) when lockdown mode is enabled. When you disable lockdown mode using
the DCUI, all users with the DCUI Access privilege are granted the Administrator role on the
host. You grant the DCUI Access privilege in Advanced Settings.
Enabling or disabling lockdown mode affects which types of users are authorized to access
host services, but it does not affect the availability of those services. In other words, if the
ESXi Shell, SSH, or Direct Console User Interface (DCUI) services are enabled, they will
continue to run whether or not the host is in lockdown mode.
You can enable lockdown mode using the Add Host wizard to add a host to vCenter Server,
using the vSphere Client to manage a host, or using the Direct Console User Interface
(DCUI).
Note
If you enable or disable lockdown mode using the Direct Console User Interface (DCUI),
permissions for users and groups on the host are discarded. To preserve these permissions,
you must enable and disable lockdown mode using the vSphere Client connected to vCenter
Server.
Lockdown mode is only available on ESXi hosts that have been added to vCenter Server.
This chapter includes the following topics:


Lockdown Mode Behavior
Lockdown Mode Configurations



Enable Lockdown Mode Using the vSphere Client
Enable Lockdown Mode Using the vSphere Web Client
Enable Lockdown Mode from the Direct Console User Interface
7.7 Lockdown Mode Behavior
Enabling lockdown mode affects which users are authorized to access host services.
Users who were logged in to the ESXi Shell before lockdown mode was enabled remain
logged in and can run commands. However, these users cannot disable lockdown mode. No
other users, including the root user and users with the Administrator role on the host, can
use the ESXi Shell to log in to a host that is in lockdown mode.
Users with administrator privileges on the vCenter Server system can use the vSphere Client
to disable lockdown mode for hosts that are managed by the vCenter Server system. Users
granted the DCUI Access privilege can always log directly in to the host using the Direct
Console User Interface (DCUI) to disable lockdown mode, even if the user does not have the
Administrator role on the host. You must use Advanced Settings to grant the DCUI Access
privilege.
Note
When you disable lockdown mode using the DCUI, all users with the DCUI Access privilege
are granted the Administrator role on the host.
Root users or users with the Administrator role on the host cannot log directly in to the host
with the DCUI if they have not been granted the DCUI Access privilege. If the
host is not managed by vCenter Server or if the host is unreachable, only DCUI Access users
can log into the DCUI and disable lockdown mode. If the DCUI service is stopped, you must
reinstall ESXi.
Different services are available to different types of users when the host is running in
lockdown mode, compared to when the host is running in normal mode. Nonroot users
cannot run system commands in the ESXi Shell.
Lockdown Mode Behavior
Service
Normal Mode
Lockdown Mode
vSphere
WebServices API
All users, based on ESXi permissions
vCenter only (vpxuser)
CIM Providers
Root users and users with Admin role on the
host
vCenter only (ticket)
Direct Console UI Users with Admin role on the host and users
(DCUI)
with the DCUI Access privilege
Users with the DCUI
Access privilege.
Lockdown Mode Behavior
Service
Normal Mode
Lockdown Mode
ESXi Shell
Users with Admin role on the host
No users
SSH
Users with Admin role on the host
No users
7.8 Lockdown Mode Configurations
You can enable or disable remote and local access to the ESXi Shell to create different
lockdown mode configurations.
The following table lists which services are enabled for three typical configurations.
Caution
If you lose access to vCenter Server while running in Total Lockdown Mode, you must
reinstall ESXi to gain access to the host.
Lockdown Mode Configurations
Service
Default
Configuration
Recommended
Configuration
Total Lockdown
Configuration
Lockdown
Off
On
On
ESXi Shell
Off
Off
Off
SSH
Off
Off
Off
Direct Console UI
On
(DCUI)
On
Off
7.9 ESXi Authentication and User Management
A user is an individual authorized to log in to ESXi or vCenter Server.
In vSphere 5.1, ESXi user management has the following caveats.


You cannot create ESXi users with the vSphere Web Client. You must log directly into
the host with the vSphere Client to create ESXi users.
ESXi 5.1 does not support local groups. However, Active Directory groups are
supported.
To prevent anonymous users such as root from accessing the host with the Direct Console
User Interface (DCUI) or ESXi Shell, remove the user's administrator privileges on the root
folder of the host. This applies to both local users and Active Directory users and groups.
Most inventory objects inherit permissions from a single parent object in the hierarchy. For
example, a datastore inherits permissions from either its parent datastore folder or parent
datacenter. Virtual machines inherit permissions from both the parent virtual machine
folder and the parent host, cluster, or resource pool simultaneously. To restrict a user’s
privileges on a virtual machine, you must set permissions on both the parent fo
7.9.1 Multiple Permission Settings
Objects might have multiple permissions, but only one permission for each user or group.
Permissions applied on a child object always override permissions that are applied on a
parent object. Virtual machine folders and resource pools are equivalent levels in the
hierarchy. If you assign propagating permissions to a user or group on a virtual machine's
folder and its resource pool, the user has the privileges propagated from the resource pool
and from the folder.
If multiple group permissions are defined on the same object and the user belongs to two or
more of those groups, two situations are possible:


If no permission is defined for the user on that object, the user is assigned the set of
privileges assigned to the groups for that object.
If a permission is defined for the user on that object, the user's permission takes
precedence over all group permissions.
7.9.1.1 Example 1: Inheritance of Multiple Permissions
This example illustrates how an object can inherit multiple permissions from groups that are
granted permission on a parent object.
In this example, two permissions are assigned on the same object for two different groups.





Role 1 can power on virtual machines.
Role 2 can take snapshots of virtual machines.
Group A is granted Role 1 on VM Folder, with the permission set to propagate to
child objects.
Group B is granted Role 2 on VM Folder, with the permission set to propagate to
child objects.
User 1 is not assigned specific permission.
User 1, who belongs to groups A and B, logs on. User 1 can both power on and take snapshots
of VM A and VM B.
Example 1: Inheritance of Multiple Permissions
7.9.1.2 Example 2: Child Permissions Overriding Parent Permissions
This example illustrates how permissions that are assigned on a child object can override
permissions that are assigned on a parent object. You can use this overriding behavior to
restrict user access to particular areas of the inventory.
In this example, permissions are assigned to two different groups on two different objects.




Role 1 can power on virtual machines.
Role 2 can take snapshots of virtual machines.
Group A is granted Role 1 on VM Folder, with the permission set to propagate to
child objects.
Group B is granted Role 2 on VM B.
User 1, who belongs to groups A and B, logs on. Because Role 2 is assigned at a lower point in
the hierarchy than Role 1, it overrides Role 1 on VM B. User 1 can power on VM A, but not
take snapshots. User 1 can take snapshots of VM B, but not power it on.
Example 2: Child Permissions Overriding Parent Permissions
7.9.1.3 Example 3: User Permissions Overriding Group Permissions
This example illustrates how permissions assigned directly to an individual user override
permissions assigned to a group that the user is a member of.
In this example, permissions are assigned to a user and to a group on the same object.



Role 1 can power on virtual machines.
Group A is granted Role 1 on VM Folder.
User 1 is granted No Access role on VM Folder.
User 1, who belongs to group A, logs on. The No Access role granted to User 1 on VM Folder
overrides the group permission. User 1 has no access to VM Folder or VMs A and B.
Example 3: User Permissions Overriding Group Permissions
7.9.2 root User Permissions
Root users can only perform activities on the specific host that they are logged in to.
For security reasons, you might not want to use the root user in the Administrator role. In
this case, you can change permissions after installation so that the root user no longer has
administrative privileges. Alternatively, you can remove the access permissions for the root
user. (Do not remove the root user itself.)
Important
If you remove the access permissions for the root user, you must first create another
permission at the root level that has a different user assigned to the Administrator role.
Note
In vSphere 5.1, only the root user and no other user with administrator privileges is
permitted to add a host to vCenter Server.
Assigning the Administrator role to a different user helps you maintain security through
traceability. The vSphere Client logs all actions that the Administrator role user initiates as
events, providing you with an audit trail. If all administrators log in as the root user, you
cannot tell which administrator performed an action. If you create multiple permissions at
the root level—each associated with a different user—you can track the actions of each
administrator.
7.10 Best Practices for Roles and Permissions
Use best practices for roles and permissions to maximize the security and manageability of
your vCenter Server environment.
VMware recommends the following best practices when configuring roles and permissions in
your vCenter Server environment:







Where possible, grant permissions to groups rather than individual users.
Grant permissions only where needed. Using the minimum number of permissions
makes it easier to understand and manage your permissions structure.
If you assign a restrictive role to a group, check that the group does not contain the
Administrator user or other users with administrative privileges. Otherwise, you
could unintentionally restrict administrators' privileges in parts of the inventory
hierarchy where you have assigned that group the restrictive role.
Use folders to group objects to correspond to the differing permissions you want to
grant for them.
Use caution when granting a permission at the root vCenter Server level. Users with
permissions at the root level have access to global data on vCenter Server, such as
roles, custom attributes, vCenter Server settings, and licenses. Changes to licenses
and roles propagate to all vCenter Server systems in a Linked Mode group, even if the
user does not have permissions on all of the vCenter Server systems in the group.
In most cases, enable propagation on permissions. This ensures that when new
objects are inserted in to the inventory hierarchy, they inherit permissions and are
accessible to users.
Use the No Access role to masks specific areas of the hierarchy that you don’t want
particular users to have access to.
7.11 Replace a Default ESXi Certificate with a CA-Signed Certificate
ESXi uses automatically generated certificates that are created as part of the installation
process. These certificates are unique and make it possible to begin using the server, but they
are not verifiable and they are not signed by a trusted, well-known certificate authority (CA).
Using default certificates might not comply with the security policy of your organization. If
you require a certificate from a trusted certificate authority, you can replace the default
certificate.
Note
If the host has Verify Certificates enabled, replacing the default certificate might cause
vCenter Server to stop managing the host. If the new certificate is not verifiable by vCenter
Server, you must reconnect the host using the vSphere Client.
ESXi supports only X.509 certificates to encrypt session information sent over SSL
connections between server and client components.
7.12 Modifying ESXi Web Proxy Settings
When you modify Web proxy settings, you have several encryption and user security
guidelines to consider.
Note
Restart the host process after making any changes to host directories or authentication
mechanisms.
Do not set up certificates using pass phrases. ESXi does not support pass phrases, also
known as encrypted keys. If you set up a pass phrase, ESXi processes cannot start correctly.

You can configure the Web proxy so that it searches for certificates in a location other
than the default location. This capability proves useful for companies that prefer to
centralize their certificates on a single machine so that multiple hosts can use the
certificates.
Caution
If certificates are not stored locally on the host—for example, if they are stored on an NFS
share—the host cannot access those certificates if ESXi loses network connectivity. As a
result, a client connecting to the host cannot successfully participate in a secure SSL
handshake with the host.

To support encryption for user names, passwords, and packets, SSL is enabled by
default for vSphere Web services SDK connections. If you want to configure the these
connections so that they do not encrypt transmissions, disable SSL for your vSphere
Web Services SDK connection by switching the connection from HTTPS to HTTP.
Consider disabling SSL only if you created a fully trusted environment for these clients,
where firewalls are in place and transmissions to and from the host are fully isolated.
Disabling SSL can improve performance, because you avoid the overhead required to
perform encryption.

To protect against misuse of ESXi services, most internal ESXi services are accessible
only through port 443, the port used for HTTPS transmission. Port 443 acts as a
reverse proxy for ESXi. You can see a list of services on ESXi through an HTTP
welcome page, but you cannot directly access the Storage Adapters services without
proper authorization.
You can change this configuration so that individual services are directly accessible through
HTTP connections. Do not make this change unless you are using ESXi in a fully trusted
environment.

When you upgrade vCenter Server, the certificate remains in place.
7.13 General Virtual Machine Protection
A virtual machine is, in most respects, the equivalent of a physical server. Employ the same
security measures in virtual machines that you do for physical systems.
For example, ensure that antivirus, anti-spy ware, intrusion detection, and other protection are
enabled for every virtual machine in your virtual infrastructure. Keep all security measures
up-to-date, including applying appropriate patches. It is especially important to keep track of
updates for dormant virtual machines that are powered off, because it can be easy to overlook
them.
7.13.1 Disable Unnecessary Functions Inside Virtual Machines
Any service running in a virtual machine provides the potential for attack. By disabling
unnecessary system components that are not necessary to support the application or service
running on the system, you reduce the number of components that can be attacked.
Virtual machines do not usually require as many services or functions as physical servers.
When you virtualize a system, evaluate whether a particular service or function is necessary.
Procedure

Disable unused services in the operating system.
For example, if the system runs a file server, turn off any Web services.

Disconnect unused physical devices, such as CD/DVD drives, floppy drives, and USB
adaptors.
See Removing Unnecessary Hardware Devices.


Turn off screen savers.
Do not run the X Window system on Linux, BSD, or Solaris guest operating systems
unless it is necessary.
7.13.2 Use Templates to Deploy Virtual Machines
When you manually install guest operating systems and applications on a virtual machine,
you introduce a risk of misconfiguration. By using a template to capture a hardened base
operating system image with no applications installed, you can ensure that all virtual
machines are created with a known baseline level of security.
You can use these templates to create other, application-specific templates, or you can use the
application template to deploy virtual machines.
Procedure
1. Provide templates for virtual machine creation that contain hardened, patched, and
properly configured operating system deployments.
If possible, deploy applications in templates as well. Ensure that the applications do
not depend on information specific to the virtual machine to be deployed.
What to do next
You can convert a template to a virtual machine and back to a template in the vSphere Client,
which makes updating templates easy. For more information about templates, see the vSphere
Virtual Machine Administration documentation.
7.13.3 Prevent Virtual Machines from Taking Over Resources
When one virtual machine consumes so much of the host resources that other virtual
machines on the host cannot perform their intended functions, a Denial of Service (DoS)
might occur. To prevent a virtual machine from causing a DoS, use host resource
management features such as setting shares and limits to control the server resources
that a virtual machine consumes.
By default, all virtual machines on a host share resources equally.
Procedure
1. Use shares or reservations to guarantee resources to critical virtual machines.
Limits constrain resource consumption by virtual machines that have a greater risk of
being exploited or attacked, or that run applications that are known to have the
potential to greatly consume resources.
7.14 Removing Unnecessary Hardware Devices
Any enabled or connected device represents a potential attack channel. Users and processes
without privileges on a virtual machine can connect or disconnect hardware devices, such as
network adapters and CD-ROM drives. Attackers can use this capability to breach virtual
machine security. Removing unnecessary hardware devices can help prevent attacks.
Use the following guidelines to increase virtual machine security.




Ensure that unauthorized devices are not connected and remove any unneeded or
unused hardware devices.
Disable unnecessary virtual devices from within a virtual machine. An attacker with
access to a virtual machine can connect a disconnected CD-ROM drive and access
sensitive information on the media left in the drive, or disconnect a network adapter to
isolate the virtual machine from its network, resulting in a denial of service.
Ensure that no device is connected to a virtual machine if it is not required. Serial and
parallel ports are rarely used for virtual machines in a datacenter environment, and
CD/DVD drives are usually connected only temporarily during software installation.
For less commonly used devices that are not required, either the parameter should not
be
7.15 Securing vCenter Server Systems
7.15.1 Hardening the vCenter Server Host Operating System
Protect the host where vCenter Server is running against vulnerabilities and attacks by
ensuring that the operating system of the host (Windows or Linux) is as secure as possible.



Maintain a supported operating system, database, and hardware for the vCenter
Server system. If vCenter Server is not running on a support operating system, it
might not run properly, making vCenter Server vulnerable to attacks.
Keep the vCenter Server system properly patched. By staying up-to-date with
operating system patches, the server is less vulnerable to attack.
Provide operating system protection on the vCenter Server host. Protection includes
antivirus and antimalware software.
For operating system and database compatibility information, see the vSphere Compatibility
Matrixes.
7.15.2 Best Practices for vCenter Server Privileges
Strictly control vCenter Server administrator privileges to increase security for the system.


Full administrative rights to vCenter Server should be removed from the local
Windows administrator account and granted to a special-purpose local vCenter
Server administrator account. Grant full vSphere administrative rights only to those
administrators who are required to have it. Do not grant this privilege to any group
whose membership is not strictly controlled.
Avoid allowing users to log in directly to the vCenter Server system. Allow only those
users who have legitimate tasks to perform to log into the system and ensure that
these events are audited.



Install vCenter Server using a service account instead of a Windows account. You can
use a service account or a Windows account to run vCenter Server. Using a service
account allows you to enable Windows authentication for SQL Server, which provides
more security. The service account must be an administrator on the local machine.
Check for privilege reassignment when you restart vCenter Server. If the user or user
group that is assigned the Administrator role on the root folder of the server cannot
be verified as a valid user or group, the Administrator privileges are removed and
assigned to the local Windows Administrators group.
Grant minimal privileges to the vCenter Server database user. The database user
requires only certain privileges specific to database access. In addition, some
privileges are required only for installation and upgrade. These can be removed after
the product is installed or upgraded.
7.15.3 Restrict Use of the Administrator Privilege
By default, vCenter Server grants full administrator privileges to the administrator of the
local system, which can be accessed by domain administrators. To minimize risk of this
privilege being abused, remove administrative rights from the local operating system's
administrator account and assign these rights to a special-purpose local vSphere
administrator account. Use the local vSphere account to create individual user accounts.
Grant the Administrator privilege only to administrators who are required to have it. Do not
grant the privilege to any group whose membership is not strictly controlled.
Procedure
1. Create a user account that you will use to manage vCenter Server (for example, viadmin).
2. Ensure that the user does not belong to any local groups, such as the Administrators
group.
3. Log into the vCenter Server system as the local operating system administrator and
grant the role of global vCenter Server administrator to the user account you created
(for example, vi-admin).
4. Log out of vCenter Server and log in with the user account you created (vi-admin).
5. Verify that the user can perform all tasks available to a vCenter Server administrator.
6. Remove the administrator privileges that are assigned to the local operating system
administrator user or group.
7.15.4 Restrict Use of the Administrator Role
Secure the vCenter Server Administrator role and assign it only to certain users.
7.16 Best Practices for Virtual Machine and Host Security
7.17 Installing Antivirus Software
Because each virtual machine hosts a standard operating system, consider protecting it from
viruses by installing antivirus software. Depending on how you are using the virtual
machine, you might also want to install a software firewall.
Stagger the schedule for virus scans, particularly in deployments with a large number of
virtual machines. Performance of systems in your environment will degrade significantly if
you scan all virtual machines simultaneously.
Because software firewalls and antivirus software can be virtualization-intensive, you can
balance the need for these two security measures against virtual machine performance,
especially if you are confident that your virtual machines are in a fully trusted environment.
7.18 Managing ESXi Log Files
Log files are an important component of troubleshooting attacks and obtaining information
about breaches of host security Logging to a secure, centralized log server can help prevent
log tampering. Remote logging also provides a long-term audit record.
Take the following measures to increase the security of the host.




Configure persistent logging to a datastore. By default, the logs on ESXi hosts are
stored in the in-memory file system. Therefore, they are lost when you reboot the
host, and only 24 hours of log data is stored. When you enable persistent logging, you
have a dedicated record of server activity available for the host.
Remote logging to a central host allows you to gather log files onto a central host,
where you can monitor all hosts with a single tool. You can also do aggregate analysis
and searching of log data, which might reveal information about things like
coordinated attacks on multiple hosts.
Configure remote secure syslog on ESXi hosts using a remote command line such as
vCLI or PowerCLI, or using an API client.
Query the syslog configuration to make sure that a valid syslog server has been
configured, including the correct port.
7.18.1 Configure Syslog on ESXi Hosts
All ESXi hosts run a syslog service (vmsyslogd), which logs messages from the VMkernel and
other system components to log files.
You can use the vSphere Client or the esxcli system syslog vCLI command to configure the
syslog service.
For more information about using vCLI commands, see Getting Started with vSphere
Command-Line Interfaces.
Procedure
1.
2.
3.
4.
5.
In the vSphere Client inventory, select the host.
Click the Configuration tab.
In the Software panel, click Advanced Settings.
Select Syslog in the tree control.
To set up logging globally, click global and make changes to the fields on the right.
7.19 Securing Fault Tolerance Logging Traffic
When you enable Fault Tolerance (FT), VMware vLockstep captures inputs and events that
occur on a Primary VM and sends them to the Secondary VM, which is running on another
host.
This logging traffic between the Primary and Secondary VMs is unencrypted and contains
guest network and storage I/O data, as well as the memory contents of the guest operating
system. This traffic can include sensitive data such as passwords in plaintext. To avoid such
data being divulged, ensure that this network is secured, especially to avoid "man-in-themiddle" attacks. For example, use a private network for FT logging traffic.
7.20Auto Deploy Security Considerations
To best protect your environment, be aware of security risks that might exist when you use
Auto Deploy with host profiles.
In most cases, administrators set up Auto Deploy to provision target hosts not only with an
image, but also with a host profile. The host profile includes configuration information such
as authentication or network settings. Host profiles can be set up to prompt the user for
input on first boot. The user input is stored in an answer file. The host profile and answer file
(if applicable) are included in the boot image that Auto Deploy downloads to a machine.


The administrator password and user passwords that are included with the host
profile and answer file are MD5-encrypted. Any other passwords associated with host
profiles are in the clear.
Use the vSphere Authentication Service to set up Active Directory to avoid exposing
the Active Directory password. If you set up Active Directory using host profiles, the
passwords are not protected.
For more information about Auto Deploy, see the Auto Deploy information that is part of the
vSphere Installation and Setup documentation. For more information about host profiles
and answer files, see the vSphere Host Profiles documentation.
7.21 Image Builder Security Considerations
To protect the integrity of the ESXi host, do not allow users to install unsigned (communitysupported) VIBs. An unsigned VIB contains untested code that is not certified by, accepted
by, or supported by VMware or its partners. Community-supported VIBs do not have a
digital signature.
The ESXi Image Profile lets you set an acceptance level for the type of VIBs that are allowed
on the host. The acceptance levels include the following.




VMware Certified. VIBs that are VMware Certified are created, tested, and signed by
VMware.
VMware Accepted. VIBs that are created by a VMware partner, but tested and signed
by VMware.
Partner Supported. VIBs that are created, tested, and signed by a certified VMware
partner.
Community Supported. VIBs that have not been tested by VMware or a VMware
partner.
For more information about Image Builder, see the vSphere Installation and Setup
documentation.
7.22 Host Password Strength and Complexity
By default, ESXi uses the pam_passwdqc.so plug-in to set the rules that users must observe
when creating passwords and to check password strength.
The pam_passwdqc.so plug-in lets you determine the basic standards that all passwords
must meet. By default, ESXi imposes no restrictions on the root password. However, when
nonroot users attempt to change their passwords, the passwords they choose must meet the
basic standards that pam_passwdqc.so sets.
A valid password should contain a combination of as many character classes as possible.
Character classes include lowercase letters, uppercase letters, numbers, and special
characters such as an underscore or dash.
Note
When the number of character classes is counted, the plug-in does not count uppercase
letters used as the first character in the password and numbers used as the last character of a
password.
To configure password complexity, you can change the default value of the following
parameters.







retry is the number of times a user is prompted for a new password if the password
candidate is not sufficiently strong.
N0 is the number of characters required for a password that uses characters from
only one character class. For example, the password contains only lowercase letters.
N1 is the number of characters required for a password that uses characters from two
character classes.
N2 is used for passphrases. ESXi requires three words for a passphrase. Each word in
the passphrase must be 8-40 characters long.
N3 is the number of characters required for a password that uses characters from
three character classes.
N4 is the number of characters required for a password that uses characters from all
four character classes.
match is the number of characters allowed in a string that is reused from the old
password. If the pam_passwdqc.so plug-in finds a reused string of this length or
longer, it disqualifies the string from the strength test and uses only the remaining
characters.
Setting any of these options to -1 directs the pam_passwdqc.so plug-in to ignore the
requirement.
Setting any of these options to disabled directs the pam_passwdqc.so plug-in to disqualify
passwords with the associated characteristic. The values used must be in descending order
except for -1 and disabled.
Note
The pam_passwdqc.so plug-in used in Linux provides more parameters than the parameters
supported for ESXi.
For more information on the pam_passwdqc.so plug-in, see your Linux documentation.
7.22.1 Change Default Password Complexity for the pam_passwdqc.so Plug-In
Configure the pam_passwdqc.so plug-in to determine the basic standards all passwords
must meet.
Procedure
1.
2.
3.
4.
5.
Log in to the ESXi Shell as a user with administrator privileges.
Open the passwd file with a text editor.
For example, vi /etc/pam.d/passwd
Edit the following line.
password requisite /lib/security/$ISA/pam_passwdqc.so retry=N
min=N0,N1,N2,N3,N4
6. Save the file.
7.22.1.1 Example: Editing /etc/pam.d/passwd
7.22.2 Ensure that vpxuser Password Meets Policy
When you add a host to the vCenter Server inventory, vCenter Server creates a special user
account called vpxuser on the host. vpxuser is a privileged account that acts as a proxy for all
actions initiated through vCenter Server. Ensure that the default settings for the vpxuser
password meet the requirements of your organization's password policy.
By default, vCenter Server generates a new vpxuser password every 30 days using OpenSSL
crypto libraries as a source of randomness. The password is 32 characters long and is
guaranteed to contain at least one symbol from four character classes: symbols (./:=@[\\]^_{}~), digits (1-9), uppercase letters, and lowercase letters. Ensuring that the
password expires periodically limits the amount of time an attacker can use the vpxuser
password if it is compromised.
You can change the default value for password expiration and for password length to meet
your password policy.
Important
To preclude the possibility that vCenter Server is locked out of the ESXi host, the password
aging policy must not be shorter than the interval that is set to automatically change the
vpxuser password.
Procedure
1. To change the password length policy, edit the vpxd.hostPasswordLength parameter
in the vCenter Server configuration file on the system where vCenter Server is
running.
Operating
System
Default Location
Windows
C:\Documents and Settings\All Users\Application Data\VMware
VirtualCenter\vpxd.cfg
Linux
/etc/vmware-vpx/vpxd.cfg
1. To change the password aging requirement, use the Advanced Settings dialog box in
the vSphere Web Client.
2. Browse to the vCenter Server system in the vSphere Web Client inventory.
3. Click the Manage tab and click Settings.
4. Select Advanced Settings and locate the
VirtualCenter.VimPasswordExpirationInDays parameter.
5. Restart vCenter Server.
7.23 Synchronizing Clocks on the vSphere Network
Before you install vCenter Single Sign On, install the vSphere Web Client, or deploy the
vCenter Server appliance, make sure all machines on the vSphere network have their clocks
synchronized.
If the clocks on vCenter Server network machines are not synchronized, SSL certificates,
which are time-sensitive, might not be recognized as valid in communications between
network machines. Unsynchronized clocks can result in authentication problems, which can
cause the vSphere Web Client installation to fail or prevent the vCenter Server Appliance
vpxd service from starting.
7.23.1 Synchronize ESX and ESXi Clocks with a Network Time Server
Before you install vCenter Single Sign On, the vSphere Web Client, or the vCenter Server
appliance, make sure all machines on the vSphere network have their clocks synchronized.
Procedure
1. From the vSphere Web Client, connect to the vCenter Server.
2. Select the host in the inventory.
7.24 Monitoring and Restricting Access to SSL Certificates
Attackers can use SSL certificates to impersonate vCenter Server and decrypt the vCenter
Server database password. You must monitor and strictly control access to the certificate.
Only the service account user requires regular access to the directory that contains vCenter
Server SSL certificates. Infrequently, the vCenter Server system administrator might need to
access the directory as well. Because the SSL certificate can be used to impersonate vCenter
Server and decrypt the database password, monitor the event log and set an alert to trigger
when an account other than the service account accesses the directory.
To prevent a user other than the service account user from accessing the directory, change
the permissions on the directory so that only the vCenter Server service account is allowed to
access it. This restriction prevents you from collecting a complete support log when you issue
a vc-support script. The restriction also prevents the administrator from changing the
vCenter Server database password.
8 MSCS
Clustering Requirements
Component
Virtual SCSI adapter
Requirement
LSI Logic Parallel for Windows Server 2003
LSI Logic SAS for Windows Server 2008
Operating system
Windows Server 2003 SP1 and SP2 or Windows Server 2008
SP2 and above. For supported guest operating systems see
Other Clustering Requirements and Recommendations.
Virtual NIC
Use the default type for all guest operating systems.
I/O timeout
Set to 60 seconds or more. Modify
HKEY_LOCAL_MACHINE\System\CurrentControlSet\S
ervices\Disk\TimeOutValue.
The system might reset this I/O timeout value if you re-create
a cluster. You must reset the value in that case.
Disk format
Select Thick Provision to create disks in eagerzeroedthick
format.
Disk and networking setup
Add networking before disks. Refer to the VMware knowledge
base article at http://kb.vmware.com/kb/1513 if you
encounter any errors.
Windows Server 2003 SP1 and SP2 : two-node clustering
Number of nodes
Windows Server 2008 SP2 and above: up to five-node
clustering
For supported guest operating systems see Other Clustering
Requirements and Recommendations.
NTP server
Synchronize domain controllers and cluster nodes with a
common NTP server, and disable host-based time
synchronization when using clustering in the guest.
8.1.1 Supported Shared Storage Configurations
Different MSCS cluster setups support different types of shared storage configurations. Some
setups support more than one type. Select the recommended type of shared storage for best
results.
Shared Storage Requirements
Storage Type
Clusters on
One Physical
Machine
(Cluster in a
Box)
Virtual disks
Yes
(recommended)
Pass-through RDM
(physical compatibility No
mode)
Clusters Across
Physical Machines
Clusters of Physical
and Virtual Machines
(Cluster Across
Boxes)
(Standby Host
Clustering)
No
No
Yes
(recommended)
Yes
Non-pass-through RDM
(virtual compatibility
mode)
Yes
Yes
No
Use of software iSCSI initiators within guest operating systems configured with MSCS, in any
configuration supported by Microsoft, is transparent to ESXi hosts and there is no need for
explicit support statements from VMware.
Note
Clusters across physical machines with non-pass-through RDM is supported only for
clustering with Windows Server 2003. It is not supported for clustering with Windows
Server 2008.
8.1.2 Sphere MSCS Setup Limitations
Before you set up MSCS, review the list of functions that are not supported for this release,
and requirements and recommendations that apply to your configuration.
The following environments and functions are not supported for MSCS setups with this
release of vSphere:



Clustering on iSCSI and NFS disks.
Mixed environments, such as configurations where one cluster node is running a
different version of ESXi than another cluster node.
Use of MSCS in conjunction with vSphere Fault Tolerance (FT).





Migration with vSphere vMotion® of clustered virtual machines.
N-Port ID Virtualization (NPIV)
With native multipathing (NMP), clustering is not supported when the path policy is
set to round robin. Third-party multipathing plug-ins might support round robin or
other load balancing behavior with Microsoft clusters. Support of third-party
multipathing plug-ins is provided by the plug-in vendor. Round robin is the default
policy for multiple storage arrays in new vSphere releases. See KB 1010041 for a list
of storage arrays and the PSP to configure for MSCS.
ESXi hosts that use memory overcommitment are not suitable for deploying MSCS
virtual machines. Memory overcommitment can cause virtual machines to stall for
short durations. This can be significantly disruptive as the MSCS clustering
mechanism is time-sensitive and timing delays can cause the virtual machines to
behave incorrectly.
Suspend or resume of more than one MSCS node in an ESX host with a five-node
cluster in a box configuration is not supported. This I/O intensive operation is
disruptive of the timing sensitive MSCS clustering software.
FCoE is supported in ESXi 5.1 Update 1. See KB 1037959 for more information.
8.1.3 MSCS and Booting from a SAN
You can put the boot disk of a virtual machine on a SAN-based VMFS volume.
Booting from a SAN is complex. Problems that you encounter in physical environments
extend to virtual environments. For general information about booting from a SAN, see the
vSphere Storage documentation.
Follow these guidelines when you place the boot disk of a virtual machine on a SAN-based
VMFS volume:



Consider the best practices for boot-from-SAN that Microsoft publishes in the
following knowledge base article: http://support.microsoft.com/kb/305547/en-us.
Use StorPort LSI Logic drivers instead of SCSIport drivers when running Microsoft
Cluster Service for Windows Server 2003 or 2008 guest operating systems.
Test clustered configurations in different failover scenarios before you put them into
production environments.
8.1.4 Setting up Clustered Continuous Replication or Database Availability
Groups with Exchange 2010
You can set up Clustered Continuous Replication (CCR) with Exchange 2007 or Database
Availability Groups (DAG) with Exchange 2010 in your vSphere environment.
When working in a vSphere environment:


Use virtual machines instead of physical machines as the cluster components.
If the boot disks of the CCR or DAG virtual machines are on a SAN, see MSCS and
Booting from a SAN.
For more information, see Microsoft’s documentation for CCR or DAG on the Microsoft Web
site.
8.2 Cluster Virtual Machines Across Physical Hosts
You can create a MSCS cluster that consists of two or more virtual machines on two ESXi or
more hosts.
A cluster across physical hosts requires specific hardware and software.



ESXi hosts that have the following:
o Two physical network adapters dedicated to the MSCS cluster and to the
public and private networks.
o One physical network adapter dedicated to the VMkernel.
Fibre Channel (FC) SAN. Shared storage must be on an FC SAN.
RDM in physical compatibility (pass-through) or virtual compatibility (non-passthrough) mode. VMware recommends physical compatibility mode. The cluster
cannot use virtual disks for shared storage.
Failover clustering with Windows Server 2008 is not supported with virtual compatibility
mode (non-pass-through) RDMs.
8.3 Cluster Physical and Virtual Machines
You can create an MSCS cluster in which each physical machine has a corresponding virtual
machine. This type of configuration is known as a standby host cluster.
A standby host cluster has specific hardware and software requirements.





Use ESXi hosts that have the following:
o Two physical network adapters dedicated to the MSCS cluster and to the
public and private networks.
o One physical network adapter dedicated to the VMkernel.
Use RDMs in physical compatibility mode (pass-through RDM). You cannot use
virtual disks or RDMs in virtual compatibility mode (non-pass-through RDM) for
shared storage.
Use the STORport Miniport driver for the Fibre Channel (FC) HBA (QLogic or
Emulex) in the physical Windows machine.
Do not run multipathing software in the physical or virtual machines.
Use only a single physical path from the host to the storage arrays in standby host
configurations.
8.3.1 Using vSphere DRS Groups and VM-Host Affinity Rules with MSCS
Virtual Machines
You can use the vSphere Client to set up two types of DRS groups: virtual machine DRS
groups, which contain at least one virtual machine, and host DRS groups, which contain at
least one host. A VM-Host affinity rule establishes an affinity (or anti-affinity) relationship
between a virtual machine DRS group and a host DRS group.
You must use VM-Host affinity rules because vSphere HA does not obey VM-VM affinity
rules. This means that if a host fails, vSphere HA might separate clustered virtual machines
that are meant to stay together, or vSphere HA might put clustered virtual machines that are
meant to stay apart on the same host. You can avoid this problem by setting up DRS groups
and using VM-Host affinity rules, which are obeyed by vSphere HA.
For a cluster of virtual machines on one physical host, all MSCS virtual machines must be in
the same virtual machine DRS group, linked to the same host DRS group with the affinity
rule "Must run on hosts in group."
For a cluster of virtual machines across physical hosts, each MSCS virtual machine must be
in a different virtual machine DRS group, linked to a different host DRS group with the
affinity rule "Must run on hosts in group."
Limit the number of hosts to two when you define host DRS group rules for a
cluster of virtual machines on one physical host. (This does not apply to clusters
of virtual machines across physical hosts.) Since vSphere HA does not obey VMVM affinity rules, virtual machines in the configuration could be spread across
hosts during a vSphere HA recovery from host failure if more than two hosts
are included in a host DRS group rule.
8.4 vSphere MSCS Setup Checklist
When you set up MSCS on ESXi, see the checklists to configure your environment according
to the requirements. You can also use the checklists to verify that your setup meets the
requirements if you need technical support.
8.4.1 Requirements for Clustered Disks
Each type of clustered disk has its own requirements, depending on whether it is in a singlehost cluster or multihost cluster.
Requirements for Clustered Disks
Component
Clustered virtual disk
(.vmdk)
Single-Host Clustering
Multihost Clustering
SCSI bus sharing mode
Not supported.
must be set to virtual.
Device type must be set
Device type must be set to virtual
to virtual compatibility
compatibility mode for cluster across
mode.
boxes, but not for standby host clustering
or cluster across boxes on Windows Sever
Clustered disks, virtual SCSI bus sharing mode
2008.
compatibility mode
must be set to virtual
(non-pass-through
mode.
SCSI bus sharing mode must be set to
RDM)
physical.
A single, shared RDM
mapping file for each
Requires a single, shared RDM mapping
clustered disk is
file for each clustered disk.
required.
Device type must be set to Physical
compatibility mode during hard disk
Clustered disks,
creation.
physical compatibility
Not supported.
mode (pass-through
SCSI bus sharing mode must be set to
RDM)
physical (the default).
Requirements for Clustered Disks
Component
All types
Single-Host Clustering
Multihost Clustering
A single, shared RDM mapping file for
each clustered disk is required.
All clustered nodes must use the same target ID (on the virtual SCSI
adapter) for the same clustered disk.
A separate virtual adapter must be used for clustered disks.
8.4.2 Other Requirements and Recommendations
The following table lists the components in your environment that have requirements for
options or settings.
Other Clustering Requirements and Recommendations
Compone
Requirement
nt
If you place the boot disk on a virtual disk, select Thick Provision during disk
provisioning.
Disk
The only disks that you should not create with the Thick Provision option are
RDM files (both physical and virtual compatibility mode).
Use Windows Server 2003 SP1 and SP2 (32 bit), Windows Server 2003 SP1 and
SP2 (64 bit), Windows Server 2008 SP2 (32 bit), Windows Server 2008 SP2 (64
bit), or Windows Server 2008 SP1 R2 (32 bit), Windows Server 2008 SP1 R2 (64
bit)
For Windows Server 2003 SP1 and SP2, use only two cluster nodes.
For Windows Server 2008 SP2 and above, you can use up to five cluster nodes.
Windows
Disk I/O timeout is 60 seconds or more
(HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk\T
imeOutValue).
Note
If you recreate the cluster, this value might be reset to its default, so you must
change it again.
The cluster service must restart automatically on failure (first, second, and
subsequent times).
Do not overcommit memory. Set the Memory Reservation (minimum memory)
option to the same as the amount of memory assigned to the virtual machine.
ESXi
configurati If you must overcommit memory, the swap file must be local, not on the SAN.
on
ESXi 5.0 uses a different technique to determine if Raw Device Mapped (RDM)
LUNs are used for MSCS cluster devices, by introducing a configuration flag to
Other Clustering Requirements and Recommendations
Compone
Requirement
nt
mark each device as "perennially reserved" that is participating in an MSCS
cluster. For ESXi hosts hosting passive MSCS nodes with RDM LUNs, use the
esxcli command to mark the device as perennially reserved: esxcli storage
core device setconfig -d <naa.id> --perennially-reserved=true. See
KB 1016106 for more information.
Multipathi Contact your multipathing software vendor for information and support of nonng
VMware multipathing software in vSphere.
9 Virtual Machine Administration
9.1 What Is a Virtual Machine?
A virtual machine consists of several types of files that you store on a supported storage
device. The key files that make up a virtual machine are the configuration file, virtual disk
file, NVRAM setting file, and the log file. You configure virtual machine settings through the
vSphere Web Client or the vSphere Client. You do not need to touch the key files.
A virtual machine can have more files if one or more snapshots exist or if you add Raw
Device Mappings (RDMs).
Caution
Do not change, move, or delete these files without instructions from a VMware Technical
Support Representative.
Virtual Machine Files
File
Usage
Description
.vmx
vmname.vmx
Virtual machine configuration file
.vmxf
vmname.vmxf
Additional virtual machine
configuration files
.vmdk
vmname.vmdk
Virtual disk characteristics
Virtual Machine Files
File
Usage
Description
vmname-flat.vmdk
flat.vmdk
Virtual machine data disk
.nvram
vmname.nvram or nvram
Virtual machine BIOS or EFI
configuration
.vmsd
vmname.vmsd
Virtual machine snapshots
.vmsn
vmname.vmsn
Virtual machine snapshot data file
.vswp
vmname.vswp
Virtual machine swap file
.vmss
vmname.vmss
Virtual machine suspend file
.log
vmware.log
Current virtual machine log file
-#.log
vmware-#.log (where # is a number
starting with 1)
Old virtual machine log entries
9.1.1 Virtual Machine Options and Resources
Each virtual device performs the same function for the virtual machine as hardware on a
physical computer does.
A virtual machine might be running in any of several locations, such as ESXi hosts,
datacenters, clusters, or resource pools. Many of the options and resources that you configure
have dependencies on and relationships with these objects.
Every virtual machine has CPU, memory, and disk resources. CPU virtualization emphasizes
performance and runs directly on the processor whenever possible. The underlying physical
resources are used whenever possible. The virtualization layer runs instructions only as
needed to make virtual machines operate as if they were running directly on a physical
machine.
All recent operating systems provide support for virtual memory, allowing software to use
more memory than the machine physically has. Similarly, the ESXi hypervisor provides
support for overcommitting virtual machine memory, where the amount of guest memory
configured for all virtual machines might be larger than the amount of the host's physical
memory.
You can add virtual disks and add more space to existing disks, even when the virtual
machine is running. You can also change the device node and allocate shares of disk
bandwidth to the virtual machine.
VMware virtual machines have the following options:
Supported Features for Virtual Machine Compatibility
and later
ESXi 5.0
and later
ESX/ESXi 4.x
and later
ESX/ESXi
3.5 and
later
Hardware version
9
8
7
4
Maximum memory (MB)
1035264
1035264
261120
65532
Maximum number of logical
processors
64
32
8
4
Maximum number of cores
(virtual CPUs) per socket
64
32
8
1
Maximum SCSI adapters
4
4
4
4
Bus Logic adapters
Y
Y
Y
Y
LSI Logic adapters
Y
Y
Y
Y
LSI-Logic SAS adapters
Y
Y
Y
N
VMware Paravirtual controllers Y
Y
Y
N
ESXi 5.1
Feature
9.1.2 VM Disk formats
Option
Action
Same format Use the same format as the source virtual machine.
as source
Thick
Create a virtual disk in a default thick format. Space required for the virtual
Provision
disk is allocated during creation. Any data remaining on the physical device is
Lazy Zeroed not erased during creation, but is zeroed out on demand at a later time on first
write from the virtual machine.
Thick
Create a thick disk that supports clustering features such as Fault Tolerance.
Provision
Space required for the virtual disk is allocated at creation time. In contrast to
Eager Zeroed the thick provision lazy zeroed format, the data remaining on the physical
device is zeroed out during creation. It might take longer to create disks in this
format than to create other types of disks.
Thin
Provision
Use the thin provisioned format. At first, a thin provisioned disk uses only as
much datastore space as the disk initially needs. If the thin disk needs more
space later, it can grow to the maximum capacity allocated to it.
9.2 Installing the Microsoft Sysprep Tool
Install the Microsoft Sysprep tool so that you can customize Windows guest operating
systems when you clone virtual machines.
The guest operating system customization feature in vCenter Server and VMware vCenter
Server Appliance uses the functions of the Sysprep tool. Verify that your vCenter Server or
VMware vCenter Server Appliance system meets the following requirements before you
customize your virtual machine’s Windows guest operating systems:



Install the Microsoft Sysprep tool. Microsoft includes the system tool set on the
installation CD-ROM discs for Windows 2000, Windows XP, and Windows 2003.
The Sysprep tool is built into the Windows Vista and Windows 2008 operating
systems.
The correct versions of the Sysprep tool is installed for each guest operating system
that you want to customize.
The password for the local administrator account on the virtual machines is set to
blank ("").
If you are using the VMware vCenter Server Application, you must have access to the
VMware vCenter Server Appliance Web console.
Note
Customization operations will fail if the correct version of the Sysprep tool is not found.
This chapter includes the following topics:



Install the Microsoft Sysprep Tool from a Microsoft Web Site
Install the Microsoft Sysprep Tool from the Windows Operating System CD
Install the Microsoft Sysprep Tool for VMware vCenter Server Appliance
9.2.1 Install the Microsoft Sysprep Tool from a Microsoft Web Site
You can download and install the Microsoft Sysprep tool from the Microsoft Web site.
Prerequisites
Verify that you download the correct version for the guest operating system to customize.
Microsoft has a different version of Sysprep for each release and service pack of Windows.
You must use the version of Sysprep specific to the operating system that you are deploying.
The vCenter Server installer creates a Sysprep directory in ALLUSERSPROFILE. The
ALLUSERSPROFILE location is usually \Documents And Settings\All Users\. The vpxd.cfg
file is also in this location. On Windows 2008, the file location is
C:\ProgramData\VMware\VMware VirtualCenter\sysprep\.
Procedure
1. Download the Sysprep files from the Microsoft Download Center and save them to
your local system.
2. Open and expand the .cab file.
The contents of the .cab file vary, depending on the operating system.
3. Extract the files to the appropriate directory for your guest operating system.
The following Sysprep support directories are created during the vCenter Server
installation:
C:\ALLUSERSPROFILE\Application Data\Vmware\VMware VirtualCenter\syspr
ep
...\1.1\
...\2k\
...\xp\
...\svr2003\
...\xp-64\
...\svr2003-64\
4. Select the subdirectory that corresponds to your operating system.
5. Click OK to expand the file
9.2.2 Install the Microsoft Sysprep Tool from the Windows Operating System
CD
You can install the Microsoft Sysprep tool from a CD.
The vCenter Server installer creates a Sysprep directory in ALLUSERSPROFILE. The
ALLUSERSPROFILE location is usually \Documents and Settings\All Users\. The vpxd.cfg
file is also in this location. On Windows 2008, the file location is
C:\ProgramData\VMware\VMware VirtualCenter\sysprep\.
Procedure
1. Insert the Windows operating system CD into the CD-ROM drive, often the D: drive.
2. Locate the DEPLOY.CAB file in the \Support\Tools directory on the CD.
3. Open and expand the DEPLOY.CAB file.
The contents of the .cab file vary, depending on the operating system.
4. Extract the files to the directory appropriate for your guest operating system.
The following Sysprep support directories are created during the vCenter Server
installation:
C:\ALLUSERSPROFILE\Application Data\Vmware\VMware VirtualCenter\syspr
ep
...\1.1\
...\2k\
...\xp\
...\svr2003\
...\xp-64\
...\svr2003-64\
5. Select the subdirectory that corresponds to your operating system.
6. Click OK to expand the files.
7. Repeat this procedure to extract Sysprep files for each of the Windows guest
operating systems that you plan to customize using vCenter Server.
9.2.3 Install the Microsoft Sysprep Tool for VMware vCenter Server Appliance
After you download and install the Microsoft Sysprep tool from the Microsoft Web site, you
can use the VMware vCenter Server Appliance Web console to upload the files to the
appliance.
Prerequisites
Verify that you download the correct version for the guest operating system to customize.
Microsoft has a different version of Sysprep for each release and service pack of Windows.
You must use the version of Sysprep specific to the operating system that you are deploying.
When you upload the files to vCenter Server Appliance, the contents of the CAB file for the
Sysprep Tool version that you downloaded are saved in /etc/vmware-vpx/sysprep/OS. For
example, /etc/vmware-vpx/sysprep/2k or /etc/vmware-vpx/sysprep/xp.
Procedure
1. Download the Sysprep files from the Microsoft Download Center and save them to
your local system.
2. Log in to the VMware vCenter Server Appliance Web console and click the vCenter
Server Summary tab.
3. In the Utilities panel, click the Sysprep Files Upload button.
4. Select a Windows platform directory, and browse to the file.
5. Click Open.
The file is uploaded to the VCenter Server Appliance.
6. Click Close.
You can customize a new virtual machine with a supported Windows guest operating system
when you clone an existing virtual machine.
9.3 Virtual Machine Compatibility Options
Compatibility Description
ESXi 5.1 and
later
This virtual machine (hardware version 9) is compatible with ESXi 5.1 and
later.
ESXi 5.0 and
later
This virtual machine (hardware version 8) is compatible with ESXi 5.0 and
5.1.
ESX/ESXi 4.x This virtual machine (hardware version 7) is compatible with ESX/ ESXi
and later
4.x, ESXi 5.0, and ESXi 5.1.
ESX/ESXi 3.5 This virtual machine (hardware version 4) is compatible with ESX/ESXi 3.5.
and later
ESX/ESXi 4.x, and ESXi 5.1. It is also compatible with VMware Server 1.0
9.3 Virtual Machine Compatibility Options
Compatibility Description
and later. ESXi 5.0 does not allow creation of virtual machines with this
compatibility, but you can run such virtual machines if they were created on
a host with different compatibility.
This virtual machine (hardware version 3) is compatible with ESX Server
ESX Server 2.x 2.x, ESX/ESXi 3.5, ESX/ESXi 4.x, and ESXi 5.0. You cannot create or edit
and later
virtual machines with ESX Server 2.x compatibility. You can only start or
upgrade them.
The compatibility setting that appears in the Compatible with drop-down menu is the default
for the virtual machine that you are creating. The following factors determine the default
virtual machine compatibility:


The ESXi host version on which the virtual machine is created.
The inventory object that the default virtual machine compatibility is set on,
including a host, cluster, or datacenter.
You can accept the default compatibility or select a different setting. It is not always
necessary to select the latest ESXi host version. Selecting an earlier version can provide
greater flexibility and is useful in the following situations:


To standardize testing and deployment in your virtual environment.
If you do not need the capabilities of the latest host version.
9.3.1 Determine the Default Virtual Machine Compatibility Setting in the
vSphere Web Client
The compatibility setting for a virtual machine provides information about the hosts,
clusters, or datacenter the virtual machine is compatible with.
The virtual machine Summary tab displays the compatibility for the virtual machine. You can
set and view the default compatibility used for virtual machine creation at the host, cluster,
or datacenter level.
Procedure
Select an inventory object and display the virtual machine compatibility.
Option
Action
Virtual
machine
Select a virtual machine in the inventory and click the Summary tab.
Host
Select a host in the inventory and click the Manage tab.
The top panel displays the Compatibility setting.
The Default Virtual Machine Compatibility is listed in the Virtual Machines
section.
Cluster
Select a cluster in the inventory, click the Manage tab, and in the
Configuration section, click General.
Datacenter
Right-click a datacenter in the inventory and select All Virtual Infrastructure
Actions > Edit Default VM Compatibility.
You can change the default compatibility or upgrade the virtual machine compatibility.
9.3.2 Supported Features for Virtual Machine Compatibility
Feature
ESXi 5.1 and ESXi 5.0 and ESX/ESXi 4.x
later
later
and later
ESX/ESXi 3.5
and later
Hardware version
9
8
7
4
Maximum memory (MB)
1035264
1035264
261120
65532
Maximum number of logical
processors
64
32
8
4
Maximum number of cores
(virtual CPUs) per socket
64
32
8
1
Maximum SCSI adapters
4
4
4
4
Bus Logic adapters
Y
Y
Y
Y
LSI Logic adapters
Y
Y
Y
Y
LSI-Logic SAS adapters
Y
Y
Y
N
VMware Paravirtual controllers Y
Y
Y
N
9.4 Change CPU Hot Plug Settings in the vSphere Web Client
The CPU hot plug option lets you add CPU resources for a virtual machine while the machine
is turned on.
The following conditions apply:





For best results, use virtual machines that are compatible with ESXi 5.0 and later.
Hot-adding multicore virtual CPUs is supported only with virtual machine that are
compatible with ESXi 5.0 and later.
Not all guest operating systems support CPU hot add. You can disable these settings
if the guest is not supported.
To use the CPU hot-add feature with virtual machines that are compatible with ESXi
4.x and later, set the Number of cores per socket to 1.
Adding CPU resources to a running virtual machine with CPU hot plug enabled
disconnects and reconnects all USB passthrough devices connected to that virtual
machine.
Prerequisites
Verify that the virtual machine is running under the following conditions:




VMware Tools is installed. This condition is required for hot plug functionality with
Linux guest operating systems.
The virtual machine has a guest operating system that supports CPU hot plug.
The virtual machine compatibility is ESX/ESXi 4.x or later.
The virtual machine is turned off.
9.4.1 Change CPU Identification Mask Settings in the vSphere Web Client
CPU identification (CPU ID) masks control the CPU features visible to the virtual machine's
guest operating system. Masking or hiding CPU features can make a virtual machine widely
available to ESXi hosts for migration. vCenter Server compares the CPU features available to
a virtual machine with the CPU features of the destination host to determine whether to
allow or disallow migration with vMotion.
For example, masking the AMD No eXecute (NX) and the Intel eXecute Disable (XD) bits
prevents the virtual machine from using these features, but provides compatibility that
allows you to migrate virtual machines to ESXi hosts that do not include this capability.
When the NX/XD bit is visible to the guest operating system, the virtual machine can use
this feature,
but you can migrate the virtual machine only to hosts on which the feature is enabled.
Caution
Changing the CPU compatibility masks can result in an unsupported configuration. Do not
manually change the CPU compatibility masks unless instructed to do so by VMware
Support or a VMware Knowledge base article.
Prerequisites
Turn off the virtual machine.
Procedure
1. Right-click the virtual machine and select Edit Settings.
a. To locate a virtual machine, select a datacenter, folder, cluster, resource pool,
host, or vApp.
b. Click the Related Objects tab and click Virtual Machines.
2. On the Virtual Hardware tab, expand CPU, and in the CPUID Mask drop-down menu,
select an NX/XD option.
Option
Hide the NX/XD
flag from guest
Description
Increases vMotion compatibility.
Hiding the NX/XD flag increases vMotion compatibility between hosts,
but might disable certain CPU security features.
Expose the NX/XD
flag to guest
Keep current
Advanced setting
values for the
NX/XD flag
Keeps all CPU security features enabled.
Uses the NX/XD flag settings specified in the CPU Identification Mask
dialog box. Enabled only when current settings specify something other
than what is specified in the other NX/XD flag options, for example, if
the NX/XD flag bit setting varies with processor brand.
3. Click OK.
9.4.2 Expose VMware Hardware Assisted Virtualization in the vSphere Web
Client
You can expose full CPU virtualization to the guest operating system so that applications that
require hardware virtualization can run on virtual machines without binary translation or
paravirtualization.
Prerequisites




Verify that the virtual machine compatibility is ESXi 5.1 and later.
Intel Nehalem Generation (Xeon Core i7) or later processors or AMD Opteron
Generation 3 (Greyhound) or later processors.
Verify that Intel VT-x or AMD-V is enabled in the BIOS so that hardware assisted
virtualization is possible.
Required Privileges: Virtual machine.Configuration.Settings set on the vCenter
Server system.
Procedure



Right-click the virtual machine and select Edit Settings.
o To locate a virtual machine, select a datacenter, folder, cluster, resource pool,
host, or vApp.
o Click the Related Objects tab and click Virtual Machines.
On the Virtual Hardware tab, expand CPU, and select Expose hardware-assisted
virtualization to guest OS.
Click OK.
The Manage tab refreshes, and the Nested Hypervisor CPU option shows Enabled.
9.4.3 Allocate Memory Resources in the vSphere Web Client
You can change the amount of memory resources allocated to a virtual machine by using the
shares, reservations, and limits settings. The host determines the appropriate amount of
physical RAM to allocate to virtual machines based on these settings. You can assign a high
or low shares value to a virtual machine, depending on its load and status.
The following user-defined settings affect the memory resource allocation of a virtual
machine.
Limit Places a limit on the consumption of memory for a virtual machine. This value is
expressed in megabytes.
Reservation Specifies the guaranteed minimum allocation for a virtual machine. The
reservation is expressed in megabytes. If the reservation cannot be met, the
virtual machine will not turn on.
Shares Each virtual machine is granted a number of memory shares. The more shares a
virtual machine has, the greater share of host memory it receives. Shares represent a
relative metric for allocating memory capacity. For more information about share
values, see the vSphere Resource Management documentation.
You cannot assign a reservation to a virtual machine that is larger than its configured
memory. If you give a virtual machine a large reservation and reduce its configured memory
size, the reservation is reduced to match the new configured memory size.
9.4.4 Network Adapter Types
When you configure a virtual machine, you can add network adapters (NICs) and specify the
adapter type.
The type of network adapters that are available depend on the following factors:



The virtual machine version, which depends on what host created it or most recently
updated it.
Whether the virtual machine has been updated to the latest version for the current
host.
The guest operating system.
The following NIC types are supported:
E1000
Emulated version of the Intel 82545EM Gigabit Ethernet NIC, with drivers
available in most newer guest operating systems, including Windows XP
and later and Linux versions 2.4.19 and later.
Flexible
Identifies itself as a Vlance adapter when a virtual machine boots, but
initializes itself and functions as either a Vlance or a VMXNET adapter,
depending on which driver initializes it. With VMware Tools installed, the
VMXNET driver changes the Vlance adapter to the higher performance
VMXNET adapter.
Vlance
Emulated version of the AMD 79C970 PCnet32 LANCE NIC, an older 10
Mbps NIC with drivers available in most 32bit guest operating systems
except Windows Vista and later. A virtual machine configured with this
network adapter can use its network immediately.
VMXNET
Optimized for performance in a virtual machine and has no physical
counterpart. Because operating system vendors do not provide built-in
drivers for this card, you must install VMware Tools to have a driver for the
VMXNET network adapter available.
VMXNET 2
(Enhanced)
Based on the VMXNET adapter but provides high-performance features
commonly used on modern networks, such as jumbo frames and hardware
offloads. VMXNET 2 (Enhanced) is available only for some guest operating
systems on ESX/ESXi 3.5 and later.
VMXNET 3
Next generation of a paravirtualized NIC designed for performance.
VMXNET 3 offers all the features available in VMXNET 2 and adds several
new features, such as multiqueue support (also known as Receive Side
Scaling in Windows), IPv6 offloads, and MSI/MSI-X interrupt delivery.
VMXNET 3 is not related to VMXNET or VMXNET 2.
9.5 VM Disk Persistence Modes
Option
Description
Dependent
Dependent disks are included in snapshots.
Independent Persistent
Disks in persistent mode behave like conventional disks on your physical
computer. All data written to a disk in persistent mode are written
permanently to the disk.
Independent Nonpersistent
Changes to disks in nonpersistent mode are discarded when you turn off
or reset the virtual machine. With nonpersistent mode, you can restart
the virtual machine with a virtual disk in the same state every time.
Changes to the disk are written to and read from a redo log file that is
deleted when you turn off or reset the virtual machine.
9.5.1 RDM Compatibility Modes
Select a compatibility mode.
Option Description
PhysicalAllows the guest operating system to access the hardware directly. Physical
compatibility is useful if you are using SAN-aware applications on the virtual
machine. However, a virtual machine with a physical compatibility RDM cannot be
cloned, made into a template, or migrated if the migration involves copying the disk.
Virtual Allows the RDM to behave as if it were a virtual disk, so that you can use such
features as taking snapshots, cloning, and so on. When you clone the disk or make a
template out of it, the contents of the LUN are copied into a .vmdk virtual disk file.
When you migrate a virtual compatibility mode RDM, you can migrate the mapping
file or copy the contents of the LUN into a virtual disk.
In most cases, you can accept the default device node. For a hard disk, a nondefault device
node is useful to control the boot order or to have different SCSI controller types. For
example, you might want to boot from an LSI Logic controller and share a data disk with
another virtual machine using a BusLogic controller with bus sharing turned on.
Disk modes are not available for RDM disks using physical compatibility mode.
9.5.2 Use Disk Shares to Prioritize Virtual Machines in the vSphere Web Client
You can change the disk resources for a virtual machine. If multiple virtual machines access
the same VMFS datastore and the same logical unit number (LUN), use disk shares to
prioritize the disk accesses from the virtual machines. Disk shares distinguish high-priority
from low-priority virtual machines.
You can allocate the host disk's I/O bandwidth to the virtual hard disks of a virtual machine.
Disk I/O is a host-centric resource so you cannot pool it across a cluster.
Shares is a value that represents the relative metric for controlling disk bandwidth to all
virtual machines. The values are compared to the sum of all shares of all virtual machines on
the server.
Disk shares are relevant only within a given host. The shares assigned to virtual machines on
one host have no effect on virtual machines on other hosts.
You can select an IOP limit, which sets an upper bound for storage resources that are
allocated to a virtual machine. IOPs are the number of I/O operations per second.
Procedure





Right-click the virtual machine and select Edit Settings.
o To locate a virtual machine, select a datacenter, folder, cluster, resource pool,
host, or vApp.
o Click the Related Objects tab and click Virtual Machines.
On the Virtual Hardware tab, expand Hard disk to view the disk options.
In the Shares drop-down menu, select a value for the shares to allocate to the virtual
machine.
If you selected Custom, enter a number of shares in the text box.
In the Limit - IOPs box, enter the upper limit of storage resources to allocate to the
virtual machine, or select Unlimited.
9.6 SCSI Controller Configuration
To access virtual disks and SCSI devices, a virtual machine uses virtual SCSI controllers.
These virtual controllers appear to a virtual machine as different types of controllers,
including BusLogic Parallel, LSI Logic Parallel, LSI Logic SAS, and VMware Paravirtual
SCSI. You can add a SCSI controller, change the SCSI controller type, and select bus sharing
for a virtual machine.
Each virtual machine can have a maximum of four SCSI controllers. The default SCSI
controller is numbered as 0. When you create a virtual machine, the default hard disk is
assigned to the default SCSI controller 0 at bus node (0:0).
When you add SCSI controllers, they are numbered sequentially 1, 2, and 3. If you add a hard
disk or SCSI device to a virtual machine after virtual machine creation, it is assigned to the
first available virtual device node on the default SCSI Controller, for example (0:1).
If you add a SCSI controller, you can reassign an existing or new hard disk, or a SCSI device,
to that controller. For example, you can assign the device to (1:z ), where 1 is SCSI Controller
1 and z is a virtual device node from 0 to 15.
By default, the SCSI controller is assigned to virtual device node (z:7), so that device node is
unavailable for hard disks or SCSI devices.
9.6.1 About VMware Paravirtual SCSI Controllers
VMware Paravirtual SCSI controllers are high performance storage controllers that can
result in greater throughput and lower CPU use. These controllers are best suited for high
performance storage environments.
VMware Paravirtual SCSI controllers are available for virtual machines with ESXi 4.x and
later compatibility. Disks on such controllers might not experience optimal performance
gains if they have snapshots or if memory on the ESXi host is over committed. This behavior
does not mitigate the overall performance gain of using VMware Paravirtual SCSI controllers
as compared to other SCSI controller options.
If you have virtual machines with VMware Paravirtual SCSI controllers, those virtual
machines cannot be part of an MSCS cluster.
For platform support for VMware Paravirtual SCSI controllers, see the VMware
Compatibility Guide.
9.6.2 Add a PCI Device in the vSphere Web Client
vSphere DirectPath I/O allows a guest operating system on a virtual machine to directly
access physical PCI and PCIe devices connected to a host. This action gives you direct access
to devices such as high-performance graphics or sound cards. You can connect each virtual
machine to up to six PCI devices.
You configure PCI devices on the host to make them available for passthrough to a virtual
machine. See the vCenter Server and Host Management documentation.
When PCI vSphere DirectPath I/O devices are available to a virtual machine, you cannot
suspend, migrate with vMotion, or take or restore Snapshots of such virtual machines.
Prerequisites



To use DirectPath, verify that the host has Intel Virtualization Technology for
Directed I/O (VT-d) or AMD I/O Virtualization Technology (IOMMU) enabled in the
BIOS.
Verify that the PCI devices are connected to the host and marked as available for
passthrough.
Verify that the virtual machine is compatible with ESXi 4.x and later.
9.6.3 Configure Video Cards in the vSphere Web Client
You can change the number of monitor displays for a virtual machine, allocate memory for
the displays, and enable 3D support.
The default setting for total video RAM is adequate for minimal desktop resolution. For more
complex situations, you can change the default memory.
Some 3D applications require a minimum video memory of 64MB.
Prerequisites




Verify that the virtual machine is powered off.
To enable 3D support, the virtual machine compatibility must be ESXi 5.0 and later.
To use a hardware 3D renderer, ensure that graphics hardware is available.
Otherwise, the virtual machine will not power on.
Required privilege: Virtual machine.Configuration.Modify device settings
9.6.4 How USB Device Passthrough Technology Works
When you attach a USB device to a physical host, the device is available only to virtual
machines that reside on that host. The device cannot connect to virtual machines that reside
on another host in the datacenter.
A USB device is available to only one virtual machine at a time. When a device is connected
to a powered-on virtual machine, it is not available to connect to other virtual machines that
run on the host. When you remove the active connection of a USB device from a virtual
machine, it becomes available to connect to other virtual machines that run on the host.
Connecting a USB passthrough device to a virtual machine that runs on the ESXi host to
which the device is physically attached requires an arbitrator, a controller, and a physical
USB device or device hub.
USB
Manages connection requests and routes USB device traffic. The arbitrator is
Arbitrator installed and enabled by default on ESXi hosts. It scans the host for USB
devices and manages device connection among virtual machines that reside on
the host. It routes device traffic to the correct virtual machine instance for
delivery to the guest operating system. The arbitrator monitors the USB device
and prevents other virtual machines from using it until you release it from the
virtual machine it is connected to.
USB
The USB hardware chip that provides USB function to the USB ports that it
Controller manages. The virtual USB Controller is the software virtualization of the USB
host controller function in the virtual machine.
USB controller hardware and modules that support USB 2.0 and USB 1.1
devices must exist on the host. Two virtual USB controllers are available to
each virtual machine. A controller must be present before you can add USB
devices to the virtual computer.
The USB arbitrator can monitor a maximum of 15 USB controllers. Devices
connected to controllers numbered 16 or greater are not available to the virtual
machine.
USB
Devices
You can add up to 20 USB devices to a virtual machine. This is the maximum
number of devices supported for simultaneous connection to one virtual
machine. The maximum number of USB devices supported on a single ESXi
host for simultaneous connection to one or more virtual machines is also 20.
For a list of supported USB devices, see the VMware knowledge base article at
http://kb.vmware.com/kb/1021345.
9.6.5 Configuring USB Devices for vMotion
With USB passthrough from a host to a virtual machine, you can migrate a virtual machine
to another ESXi host in the same datacenter and maintain the USB passthrough device
connections to the original host.
If a virtual machine has USB devices attached that pass through to an ESXi host, you can
migrate that virtual machine with the devices attached.
For a successful migration, review the following conditions:





You must configure all USB passthrough devices connected to a virtual machine for
vMotion. If one or more devices is not configured for vMotion, the migration cannot
proceed. For troubleshooting details, see the vSphere Troubleshooting
documentation.
When you migrate a virtual machine with attached USB devices away from the host
to which the devices are connected, the devices remain connected to the virtual
machine. However, if you suspend or power off the virtual machine, the USB devices
are disconnected and cannot reconnect when the virtual machine is resumed. The
device connections can be restored only if you move the virtual machine back to the
host to which the devices are attached.
If you resume a suspended virtual machine that has a Linux guest operating system,
the resume process might mount the USB devices at a different location on the file
system.
If a host with attached USB devices resides in a DRS cluster with distributed power
management (DPM) enabled, disable DPM for that host. Otherwise DPM might turn
off the host with the attached device. This action disconnects the device from the
virtual machine because the virtual machine migrated to another host.
Remote USB devices require that the hosts be able to communicate over the
management network following migration with vMotion, so the source and
destination management network IP address families must match. You cannot
migrate a virtual machine from a host that is registered to vCenter Server with an
IPv4 address to a host that is registered with an IPv6 address.
9.6.6 Add a USB Controller to a Virtual Machine in the vSphere Web Client
USB controllers are available to add to virtual machines to support USB passthrough from an
ESXi host or from a client computer to a virtual machine.
You can add two USB controllers to a virtual machine. The xHCI controller, available for
Linux guest operating systems only, supports USB 3.0 superspeed, 2.0, and 1.1 devices. The
EHCI+UHCI controller supports USB 2.0 and 1.1 devices.
The conditions for adding a controller vary, depending on the device version, the type of
passthrough (host or client computer), and the guest operating system.
USB Controller Support
Controller
type
Supported for
Supported for
Supported USB
Passthrough from ESXi Passthrough from Client
Device Version
Host to VM
Computer to VM
EHCI+UHCI 2.0 and 1.1
Yes
Yes
xHCI
Yes (USB 2.0 and 1.1
devices only)
Yes (Linux guests only)
3.0, 2.0, and 1.1
For Mac OS X systems, the EHCI+UHCI controller is enabled by default and is required for
USB mouse and keyboard access.
For virtual machines with Linux guests, you can add one or both controllers, but 3.0
superspeed devices are not supported for passthrough from an ESXi host to a virtual
machine. You cannot add two controllers of the same type.
For USB passthrough from an ESXi host to a virtual machine, the USB arbitrator can
monitor a maximum of 15 USB controllers. If your system includes controllers that exceed
the 15 controller limit and you connect USB devices to them, the devices are not available to
the virtual machine.
Prerequisites





ESXi hosts must have USB controller hardware and modules that support USB 2.0
and 1.1 devices present.
Client computers must have USB controller hardware and modules that support USB
3.0, 2.0, and 1.1 devices present.
To use the xHCI controller on a Linux guest, ensure that the Linux kernel version is
2.6.35 or later.
Verify that the virtual machine is powered on.
Required Privilege (ESXi host passthrough): Virtual Machine.Configuration.Add or
Remove Device
9.6.7 How USB Device Passthrough Technology Works
The USB controller is the USB hardware chip that provides USB function to the USB ports
that it manages. USB controller hardware and modules that support USB 3.0, 2.0, and USB
1.1 devices must exist in the virtual machine. Two USB controllers are available for each
virtual machine. The controllers support multiple USB 3.0, 2.0, and 1.1 devices. The
controller must be present before you can add USB devices to the virtual machine.
You can add up to 20 USB devices to a virtual machine. This is the maximum number of
devices supported for simultaneous connection to one virtual machine.
You can add multiple devices to a virtual machine, but only one at a time. The virtual
machine retains its connection to the device while in S1 standby. USB device connections are
preserved when you migrate virtual machines to another host in the datacenter.
A USB device is available to only one powered-on virtual machine at a time. When a virtual
machine connects to a device, that device is no longer available to other virtual machines or
to the client computer. When you disconnect the device from the virtual machine or shut the
virtual machine down, the device returns to the client computer and becomes available to
other virtual machines that the client computer manages.
For example, when you connect a USB mass storage device to a virtual machine, it is
removed from the client computer and does not appear as a drive with a removable device.
When you disconnect the device from the virtual machine, it reconnects to the client
computer's operating system and is listed as a removable device.
9.6.8 USB 3.0 Device Limitations
USB 3.0 devices have the following requirements and limitations:



The virtual machine that you connect the USB 3.0 device to must be configured with
an xHCI controller and have a Linux guest operating system with a 2.6.35 or later
kernel.
You can connect only one USB 3.0 device operating at superspeed to a virtual
machine at a time.
USB 3.0 devices are available only for passthrough from a client computer to a virtual
machine. They are not available for passthrough from an ESXi host to a virtual
machine.
9.6.9 Avoiding Data Loss
Before you connect a device to a virtual machine, make sure the device is not in use on the
client computer.
If the vSphere Client disconnects from the vCenter Server or host, or if you restart or shut
down the client computer, the device connection breaks. It is best to have a dedicated client
computer for USB device use or to reserve USB devices connected to a client computer for
short-term use, such as updating software or adding patches to virtual machines. To
maintain USB device connections to a virtual machine for an extended time, use USB
passthrough from an ESXi host to the virtual machine.
9.7 Configure Fibre Channel NPIV Settings in the vSphere Web
Client
N-port ID virtualization (NPIV) provides the ability to share a single physical Fibre Channe
HBA port among multiple virtual ports, each with unique identifiers. This capability lets you
control virtual machine access to LUNs on a per-virtual machine basis.
Each virtual port is identified by a pair of world wide names (WWNs): a world wide port
name (WWPN) and a world wide node name (WWNN). These WWNs are assigned by
vCenter Server.
For detailed information on how to configure NPIV for a virtual machine, see vSphere
Storage.
NPIV support is subject to the following limitations:





NPIV must be enabled on the SAN switch. Contact the switch vendor for information
about enabling NPIV on their devices.
NPIV is supported only for virtual machines with RDM disks. Virtual machines with
regular virtual disks continue to use the WWNs of the host’s physical HBAs.
The physical HBAs on the ESXi host must have access to a LUN using its WWNs in
order for any virtual machines on that host to have access to that LUN using their
NPIV WWNs. Ensure that access is provided to both the host and the virtual
machines.
The physical HBAs on the ESXi host must support NPIV. If the physical HBAs do not
support NPIV, the virtual machines running on that host will fall back to using the
WWNs of the host’s physical HBAs for LUN access.
Each virtual machine can have up to 4 virtual ports. NPIV-enabled virtual machines
are assigned exactly 4 NPIV-related WWNs, which are used to communicate with
physical HBAs through virtual ports. Therefore, virtual machines can utilize up to 4
physical HBAs for NPIV purposes.
Prerequisites


To edit the virtual machine’s WWNs, power off the virtual machine.
Verify that the virtual machine has a datastore containing a LUN that is available to
the host.
ESXi Hosts and Compatible Virtual Machine Hardware Versions
Version 8
Version 7
Version 4
ESXi 5.0
Create, edit, Create, edit,
Edit, run
run
run
ESX/ESXi
4.x
Not
supported
ESX Server Not
3.x
supported
Compatible with vCenter
Server Version
vCenter Server 5.0
Create, edit, Create, edit,
vCenter Server 4.x
run
run
Not
supported
Create, edit,
VirtualCenter Server 2.x and later
run
Version 3 virtual machines are not supported on ESXi 5.0 hosts. To make full use of these
virtual machines, upgrade the virtual hardware.
Note
Virtual machine hardware version 4 might be listed as VM3 in documentation for earlier
versions of ESX/ESXi.
9.8 Managing Multi-Tiered Applications with vSphere vApp in the
vSphere Web Client
You can use VMware vSphere as a platform for running applications, in addition to using it
as a platform for running virtual machines. The applications can be packaged to run directly
on top of VMware vSphere. The format of how the applications are packaged and managed is
called vSphere vApp.
A vApp is a container, like a resource pool and can contain one or more virtual machines. A
vApp also shares some functionality with virtual machines. A vApp can power on and power
off, and can also be cloned.
Each vApp has a specific summary page with the current status of the service and relevant
summary information, as well as operations on the service.
The distribution format for vApp is OVF.
Note
The vApp metadata resides in the vCenter Server's database, so a vApp can be distributed
across multiple ESXi hosts. This information can be lost if the vCenter Server database is
cleared or if a standalone ESXi host that contains a vApp is removed from vCenter Server.
You should back up vApps to an OVF package to avoid losing any metadata.
vApp metadata for virtual machines within vApps do not follow the snapshots semantics for
virtual machine configuration. So, vApp properties that are deleted, modified, or defined
after a snapshot is taken remain intact (deleted, modified, or defined) after the virtual
machine reverts to that snapshot or any prior snapshots.
You can use VMware Studio to automate the creation of ready-to-deploy vApps with prepopulated application software and operating systems. VMware Studio adds a network agent
to the guest so that vApps bootstrap with minimal effort. Configuration parameters specified
for vApps appear as OVF properties in the vCenter Server deployment wizard. For
information about VMware Studio and for download, see the VMware Studio developer page
on the VMware web site.
You can allocate CPU and memory resources for the new vApp using shares, reservations,
and limits.
Procedure

Allocate CPU resources for this vApp.
Option
Description
Shares
CPU shares for this vApp with respect to the parent’s total. Sibling vApps share
resources according to their relative share values bounded by the reservation
and limit. Select Low, Normal, or High, which specify share values respectively
in a 1:2:4 ratio. Select Custom to give each vApp a specific number of shares,
which express a proportional weight.
Reservation Guaranteed CPU allocation for this vApp.
Reservation Select the Expandable check box to make the reservation expandable. When the
vApp is powered on, if the combined reservations of its virtual machines are
Type
larger than the reservation of the vApp, the vApp can use resources from its
parent or ancestors.
Limit

Upper limit for this vApp's CPU allocation. Select Unlimited to specify no upper
limit.
Allocate memory resources for this vApp.
Option
Description
Shares
Memory shares for this vApp with respect to the parent’s total. Sibling vApps
share resources according to their relative share values bounded by the
reservation and limit. Select Low, Normal, or High, which specify share values
respectively in a 1:2:4 ratio. Select Custom to give each vApp a specific number
of shares, which express a proportional weight.
Reservation Guaranteed memory allocation for this vApp.
Reservation Select the Expandable check box to make the reservation expandable. When the
vApp is powered on, if the combined reservations of its virtual machines are
Type
larger than the reservation of the vApp, the vApp can use resources from its
parent or ancestors.
Limit
Upper limit for this vApp's memory allocation. Select Unlimited to specify no
upper limit.
9.9 vCenter Solutions Manager
9.9.1 Monitoring Agents
The vCenter Solutions Manager displays the vSphere ESX Agent Manager agents that you
use to deploy and manage related agents on ESX hosts.
An administrator uses the solutions manager to keep track of whether a solution's agents are
working as expected. Outstanding issues are reflected by the solution's ESX Agent Manager
status and a list of issues.
When a solution's state changes, the solutions manager updates the ESX Agent Manager's
summary status and state. Administrators use this status to track whether the goal state is
reached.
The agency's health status is indicated by a specific color:



Red. The solution must intervene for the ESX Agent Manager to proceed. For
example, if a virtual machine agent is powered off manually on a compute resource
and the ESX Agent Manager does not attempt to power on the agent. The ESX Agent
Manager reports this action to the solution. The solution alerts the administrator to
power on the agent.
Yellow. The ESX Agent Manager is actively working to reach a goal state. The goal
state can be enabled, disabled, or uninstalled. For example, when a solution is
registered, its status is yellow until the ESX Agent Manager deploys the solutions
agents to all the specified compute resources. A solution does not need to intervene
when the ESX Agent Manager reports its ESX Agent Manager health status as yellow.
Green. A solution and all its agents reached the goal state.
9.10 Monitoring vServices
A vService is a service or function that a solution provides to virtual machines and vApps. A
solution can provide one or more vServices. These vServices integrate with the platform and
are able to change the environment in which the vApp or virtual machine runs.
A vService is a type of service for a virtual machine and a vApp provided by a vCenter
extension. Virtual machines and vApps can have dependencies on vServices. Each
dependency is associated with a vService type. The vService type must be bound to a
particular vCenter extension that implements that vService type. This vService type is similar
to a virtual hardware device. For example, a virtual machine can have a networking device
that at deployment must be connected to a particular network.
The vService Manager allows a solution to connect to operations related to OVF templates:



Importing OVF templates. Receive a callback when OVF templates with a vService
dependancy of a certain type is imported.
Exporting OVF templates. Inserts OVF sections when a virtual machine is exported.
OVF environment generation. Inserts OVF sections into the OVF environment at the
power-on instance.
The vService Provider tab in the solution manager provides details for each vCenter
extension. This information allows you to monitor vService providers and list the virtual
machines or vApps to which they are bound.
9.10.1 Install the Client Integration Plug-In in the vSphere Web Client
The Client Integration Plug-in provides access to a virtual machine's console in the vSphere
Web Client, and provides access to other vSphere infrastructure tasks.
You use the Client Integration Plug-in to deploy OVF or OVA templates and transfer files
with the datastore browser. You can also use the Client Integration Plug-in to connect virtual
devices that reside on a client computer to a virtual machine.
You install the Client Integration Plug-in only once to connect virtual devices to virtual
machines that you access through an instance of the vSphere Web Client. You must restart
the browser after you install the plug-in.
If you install the Client Integration Plug-in from an Internet Explorer browser, you must first
disable Protected Mode. Internet Explorer identifies the Client Integration Plug-in as being on
the Internet instead of on the local intranet. In such cases, the plug-in does not install
correctly because Protected Mode is enabled for the Internet.
The Client Integration Plug-in also enables you to log in to the vSphere Web Client using
Windows session credentials.
For information about supported browsers and operating systems, see the vSphere Installation
and Setup documentation.
9.11 Using Snapshots To Manage Virtual Machines
With snapshots, you can preserve a baseline before diverging a virtual machine in the
snapshot tree.
The Snapshot Manager in the vSphere Web Client and the vSphere Client provide several
operations for creating and managing virtual machine snapshots and snapshot trees. These
operations let you create snapshots, restore any snapshot in the snapshot hierarchy, delete
snapshots, and more. You can create extensive snapshot trees that you can use to save the
virtual machine state at any specific time and restore the virtual machine state later. Each
branch in a snapshot tree can have up to 32 snapshots.
A snapshot preserves the following information:




Virtual machine settings. The virtual machine directory, which includes disks that
were added or changed after you took the snapshot.
Power state. The virtual machine can be powered on, powered off, or suspended.
Disk state. State of all the virtual machine's virtual disks.
(Optional) Memory state. The contents of the virtual machine's memory.
9.11.1 The Snapshot Hierarchy
The Snapshot Manager presents the snapshot hierarchy as a tree with one or more branches.
The relationship between snapshots is like that of a parent to a child. In the linear process,
each snapshot has one parent snapshot and one child snapshot, except for the last snapshot,
which has no child snapshots. Each parent snapshot can have more than one child. You can
revert to the current parent snapshot or restore any parent or child snapshot in the snapshot
tree and create more snapshots from that snapshot. Each time you restore a snapshot and
take another snapshot, a branch, or child snapshot, is created.
Parent
Snapshots
The first virtual machine snapshot that you create is the base parent snapshot.
The parent snapshot is the most recently saved version of the current state of
the virtual machine. Taking a snapshot creates a delta disk file for each disk
attached to the virtual machine and optionally, a memory file. The delta disk
files and memory file are stored with the base .vmdk file. The parent snapshot
is always the snapshot that appears immediately above the You are here icon
in the Snapshot Manager. If you revert or restore a snapshot, that snapshot
becomes the parent of the You are here current state.
Note
The parent snapshot is not always the snapshot that you took most recently.
Child
Snapshots
A snapshot that is taken of the same virtual machine after the parent
snapshot. Each child constitutes delta files for each attached virtual disk, and
optionally a memory file that points from the present state of the virtual disk
(You are here). Each child snapshot's delta files merge with each previous
child snapshot until reaching the parent disks. A child disk can later be a
parent disk for future child disks.
The relationship of parent and child snapshots can change if you have multiple branches in
the snapshot tree. A parent snapshot can have more than one child. Many snapshots have no
children.
Important
Do not manually manipulate individual child disks or any of the snapshot configuration files
because doing so can compromise the snapshot tree and result in data loss. This restriction
includes disk resizing and making modifications to the base parent disk using vmkfstools.
9.11.2 Snapshot Behavior
Taking a snapshot preserves the disk state at a specific time by creating a series of delta disks
for each attached virtual disk or virtual RDM and optionally preserves the memory and
power state by creating a memory file. Taking a snapshot creates a snapshot object in the
Snapshot Manager that represents the virtual machine state and settings.
Each snapshot creates an additional delta .vmdk disk file. When you take a snapshot, the
snapshot mechanism prevents the guest operating system from writing to the base .vmdk file
and instead directs all writes to the delta disk file. The delta disk represents the difference
between the current state of the virtual disk and the state that existed at the time that you
took the previous snapshot. If more than one snapshot exists, delta disks can represent the
difference between each snapshot. Delta disk files can expand quickly and become as large as
the entire virtual disk if the guest operating system writes to every block of the virtual disk.
9.11.3 Snapshot Files
When you take a snapshot, you capture the state of the virtual machine settings and the
virtual disk. If you are taking a memory snapshot, you also capture the memory state of the
virtual machine. These states are saved to files that reside with the virtual machine's base
files.
9.11.3.1 Snapshot Files
A snapshot consists of files that are stored on a supported storage device. A Take Snapshot
operation creates .vmdk, -delta.vmdk, .vmsd, and .vmsn files. By default, the first and all
delta disks are stored with the base .vmdk file. The .vmsd and .vmsn files are stored in the
virtual machine directory.
A .vmdk file to which the guest operating
system can write. The delta disk represents
the difference between the current state of the
virtual disk and the state that existed at the
time that the previous snapshot was taken.
When you take a snapshot, the state of the
virtual disk is preserved, which prevents the
guest operating system from writing to it, and
a delta or child disk is created.
A delta disk has two files, including a
descriptor file that is small and contains
information about the virtual disk, such as
geometry and child-parent relationship
information, and a corresponding file that
contains the raw data.
Note
If you are looking at a datastore with the
Datastore Browser in the vSphere Client, you
see only one entry to represent both files.
The files that make up the delta disk are
referred to as child disks or redo logs. A child
disk is a sparse disk. Sparse disks use the
copy-on-write mechanism, in which the
virtual disk contains no data in places, until
copied there by a write operation. This
optimization saves storage space. A grain is
the unit of measure in which the sparse disk
uses the copy-on-write mechanism. Each
grain is a block of sectors that contain virtual
disk data. The default size is 128 sectors or
64KB.
Flat file
A -flat.vmdk file that is one of two files that
comprises the base disk. The flat disk
contains the raw data for the base disk. This
file does not appear as a separate file in the
Datastore Browser.
Database file
A .vmsd file that contains the virtual
machine's snapshot information and is the
primary source of information for the
Snapshot Manager. This file contains line
entries, which define the relationships
between snapshots and between child disks
for each snapshot.
Memory file
A .vmsn file that includes the active state of
the virtual machine. Capturing the memory
state of the virtual machine lets you revert to
a turned on virtual machine state. With
nonmemory snapshots, you can only revert
to a turned off virtual machine state. Memory
snapshots take longer to create than
nonmemory snapshots. The time the ESX
host takes to write the memory onto the disk
is relative to the amount of memory the
virtual machine is configured to use.
A Take Snapshot operation creates .vmdk, -delta.vmdk, vmsd, and vmsn files.
File
Description
vmname-number.vmdk and
vmname-number-delta.vmdk
Snapshot file that represents the difference between the
current state of the virtual disk and the state that existed
at the time the previous snapshot was taken.
The filename uses the following syntax, S1vm000001.vmdk where S1vm is the name of the virtual
machine and the six-digit number, 000001, is based on
the files that already exist in the directory. The number
does not consider the number of disks that are attached
to the virtual machine.
vmname.vmsd
Database of the virtual machine's snapshot information
and the primary source of information for the Snapshot
Manager.
vmname.Snapshotnumber.vmsn
Memory state of the virtual machine at the time you take
the snapshot. The file name uses the following syntax,
S1vm.snapshot1.vmsn, where S1vm is the virtual
machine name, and snapshot1 is the first snapshot.
Note
A .vmsn file is created each time you take a snapshot,
regardless of the memory selection. A .vmsn file without
memory is much smaller than one with memory.
9.11.4 Snapshot Limitations
Snapshots can affect virtual machine performance and do not support some disk types or
virtual machines configured with bus sharing. Snapshots are useful as short-term solutions
for capturing point-in-time virtual machine states and are not appropriate for long-term
virtual machine backups.






VMware does not support snapshots of raw disks, RDM physical mode disks, or guest
operating systems that use an iSCSI initiator in the guest.
Virtual machines with independent disks must be powered off before you take a
snapshot. Snapshots of powered-on or suspended virtual machines with independent
disks are not supported.
Snapshots are not supported with PCI vSphere Direct Path I/O devices.
VMware does not support snapshots of virtual machines configured for bus sharing.
If you require bus sharing, consider running backup software in your guest operating
system as an alternative solution. If your virtual machine currently has snapshots
that prevent you from configuring bus sharing, delete (consolidate) the snapshots.
Snapshots provide a point-in-time image of the disk that backup solutions can use,
but Snapshots are not meant to be a robust method of backup and recovery. If the
files containing a virtual machine are lost, its snapshot files are also lost. Also, large
numbers of snapshots are difficult to manage, consume large amounts of disk space,
and are not protected in the case of hardware failure.
Snapshots can negatively affect the performance of a virtual machine. Performance
degradation is based on how long the snapshot or snapshot tree is in place, the depth
of the tree, and how much the virtual machine and its guest operating system have
changed from the time you took the snapshot. Also, you might see a delay in the
amount of time it takes the virtual machine to power-on. Do not run production
virtual machines from snapshots on a permanent basis.
9.11.5 Managing Snapshots
You can review all snapshots for the active virtual machine and act on them by using the
Snapshot Manager.
After you take a snapshot, you can use the Revert to current snapshot command from the
virtual machine’s right-click menu to restore that snapshot at any time. If you have a series
of snapshots, you can use the Go to command in the Snapshot Manager to restore any parent
or child snapshot. Subsequent child snapshots that you take from the restored snapshot
create a branch in the snapshot tree. You can delete a snapshot from the tree in the Snapshot
Manager.
The Snapshot Manager window contains the following areas: Snapshot tree, Details region,
command buttons, Navigation region, and a You are here icon.
Snapshot tree
Displays all snapshots for the virtual machine.
You are here
icon
Represents the current and active state of the virtual machine. The You
are here icon is always selected and visible when you open the Snapshot
Manager.
You can select the You are here state to see how much space the node is
using. Go to, Delete, and Delete all are disabled for the You are here state.
Go to, Delete,
and Delete All
Snapshot options.
Details
Shows the snapshot name and description, the date you created the
snapashot, and the disk space.
The Console shows the power state of the virtual machine when a
snapshot was taken. The Name, Description, and Created text boxes are
blank if you do not select a snapshot.
Navigation
Contains buttons for navigating out of the dialog box.
Close the Snapshot Manager.
The question mark icon opens the help system.
9.12 Change Disk Mode to Exclude Virtual Disks from Snapshots in
the vSphere Web Client
You can set a virtual disk to independent mode to exclude the disk from any snapshots taken
of its virtual machine.
Prerequisites
Power off the virtual machine and delete any existing snapshots before you change the disk
mode. Deleting a snapshot involves committing the existing data on the snapshot disk to the
parent disk.
Required privileges:


Virtual machine.State.Remove Snapshot
Virtual machine.Configuration.Modify device settings
Procedure


Right-click the virtual machine and select Edit Settings.
1. To locate a virtual machine, select a datacenter, folder, cluster, resource pool,
host, or vApp.
2. Click the Related Objects tab and click Virtual Machines.
On the Virtual Hardware tab, expand Hard disk, and select an independent disk
mode option.



Option
Independent Persistent

Independent Nonpersistent

Description

Disks in persistent mode behave like conventional disks on
your physical computer. All data written to a disk in
persistent mode are written permanently to the disk.

Changes to disks in nonpersistent mode are discarded
when you power off or reset the virtual machine. With
nonpersistent mode, you can restart the virtual machine
with a virtual disk in the same state every time. Changes to
the disk are written to and read from a redo log file that is
deleted when you power off or reset.
Click OK.
Restoring snapshots has the following effects:



The current disk and memory states are discarded, and the virtual machine reverts to
the disk and memory states of the parent snapshot.
Existing snapshots are not removed. You can restore those snapshots at any time.
If the snapshot includes the memory state, the virtual machine will be in the same
power state as when you created the snapshot.
Virtual Machine Power State After Restoring a Snapshot
Virtual Machine State When
Parent Snapshot Is Taken
Virtual Machine State After Restoration
Powered on (includes memory)
Reverts to the parent snapshot, and the virtual
machine is powered on and running.
Powered on (does not include memory)
Reverts to the parent snapshot and the virtual
machine is powered off.
Powered off (does not include memory)
Reverts to the parent snapshot and the virtual
machine is powered off.
Virtual machines running certain kinds of workloads can take several minutes to resume
responsiveness after reverting from a snapshot.
Revert to Snapshot
Note
vApp metadata for virtual machines in vApps does not follow the snapshot semantics for
virtual machine configuration. vApp properties that are deleted, modified, or defined after a
snapshot is taken remain intact (deleted, modified, or defined) after the virtual machine
reverts to that snapshot or any previous snapshots.
Download