vSphere Installation and Setup Guide vSphere Upgrade Guide vCenter Server and Host Management Guide vSphere VM Administration Guide vSphere Host Profiles Guide vSphere Networking Guide 2 Introduction to Networking 9 Networking Concepts Overview 9 Network Services 10 View Networking Information in the vSphere Client 10 View Network Adapter Information in the vSphere Client 11 TCP Segmentation Offload, TSO, allows a TCP/IP stack to emit large frames (up to 64KB) even though the maximum transmission unit (MTU) of the interface is smaller. The network adapter then separates the large frame into MTU-sized frames and prepends an adjusted copy of the initial TCP/IP headers 3 Setting Up Networking with vSphere Standard Switches 13 vSphere Standard Switches 13 Standard Port Groups 14 Port Group Configuration for VMs 14 VMkernel Networking Configuration 15 vSphere Standard Switch Properties 18 default number of logical ports for a standard switch is 120. All port groups in a datacenter that are physically connected to the same network (in the sense that each can receive broadcasts from the others) are given the same label. Conversely, if two port groups cannot receive broadcasts from each other, they have distinct labels. For a port group to reach port groups located on other VLANs, the VLAN ID must be set to 4095 (e1: or same VLAN ID] VLAN ID: If you enter 0 or leave the option blank, the port group can see only untagged (non-VLAN) traffic. VLAN ID: If you enter 4095, the port group can see traffic on any VLAN while leaving the VLAN tags intact vMotion: both hosts are in the same broadcast domain—that is, the same Layer 2 subnet. ESXidoes not support VM migration between hosts in different broadcast domains because the migrated VM might require systems and resources that it would no longer have access to in the new network. Even if your network configuration is set up as a high-availability environment or includes intelligent switches that can resolve the VM’s needs across different networks, you might experience lag times as the Address Resolution Protocol (ARP) table updates and resumes network traffic for the VMs Choose the connection speed manually if the NIC and a physical switch might fail to negotiate the proper connection speed. Symptoms of mismatched speed and duplex include low bandwidth or no link connectivity 4 Setting Up Networking with vSphere Distributed Switches 21 vSphere Distributed Switch Architecture 22 Configuring a vSphere Distributed Switch 22 Distributed Port Groups 27 Working with Distributed Ports 28 Private VLANs 29 Configuring vSphere Distributed Switch Network Adapters 31 Configuring VM Networking on a vSphere Distributed Switch 35 Properties Configured at Switch Level Port Group Level MTU size. Discovery Protocol: enable, disable. Cisco Discovery Protocol or Link Layer Discovery Protocol from the Type drop-down menu. Set Operation to Listen, Advertise, or Both. No of ports. Port binding Choose when ports are assigned to VMs connected to this distributed port group. Select Static binding to assign a port to a VM when the VM connects to the distributed port group. This option is not available when the vSphere Client is connected directly to ESXi. Select Dynamic binding to assign a port to a VM the first time the VM powers on after it is connected to the distributed port group. Dynamic binding is depricated in ESXi 5.0. Select Ephemeral for no port binding. This option is not available when the vSphere Client is connected directly to ESXi. Upgrade distributed vSwitch: - Need reboot? - It’s a wizard. The upgrade wizard lists the hosts associated with this vSphere distributed switch and whether or not they are compatible with the upgraded vSphere distributed switch version. You can proceed with the upgrade only if all hosts are compatible with the new vSphere distributed switch version. - Next to each incompatible host is the reason for the incompatibility. Port group advance setting Do a comparison between the 2 vSwitches as a lot of additional info is shown on distributed vSwitch Private VLANs are used to solve VLAN ID limitations and waste of IP addresses for certain network setups. A private VLAN is identified by its primary VLAN ID. A primary VLAN ID can have multiple secondary VLAN IDs associated with it. Primary VLANs are Promiscuous, so that ports on a private VLAN can communicate with ports configured as the primary VLAN. Ports on a secondary VLAN can be either Isolated, communicating only with promiscuous ports, or Community, communicating with both promiscuous ports and other ports on the same secondary VLAN. To use private VLANs between a host and the rest of the physical network, the physical switch connected to the host needs to be private VLAN-capable and configured with the VLAN IDs being used by ESXi for the private VLAN functionality. For physical switches using dynamic MAC+VLAN ID based learning, all corresponding private VLAN IDs must be first entered into the switch's VLAN database. Removing a primary private VLAN also removes all associated secondary private VLANs Adding a physical NIC: If you select an adapter that is attached to another switch, it will be removed from that switch and reassigned to this vSphere distributed switch. Migrating VM network (migrating a group of VM) E1: add comparison between dVS and vS. UI and feature comparison. Limitations of DVS. 5 Managing Network Resources 37 vSphere Network I/O Control 37 TCP Segmentation Offload and Jumbo Frames 40 NetQueue and Networking Performance 42 DirectPath I/O 43 The iSCSI traffic resource pool shares do not apply to iSCSI traffic on a dependent hardware iSCSI adapter. Also, Traffic Shaping do not apply to dependant hardware iSCSI adapter. Click the Manage Port Groups to bring up the dialog box. From there, we can map the various port groups (only VM port group, not vmkernel) to user defined port group. To enable TSO at the VM level, you must replace the existing vmxnet or flexible virtual network adapters with enhanced vmxnet virtual network adapters. This replacement might result in a change in the MAC address of the virtual network adapter. TSO support through the enhanced vmxnet network adapter is available for VMs that run the following guest OSs: Microsoft Windows 2003 Enterprise Edition with Service Pack 2 (32 bit and 64 bit) [e1: no Win 08??] Red Hat Enterprise Linux 4 (64 bit) Red Hat Enterprise Linux 5 (32 bit and 64 bit) SUSE Linux Enterprise Server 10 (32 bit and 64 bit) [e1: the list of OS is old. Need update??] TSO is enabled on a VMkernel interface. If TSO becomes disabled for a particular VMkernel interface, the only way to enable TSO is to delete that VMkernel interface and recreate it with TSO enabled. The network must support jumbo frames end-to-end: VM NIC vSwitch pNIC pSwitch. Jumbo frames up to 9kB (9000 bytes) are supported. Before enabling Jumbo frames, check with your hardware vendor to ensure that your physical network adapter supports jumbo frames. Inside the guest OS, configure the network adapter to allow jumbo frames. NetQueue takes advantage of the ability of some network adapters to deliver network traffic to the system in multiple receive queues that can be processed separately, allowing processing to be scaled to multiple CPUs, improving receive-side networking performance. Done at per ESXi host level. In the VMware vSphere CLI, use the command vicfg-advcfg --set true VMkernel.Boot.netNetQueueEnable DirectPath I/O allows VM access to physical PCI functions on platforms with an I/O Memory Management Unit. The following features are unavailable for VMs configured with DirectPath: Hot adding and removing of virtual devices Suspend and resume Record and replay Fault tolerance High availability DRS (limited availability. The VM can be part of a cluster, but cannot migrate across hosts) Snapshots The following features are only available for VMs configured with DirectPath I/O on Cisco Unified Computing Systems (UCS) through Cisco VM Fabric Extender (VM-FEX) distributed switches. vMotion Hot adding and removing of virtual devices Suspend and resume High availability DRS Snapshots Adding a DirectPath device to a VM sets memory reservation to the memory size of the VM How to add. Host must be restarted. Only when that is done, then we can add it to the VM. From inside the VM. Taken from: http://www.petri.co.il/vmware-esxi4-vmdirectpath.htm You can enable DirectPath I/O with vMotion for VMs in a datacenter on a Cisco UCS system that has at least one supported Cisco UCS VM Fabric Extender (VM-FEX) distributed switch. Source: http://infrastructureadventures.com/tag/directpath-io/ Notice the DirectPath I/O is shown as Active in the screesshot below. And also below. 6 Networking Policies 45 Load Balancing and Failover Policy 45 VLAN Policy 52 Security Policy 52f Traffic Shaping Policy 56 Resource Allocation Policy 59 Monitoring Policy 60 Port Blocking Policies 61 Manage Policies for Multiple Port Groups on a vSphere Distributed Switch 62 Incoming traffic is controlled by the load balancing policy on the physical switch. [e1: I thought vDS has egress & ingress?]] Beaconing is not supported with guest VLAN tagging. In some cases, you might lose standard switch connectivity when a failover or failback event occurs. This causes the MAC addresses used by VMs associated with that standard switch to appear on a different switch port than they previously did. To avoid this problem, put your physical switch in portfast or portfast trunk mode Route based on ip hash Select an uplink based on a hash of the source and destination IP addresses of each packet. For non-IP packets, whatever is at those offsets is used to compute the hash. [e1: examples of non IP packets?? ICMP, RARP, OSPF, EGP, RIP all still IP right?] IP-based teaming requires that the physical switch be configured with etherchannel. For all other options, etherchannel should be disabled [e1: elaborate why…??] Link Status only Relies solely on the link status that the network adapter provides. This option detects failures, such as cable pulls and physical switch power failures, but not configuration errors, such as a physical switch port being blocked by spanning tree or misconfigured to the wrong VLAN or cable pulls on the other side of a physical switch. Beacon Probing Sends out and listens for beacon probes on all NICs in the team and uses this information, in addition to link status, to determine link failure. This option detects many of the failures mentioned above that are not detected by link status alone. Do not use beacon probing with IP-hash load balancing. [e1: why??] Select Yes or No to notify switches in the case of failover. If you select Yes, whenever a virtual NIC is connected to the standard switch or whenever that virtual NIC’s traffic is routed over a different physical NIC in the team because of a failover event, a notification is sent over the network to update the lookup tables on the physical switches. In almost all cases, this is desirable for the lowest latency of failover occurrences and migrations with vMotion. Do not use this option when the VMs using the port group are using Microsoft Network Load Balancing (NLB) in unicast mode. No such issue exists with NLB running in multicast mode. Select Yes or No to disable or enable failback. This option determines how a physical adapter is returned to active duty after recovering from a failure. If failback is set to Yes, the adapter is returned to active duty immediately on recovery, displacing the standby adapter that took over its slot, if any. If failback is set to No, a failed adapter is left inactive even after recovery until another active adapter fails, requiring its replacement. [e1: if we set to disable failback, and it’s a temporary failure (so vmnic #1 recovers), what happens if the vmnic #2 fails? Will the inactive vmnic #1 be set to active?? Basically, the answer depends whether vmnic #1 became standby or unused] If you are using iSCSI Multipathing, your VMkernel interface must be configured to have one active adapter and no standby adapters. [e1: that means must have 2 initiators and 2 IP??] Layer 2 is the data link layer. The three elements of the Layer 2 Security policy are promiscuous mode, MAC address changes, and forged transmits. MAC address change: incoming packet dropped. Forged Transmit: outgoing packet dropped. In some situations, you might have a legitimate need for more than one adapter to have the same MAC address on a network—for example, if you are using Microsoft Network Load Balancing in unicast mode. When Microsoft Network Load Balancing is used in the standard multicast mode, adapters do not share MAC addresses. MAC address changes settings affect traffic leaving a VM. MAC address changes will occur if the sender is permitted to make them, even if standard switches or a receiving VM does not permit MAC address changes. Similar to MAC Address Change, in the Forged Transmit option, the guest OS does not detect that its virtual network adapter cannot send packets by using the impersonated MAC address. The ESXi host intercepts any packets with impersonated addresses before they are delivered, and the guest OS might assume that the packets are dropped. Traffic Shaping policy Average Bandwidth K bits/s Establishes the number of bits per second to allow across a port, averaged over time. This number is the allowed average load. [e1: average over how long??] Peak Bandwidth K bits/s Burst Size K bytes Maximum number of bits per second to allow across a port when it is sending or receiving a burst of traffic. This number limits the bandwidth that a port uses when it is using its burst bonus. This tops the bandwidth used by a port whenever it is using its burst bonus. [e1: no way to see the burst bonus??] Maximum number of bytes to allow in a single burst. If this parameter is set, a port might gain a burst bonus if it does not use all its allocated bandwidth. When the port needs more bandwidth than specified by the average bandwidth, it might be allowed to temporarily transmit data at a higher speed if a burst bonus is available. This parameter limits the number of bytes that have accumulated in the burst bonus and transfers traffic at a higher speed The policy here is applied to each virtual adapter attached to the port group, not to the standard switch or the port group as a whole. You can set at vSwitch level or port group level. 7 Advanced Networking 67 Enable Internet Protocol Version 6 Support 67 VLAN Configuration 68 Working With Port Mirroring 68 Configure NetFlow Settings 72 Switch Discovery Protocol 72 Change the DNS and Routing Configuration 74 MAC Addresses 74 Mounting NFS Volumes 76 [e1: IPv6 design difference vs IPv4.??] IPv6 allows up to 2128 addresses, a massive increase from the 232 (about 4.3 billion) addresses possible with IPv4, and includes several other improvements IPv6 benefits: - eliminates the primary need for network address translation (NAT), which gained widespread deployment as an effort to alleviate IPv4 address exhaustion. - simplifies aspects of address assignment (stateless address autoconfiguration), network renumbering and router announcements when changing Internet connectivity providers. o IPv6 subnet size has been standardized by fixing the size of the host identifier portion of an address to 64 bits to facilitate an automatic mechanism for forming the host identifier from link-layer media addressing information (MAC address). - Network security is also integrated into the design of the IPv6 architecture, and the IPv6 specification specifies IPsec as a fundamental interoperability requirement. - The use of jumbograms may improve performance over high-MTU links. 4 GB vs 64KB in IPv4 Simplified processing required for router. The packet header and the process of packet forwarding have been simplified. Although IPv6 packet headers are at least twice the size of IPv4 packet headers, packet processing by routers is generally more efficient, thereby extending the end-to-end principle of Internet design. Specifically: - The packet header in IPv6 is simpler than that used in IPv4, with many rarely used fields moved to separate optional header extensions. - X`IPv6 routers do not perform fragmentation. IPv6 hosts are required to either perform path MTU discovery, perform end-to-end fragmentation, or to send packets no larger than the IPv6 default minimum MTU size of 1280 octets. - The IPv6 header is not protected by a checksum; integrity protection is assumed to be assured by both link-layer and higher-layer (TCP, UDP, etc.) error detection. Therefore, IPv6 routers do not need to recompute a checksum when header fields (such as the time to live (TTL) or hop count) change. - The TTL field of IPv4 has been renamed to Hop Limit, reflecting the fact that routers are no longer expected to compute the time a packet has spent in a queue. http://en.wikipedia.org/wiki/File:Ipv6_address_leading_zeros.svg http://www.techsutram.com/2009/03/differences-ipv4-vs-ipv6.html Stateless address autoconfiguration (SLAAC) - IPv6 hosts can configure themselves automatically when connected to a routed IPv6 network using Internet Control Message Protocol version 6 (ICMPv6) router discovery messages. When first connected to a network, a host sends a link-local router solicitation multicast request for its configuration parameters; if configured suitably, routers respond to such a request with a router advertisement packet that contains network-layer configuration parameters.[ - If IPv6 stateless address autoconfiguration is unsuitable for an application, a network may use stateful configuration with the Dynamic Host Configuration Protocol version 6 (DHCPv6) or hosts may be configured statically. - Routers present a special case of requirements for address configuration, as they often are sources for autoconfiguration information, such as router and prefix advertisements. Stateless configuration for routers can be achieved with a special router renumbering protocol. IPv6 is disabled by default. [e1: not supported in DVS?? Can’t find where to set. Only at the host level and Standard Switch.] VLAN benefits It reduces network traffic congestion. iSCSI traffic requires an isolated network. [e1: but do we have to use VLAN??] You can configure VLANs in ESXi using three methods: External Switch Tagging (EST), Virtual Switch Tagging (VST), and Virtual Guest Tagging (VGT). With EST, all VLAN tagging of packets is performed on the physical switch. Host network adapters are connected to access ports on the physical switch. Port groups that are connected to the virtual switch must have their VLAN ID set to 0. With VST, all VLAN tagging of packets is performed by the virtual switch before leaving the host. Host network adapters must be connected to trunk ports on the physical switch. Port groups that are connected to the virtual switch must have an appropriate VLAN ID specified. With VGT, all VLAN tagging is performed by the VM. VLAN tags are preserved between the VM networking stack and external switch when frames are passed to and from virtual switches. Physical switch ports are set to trunk port. NOTE When using VGT, you must have an 802.1Q VLAN trunking driver installed on the VM. See my blog for each of this entry. Port Mirroring Netflow Switch Discovery Protocol Listen ESXi detects and displays information about the associated Cisco switch port, but information about the vSphere distributed switch is not available to the Cisco switch administrator. Advertise ESXi makes information about the vSphere distributed switch available to the Cisco switch administrator, but does not detect and display information about the Cisco switch. Both ESXi detects and displays information about the associated Cisco switch and makes information about the vSphere distributed switch available to the Cisco switch administrator MAC Address you might need to set a MAC address for a virtual network adapter, as in the following cases: Virtual network adapters on different physical hosts share the same subnet and are assigned the same MAC address, causing a conflict. To ensure that a virtual network adapter always has the same MAC address. To circumvent the limit of 256 virtual network adapters per physical machine and possible MAC address conflicts between VMs, system administrators can manually assign MAC addresses. By default, VMware uses the Organizationally Unique Identifier (OUI) 00:50:56 for manually generated addresses, but all unique manually generated addresses are supported. You can set the addresses by adding the following line to a VM‘s configuration file: ethernetnumber.address = 00:50:56:XX:YY:ZZ where <number> refers to the number of the Ethernet adapter, XX is a valid hexadecimal number between 00 and 3F, and YY and ZZ are valid hexadecimal numbers between 00 and FF. The value for XX must not be greater than 3F to avoid conflict with MAC addresses that are generated by the VMware Workstation and VMware Server products. The maximum value for a manually generated MAC address is: ethernetnumber.address = 00:50:56:3F:FF:FF You must also set the option in a VM’s configuration file: ethernetnumber.addressType="static" Because ESXi VMs do not support arbitrary MAC addresses, you must use the above format. As long as you choose a unique value for XX:YY:ZZ among your hard-coded addresses, conflicts between the automatically assigned MAC addresses and the manually assigned ones should never occur. Each network adapter manufacturer is assigned a unique three-byte prefix called an Organizationally Unique Identifier (OUI), which it can use to generate unique MAC addresses. VMware has the following OUIs: Generated MAC addresses Manually set MAC addresses For legacy VMs, but no longer used with ESXi The first three bytes of the MAC address that is generated for each virtual network adapter consists of the OUI. The MAC address-generation algorithm produces the other three bytes. The algorithm guarantees unique MAC addresses within a machine and attempts to provide unique MAC addresses across machines. The network adapters for each VM on the same subnet should have unique MAC addresses. Otherwise, they can behave unpredictably. The algorithm puts a limit on the number of running and suspended VMs at any one time on any given host. It also does not handle all cases when VMs on distinct physical machines share a subnet. The VMware Universally Unique Identifier (UUID) generates MAC addresses that are checked for conflicts. The generated MAC addresses are created by using three parts: the VMware OUI, the SMBIOS UUID for the physical ESXi machine, and a hash based on the name of the entity that the MAC address is being generated for. After the MAC address has been generated, it does not change unless the VM is moved to a different location, for example, to a different path on the same server. The MAC address in the configuration file of the VM is saved. All MAC addresses that have been assigned to network adapters of running and suspended VMs on a given physical machine are tracked. The MAC address of a powered off VM is not checked against those of running or suspended VMs. It is possible that when a VM is powered on again, it can acquire a different MAC address. This acquisition is caused by a conflict with a VM that was powered on when this virtual machine was powered off. 8 Networking Best Practices 77 When using passthrough devices with a Linux kernel version 2.6.20 or earlier, avoid MSI and MSI-X modes because these modes have significant performance impact. To physically separate network services and to dedicate a particular set of NICs to a specific network service, create a vSphere standard switch or vSphere distributed switch for each service. If this is not possible, separate network services on a single switch by attaching them to port groups with different VLAN IDs. [e1: check against PSO best practice] Configure all VMkernel network adapters to the same MTU. When several VMkernel network adapters are connected to vSphere distributed switches but have different MTUs configured, you might experience network connectivity problems. Difference between VSS, DVS and Nexus 1000 http://virtualb.eu/wordpress/wp-content/uploads/2012/07/cisco_vmware_virtualizing_the_datacenter.pdf [Added by Benjamin T] vSphere Security Summary Areas of security VM security ESXi security Network security Storage security vCenter security Security ≠ Compliance tracking configuration changes in vSphere is part of compliance. VM security vmdk encryption ESXi security Lock down mode Firewall SSH Log review Network security Trust zones. No air gap, using technology like vShield. VLAN Network IO Control. PVLAN. But this is normally under networking. Storage security iSCSI CHAP LUN masking and zoning vCenter security role. 1 Security for ESXi Systems 7 ESXi Architecture and Security Features 7 Security Resources and Information 13 ESXi provides additional VMkernel protection with the following features: Memory Hardening The ESXi kernel, user-mode applications, and executable components such as drivers and libraries are located at random, non-predictable memory addresses. Combined with the non-executable memory protections made available by microprocessors, this provides protection that makes it difficult for malicious code to use memory exploits to take advantage of vulnerabilities. Kernel Module Integrity Digital signing ensures the integrity and authenticity of modules, drivers and applications as they are loaded by the VMkernel. Module signing allows ESXi to identify the providers of modules, drivers, or applications and whether they are VMware-certified. Trusted Platform Module (TPM) Each time ESXi boots, it measures the VMkernel and a subset of the loaded modules (VIBs) and stores the measurements into Platform Configuration Register (PCR) 20 of the TPM. This behavior is enabled by default and cannot be disabled. Hardware support for this feature is fully tested and supported by VMware and its OEM partners. Not all VIBs are measured as part of this process. The VMware TPM/TXT feature that leverages the fully tested hardware support is suitable for a proof-of-concept that demonstrates monitoring of certain TPM PCR values, by alerting when any values change from one boot to the next. Third-party solutions could use this feature to detect changes to VIB measurements stored in these PCRs for the following cases: Corruption of the measured images Unexpected or unauthorized updates, or other types of changes to the measured images The ESXi firewall in ESXi 5.0 does not allow per-network filtering of vMotion traffic. Therefore, you must install rules on your external firewall to ensure that no incoming connections can be made to the vMotion socket. 2 Securing ESXi Configurations 15 Securing the Network with Firewalls 15 Securing VMs with VLANs 20 Securing Standard Switch Ports 25 Internet Protocol Security 26 Securing iSCSI Storage 30 Cipher Strength 32 VLAN hopping occurs when an attacker with authorized access to one VLAN creates packets that trick physical switches into transmitting the packets to another VLAN that the attacker is not authorized to access. Vulnerability to this type of attack usually results from a switch being misconfigured for native VLAN operation, in which the switch can receive and transmit untagged packets. VMware standard switches do not support the concept of a native VLAN. All data passed on these switches is appropriately tagged. However, because other switches in the network might be configured for native VLAN operation, VLANs configured with standard switches can still be vulnerable to VLAN hopping. If you plan to use VLANs to enforce network security, disable the native VLAN feature for all switches unless you have a compelling reason to operate some of your VLANs in native mode. Standard switches and VLANs can protect against the following types of attacks. MAC flooding Floods a switch with packets that contain MAC addresses tagged as having come from different sources. Many switches use a content-addressable memory table to learn and store the source address for each packet. When the table is full, the switch can enter a fully open state in which every incoming packet is broadcast on all ports, letting the attacker see all of the switch’s traffic. This state might result in packet leakage across VLANs. Although VMware standard switches store a MAC address table, they do not get the MAC addresses from observable traffic and are not vulnerable to this type of attack. 802.1q and ISL tagging attacks Force a switch to redirect frames from one VLAN to another by tricking the switch into acting as a trunk and broadcasting the traffic to other VLANs. VMware standard switches do not perform the dynamic trunking required for this type of attack and, therefore, are not vulnerable. Double-encapsulation attacks Occur when an attacker creates a double-encapsulated packet in which the VLAN identifier in the inner tag is different from the VLAN identifier in the outer tag. For backward compatibility, native VLANs strip the outer tag from transmitted packets unless configured to do otherwise. When a native VLAN switch strips the outer tag, only the inner tag is left, and that inner tag routes the packet to a different VLAN than the one identified in the now-missing outer tag. VMware standard switches drop any double-encapsulated frames. Multicast brute-force attacks Involve sending large numbers of multicast frames to a known VLAN almost simultaneously to overload the switch so that it mistakenly allows some of the frames to broadcast to other VLANs. VMware standard switches do not allow frames to leave their correct broadcast domain (VLAN) and are not vulnerable to this type of attack. Spanning-tree attacks Target Spanning-Tree Protocol (STP), which is used to control bridging between parts of the LAN. The attacker sends Bridge Protocol Data Unit (BPDU) packets that attempt to change the network topology, establishing themselves as the root bridge. As the root bridge, the attacker can sniff the contents of transmitted frames. VMware standard switches do not support STP and are not vulnerable to this type of attack. Random frame attacks Involve sending large numbers of packets in which the source and destination addresses stay the same, but in which fields are randomly changed in length, type, or content. The goal of this attack is to force packets to be mistakenly rerouted to a different VLAN. VMware standard switches are not vulnerable to this type of attack. Each virtual network adapter has its own MAC address assigned when the adapter is created. This address is called the initial MAC address. Although the initial MAC address can be reconfigured from outside the guest OS, it cannot be changed by the guest OS. In addition, each adapter has an effective MAC address that filters out incoming network traffic with a destination MAC address different from the effective MAC address. The guest OS is responsible for setting the effective MAC address and typically matches the effective MAC address to the initial MAC address. The setting for the MAC Address Changes option affects traffic that a VM receives. When the option is set to Accept, ESXi accepts requests to change the effective MAC address to other than the initial MAC address. This is different to Forged Transmit, which is just sending out a packet (outbound traffic) with a different MAC Address. When the option is set to Reject, and the the Guest OS changes the effective MAC address, ESXi will drop all incoming packets. The guest OS does not detect that the MAC address change was not honored. This protects the host against MAC impersonation. The port that the virtual adapter used to send the request is disabled and the virtual adapter does not receive any more frames until it changes the effective MAC address to match the initial MAC address. NOTE The iSCSI initiator relies on being able to get MAC address changes from certain types of storage [e1: why?]. If you are using ESXi iSCSI and have iSCSI storage, set the MAC Address Changes option to Accept. you might have a legitimate need for more than one adapter to have the same MAC address on a network—for example, if you are using Microsoft Network Load Balancing in unicast mode. When Microsoft Network Load Balancing is used in the standard multicast mode, adapters do not share MAC addresses. By default, promiscuous mode is turned off. ESXi hosts support IPsec using IPv6. [e1: no IPv4? This could be because IPsec is built-in in IPv6] When you set up IPsec on a host, you enable authentication and encryption of incoming and outgoing packets. When and how IP traffic is encrypted is depends on how you set up the system's security associations and security policies A security association determines how the system encrypts traffic. When you create a security association, you specify the source and destination, encryption parameters, a name for the security association. [e1: the how is specified in the encryption parameter. Examples??] A security policy determines when the system should encrypt traffic. The security policy includes source and destination information, the protocol and direction of traffic to be encrypted, the mode (transport or tunnel) [e1: different between transport and tunnel??] and the security association to use. You can add a security association using the vicfg-ipsec. [e1: do we do this per pair of host?] For iSCSI, ESXi only support CHAP. ESXi does not support Kerberos, Secure Remote Protocol (SRP), or public-key authentication methods for iSCSI. Additionally, it does not support IPsec authentication and encryption. In CHAP authentication, when the initiator contacts an iSCSI target, the target sends a predefined ID value and a random value, or key, to the initiator. The initiator creates a one-way hash value that it sends to the target. The hash contains three elements: a predefined ID value, the random value that the target sends, and a private value, or CHAP secret, that the initiator and target share. When the target receives the hash from the initiator, it creates its own hash value by using the same elements and compares it to the initiator’s hash. If the results match, the target authenticates the initiator. ESXi supports unidirectional and bidirectional CHAP authentication for iSCSI. In unidirectional CHAP authentication, the target authenticates the initiator, but the initiator does not authenticate the target. In bidirectional CHAP authentication, an additional level of security enables the initiator to authenticate the target. ESXi supports CHAP authentication at the adapter level, when only one set of authentication credentials can be sent from the host to all targets. It also supports per-target CHAP authentication, which enables you to configure different credentials for each target to achieve greater target refinement.[e1: so a secured LUN can be configured with a different password] Choosing not to enforce more stringent authentication can make sense if your iSCSI storage is housed in one location and you create a dedicated network or VLAN to service all your iSCSI devices. The iSCSI configuration is secure because it is isolated from any unwanted access, much as a Fibre Channel SAN is. Secure iSCSI Ports When you run iSCSI devices, ESXi does not open any ports that listen for network connections. [e1:does it only apply only when we run iSCSI devices? What does it mean by run iSCSI device?] This measure reduces the chances that an intruder can break into ESXi through spare ports and gain control over the host. Therefore, running iSCSI does not present any additional security risks at the ESXi end of the connection. If any security vulnerabilities exist in the iSCSI device software, your data can be at risk through no fault of ESXi. To lower this risk, install all security patches that your storage vendor. To ensure the protection of the data transmitted to and from external network connections, ESXi uses one of the strongest block ciphers available—256-bit AES block encryption. ESXi also uses 1024-bit RSA for key exchange. These encryption algorithms are the default for the following connections. vSphere Client connections to vCenter Server and to ESXi through the management interface. SDK connections to vCenter Server and to ESXi. Management interface connections to VMs through the VMkernel. SSH connections to ESXi through the management interface. For SSH, vSphere supports only 256-bit and 128-bit AES ciphers In addition to implementing the firewall, risks to the hosts are mitigated using other methods. ESXi runs only services essential to managing its functions, and the distribution is limited to the features required to run ESXi. By default, all ports not specifically required for management access to the host are closed. You must specifically open ports if you need additional services. By default, weak ciphers are disabled and all communications from clients are secured by SSL. The exact algorithms used for securing the channel depend on the SSL handshake. Default certificates created on ESXi use SHA-1 with RSA encryption as the signature algorithm. The Tomcat Web service, used internally by ESXi to support access by Web clients, has been modified to run only those functions required for administration and monitoring by a Web client. As a result, ESXi is not vulnerable to the Tomcat security issues reported in broader use. Insecure services such as FTP and Telnet are not installed, and the ports for these services are closed by default. 3 Securing the Management Interface 33 General Security Recommendations 33 ESXi Firewall Configuration 34 ESXi Firewall Commands 39 The firewall also allows Internet Control Message Protocol (ICMP) pings and communication with DHCP and DNS (UDP only) clients. You can add supported services and management agents that are required to operate the host by adding rule set configuration files to the ESXi firewall directory /etc/vmware/firewall/. The default rule set configuration file is service.xml. The file contains firewall rules and describes each rule's relationship with ports and protocols. NOTE The behavior of the NFS Client rule set (nfsClient) is different from other rule sets. When the NFS Client rule set is enabled, all outbound TCP ports are open for the destination hosts in the list of allowed IP addresses. A rule set configuration file contains firewall rules and describes each rule's relationship with ports and protocols. The rule set configuration file can contain rule sets for multiple services. Rule set configuration files are located in the /etc/vmware/firewall/ directory. To add a service to the host security profile, you define the port rules for the service in a configuration file. Name the configuration file service_name.xml. Each set of rules for a service in the rule set configuration file contains the following information. A numeric identifier for the service, if the configuration file contains more than one service. A unique identifier for the rule set, usually the name of the service. For each rule, the file contains one or more port rules, each with a definition for direction, protocol, port type, and port number or range of port numbers. An indication of whether the service is enabled or disabled when the rule set is applied. [e1: so the service will be started automatically if the firewall port is opened] An indication of whether the rule set is required and cannot be disabled. After you add a service or rule to the configuration file, you must refresh the firewall settings See the part: Example: Rule Set Configuration File If different services have overlapping port rules, enabling one service might implicitly enable overlapping services. To minimize the effects of this behavior, you can specify which IP addresses are allowed to access each service on the host. The NFS Client rule set behaves differently than other ESXi firewall rule sets. ESXi configures NFS Client settings when you mount or unmount an NFS datastore. When you add or mount an NFS datastore, ESXi checks the state of the NFS Client ( nfsClient) firewall rule set. If the NFS Client rule set is disabled, ESXi enables the rule set and disables the Allow All IP Addresses policy by setting the allowedAll flag to FALSE. The IP address of the NFS server is added to the allowed list of outgoing IP addresses. If the NFS Client rule set is enabled, the state of the rule set and the allowed IP address policy are not changed. The IP address of the NFS server is added to the allowed list of outgoing IP addresses. When you remove or unmount an NFS datastore, ESXi performs one of the following actions. If ESXi is mounted on any NFS datastore, the IP address of the unmounted NFS server is removed from the list of allowed outgoing IP addresses and the NFS Client rule set remains enabled. If ESXi is not mounted on any NFS datastore, the IP address of the unmounted NFS server is removed from the list of allowed outgoing IP addresses and the NFS Client rule set is disabled. NOTE If you manually enable the NFS Client rule set or manually set the Allow All IP Addresses policy, either before or after you add an NFS datastore to the system, your settings are overridden when the last NFS datastore is unmounted. The NFS Client rule set is disabled when all NFS datastores are unmounted. The settings described in this section only apply to service settings configured through the vSphere Client or applications created with the vSphere Web services SDK. Configurations made through other means, such as the ESXi Shell or configuration files in /etc/init.d/, are not affected by these settings Start automatically if any ports are open, and stop when all ports are closed: The default setting for these services that VMware recommends. If any port is open, the client attempts to contact the network resources pertinent to the service in question. If some ports are open, but the port for a particular service is closed, the attempt fails, but there is little drawback to such a case. If and when the applicable outgoing port is opened, the service begins completing its tasks. Start and stop with host: The service starts shortly after the host starts and closes shortly before the host shuts down. Much like Start automatically if any ports are open, and stop when all ports are closed, this option means that the service regularly attempts to complete its tasks, such as contacting the specified NTP server. If the port was closed but is subsequently opened, the client begins completing its tasks shortly thereafter. Start and stop manually: The host preserves the user-determined service settings, regardless of whether ports are open or not. When a user starts the NTP service, that service is kept running as long as the host is powered on. If the service is started and the host is powered off, the service is stopped as part of the shutdown process, but as soon as the host is powered on, the service is started again, preserving the userdetermined state. NOTE ESXi firewall automates when rule sets are enabled or disabled based on the service startup policy. When a service starts, its corresponding rule set is enabled. When a service stops, the rule set is disabled 4 Authentication and User Management 41 Securing ESXi Through Authentication and Permissions 41 Managing vSphere Users 42 Managing vSphere Groups 45 Password Requirements 47 Assigning Permissions 48 Assigning Roles 58 Using Active Directory to Manage Users and Groups 63 Using vSphere Authentication Proxy 65 ESXi uses the Pluggable Authentication Modules (PAM) structure for authentication when users access the ESXi host using the vSphere Client. The PAM configuration for VMware services is located in /etc/pam.d/system-auth-generic, which stores paths to authentication modules. Changes to this configuration affect all host services. The reverse proxy in the VMware Host Agent process listens on ports 80 and 443. vSphere Client or vCenter Server users connect to the host agent through these ports. The host process receives the user name and password from the client and forwards them to the PAM module to perform the authentication CIM transactions also use ticket-based authentication in connecting with the host process. If Active Directory authentication has been configured on the host, then the same Windows domain users known to vCenter Server will be available on the ESXi host. Do not create a user named ALL. Privileges associated with the name ALL might not be available to all users in some situations. For example, if a user named ALL has Administrator privileges, a user with ReadOnly privileges might be able to log in to the host remotely. This is not the intended behavior. [e1: this paragraph is confusing] By default, some versions of the Windows OS include the NT AUTHORITY\INTERACTIVE user in the Administrators group. When the NT AUTHORITY\INTERACTIVE user is in the Administrators group, all users you create on the vCenter Server system have the Administrator privilege. To avoid this, remove the NT AUTHORITY\INTERACTIVE user from the Administrators group on the Windows system where you run vCenter Server Create a password that meets the length and complexity requirements. The host checks for password compliance using the default authentication plug-in, pam_passwdqc.so. If the password is not compliant, the following error appears: A general system error occurred: passwd: Authentication token manipulation error. To change the user’s ability to access ESXi through a command shell, select or deselect Grant shell access to this user. NOTE To be granted shell access, users must also have an Administrator role for an inventory object on the host. [e1: weird. Because an inventory object can be just a empty folder?] You can remove the root user from ESXi host. Don’t do it, ramification is…? Users who are logged in and are removed from the domain keep their vSphere permissions until the next validation period. The default is every 24 hours If you change a user’s name in the domain, the original user name becomes invalid in the vCenter Server system. If you change the name of a group, the original group becomes invalid after you restart the vCenter Server system You can view, sort, and export lists of a host's local users and groups to a file that is in HTML, XML, Microsoft Excel, or CSV format. [e1: can’t export VC users/groups?] If you use Active Directory groups, make sure that they are security groups and not distribution groups. Permissions assigned to distribution groups are not enforced by vCenter Server By default, ESXi enforces requirements for user passwords. When you create a password, include a mix of characters from four character classes: lowercase letters, uppercase letters, numbers, and special characters such as an underscore or dash. Your user password must meet the following length requirements. Passwords containing characters from one or two character classes must be at least eight characters long. Passwords containing characters from three character classes must be at least seven characters long. Passwords containing characters from all four character classes must be at least six characters long. NOTE An uppercase character that begins a password does not count toward the number of character classes used. A number that ends a password does not count toward the number of character classes used. [e1: the password bugoff and bugoff01 has the same character class. So don’t put number at the end as they think you’re just incrementing the password!] By default, ESXi uses the pam_passwdqc.so plug-in to set the rules that users must observe when creating passwords and to check password strength. The pam_passwdqc.so plug-in lets you determine the basic standards that all passwords must meet. By default, ESXi imposes no restrictions on the root password. However, when nonroot users attempt to change their passwords, the passwords they choose must meet the basic standards that pam_passwdqc.so sets. A valid password should contain a combination of as many character classes as possible. You can also use a passphrase, which is a phrase consisting of at least three words, each of which is 8 to 40 characters long. the root and vpxuser user accounts have the same access rights as any user assigned the Administrator role on all objects. Any operation that consumes storage space, such as creating a virtual disk or taking a snapshot, requires the Datastore.Allocate Space privilege on the target datastore, as well as the privilege to perform the operation itself. Each host and cluster has its own implicit resource pool that contains all the resources of that host or cluster. Deploying a VM directly to a host or cluster requires the Resource.Assign Virtual Machine to Resource Pool privilege. The list of privileges is the same for both ESXi and vCenter Server. [e1: this is wrong as the list in vCenter has more objects?] [e1: is there document that explains the vcenter privileges??] VMs inherit permissions from both the parent VM folder and the parent host, cluster, or resource pool simultaneously. To restrict a user’s privileges on a VM, you must set permissions on both the parent folder and the parent host, cluster, or resource pool for that VM. In this example, permissions are assigned to two different groups on two different objects. Role 1 can power on VMs. Role 2 can take snapshots of VMs. [e1: but cannot power on VM?] Group A is granted Role 1 on VM Folder, with the permission set to propagate to child objects. Group B is granted Role 2 on VM B. User 1, who belongs to groups A and B, logs on. Because Role 2 is assigned at a lower point in the hierarchy than Role 1, it overrides Role 1 on VM B. User 1 can power on VM A, but not take snapshots. User 1 can take snapshots of VM B, but not power it on. For security reasons, you might not want to use the root user in the Administrator role. In this case, you can change permissions after installation so that the root user no longer has administrative privileges or you can delete the root user’s access permissions altogether through the vSphere Client. [e1: this is NOT the same with deleting the root user itself. This is just disabling it. But what programs/service check for existance of root user?] If you do so, you must first create another permission at the root level that has a different user assigned to the Administrator role. Assigning the Administrator role to a different user helps you maintain security through traceability. The vSphere Client logs all actions that the Administrator role user initiates as events, providing you with an audit trail. If all administrators log in as the root user, you cannot tell which administrator performed an action. If you create multiple permissions at the root level—each associated with a different user or user group—you can track the actions of each administrator or administrative group. After you create an alternative Administrator user, you can assign a different role to the root user. To manage the host using vCenter Server, the new user you created must have full Administrator privileges on the host. vicfg commands do not perform an access check. Therefore, even if you limit the root user’s privileges, it does not affect what that user can do using the command-line interface commands vpxuser is created when a host is attached to vCenter Server. vCenter Server has Administrator privileges on the host that it manages. However, the vCenter Server administrator cannot directly create, delete, or edit users and groups in the hosts. These tasks can only be performed by a user with Administrator permissions directly on each host. NOTE You cannot manage the vpxuser using Active Directory. CAUTION Do not change vpxuser in any way. Do not change its password. Do not change its permissions. The dcui user runs on hosts and acts with Administrator rights. This user’s primary purpose is to configure hosts for lockdown mode from the Direct Console User Interface (DCUI). This user acts as an agent for the direct console and must not be modified or used by interactive users. [e1: I guess this user is created as root can be deleted] Do not change the dcui user in any way and do not change its permissions. If the AD is large, it can take a long time. So change the settings here vSphere provide three default roles, and you cannot change the privileges associated with these roles. They are: no access, read only, administrator Each subsequent default role includes the privileges of the previous role. For example, the Administrator role inherits the privileges of the Read Only role. Roles you create yourself do not inherit privileges from any of the default roles. You can create custom roles by using the role-editing facilities in the vSphere Client to create privilege sets that match your user needs. If you use the vSphere Client connected to vCenter Server to manage ESXi hosts, you have additional roles to choose from in vCenter Server. Also, the roles you create directly on a host are not accessible within vCenter Server. You can work with these roles only if you log in to the host directly from the vSphere Client. NOTE When you add a custom role and do not assign any privileges to it, the role is created as a Read Only role with three system-defined privileges: System.Anonymous, System.View, and System.Read. All roles permit the user to schedule tasks by default. Users can schedule only tasks they have permission to perform at the time the tasks are created. NOTE Changes to permissions and roles take effect immediately, even if the users involved are logged in. The exception is searches, where permission changes take effect after the user has logged out and logged back in Users who are in the Active Directory group ESX Admins are automatically assigned the Administrator role VM User sample A set of privileges to allow the user to interact with a VM’s console, insert media, and perform power operations. Does not grant privileges to make virtual hardware changes to the VM. Privileges granted include: All privileges for the scheduled tasks privileges group. Selected privileges for the global items and VM privileges groups. No privileges for the folder, datacenter, datastore, network, host, resource, alarms, sessions, performance, and permissions privileges groups. VM Power User: VM User + make hardware changes to VMs, as well as perform snapshot operations. Addtitional Privileges granted selected privileges for datastore Datastore Consumer: allow the user to consume space on the datastores on which this role is granted. To perform a space-consuming operation, such as creating a virtual disk or taking a snapshot, the user must also have the appropriate VM privileges granted for these operations. Usually granted on a datastore or a folder of datastores. [e1: so what is missing is Storage Admin role. Datastore Consumer is valid for Server Admin, as the wizard to create VM needs to consume datastore] Network consumer: allow the user to assign VMs or hosts to networks, if the appropriate permissions for the assignment are also granted on the VMs or hosts. Usually granted on a network or folder of networks Removing a role from one vCenter Server system removes the role from all other vCenter Server systems in the group, even if you reassign permissions to another role on the current vCenter Server system. ESXi supports synchronizing time with an external NTPv3 or NTPv4 server that is compliant with RFC 5905 and RFC 1305. The Microsoft Windows W32Time service does not meet these requirements. To join AD, there are 2 techniques. Follow the second one so you can put the ESXi into the right OU. vSphere Authentication Proxy Enable ESXi hosts to join a domain without using Active Directory credentials. vSphere Authentication Proxy enhances security for PXE-booted hosts and hosts that are provisioned using Auto Deploy, by removing the need to store Active Directory credentials in the host configuration. I will install vSphere Authentication Proxy on the same machine as the associated vCenter Server as it is 1:1 binding and it is pretty light. You can install on a different machine. [e1: add screenshot] Does not support IPv6. The vCenter Server can be on a host machine in an IPv4-only, IPv4/IPv6 mixed-mode, or IPv6-only network environment, but the machine that connects to the vCenter Server through the vSphere Client must have an IPv4 address. You do not have to install Auto Deploy on the same host machine as the vSphere Authentication Proxy service. When you install the vSphere Authentication Proxy service, the installer creates a domain account with appropriate privileges to run the authentication proxy service. The account name begins with the prefix CAMand has a 32-character, randomly generated password associated with it. The password is set to never expire. Do not change the account settings Before you use the vSphere Authentication Proxy to connect ESXi to a domain, you must authenticate the vSphere Authentication Proxy server to ESXi. If you use Host Profiles to connect a domain with the vSphere Authentication Proxy server, you do not need to authenticate the server. The host profile authenticates the proxy server to ESXi. To authenticate ESXi to use the vSphere Authentication Proxy, export the server certificate from the vSphere Authentication Proxy system and import it to ESXi. You need only authenticate the server once. By default, ESXi must authenticate the vSphere Authentication Proxy server when using it to join a domain. Make sure that this authentication functionality is enabled at all times. To authenticate the vSphere Authentication Proxy server to ESXi, upload the proxy server certificate to ESXi. 5 Encryption and Security Certificates for ESXi and vCenter Server 71 Enable Certificate Checking and Verify Host Thumbprints 72 Generate New Certificates for ESXi 72 Replace a Default Host Certificate with a CA-Signed Certificate 73 Upload an SSL Certificate and Key Using HTTPS PUT 73 Upload an SSH Key Using HTTPS PUT 74 Upload an SSH Key Using a vifs Command 74 Configure SSL Timeouts 75 Modifying ESXi Web Proxy Settings 76 ESXi and vCenter Server support standard X.509 version 3 (X.509v3) certificates to encrypt session information sent over Secure Socket Layer (SSL) protocol connections between components. The default certificates are vulnerable to possible man-in-the-middle attacks. To prevent man-in-the-middle attacks and to fully use the security that certificates provide, certificate checking is enabled by default. The screenshow below shows where to enable it. The certificate consists of two files: the certificate itself (rui.crt) and the private-key file (rui.key). vCenter Server certificates are preserved across upgrades. If the host has Verify Certificates enabled, replacing the default certificate cause vCenter Server to stop managing the host Manually match the thumbprint with the one shown in each ESXi console. To generate a new cert, Run the command /sbin/generate-certificates You can use third-party applications to upload certificates. Applications that support HTTPS PUT operations work with the HTTPS interface that is included with ESXi. Put the files here You can use authorized keys to log in to a host with SSH. [e1: how?? Just put the key in the SSH client??] You can upload authorized keys with HTTPS PUT. Lockdown mode does not apply to root users who log in using authorized keys. When you use an authorized key file for root user authentication, root users are not prevented from accessing a host with SSH when the host is in lockdown mode. Authorized keys allow you to authenticate remote access to a host. When users or scripts try to access a host with SSH, the key provides authentication without a password. With authorized keys you can automate authentication, which is useful when you write scripts to perform routine tasks. You can upload the following types of SSH keys to a host using HTTPS PUT: Authorized keys file for root user DSA key DSA public key RSA key RSA public key [e1: we just need 1 key, not 5??] Do not modify the /etc/ssh/sshd_config file. Put the above files here vifs --server hostname --username username --put filename /host/ssh_host_dsa_key_pub You can configure SSL timeouts for ESXi. Timeout periods can be set for two types of idle connections: The Read Timeout setting applies to connections that have completed the SSL handshake process with port 443 of ESXi. The Handshake Timeout setting applies to connections that have not completed the SSL handshake process with port 443 of ESXi. Both connection timeouts are set in milliseconds. Idle connections are disconnected after the timeout period. By default, fully established SSL connections have a timeout of infinity. ESXi Web Proxy. [e1: What is this??] Do not set up certificates using pass phrases. ESXi does not support pass phrases, also known as encrypted keys 6 Lockdown Mode 81 Lockdown Mode Behavior 81 Lockdown Mode Configurations 82 Enable Lockdown Mode Using the vSphere Client 82 Enable Lockdown Mode from the Direct Console User Interface 83 Using the ESXi Shell 83 E1: My expectation of lockdown mode is the ESXi is completely locked down, so only ESX admin can access it. To unlock it, you need to go via vCenter as you can’t even login to the ESXi. This could create problem as vCenter may lose access to ESX, and reboot might not be acceptable because there are VMs running on it. That’s why you can still disable it if you have KVM or ILO or physical access to the box. Certainly you need root password. Basically, anyone with root password + physical access is as good as administrator. There are 4 components: Lockdown Mode : can’t login via vSphere Client or vMA to ESXi. Can still SSH? ESXi Shell : it’s called Tech Support Mode in 4.x. This is the shell (prompt) to type CLI. One of the command is dcui, which will bring up SSH : you need this to access ESXi Shell remotely. Direct Console UI : the yellow/grey menu on the ESXi Console. From the ESXi Shell, you can bring it up by typing dcui. You can disable the service, which will prevent it. When you enable lockdown mode, no users other than vpxuser have authentication permissions remotely, nor can they perform operations against the host directly. Lockdown mode forces all operations to be performed through vCenter Server. When a host is in lockdown mode, you cannot run vSphere CLI commands from an administration server, from a script, or from vMA against the host. External software or management tools might not be able to retrieve or modify information from the ESXi host. Locally, the root user is still authorized to log in to the DCUI when lockdown mode is enabled. [e1: so the user has to have console access first, either via KVM or ILO. So since the user already has console access, this represents a way just in case vCenter can’t reach the ESX. You can perform troubleshooting like rebooting management agents. If you disable dcui service, then you can login. You will get a reply that looks like this] If you enable or disable lockdown mode using the Direct Console User Interface (DCUI), permissions for users and groups on the host are discarded. To preserve these permissions, you must enable and disable lockdown mode using the vSphere Client connected to vCenter Server. [e1: I tested disabling lockdown mode from Direct Console UI. vCenter did not get refreshed] There are 2 ways to enable Lockdown Mode: vSphere Client DC UI. For vSphere Client, the ESX must be part of vCenter first before this option appears. The screenshot belows show that Lockdown Mode does not appear as this ESXi is a standalone one. For DC UI, the ESX can be standalone. Lockdown mode does not apply to root users who log in using authorized keys. When you use an authorized key file for root user authentication, root users are not prevented from accessing a host with SSH when the host is in lockdown mode 7 Best Practices for VM and Host Security 87 VM Recommendations 87 Auto Deploy Security Considerations 92 Host Password Strength and Complexity 92 Copy and paste operations between the VM and the machine running vSphere Client are disabled by default for hosts to prevent exposing sensitive data that has been copied to the clipboard. [e1: the VM must be running VMware Tools??] The informational messages sent by guest operating processes are known as setinfo messages and typically contain name-value pairs that define VM characteristics or identifiers that the host stores (for example, ipaddress=10.17.87.224). The configuration file containing these name-value pairs is limited to a size of 1MB, which prevents attackers from staging a DoS attack by writing software that mimics VMware Tools and filling the host's memory with arbitrary configuration data, which consumes space needed by the VMs. If you require more than 1MB of storage for name-value pairs, you can change the value as required. You can also prevent the guest OS processes from writing any name-value pairs to the configuration file To prevent Guest OS from writing into the configuration file, add the line. In the Name column: isolation.tools.setinfo.disable In the Value column: true VM log file. VMware recommends saving 10 log files, each one limited to 100KB. These values are large enough to capture sufficient information to debug most problems that might occur. Each time an entry is written to the log, the size of the log is checked. If it is over the limit, the next entry is written to a new log. If the maximum number of log files exists, the oldest log file is deleted. To limit the log size, use a text editor to add or edit the following line to the .vmx file, where maximum_size is the maximum file size in bytes. log.rotateSize=maximum_size For example, to limit the size to around 100KB, enter 100000. To keep a limited number of log files, use a text editor to add or edit the following line to the .vmx file, where number_of_files_to_keep is the number of files the server keeps. log.keepOld=number_of_files_to_keep For example, to keep 10 log files and begin deleting the oldest ones as new ones are created, enter 10. VMware does not offer technical support for VM problems if VM logging has been disabled. So always have a log file for the VM. This logging traffic between the Primary and Secondary VMs is unencrypted and contains guest network and storage I/O data, as well as the memory contents of the guest OS. This traffic can include sensitive data such as passwords in plaintext. To avoid such data being divulged, ensure that this network is secured, especially to avoid "man-in-the-middle" attacks. Use a private network for FT logging traffic. The administrator (root) password and user passwords that are included with the host profile and answer file are MD5-encrypted. Any other passwords associated with host profiles are in the clear. [e1: how to include a user password in the host profile??] Use the vSphere Authentication Service to set up Active Directory to avoid exposing the Active Directory password. If you set up Active Directory using host profiles, the passwords are not protected. [e1: try this] The files that contain the host profile and answer file information are stored on disk in an obfuscated form. The files can be retrieved only as part of the waiter.tgz file that is generated for each host. The raw files are not accessible through the web server. However, it is possible for malicious code to pretend to be a host and download the host's waiter.tgz file, which contains information that can be used to compromise the host. To greatly reduce Auto Deploy security risks, completely isolate the network where Auto Deploy is used. [e1: how to do this since it is on the management network?] High-secure virtual infrastructure design in which there is a requirement to reduce the number of entry points to the virtual infrastructure. One of the requirements is to allow only a single session to the virtual machine console. Due to the increasing awareness \ demand of security in virtual infrastructure more organizations might want to apply this security setting. 1. Turn of the virtual machine. 2. Open Configuration parameters of the VM to edit the advanced configuration settings 3. Add Remote.Display.maxConnections with a value of 1 4. Power on virtual machine Assumming if we set this value to 0 we can disallow VM admin to take over VM and thus compromising the VM security. Only access by RDP. [Added by Benjamin T – add screenshot] vSphere Storage Guide 1 Introduction to Storage 11 Storage Virtualization 11 Supported Storage Adapters 12 Types of Physical Storage 13 Target and Device Representations 17 Viewing Storage Devices 18 Displaying Datastores 21 How VMs Access Storage 22 Comparing Types of Storage 23 These virtual controllers include BusLogic Parallel, LSI Logic Parallel, LSI Logic SAS, and VMware Paravirtual. These controllers are the only types of SCSI controllers that a VM can see and access. Each virtual disk that a VM can access through one of the virtual SCSI controllers resides on a vSphere VM File System (VMFS) datastore, an NFS-based datastore, or on a raw disk. [e1: so even physical RDM appears like vDisk] From the standpoint of the VM, each virtual disk appears as if it were a SCSI drive connected to a SCSI controller. Whether the actual physical disk device is being accessed through parallel SCSI, iSCSI, network, or Fibre Channel adapters on the host is transparent to the guest OS Storage Adapter For the Paths, the columns are runtime name [e1: does not survive reboot and may not be consistent], target, LUN no, Device and LUN ID. [e1: difference between device and LUN ID as the columns look the same] Local storage: ESXi supports a variety of internal or external local storage devices, including SCSI, IDE, SATA, USB, and SAS storage systems. Regardless of the type of storage you use, your host hides a physical storage layer from virtual machines. NOTE You cannot use IDE/ATA or USB drives to store VMs. ESXi does not support the delegate user functionality that enables access to NFS volumes using nonroot Credentials Different storage vendors present the storage systems to ESXi hosts in different ways. Some vendors present a single target with multiple storage devices or LUNs on it, while others present multiple targets with one LUN each. Targets that are accessed through the network have unique names that are provided by the storage systems. The iSCSI targets use iSCSI names, while Fibre Channel targets use World Wide Names (WWNs). A device, or LUN, is identified by its UUID name. If a LUN is shared by multiple hosts, it must be presented to all host with the same UUID [e1: is it even possible to present different UUID??] You can use the esxcli --server=server_name storage core device list command to display all device names in the vSphere CLI. The output is similar to the following example: # esxcli --server=server_name storage core device list naa.number Display Name: DGC Fibre Channel Disk(naa.number) ... Other UIDs:vml.number vmhbaAdapter:CChannel:TTarget:LLUN vmhbaAdapter is the name of the storage adapter. The name refers to the physical adapter on the host, not to the SCSI controller used by the VMs. CChannel is the storage channel number. Software iSCSI adapters and dependent hardware adapters use the channel number to show multiple paths to the same target. TTarget is the target number. Target numbering is determined by the host and might change if the mappings of targets visible to the host change. Targets that are shared by different hosts might not have the same target number. LLUN is the LUN number that shows the position of the LUN within the target. The LUN number is provided by the storage system. If a target has only one LUN, the LUN number is always zero (0). For example, vmhba1:C0:T3:L1 represents LUN1 on target 3 accessed through the storage adapter vmhba1 and channel 0. Local storage supports a cluster of VMs on a single host (also known as a cluster in a box). A shared virtual disk is required. For more information about this configuration, see the vSphere Resource Management documentation 2 Overview of Using ESXi with a SAN 25 ESXi and SAN Use Cases 26 Specifics of Using SAN Storage with ESXi 26 Making LUN Decisions 26 Choosing VM Locations 28 Layered Applications 29 Third-Party Management Applications 30 SAN Storage Backup Considerations 30 2 techniques: predictive and adaptive. [e1: this does not make sense as array is spindle driven. But exams may ask??]] Predictive: 1 Provision several LUNs with different storage characteristics. Example: LUN01 is RAID10, LUN06 is RAID 5 2 Create a VMFS datastore on each LUN, labeling each datastore according to its characteristics. 3 Create virtual disks for VM and place them. 4 Use disk shares to distinguish high-priority from low-priority VMs. 5 Run the applications to determine whether VM performance is acceptable. NOTE Disk shares are relevant only within a given host. The shares assigned to VMs on one host have no effect on VMs on other hosts. [e1: it should be at datastore level, unless we don’t turn on Storage IO Control??] Adaptive 1 Provision a large LUN. Choose 1 RAID type. with write caching enabled. 2 Create a VMFS on that LUN. 3 Create four or five virtual disks on the VMFS. 4 Run the applications to determine whether disk performance is acceptable. If not, create another LUN, preferably with different RAID type. Reasons for an array-based solution: Array-based solutions usually result in more comprehensive statistics. With RDMs, data always takes the same path, which results in easier performance management. Security is more transparent to the storage administrator when you use an RDM and an array-based solution because with RDMs, VMs more closely resemble physical machines. If you do not intend to use RDMs, check the storage vendor documentation to see if operations on LUNs with VMFS volumes are supported. If you use array operations on VMFS LUNs, carefully read the section on resignaturing [e1: see my notes] VADP features Perform full, differential, and incremental image backup and restore of VMs. Perform file-level backup of VMs that use supported Windows and Linux OSs. [e1: restore process??] Ensure data consistency by using Microsoft Volume Shadow Copy Services (VSS) for VMs that run supported Microsoft Windows OSs. 3 Using ESXi with Fibre Channel SAN 33 Fibre Channel SAN Concepts 33 Using Zoning with Fibre Channel SANs 34 How VMs Access Data on a Fibre Channel SAN 35 Ports are identified in a number of ways. WWPN (World Wide Port Name) A globally unique identifier for a port that allows certain applications to access the port. The FC switches discover the WWPN of a device or host and assign a port address to the device. Port_ID (or port address) Within a SAN, each port has a unique port ID that serves as the FC address for the port. This unique ID enables routing of data through the SAN to that port. The FC switches assign the port ID when the device logs in to the fabric. The port ID is valid only while the device is logged on. Asymmetrical storage system Supports Asymmetric Logical Unit Access (ALUA). ALUA-complaint storage systems provide different levels of access per port. ALUA allows hosts to determine the states of target ports and prioritize paths. The host uses some of the active paths as primary while others as secondary. [e1: what if not all hosts agree on the path?] With ESXi hosts, use a single-initiator zoning or a single-initiator-single-target zoning. The latter is a preferred zoning practice. Using the more restrictive zoning prevents problems and misconfigurations that can occur on the SAN. [e1: so we end up with hundreds of zones.] Exam may ask this. When a VM interacts with its virtual disk stored on a SAN, the following process takes place: 1 When the guest OS in a VM reads or writes to a SCSI disk, it issues SCSI commands to the virtual disk. 2 Device drivers in the VM’s OS communicate with the virtual SCSI controllers. 3 The virtual SCSI controller forwards the command to the VMkernel. 4 The VMkernel performs the following tasks. a Locates the file in the VMFS volume that corresponds to the guest VM disk. b Maps the requests for the blocks on the virtual disk to blocks on the appropriate physical device. c Sends the modified I/O request from the device driver in the VMkernel to the physical HBA. 5 The physical HBA performs the following tasks. a Packages the I/O request according to the rules of the FC protocol. b Transmits the request to the SAN. 6 Depending on a port the HBA uses to connect to the fabric, one of the SAN switches receives the request and routes it to the storage device that the host wants to access 4 Configuring Fibre Channel Storage 37 ESXi Fibre Channel SAN Requirements 37 Installation and Setup Steps 38 Configuring FCoE Adapters 39 N-Port ID Virtualization 41 Unless you are using diskless servers, do not set up the diagnostic partition on a SAN LUN. For multipathing to work properly, each LUN must present the same LUN ID number to all ESXi hosts. [e1: is it possible to present different LUNs?? I thought LUN No is set at array level. Only possible when using 2 arrays. Test this at lab. Benjamin T: When using two Directors on the array it is possible to assign a different LUN number from each different Director so LUNS visible to hosts connected to Director1 will not be visible to the same host when you connect it to Director2. EMC storage] Make sure the storage device driver specifies a large enough queue. You can set the queue depth for the physical HBA during system setup. [e1: need reboot?] On VMs running Microsoft Windows, increase the value of the SCSI TimeoutValue parameter to 60. [e1: I thought only NFS needs 60 seconds] ESXi does not support FC connected tape devices. In FC, you cannot use multipathing software inside a VM to perform I/O load balancing to a single physical LUN. However, when your Microsoft Windows VM uses dynamic disks, this restriction does not apply. [e1: the VM does not even have HBA??] When you use vMotion or DRS with an active-passive SAN storage device, make sure that all ESXi systems have consistent paths to all storage processors. [e1: how can they are inconsistent since only 1 SP owns the LUN??] Not doing so can cause path thrashing when a vMotion migration occurs. For active-passive storage arrays not listed in Storage/SAN Compatibility, VMware does not support storage port failover. In those cases, you must connect the server to the active port on the storage array. This configuration ensures that the LUNs are presented to the ESXi host Having different models of the same HBA is supported, but a single LUN cannot be accessed through two different HBA types, only through the same type. Ensure that the firmware level on each HBA is the same. Set the timeout value for detecting a failover. To ensure optimal performance, do not change the default value. [e1: contradictory] Converged Network Adapters (CNAs) that contain network and Fibre Channel functionalities on the same card. When such adapter is installed, your host detects and can use both CNA components. You do not need to configure the hardware FCoE adapter to be able to use it. A software FCoE adapter uses the native FCoE protocol stack in ESXi for the protocol processing. The software FCoE adapter is used with a NIC that offers Data Center Bridging (DCB) and I/O offload capabilities. ESXi 5.0 supports a maximum of four software FCoE adapters Follow these guidelines when you configure a network switch for software FCoE environment: On the ports that communicate with your ESXi host, disable the Spanning Tree Protocol (STP). Having the STP enabled might delay the FCoE Initialization Protocol (FIP) response at the switch and cause an all paths down (APD) condition. The FIP is a protocol that FCoE uses to discover and initialize FCoE entities on the Ethernet. Turn on Priority-based Flow Control (PFC) and set it to AUTO. Before you activate the software FCoE adapters, you need to connect the VMkernel to physical FCoE NICs installed on your host. This kinda give it the transport layer. Give it an IP address, but it won’t be used. To use the Software FCoE Adapter, the NIC used for FCoE must be bound as an uplink to a vSwitch which contains a VMkernel portgroup (vmk). It is ethernet after all. Since FCoE packets are exchanged in a VLAN, the VLAN ID must be set on the physical switch to which the adapter is connected, not on the adapter itself. The VLAN ID is automatically discovered during the FCoE Initialization Protocol (FIP) VLAN discovery process so there is no need to set the VLAN ID manually. Network Adapter Best Practices If the network adapter has multiple ports, add each port to a separate vSwitch. This practice helps you to avoid an APD condition when a disruptive event, such as an MTU change, occurs. [e1: good idea. But FCoE normally used in 10 GE, so we only have 2 cables. Are we saying 2 vSwitches?] Do not move a network adapter port from one vSwitch to another when FCoE traffic is active. Also, reboot your host afterwards. N-Port ID Virtualization (NPIV) is an ANSI T11 standard that describes how a single Fibre Channel HBA port can register with the fabric using several worldwide port names (WWPNs). This allows a fabric-attached N-port to claim multiple fabric addresses. Each address appears as a unique entity on the Fibre Channel fabric. WWNs uniquely identify such objects in the Fibre Channel fabric. When VMs have WWN assignments, they use them for all RDM traffic, so the LUNs pointed to by any of the RDMs on the VM must not be masked against its WWNs. When VMs do not have WWN assignments, they access storage LUNs with the WWNs of their host’s physical HBAs. By using NPIV, however, a SAN administrator can monitor and route storage access on a per VM basis. [e1: use case? Security?] The following section describes how this works. When a VM has a WWN assigned to it, the VM’s configuration file (.vmx) is updated to include a WWN pair (consisting of a World Wide Port Name, WWPN, and a World Wide Node Name, WWNN). [e1: which one is which??] As that VM is powered on, the VMkernel instantiates a virtual port (VPORT) on the physical HBA which is used to access the LUN. The VPORT is a virtual HBA that appears to the FC fabric as a physical HBA, that is, it has its own unique identifier, the WWN pair that was assigned to the VM. Each VPORT is specific to the VM, and the VPORT is destroyed on the host and it no longer appears to the FC fabric when the VM is powered off. When a VM is migrated from one host to another, the VPORT is closed on the first host and opened on the destination host. If NPIV is enabled, WWN pairs (WWPN & WWNN) are specified for each VM at creation time. A minimum of 2 WWPNs are needed to support failover with NPIV. Typically only 1 WWNN is created for each VM When a VM using NPIV is powered on, it uses each of these WWN pairs in sequence to try to discover an access path to the storage. The number of VPORTs that are instantiated equals the number of physical HBAs present on the host. A VPORT is created on each physical HBA that a physical path is found on. Each physical path is used to determine the virtual path that will be used to access the LUN. Note that HBAs that are not NPIV-aware are skipped in this discovery process because VPORTs cannot be instantiated on them. NPIV requirements: RDM only. Use HBAs of the same type, either all QLogic or all Emulex. If a host uses multiple physical HBAs as paths to the storage, zone all physical paths to the virtual machine. This is required to support multipathing even though only one path at a time will be active. The switches in the fabric must be NPIV-aware. When configuring a LUN for NPIV access at the storage level, make sure that the NPIV LUN number and NPIV target ID match the physical LUN and Target ID. Capabilities and limitations of the use of NPIV with ESXi. NPIV supports vMotion. When you use vMotion to migrate a VM it retains the assigned WWN. [e1: because it is at vmx file] If you migrate an NPIV-enabled VM to a host that does not support NPIV, VMkernel reverts to using a physical HBA to route the I/O. [e1: make sure the RDM LUN is visible to the new host] If your FC SAN environment supports concurrent I/O on the disks from an active-active array, the concurrent I/O to two different NPIV ports is also supported. Because the NPIV technology is an extension to the FC protocol, it requires an FC switch and does not work on the direct attached FC disks. When you clone a VM or template with a WWN assigned to it, the clones do not retain the WWN. [e1: interesting. What else are different when we clone?] No Storage vMotion. Disabling and then re-enabling the NPIV capability on an FC switch while VMs are running can cause an FC link to fail and I/O to stop. Not all storage devices are certified for all features and capabilities of ESXi. The tests/qualifications performed: Basic connectivity This configuration does not allow for multipathing or any type of failover. HBA failover Storage port failover Boot from SAN Direct connect FC Arbitrated Loop (AL) is not supported. MSCS Clustering 5 Modifying Fibre Channel Storage for ESXi 45 Testing ESXi SAN Configurations 45 General Setup Considerations for Fibre Channel SAN Arrays 46 EMC CLARiiON Storage Systems 46 EMC Symmetrix Storage Systems 47 IBM System Storage DS4800 Storage Systems 48 IBM Systems Storage 8000 and IBM ESS800 49 HP StorageWorks Storage Systems 49 Hitachi Data Systems Storage 50 Network Appliance Storage 50 LSI-Based Storage Systems 51 For all storage arrays, make sure that the following requirements are met: - LUNs must be presented to each HBA of each host with the same LUN ID number. - Unless specified for individual storage arrays, set the host type for LUNs presented to ESXi to Linux, Linux Cluster, or, if available, to vmware or esx. - If you are using vMotion, DRS, or HA, make sure that both source and target hosts for virtual machines can see the same LUNs with identical LUN IDs. The default multipathing policy for CLARiiON arrays that do not support ALUA is Most Recently Used. For CLARiiON arrays that support ALUA, the default multipathing policy is VMW_PSP_FIXED. The ESXi system sets the default policy when it identifies the array, because this array is an active-passive disk array. Source: http://virtualausterity.blogspot.be/2010/01/why-vmware-psa-is-helping-me-to-save.html This configuration provides two paths from each HBA, so that each element of the connection can fail over to a redundant path. The order of the paths in this configuration provides HBA and switch failover without the need to trigger SP failover. The storage processor that the preferred paths are connected to must own the LUNs. VMware PSPs Path Selection Plug-Ins (PSPs) run in conjunction with the VMware NMP and are responsible for choosing a physical path for I/O requests. The VMware NMP assigns a default PSP for every logical device based on the SATP associated with the physical paths for that device. You can override the default PSP. By default, the VMware NMP supports the following PSPs: Most Recently Used (MRU) Selects the path the ESX/ESXi host used most recently to access the given device. If this path becomes unavailable, the host switches to an alternative path and continues to use the new path while it is available. Fixed Uses the designated preferred path, if it has been configured. Otherwise, it uses the first working path discovered at system boot time. If the host cannot use the preferred path, it selects a random alternative available path. The host automatically reverts back to the preferred path as soon as that path becomes available. Round Robin (RR) Uses a path selection algorithm that rotates through all available paths enabling load balancing across the paths. 6 Booting ESXi from Fibre Channel SAN 53 Boot from SAN Benefits 53 Boot from Fibre Channel SAN Requirements and Considerations 54 Getting Ready for Boot from SAN 54 Configure Emulex HBA to Boot from SAN 55 Configure QLogic HBA to Boot from SAN 57 If you use boot from SAN, the benefits for your environment will include the following: - Cheaper servers. Servers can be more dense and run cooler without internal storage. - Easier server replacement. You can replace servers and have the new server point to the old boot location. - Less wasted space. Servers without local disks often take up less space. - Easier backup processes. You can backup the system boot images in the SAN as part of the overall SAN backup procedures. Also, you can use advanced array features such as snapshots on the boot image. - Improved management. Creating and managing the operating system image is easier and more efficient. - Better reliability. You can access the boot disk through multiple paths, which protects the disk from being single point of failure. 7 Best Practices for Fibre Channel Storage 59 Preventing Fibre Channel SAN Problems 59 Disable Automatic Host Registration 60 Optimizing Fibre Channel SAN Storage Performance 60 Fibre Channel SAN Configuration Checklist 61 The RAID group containing the ESXi LUNs should not include LUNs used by other servers that are not running ESXi. Make sure read/write caching is enabled. SAN storage arrays require continual redesign and tuning to ensure that I/O is load balanced across all storage array paths. To meet this requirement, distribute the paths to the LUNs among all the SPs to provide optimal load balancing. Monitoring indicates when it is necessary to rebalance the LUN distribution. NOTE Dynamic load balancing is not currently supported with ESXi. ESXi Configuration For all LUNs hosting clustered disks on active-passive arrays, use the Most Recently Used PSP. For LUNs on active-active arrays, you can use the Most Recently Used or Fixed PSP. With either active-passive or active-active arrays, you can use the Round Robin PSP. All FC HBAs must be of the same model. Set the following Software Advanced Settings for the host: Set Disk.UseLunReset to 1 Set Disk.UseDeviceReset to 0 More: http://www.yellow-bricks.com/2011/04/13/disk-usedevicereset-do-i-really-need-to-set-it/ 8 Using ESXi with iSCSI SAN 63 iSCSI SAN Concepts 63 How VMs Access Data on an iSCSI SAN 68 Software iSCSI Adapter A software iSCSI adapter is a VMware code built into the VMkernel. It allows your host to connect to the iSCSI storage device through standard network adapters. The software iSCSI adapter handles iSCSI processing while communicating with the network adapter. With the software iSCSI adapter, you can use iSCSI technology without purchasing specialized hardware. Hardware iSCSI Adapter A hardware iSCSI adapter is a third-party adapter that offloads iSCSI and network processing from your host. Hardware iSCSI adapters are divided into categories. Dependent Hardware iSCSI Adapter Depends on VMware networking, and iSCSI configuration and management interfaces provided by VMware. This type of adapter can be a card that presents a standard network adapter and iSCSI offload functionality for the same port. The iSCSI offload functionality depends on the host's network configuration to obtain the IP, MAC, and other parameters used for iSCSI sessions. An example of a dependent adapter is the iSCSI licensed Broadcom 5709 NIC. Independent Hardware iSCSI Adapter Implements its own networking and iSCSI configuration and management interfaces. An example of an independent hardware iSCSI adapter is a card that either presents only iSCSI offload functionality or iSCSI offload functionality and standard NIC functionality. The iSCSI offload functionality has independent configuration management that assigns the IP, MAC, and other parameters used for the iSCSI sessions. iSCSI Storage System Types ESXi supports different storage systems and arrays. The types of storage that your host supports include active-active, active-passive, and ALUA-compliant. Active-active storage system Allows access to the LUNs simultaneously through all the storage ports that are available without significant performance degradation. All the paths are active at all times, unless a path fails. Active-passive storage system A system in which one storage processor is actively providing access to a given LUN. The other processors act as backup for the LUN and can be actively providing access to other LUN I/O. I/O can be successfully sent only to an active port for a given LUN. If access through the active storage port fails, one of the passive storage processors can be activated by the servers accessing it. Asymmetrical storage system Supports Asymmetric Logical Unit Access (ALUA). ALUA-complaint storage systems provide different levels of access per port. ALUA allows hosts to determine the states of target ports and prioritize paths. The host uses some of the active paths as primary while others as secondary. Virtual port storage system Allows access to all available LUNs through a single virtual port. These are active-active storage devices, but hide their multiple connections though a single port. ESXi multipathing does not make multiple connections from a specific port to the storage by default. Some storage vendors supply session managers to establish and manage multiple connections to their storage. These storage systems handle port failover and connection balancing transparently. This is often referred to as transparent failover. 9 Configuring iSCSI Adapters and Storage 69 ESXi iSCSI SAN Requirements 70 ESXi iSCSI SAN Restrictions 70 Setting LUN Allocations for iSCSI 70 Network Configuration and Authentication 71 Setting Up Independent Hardware iSCSI Adapters 71 Configuring Dependent Hardware iSCSI Adapters 72 Configuring Software iSCSI Adapter 74 Setting Up iSCSI Network 76 Using Jumbo Frames with iSCSI 82 Configuring Discovery Addresses for iSCSI Adapters 83 Configuring CHAP Parameters for iSCSI Adapters 84 Configuring Advanced Parameters for iSCSI 88 iSCSI Session Management 89 ESXi iSCSI SAN Restrictions - ESXi does not support iSCSI-connected tape devices. - You cannot use virtual-machine multipathing software to perform I/O load balancing to a single physicalLUN. - ESXi does not support multipathing when you combine independent hardware adapters with either software or dependent hardware adapters. For CHAP authentication, enable it on the initiator and the storage system side. After authentication is enabled, it applies for all of the targets that are not yet discovered, but does not apply to targets that are already discovered. After the discovery address is set, the new targets discovered are exposed and can be used at that point. Dependent Hardware iSCSI Considerations - When you use any dependent hardware iSCSI adapter, performance reporting for a NIC associated with the adapter might show little or no activity, even when iSCSI traffic is heavy. This behavior occurs because the iSCSI traffic bypasses the regular networking stack. - If you use a third-party virtual switch, for example Cisco Nexus 1000V DVS, disable automatic pinning. Use manual pinning instead, making sure to connect a VMkernel adapter (vmk) to an appropriate physical NIC (vmnic). - The Broadcom iSCSI adapter performs data reassembly in hardware, which has a limited buffer space. When you use the Broadcom iSCSI adapter in a congested network or under heavy load, enable flow control to avoid performance degradation. Flow control manages the rate of data transmission between two nodes to prevent a fast sender from overrunning a slow receiver. For best results, enable flow control at the end points of the I/O path, at the hosts and iSCSI storage systems. Broadcom iSCSI adapters do not support IPv6 and Jumbo Frames. The iSCSI adapter and physical NIC connect through a virtual VMkernel adapter, also called virtual network adapter or VMkernel port. You create a VMkernel adapter (vmk) on a vSphere switch (vSwitch) using 1:1 mapping between each virtual and physical network adapter. NOTE If you use separate vSphere switches, you must connect them to different IP subnets. Otherwise, VMkernel adapters might experience connectivity problems and the host will fail to discover iSCSI LUNs. An alternative is to add all NICs and VMkernel adapters to a single vSphere standard switch. In this case, you must override the default network setup and make sure that each VMkernel adapter maps to only one corresponding active physical adapter. To configure iSCSI adapter: Required privilege: Host.Configuration.Storage Partition Configuration To list iSCSI sessions, run the following command: esxcli --server=server_name iscsi session list 10 Modifying iSCSI Storage Systems for ESXi 93 Testing ESXi iSCSI SAN Configurations 93 General Considerations for iSCSI SAN Storage Systems 94 EMC CLARiiON Storage Systems 94 EMC Symmetrix Storage Systems 95 Enable HP StorageWorks MSA1510i to Communicate with ESXi 95 HP StorageWorks EVA Storage Systems 96 NetApp Storage Systems 97 Dell EqualLogic Storage Systems 97 HP StorageWorks SAN/iQ Storage Systems 97 Dell PowerVault MD3000i Storage Systems 98 iSCSI Targets in vApps 98 EMC CLARiiON Storage Systems This is an active-passive disk array, so any related issues that apply to all active-passive disk arrays are relevant. In addition, keep in mind the following: To avoid the possibility of path thrashing, the default multipathing policy is Most Recently Used, not Fixed. The ESXi system sets the default policy when it identifies the storage system To boot from a SAN, choose the active storage processor for the boot LUN’s target in the HBA BIOS. If you use an iSCSI target in a virtual appliance, for example HP LeftHand P4000 VSA, the host should connect to the target through the software iSCSI adapter rather than a hardware iSCSI adapter. 11 Booting from iSCSI SAN 99 General Boot from iSCSI SAN Recommendations 99 Prepare the iSCSI SAN 100 Configure Independent Hardware iSCSI Adapter for SAN Boot 100 iBFT iSCSI Boot Overview 101 12 Best Practices for iSCSI Storage 107 Preventing iSCSI SAN Problems 107 Optimizing iSCSI SAN Storage Performance 108 Checking Ethernet Switch Statistics 111 iSCSI SAN Configuration Checklist 111 Using VLANs or VPNs does not provide a suitable solution to the problem of link oversubscription in shared configurations. VLANs and other virtual partitioning of a network provide a way of logically designing a network, but do not change the physical capabilities of links and trunks between switches. When storage traffic and other network traffic end up sharing physical connections, as they would with a VPN, the possibility for oversubscription and lost packets exists. The same is true of VLANs that share interswitch trunks. Performance design for a SANs must take into account the physical limitations of the network, not logical allocations. Source: http://virtcloud.wordpress.com/2012/02/18/storage-network-design-and-setup-for-vsphere/ ESXi Configuration Set the following Advanced Settings for the ESXi host: - Set Disk.UseLunReset to 1 - Set Disk.UseDeviceReset to 0 - A multipathing policy of Most Recently Used must be set for all LUNs hosting clustered disks for active-passive arrays. - A multipathing policy of Most Recently Used or Fixed maybe set for LUNs on active-active arrays. Allow ARP redirection if the storage system supports transparent failover. [Ben T why?] 13 Working with Datastores 113 Understanding VMFS Datastores 114 NFS Datastores 127 Unmount VMFS or NFS Datastores 128 Rename VMFS or NFS Datastores 129 Group VMFS or NFS Datastores 129 Handling Storage Device Disconnections 130 Creating a Diagnostic Partition 133 Set Up Dynamic Disk Mirroring 134 A VMFS datastore holds virtual machine files, directories, symbolic links, RDM descriptor files, and so on. The datastore also maintains a consistent view of all the mapping information for these objects. Mapping information is called metadata. Metadata is updated each time you perform datastore or virtual machine management operations. Examples of operations requiring metadata updates include the following: - Creating, growing, or locking a virtual machine file - Changing a file's attributes - Powering a virtual machine on or off - Creating or deleting a VMFS datastore - Expanding a VMFS datastore - Creating a template - Deploying a virtual machine from a template - Migrating a virtual machine with vMotion When metadata changes are made in a shared storage enviroment, VMFS uses special locking mechanisms to protect its data and prevent multiple hosts from concurrently writing to the metadata. When metadata changes are made in a shared storage enviroment, VMFS uses special locking mechanisms to protect its data and prevent multiple hosts from concurrently writing to the metadata. SCSI Reservations VMFS uses SCSI reservations on storage devices that do not support hardware acceleration. SCSI reservations lock an entire storage device while an operation that requires metadata protection is performed. After the operation completes, VMFS releases the reservation and other operations can continue. Because this lock is exclusive, excessive SCSI reservations by a host can cause performance degradation on other hosts that are accessing the same VMFS. Atomic Test and Set (ATS) For storage devices that support hardware acceleration, VMFS uses the ATS algorithm, also called hardware assisted locking. In contrast with SCSI reservations, ATS supports discrete locking per disk sector. Perform a manual storage rescan each time you make one of the following changes. ■Zone a new disk array on a SAN. ■Create new LUNs on a SAN. ■Change the path masking on a host. ■Reconnect a cable. ■Change CHAP settings (iSCSI only). ■Add or remove discovery or static addresses (iSCSI only). ■Add a single host to the vCenter Server after you have edited or removed from the vCenter Server a datastore shared by the vCenter Server hosts and the single host. Important If you rescan when a path is unavailable, the host removes the path from the list of paths to the device. The path reappears on the list as soon as it becomes available and starts working again. Storage filters Key Filter Name config.vpxd.filter.vmfsFilter VMFS Filter config.vpxd.filter.rdmFilter RDM Filter config.vpxd.filter.SameHostAndTransportsFilter Same Host and Transports Filter config.vpxd.filter.hostRescanFilter Host Rescan Filter Note If you turn off the Host Rescan Filter, your hosts continue to perform a rescan each time you present a new LUN to a host or a cluster. NFS Datastores ESXi can access a designated NFS volume located on a NAS server, mount the volume, and use it for its storage needs. Same way that you use VMFS datastores. ESXi supports the following shared storage capabilities on NFS volumes: ■vMotion ■VMware DRS and VMware HA ■ISO images, which are presented as CD-ROMs to virtual machines Virtual machine snapshots ■ When you work with NFS storage, the following considerations apply: ■The maximum size of NFS datastores depends on the support that an NFS server provides. ESXi does not impose any limits on the NFS datastore size. ■If you use non-ASCII characters to name datastores and virtual machines, make sure that the underlying NFS server offers internationalization support Note When you mount the same NFS volume on different hosts, make sure that the server and folder names are identical across the hosts. If the names do not match exactly, the hosts see the same NFS volume as two different datastores. This might result in a failure of such features as vMotion. Unmount VMFS or NFS Datastores When you unmount a datastore, it remains intact, but can no longer be seen from the hosts that you specify. The datastore continues to appear on other hosts, where it remains mounted. Before unmounting VMFS datastores, make sure that the following prerequisites are met: ■No virtual machines reside on the datastore. ■The datastore is not part of a datastore cluster. ■The datastore is not managed by Storage DRS. ■Storage I/O control is disabled for this datastore. ■The datastore is not used for vSphere HA heartbeating. Note The datastore that is unmounted from some hosts while being mounted on others, is shown as active in the Datastores and Datastore Clusters view. Planned Device Removal Planned device removal is the intentional disconnection of a storage device. You might plan to remove a device for a variety of reasons, such as upgrading your hardware or reconfiguring your storage devices. To perform an orderly removal and reconnection of a storage device, use the following procedure: 1Migrate virtual machines from the device you plan to detach. 2Unmount the datastore deployed on the device. 3Detach the storage device. You can now perform a reconfiguration of the storage device by using the array console. 4Reattach the storage device. 5Mount the datastore and restart the virtual machines. Change virtual machine settings to allow the use of dynamic disk mirroring. a Right-click the virtual machine and select Edit Settings. bClick the Options tab and under Advanced, select General. c Click Configuration Parameters. Click Add Row and add the following parameters: Name Value scsi#.returnNoConnectDuringAPD True scsi#.returnBusyOnNoConnectStatus False d 14 Raw Device Mapping 135 About Raw Device Mapping 135 Raw Device Mapping Characteristics 138 Create VMs with RDMs 140 Manage Paths for a Mapped Raw LUN 141 RDM is a special mapping file in a VMFS volume that manages metadata for its mapped device. The mapping file is presented to the management software as an ordinary disk file, available for the usual file-system operations. To the virtual machine, the storage virtualization layer presents the mapped device as a virtual SCSI device. VMFS5 supports greater than 2TB disk size for RDMs in physical compatibility mode only. The following restrictions apply: ■You cannot relocate larger than 2TB RDMs to datastores other than VMFS5. ■You cannot convert larger than 2TB RDMs to virtual disks, or perform other operations that involve RDM to virtual disk conversion. Such operations include cloning. Clustered VM access: 15 Solid State Disks Enablement 143 Benefits of SSD Enablement 143 Auto-Detection of SSD Devices 143 Tag Devices as SSD 144 Untag an SSD Device 145 Untag an Automatically Detected SSD Device 146 Tag Devices as Local 146 Identify SSD Devices 147 Identifying a Virtual SSD Device 148 Best Practices for SSD Devices 148 16 VMkernel and Storage 149 Storage APIs 150 17 Understanding Multipathing and Failover 153 Failover with Fibre Channel 153 Host-Based Failover with iSCSI 154 Array-Based Failover with iSCSI 156 Path Failover and VMs 157 Managing Multiple Paths 158 VMware Multipathing Module 159 Path Scanning and Claiming 161 Managing Storage Paths and Multipathing Plug-Ins 164 Virtual machine I/O might be delayed for up to sixty seconds while path failover takes place. These delays allow the SAN to stabilize its configuration after topology changes. In general, the I/O delays might be longer on active-passive arrays and shorter on active-active arrays. ISCSI Failover ■ESXi does not support multipathing when you combine an independent hardware adapter with software iSCSI or dependent iSCSI adapters in the same host. ■Multipathing between software and dependent adapters within the same host is supported. ■On different hosts, you can mix both dependent and independent adapters. Failover with Software iSCSI With software iSCSI, as shown on Host 2 of the Host-Based Path Failover illustration, you can use multiple NICs that provide failover and load balancing capabilities for iSCSI connections between your host and storage systems. For this setup, because multipathing plug-ins do not have direct access to physical NICs on your host, you first need to connect each physical NIC to a separate VMkernel port. You then associate all VMkernel ports with the software iSCSI initiator using a port binding technique. As a result, each VMkernel port connected to a separate NIC becomes a different path that the iSCSI storage stack and its storage-aware multipathing plug-ins can use. With this form of array-based failover, you can have multiple paths to the storage only if you use multiple ports on the ESXi host. These paths are active-active. VMW_PSP_MRU The host selects the path that it used most recently. When the path becomes unavailable, the host selects an alternative path. The host does not revert back to the original path when that path becomes available again. There is no preferred path setting with the MRU policy. MRU is the default policy for most active-passive storage devices. Displayed in the vSphere Client as the Most Recently Used (VMware) path selection policy. VMW_PSP_FIXED The host uses the designated preferred path, if it has been configured. Otherwise, it selects the first working path discovered at system boot time. If you want the host to use a particular preferred path, specify it through the vSphere Client. Fixed is the default policy for most active-active storage devices. Displayed in the vSphere Client as the Fixed (VMware) path selection policy. VMW_PSP_RR The host uses an automatic path selection algorithm rotating through all active paths when connecting to active-passive arrays, or through all available paths when connecting to active-active arrays. RR is the default for a number of arrays and can be used with both active-active and active-passive arrays to implement load balancing across paths for different LUNs. Displayed in the vSphere Client as the Round Robin (VMware) path selection policy. The following considerations help you with multipathing: - If no SATP is assigned to the device by the claim rules, the default SATP for iSCSI or FC devices is VMW_SATP_DEFAULT_AA. The default PSP is VMW_PSP_FIXED. - When the system searches the SATP rules to locate a SATP for a given device, it searches the driver rules first. If there is no match, the vendor/model rules are searched, and finally the transport rules are searched. If no match occurs, NMP selects a default SATP for the device. - If VMW_SATP_ALUA is assigned to a specific storage device, but the device is not ALUA-aware, no claim rule match occurs for this device. The device is claimed by the default SATP based on the device's transport type. - The default PSP for all devices claimed by VMW_SATP_ALUA is VMW_PSP_MRU. The VMW_PSP_MRU selects an active/optimized path as reported by the VMW_SATP_ALUA, or an active/unoptimized path if there is no active/optimized path. This path is used until a better path is available (MRU). For example, if the VMW_PSP_MRU is currently using an active/unoptimized path and an active/optimized path becomes available, the VMW_PSP_MRU will switch the current path to the active/optimized one. - If you enable VMW_PSP_FIXED with VMW_SATP_ALUA, the host initially makes an arbitrary selection of the preferred path, regardless of whether the ALUA state is reported as optimized or unoptimized. As a result, VMware does not recommend to enable VMW_PSP_FIXED when VMW_SATP_ALUA is used for an ALUA-compliant storage array. The exception is when you assign the preferred path to be to one of the redundant storage processor (SP) nodes within an active-active storage array. The ALUA state is irrelevant. - By default, the PSA claim rule 101 Common mistake: inconsistent path selection policy 18 Storage Hardware Acceleration 173 Hardware Acceleration Benefits 173 Hardware Acceleration Requirements 174 Hardware Acceleration Support Status 174 Hardware Acceleration for Block Storage Devices 174 Hardware Acceleration on NAS Devices 179 Hardware Acceleration Considerations 181 ESXi hardware acceleration supports the following array operations: - Full copy, also called clone blocks or copy offload. Enables the storage arrays to make full copies of data within the array without having the host read and write the data. This operation reduces the time and network load when cloning virtual machines, provisioning from a template, or migrating with vMotion. - Block zeroing, also called write same. Enables storage arrays to zero out a large number of blocks to provide newly allocated storage, free of previously written data. This operation reduces the time and network load when creating virtual machines and formatting virtual disks. - Hardware assisted locking, also called atomic test and set (ATS). Supports discrete virtual machine locking without use of SCSI reservations. This operation allows disk locking per sector, instead of the entire LUN as with SCSI reservations. The following list shows the supported NAS operations: - File clone. This operation is similar to the VMFS block cloning except that NAS devices clone entire files instead of file segments. - Reserve space. Enables storage arrays to allocate space for a virtual disk file in thick format. - Extended file statistics. Enables storage arrays to accurately report space utilization for virtual machines. 19 Storage Thin Provisioning 183 Storage Over-Subscription 183 Virtual Disk Thin Provisioning 183 Array Thin Provisioning and VMFS Datastores 186 20 Using Storage Vendor Providers 191 Vendor Providers and Storage Data Representation 191 Vendor Provider Requirements and Considerations 192 Storage Status Reporting 192 Register Vendor Providers 193 View Vendor Provider Information 193 Unregister Vendor Providers 194 Update Vendor Providers 194 21 VM Storage Profiles 195 Understanding Storage Capabilities 196 Understanding VM Storage Profiles 199 22 Using vmkfstools 205 vmkfstools Command Syntax 205 vmkfstools Options 206 vSphere Resource Management Guide vSphere Availability Guide vSphere Monitoring and Performance Guide vSphere Troubleshooting VMware vSphere Examples and Scenarios Guide Iwan’s Add-on ESXi list of services that can be configured via vSphere client. .vmx file description Users and processes without privileges on a VM can connect or disconnect hardware devices, such as network adapters and CD-ROM drives. [e1: user = vCenter user??] To prevent a VM user or process from connecting or disconnecting the device from within the guest OS, add this line device_name.allowGuestConnectionControl = "false" By default, Ethernet 0 is configured to disallow device disconnection. The only reason you might change this is if a prior administrator set ethernet0.allowGuestConnectionControl = “true” Bens Add-on RAID 1 performance Since all the data exist in two or more copies, each with its own hardware, the read performance can go up roughly as a linear multiple of the number of copies. That is, a RAID 1 array of two drives can be reading in two different places at the same time, though not all implementations of RAID 1 do this.[4] To maximize performance benefits of RAID 1, independent disk controllers are recommended, one for each disk. Some refer to this practice as splitting or duplexing (for two disk arrays) or multiplexing (for arrays with more than two disks). When reading, both disks can be accessed independently and requested sectors can be split evenly between the disks. RAID 5 performance RAID 5 implementations suffer from poor performance when faced with a workload that includes many writes that are not aligned to stripe boundaries, or are smaller than the capacity of a single stripe. This is because parity must be updated on each write, requiring read-modify-write sequences for both the data block and the parity block. Random write performance is poor, especially at high concurrency levels common in large multi-user databases. The read-modify-write cycle requirement of RAID 5's parity implementation penalizes random writes by as much as an order of magnitude compared to RAID 0.[9] Performance problems can be so severe that some database experts have formed a group called BAARF — the Battle Against Any Raid Five.[10] The read performance of RAID 5 is almost as good as RAID 0 for the same number of disks. Except for the parity blocks, the distribution of data over the drives follows the same pattern as RAID 0. The reason RAID 5 is slightly slower is that the disks must skip over the parity blocks. Raid 6 Redundancy and data loss recovery capability RAID 6 extends RAID 5 by adding an additional parity block; thus it uses block-level striping with two parity blocks distributed across all member disks. Performance (speed) RAID 6 does not have a performance penalty for read operations, but it does have a performance penalty on write operations because of the overhead associated with parity calculations. source: http://blogs.vmware.com/vsphere/2012/06/troubleshooting-storage-performance-in-vsphere-part-2.html Create an EVC Cluster Create an EVC cluster to help ensure vMotion compatibility between the hosts in the cluster. When you create an EVC cluster, use one of the following methods: ■ Create an empty cluster, enable EVC, and move hosts into the cluster. ■ Enable EVC on an existing cluster. VMware recommends creating an empty EVC cluster as the simplest way of creating an EVC cluster with minimal disruption to your existing infrastructure. Prerequisites Before you create an EVC cluster, ensure that the hosts you intend to add to the cluster meet the requirements listed in EVC Requirements. Procedure 1 Create an empty cluster, and enable EVC. Select the CPU vendor and EVC mode appropriate for the hosts you intend to add to the cluster. Other cluster features such as vSphere DRS and vSphere HA are fully compatible with EVC. You can enable these features when you create the cluster. 2 Select a host to move into the cluster. 3 If the host feature set is greater than the EVC mode that you have enabled for the EVC cluster, ensure that the cluster has no powered-on virtual machines. ■ Power off all the virtual machines on the host. ■ Migrate the host’s virtual machines to another host using vMotion. 4 Move the host into the cluster. You can power on the virtual machines on the host, or migrate virtual machines into the cluster with vMotion, if the virtual machines meet CPU compatibility requirements for the cluster’s EVC mode. Virtual machines running on hosts with more features than the EVC mode must be powered off before migration into the cluster. 5 Repeat Step 3 and Step 4 for each additional host that you want to move into the cluster. Update Manager Capabilities scan and remediate host (3.5, 4.0, 4.1, 5.0) host extensions (such as EMC’s Powerpath/VE). Install too or just patch? vCenter virtual appliances? upgrade VMware Tools and upgrade VM hardware. install and update the Cisco Nexus 1000V migrate ESX 4.x hosts across to ESXi. Unfortunately, because of the size the /boot partition was allocated in ESX 3.x, these hosts cannot be migrated to ESXi 5. Any ESX 4.x hosts that had previously been upgraded from ESX 3.x will not have the required minimum 350 MB of space in the /boot partition. In these cases a fresh install is required. Features: Use the Distributed Resource Scheduler (DRS), DPM, HA, and FT (?? It will disable FT??) With vSphere 5, the cluster can even calculate if it can remediate multiple hosts at once while still appeasing its cluster constraints, speeding up the overall patching process Caveat Patching the VUM VM itself will not be possible if it requires Windows to be rebooted. VUM requires dbo permissions on the MSDB database during the install to create the required tables. You can and should remove the dbo permissions on the MSDB database after the installation of VUM is complete. They are not needed after installation, as with the vCenter Server setup. vCenter Orchestrator vSphere Storage Appliance Command-Line Management in vSphere 5 for Service Console Users Installing and Configuring VMware Tools Guide Legacy Host Licensing with vCenter Server 5 Command-Line Installation and Upgrade of VMware vCenter Server 5 Configuring and Troubleshooting N-Port ID Virtualization Cicso Nexus. What design materials? It is included in the blueprint?