VMware ® Virtual Infrastructure3 Security Risk Assessment This template is designed to describe to an IT Security Team what VMware Virtual Infrastructure3 is and how implementation of an infrastructure that meets or exceeds the corporate security policies will be achieved BY GAVIN JOLLIFFE xtravirt.com 2007 | 1 Table of Contents Table of Contents............................................................................................................................................. 2 1 Document Control .................................................................................................................................. 3 1.1 Authorisation ............................................................................................................................................ 3 1.2 Document Control/Change History .......................................................................................................... 3 1.3 Document References .............................................................................................................................. 3 1.4 Distribution List ........................................................................................................................................ 3 1.5 Terms and Abbreviations ......................................................................................................................... 3 2 Introduction ........................................................................................................................................... 5 2.1 Purpose of document ............................................................................................................................... 5 2.2 Background............................................................................................................................................... 5 2.3 Assumptions / Exclusions ......................................................................................................................... 5 2.4 Issues & unknowns ................................................................................................................................... 5 2.5 Constraints (Standards, Policies, Guidelines) ........................................................................................... 5 3 Virtual Infrastructure Risk Assessment Overview ................................................................................... 6 3.1 Introduction.............................................................................................................................................. 6 3.2 General Security Features ........................................................................................................................ 7 4 ESX Server Service Console ................................................................................................................... 10 4.1 Overview ................................................................................................................................................ 10 4.2 Risk Assessment ..................................................................................................................................... 10 4.3 Additional Best Practice Configuration .................................................................................................. 13 5 ESX Server Kernel (Virtualisation layer) ................................................................................................ 16 5.1 Overview ................................................................................................................................................ 16 5.2 Risk Assessment ..................................................................................................................................... 16 6 ESX Server Virtual Networking Layer .................................................................................................... 18 6.1 Overview ................................................................................................................................................ 18 6.2 Risk Assessment ..................................................................................................................................... 19 6.3 Additional Best Practice configuration ................................................................................................... 21 7 Virtual Machines .................................................................................................................................. 23 7.1 Overview ................................................................................................................................................ 23 7.2 Risk Assessment ..................................................................................................................................... 23 7.3 Additional Best Practice configuration ................................................................................................... 25 8 Virtual Storage ...................................................................................................................................... 26 8.1 Overview ................................................................................................................................................ 26 8.2 Risk Assessment ..................................................................................................................................... 26 9 VirtualCenter ........................................................................................................................................ 28 9.1 Overview ................................................................................................................................................ 28 9.2 Risk Assessment ..................................................................................................................................... 28 9.3 Additional Best Practice Configuration .................................................................................................. 29 xtravirt.com 2007 | 2 1 Document Control Copyright in this document remains vested in <COMPANY NAME> and no copies may be made of it or any part of it except for the purpose of evaluation in confidence. The information contained in this document is confidential and is submitted on the understanding that it will be used only by the staff or consultants of <COMPANY NAME> and that, where external consultants are employed, the use of this information is restricted to use in relation to the business of the project. In particular, the contents of this document may not be disclosed in whole or in part to any other party without the prior written consent of <COMPANY NAME>. 1.1 Authorisation Authorised by : <AUTHORISOR NAME> Date : <DATE> 1.2 Document Control/Change History Version Draft 1.3 Date Comment Editor In Progress Document References Title Author Date Version Security Design of the VMware Infrastructure 3 Architecture VMware 22/02/07 Not Specified VMware Infrastructure 3 Security Hardening VMware 21/02/07 Not Specified Server Configuration Guide VMware 25/09/06 20060925 Providing LUN Security VMware 10/03/06 Not Specified Note: Content from referenced documents has been quoted or paraphrased throughout this document. 1.4 Distribution List Title <NAME / POSITION> 1.5 Date <DATE> Version <VERSION NO.> Terms and Abbreviations Term/Abbreviation DoS xtravirt.com Definition Denial of Service attack 2007 | 3 MAC Media Access Control SSH Secure Shell VC VMware VirtualCenter VI VMware Virtual Infrastructure VMM Virtual Machine Monitor xtravirt.com 2007 | 4 2 Introduction 2.1 Purpose of document This document provides an evaluation of the possible risks and proposed countermeasures with implementation of the project using VMware Virtual Infrastructure virtualisation based software. The document is virtualisation focussed and is intended to provide security personnel with enough detail to ensure that current security standards are met or exceeded and allow the design of this project to proceed in an assured manner. 2.2 Background <CLIENT AND/OR PROJECT BACKGROUND> 2.3 Assumptions / Exclusions This assessment excludes security risks present with a physical site attack. 2.4 Issues & unknowns <COMPLETE AS APPLICABLE> 2.5 Constraints (Standards, Policies, Guidelines) <COMPLETE AS APPLICABLE> xtravirt.com 2007 | 5 3 Virtual Infrastructure Risk Assessment Overview 3.1 Introduction VMware® ESX Server installs directly on server hardware, or “bare metal”, and inserts a virtualisation layer between the hardware and the operating system. ESX Server partitions a physical server into multiple secure and portable virtual machines that can run side by side on the same physical server. Each virtual machine represents a complete system—with processors, memory, networking, storage and BIOS—so that Windows, Linux, Solaris and NetWare operating systems and software applications run in virtualised environment without any modification. The bare metal architecture gives ESX Server complete control over the server resources allocated to each virtual machine and provides for near native virtual machine performance and enterprise class scalability. Virtual machines have built in high availability, resource management and security features that can provide better service levels to software applications than static physical environments. VMware ESX Server architecture Source: (VMware) Server Configuration Guide From a security perspective, VMware Infrastructure consists of several major components: Virtualisation layer, consisting of the VMkernel and the virtual machine monitor Virtual machines ESX Server Service Console ESX Server virtual networking layer Virtual storage VirtualCenter For each of the components, risks have been assessed independently to allow a methodical approach to discerning risk and countermeasure. Note: To assess overall relative risk the countermeasures should be read in conjunction with each other. xtravirt.com 2007 | 6 Risk assessments are addressed in sub-sections as per the components list above. Each sub-section provides an overview, risk assessment and includes any additional best practice configurations. 3.1.1 Definitions The following definitions were used as reference points within the document. Potential Impact Definition Description Loss of Confidentiality Impact of unauthorised disclosure of sensitive information (e.g: contravene any privacy legislation). Loss of Integrity Impact if system or data integrity is lost by unauthorised changes to the data or system. Loss of Availability Impact to system functionality and operational effectiveness. 3.2 General Security Features 3.2.1 Product Overview – VMware ESX VMware ESX 3 Server presents a generic x86 platform by virtualising four key hardware components: processor, memory, disk, and network. An operating system is then installed into this virtualised platform. The virtualisation layer, or VMkernel, is a kernel designed by VMware specifically to run virtual machines. It controls the hardware utilised by ESX Server hosts and schedules the allocation of hardware resources among the virtual machines. A Service Console using a modified version of Red Hat Enterprise 3 is accessible and provides a local management interface and API to the ESX kernel. Because the VMkernel is fully dedicated to supporting virtual machines and is not used for other purposes, the interface to the VMkernel is strictly limited to the API required to manage virtual machines. There are no public interfaces to the VMkernel, and it cannot execute arbitrary code. The VMkernel alternates among all the virtual machines on the host in running the virtual machine instructions on the processor. Every time a virtual machine’s execution is stopped, a context switch occurs. During the context switch the processor register values are saved and the new context is loaded. When a given virtual machine’s turn comes around again, the corresponding register state is restored. Each virtual machine has an associated virtual machine monitor (VMM). The VMM uses binary translation to modify the guest operating system kernel code so it can run in a less-privileged processor ring. This is analogous to what a Java virtual machine does using just-in-time translation. Additionally, the VMM virtualises a chip set for the guest operating system to run on. The device drivers in the guest cooperate with the VMM to access the devices in the virtual chip set. The VMM passes request to the VMkernel to complete the device virtualisation and support the requested operation. The following outlines key security features of the product: Compatibility with SAN security practices. VMware Infrastructure enforces security policies with LUN zoning and LUN masking. xtravirt.com 2007 | 7 Implementation of secure networking features. VLAN tagging enhances network security by tagging and filtering network traffic on VLANs, and Layer network security policies enforce security for virtual machines at the Ethernet layer. Integration with Microsoft® Active Directory. VMware Infrastructure bases access controls on existing Microsoft Active Directory authentication mechanisms. Custom roles and permissions. VMware Infrastructure enhances security and flexibility with userdefined roles which can be managed in a granular way. Resource pool access control and delegation. VMware Infrastructure secures resource allocation at different levels and VM management can be delegated accordingly. Audit trails. VMware Infrastructure maintains a record of significant configuration changes and the administrator who initiated each one. Reports for event tracking can be exported. Session management. VMware Infrastructure enables discovery and, if necessary, terminate VirtualCenter user sessions. Vulnerability Response. VMware has implemented internal processes to ensure VMware products meet highest standards for security. The VMware Security Response Policy (www.vmware.com/vmtn/technology/security/security_response.html) documents VMware’s commitments for resolving possible vulnerabilities in VMware products. Security Certification. VMware ESX Server 2.5.0 and VirtualCenter 1.2.0 have been validated under the U.S. Common Criteria Evaluation and Validation Scheme (CCEVS) process, achieving EAL2 certification. VMware ESX Server 3.0 and VirtualCenter 2.0 are currently being tested for certification at EAL4+. The following table details predetermined TCP and UDP ports used for management access to a VirtualCenter management server, ESX Server host(s) and other network components of a virtual infrastructure built on VMware ESX. Port # Purpose Traffic Type 80 HTTP access. Redirected to port 443 TCP in 443 HTTPS access TCP in 902 Authentication traffic from VI client to VirtualCenter or ESX host TCP in, UDP out 903 Remote console traffic generated by user access to VM’s TCP in Traffic between ESX hosts for VMware HA TCP out, UDP in/out Incoming requests for VMotion TCP in/out Traffic between ESX hosts for VMware HA TCP out, UDP in/out 27000 License transactions from ESX host to license server TCP out 27010 License transactions from the license server TCP in 2050 – 5000 8000 8042 - 8045 In addition to the above, to ensure the protection of data transmitted to and from external network connections, ESX server uses one of the strongest block ciphers available – 256 bit AES block encryption. ESX Server also uses 1024 bit RSA for key exchange. These encryption algorithms protect the following connections: xtravirt.com 2007 | 8 VI client connections to the VC Server and to ESX hosts via the Service Console. VI Web Access connections to ESX hosts via the Service Console. Service Console connections to virtual machines through the VMkernel. SSH connections to the ESX hosts via the Service Console. xtravirt.com 2007 | 9 4 ESX Server Service Console 4.1 Overview Whether using the VI management client or the command line, all configuration tasks for ESX Server are performed through the Service Console, including configuring storage, controlling aspects of virtual machine behaviour, and setting up virtual switches or virtual networks. A person logged in to the Service Console with privileged permissions has the ability to modify, shut down, or even destroy virtual machines on that host. While VMware ESX Server management clients use authentication and encryption to prevent unauthorised access to the Service Console, other services might not offer the same protection. If attackers gain access to the Service Console, they are free to reconfigure many attributes of the ESX Server host. The Service Console is the point of control for ESX Server and safeguarding it from misuse is crucial. The ESX Server 3.0 Service Console provides an execution environment to monitor and administer the entire ESX Server host. The Service Console operating system is a reduced version of Red Hat Enterprise Linux, Update 6. Much of the functionality not necessary for interacting with the ESX Server virtualisation layer has been removed, therefore not all vulnerabilities of this distribution apply to the Service Console. VMware monitors and tracks all known security exploits that apply to this particular reduced version and issues custom updates as and when needed. 4.2 Risk Assessment Threat # 1 Infiltrate Service Console via untrusted network Likelihood Possible Potential Impact Availability Integrity Countermeasure Connection only to internal trusted network No connection allowed to Internet. Comments By default, ESX Server is installed with a high security internal firewall setting - all outbound ports are closed Only inbound ports that are open are those required for interactions with clients such as the VMware Virtual Infrastructure Client. Note: This is the VMware recommended security setting unless the Service Console is connected to a trusted network. 2 3 Service Console via client communication streams Possible Service Console via SC web service Possible xtravirt.com Availability Integrity Availability All communications from clients are encrypted through SSL by default. Connection uses 256bit AES block encryption and 1024-bit RSA key encryption Service Console access across insecure networks such as the Internet or non-private WAN’s could open up the risk of a man-in-the-middle type attack and is considered unsafe practice The Tomcat Web service, used internally by ESX Server to support access to the Service Console by Web clients such as VMware Virtual Infrastructure Web Access, has been modified to run only those functions required for administration and monitoring by a Web client. 2007 | 10 Threat # 4 Likelihood Potential Impact Comments VMware monitors all security alerts that could affect Service Console security and, if needed, issues a security patch, as it would for any other security vulnerability that could affect ESX Server hosts. VMware provides security patches for the Service Console which uses Red Hat Enterprise Linux 3, Update 6 and later as they become available Insecure services such as FTP and Telnet are not installed and the ports for these services are closed by default. FTP, SMB and similar can be required occasionally for periodic maintenance of the ESX host. When required these ports will be opened for the minimal time required to complete a required function then be closed again. Service Console via Red Hat vulnerability Possible Service Console via insecure services Possible Service Console – via file/folder permission change Possible 7 Service Console via SNMP Unlikely Availability ESX Server supports SNMPv1, and the management information base is read-only. Nothing can be set through SNMP management calls. 8 Service Console via common network Unlikely Availability Isolate Service Console by creating a separate VLAN. Configure network access for management tool connections with the Service Console through a single virtual switch and one or more uplink ports. For this project, ESX servers are located on an internal trusted network and network isolation via separate firewalled VLAN will be used. ESX Server 3 includes a firewall between the Service Console and the network. By default, the Service Console firewall is configured at the high security setting, which blocks all incoming and outgoing traffic except for that on ports 902, 80, 443, and 22, which are used for basic communication with ESX Server. Any ports opened on a full time basis, will be documented in the design, including the purpose for opening each port. 5 6 9 Service Console via tcp/udp port Availability Countermeasure Integrity Availability Integrity Availability Integrity Integrity Possible Availability Integrity The number of applications that use a setuid or setgid flag has been minimised. This prevents anyone without access to the Service Console VLAN or virtual switch from viewing traffic to and from the Service Console. Also prevents attackers from sending any packets to the Service Console. The server could be vulnerable to a DoS attack using the default ports. Additionally placing ESX servers behind a hardware firewall is considered a good practice countermeasure. Furthermore, the ‘iptables’ program within the Service Console can be used to further restrict network access to a more granular level, eg: by subnet, nominated ip list. xtravirt.com 2007 | 11 Threat # Likelihood Potential Impact Countermeasure Comments 10 Service Console via applications and services running in the console Unlikely Availability Additional software that could run in the Service Console includes management agents and backup agents. Services that could run include NIS, SNMP, or CIM HTTPS. Software will be limited to core support requirements for Enterprise Management systems such as hardware monitoring and backup. The more components there are running in the Service Console, the more potential objects are susceptible to security vulnerabilities so will be kept to a minimum. 11 Managing the Service Console as a Linux host Likely Availability Strict operating processes, logging, file integrity checks. The Service Console is generated from a Red Hat Linux distribution that has been carefully stripped down and modified to provide exactly the functionality necessary to communicate with and allow management of the VMkernel. Any additional software installed should not make assumptions about what RPM packages are present, nor that they can modify them. In many cases, the packages that do exist have been modified especially for ESX Server. Integrity Confidentiality 12 Malicious code Possible Availability Note: the Service Console is not be treated like a Linux host when it comes to patching. Patches issued by Red Hat or any other third-party vendor are never to be installed. Follow VMware best practice to configure Service Console network isolation to an internal trusted network. Because ESX Server runs a customised, locked-down version of Linux, there is much less likelihood of security exploits than in a standard Linux distribution. VMware state that if you follow the best practice of isolating the network for the Service Console, there is no reason to run any antivirus or other such security agents, and their use is not recommended. 13 Unauthorised modification of key Service Console configuration files (file system integrity) Possible Integrity Key configuration files should be monitored for integrity and unauthorised tampering. /etc/profile The design will specify use of using a tool such as Tripwire, or a checksum tool such as sha1sum which is built into the Service Console. /etc/ssh/sshd_config /etc/pam.d/system_auth /etc/ntp /etc/ntp.conf /etc/passwd /etc/group /etc/sudoers /etc/shadow /etc/vmware/ These files are also to be backed up regularly. 14 Internal attack or user error on ESX Service xtravirt.com Possible Availability Use VirtualCenter via VI Client There are some tasks that cannot be performed via the VI 2007 | 12 Threat # Likelihood Console Potential Impact Integrity Countermeasure *1 The best measure to prevent security incidents in the ESX Service Console is to avoid accessing it if at all possible. Many of the tasks necessary to configure and maintain the ESX Server host can be achieved using the VI Client, either connected directly to the host, or, preferably through VirtualCenter. The VI Client communicates using a welldefined API, which limits what can be done. This is safer than direct execution of arbitrary commands. Connectivity of ssh based client communication tools such as putty, winscp etc.. will be limited to a discrete group of ip addresses belonging to the physical / virtual desktops of the Windows Infrastucture Management Team staff. Limiting the connectivity will be achieved by utilising the /etc/hosts.allow and /etc/hosts.deny files within VMware ESX. The best practice approach to this is to deny access based on subnet range, only allowing access based on ip address exception. 15 Denial of Service attack by filling up root partition Possible Availability Create separate partitions for /home, /tmp, and /var/log. These are all directories that have the potential to fill up. Comments Client. For these tasks, you must log in to the Service Console. Also, if the connection is lost to the host, executing certain of these commands through the command-line interface may be the only recourse, eg: if the network connection fails and you are therefore unable to connect using VI Client. There may be some cases in which you want to automate certain configuration tasks using scripts that run in the Service Console, but for interactive administration, VI Client is the most secure access method. A login 'grace-time' setting will be configured to ensure idle ssh sessions are not left connected to an ESX Server host indefinitely If not isolated from the root partition, a denial of service could be experienced if the root partition is full and unable to accept any more writes. *1. VirtualCenter has the added benefit that authorisation and authentication are performed via the standard central Active Directory service, instead of using special local accounts in the Service Console. In addition, roles and users are stored in a database, providing an easy way to view the current permissions as well as take a snapshot of them. VirtualCenter also keeps track of every task invoked through it, providing an automatic audit trail. 4.3 Additional Best Practice Configuration The following best practice configurations will be adopted within the design for this project. Maintain Thorough Logging In addition to identifying system issues, logging allows tracking of any unusual activity that might be a precursor to an attack and also allows a post-mortem to be carried out on any compromised systems. Log files can be accessed by navigating to the /var/log/ directory and provide an important tool for diagnosing security breaches as well as other system issues. They also provide key sources of audit information. In addition to storing log information in files on the local file system, you can send this log information to a remote system. The syslog program is typically used for computer system management and security auditing, xtravirt.com 2007 | 13 and it can serve these purposes well for ESX Server hosts. You can select individual Service Console components for which you want the logs sent to a remote system. The following tips provide best practices for logging: Ensure accurate time-keeping. Control growth of log files. Use remote syslog logging. Remote logging to a central host provides a way to greatly increase administration capabilities. By gathering log files onto a central host, you can easily monitor all hosts with a single tool as well as do aggregate analysis and searching to look for things like coordinated attacks on multiple hosts. Use local and remote sudo logging. If you have configured sudo to enable controlled execution of privileged commands, you can benefit from using syslog to audit use of these commands. The following instructions show how to log all privileged command executions using syslog. You can then benefit from the other syslog features such as remote logging and log file rotation. Configure sudo to use syslog to record all occurrences of its use. Add an entry to /etc/syslog.conf to send the logging information to a file and, optionally, to a remote host. In ESX Server 3, only SNMPv1 is supported, and only for queries. Use a Directory Service for Authentication for access to the ESX Server Service Console Advanced configuration and troubleshooting of an ESX Server host may require local privileged access to the Service Console. For this circumstance, it is recommended to set up individual host-localised user accounts and groups for selected administrators with overall responsibility for your virtual infrastructure. These accounts would correspond to real individuals and not be accounts shared by multiple people. More preferable, and as proposed for the project will be to configure the server to authenticate users via Active Directory thereby centralising accounts and allowing continuity of existing policies for password complexity, aging and reuse. Note: Although you can create host local accounts on the Service Console that correspond to each global account, this presents the problem of having to manage user names and passwords in multiple places. It is recommended to use a directory service, such as NIS or LDAP, to define and authenticate users on the Service Console, so local user accounts do not have to be created. Because Service Console authentication is Unix-based, it cannot use Active Directory to define user accounts. However, it can use Active Directory to authenticate users. Individual user accounts can be defined on the host, then use the local Active Directory domain to manage the passwords and account status. Root User Logon The root user of the Service Console has almost unlimited capabilities, and securing this account is important to secure the ESX Server host. By default, remote access via ssh is enabled, but not for the root account. Files can be copied remotely to and from the Service Console using an scp (secure cp) client, such as WinSCP. Enabling remote root access is not recommended, because it opens the system to network based attack should someone obtain the root password. The recommended approach is to log in remotely using a regular user account, then use sudo to perform privileged commands. xtravirt.com 2007 | 14 Note: The sudo command enhances security because it grants root privileges only for select activities, in contrast with the su command, which grants root privileges for all activities. Using sudo also provides superior accountability because all sudo activities are logged, whereas if you use su, ESX Server only logs the fact that the user switched to root by way of su. The sudo command also provides a way for you to grant or revoke execution rights to commands on an as-needed basis. Root access can be disallowed on the console of the ESX Server host. This approach forces anyone who wants to access the system to first log in using a regular user account, then use sudo or su to perform tasks. If disallowing root login on the console, a non-privileged account on the host should be created to enable logins. This should be a local account so that in case the network connection to the directory service is lost, access to the host is still possible. Access can be assured by defining a local password for this account, which will then override authentication via directory services. The net effect is that administrators can still access the system, but they never have to log in as root. Instead, they use sudo to perform particular tasks or su to perform arbitrary commands. Because su is a powerful command, access will be limited to it. By default, only users that are members of the wheel group in the Service Console have permission to run su. If a user attempts to run su - to gain root privileges and that user is not a member of the wheel group, the su - attempt fails and the event is logged. The following list of recommendations for using sudo for the project will be : Configure local and remote sudo logging. Create a special admins group and allow only members of that group to use sudo. Use sudo aliases to determine the authorisation scheme, then add and remove users in the alias definitions instead of in the commands specification. Be careful to permit only the minimum necessary operations to each user and alias. Permit very few users to run the su command, because su opens a shell that has full root privileges but is not auditable. Require users to enter their own passwords when performing operations. This is the default setting. Do not require the root password, because this presents a security risk, and do not disable password checking. In sudo the authentication only persists for a brief period of time before sudo asks for a password again. xtravirt.com 2007 | 15 5 ESX Server Kernel (Virtualisation layer) 5.1 Overview The virtualisation layer, or VMkernel, is a kernel designed by VMware from the ground up to run virtual machines. It controls the hardware utilised by ESX Server hosts and schedules the allocation of hardware resources among the virtual machines. Because the VMkernel is fully dedicated to supporting virtual machines and is not used for other purposes, the interface to the VMkernel is strictly limited to the API required to manage virtual machines. 5.2 Risk Assessment Threat # Likelihood Potential Impact Countermeasure Comments 1 Buffer Overflow attack Possible Availability To provide an extra layer of security, the VMM supports the buffer overflow prevention capabilities built in to most Intel and AMD CPUs, known as the NX or XD bit. Since the binary translator does not operate on translation units of more than 12 instructions, it is not possible for the translator to experience a buffer overflow for this operation. Buffer overflow attacks usually exploit code that operates on unconstrained input without doing a length check. If it is possible to provide a very, very long string and the code that operates on the string has a fixed size buffer, and it does not perform length checks, a buffer overflow occurs and may be used in an attack. 2 Hyperthreading exploits Unlikely Availability ESX Server virtual machines do not provide hyperthreading technology to the guest operating system. ESX Server, however, can utilise hyperthreading to run two different virtual machines simultaneously on the same physical processor. However, because virtual machines do not necessarily run on the same processor continuously, it is more challenging to exploit the vulnerability discussed above. Intel’s hyperthreading technology allows two process threads to execute on the same CPU package. These threads can share the memory cache on the processor. Malicious software can exploit this feature by having one thread monitor the execution of another thread, possibly allowing theft of cryptographic keys. 3 Memory virtualisation exploits*1 Unlikely Availability When a virtual machine needs memory, each memory page is zeroed out by the VMkernel before being handed to the virtual machine. Normally, the virtual machine then has exclusive use of the memory page, and no other virtual machine can touch it or even see it. Any attempt by the operating system or any application running inside a virtual machine to address memory outside of what has been allocated by the VMM would cause a fault to be delivered to the guest operating system, typically resulting in an immediate system crash, panic, or halt in the virtual machine, depending on the operating system. This is often termed ‘hyperspacing’, when a malicious guest operating system attempts I/O to an address space that is outside normal boundaries. 4 Memory leak through Transparent Page sharing Not Possible N/A As soon as any one virtual machine tries to modify a shared page, it gets its own private copy. Because shared Transparent page sharing is a technique for using memory resources more efficiently. Memory pages that are xtravirt.com 2007 | 16 # Threat Likelihood Potential Impact Countermeasure Comments memory pages are marked copy-on-write, it is impossible for one virtual machine to leak private information to another through this mechanism. Transparent page sharing is controlled by the VMkernel and VMM and cannot be compromised by virtual machines. identical in two or more virtual machines are stored once in the host system’s RAM, and each of the virtual machines has readonly access. Such shared pages are common, for example, if many virtual machines on the same host run the same operating system. *1. The RAM allocated to a virtual machine by the VMM is defined by the virtual machine’s BIOS settings. The memory is allocated by the VMkernel when it defines the resources to be used by the virtual machine. A guest operating system uses physical memory allocated to it by the VMkernel and defined in the virtual machine’s configuration file. The operating system that executes within a virtual machine expects a zero-based physical address space, as provided by real hardware. The VMM gives each virtual machine the illusion that it is using such an address space, virtualising physical memory by adding an extra level of address translation. A machine address refers to actual hardware memory, while a physical address is a software abstraction used to provide the illusion of hardware memory to a virtual machine. xtravirt.com 2007 | 17 6 ESX Server Virtual Networking Layer 6.1 Overview The virtual networking layer consists of the virtual network devices through which virtual machines and the Service Console interface with the rest of the network. ESX Server relies on the virtual networking layer to support communications between virtual machines and their users. In addition, ESX Server hosts use the virtual networking layer to communicate with iSCSI SANs, NAS storage, and so forth. The virtual networking layer includes virtual network adapters and the virtual switches. VMware ESX Server virtual networking layer Source: (VMware) Server Configuration Guide VMware Infrastructure 3 provides virtual network adapters to guest operating systems that have these characteristics: They have their own MAC addresses and unicast/multicast/broadcast filters. They are strictly layered Ethernet adapter devices. They interact with the low-level VMkernel layer stack via a common API. The ESX Server 3 networking stack uses a modular design for flexibility. A virtual switch is “built to order” at run time from a collection of small functional units, such as: The core layer forwarding engine VLAN tagging, stripping, and filtering units Virtual port capabilities specific to a particular adapter or a specific port on a virtual switch Level security, checksum, and segmentation offload units When the virtual switch is built at run time, ESX Server loads only those components it needs. It installs and runs only what is actually needed to support the specific physical and virtual Ethernet adapter types used in the configuration. xtravirt.com 2007 | 18 The following diagram shows how various networks can be segregated within an ESX Server host. Network Segregation Source: (VMware) Server Configuration Guide 6.2 Risk Assessment Threat # 1 Attack via Virtual Switch integrity Likelihood Unlikely Potential Impact Confidentiality Integrity Countermeasure Comments ESX Server provides no path for network data to cross between virtual switches. A common cause of traffic leaks in the world of physical switches is cascading; often needed because physical switches have a limited number of ports. Because each virtual switch provides 1016 ports there is no code to connect virtual switches. Virtual switches cannot share physical Ethernet adapters, so there is no way to fool the Ethernet adapter into doing loopback or something similar that would cause a leak between virtual switches Each virtual switch has its own forwarding table, and there is no mechanism in the code to allow an entry in one table to point to a port on another virtual switch. In other words, every destination the switch looks up must match ports on the same virtual switch as the port where the frame originated, even if other virtual switches’ lookup tables contain entries for that address. 2 VM’s or other network nodes influencing Virtual Switch behaviour xtravirt.com Unlikely Availability Virtual switches do not learn from the network in order to populate their forwarding tables. This eliminates an entry denial-of-service (DoS) or leakage attacks, either as a direct DoS attempt or, more There are natural limits to virtual switch isolation. If you connect the uplinks of two virtual switches together, or if you bridge two virtual switches with software running in a virtual machine, you open the door to the same kinds of problems you might see in physical switches. Virtual switches make private copies of any frame data used to make forwarding or filtering decisions. 2007 | 19 Threat # Likelihood Potential Impact Countermeasure Comments likely, as a side effect of some other attack, such as a worm or virus, as it scans for vulnerable hosts to infect. 3 4 Unintended inter-VLAN traffic flow Network attack via Virtual Switch VLAN’s Unlikely / Not Possible Unlikely Confidentiality Confidentiality Integrity Availability It is important to ensure that frames are contained within the appropriate VLAN on a virtual switch. ESX Server does so in the following ways: VLAN data is carried outside the frame as it passes through the virtual switch. Filtering is a simple integer comparison. This is really just a special case of the general principle that the system should not trust user accessible data. Virtual switches have no dynamic trunking support. Virtual switches have no support for what is referred to as native VLAN. Recommend creation of VLANs as they provide the almost all of the security benefits inherent in implementing physically separate networks without the hardware overhead. Also recommend use separate physical network adapters for virtual machine zones to ensure that the zones are isolated. 5 Network breach by user error or omission Possible Confidentiality Availability ESX Server supports IEEE 802 .1q VLANs, which can be used to further protect the virtual machine network, Service Console, or storage configuration. This driver is written by VMware according to the IEEE specification. VLANs allow you to segment a physical network so that two machines on the same physical network cannot send packets to or receive packets from each other unless they are on the same VLAN. Label all virtual networks appropriately to prevent confusion or security compromises. This labelling prevents operator error due to a virtual machine being attached to a network it is not authorised for or to a network that could allow the leakage of sensitive information. In the case of the project design sensitive networks are physically segregated from each other by using clusters of physical ESX hosts. 6 MAC address spoofing Possible Confidentiality Virtual switch security profiles on ESX Server hosts can protect against this type of attack with two options, which are set per virtual switch: xtravirt.com MAC address changes — By default, this option is set to Accept. To protect against Each virtual network adapter in a virtual machine has its own initial MAC address assigned when the adapter is created. In addition, each adapter has an effective MAC address that filters out incoming network traffic with a destination MAC address different from the 2007 | 20 Threat # Likelihood Potential Impact Countermeasure MAC impersonation, you can set this option to Reject. ESX Server then will not honour requests to change the effective MAC address to anything other than the initial MAC address. The port that the virtual adapter used to send the request is disabled and as a result, the virtual adapter does not receive any more frames until it changes the effective MAC address to match the initial MAC address. The guest operating system does not detect that the MAC address change has not been honoured. Forged transmissions — By default, this option is set to Accept, meaning ESX Server does not compare source and effective MAC addresses. The Forged Transmits option setting affects traffic transmitted from a virtual machine. If you set this option to Reject, ESX Server compares the source MAC address being transmitted by the operating system with the effective MAC address for its adapter to see if they match. If the addresses do not match, ESX Server drops the packet. The guest operating system does not detect that its virtual network adapter cannot send packets using the impersonated MAC address. ESX Server intercepts any packets with impersonated addresses before they are delivered, and the guest operating system might assume that the packets have been dropped. Comments effective MAC address. Upon creation, a network adapter’s effective MAC address and initial MAC address are the same. However, the virtual machine’s operating system can alter the effective MAC address to another value at any time. If an operating system changes the effective MAC address, its network adapter then receives network traffic destined for the new MAC address. The operating system can send frames with an impersonated source MAC address at any time. Thus, an operating system can stage malicious attacks on the devices in a network by impersonating a network adapter authorised by the receiving network. Within the design both of these options will be set to Reject. 6.3 Additional Best Practice configuration The following best practice configurations will be adopted within the design for this project. Do Not Create a Default Port Group During ESX Server installation, there is an option to create a default virtual machine port. However, this option creates a virtual machine port group on the same network interface as the Service Console. If this setting is left unchanged, it could allow virtual machines to detect sensitive and often unencrypted information. Since the xtravirt.com 2007 | 21 Service Console should always be on a separate, private network, this option should never be used except in a test environment. This option will be disabled in the standard VMware ESX Server image for this project. Use a Dedicated, Isolated Network for VMotion and iSCSI Because VMotion information is not encrypted, the entire state of a virtual machine could potentially be snooped on the network used for VMotion. Therefore, it is critical that this network be isolated from any other use. To encrypt VMotion traffic, there is the option of using hardware-based SSL encryption. Encryption is not available for iSCSI disk I/O, so this network should be strictly controlled, too. In the case of the project design the network for VMotion will be dedicated, isolated and non-routable. Do Not Use Promiscuous Mode on Network Interfaces ESX Server has the ability to run virtual network adapters in promiscuous mode. Promiscuous mode may be enabled on virtual switches that are bound to a physical network adapter (vmnic) and virtual switches that do not bind to a physical network adapter (vmnet). When promiscuous mode is enabled for a vmnic switch, all virtual machines connected to the virtual switch have the potential of reading all packets sent across that network, from other virtual machines as well as any physical machines or other network devices. When promiscuous mode is enabled for a vmnet switch, all virtual machines connected to the vmnet switch have the potential of reading all packets across that network — that is, traffic among the virtual machines connected to that vmnet switch. While promiscuous mode can be useful for tracking network activity, it is an insecure mode of operation because any adapter in promiscuous mode has access to the packets regardless of whether some of the packets should be received only by a particular network adapter. This means that an administrator or root user within a virtual machine can potentially view traffic destined for other guest operating systems. Promiscuous mode should only be used for security monitoring, debugging, or troubleshooting. By default, promiscuous mode is set to Reject and will remain so in the standard VMware ESX server image for this project. xtravirt.com 2007 | 22 7 Virtual Machines 7.1 Overview Virtual machines are the containers in which guest operating systems and their applications run. By design, all VMware virtual machines are isolated from one another. Virtual machine isolation is imperceptible to the guest operating system. Even a user with system administrator privileges or kernel system level access on a virtual machine’s guest operating system cannot breach this layer of isolation to access another virtual machine without privileges explicitly granted by the ESX Server system administrator. This isolation enables multiple virtual machines to run securely while sharing hardware and ensures both their ability to access hardware and their uninterrupted performance. For example, if a guest operating system running in a virtual machine crashes, other virtual machines on the same ESX Server host continue to run. The guest operating system crash has no effect on: The ability of users to access the other virtual machines The ability of the running virtual machines to access the resources they need The performance of the other virtual machines Each virtual machine is isolated from other virtual machines running on the same hardware. While virtual machines share physical resources such as CPU, memory, and I/O devices, a guest operating system in an individual virtual machine cannot detect any device other than the virtual devices made available to it. Virtual Machine Resources Source: (VMware) Server Configuration Guide 7.2 Risk Assessment Threat # 1 Attack VM via communication path from another VM on same ESX server xtravirt.com Likelihood Not possible Potential Impact Availability Countermeasure Comments Because the VMkernel and VMM mediate access to the physical resources and all physical hardware access takes place through the VMkernel, virtual machines cannot Just as a physical machine can communicate with other machines in a network only through a network adapter, a virtual machine can communicate with other virtual 2007 | 23 Threat # 2 Denial of Service attack via resource starvation Likelihood Possible Potential Impact Availability Countermeasure circumvent this level of isolation. machines running on the same ESX Server host only through a virtual switch. Further, a virtual machine communicates with the physical network, including virtual machines on other ESX Server hosts, only through a physical network adapter. By default, ESX Server imposes a form of resource reservation by applying a distribution algorithm that divides the available host resources equally among the virtual machines while keeping a certain percentage of resources for use by system components, such as the Service Console. This default behaviour provides a degree of natural protection from denial-of-service and distributed denial-of-service attacks. Resource reservations and limits protect virtual machines from performance degradation if another virtual machine tries to consume too many resources on shared hardware. For example, if one of the virtual machines on an ESX Server host is incapacitated by a denial-ofservice or distributed denial-ofservice attack, a resource limit on that machine prevents the attack from taking up so many hardware resources that the other virtual machines are also affected. Similarly, a resource reservation on each of the virtual machines ensures that, in the event of high resource demands by the virtual machine targeted by the denial-ofservice attack, all the other virtual machines still have enough resources to operate. Resource reservations and limits can be set on an individual basis if you want to customise the default behaviour so the distribution is not equal across all virtual machines on the host. 3 Virtual Machine security risks - general Possible Confidentiality Integrity Availability Comments In every virtual machine in the virtual infrastructure, antivirus agents, spyware filters, intrusion detection systems, and any other standard security measures present on physical servers should be installed and kept up to date including patching. The project design will require all virtual machine servers to maintain current standard security measures. 4 Attack via VI Console xtravirt.com Possible Availability The project design will be based upon role based administrative tiers to restrict use of the VI Console to as few operatives as required. The standard console access route for virtual machines will be via RDP. The VI Console allows a user to connect to the console of a virtual machine, in effect seeing what a monitor on a physical server would show. However, the VI Console also provides power management and removable device connectivity controls, which could potentially allow a malicious user to bring down a virtual machine. In addition, it also has a performance impact on the Service Console, especially if many VI Console sessions are open simultaneously. Instead of VI Console, use native remote management services, such as terminal services and ssh, to 2007 | 24 Threat # Likelihood Potential Impact Countermeasure Comments interact with virtual machines. 7.3 Additional Best Practice configuration The following best practice configurations will be adopted within the Low Level design for this component. Create template images Only approved pre-hardened and patched server images will be available for deployment. VM Configuration Options Unneeded devices such as cd, floppy, usb drives will be disconnected by default on virtual machines and only approved staff will have rights to modify based upon role based permissions. Copy and paste operations between the guest operating system and remote console will be disabled so that sensitive information can not be inadvertently copied over. xtravirt.com 2007 | 25 8 Virtual Storage 8.1 Overview Virtual disk files are stored on high performance shared storage such as Fibre Channel or iSCSI SAN. VMFS is a cluster file system which enables multiple installations of ESX Server to have concurrent fast access to the same virtual machine storage. Since virtual machines are hardware independent and portable across servers, VMFS ensures that individual servers are not single points of failure and enables resource balancing across multiple servers. Fibre Channel HBA consolidation allows the sharing of storage network components across many virtual machines while maintaining hardware fault tolerance. Virtual Storage Source: (VMware) vmware.com 8.2 Risk Assessment Threat # 1 Unauthorised presentation of SAN based data from other sources Likelihood Unlikely Potential Impact Integrity Confidentiality Countermeasure Comments Zoning and LUN masking are implemented to segregate SAN activity. Where applicable, this methodology will be maintained in the context of this project. Zoning provides access control in a SAN topology; it defines which host bus adapters (HBAs) can connect to which SAN device storage processors. When a SAN is configured using zoning, the devices outside a zone are not visible to the devices inside the zone. In addition, SAN traffic within each zone is isolated from the other zones. LUN masking is commonly used for permission management. xtravirt.com 2007 | 26 Threat # Likelihood Potential Impact Countermeasure Comments LUN masking is performed at the storage processor or server level; it makes a LUN invisible when a target is scanned. The administrator configures the disk array so each server or group of servers can see only certain LUNs. 2 Data capture or Denial of Service attack via virtualised storage xtravirt.com Unlikely Integrity Confidentiality Virtual machines have no knowledge or understanding of Fibre Channel. The only storage available to virtual machines is on SCSI devices. Each virtual machine is able to see only the virtual disks that are presented to it on its virtual SCSI adapters. This isolation is complete, with regard to both security and performance. A VMware virtual machine has no visibility into the WWN (world wide name), the physical Fibre Channel HBAs, or even the target ID or other information about the LUNs upon which its virtual disks reside. The virtual machine is isolated to such a degree that software executing in the virtual machine cannot even detect that it is running on a SAN fabric. Even multipathing is handled in a way that is transparent to a virtual machine. Additionally, virtual machines can be configured to limit the bandwidth they use to communicate with storage devices. This prevents the possibility of a denial-of-service attack against other virtual machines on the same host by one virtual machine taking over the Fibre Channel HBA. A virtual machine does not have virtual Fibre Channel HBAs but only has virtual SCSI adapters. A host running ESX Server is attached to a Fibre Channel SAN in the same way that any other host is. It uses Fibre Channel HBAs, with the drivers for those HBAs installed in the software layer that interacts directly with the hardware. In environments that do not include virtualisation software, the drivers are installed on the operating system, but for ESX Server, the drivers are installed in the ESX Server VMkernel. ESX Server also includes VMware Virtual Machine File System (VMware VMFS), a distributed file system and volume manager that creates and manages virtual volumes on top of the LUNs that are presented to the ESX Server host. Those virtual volumes, usually referred to as virtual disks, are allocated to specific virtual machines. 2007 | 27 9 VirtualCenter 9.1 Overview VirtualCenter is composed of five main components: VirtualCenter Management Server is the central control node for configuring, provisioning and managing virtualised IT environments. VirtualCenter Database is used to store persistent information about the physical servers, resource pools and virtual machines managed by the VirtualCenter Management Server. The database resides on standard versions of Oracle, Microsoft® SQL Server, or Microsoft® MSDE. Virtual Infrastructure Client allows administrators and users to connect remotely to the VirtualCenter Management Server or individual ESX Servers from any Windows PC. VirtualCenter Agent connects VMware ESX Servers with the VirtualCenter Management Server. Virtual Infrastructure Web access allows virtual machine management and access to virtual machine graphical consoles without installing a client. VMware VirtualCenter Overview Source: (VMware) vmware.com 9.2 Risk Assessment Threat # 1 Non-specific attack on Windows host running VirtualCenter Likelihood Possible Potential Impact Availability Integrity Countermeasure Comments The standard set of recommendations applies, as it would for any host: install antivirus agents, spyware filters, intrusion detection systems, and any other standard security measures present on physical servers should be installed and kept up to date including patching. The design will require all virtual machine servers to meet or exceed standard security measures. xtravirt.com 2007 | 28 Threat # 2 3 Likelihood Error or omission by use of administrative access Possible Unauthorised modification of key VirtualCenter configuration (system integrity) Possible Potential Impact Availability Integrity Availability Integrity Countermeasure Comments VirtualCenter runs as a user that requires local administrator privilege and must be installed by a local administrative user. To limit the scope of administrative access, it is recommended to avoid using the Windows Administrator user to run VirtualCenter installation. Instead a dedicated VirtualCenter administrator account is to be used. This avoids automatically providing administrative access to domain administrators, who could belong to the local Administrators group. It also provides a way of accessing VirtualCenter when the domain controller is down, because the local VirtualCenter administrator account does not require remote authentication. For compliance and auditing, it is recommended that a record of various configurations over time. To capture these it is recommended to use the ‘Generate VirtualCenter Server’ log bundle command, in the VMware program file menu on the VirtualCenter host. This tool was designed to capture information to be used for troubleshooting and debugging, but the resulting archive file serves as a convenient way to maintain a historical record. Although most of a VMware Infrastructure environment is defined by information contained in the VirtualCenter database, certain important configuration information resides only on the VirtualCenter Server host’s local file system. This includes the main configuration file vpxd.cfg, various log files, and, implicitly, the Windows registry settings that pertain to VirtualCenter. For the design this task will be planned for on a regular basis, so as to track changes made to the VirtualCenter configuration over time. 9.3 The resulting ZIP archive includes: licmgr_reg.txt, odbc_reg.txt, vmware_reg.txt — all the relevant Windows registry entries vpxd.cfg — the main VirtualCenter Server configuration file (in XML format) vpxd-*.log — log files for VirtualCenter Server lmgrd.log — log file for the license server (if present) Additional Best Practice Configuration The following best practice configurations will be adopted within the Low Level design for this component. Role Based Administration VirtualCenter has a advanced system of roles and permissions, to allow granular determination of authorisation for administrative and user tasks, based on user or group and inventory item, such as clusters, resource pools, and hosts. This system ensures that only the minimum necessary privileges are assigned to people in order to prevent unauthorised access or modification. Custom task based roles can also be defined to specifically tailor user access and functionality. The design will detail role based administration access to VirtualCenter deployments. xtravirt.com 2007 | 29 Limit Network Connectivity to VirtualCenter The only network connection VirtualCenter requires is to the ESX Server Service Console and to a network on which instances of VI Client are running. Avoid putting the VirtualCenter server on any other network, such as your production or storage networks. Limiting network connectivity reduces the possible avenues of attack. It is recommended to further protect the VirtualCenter server using a firewall. This firewall may sit between the clients and the VirtualCenter server, or both the VirtualCenter Server and the clients may sit behind the firewall, depending on the deployment. The main consideration is ensuring that a firewall is present at what is considered to be an entry point for the system as a whole. Note: Networks configured with a VirtualCenter server can receive communications from several types of clients: the VI Client, VI Web Access, or third-party network management clients that use the SDK to interact with the host. During normal operation, VirtualCenter listens on designated ports for data from the hosts it is managing and from clients. VirtualCenter also assumes that the hosts it is managing listen for data from VirtualCenter on designated ports. If a firewall is present between any of these components, it must be ensured that the appropriate ports are open to support data transfer through the firewall. Ensure Proper Security Measures Are Used when Configuring the Database for VirtualCenter It is recommended to install the VirtualCenter database on a separate server and subject it to the same security measures as any production database. Permissions used for access should be configured to the database to the minimum necessary. Enable Full and Secure Use of Certificate-based Encryption All versions of VMware products, including all releases of VirtualCenter Server use X.509 certificates to encrypt session information sent over SSL (secure sockets layer protocol) connections between server and client components. During the installation of VMware products, default, self-signed certificates are automatically generated. However, the default certificates generated by VirtualCenter up to and including version 2.0.1 Patch 1 are defective and should not be used. Note: By contrast, the default certificates generated by ESX Server hosts are valid and can be used as-is. This requires that any VI Client that wishes to connect to ESX Server directly (that is, without going through VirtualCenter), must pre-trust the default certificates. For environments that require strong security, VMware recommends that administrators replace all default self-signed certificates generated at installation time with legitimate certificates signed by their local root certificate authority or public, third-party certificates available from multiple public certificate authorities. Server-certificate verification on all VI Client installations and the VirtualCenter host should be enabled. This involves a modification to the Windows registry on all client hosts. Note: VirtualCenter asks for root credentials when it first connects to an ESX Server host. The root password for that host is cached only long enough to enable VirtualCenter management functionality, and the communication channel to the host is encrypted. VirtualCenter then creates a user called vpxuser with a pseudo-randomly generated password and uses the vpxuser account for subsequent connections and management operations. The vpxuser account for each ESX Server host has a unique, 32 -character (2 56-bit) password that is generated from a cryptographically random string of data that is mapped to a set of legal password characters. Once generated, the password is encrypted using 102 4-bit RSA key encryption. The password is also stored encrypted on the host, as any local account password would be. xtravirt.com 2007 | 30 The vpxuser account is created for VirtualCenter management when a host is added to VirtualCenter and is used only to authenticate the connection between VirtualCenter and the ESX Server host. Entries corresponding to the account are added to /etc/passwd and /etc/shadow, but no process actually runs as vpxuser on ESX Server. The vpxuser password is reset every time a host is added to VirtualCenter. If VirtualCenter is disconnected from a host, it tries to reconnect with the vpxuser and password that is stored encrypted in the VirtualCenter database. If that fails, the user is prompted to reenter the root password so the system can reset (that is, automatically generate a new password for the vpxuser account). In the VirtualCenter code, database specific variable protection mechanisms, such as parameterised queries in SQL Server are used extensively, thereby greatly reducing the risk of any SQL injection attack. The VIM API, which is the main SDK library, allows for a mechanism to specify privileges necessary to invoke the API as part of the API definition. This ensures that security implications are taken into consideration from the beginning of writing a new API. This concludes the VMware® Virtual Infrastructure3 Risk Assessment. xtravirt.com 2007 | 31