Private Cloud Security Considerations for Enterprise IT Published: November 2014 Version: 1.0 Authors: Yuri Diogenes—Microsoft Anthony Stevens – Content Master Tom Shinder—Microsoft Reviewers: Clint Rousseau – Microsoft Avery Spates – Microsoft Jeremy Girven – Microsoft Jim Dial—Microsoft Fernando Cima – Microsoft Frank Koch – Microsoft Corporation Scott Culp – Microsoft Corporation Allen Brokken – Microsoft Corporation The Private Cloud Security v-team, Microsoft Corporation Contents 1.0 Introduction ............................................................................................................................................ 4 1.1 Document audience ............................................................................................................................ 5 1.2 Document purpose ............................................................................................................................. 5 2.0 Introduction to Private Cloud Security ................................................................................................... 5 3.0 Private Cloud Security Problem Domain ................................................................................................. 6 3.1 Conceptual Design .............................................................................................................................. 6 3.2 Design Principles ................................................................................................................................. 8 3.3 Security Responsibilities ................................................................................................................... 19 4.0 Private Cloud Security Considerations .................................................................................................. 20 4.1 Security Foundation Considerations ................................................................................................. 22 4.2 Infrastructure Security Considerations ............................................................................................. 33 4.3 Platform Security Considerations ..................................................................................................... 43 4.4 Software Security Considerations ..................................................................................................... 46 4.5 Service Delivery Security Considerations .......................................................................................... 47 4.6 Management Security Considerations .............................................................................................. 50 4.7 Client Security Considerations .......................................................................................................... 51 4.8 Legal Considerations ......................................................................................................................... 52 5.0 Private Cloud Security Challenges......................................................................................................... 55 5.1 Resource Pooling Security Considerations........................................................................................ 56 5.2 Broad Network Access Security Considerations ............................................................................... 61 5.3 On-demand Self-Service Security Considerations............................................................................. 67 5.4 Rapid Elasticity Security Considerations ........................................................................................... 72 5.5 Measured Service Security Considerations....................................................................................... 75 5.6 Mitigating Security Risks ................................................................................................................... 76 4.0 Summary ............................................................................................................................................... 79 5.0 Additional Resources ............................................................................................................................ 80 6.0 Authors and Reviewers ......................................................................................................................... 80 Tables Table 1 Infrastructure security considerations for resource pooling ......................................................... 57 Table 2 Platform security considerations for resource pooling .................................................................. 58 Table 3 Software security considerations for resource pooling ................................................................. 60 Table 4 Infrastructure security considerations for broad network access ................................................. 62 Table 5 Platform security considerations for broad network access ......................................................... 63 Table 6 Software security considerations for broad network access ......................................................... 64 Table 7 Service delivery security considerations for broad network access .............................................. 65 Table 8 Management security considerations for broad network access .................................................. 65 Table 9 Client security considerations for broad network access .............................................................. 66 Table 10 Platform security considerations for On-demand Self-service .................................................... 68 Table 11 Software security considerations for On-demand Self-service.................................................... 69 Table 12 Management security considerations for On-demand Self-service ............................................ 70 Table 13 Infrastructure security considerations for Rapid Elasticity .......................................................... 73 Table 14 Infrastructure security considerations for Rapid Elasticity .......................................................... 75 Figures Figure 1 – Private Cloud Security Model ....................................................................................................... 7 Figure 2 – Private cloud infrastructure using data classification zones ...................................................... 18 Figure 3 – Private cloud infrastructure using data classification zones ...................................................... 19 Figure 4 – Cloud Security Responsabilities ................................................................................................. 20 Figure 5 – Private Cloud Security Model Collapsed .................................................................................... 21 Figure 6 – Infrastructure security issues ..................................................................................................... 34 Figure 7 – Example of a network topology ................................................................................................. 36 Figure 8 – Example of a network topology ................................................................................................. 38 Figure 9 – Private cloud platform security issues ....................................................................................... 43 Figure 10 – Security in the software layer at the private cloud.................................................................. 46 Figure 11 – Service Delivery Security .......................................................................................................... 48 Figure 12 – Security Threats to Private Cloud Architectures ...................................................................... 77 1.0 Introduction Cloud computing is no longer a promise that will change the way companies operate and leverage IT resources, it is now adopted by a great number of business, both large and small. According to Hosting and Cloud Go Mainstream: 2014, a Microsoft Corp.-commissioned study conducted by 451 Research LLC. The new study showed that more than 45 percent of organizations surveyed are beyond the pilot phase, and 32 percent now possess a formal cloud computing plan as part of their overall IT and business strategy. These businesses already embraced cloud computing and are ready to make the most of their IT budgets in order to maximize return of investment. Cloud computing can assist companies to resolve some of the major challenges that they face, such as cost, flexibility and accessibility. As mentioned in this business article from Microsoft, the top five benefits of cloud computing are the following ones: Lower capital expenditure Easier maintenance and upgrades Greater flexibility and mobility Continuity of business Improved IT Security Although these are decision factors around cloud computing adoption and Public Cloud adoption is growing in North America and Western Europe, the choice for Private Cloud is still the preference in emerging markets, according to Gartner (2013). With increasing numbers of organizations looking to create cloud-based environments or to implement cloud technologies within their existing data centers, business and technology decision-makers are looking closely at the possibilities and practicalities that these changes involve. Although the increase in business agility coupled with greater flexibility of service provisioning are convincing arguments in favor of moving to the private and hybrid cloud models, significant deployment blockers remain. By using the private cloud model, companies can retain the governance and hosting of corporate data in trusted environments while transitioning those environments to use the five essentials characteristics of cloud computing defined by the United States National Institute of Standards and Technology (NIST). Private cloud computing can provide both reduced costs and increased robustness. To be successful, enterprises must consider security from the starting point of the private cloud designing process. At the same time, it is essential to address data sovereignty, or regulatory compliance demands. Security becomes an essential aspect of the planning and architecture phases. 1.1 Document audience The primary audience for this document is the system architect or system designer who is interested in understanding the security implications that need to be considered before implementing a Private Cloud infrastructure. Others that might be interested in this document include IT implementers and enterprise security specialists. 1.2 Document purpose The purpose of this document is to provide you with design considerations and architectural view for designing an effective security within a private cloud environment. This document does not cover security considerations for Public Cloud services, such as Microsoft Azure, Office 365, Dynamics CRM Online or offering from other vendors. Although public cloud services are not covered, the services mentioned above participate in the Cloud Security Alliance (CSA) STAR Registry program, which allows customers to compare the compliance posture of participating cloud services. In order to participate, Microsoft must be in accordance with the CSA Cloud Controls Matrix (CCM), which outlines fundamental security principles to guide cloud vendors and to assist prospective customers in assessing the overall security risk of a cloud provider. Detailed papers that discuss how our services fulfill the security, privacy, compliance, and risk management requirements defined in the CCM are published in the CSA’s Security Trust and Assurance Registry (STAR). For more information about Microsoft approach to cloud transparency, read this paper. Note: You can download the latest version of the Cloud Controls Matrix from CSA here. 2.0 Introduction to Private Cloud Security Implementing a private cloud environment requires IT departments to re-evaluate many aspects of how they interact with their organization. A noticeable trend has been for business units to circumvent IT departments and source services direct from an external hosted provider. Hence, IT departments are increasingly considering themselves as a separate business unit whose job is to provide reliable IT services to the organization. In consequence, IT departments find themselves in competition with public cloud providers and are therefore considering hosting private clouds to complete against these public offerings. This change in the relationship between IT department and host organization has often been hindered by the inability to account effectively for the cost of the services that the IT department provides. In fact, many organizations struggle to implement proper cost accounting for their IT services. Private cloud computing provides the ability to allocate costs in a fair and metered manner to the service user in proportion to the user’s demand for those services. Knowing exactly which services are being used, what IT is costing, and where to allocate the costs brings considerable business benefits. Private cloud implementations also affect the way in which IT departments should view security. Security should not be viewed as a discrete silo that contains traditional capabilities such as authentication, authorization, auditing, and so on. Instead, security should be considered as a wrapper around every element of a computing environment. Migrating to private cloud gives organizations the opportunity to redesign their security to this integrated arrangement. 3.0 Private Cloud Security Problem Domain The following problems or challenges are typically the ones encountered by companies trying to address security concerns regarding private cloud: Ensure that data partitioning for tenants is in place for data hosted in the private cloud. Authenticate and authorize access to resources according to user’s rights to access those resources. Secure provision of resources for tenants. Provide Perception of unlimited resources while preserving availability and integrity of the private cloud infrastructure. Enable users to secure access private cloud resources from anywhere. 3.1 Conceptual Design To solve the problems previously identified and assist organizations to design a secure model for their private cloud we encourage to follow the private cloud security model showed in Figure 1: Figure 1 – Private Cloud Security Model This security model shows that to approach any part of the cloud environment is to have security as the foundation for the entire strategy. Additionally, all communication between layers in the cloud model (for example, between infrastructure, storage, networking and virtualization layers) must also have security as the foundation. Security also applies with intra-layer communications, for example between in-memory processes and associated storage. Multi-tenancy security must also provide secure isolation between tenants within a database, Virtual Machine or application instance. In summary, security is a universal factor that applies to every element of cloud operations. Designers, implementers, and operators must consider security factors in every interaction for each physical or logical component of the cloud environment. Note that not all capabilities in this security foundation will apply to each component of the security model. For example, intrusion detection is unlikely to be applied between the management stack and the infrastructure as a service layer. The model merely serves to highlight the all-pervasive nature of security and to emphasize the level at which you should include security. This requirement for pervasive security results from changing organizational perspectives around the delivery of IT services. Increasingly, IT departments are breaking out from the traditional firewalled datacenter approach and having to act as one of many possible service providers that host the organization’s IT services. Business units no longer have to contract with the internal provider and often source these services from external providers, such as public cloud vendors and internal IT departments must recognize this change. 3.2 Design Principles In a traditional data center environment, the demarcation of security responsibilities between the data center operator and the service user was relatively well defined. Generally, the responsibility was aligned with ownership of the physical component, whether that was a server, a networking device or the overall network infrastructure; if the IT department owned and administered the server, then that department also managed and updated security on that asset. For the private cloud, the key security principle that drives an effective design is that your design should seek to build a system of controls, rather than a collection of controls. This unified system of controls is more than just the individual security technologies and methodologies – each part integrates with each other to provide the overall defenses. This unified security approach would include the following design principles: Apply generic security best practices Understand that isolation is key Consider security as a the foundation for the entire solution Assume attackers are authenticated and authorized Assume all data locations are accessible Use established strong cryptographic technologies Automate security operations Reduce attack surface Restrict intra-application communication Audit extensively Implement effective governance, risk management and compliance Create data classification zones 3.3.1 Apply Generic Security Best Practices Private clouds use existing technologies such as virtualization and extend the infrastructure designs current in many organizations. As such, you should maintain existing security practices as part of the security design for your private cloud. For example, you should continue to: Keep IT infrastructure systems in a secure and controlled location. Implement the principles of least privilege and defense in depth. Use firewalls and use separate NICs for management functions. Carry out penetration testing and to audit your security processes. Segment and provide access controls on the network using network firewalls However, private cloud architectures introduce new potential vulnerabilities and you must modify and add to your existing design to mitigate these new threats. 3.3.2 Isolation is Key Typically, private cloud implementations use virtualization technologies to make infrastructure, platform, and software resources available to clients within the enterprise. Tenants may be other business units within the enterprise, or other sections of the IT department using private cloud resources to deliver services to client business units. Even though private cloud tenants are part of the same organization, you must ensure isolation of their resources. For example, confidential human resources data must not be generally accessible even though the human resources systems could be running on the same physical server as the company intranet. It is not just an issue of simple confidentiality. In the IaaS, PaaS, and SaaS service delivery models, you may not know which tenant services are co-hosted on the same physical devices at any particular time. In consequence, a problem in one tenant service could affect the performance, network connectivity, or network availability of other tenant services on the same physical hardware. Your design must ensure isolation between tenants in both the physical and virtual environments that make up the private cloud. If your private cloud is partially or wholly hosted by a third party, then you must be assured that the cloud infrastructure used by the third party also guarantees isolation, both between your services and between your services and any other organization's services that the third party also hosts. Private Cloud isolation should separate access to all cloud services including: Service Catalog – contains VMs, application templates, service offerings and automation scripts Service Catalog Library – the physical location of source files, software library and virtual disks Tenant compute – access to compute resources controlled by capacity, type or location controlled using templates Tenant Storage – access to storage resources controlled by capacity, data type, or location controlled using templates Tenant networking – access to networks controlled by network purpose, and classification Backup and recovery – access to backed up resources and data controlled through automation 3.3.3 Consider Security as the Foundation for your Design Decisions You should consider security as the foundation for all elements of your private cloud architecture. The private cloud security model in figure 1 shows how security concerns are relevant to all elements in all layers and stacks within the architecture: Infrastructure Platform Software Service delivery Management A private cloud typically hosts services in virtualized environments, with multiple services colocated on the same physical device. The security foundation functions must be applied to both the physical and virtual environments because in a private cloud architecture you cannot assume that by protecting the physical environment you automatically protect the virtual environment, and vice versa. If an attacker gains access to the physical infrastructure, they can disrupt not only the infrastructure, but also potentially gain access to the virtualized resources hosted in the cloud. If attackers manage to compromise a virtualized environment, they can potentially use the compromised environment as a platform to attack other virtualized environments within the cloud or to attack the infrastructure. Although your design should consider security as the foundation around all elements in the architecture, your design should take into account the possibility that responsibility for security may be split between the CSP and the tenants. Considering security as the foundation for your design decision should be part of your defense in depth strategy for securing your private cloud. 3.3.4 Assume Attackers are Authenticated and Authorized In the private cloud, you may delegate some of the responsibility for managing the security of the environment to the tenant. A tenant may provision resources through a self-service portal in order to run its tenant application or service in the private cloud. The cloud service provider may have little or no control over how the tenant configures and uses its virtual resources and this includes control over how the tenant grants access to its services to its end users. Because of this, you must assume that attackers can be authenticated users with authorized access to a virtual machine running in your data center. The attacker could be an untrustworthy employee, someone using stolen credentials, or an attacker using elevated credentials. You should consider this route of attack from within a virtual machine in your data center in addition to more traditional attacks that may be mounted from outside your organization in an attempt to exploit weaknesses in your external defenses. Attackers will now attempt to find weaknesses that they can exploit in the virtual environment. For example, an attacker might try to gain access to the hypervisor from within the hosted virtual machine, a type of exploit known as hyperjacking. 3.3.5 Assume All Data Locations are Accessible This point closely relates to the previous point about authenticated and authorized attackers. In private cloud architectures, many data locations are exposed as services. For example, virtual machines may mount virtual hard disks from a storage resource, or they may use virtual queues, virtual tables, or virtual binary large object (BLOB) storage. A tenant may provision these resources through an automated self-service portal as part of the infrastructure or platform services provisioning process. If an attacker can gain access to a tenant's virtual environment, you must assume that they may also gain access to the tenant's data locations. Because of this, you should consider when and how to encrypt data and how to store and manage the encryption keys that enable access to the data stored in the cloud. The exposure of multiple data location make creating and protecting data classification zones an important design consideration. 3.3.6 Do Not Trust Client Information You cannot make any assumptions about the security of any of the client applications that access the tenant services hosted in the private cloud. This proviso is especially important when the tenant wants to enable broad network access to the tenant service from multiple device types and from multiple locations. Poorly designed client applications could accidently reveal credentials or keys, and may perform limited validation on the data that they send to the services hosted in the cloud. Therefore, cloud management services and tenant services must perform their own validation of data sent from all client applications. In contrast, you have more control over the client applications and tools that you use to manage the cloud infrastructure. For example, you may limit access to cloud management functions to client applications running on the corporate intranet, or use certificates to identify client applications. However, some cloud management operations may require calls to published APIs, which themselves may not require full validation of the data sent to the cloud and could be invoked from a custom application created by a developer. 3.3.7 Use Established Strong Cryptographic Techniques Encryption of data at rest, in transit, and during processing can help to ensure that the data is only visible to those who should be able to see it. Therefore using encryption can help to preserve the isolation of tenants' resources and help to mitigate the threat that attackers may be authenticated users with access to the locations where the application or service stores its data. You should ensure that your infrastructure uses established, strong encryption techniques wherever it uses encryption. Tenants should also be encouraged or mandated to use established, strong encryption techniques in their own cloud-hosted applications and services. Implementing cryptographic algorithms securely is complex and difficult. Using strong, established cryptographic algorithms and cryptographic systems rather than "rolling your own" helps to validate your approach to encryption, and makes your cryptographic processes auditable. Remember that attackers will only bother attacking the algorithm itself if they recognize it as weak – typically, they go after the keys. 3.3.8 Automate Security Operations The size of private clouds, self-service provisioning, and virtualization all combine to make it essential to automate operational activities such as collecting and correlating monitoring data, responding to security related incidents, and allocating resources to tenants to prevent denial of service situations. A typical private cloud hosts a large number of tenant services and supports a large number of end users. To ensure effective and timely responses to security issues, you must automate those responses as far as possible. Automated security responses rely on monitoring, so your design must include monitoring services that enable you to automatically identify and act on possible security issues. The automated response procedures must send notifications to the staff that are responsible for security and create a full audit trail of their actions. You should evaluate and review these procedures regularly. Note that when configuring security monitoring, you must also not let yourself be swamped by security responses and turn off monitoring altogether. Better to build up a feel for what is important by enabling rules more slowly. When you have a baseline and an understanding for your environment, you can then add the automation. If you are building your virtualization host and guest environments from standard templates or images, you should ensure that those templates and images include configuration of the monitoring on which your automated responses will rely. This also applies to service catalogue, which also include catalogues for: Templates Applications Databases Automation Your private cloud design should include a comprehensive automation platform that will enable operational activities such as those outlined above. You should also ensure that the security monitoring and response automation can cope with cloud-based environments and with virtual machines that can be rapidly provisioned and deprovisioned. 3.3.9 Reduce Attack Surface As with all computer systems, reducing the attack surface is a key element to preventing attacks from succeeding. If the attacker only has a very small area to attempt to access, then he or she has far fewer options to find a successful exploit. Within a private cloud environment where you are likely to be using virtualization, you must ensure that you reduce attack surface wherever possible on both host and guest computers. You should only enable the ports, services, and features that are essential to your operations. Your risk assessment should identify all unnecessary components and you should then remove or disable these components. For Hypervisors, it is recommended to run the minimum features necessary for virtualization only. The hypervisor should be the only feature on the OS or networking (MPIO, SAN) software should be installed. The default installation for the hypervisor role should be without a GUI, which also helps to reduce the operating system footprint. Note: For companies using Microsoft Windows Server Hyper-V, the installation or use of antimalware in the management operating system is not recommended, for more information see Hardening the Hyper-V host. 3.3.10 Restrict Intra-Application Communications Anytime data transitions though the private cloud, it adds another possible place that an attacker might be able to access or tamper with that data. Your private cloud design should limit the number of nodes that data must pass through to reach its destination as a part of the overall design goal to reduce the attack surface. For example, if a tenant is deploying a three-tier application to the private cloud, they should be able to configure the application so that the individual tiers can communicate directly with each other without the data passing through a shared broker component unless there is a specific requirement for this to happen. Additionally, more complex application communication requires more complex monitoring, and the more difficult it becomes to understand the flow of data within the already complex cloud environment. Virtualization adds additional application mobility that may need enhanced security controls. For example by default Hyper-V Live Migration or VMWare vMotion traffic is not encrypted and could possibly be intercepted. Ensure that either this traffic is encrypted or is used in a secure private VLAN. 3.3.11 Audit Extensively The private cloud will be access by a range of users with differing levels of permissions. IT administrators will have direct access the physical and logical data and systems. All activity including request, provisioning, and usage and decommissioning should audited and logged to a secure location. Weekly and Monthly activity reports should be created and reviewed. Further, regular security audits should be performed to validate that the current security design includes mitigations for known threats. If not already applicable, you should consider gaining consider certifications such as ISO/IEC 27001and SAS 70 Type II. ISO/IEC 27001 is an international standard has been prepared to provide a model for establishing, implementing, operating, monitoring, reviewing, maintaining, and improving an Information Security Management System (ISMS). This International Standard can assess conformance by interested internal and external parties. For more information, visit http://www.iso.org. Statement on Auditing Standards No. 70 (SAS 70) is an international auditing standard that enables businesses that provide services to other organizations to provide an independent, trustworthy account of their internal control practices. An independent auditor performs the SAS 70 audit and generates the resulting SAS 70 report, which the service provider supplies to its customers and clients for use when they themselves undergo auditing. For more information, visit http://www.aicpa.org. 3.3.12 Implement Effective Governance, Risk Management and Compliance In a private cloud, because ownership of and responsibility for services hosted in the cloud is split between various parties, the SLAs between the parties must make clear where those responsibilities lie and what the different parties should expect from each other. The set of SLAs that you must consider will depend, in part, on the particular cloud model adopted by your organization: If you chose to adopt a hybrid cloud model, or chose to host your private cloud with a third-party organization, then the enterprise CSP will have SLAs with the third-party cloud service provider and with the client business units within the organization. In these scenarios, the IT department is acting as a broker to deliver and manage cloud services provided by an external entity, to internal clients. If you are hosting the private cloud on premises, you must then negotiate SLAs between the IT department and the client business units. If you are using a third party to provide some or all of the private cloud services in the enterprise, then you must ensure that the SLAs with the third party provider enable you to meet your SLAs with your internal clients. This requires a detailed understanding of what the third party SLAs offer in terms of security. For example: How do they guarantee isolation between tenants? What do they guarantee in terms availability and disaster recovery? How do they ensure the integrity of your applications and data? What steps do they take to ensure the physical security of your data such as vetting employees who have access to the physical environment and hardware disposal procedures? Regardless of whether you choose to host the private cloud externally with a third party, on premises, or adopt a hybrid approach, you will still need SLAs between the your IT department and the tenants within the organization. The on-demand, self-service attribute of the private cloud, means that there may be some delegation of responsibility for managing the security of the virtualized environment to the tenant. In addition to the guarantees made to the tenant by the cloud service provider, SLAs may also need to specify requirements related to security that the tenant must fulfill. For example, an SLA might specify the encryption technologies that the tenant service should use to protect data, or that the tenant should use the enterprise directory for identity management through federation services provided by the CSP. How much delegation of responsibility occurs will vary between organizations and between tenants. For some organizations, the primary motive behind adopting a private cloud model will be to make more efficient use resources by pooling them: the IT department will still deploy and manage applications and services on behalf of the client business units. In this scenario, the on-demand, self-service characteristic is not significant to the relationship between the cloud service provider and the client business unit, and SLAs will cover traditional areas such as availability, disaster recovery, and performance. However in some scenarios, tenants will make use of the on-demand, self-service functionality of the private cloud to obtain and manage infrastructure or platform resources to run their own applications and services. In this case, the split in responsibility for security features such as identity and access management or data protection may be more complex and the SLAs must include clear definitions that are understood and agreed upon by all the concerned parties. Scenarios that are more complex are also possible. For example, in addition to managing the private cloud infrastructure, the IT department may also commission, procure, or develop applications for other business units within the organization. In this scenario, one part of the IT department may use the on-demand, self-service capability of the private cloud to acquire infrastructure or platform services on behalf of the client business unit and also handle some or all of the ongoing management of the service on behalf of the client business unit. This arrangement creates a scenario where there are two SLAs: one between the cloud service provider and service provider sections of the IT department, and one between the service provider part of the IT department and the client business unit. In a private cloud, all parties must have an explicit understanding of their obligations and responsibilities and agree to them. Depending on the model adopted by your organization, the list of parties involved might include: third-party cloud services providers an internal cloud-services provider application and software providers who are a part of the IT department and who consume private cloud services client business units within the organization whose applications and services are hosted in the private cloud When you are negotiating and drafting SLAs you must ensure that: You cover all aspects of security for all hosted services and applications and there are no gaps in responsibility. Your SLAs with third party providers enable you to meet the SLAs with your tenants. 3.3.13 Create Data Classification Zones Modern private clouds allow you to create logical grouping of resources (Compute, Storage and Networking) to which role based access rights and data classifications attributes can be assigned. Data classifications zones can be created to enforce business and compliance rules to govern which types of data may be used, stored and transmitted in that zone. For example, let’s consider a financial company that needs to categorize data on a shared private cloud infrastructure by isolating federally regulated applications and systems from non-regulated applications (Protected). Further, let’s consider that the financial company also needs to ensure that all data is encrypted at all times for applications that process credit card information (Secure). The financial company would need to create at least three data classifications zones in the private cloud. Each zone will have its unique data privacy and protection rules enforced by well-known security measures. General Data Zone – contains general purpose VMs, application or databases configurations. All users may be permitted to view, deploy and access systems in this zone. Protected Data Zone Sample Requirements - only applications and data used for federally regulated purposes can be deployed to this zone. All OS images and Database Servers must be configured to federally regulated standards prior to deployment. All access and activity in this zone must be audited. The Zone should support categorization of data by security impact level of Low, Medium and High. For more information see on Federally regulated Information Systems requirements, and design considerations for information categorization in for private cloud see the FISMA website http://csrc.nist.gov/groups/SMA/fisma/index.html Implementation – All VM templates, application templates, OS images and library files will be restricted for use in this zone. OS images will be configured using federal configuration standards. Only users with access to these catalog items will be permitted to deploy to this zone. All request, provisioning, usage and retirement activity in the zone will be audited at the host and management level. Secure Data Zone Sample Requirements – any financial or customer data must be protected via strong encryption measures in transit, during processing and at rest. OS images and databases servers that run validated financial applications are hardened according to regulations standards. All access and activity in this zone must be audited. For more information on payment industry INormation Systems requirements see Implementation – All VM templates, application templates, OS images and library files will be restricted for use in this zone. OS images will be configured using financial indstustry configuration standards. Only users with access to these catalog items will be permitted to deploy to this zone. All request, provisioning, usage and retirement activity in the zone will be audited at the host and management level. Private Cloud - Data Classification Zones Centralized Management Sample Logical Data Isolation Model Compute Cluster 16 Nodes Preferred Nodes Secure Zone Preferred Nodes Protected Zone General Zone Management - VM Library - App Library - Provisioning - Automation - Monitoring - Patching - Chargeback - Capacity - Reporting Self-Service Catalog - VMs - Apps - Services Self-Service Users Secure VM Template Storage 3TB Networking 10GB Shared Resource Pool LUN LUN Encrypted VLAN 102, 104 Encrypted LUN LUN Encrypted VLAN 101, 103 LUN LUN Secure Catalog Access Unencrypted Untagged Unencrypted Protected VM Template Protected Catalog Access Role Based Access to Data Classification Zones General VM Template Regional - Data Center General Catalog Access Templates enforce Deployment policies, and Zones Access Figure 2 – Private cloud infrastructure using data classification zones When designing your security zone, ensure to address the following considerations: Self Service Catalog – contains protected, secure and general use VM and application templates. Access to the catalog should use role based access controls to restrict what users are allowed to view and use. For example users with only General Zone access should not see templates and files that are restricted to Secure or Protected Zone use. Automation – is used during the provisioning, de-provision process should be restricted and treated as data that needs to be classified. For automation scripts that installs and configures a SQL database should only be made available to users that have access to the templates for the appropriate data classification zone. Templates – in the private cloud become the primary method of enforcing deployment standards, with validated configurations for VMs and application. Templates should be planned and created as part of the audit compliance validation process for regulated systems. Unique templates should be created for each VM, application or database server for each zone type. Templates with different data classifications should be stored in separate directories in the shared library, and access controls limit which user and IT administrators have access to create or use those templates. VM templates should be used to create a deployment configuration policy for a zone. For example, a Secure Zone VM Template should contain: o VM OS Image configured for Secure use o Hardware configuration that uses encrypted disk drives (in guest encryption) o Network configuration that only allows deployment to VLANs that support encrypted traffic o Storage configurations that only allow deployment to SANs that have encrypted volumes by default, and are classified for zone use. Figure 3 shows an example of how to perform logical and physical segmentation while preserving the data classification zone: Figure 3 – Private cloud infrastructure using data classification zones 3.3 Security Responsibilities In a traditional data center environment, the demarcation of security responsibilities between the data center operator and the service user was relatively well defined. Generally, the responsibility was aligned with ownership of the physical component, whether that was a server, a networking device or the overall network infrastructure; if the IT department owned and administered the server, then that department also managed and updated security on that asset. With cloud models, security responsibility has altered, in that departments may be responsible for a portion of the security on the service that they pay for, depending on the service provisioning model in use. Figure 4 shows the split of security responsibility for the three main cloud provision models. Figure 4 – Cloud Security Responsabilities 4.0 Private Cloud Security Considerations The private cloud security model presented in Figure 1 uses the same design as the private cloud reference model but replaces the capabilities with mechanisms for implementing security. You can use this model to understand the private cloud areas in which you need to include security considerations throughout your design process. In Figure 5 you can see how these components tie in to the different layers of the private cloud reference model: Figure 5 – Private Cloud Security Model Collapsed By leveraging the Private Cloud Security Model, you have the opportunity re-examine the provision of security within your datacenter. You can take a holistic view of the central importance of security and ensure that you achieve this goal within your private cloud design. The following sections will cover in more details the security considerations of the components presented in Figure 4. Note: in 2013 Cloud Security Alliance (CSA) launched an updated version of their Reference Architecture that can also be used as a reference for private cloud security considerations. You can also use their Scenario Application to compare business scenarios against the reference architecture. 4.1 Security Foundation Considerations The private cloud must implement security in all steps of the design process; it must be the foundation for the entire design. Every transaction then must pass through this security wrapper on any data transition within the cloud, for example: Client to the service delivery layer Service delivery layer to the software layer Software to platform layers Platform to infrastructure layers Provider to the management stack Management stack to the software, platform, or infrastructure layers In addition, security applies to all intra-layer communications, to data being processed, and to data at rest. The exact security mechanism applied will depend on the data type, the source and destination layers, or the environment in which that data is being transmitted, processed or stored. For example, software developed for cloud implementations should follow the security development lifecycle (SDL) guidelines. Note: for more information on these guidelines, see Microsoft Security Development Lifecycle, at http://www.microsoft.com/security/sdl/default.aspx. 4.1.1 Identity and Access Management Identity and access management (IdAM) covers the overarching issue of establishing identity and then using that identity to control access to resources. Identity and access management is fundamental to private cloud design, as you must be able to establish the identity of a cloud consumer and then manage their access to resources within the cloud environment. However, it is important to remember that IdAM also applies to administrators and to services that may access your private cloud. IdAM can include the following security topics: Authentication Authorization Auditing Directory service Federation Security policies RBAC Credential management 4.1.2 Authentication The first functionality that your IdAM framework must provide is that of establishing identity through the process of authentication, for example by requiring a user to enter a user name and a password. These credentials are then checked with a directory service, metadirectory, or other authentication mechanism, typically by using some form of hashing algorithm so that the user’s password is not transmitted across the network. If the authenticating mechanism validates the user credentials, then the operating system or federation environment generates a token. This token may contain information about the user and the groups of which that user is a member. Alternatively, in a federated environment, the token may contain one or more claims about the user. Note that this token does not contain any permissions. 4.1.3 Authorization The second part of IdAM is the access management part, which includes the process of authorization. Authorization and authentication have to work together both to identify users and control the resources that they access. In server-based computing, authorization typically involves setting permissions on objects, such as files, folders, shares, and processes. In virtualized environments, you also need to set permissions on virtual machines and virtual networks. In private cloud implementations, you should also control compute resources, storage groups, and service end points. Access to a resource comes from comparing a user’s access token with the permissions set on the resource. Typically, these resource permissions are cumulative, so a user with read permission from their own account and read and write permission resulting from group member ship has read and write access to the resource. Deny permissions trump allow permissions, so a user with read and write permission from their personal account but who is a member of a group that is denied access to the resource will not have access to the resource. In private and hybrid cloud environments, identity must be able to flow dynamically between resources that may not have any common mechanism for exchanging identity information. You can accomplish this task by use of federation technologies, which implement claims-based authentication to a centralized identity store, such as a directory service. Services, applications, and other resources can then use these claims to check the user’s identity, based on a federation trust model between two or more federation providers. 4.1.4 Role-Based Access Control Role-Based Access Control (RBAC) is at the heart of user access and control to private cloud resources. Private clouds should abstract away hardware, networks, storage devices and capacity into logical groups of resources that may run on disparate systems. RBAC should be used to grant access to and control capacity for logical resources. For example – a resource pools with 100 CPU, 100GB RAM, 10TB of storage, on GuestNeT1 and GUESTNET2 – should use Domain access to assign users to this grouping of resources. 4.1.5 Anonymous Permissions A key element with authorization in private cloud environments is the control of anonymous permissions. Anonymous permissions enable unauthenticated users or services to access resources. Many operating systems allow anonymous logons, which are typically used for public access to web sites. With most private cloud implementations, users would typically be known to the organization and therefore authenticated. However, there are scenarios where anonymous access might be required, for example to enable members of the public to interact with an online communication session as a guest. In all cases, there should be strict partitioning between any resource that allows anonymous access and ones that require authenticated access. 4.1.6 Federation Claims Federation is a mechanism for authenticating users from one security domain and authenticating them on another domain without the requirement for an intrinsic trust relationship between the two organizations. The organizations themselves may be running different operating systems, directory services, certification authorities, and security protocols. Hence, this approach is particularly useful in hybrid cloud implementations and is carried out using claims-based authentication. A claim is a collection of assertions about a user, such as their user name, email address, or groups of which they are a member. Claims are generated by a security token service (STS) in one organization where that user is able to authenticate against that organization’s directory service. Claims are electronically signed to prevent tampering in transit and the communication channel over which claims are exchanged may also be encrypted. To exchange authentication information, the two organizations establish a public key infrastructure (PKI) trust between the STS in one organization and the STS in the other organization. When a user wants to authenticate to a resource controlled by the other organization, their logon request is redirected to their home realm and authenticated by the IdAM system in that realm. After authentication, the home STS generates the cryptographically signed security token containing the claims about the authenticated user. This token is then submitted to the requested service at the other organization. Because the home realm has authenticated that user and the cryptographic signing guarantees that the token has not been altered in transit, then the target service accepts the token and, depending on the claims in that token, authorizes that user to access the service. Federation is particularly important in both private and hybrid cloud environments, as services may run in completely different security contexts. In the case of private cloud implementations, the home realm for every user may be the organization’s directory service and applications or services can be configured to establish federated trust relationships with that STS. The services at the service delivery layer, the applications in the software layer, the virtual machines within the platform layer, the operating systems integral to the infrastructure layer, and the management consoles and services forming the management stack can then all use this federated environment for authentication, authorization, and RBAC. 4.1.7 Auditing Together with authentication and authorization, auditing is an essential component of your private cloud IdAM environment, particularly with regard to establishing compliance and implementing effective governance. With public cloud implementations, you are also likely to want to achieve Auditing Standards (SAS) Type I or II accreditation or possibly ISO 27001 compliance to demonstrate to your customers that you take security seriously. With private cloud implementations, showing compliance with external auditing standards may not be so important. What you will want are the answers the following types of questions: Which user accounts have been locked out over the last day, week, or month? Who has attempted to access resources to which they do not have permission? Have any administrators changed access permissions that would enable them to view consumer data? These are in effect relatively straightforward questions to answer and a central auditing system should help you identify when these events occur. But there are questions that might require more sophisticated analysis: Are any users or administrators behaving in a suspicious manner? Are access requests for one resource being redirected to another resource? Are attempts being made to communicate between virtual machines or between virtual machines and host computers? As mentioned earlier in this paper, a significant change in private cloud implementations is that you can no longer assume that attackers are unauthenticated. Hence your auditing implementation must be able to identify unexpected or suspicious activity and be able to filter out that activity from the thousands of regular operations without imposing an unacceptable performance burden or generating excessive numbers of false positives. Typically, the starting point for auditing is the directory service and most commercial directory services implement effective monitoring capabilities. However, you are also going to need to monitor at other levels within your private cloud infrastructure. These levels include monitoring the following resources: Firewall Service endpoint Application Database Virtual machine monitoring through the operating system Host computer operating system Network Storage Management You can use an audit collection and collation service to forward these events to a centralized database. To ensure that events do not swamp the database, you would implement event filtering so that only events of particular interest are collated. Analysis tools can then interrogate this database to identify suspicious activities. Auditing must be of sufficient precision and granularity to be able to track the actions of a single individual right the way through the entire private cloud environment. This end-to-end auditing of individuals is vital for checking up on the actions of your administrators. The auditing database itself also requires auditing and management. Unless properly managed, the number of events can cause the database to grow excessively. Finally, the auditing output must directly support any compliance requirements that apply to your organization. Ideally, this information would be displayed as a dashboard, with easilyassimilated indicators showing current and historic levels of compliance. 4.1.8 Data Protection at Rest The main aim of security in cloud environments is to protect data both at rest and in transit. Hence, data protection is a major factor that needs to be incorporated into the very fabric of your private cloud service blueprint. Data protection at rest includes consideration of the following factors: Software encryption or hardware encryption File and folder encryption or full disk encryption ACLs and access control entries (ACEs) Storage policies Protection of data at rest requires that information to be encrypted. This encryption can be applied in a number of ways, depending on factors such as cost, performance, and ease of configuration. 4.1.9 Hardware Disk Encryption There are two forms of hardware-based disk encryption. One uses a specialized microprocessor that is part of the disk hardware; the other uses either the main processor or a host bus adaptor (HBA). In both cases, hardware encryption enables the entire disk to be encrypted, which gives rise to the term full or whole disk encryption. Full disk encryption protects the master boot record (MBR), the files and folders, the folder structure and the partition table. Performance for a disk with dedicated hardware-based encryption is similar to that for a nonencrypted device. Hardware-based encryption with external processing may perform less well if the main processor is busy. As the whole disk is encrypted, this arrangement protects the disks if they are physically removed from the private cloud environment. As private cloud architectures stress resiliency over redundancy and tend to use arrays of hard disks, this approach prevents data from failed drives being read and also ensures that an attacker cannot read a hard disk that they have physically removed from the environment. Administrators can instantly and irretrievably wipe a hard disk by using the cryptographic disk erasure process. This process generates a new key for the hard disk, thus making all the old data inaccessible in milliseconds, compared to several minutes for a repeated disk wipe. The key for hardware disk encryption is typically 32 bytes or 256 bits in length, which gives 2256 or 1.16x1077 combinations and makes a successful brute force attack highly unlikely. However, after the disk is mounted, the operating system has full access to all parts of the disk as the encryption is now provided transparently by the hardware. Hence, the limitation with hardware-based encryption in a private cloud environment is that it only uses one key to encrypt the whole disk and you cannot use the hardware key to partition data on the disk between different tenants. 4.1.10 Software Disk Encryption Software-based encryption can work in two ways; it can encrypt either the full disk or just a set of specified files and folders. Unlike hardware-based full disk encryption, software-based full disk encryption of the boot disk does not encrypt the MBR. Performance is also reduced compared to dedicated hardware-based encryption, as the operating system needs to decrypt data on the partition. File and folder-based software encryption does not encrypt an entire volume but enables you to encrypt individual files and folders. The advantage of file and folder-based software encryption is that you can encrypt different folders with differing encryption keys, thus enabling data partitioning between users or business units that is not possible with full disk encryption. When you create your design, you will need to consider whether you need disk encryption and then how to apply that disk encryption. Note that you can combine dedicated hardware-based full disk encryption with software-based file and folder encryption to reap the benefits of both systems. 4.1.11 Data Protection in Transit Data protection in transit is a different proposition to data protection at rest and requires you to consider a range of approaches and technologies to provide effective security. While very few organizations would not consider implementing data encryption for data transiting the Internet, many are still not implementing equivalent levels of encryption and data protection within their organizations. Private cloud environments should also seek to improve security by implementing encryption for every transaction, not just those from the client to the service endpoint. Hence, your design must consider encryption of the following data transit paths: Service endpoint to software layer Software layer internal communications Software layer to platform layer Platform layer internal communications Platform layer to infrastructure layer Infrastructure layer internal communications Management layer to service delivery, software, platform and infrastructure layers Private cloud to public cloud environment (for hybrid implementations) Physical transportation of the data from one datacenter to another Public cloud storage to on-premises storage To use encryption to protect your data in transit, you may consider the following technologies: Secure Sockets Layer (SSL) or Transport Layer Security (TLS) IP security (IPsec) Virtual private networking (VPN) All of these encryption approaches use symmetric key bulk encryption combined with asymmetric public/private key pair encryption to exchange the bulk symmetric key between the sending and receiving parties. This approach ensures that encryption does not place too great a processing load on the hosts at either end of the encrypted link. If a private cloud implementation requires the service delivery layer to accept large numbers of simultaneous encrypted connections, specialized SSL offload processors are available to offload the initial handshake process that then sets up the symmetric bulk encryption session. Although SSL 3.0 has been widely accepted as the basis for securing web browsing sessions, this version of the protocol is now seen as less secure than later implementations. The introduction of TLS 1.0 (SSL 3.1) in 1999 further improved security and the latest version of this protocol is now TLS 1.2 (SSL 3.3), which was implemented as RFC5246 and published in 2008. TLS operates at the transport layer of the Internet protocol suite, which includes Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and Stream Control Transmission Protocol (SCTP). In your private or hybrid cloud design, you will most probably use TLS to secure the client or provider to service delivery layer traffic and the service delivery layer to software layer connection. IPsec is a protocol suite that provides encryption, mutual authentication, and cryptographic key exchange during a communication session between two hosts. IPsec operates at the Internet layer of the Internet protocol suite. VPNs are most commonly used to create secure tunnels through public networks. The advantage with VPN technologies is that after the VPN is created, it acts like a physical network, directing outgoing traffic to the VPN default gateway regardless of the intervening network appliances. VPNs can use a range of technologies, including Point to Point Tunneling Protocol (PPTP) and a combination of IPsec and Layer 2 Tunneling Protocol (L2TP). Applications that can’t connect over Internet protocols can typically connect quite happily by using VPNs. Hence you can use VPN connections as part of your private cloud implementation to connect consoles in the management stack to managed services in the software, platform, or infrastructure layers. Key length has a direct effect both on the speed of encryption and the security of the resulting data exchange. For bulk encryption of SSL/TLS traffic, a 40-bit symmetric key length is woefully inadequate and even the 56-bit Data Encryption Standard (DES) key length is regarded as obsolete. 128-bit encryption is still judged strong enough for private cloud use but many implementations now use 192 or 256-bit symmetric key lengths specified in the more secure implementations of Advanced Encryption Standard (AES). Note: There are still restrictions in relation to the export of cryptographic technology. You should check if these restrictions apply to your location, as you may not be able to use longer key lengths. To exchange the symmetric key, you need to use an asymmetric key pair with equivalent computational security. Because public-private key pairs can be broken by integer factorization as well as brute force, asymmetric keys must be considerably longer than the symmetric key that they are protecting. The following table shows the key lengths that provide equivalent protection for symmetric and asymmetric key types using the Rivest-Shamir-Aldeman (RSA) algorithm: Symmetric Key RSA Asymmetric Key Length Length 112-bit 2048-bit 128-bit 3072-bit 256-bit 15360-bit Note that 1024-bit asymmetric keys are now regarded as insecure. For private cloud implementations, the best balance between security and key length is a combination of 128-bit symmetric keys protected by 2048 or 3072-bit asymmetric keys. For the highest levels of transport security, you may need to consider data tokenization. This approach is typically used in Payment Card Industry (PCI) environments that must conform to the PCI Data Security Standard. Tokenization replaces the confidential data with values that are not confidential. The confidential data is not transmitted but the token can be used to reference that information from the tokenization data store. This approach is also widely used for medical records, bank transactions, vehicle registration details, and credit card payment information. In a scenario where you have multiple datacenters and there is a need to physically transport the data from one location to another, ensure that the data is also encrypted while in transit. Another scenario where data will be moving from one datacenter to another is when you are moving data located in a service provider datacenter (for example a public cloud storage services) to the private cloud storage located on-premises. Note: if you are using Microsoft Azure, you can use the Import/Export capability to move data from Azure storage to on-premises, for more information about this capability see http://blogs.msdn.com/b/windowsazurestorage/archive/2013/11/01/announcing-windowsazure-import-export-service-preview.aspx 4.1.12 Certificates In a private cloud implementation, you may need to use digital certificates for TLS/SSL encryption, for client authentication, server authentication, and for a range of other securityrelated purposes. In common with non-cloud implementations, you will most probably use X.509 v3 certificates for these activities. However, a significant difference with a private cloud environment will be the requirement to create large numbers of certificates as part of the provisioning process. For example, if you want to provision a secure web server within a virtual machine, then that web server will require a host name and corresponding IP address. To implement TLS/SSL encryption, the provisioning process must also create an X.509 certificate with a common name that matches the host name of the default web site on the virtual machine. In consequence, you will need to create certificates and bind them to web sites or applications as part of your provisioning process. This requirement is likely to mean that you need to implement an internal certification authority (CA) to generate these certificates or connect to a public CA and request a new certificate for each provisioning action that requires encryption support. In public cloud implementations, the certificates that you provision would have to show a certificate chain relationship to a root certificate issued by a trusted CA. The client computer would have this root certificate installed in its trusted root certificate store. With private cloud implementations, you have the option of using a private CA within your cloud infrastructure to respond to these provisioning requests. However, you would then have to ensure that your client computers trusted the issued certificates. This trust is automatically established with domain membership in Microsoft Windows if the root CA certificate is published in Active Directory, but with non-domain-joined computers and mobile devices you would have to arrange for the root certificate to be installed in each client’s trusted root store. 4.1.13 Security Monitoring and Response Just as security needs to be pervasive in private cloud environments, so does your security monitoring. This security monitoring also needs to be tightly integrated with your overall monitoring environment. Effective security monitoring must not only be integral to your private cloud operation but it must be able to cope with the rapid expansion and self-service elements of private cloud operation. For example, because consumers can provision and deprovision resources rapidly, you may not know exactly how many virtual machines are currently in operation or how many applications you are hosting. And if you don’t know what resources are online at any one time, you can’t secure those resources. When a virtual machine is started up, you need to ensure that either agent-based or agentless monitoring starts as soon as the virtual machine is operational and that the results of that security monitoring are run through the intrusion detection system and collated in the auditing database. You also need to be able to confirm that the virtual machine conforms to your security policies and are not acting suspiciously. Auditing and security monitoring by themselves are not enough; you also need to respond to incidents in a timely manner and take decisive action to contain a security breach. In cloudbased environments, this requirement may mean implementing tools that take automatic action rather than wait for human intervention. For example, if a user appears to be acting suspiciously, your security management software should be able to terminate that user’s session and disable his or her account before alerting an administrator. 4.1.14 Security Management Security management is the overall capability that provides the ability to manage and control all security aspects of your private cloud implementation. You will need to consider the following factors when creating your private cloud service blueprint: Proactive and reactive management: In private cloud environments, security management needs to be both proactive and reactive. You must implement proactive security through risk assessments, threat modeling, data classification, security policies, preconfigured virtual machine templates, access rights, update procedures, reduced attack surfaces and so on. You will also need to implement reactive security management by using security monitoring and automated security responses. Risk assessments. By identifying the types of risk inherent to your operations, you can identify the major security threats to your private cloud infrastructure. Threat modeling. From your risk assessment, you can model the threats and classify them according to severity. As has been pointed out already, private cloud environments change the nature of the threat as well as provide new and different potential attack vectors. Data classification. Not all data requires classification at the same security level. For example, your organization’s published policy statement on sustainable IT does not require the same protection as a spreadsheet with financial projections for a new product launch. Your security management must allow for these differences and afford the correct level of protection for different data types. Security policies. Security policies need to be defined, created, applied, monitored, adjusted, and maintained. These policies should apply enough protection in a looselycoupled environment to secure the data without hindering employees from carrying out their work. Attack Surface Reduction. Attach surface reduction is important at every level within the private cloud architecture. This principle also applies to virtual machines. Update procedures. Your security management environment must implement processes for updating host operating systems, virtual machines, development environments and applications rapidly and reliably. With private cloud environments, you have additional options which may involve provisioning or cloning a virtual machine, applying security updates to the clone, and then reverting back to the original if the updates cause malfunctioning. Note that in certain situations, compliance or application requirements may dictate that the provider does not have access to the virtual machines before provisioning. Hence, the provisioning, updating, and mounting process must be fully automated. Your security management environment must control these processes where possible and verify that any changes that take place do not affect the operational integrity of your environment. In addition, your security management environment must not introduce additional security vulnerabilities, so principles such as encrypted communications with IPsec, RBAC, two-factor authentication and attack profiling must apply to the management environment as well. 4.2 Infrastructure Security Considerations Now that you have examined the factors within the security wrapper, this paper presents the security issues that apply at the Infrastructure layer. Infrastructure security is the most comprehensive aspect of private cloud security, as it starts with physical security and goes up as far as security of the hypervisor element of the virtualization environment. The layers of infrastructure security consist of: Physical security Supply security Facility security Hardware security Network security Compute security Storage security Host operating system security Hypervisor security Figure 6 shows how these security elements apply at the corresponding levels of the infrastructure layer. Figure 6 – Infrastructure security issues 4.2.1 Physical Security The physical security requirements of private cloud implementations tend not to differ substantially from those of an internal datacenter. However, as all security starts with good physical security, it is worthwhile highlighting two factors that are of importance. Physical access control. Effective security starts with restricting the people who have access to the physical hardware. Newer technologies such as smartcards, fingerprint scanners, and retina scanners are now in common use but all are equally ineffective if you haven’t put bars over the windows to the room that houses your private cloud hardware. Many organizations pay for extensive electronic penetration testing yet fail to implement any form of physical security assessment. A plate glass window rapidly transforms into a door after the application of the heavy end of a fire extinguisher. In consequence, doors to the data center need to be able to withstand physical attack using any item likely to be readily available to an intruder. Employees must participate in a security awareness program to avoid common mistakes such as allowing tailgating (following people through doors) and shoulder surfing. Physical data leakage. Because private cloud implementations tend to make greater use of simple arrays of hard disks and swap out failed disks on pre-determined maintenance schedules rather than every time a disk fails, you have the situation where multiple hard disks may be replaced at the same time. Unless there are effective security controls to ensure that these hard disks are wiped before they are removed from the premises, then there is the potential for data leakage. 4.2.2 Energy Supply Security A private cloud implementation requires electrical power to operate. It may also require a reliable Internet connection. Unless the provision and security of these services is assessed and the risks to each supply evaluated, your overall security cannot be accurately estimated. Backup electricity generator and/or dual power source (Dual Utility Feeds) are approaches to improving security of the electricity supply. Although significantly more expensive, mirrored or standby data centers in another location can provide equivalent security combined with resilience in the case of significant disruption at one site. 4.2.3 Facility Security Facility security addresses issues with providing essential services to the infrastructure, such as cooling facilities, power supplies, cabling, and physical networking. In these areas, private cloud implementations are more closely aligned to dynamic data centers, where rapid provisioning and deprovisioning may result in large fluctuations in cooling and power requirements from powering up and down the physical servers. Additionally, in private cloud environments, the physical networking topology may differ to account for the change in trust level of the internal network. 4.2.4 Network Security Many network architectures include a tiered design with three or more tiers such as core, distribution, and access. Designs are driven by the port bandwidth and quantity required at the edge, in addition to the ability of the distribution and core tiers to provide higher speed uplinks to aggregate traffic. A dedicated management network is a frequent feature of advanced data center virtualization solutions. Most virtualization vendors recommend that hosts be managed via a dedicated network so that there is no competition with tenant traffic and to provide a degree of separation for security and ease of management purposes. This historically implied dedicating a network adapter per host and port per network device to the management network. Private Clouds networking have to deal with networking multi-tenancy issues, but access controls to perimeter networks should follow established best practices such as deploying edge server, and devices with strict traffic flow shaping. With Multi-Tenancy, the Private Cloud needs to be able to support re-use of IP addresses with Port Based/Private VLANS and Remote Tunnels through Virtual Security Gateways. Managing the network environment in a private cloud can present challenges that must be addressed. Ideally, network settings and policies are defined centrally and applied universally by the management solution. For VLAN-based network segmentation, several components including the host servers, host clusters, Virtual Machine Manager, and the network switches must be configured correctly to enable both rapid provisioning and network segmentation. The hypervisor and host clusters, virtual switches should be defined on all nodes in order for a virtual machine to be able to failover to any node and maintain its connection to the network. At large scale, this can be accomplished via automation. Private Clouds do have the concept of internal trusted networks and external untrusted networks inherently. Below you have an example of how this can be distributed: Trusted networks should be: Cluster communication (non-routable) Management network (non-routable) Virtual Machine Migration Network (non-routable) Storage Network (non-routable) Figure 7 – Example of a network topology External Untrusted networks: Guest networks – VLAN tagged or not Perimeter networks – controlled by firewall rules Secure networks – high controlled networks with IPSEC or other Authorization before communication is initiated All of these networks are presented to the Private cloud hosts and the virtual switch is used to manage access and traffic flows. However, this change does not mean that different firewall rules cannot apply to cloud to internal network connections compared to cloud to Internet connections. Instead of an allencompassing network labeled “External”, you would create a new network called “Internal Network” and assign an IP address range to that network. For communications to take place, you then specify what ports are open and which protocols are allowed through the perimeter network into the cloud network. You can then configure additional firewall rules that govern communications between the internal network and the cloud network. You will be monitoring your firewalls as part of your overall security monitoring and using this information to ensure that client access attempts follow prescribed paths to specific services. Any attempts to access a service or virtual machine from a session that should not be connecting to that resource should result in automatic termination of the session, locking out the account, and alerting security to carry out a forensic follow-up examination. As with all IT infrastructures, the routers, firewalls, and switches must be kept updated, otherwise these components can provide an open door for intruders. A private cloud implementation will also make heavy use of network partitioning through virtual local area networks (VLANs). These VLANs can be implemented both through the physical network switches and as part of the virtualization environment. VLANs help ensure that packets can only travel along the network segments that they should be traversing.. The dynamic nature of virtualized environments may cause the location of a server and its corresponding storage to change, based on shared resource usage and dynamic virtual machine placement. These dynamic movements may increase network latency and reduce routing effectiveness. In addition, network topologies such as spanning tree layouts may not work in private cloud environments. Shared network services such as DNS also need to be included in the security assessment of your private cloud implementation. As with most network designs, any service should provide redundancy and should not implement a single point of failure. Any known vulnerabilities in these services must be addressed and best security practice applied. Private clouds can also take advantage of converged networks, where different types of network traffic share the same Ethernet network infrastructure. Figure 8 has an example of Converged network: Figure 8 – Example of a network topology Note: For an example of how to configure the converged network showed in figure 8 using Windows Server 2012, see the article Network Recommendations for a Hyper-V Cluster in Windows Server 2012. Network virtualization is another capability that should be leveraged in a Private Cloud. Network Virtualization decouples the customer’s virtual networks (tenant) from the physical network infrastructure of the hoster (private cloud owner), providing freedom for workload placements inside the datacenters. Virtual machine workload placement is no longer limited by the IP address assignment or VLAN isolation requirements of the physical network because it is enforced within the hypervisor hosts based on software-defined, multitenant virtualization policies. Some other advantages of using Network Virtualization are: Enables easier management of decoupled server and network administration: Server workload placement is simplified because migration and placement of workloads are independent of the underlying physical network configurations. Server administrators can focus on managing services and servers, and network administrators can focus on overall network infrastructure and traffic management. This enables datacenter server administrators to deploy and migrate virtual machines without changing the IP addresses of the virtual machines. Simplifies the network and improves server/network resource utilization: The rigidity of VLANs and the dependency of virtual machine placement on a physical network infrastructure results in overprovisioning and underutilization. By breaking the dependency, the increased flexibility of virtual machine workload placement can simplify the network management and improve server and network resource utilization. Note: For more information about Network Virtualization in Hyper-V, see the article Hyper-V Network Virtualization Gateway Architectural Guide. 4.2.5 Hardware Security Hardware security in private cloud environments must take account of the differing features of private cloud implementations. Private cloud implementations typically have very high levels of commoditization, so any hardware security devices will need to be implemented on large numbers of host computers, hard disks, or network cards. These hardware security devices can include hardware security modules (HSM) to protect cryptographic keys or offload cryptographic processing, most commonly for asymmetric key calculations. Note that symmetric key cryptographic calculations tend to be executed in software on the main processor, as the time to transfer the data to an external device and back again reduces the computational advantage of the dedicated processor in the HSM. Hardware security devices can also include Trusted Protection Modules (TPM) for disk encryption. However, private cloud implementations with are more likely to use a combination of hardware full disk encryption alongside software file and folder encryption. Host security can include setting the basic input/output system (BIOS) to require a password at boot time and in order to access the BIOS settings. Although there are known mechanisms for defeating BIOS security, this form of hardware security on the host computers still has a place in private cloud environments. You should include firmware such as system BIOS updates as part of your maintenance cycles. Dynamic migration of virtual machines simplifies this process, as any virtual machines can be moved from a host computer, which is then updated, brought back online and the virtual machines reverted to that host. In the BIOS ensure booting from unauthorized sources is disabled. Turn off all unused USB ports and disable CD / DVD ROM drives. 4.2.6 Compute Security Private cloud environments consist of significant numbers of compute resources, implemented either as small form factor hardware compute units (for example, blade servers) or as virtual machines running on host computers (which may also use a blade format). It is essential that there is effective security on the processes that run within the associated memory of these compute resources. Process isolation enables tight control of processes running within the operating system and constrains operations only to designated objects or targets. Authorization rules govern which initiators can access which targets. In private cloud implementations, it is important that processes that run on compute resources are tightly locked into ownership of those processes. Memory segments should also be routinely wiped or set to zero when allocated to another process. In addition, areas containing data such as the page file should be wiped when the computer powers down. Using industry standard capabilities such as secure boot to help ensure that your servers that are part of the compute note boots using only software that is trusted by the PC manufacturer is also important. When the server starts up, the firmware checks the signature of each piece of boot software, including firmware drivers (Option ROMs) and the operating system. If the signatures are good, the server boots, and the firmware gives control to the operating system. Secure Boot requires a PC that meets the UEFI Specifications Version 2.3.1, Errata C or higher. Secure Boot is supported for UEFI Class 2 and Class 3 PCs. For UEFI Class 2 PCs, when Secure Boot is enabled, the compatibility support module (CSM) must be disabled so that the PC can only boot authorized, UEFI-based operating systems. Secure Boot does not require a Trusted Platform Module (TPM). Note: For more information about Secure Boot in Windows Operating Systems read Secured Boot and Measured Boot: Hardening Early Boot Components Against Malware. Finally, memory dump files can contain sensitive application information, including user names and passwords. You should ensure that any memory dump files are secured against unauthorized access. 4.2.7 Storage Security The encryption factors in storage security have already been discussed. However, there are other considerations that arise from the use of pooled storage in a multi-tenant environment. As with compute resources, storage allocation should ensure effective partitioning between tenants and strictly enforce ownership of the storage space. When an IaaS compute resource is provisioned, it is allocated the dedicated compute and storage resources and access to those resources is restricted to the commissioning tenant. In operation, that storage space is kept isolated from other tenants. Encryption and ACLs prevent other users from accessing the data stored in those locations. Ensure to use storage data classifications as part of security zones (PCI, FISMA, General). Private cloud security zone are created from Storage Partitioning, to create spate data classifications volume. For example Private cloud is designed to support both government and financial industry data. The cloud should have separate security zones to manage that data. In storage create at least 3 data classification zones, one for government (FISMA), one for financial (like PCI), and one for general purpose. The provisioning process controls what data is allowed to be provisioned on that zone. For example the virtual machine for Government use should only be provisioned on that storage zone. Finally, when a tenant or administrator deprovisions a storage resource, it is important that all the data on that volume is wiped. If the associated compute resource has local storage, any local and transient data on that compute resource should also be destroyed. Any resources that a tenant has used must return to the respective resource pools in a completely sanitized state and bear no recoverable imprint of the data that the tenant was using. When reviewing storage security, ensure that you also consider data held in caching controllers. This information must be wiped as part of the deprovisioning process. 4.2.8 Operating System Security Operating system security in the infrastructure layer generally involves configuring the host operating systems that support the virtualization environment. As with all operating system configurations, a key approach is reducing the attack surface to an acceptable level. The level to which you need to reduce the attack surface will depend on your overall risk management strategy and threat model. Virtualization environments generally do not require graphical user interface (GUI) support from the operating systems on which they run, hence you should consider using a version of the operating system that does not include this component. Any services that are not absolutely essential to the virtualization environment should be disabled. The provisioning process for host operating systems should include application of operating system security policies. These policies should set appropriate levels of operational security and include IPsec polices to control the servers to which each computer can connect. Provisioning at the infrastructure level also needs to include creation of certificates to match the host name of the provisioned host. The use of a Bare Metal OS Provisioning can alleviate/provide assurance that the OS is deployed consistently. Before the host computer is connected to the network, it must have all relevant security updates applied. Only then is the new compute resource brought online and available to the requesting tenant. The deployment of the appropriate management tools for operations management, configuration management and capacity management should be part of the deployment. 4.2.9 Virtualization Security Although virtualization is not a pre-requisite for private cloud implementations, the operational flexibility that this technology brings means that it is almost a requirement for meeting the need for rapid elasticity. Typically, virtualization is implemented as a part of a dedicated version of a server operating system without the GUI. Virtualization requires hardware support from the chipset. However, most modern server designs provide the full range of virtualization support. From the security perspective, however, it is important to understand that if your private cloud environment is not using virtualization, then you should reduce the potential attack surface by disabling the hardware virtualization support within the system BIOS. The virtualization environment should be updated as part of the operating system and this updating needs to be applied before any guest virtual machines are run on it. In a private cloud environment, additional services or applications should not run on the host computer, with the possible exception of anti-virus applications. This anti-virus scanning should then be included in the comprehensive security monitoring of the whole environment. Note: For more information about Hyper-V Security read Hyper-V Security Guide. 4.2.10 Update Security The final component in infrastructure security covers applying security updates to the entire infrastructure layer. This process should also include updates to the switches, firewalls, and firmware. As with most aspects of private cloud provision, the key attribute is high levels of automation. Updates need to be delivered in a timely manner and targeted correctly at each running host computer. Newly provisioned computers require updates to be applied before being brought online and the management interface must keep track of the update status of all running host computers and virtual machines. Private cloud environments do significantly facilitate the process of applying security updates, as you can use the pooled resources feature to your advantage. Because no virtual machine is tied to any one host computer and no compute resource is tied to any physical host computer, you have the flexibility to move resources around while you update operating systems or carry out other maintenance tasks. For example, if you need to update the hypervisor on your host computers, you can simply start with one host computer, live migrate the running virtual machines onto other host computers, apply updates to that host, reboot, test functionality, and then live migrate the running virtual machines from another host computer onto that updated computer. You continue this action until you have updated all host computers. Update testing is also simplified, as you have the ability to provision hardware and software to carry out that testing. By taking snapshots of running virtual machines before updating, you have an immediate fallback position should the update fail. As long as your datasets for each application are independent of the application, then failure of the virtual machine failure should not corrupt the data and when the previous image of the virtual machine is restored, the application should function as normal. Hence, private cloud environments give significant benefits when testing, deploying, and rolling back security updates. In a hosted or hybrid environment, the cloud service provider might not have permission to apply updates to virtual machines, particularly with IaaS provision. In this case, the provider must make the update tools available to the consumer and ensure that they are used properly to keep the consumer’s environment up-to-date. 4.3 Platform Security Considerations Having applied effective security to your infrastructure, you can start to examine security at the platform level. Good platform security is essential for high levels of application security in the software layer. Unless you have addressed potential attack points at the platform level, you are potentially compromising security of all your applications. You also need to consider security between the platform and the infrastructure layers. Figure 9 summarizes these security considerations. Figure 9 – Private cloud platform security issues The first part of platform security is at the virtualization level; here you must protect the virtual machines from each other and from the host computers. The host computers must also be protected from the virtual machines. To achieve high levels of security, you must consider each virtual machine as having its own defensive perimeter. These defenses will consist of a guest firewall, anti-virus, system policies, and IPsec-secured communications. In addition, you may also want to apply intrusion detection and security monitoring on the guest virtual machine, using either an agent-based or agentless mechanism. In effect, you are applying server security best practice to the virtual machines. To support the rapid elasticity attribute of private cloud environments, virtual machines are typically provisioned from templates. As a virtual machine is provisioned, it must have the latest security updates applied, the anti-virus definitions updated, any policy changes implemented, and monitoring agents brought up to the latest release. A machine certificate needs to be installed and IPsec policies applied before bringing the virtual machine online in the production environment. Virtual networking simplifies this process, as the provisioning system can switch the virtual machine into a limited access security update virtual network to carry out this updating before switching it across to the production environment network. Note: If you are using System Center 2012 to manage your Private Cloud, cloud services are upgraded by selecting a new version of the service template. For more information on how to perform that read How to Upgrade a Service Deployed to a Private Cloud. As mentioned in the infrastructure section, if the provider does not have access to the virtual machines in a PaaS environment, then there needs to be a mechanism for applying security updates to these virtual machines. Security updates to the virtual machines should also address other platform components, such as application frameworks, user experience (UX) services, integration services, queuing services, and so on. If you are providing PaaS for your consumers, then at the end of this provisioning process, they should be able to connect to the virtual machines and start developing applications. If you are providing SaaS, you can start installing your applications and running services. 4.3.1 Data Security The platform layer also includes access to data services, so you should consider security aspects of this storage as well. Because of the generalized increased threat levels (not just to private cloud implementations) it is important that you take the view that all data is accessible, wherever it is stored. The principle of security through obscurity is well and truly discredited, as attackers with administrator rights can gain access to all levels of a private cloud environment. If a data bit is stored, you must assume an attacker can access it. Only the combination of encryption, ACLs, monitoring and auditing can provide effective levels of security. Other considerations with data security require you to consider the lifecycle of a data bit. Within private cloud environments, data bits are not written just to one location on a single hard drive. The requirement for resilience results in that information being replicated to multiple locations. In addition, this data may appear on caching disk controllers, in temporary files, or in other stores through application-level or operating system replication. Finally, data at rest is always more vulnerable than data in transit. There are technologies that enable attackers to intercept data in transit between two hosts, but it may not be possible or practicable to reconstruct that data. In any event, data intercepted in transit can only compromise that individual transmission, whereas accessing data at rest can provide the entire data set. Hence data security is a key factor that requires extensive investigation. Although the user perception of cloud services is that their data is “somewhere out there”, as an operator you cannot afford to take such a lax view. You must implement strict data security and review where your data resides from the moment of writing it to disk to the point at which it is scrubbed or encrypted beyond recovery. 4.3.2 Application Framework Security Your choice of application framework will depend on the type of applications and the development environment that your cloud environment will support. Hence, you will need to ensure that you apply strict standards in terms of what application framework types and versions are available, how those frameworks can be used, and how you update them. 4.3.3 Development Environment Security Your provision of development environment may result from your customer requirements or may be something that you impose as an organizational standard. However, the larger the number of development environments that you support, the greater the challenge of providing adequate security. Whatever development environment you provide, it is important that your consumers implement best security practices into the applications that they create following the principles of SDL. Factors such as using appropriate class design to reduce attack surface area, developing robust exception management, avoiding threading vulnerabilities and so on apply even more in a private cloud environment. Providing consumers with a sandboxed environment can significantly reduce the threat from poorly secured code that your customers create. When tenants deploy their applications, strict application partitioning is essential. Each tenant’s application must be completely bounded within its environment and not able to access other tenant applications or data. Any attempts to do so must be detected and that application instance suspended until you can complete your forensic analysis. 4.3.4 Update Security Update security in the platform layer shares similar factors as the infrastructure layer. Updates need to be tested and deployed rapidly while minimizing downtime. Virtualization and virtual machine snapshots can assist in this process by creating fallback positions so that platform components can be updated. Private cloud environments simplify this process in that updates to development environments can be carried out when the development environment is not in use by the consumer. Again, you must consider the circumstances in which you might not have access to the virtual machines to make these updates. 4.4 Software Security Considerations As the highest level of the private cloud service provision layers, software security brings its own specific security challenges that are unique to an environment that hosts live applications. Figure 10 shows these areas. Figure 10 – Security in the software layer at the private cloud 4.4.1 Application Security Application security in private cloud implementations has many commonalities with data center application hosting. All the usual best practices about making applications secure by design and secure by default apply equally in the private cloud. However, there are the following issues that are specific to the cloud. Application partitioning. The requirement for a multi-tenant support in private cloud environments requires strict application partitioning, where provisioned applications only service requests from users within the provisioning consumer’s organizational unit or virtual team. Supporting this multi-tenant model requires full integration between each running application and the authentication and authorization mechanisms. Typically, authentication would be carried out through federated identities, using an industry-standard federation model such as Security Assertion Markup Language (SAML) token exchange. Client trust levels. With private cloud implementations, you may not have the same level of control over client types, operating systems, browser types, update levels and anti-virus security as with a more tightly-controlled network, particularly if you are making use of the universal connectivity aspect of cloud provision. In consequence, applications that you create should validate and constrain all client input by checking it for type, range, length, and format. 4.4.2 Update Security Update security in the software layer involves similar considerations to updates in the platform layer. Again, the deployment flexibility and virtualization features assist with installing application security updates and rolling back a complete application if an update fails. 4.5 Service Delivery Security Considerations The purpose of the service delivery layer is to make the services in the private cloud environment available to the consumer. The service delivery layer also provides the interface through which consumers can connect. Capabilities that the service delivery layer provides are: Service end-points – provides the connection points to the SaaS, PaaS or IaaS hosted services. Self-service portal – enables consumers to request cloud resources and to return those resources to the general pool when no longer required. Service catalog – lists the services to which consumers can connect. Service provisioning – enables consumers to provision virtual machines, development environments, or applications. Billing – converts service usage into cost values and dispatches bills automatically. Service contracts – publishes and maintains a register of SLAs and operating level agreements (OLAs) for each tenant. Metering - accounts for consumers’ usage of cloud resources and sends this information to the billing capability. Service reporting – reports on the service levels actually provided and compares these levels to SLAs. As this layer provides the service connection to the consumer, security is a critical issue with all these capabilities. Figure 11 shows the elements of the service delivery layer that require this security. Figure 11 – Service Delivery Security As with the software, platform, and infrastructure layers, the security capabilities of IdAM, data protection, security monitoring, security management, authentication, authorization, RBAC and auditing all apply to the service delivery layer. 4.5.1 Connection Security Connection security is a key element in securing the delivery of services, as consumers will always be accessing these services using a remote network connection. With private cloud environments, this paper has highlighted why you should consider the internal network as an untrusted network alongside the Internet. Hence, all client connections should be treated with the same level of minimal trust. Establishing a secure connection to a client helps to ensure integrity of the data and makes it more difficult for an attacker to compromise the data stream. Hence, techniques such as TLS/SSL encryption using a minimum of 2048-bit public/private key pairs and 128-bit bulk encryption keys are essential. Authentication is also a key requirement, as your private cloud environment should typically not be accepting unauthenticated requests. If you have a public web site that accepts anonymous requests, then this site should be hosted separately by a commercial hosting provider. Certificates used for TLS/SSL traffic can be third-party or generated automatically by your internal CA. Regardless of the process that you use, clients should have the root certificate stored in their trusted root certification store to prevent error messages on connection. Users should be trained to be immediately suspicious if they receive a certificate error when connecting to a private cloud resource. Note that if you have implemented an SSL inspection mechanism, then this mechanism can assist by also providing other validations, such as CA authenticity, certificate revocation list (CRL) checking, chaining, and other security tests on the certificate. Connections from the service delivery layer to the software, platform, or infrastructure layer also need encryption and mutual authentication, typically by use of TLS/SSL or IPsec encryption. 4.5.2 Service End-Point Security Even though the role of the perimeter network has diminished in private cloud implementations, the point at which the consumer connects to the service delivery layer is still a significant security boundary. Hence, your security defenses should aim to prevent the most common forms of attack from succeeding. The security techniques of port and protocol restrictions combined with packet inspections, traffic analysis, intrusion detection systems, and honey traps are not unique to private cloud environments; what changes is the degree of automation that is necessary to respond to attacks. Defenses of any kind are useless if they are not actively protected and the increasing threat profile from more sophisticated attacks makes passive defense no longer effective in protecting your environment. Hence your perimeter defenses need to be closely monitored, with immediate follow-up action on any intrusion. Authentication, authorization, and audit controls must apply at the point of contact. Links to federated identity providers must be secure from tampering or interception. 4.6 Management Security Considerations The management stack contains a range of linked capabilities that provide the ability to manage the service delivery layers. Typically, these are capabilities to which the provider connects rather than the consumer. However, some of the reporting output from the management stack can appear in the service delivery layer and form the basis for information that the consumer can access. The provider’s contact with the management layer goes through the same levels of authentication, authorization, and auditing as the consumer’s approach to the service delivery layer. Although you might expect that you should be able to trust your administrators more, their greater levels of control mean that you have to be more aware of what your administrators are up to and in consequence, can afford to trust them less. 4.6.1 Management Tools The exact management tools that you use in a private cloud environment will depend on your organizational policy, operating system and virtualization platforms, training, and personal preference. Tools with specific security functionality cover the following capabilities: Deployment and Provisioning Management Capacity Management Change and Configuration Management Release and Deployment Management Network Management Fabric Management Incident and Problem Management 4.6.2 Authentication, Authorization, Auditing and Role-Based Access Control The management stack must fully integrate with the highest levels of authentication available within your private cloud environment. Typically, you would implement two-factor authentication alongside federation to identity-enable individual management applications within the cloud. 4.6.3 Management Isolation from User Data In a fully service-oriented private or hybrid cloud implementation, you treat your organization’s business units as separate tenants. In consequence, your administrators are a separate tenant and access rights to other tenants’ data should be restricted. In consequence, auditing for administrators must look for unexpected behaviors, such as changing permissions to give access to tenant resources. This response to such incidents (whether concerning administrator accounts or not) should be gradated, in that an attempt to view a general document in a particular business unit does not necessarily need to be treated in the same way as an attempt to access a spreadsheet of company salaries and bonuses owned by the Finance department. As with any business asset, there should be a sanity check to establish if the administrator has valid reasons to change permissions on a particular file. Although automation and data processing provides advanced capacity to analyze large data sets that auditing generates, a common-sense human-centric approach needs to apply to investigative follow-up. Any investigation needs to follow the contractual terms of the employee’s engagement and comply with local employment laws. 4.7 Client Security Considerations With private cloud environments, you have three options for client security: Secure trusted client. A secure trusted client one that exists on the internal network and has a security trust relationship with the cloud domain. You would provide appropriate levels of protection to these client computers by using anti-virus protection, two-factor authentication, hardware computer security, and integral data protection. Connections to the private cloud network must be made over a protected communications channel with a quarantine process to ensure that the client computer has the latest security updates, anti-virus definitions, personal firewall enabled and so on before being allowed to access the service endpoints. Insecure untrusted client. Here, you do not trust the client computers at all and assume that every input that you receive from the client is suspect. You then check the input for type, range, length, and format. Your application design ensures that no sensitive data is stored locally on the client, which can be a desktop, tablet, mobile device, or even a browser on a public kiosk computer. Secure untrusted client. With this option, you provide as much local security as possible to the client as with the secure trusted client example. However, you do not set up any form of inherent trust relationship between the client and the cloud environment, as would be explicit with domain membership. Authentication would be through federation and your cloud-based applications would treat all client input as suspect and thoroughly check this information before accepting it. Note: The option for insecure trusted client is not considered further for obvious reasons. An example of a secure untrusted client would be a laptop with integrated disk encryption, either hardware or software-based. It would require two-factor authentication to log on, using either a smart card or fingerprint recognition and would not be domain-joined. Authentication to the cloud service would also be two-factor and the device might include a geographic locating device to assist with recovery in case of theft. The capability of an Endpoint Protection Scanning provides the ability to provide access to the Private Cloud, but limit access if the client doesn’t meet certain constraints like an updated AV, specific OS Version and other checks. One area that may change with private and hybrid cloud implementations is domain membership, which is no longer a pre-requisite if the client uses federated authentication to identify themselves to cloud – based applications using the cloud directory service as their home realm. It should be noted that implementing federated authentication on a standalone computer rather than adding that computer to the domain changes the profile of network services available to the clients. In reality, most organizations running private cloud environments will probably attempt to secure their client computers as much as possible. However, as previously discussed, if an attacker can gain physical possession of a hardware device, your attempts to protect the data on it must be extremely effective and render the stored data functionally inaccessible. There must certainly be no inherent degradation of the security of your private cloud environment if a client computer is stolen and compromised. And if a client computer is stolen and compromised, the effects of this compromise on your environment must be carefully assessed. Unfortunately, the only person who is likely to know the difference between a stolen laptop and a stolen and compromised one will be the attacker. In consequence, you must either be absolutely sure that a stolen client laptop is as close to an inert lump of plastic and metal from the attacker’s perspective or that you can rapidly make any changes to your own environment that may be required (for example, reissuing trusted root certificates and revoking ones on the stolen equipment) resulting from the possible compromise. 4.8 Legal Considerations One area where IT decision-makers have considerable concerns with private and hybrid cloud implementations are the areas of legality, data protection, personally identifiable information (PII) and compliance. These requirements are particularly important in hybrid implementations, where you or business units within your organization may be in the position of the customer to a public cloud supplier. 4.8.1 Governance Organizations looking at implementing a private cloud infrastructure are likely to need to ensure that effective governance of the new environment. The management stack of the private cloud architecture should enable management to view security aspects of the environment and show the current threat levels to the organization. Typically, governance oversight is provided through a web-based dashboard that translates the technical aspect of security issues into understandable business language. 4.8.2 Compliance Organizations in certain industry verticals such as health, financial operations, and the provision of public services fall under the auspices of a range of compliance requirements and regulations, such as the Health Insurance Portability and Accountability Act (HIPPA). With international organizations or hybrid implementations, it is possible that moving to a private cloud environment may result in users in one country with one set of regulations accessing data in another country with a different or even conflicting set of requirements. The requirement for access to company data by law enforcement agencies is another area that must be examined carefully. For example, an organization may be presented with a subpoena to make its e-mail records made available. If this occurrence takes place, what is the effect on client confidentiality for data owned by a business unit from a different continent? Business units must be aware that these risks exist and that they may be exposed to the legal requirements of a different jurisdiction. Ultimately, your organization needs to be aware of the compliance requirements of all the countries in which it operates. One conclusion may be that data from one country cannot be hosted in another, as can be the case with public cloud implementations. 4.8.3 Integrated Governance, Risk Management, and Compliance The most effective approach to mitigating legal issues is to implement a fully integrated governance, risk management, and compliance framework. This framework would need to be defined at the highest level and then designed into the private cloud implementation. 4.8.4 Integrated Governance, Risk Management, and Compliance Personally identifiable information (PII) is data that enables a living person to be identified. The US Office of Management and Budget identifies the following information as PII. Full name (unless a very common name) National identification number Vehicle registration plate number Driver's license number Date of birth Birthplace Protection of PII can be a significant issue with organizations that operate in multiple jurisdictions. For example, legislation such as the Data Protection Directive of the European Union (Directive 95/46/EC) governs the protection of PII in Europe. Among other requirements, this legislation requires data holders to give notice to users that their data is being stored and grants them access to correct inaccurate data. This data must also be protected from potential abuses. Hence, storing personal data can be a significant complication. This complication arises not from the fact that the data might be insecure, as cloud environments can be made as secure as more traditional data centers. In this case, the issue is about granting access to the owner to amend the data. If your organization needs to store PII and you have a legal requirement to enable the owner of that data to change it, then you should consider how that information can be presented to the owner and amended if required. Your organization must create a statement that covers its collection, collation, storage, management, transfer, and deletion of PII. This statement must address the process for releasing the information to the original owner and to any third parties, such as a hosted cloud provider. The US Patriot act also introduces complications for multi-national organizations that are wholly-owned by US companies but operate in other parts of the world. If this situation applies to your organization, you should review the requirements of this act when planning data storage and PII. 4.8.5 Legal Agreements The basis of the private cloud legal relationships between the IT department and the business units of the organization that subscribe to those services will be contained within a number of documents. These documents should align with the IT Infrastructure Library (ITIL) Security Management process and include: Service Level Agreement (SLA). The SLA is the key definition of the arrangement between the service provider and the consumer of the private cloud services. This document should clearly identify the security levels that the service provider applies and identify the risks so that the consumer can make an informed decision on the service offerings. Operating Level Agreement (OLA). This document defines the relationships between the groups within the organization that support the SLA. The OLA makes these support relationships clearly visible and helps the consumer identify responsibility for support functions. The OLA must clearly spell out who is responsible for security support, the boundaries of that support, and the contact details and follow-up information if there is a security issue. Terms of Usage (ToUs). ToUs agreements make the consumer aware of what is or is not deemed acceptable usage of the cloud-based service, particularly in relation to security. For example, running port scans or using other people’s identities to log on are areas which might be specifically prohibited by the ToUs. User License Agreements (ULAs). ULAs specify the terms that the consumer must accept before accessing private cloud applications, platforms, or operating systems. Some of the ULAs may come from commercial off-the-shelf software hosted in the cloud environment or may be specifically created by the organization’s legal department for its in-house applications. All of these documents must set out clearly the security considerations of using the private cloud service, what activities are prohibited, and any penalties for contravention of these prohibitions. It should highlight that security responses may be automated and that manual intervention may be required to undo those responses. The legal documentation must also set out the process for establishing the identity of the consumer in the case of activities such as password resets or account provisioning and deprovisioning. 5.0 Private Cloud Security Challenges The previous section outlined the private cloud security problem domain, design principles that should apply at all levels of the private cloud design and design considerations. This section examines the key attributes that characterize cloud architectures: resource pooling, broad network access, on-demand self-service, rapid elasticity, and measured services. For each of these attributes, this section will analyze the security considerations in order to: Identifies the potential impact of the attribute on the design of the security functionality in your private cloud. Describes how the cloud security design principles apply to the detailed design of the features that support the cloud attribute. As described in section 3.3.3 of this document, security should be the foundation for the entire design process. The private cloud security model presented in figure 1 shows how security concerns are relevant to all elements in all layers and stacks within the architecture: Infrastructure Platform Software Service delivery Management The sections that it follows will use those security capabilities as the foundation for the private cloud attributes security considerations. 5.1 Resource Pooling Security Considerations Resource pooling in a private cloud enables virtualized resources to reassign dynamically to other tenants and to optimize resource usage. Your virtualization solution must clean any resources, especially storage and memory, before reassigning them to another tenant so that data belonging to the original tenant is not exposed to the new tenant. In the private cloud, automated processes typically handle the cleaning and allocation of resources to tenants. In a typical cloud, the resources that a tenant uses could be hosted on any of the physical devices in the cloud that offer that resource. For example, when a client provisions and starts a virtual machine in the private cloud, that virtual machine could be hosted on any of the physical servers in the cloud. One consequence that follows from this arrangement is that the same physical machine could host applications and services with different levels of business criticality, and that those applications and services may themselves include very different security features, such as those that govern authentication. Logical separately of shared resource pools should be performed with cloud management toolsets like Microsoft Virtual machine Manager. VMM allows IT administrators to create logical grouping of physical resources (compute, store and networking) in to resource pools. Then tenant, data classification bounds can be set. When designing your private cloud security and defining the resource pooling capability it is important to have a solution that uses a pool of resources that can be allocated to many different tenants, while ensuring the proper isolation of resources (network, compute, memory, storage) between tenants. The significant aspects of resource pooling in a private cloud that will affect your security design are: Reuse of resources by different tenant applications Co-location of services belonging to different tenants on the same physical server The automated processes that handle the allocation and de-allocation of resources Identity and access management systems will help you to manage the authentication and authorization that will control access to virtual resources by their owners. Within the enterprise, a single identity and access management system, will simplify the task of configuring and managing authentication and authorization for tenant applications and services, especially when there is a requirement to integrate several tenant applications with each other. If multiple identity and access management systems exist, then you can use federation services, such as those provided by ADFS, to integrate them where necessary. Although you can use authentication, authorization, and role-based controls to manage access to resources, in a private cloud you must also assume that credentials can be stolen or abused and that someone can gain access to resources that they should not be able to reach. Data protection services will help to preserve the confidentiality of data stored in virtual environments. Monitoring combined with automated responses will enable you to handle possible attacks in this complex environment, and logging will enable you to investigate and analyze problems and provide evidence to auditors. 5.1.1 Infrastructure Security Your design must address the risk that that a low business impact service might be more easily compromised by an attacker and the attacker can then exploit that weakness to attack the high business impact service running on the same physical server. An attack may involve trying to gain access to the high business impact service's data, or simply making the high value service unavailable by overloading the low value service. To address this issue, you could: Consider dividing the infrastructure into pools so that you can segregate the hosted applications, for example, running high business impact applications and services in their own pool. That pool may have more stringent security controls applied at the infrastructure layer or might only run applications and services with integrated security controls. This arrangement might affect service billing, with high security pools demanding premium billing. With high capacity servers resources spools can support dozens of VMs while using data classification zone to provide tenant and data type isolation. Specify limits on the type of application that you will allow to run in the private cloud. For example, no services classified as being high business impact can run on the private cloud infrastructure. The infrastructure layer typically includes network traffic monitoring in network devices such as switches. This type of monitoring can identify unusual traffic that may indicate that an attack on the infrastructure is in progress or that some element in the cloud is compromised. Table 1 has also core infrastructure components and its security considerations for private cloud. Table 1 Infrastructure security considerations for resource pooling Component Network Virtualization Storage Virtualization Security Considerations Primarily route all network traffic through your physical network devices unless your virtualization solution provides virtual network inspection built in or via third party component. Add additional monitoring functionality to each server to monitor each virtual network. Use a virtualization solution that enables virtualized network traffic monitoring devices. Ensure that network traffic between virtual machines is encrypted to protect it as it is moves through the cloud infrastructure. Assign encrypted networks to specific VLANs that will allow networks to be used for secure data to be assigned to secure resource pools or data classification zones. Perform whole volume encryption to protect physical storage media in case an attacker gains access to the underlying physical storage infrastructure from within a virtual environment. Virtual machines should only have access to the virtual storage devices allocated to them. Isolate your host environment from the guest workloads and ensure isolation between the virtualized guest environments Important Notes Encrypting traffic on the wire means that intrusion detection systems and intrusion prevention systems will not be able to inspect the traffic. However, you can still use IPsec to provide authentication (for example by using AuthIP and ESP-NULL), which enables the parties to be sure of each other's identities, detect any tampering of the payload, and optionally guard against replay attacks. As always, evaluate the trade-off between performance and security that arises with any encryption technique. Different encryption algorithms offer different performance characteristics and different levels of protection. Keep in mind that not all traffic needs to be authenticated and encrypted – design over-the-wire encryption into your plan where it makes the most sense. Allowing access from the guest to the host could allow an attacker access to the private cloud infrastructure. Potentially, this would enable the attacker to damage or disrupt the entire cloud infrastructure, or to launch an attack on other virtual environments hosted in the cloud. 5.1.2 Platform Security Although controls should be in place in the infrastructure layer to protect the hosted virtual environments, you should adopt the defense in depth principle and assume that an attacker could discover a weakness in the infrastructure security and try to gain access to the platform (or virtualized operating system) that hosts the tenant application or service. Table 2 has other considerations regarding virtualization platform. Table 2 Platform security considerations for resource pooling Component Virtualization Application/ Service Security Considerations All virtual machines have a host-based firewall configured to protect them from network attacks from the external world, other virtual machines, or the underlying infrastructure. All host-based firewalls only allow inbound and outbound traffic from and to the specific machines with which they must communicate. Tenant application or service in a private cloud consists of virtual compute, memory, storage, and network resources. The virtualization solution that you adopt must ensure the isolation of all of these resources (and any others that may be used) for each tenant. Important Notes You can also use IPsec to logically isolate groups of hosted virtual machines so that they cannot communicate outside of the group with other hosts on the network. For example, if you have a multi-tier application hosted in your private cloud, then you could use IPsec to ensure that the database server can only be reached from the middle-tier server, and that the middle-tier server can only be reached from the front-end web server. If two or more tenant applications hosted in different virtual machines do require access to a shared resource, such as when two hosted applications may require network connectivity or access to the same database server, this sharing must be managed such that only the participating applications have access, and the shared use is actively monitored. All administrative access from operations staff and the owner of the virtual resource should be fully authenticated, authorized, logged, and audited. 5.1.3 Software Security Applications and services running in the private cloud can protect their data in a number of ways. The design of these security features will be the responsibility of the application designer, not the designer of the cloud infrastructure. However, the cloud service provider (CSP) should take steps to ensure that the application designer is aware of the data protection services and other security features that are provided by the cloud infrastructure, and of any specific features of the cloud infrastructure that might influence the design of the application or service. Some specific issues that an application or service designer should address include the encryption techniques they use to protect their data and how the disaster recovery planning services provided by the cloud work with their application. For example, the infrastructure layer may use whole volume encryption to provide protection for data stored on that volume in the event that an attacker gains direct access to the underlying storage. From within a virtual machine, applications can access their stored data without decrypting it because the infrastructure performs the decryption on behalf of the virtualized environment. Because of this feature, an application hosted in a virtual machine in the cloud may require its own encryption services for sensitive data so that if an attacker gains access to the virtual machine, he or she will be unable to access the sensitive data. A tenant application may encrypt data in storage, data in memory, and data during processing to make it more difficult for someone to read, intercept, or tamper with it in a tenant application or service even if they have gained authenticated and authorized access to the tenant's environment. However, all encryption techniques rely on the existence of a private key to perform both encryption and decryption in the case of a symmetric encryption algorithm, or decryption in the case of an asymmetric algorithm. However strong the encryption algorithm, it is useless if someone gains unauthorized access to the private key. Table 3 has other considerations regarding software security. Table 3 Software security considerations for resource pooling Component Encryption Development Methodology Auditing and Logging Security Considerations Any encryption technique must use private or secret keys to decrypt encrypted data. In a virtual machine, this private or secret key must be stored somewhere inside of the virtual machine so that the application can decrypt its data. The owner of the application that uses data encryption within the virtual machine must take steps to protect any private or secret encryption keys. This protection must be effective when the virtual machine is running or dormant, and must be effective for any backup or archive copies of the virtual environment. Tenants should use any secure key storage facilities offered by the platform or virtualized operating systems to store the private keys used by their application. Tenants should follow strict procedures to protect keys from being discovery outside of the virtual environment. Clients should also consider encrypting data during processing in their tenant application. Private cloud tenants should be encouraged to use a security development lifecycle for the applications that will be hosted in this environment. Developers must decide to use HTTP or HTTPS depending on the contents of the logs. If an application is logging a large amount of data that won’t be of interest to outside parties or eavesdroppers, then HTTP can be used for a faster transfer. Important Notes If attackers gain access to the virtual machine, they may be able to gain access to any private keys stored inside the virtual machine, rendering any encryption worthless. Tenants should use separate keys during development and test, and securely delete local copies of keys after they have been uploaded and installed in the virtual environment where they will be used. Data stored temporarily in a queue or data stored in memory could become visible in a memory dump if a server or virtual machine experiences a stop error Using Microsoft SDL as development methodology can assist developers to create apps that can mitigate potential risks exposed in a shared environment like private cloud. However, Microsoft recommends protecting all log data in transit to public cloud providers by using HTTPS. Component Request Throttling / Input Sanitization Security Considerations Developers must do application-level throttling of incoming requests for any kind of complex, time-intensive operation. Important Notes The Microsoft Security Development Lifecycle (SDL) portal at http://www.microsoft.com/sdl provides resources on fuzzing parsers. If a service is parsing a proprietary file or request format (perhaps encapsulated inside HTTP), then fuzz test it to ensure the code can correctly accommodate malformed input. In a private cloud, the cloud infrastructure may move tenant applications to different physical servers or even different data centers to maintain availability in the face of hardware failures or to optimize performance or resource utilization. Any encryption techniques used by tenant applications to protect data must continue to be effective in these scenarios. Any automated processes that move applications and services to different physical devices must ensure that any keys used to protect application data continue to be available to the applications and services that need them; if this approach requires keys to be copied between locations, the automated process must ensure that this transfer process is secure. 5.1.4 Management Security SLAs between the cloud service provider and the client business units should specify what level of access to client data in what circumstances is permitted to operations staff. All management operations, whether performed by the cloud service provider or consumer must be logged and be auditable. 5.1.5 Legal In many regions, there is legislation that relates to data protection and privacy. In a private cloud, you must ensure that it is clear who within the organization is responsible for compliance. For example, the IT department might be responsible for ensuring proper isolation between the virtual environments used by the different business units, but a business unit might be responsible for ensuring that their application or service is compliant. Hybrid clouds or the use of distributed clouds across different geographic regions to provide better performance or more resilience might also complicate the issue of monitoring for compliance: it may not be permissible to move some data across geographic regions, legal data protection and privacy requirements might be different in different regions. In these scenarios, you must enable a client business unit to specify where their cloud-hosted application or service can run. 5.2 Broad Network Access Security Considerations As a designer of a private cloud solution, it is important to provide appropriate authentication and authorization services for the broad range of users accessing the cloud. Different services have different security requirements, such as different levels of security, access from multiple locations, or self-provisioning of users. This section describes how these capabilities relate to the broad network access attribute of private clouds. There is in an increasing demand from business users to enable support for a wider range of client devices such as mobile phones and tablets. These devices, along with more traditional clients, may be used both internally and externally to access corporate systems. These requirements, combined with the fact that private clouds may also enable on-demand selfservice access to resources and have an infrastructure that is built to support virtualization and resource pooling, give rise to the following concerns that you should address in your private cloud design: There is a much broader attack surface available to potential attackers, not only from outside the organization, but also from within. For example, if an attacker manages to compromise a guest operating system hosted within the private cloud, in effect you have an attacker operating within your data center. Tenants may manage some aspects of the security of their virtual environments, including authentication and authorization. There is no longer a clearly identifiable perimeter around your resources: an attack on a hosted service could potentially come from an external source, another hosted service, or from the infrastructure. To mitigate these threats, you need to ensure that access to resources, applications, services is authenticated and authorized, and that all access is monitored, logged, and auditable. This must happen consistently regardless of the client device type. You may require different strengths of authentication (password, certificate, multifactor) based in the location of the client (private cloud, intranet, extranet, internet) and the sensitivity of the application's data (personally identifiable information, company confidential information, publicly available information). 5.2.1 Infrastructure Security The use of virtualization technologies to build private cloud infrastructures means that there are new potential targets for attackers. By targeting the hypervisor technology that underpins the virtualization in a private cloud, an attacker may be able to disrupt the entire cloud or use the hypervisor to gain access to other virtual machines hosted in the cloud. Table 4 has other considerations regarding infrastructure security. Table 4 Infrastructure security considerations for broad network access Component Hypervisor Security Considerations Ensure that hypervisors and host operating systems are not accessible from the guest operating systems controlled by the tenants. Consider dividing your infrastructure resources into pools Important Notes Consider potential attacks originating from guest operating systems running within your data center as well as more traditional threats from outside your security perimeters. This approach may affect the efficiency of your resource utilization in the cloud because certain tenant applications are constrained to run in a particular pool of hypervisors, but it does enable you to build security boundaries between the pools to limit the scope of any attack on the hypervisor infrastructure Hardware Consider using hardware security features to detect unauthorized changes to the hypervisor For example, Intel Trusted Execution Technology (TXT) can verify that a launch environment has not changed from a known good configuration. Network Minimize routing network traffic within the cloud infrastructure. By using this approach you will minimize the number of locations where the infrastructure might be exposed to attack 5.2.2 Platform Security In the IaaS and PaaS private cloud models, the platform is typically a virtual machine hosted on the cloud infrastructure. The platform hosts the tenant's service or application, and client applications will access these services or applications from within the corporate network, from outside the corporate network, or potentially from other virtual machines running in the cloud. Supporting broad network access to the virtual machines in the platform means a greater potential attack surface. Table 5 has other considerations regarding platform security. Table 5 Platform security considerations for broad network access Component Host Network Security Considerations Assume that the host operating system could gain access to the virtual machine Every host must run its own anti-malware software Every virtual environment must be protected by its own firewall 5.2.3 Software Security Important Notes You may have a different assumption depending on the hardening level for the host and how your virtualization platform handles this access. Protecting inbound and outbound traffic from the virtual environment is a very important consideration. In the IaaS and PaaS service delivery models, tenants may be wholly or partly responsible for the management of the applications and services that they chose to host in the cloud. Although there will be security features protecting the infrastructure and platform directly, a defense in depth approach means that the cloud service provider should not rely solely on these security features. By making it harder for an attacker to compromise a virtual environment managed by a tenant, the cloud service provider makes it harder for an attacker to use this route to attack the private cloud infrastructure. Table 6 has other considerations regarding software security. Table 6 Software security considerations for broad network access Component Access Control Overall Configuration Deployment Development Framework SLA Security Considerations Apply strict controls to the software that tenants may run in their virtual environments. Important Notes An example of that will be limiting the available APIs or the permitted programming languages that they can use to develop hosted applications and services Lock down the security configuration. An example of that will be firewall settings, in the platform or virtualized operating system such that the tenant cannot modify them Validate the tenant's software before they can deploy it to the private cloud. Validation process should be done in a isolated environment to not interfere with the production network. Use corporate policies to specify guidelines or methodologies that tenants must follow when they develop, commission, or purchase applications to run in the cloud. For example, corporate policy may mandate development teams within the organization to use the SDL methodology to improve the overall security of their products. This approach helps to shift the responsibility to the tenant, but you will still need to audit compliance. SLAs between the cloud service provider and the tenants should make clear who is responsible for which aspects of software security in the virtual environment. Ensure that there is a common agreement and this agreement is documented in the SLA before put any software in production. All of these considerations may limit the utility of the private cloud solution because they conflict with the provision of other cloud attributes, especially the on-demand self-service attribute. These controls may also require completion of additional complex manual steps before the tenant's software can be deployed and used. It is also important to encourage tenants to use existing services as part of their solution wherever possible. The cloud service provider can manage the security of some enterprise wide services in the cloud, so that tenants do not have to re-implement them. For example, the CSF can provide identity federation services (such as Active Directory Federation Services or ADFS) in the cloud, you can make it easier for tenants to participate in single sign-on, and utilize your existing security infrastructure for identity management and access control rather than creating their own authentication and authorization mechanisms and expanding the available attack surface. As you move through the cloud service delivery models from IaaS, through PaaS, to SaaS, the cloud service provider takes more responsibility for the security of the hosted application or service, and the tenant gives up more control of the environment and loses some degree of flexibility. The PaaS and SaaS models therefore make it easier for the cloud service provider to ensure consistent standards of security and to reduce the threat to the virtual environments. 5.2.4 Service Delivery Security Supporting broad network access to the cloud will tend to reduce the amount of filtering and monitoring that the cloud service provider can apply at this layer as it must allow traffic from a wider range of devices and locations through this layer. Table 7 has other considerations regarding service delivery security. Table 7 Service delivery security considerations for broad network access Component Service endpoint Service controls Security Considerations Tenants may need the ability to configure their service endpoints to make their applications and services available to a broad audience who use a broad array of devices. CSP can specify the controls that must be in place on the services that you and the tenants use. Important Notes An example will be opening the necessary ports and configuring SSL. The CSP will need to consider the inbound and outbound traffic requirements of tenant services and have an automated method to enable and disable this traffic, as appropriate. One example of service will be the selfservice portal for requesting cloud resources or billing services. 5.2.5 Management Security Tenants may require the ability to perform some management operations from client applications and devices of their choice. Table 8 highlights some key considerations regarding management security. Table 8 Management security considerations for broad network access Component Authentication Security Considerations All access to cloud management services should be authenticated. Important Notes You should use two-factor authentication and ensure that monitoring, logging and auditing capabilities are enabled. Component Attack Surface Access Control Security Considerations Reduce the available attack surface by restricting access to management services to specific client devices in specific locations. In case tenants do have access to some cloud management tools, then there should be rolebased access controls in place to limit who can perform the management operations. Important Notes One example of service will be the selfservice portal for requesting cloud resources or billing services. Assigning individual users to the relevant roles should be a part of the provisioning process for cloud resources. These considerations will largely be driven by determining what access, if any; tenants should have to any of the cloud management tools. 5.2.6 Client Security In a private cloud, business units within the enterprise may have responsibility for and ownership of the tenant services and applications hosted in the cloud. In this is the case, the business unit will also determine whether there are any controls over the client applications used to access the cloud-hosted service. Table 9 highlights some key considerations regarding client security. Table 9 Client security considerations for broad network access Component SLA Application Scope Security Considerations SLAs can mandate that certain security features are included in client applications. Limit the client applications that can act as a client to the service is to use a certificatebased mechanism. A consumer, service, or computer must present a valid certificate before it can invoke any service operations. Important Notes Service model often makes it possible to discover the service API, so it is always possible that someone could build their own client application to exploit vulnerabilities in the service API using known or discovered credentials for the service. This approach is dependent on robust Public Key Infrastructure (PKI) and well managed policies to ensure that someone who wants to impersonate an existing, approved, user cannot discover and purloin the certificate. It is also important to mention that client devices can cache data for performance, enable the user to export data to save locally, or simply enable a user to screenshot data. All of these features, and many others, make it difficult to protect the data that client applications make available to the end user. Unless you can apply strict controls to the client environment, it is difficult to mitigate these risks. 5.2.7 Legal You must determine if any local laws or regulations place restrictions on where the data may be located, or whether the clients must include any particular security features. If the client business unit is responsible for the service, the SLA between the IT department and the business unit must specify where responsibility for compliance lies. 5.3 On-demand Self-Service Security Considerations As a designer of a private cloud solution, you should design access control for the services hosted in the cloud. You should also determine who can request services and how much they can request. This section describes how these capabilities relate to the on-demand self-service attribute of private clouds. The on-demand, self-service characteristic of a public cloud implies that anyone with a credit card can purchase the resources they need as and when they require them. For the private cloud, you must determine who within the enterprise should have the authority to request resources from your private cloud (and who has the authority to release those resources when they are no longer required). The key issues associated with the on-demand self-service attribute of the private cloud are therefore: Authentication, authorization, and role-based access controls that control who, within the organization, may provision and manage cloud-based resources. Monitoring and auditing the use of a provisioning portal to ensure that controls are applied effectively. In a private cloud, the client who requests the resources may not be a separate business unit within the organization but can be the IT department itself, acquiring resources from the cloud on behalf of a client business unit. In this scenario, one of the benefits derived by the organization is the ability of the IT department to make new infrastructure resources available much faster than in a more traditional architecture. Provisioning new virtual servers and storage services may be done in minutes rather than the weeks or months it can take to procure and commission a physical server and storage, and this means that the IT department is able to provide a significantly better service to its client business units within the organization. A key benefit of the cloud is that IT resources are treated as a utility that can be metered and billed for, so the decision about who can make and approve those requests will be similar to the decision about who can make and approve other purchasing decisions. You should consider integrating the cloud provisioning portal with any existing purchasing system that includes an approval process and auditable workflow. Another benefit of the cloud is the elasticity of the supply of resources. Any authorization to use cloud resources should define the value of the resources that may be consumed, record total usage and report on the total cost of consuming the services for a given time period. The self-service service provisioning system should use policies to specify quotas for these limits. 5.3.1 Infrastructure Security Apart from being able to initiate requests to provision and de-provision cloud resources, tenants should have no access to the private cloud physical infrastructure. If your private cloud supports the IaaS service delivery model, then your self-service provisioning portal should enable clients to request virtual infrastructure resources. 5.3.2 Platform Security Although from the perspective of the tenant the request for a resource such as a virtual machine to host a departmental application may appear to be a single operation, the provisioning process is more complex. The tenant must be granted access to the resource they have requested and typically this will include some administrative rights for the resource. Table 10 highlights some key considerations regarding platform security. Table 10 Platform security considerations for On-demand Self-service Component Provisioning Data isolation and protection SLA Security Considerations Provisioning process must correctly configure the security environment for the resource. Ensure that the security configuration of the virtual machine allows access for the tenant and no one else. Ensure that the operating system has a baseline security configuration applied for the tenant Important Notes Create any necessary VLANs, granting access to storage, and setting up the required security monitoring and logging. This consideration is important because the provisioning process automatically creates the virtual machine and may configure it with an operating system for the tenant to use. This consideration is appropriate if the provisioning process also installs an operating system on the virtual infrastructure. The design should specify any security settings that must be applied to protect the tenant's data, and to protect the host environment in case the guest environment is compromised. This consideration is particularly important for the IaaS model where tenants will access cloud-based resources through a virtualized environment. The SLA associated with the virtual resources must specify precisely what aspects of security in the virtual machine the cloud service provider manages, and what aspects the tenant manages. The tenant owns the virtualized environment, and may now be responsible for designing, implementing, and managing the security in the virtualized environment. Component Security Controls Security Considerations Identify what a compromised host might be able to do in terms of attacking memory, compute, networking and storage to the cloud infrastructure and put controls in place to limit the impact of a compromised guest virtual machine on the entire infrastructure. Important Notes Consider the fact that even though these virtual machine templates will have a base desired security configuration, the cloud consumer is going to have complete control after the initial deployment. 5.3.3 Software Security Although from the perspective of the tenant the process of software development might abstract itself from the private cloud infrastructure, there are still security considerations that should be taken when developing and guiding developers to create new apps for this environment. Table 11 highlights some key considerations regarding software security. Table 11 Software security considerations for On-demand Self-service Component Authentication and Authorization Security Considerations Clients should be encouraged to use the existing enterprise security features such as authentication and authorization instead of building their own custom implementations into their tenant services. SLA Methodology Identity and Access Management Software acquisition Important Notes By using identity federation services you can enable tenant services to participate in a single sign-on environment and use existing role memberships from the enterprise directory to determine authorization in their applications. The private cloud provider is responsible managing the security of the service (in a SaaS model). However, the agreement between the provider and the consumer may specify that the consumer is responsible for some aspects of the security. Follow best practice for development by adopting a methodology such as SDL The SLA may state that the consumer should not share their credentials with another user and must take reasonable steps to protect their password. This consideration is particularly important in the IaaS and PaaS models, where the tenant is responsible for the design of the security features of their infra or application. Use existing enterprise security features included in directory services solutions such as Active Directory. Leveraging current resources is important, instead of building their own identity and access management features, a tenant application should integrate with the existing identity providers using federation with ADFS. Specify security standards with which the purchased software must comply. If tenants are purchasing third party software they should comply with the Enterprise security standards. Component Validation Key Management Security Considerations Use special test and staging environments in the cloud to verify the security of their applications and services Use secure platform key management services for keys used by the tenant's hosted service or application Important Notes The validation/test environment should be isolated from the private cloud infrastructure to avoid that a rogue program start demanding more resources that it needs, which could affect other tenants in production. This consideration is particularly important in the IaaS and PaaS models, where the tenant is responsible for the design of the security features of their infra or application. 5.3.4 Management Security Designing administrative security for managing the cloud infrastructure should follow best practice in terms of using role-based access control and the principle of least privilege. However, you must take into account the fact that with the cloud infrastructure it is difficult to identify where services and data physically reside. For example, a role that enables members of the role to make a configuration change on a host server may have the ability to make the change on any host server in the cloud. Table 12 highlights some key considerations regarding software security. Table 12 Management security considerations for On-demand Self-service Component Authentication and Authorization Administrative Privileges SLA Security Considerations Role-based access control for the selfservice provisioning system to control who, in your organization, can make a provisioning request for cloud services. Role-based access control for the various cloud services that you make available to tenants for use in their own hosted services. Role-based access control for managing the cloud infrastructure itself. Use role-based access control for all cloud management functions Use roles to control what level of access tenants have to any management functionality. Important Notes Take into consideration that client business units have this ability; it may be that only the IT department can use this functionality. Consider using identity federation services that may be used by a tenant to integrate with the enterprise directory to handle authentication in the tenant's service Consider who can view the infrastructure monitoring data from a virtual machine host. In a private cloud infrastructure you might have many cloud management functions, for example financial management, capacity management, and fabric management. Different tenants may require different levels of access and different users associated with a tenant may require different levels of access. Tenants should only be able to manage their own hosted environments and services. Ensure that you always use the principle of least privilege. Start with no privilege and add only what it is necessary to perform the job. However, you must take into consideration the fact that with the cloud infrastructure it is difficult to identify where services and data physically reside. For example, a role that enables members of the role to make a configuration change on a host server may have the ability to make the change on any host server in the cloud. SLAs should specify what management functions the tenant can perform on their resources, and what management functions the cloud service provider can perform on the tenant's resources. SLAs should specify in what circumstances the cloud service provider can shut down a virtualized environment owned and managed by a client business unit, and what notification should be given that this is happening. Explicitly stating the function that the tenant can perform is key security component at the same time that sets the correct expectation for the tenant. An example of that is a scenario where monitoring within the cloud determines that a virtual operating system has been compromised by an attacker and is trying to gain access to other virtual environments in the cloud, automatic management procedures should close the compromised environment immediately and notify cloud operators and the client. Note that you might consider implementing mitigation procedures that would allow a temporary increase in resources to the workload while the tenant attempts to rectify the problem with the service still online. Component Operations Security Considerations When the cloud service provider performs a management operation, that operation may potentially affect all tenant services or some subset of tenant services. Because of the way that a cloud pools resources it may not be easy to predict which tenants experience disruption to services. Important Notes Consider that for some management operations, you must assume that it will affect mission-critical services in the cloud. You must design membership of your administrative roles accordingly. 5.3.5 Legal Responsibility for compliance with any local legislation (for example in relation to traceability of actions) may lie with either the cloud service provider or the tenant. SLAs should specify where that responsibility lies. 5.4 Rapid Elasticity Security Considerations Among other concerns that a designer of a private cloud solution should have, one of the most relevant concerns related to the essential characteristic called rapid elasticity is that a rogue application, client, or DoS attack might destabilize the data center by requesting a large amount of resources. It is important to balance the requirement that individual consumers/tenants have the perception of infinite capacity with the reality of limited shared resources. Cloud architectures offer elasticity of resources to clients and hosted applications and services. From the tenant’s perspective, the cloud offers an unlimited pool of resources. If the consumer of the cloud service anticipates a burst in demand for their service, the client can request more resources from the cloud to ensure that the service is capable of meeting that demand. A more sophisticated hosted application or service can monitor demand and automatically request additional resources from the cloud using an API. Clients and client applications can also release resources back into the pool when they are no longer required. The key issues associated with the rapid elasticity attribute of the private cloud are therefore: Authentication, authorization, and role-based access controls that control who or what, within the organization, may request additional resources from the pool or return resources that are no longer required to the pool. Monitoring and auditing requests to allocate and de-allocate resources to ensure that quotas are respected and that the availability of individual services, hardware devices, and the private cloud is maintained. Ensuring data destruction with pooled resources so that session information from one tenant is not available to another tenant. 5.4.1 Infrastructure Security Monitoring is equally important for both provisioning and de-provisioning requests: an attacker may attempt to destabilize the private cloud by shutting down resources. As has also been discussed, the provisioning and de-provisioning processes must ensure that the resources available in the pool for reuse do not contain any sensitive data that could be exploited by the application or service that next acquires the resource. From the perspective of the tenants, the private cloud is an unlimited pool of resources, available on demand. From the perspective of the cloud service provider, the private cloud is fixed size pool of shared resources used by client business units who have expectations of the quality of service they will receive from the cloud. You may also offer different sizes of resource to clients (for example small, medium, and large virtual machines), and in order to maintain availability for all clients you might need to limit the number of certain sizes of virtual machine in your cloud so that 10% of virtual machines are large, 60 % are medium, and 30% are small. Table 13 highlights some key considerations regarding infrastructure security. Table 13 Infrastructure security considerations for Rapid Elasticity Component Quota Security Considerations Define quotas to mediate access to the cloud resources. Ensure that these quotas are specified in the SLA. Determine the appropriate granularity of the resource quotas and determine whether the quotas may be adjusted. Important Notes The intent is to avoid that client or attacker can accidentally or deliberately overwhelm the cloud infrastructure with provisioning requests or grab a large share of the available resources to the detriment of the service availability to other clients. Transparency as a service provider an essential element to gain trust, therefore to implement such quotas you must be able to tell which client made the provisioning request (remembering that the request itself may be made automatically by a running service), dynamically monitor the resource utilization by client, and enforce the quota. This recommendation is important for scenarios where a client requests a higher quota for a service that is particularly resource intensive; or a in a case where the client requests a lower quota for a lower priority service or ask for limits on the costs associated with running the service. Component Availability Logging Security Considerations Maintain availability for all clients. Use dynamic load balancing for the applications and services hosted in the cloud as demand for those services changes and as services scales in or out All requests to provision or de-prevision resources for a client must be logged and auditable. Important Notes To ensure availability, the cloud infrastructure must be able to handle provisioning requests in a timely manner. Dynamic load balancing may require running virtualized environments to move between physical servers or even between data centers. In addition to maintaining availability, the automated procedures that handle this process must maintain the confidentiality and integrity of these virtualized environments. While logging is important, be able to use those logs to audit resources and usability is an important practice. Although the considerations mentioned in Table 13 are important, you must remember that when overall demand is high for cloud resources, any load balancing and quota-based rationing of resources must guarantee availability of systems as specified by any SLAs with the client business units. If demand for private cloud resources is highly elastic and you cannot maintain the availability of the hosted services with your existing capacity, you can adopt a hybrid model and extend your private cloud to infrastructure provided by a third party (sometimes referred to as “cloud bursting”). In this scenario, you must consider what impact, if any, does hosting a service in the third party's infrastructure instead of your own will have on: The SLAs with your client business units. The integration of a tenants application with other services hosted in your private cloud. The legal requirements that relate to the hosted application. 5.4.2 Software Security Software that can scale in a private cloud faces two security related issues: Although the private cloud infrastructure can enable rapid elasticity in the supply of virtual resources, hosted applications and services must be designed correctly if they are to function securely when they are scaled out. Hosted applications and services that initiate scaling requests automatically based on monitored demand or a timetable must perform these operations without impacting their own or other services availability within the cloud. Table 14 highlights some additional considerations regarding software security. Table 14 Infrastructure security considerations for Rapid Elasticity Component Scalability Availability Security Considerations Applications that are designed to scale may require some mechanism to share user state across instances. Poorly designed autoscaling algorithms used in a hosted service could affect the availability of other services. The cloud infrastructure should include checks within the autoscaling service to prevent repeated resource requests and enable tenants to specify upper and lower limits on their resource requirements. Important Notes SLAs or corporate policies may define how to accomplish shared state securely; for example, specifying requirements for cookie encryption. This behavior can occur by a continuous request to provision and then deprovision a resource, or by continuing to request resources indefinitely. Adding this capability will help to prevent that poorly designed autoscaling algorithm inadvertently shut a service down completely, making it unavailable. 5.4.3 Management Security Provisioning and deprovisioning requests and scale in and out requests are made through a cloud management interface, implemented either as a GUI or through an API. Access to these functions should be protected through role-based access control policies and their use fully logged. Additionally, these interfaces should implement any quota checks on resource allocation that you want to enforce. 5.4.4 Legal Certain applications may need to guarantee availability or meet targets for response time or throughput to meet legal or corporate policy requirements. The private cloud's enablement of rapid elasticity in meeting demand for services, must ensure that any such legal or corporate requirements are met without comprising the confidentiality, integrity, or availability of those or any other services hosted in the cloud. 5.5 Measured Service Security Considerations As a designer for a private cloud you want to ensure that all applications and services running in the cloud are measured and accounted for. Protecting the measurement and billing services in the private cloud it is an important task. Securing measured services will enable tenants to understand what they are paying for and to be able to identify any resources that they are paying for that they did not explicitly approve. You also need to understand how these capabilities relate to the measured service attribute of private clouds. In a private cloud environment, it is important for the CSP to track all chargeable use of the cloud services by its tenants so that it can bill tem accordingly. A concern of the private cloud service provider here is to ensure that tenants cannot bypass the monitoring systems in any way to reduce the amount they have to pay. Although it is unlikely that a business unit within an enterprise would try to steal cloud services from the enterprise private cloud in this way, there is the risk that someone could try to use the private cloud resources for their own purposes. For example, an employee could run a private web server in the corporate cloud (often hosting explicit adult material) or someone from outside who gained access to the private cloud could run a private mail server. To achieve this without being detected, the person or entity using the private cloud resources would either have to bypass the measuring and billing in the private cloud, or arrange for their use to be paid for by a legitimate client such as a business unit. The measured service attribute of private clouds also affects the overall availability of resources in the private cloud. By measuring and charging for the use of resources in the private cloud, the cloud service provider encourages tenants to return resources to the pool when they have finished with them. Without this cost incentive, tenants may hang on to resources indefinitely even though they are not using them, reducing the overall availability of the private cloud's resource pool. 5.5.1 Infrastructure, Platform, and Software Security You must ensure that all monitoring and logging features that measure resource usage and charge tenants accordingly are protected from tampering. Such logging must always be accurate and must always correctly identify who is using the resource. 5.5.2 Management Security Tenants should be able to access their own billing information through the financial management services in the private cloud with enough detail to enable them to identify any possible unauthorized usage of resources on their behalf. The cost of resources should provide a sufficient incentive for client business units to monitor their resource usage. 5.6 Mitigating Security Risks Taking the private cloud reference model as the basis for analysis, you can identify threats to the private cloud infrastructure and place these threats into appropriate places within the model. This approach provides a basis for threat modeling and risk analysis. Hence, you can classify attacks according to the layer or stack that the attack targets. Figure 3 highlights the primary areas in the private cloud where these individual attack types can target. Note that this diagram is not showing responsibilities but listing the different types of attacks that might take place against the management stack, the infrastructure layer and so on. You should also note that some of these areas may be out of control of the private cloud provider, such as client security. A later section discusses the implications of changes to the client security relationship. Figure 12 – Security Threats to Private Cloud Architectures Note: This figure only summarizes the threats that may exist at the different levels of the cloud architecture. For an explanation of each of these threats and applicable countermeasures, see Cloud Security Threats and Countermeasures at a Glance, at http://blogs.msdn.com/b/jmeier/archive/2010/07/08/cloud-security-threats-andcountermeasures-at-a-glance.aspx. 5.6.1 Shared Tenant Model A key differentiator with public cloud environments is that the service is provided on a shared tenant basis and multiple tenants use the same services. The public cloud implementation then applies authentication, authorization, and access controls to create logical partitions between the tenants so that individual tenants are isolated from each other and cannot see other tenants’ data. Note: In private cloud terminology, a tenant is a client, typically a business unit within the organization, who is using the private cloud to run their applications and services. The perception of a private cloud is that it is only hosting one organization, and in consequence, security partitioning is not required. In reality, organizations may have good reasons to want to implement such partitioning, such as between different business groups or between the finance department and the rest of the organization. In consequence, a private cloud model may also be a shared tenant model with similar requirements for effective security partitioning between different business units as with public cloud implementations. 5.6.2 Virtualization Virtualization is not an absolutely essential component of private cloud architectures, as organizations can use blade server arrays or other compute-dense configurations to provide cloud-based services. However, the advantages of improved server utilization and greater operational flexibility that virtualization platforms provide have led to very high uptake of this technology in both cloud environments and in the predecessor architecture to the cloud, the dynamic data center. Virtualization radically changes the way an organization secures and manages their data center. Because workloads are mobile and can move from host to host based on optimization algorithms that require no human involvement, security policies linked to physical location are no longer effective, so security policies must be independent of network or hardware topologies. Although estimates of data center server virtualization indicate that this technology has reached adoption levels of over 50%, management and security tools for virtualized environments are still catching up with the physical systems that they replace. The presence of the new hypervisor layer provides additional attack vectors and new opportunities for security breaches. One example of the new attack vectors occurs when virtual machines running on the same physical host typically use virtual networking components to communicate between these guest operating systems. In consequence, virtual machines can be communicating with each other without those communications being picked up by monitoring tools on the physical network. IT staff must be able to identify when inter-virtual machine traffic is occurring and apply policies and monitoring to that traffic. A key factor for implementing effective security in virtualized environments will be virtualization of the security controls themselves. As these virtualized controls become available, they should as a minimum meet the following criteria: Fully integrate with the private cloud fabric Provide separate configuration interfaces Provide programmable, on-demand services in an elastic manner Consist of policies that govern logical attributes, rather than policies that are tied to physical instances Enable the creation of trust zones that can separate multiple tenants in a dynamic environment In summary, security in private cloud environment must be adaptive and natively implemented into a fabric where resources are allocated dynamically. Any security functionality that is tied to a server, an internet protocol (IP) address, a media access control (MAC) address, port, or other physical instance will no longer be as effective as in purely physical environments. 4.0 Summary Security in private cloud environments is not intrinsically more difficult than in older-style data centers; it just requires a good understanding of the architectural, procedural, and operational differences between private cloud implementations and more traditional designs. The allencompassing security as the foundation for your private cloud design, the layered architecture, the pooling of resources, the provision of rapid elasticity, the change in connection to the intranet, and the different treatment of client computers are examples of the changes in approach and mindset that private cloud environments require. The security principles of defense in depth, authentication, authorization, auditing, least privilege, encryption, and data protection remain unchanged. As private and hybrid cloud implementations grow in popularity, new management and diagnostic tools will appear that will simplify the process of detecting intruders rapidly and applying automated actions to prevent data loss. These tools will also operate in virtualized environments and help to prevent side-channel attacks between virtual machines or between virtual machines and hosts. Ultimately, it is the operational experience of IT departments that will do the most to plug any gaps in security that private or hybrid cloud implementations create. Whatever the operational challenges, the business benefits of cloud environments mean that this new architecture is likely to be widely adopted and should deliver quantifiable business benefits to organizations who embrace this technology. 5.0 Additional Resources TBD 6.0 Authors and Reviewers