Behind the scenes of IaaS implementations Sumayah Alrwais Indiana University salrwais@indiana.edu Abstract Open source IaaS is hot topic in cloud computing and every year a new IaaS with a different design is announced. This paper provides a closer look into three common IaaS; Eucalyptus, OpenStack & Nimbus. OpenStack is currently the best choice of public cloud providers due to its aim at standardization and scalability while Eucalyptus is more appropriate for private cloud deployments. 1. Introduction “Cloud computing” is the term used to describe basically outsourced services. Armbrust and others [29] have called it the dream of computing as a utility “cloud computing, the long-held dream of computing as a utility, has the potential to transform a large part of the IT industry, making software even more attractive as a service and shaping the way IT hardware is designed and purchased” The cloud system can be divided mainly into three layers as shown in figure1. The infrastructure as a service (IaaS) is the lowest level which delivers computing infrastructure as a service to end users. Iaas provides users with a Software as a service (SAAS) way to monitor, manage and lease resources by deploying virtual machine (VM) instances on those resources. Amazon EC2, Eucalyptus, Nimbus, Open stack & Open Nebula [1, 2, 16, 24, 30] are examples of cloud infrastructure implementations. Platform as a service (PAAS) An interesting cloud infrastructure is Nimbus. Nimbus first came out in 2005 and is gaining momentum and popularity especially in scientific communities. It is an open source Infrastructure as a service implementation of cloud infrastructure which aims to allow providers to build private clouds and (IAAS) workspace services. Nimbus facilitates the management of images and allows users to utilize Figure 1: Cloud Architecture layers cloud computing resources [24]. Although Nimbus has been thoroughly researched and analyzed in different dimensions, security did not get a fair share of it. Another commonly used cloud infrastructure is Eucalyptus [1] which is a reverse engineering of EC2 and has an commercial enterprise version. OpenStack [16] is a cloud infrastructure announced in June 2010 that is gaining momentum and competing with Eucalyptus for customers and sponsors such as NASA and IBM. In this paper, we provide closer look into the design, architecture, network configurations and security of Eucalyptus, OpenStack and Nimbus. The remainder of this paper is organized as follows; section 2 provides an overview of related works. Sections 3, 4 & 5 describe Eucalyptus, OpenStack and Nimbus. Section 6 provides a discussion and comparison of the main features and section 7 concludes the paper. 2. Related works There are many works in the literature giving an over view of the Eucalyptus systems such as [6, 7, 8] where they gave a detailed description of system design and performance. The authors in [7] also provide a general overview of the networking within Eucalyptus. To best of our knowledge, there is no published works in the design of the OpenStack or Nimbus. There are number of published works into experiments performed on Nimbus but not its design. A comparison between Eucalyptus, open nebula, nimbus and others is provided in [8, 9] but these works only gave a brief description of nimbus and didn’t include OpenStack. This paper is unique in its inclusion of OpenStack and considering the network and security features of all three Eucalyptus, OpenStack and Nimbus. 1 3. Eucalyptus Eucalyptus is an open source implementation of IaaS which enables the creation and administration of private clouds that are scalable, elastic and extensible. Eucalyptus is the result of a research project team led by Rich Wolski at the MAYHEM labs, University of California, Santa Barbara [1]. The name Eucalyptus stands for Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems. The Eucalyptus cloud allows users to create, manage and terminate virtual machines images using tools (such as Euca2ools) that is the same as AWS EC2 and S3 APIs [3, 2] making it easier for users to migrated between EC2 and eucalyptus without needing to learn new tools. Using the same interface as EC2 allows hybrid cloud containing Eucalyptus and EC2 to exist. Eucalyptus has three partners; Ubuntu, Dell, HP, Novel and Redhat [1]. Ubuntu used to be the largest supporter of Eucalyptus and has adopted Eucalyptus in its Ubuntu server releases [4]. A commercial enterprise version of Eucalyptus was released in early 2009 [5] System architecture and design Eucalyptus was designed with the goal of its use to be in research setting. According to its designers [1, 6, 7, 8, 9], it is designed to be extensible and easy to modify and use with not require dedicated resources. Eucalyptus components were designed and implemented as web services and thus each service had a WSDL interface. Use of web services made it possible to utilize WSsecurity to secure communications between components and to plug in components as needed. The Eucalyptus system is composed of composed of a 5 main components interacting together; Client, cloud controller, storage controller (Walrus), cluster controller and a node controller. These components are organized in hierarchical structure as illustrated in figure 2 and communicate through private and public networks as described in the networking section. Following is a description of each component according to [6, 7, 8, 9]: 1. Node Controller: Node controller is installed on each physical resource (node). It is responsible of hosting and administering virtual machine (VM) instances running on it. The node controller interface described through WSDL provides functionalities to manage its VMs such as runInstance, terminateInstance, describeInstance, describeResource and startNetwork. The first three functions interact with the node’s underlying hypervisor to execute and govern such services. Eucalyptus supports both XEN and KVM hypervisors. The describeResource is a query that reports to the cluster controller its currents capabilities such as number of cores, memory and disk capacity. The startNetwork sets up the node’s virtual network. Client Cloud Controller and warlus Cluster Controller Cluster Controller Public and private networks Cluster 1 Node Controller VM Node Controller VM Cluster n Node Controller Node Controller VM VM Node Controller Node Controller VM VM Node Controller VM Node Controller VM Figure 2: Eucalyptus Architecture 2 2. 3. 4. Cluster Controller: Cluster controller runs on one machine (usually the head node) per cluster and works as an intermediary between the cloud controller and node controller. A cluster controller manages a number of node controllers in its cluster and is connected to both public and private networks. A cluster controller collects state information about its node controllers, schedules VM creation requests and configures both public and private networks. Its WSDL is similar to that of a node controller but is specific to a number of instances rather than one. Its functions are runInstances, terminateInstances, describeInstances and describeResource. These functions interact with NC to execute. The cluster controller performs scheduling by running describeResource on each node controller and selecting the first free node. Storage controller: Storage controller (Warlus) is a component that provides storage services for storing virtual machine images and user’s data. Warlus runs also as a web service and has a WSDL interface compatible to that of S3 [2]. Cloud Controller: Cloud controller is the user’s entry point into the Eucalyptus system and only one instance is run on the system. It provides users with a way managing the system. Cloud controller processes and handles user’s (regular users and administrators) requests, schedules VM instance creation, maintain system and user data and process service level agreements [7]. According to [6], the cloud controller is built using the Enterprise service bus [10] providing decoupling from the services’ implementation and the users requests which enabled the use of multiple interfaces and different implementations of such services. Figure 3 illustrates the use of the ESB. VM Scheduler SLA Engine User interfaces Others Enterprise Service Bus (ESB) VM Scheduler SLA Engine User interfaces Others Figure 3: Cloud Controller ESB 5. Client: The client component provides the user with a way to access the Eucalyptus system (cloud controller). Eucalyptus provides two interfaces; One is a WDSL which is a SOAP client interface similar to AWS EC2 [3] interface called euca2ools. Another interface is a HTTP query based interface. The administrator is provided with an interface to manage users and VM images. Networking Networking in the cloud constitutes both inter-VM communication and external VM communication. Eucalyptus delivers networking by providing private and public networks. Each VM instance is assigned two virtual NICs; one is called “private” and is connected to the private network and used for inter-VM while the other “public” is connected to the public network (either through NAT or a direct IP address). Figure 4 illustrates the use of private and public interfaces per VM. 3 Node VM Instance1 Public interface VM Instance N Private interface Public interface Private interface VLAN N VLAN 1 Public Bridge Private Bridge Physical Interface VDE Switch To Ethernet To another VDE Switch Figure 4: Network Configurations per node The public interface of each VM is directly connected to the public bridge (software based) which is connected to the physical NIC. The cluster controller sets up and configures the public interface network and IPs are assigned in one of three ways [11, 7]: 1. 2. 3. IPs are assigned using the DHCP server running on the whole physical network. IPs are dynamically selected from a pre-specified IP pool (Elastic IP). Static MAC (Media Access control) and IP tuples are pre-defined by the administrator. The private network is set up through the use of Virtualized Ethernet Switch [12] which is software based. Each VM private bridge is connected the private bridge to the Virtualized Ethernet Switch network. The VDE switch provides virtual networks and handles connections to the real Ethernet network. One VDE switch is created per cluster controller and node controller. In order to avoid two instances owned by two different users from inspecting and altering other user’s packets (i.e. network traffic isolation), VLANs can be used. VLAN are used by tagging each packet with its VLAN name which would make the VDE switch send the packet to that VLAN only as shown in figure 3. Eucalyptus offers four networking modes; MANAGED, MANAGED-NOVLAN, SYSTEM and STATIC. Table 1 [11] provides a comparison between the different networking modes and their supported features. IP control referes to whether the IPs are assigned by Eucalyptus’s DHCP server or a DHCP server outside Eucalyptus. Metadata service provides instance specific information from inside a VM instance. Mode\Feature Managed Managed-NoVLAN Static System IP control X X X Security groups X X Elastic IPs X X Metadata service X X X X VM Isolation X X Table 1: Eucalyptus Network modes Security The eucalyptus provides ensures security in a number of ways. VDE switches are connected to each other providing one whole private network inside eucalyptus. Their connections to each other are governed by firewall rules and a user sets up rules to specify which packet to allow and deny. This is provided through security groups which are a set of rules to all VM instances associated with a certain security group. Another security feature is the utilization of VLANs which allows the isolation of user’s network traffic. Eucalyptus’s web services communication is secured through the use of WS-Security which enforces message confidentiality and integrity as specified in the security policies. End user’s communication to the eucalyptus system is secured through the command line via secure socket shell which uses public key cryptography to encrypt traffic between the two end points. 4 4. OpenStack OpenStack is another open source cloud infrastructure implementation. OpenStack is developed jointly by RackSpace cloud hosting [13] and NASA. In the latest report by Guy Rosen, RackSpace is the 2nd top cloud hosting company in use. OpenStack was first introduced in June, 2010 and is gaining momentum and popularity every day. OpenStack acquired partnership of Ubuntu and CISCO in addition to many other contributors [15]. OpenStack mission acourding to [15] is “To produce the ubiquitous Open Source cloud computing platform that will meet the needs of public and private cloud providers regardless of size, by being simple to implement and massively scalable”. OpenStack has taken the spotlight from Eucalyptus and has acquired NASA’s partnership. The main advantage of OpenStack over Eucalyptus is that it is geared toward public cloud provision and support of large deployments in addition to private clouds. Figure 5: Top Cloud hosting services [14] OpenStack has three main parts: Swift, Nova and Imaging service. Swift is a redundant, scalable object storage capacity service provided by RackSpace and is currently called “OS Object Storage” and Nova, the underlying cloud controller developed by NASA and is currently called “OS Compute”. The imaging service provides VM image lookup and retrieval. In this paper we will focus on OS Compute which provides cloud management of all resources, networking, authorization and scalability [16]. Current release is Bexar which was released in Feb, 3, 2010 which provides support of XEN, KVM, UML, QEMU and Microsoft hyper-V hypervisor. OpenStack Compute controls access to the cloud through users and projects. Projects constitute the main organization structure with in the cloud. Projects are resource containers encompassing users, VM images, instances, keys, volumes & VLANs [17]. Each project has a pre-determined quota that controls the number and size of volumes, number of instances, number of cores and the public IPs. OpenStack system is designed to be used by many types of users. Access of different kinds of users is governed through the adoption of a role based access control (RBAC) where five roles are defined as follows [17]: 1. 2. 3. Cloud Administrator (Admin): provides root system access to the whole system. IT Security (itsec): allows users to quarantine instances. Project Managers (projectmanager): it is used by project owners to manage projects by adding users, images and VM instance administration. System architecture and design OpenStack compute was designed to be component based, highly available, fault tolerant, recoverable and provides API compatibility [18]. OpenStack compute consists of 8 main components as illustrated in figure 6. The components are designed to interact based on message based architecture. Communication between the cloud controller, scheduler, network controller and volume controller is performed through the AMPQ (Advanced Message Queue Protocol) using asynchronous communication. 5 EC2 API OpenStack API Nova-manage API Server Cloud Contoller AMQP Queue http AMQP Object Store Scheduler AMQP Network Controller AMQP Volume Controller Auth Manager Compute Controller Figure 6: OpenStack Architecture Following is a description of each component [17, 18]: 1. 2. 3. 4. 5. 6. 7. API Servers: User’s front end providing interaction with the cloud controller allowing users holding all kinds of roles to access and manage storage, network, images, projects, users and instances. Two interfaces are provided; one in EC2 compatible and another is openStack specific. Also, a nova-manage interface is provided for users holding admin roles for administration and maintenance of the cloud. Cloud controller: maintains cloud state and handles interaction between services such as scheduler, volume and network controller through the use of a message queue called RabbitMQ [19]. Scheduler: Selects the available compute controller to host an instance. Network controller: manages networking resources on each node builds virtual networks to be used by computer controllers. It provides the commands such as Allocate Fixed IP Addresses, Configuring VLANs for projects and Configuring networks for compute nodes. Volume controller: provides permanent storage for the compute controllers. It provides commands such as Create Volumes, Delete Volumes and Establish Compute volumes. Auth Manager: Provides authorization and authentication services. Compute controller: manages hosted VM instances and provides commands such as Run instances, Terminate instances, Reboot instances, Attach volumes, Detach volumes and Get console output through an API interface. Networking OpenStack provides two types of IPs; Fixed IPs which are assigned to the VM at creation time and remains with it until the VM is terminated and Floating IPs which are associated and de-associated with VMs dynamically. The network controller for each node creates virtual networks and uses bridge networking to link virtual networks to each other and to public network. Three networking modes are supported to implement fixed IPs; flat mode, flat DHCP mode and VLAN DHCP mode. In flat mode, the network administrator specifies an IP prefix (IP pool) which is used to assign fixed IPs to instances at boot time. Two 6 network bridges (named br100) are configured; one on the network controller and the other on the compute controller with all VMs connected to it. In flat DHCP mode, an IP prefix is specified as in the flat mode but the assignment of the IPs is performed through the use of a DHCP server where a dnsmasq is run on the network controller to listen to dhcpdiscover requests. In flat mode and flat DHCP modes, the IPs assigned are public IPs thus VMs are not isolated. In VLAN DHCP mode, VLANs are created using switches that support host managed VLAN tagging where one VLAN is created for each project. In VLAN, Instances are assigned private IP and are only accessible from the VLAN. In order for the user to gain access to the VLAN, the user needs first to access a head node called the cloud pipe which provided VPN access. The user then is able to VPN into the VLAN and accesses the instances with the private IPs. VLAN DHCP is the default mode in OpenStack. Figures 7 obtained from [18] give an overview of the VLAN DHCP mode. In the current release, IPv6 is supported in all network FlatManager. modes but Figure 7: OpenStack VLAN Network mode [18] Security OpenStack uses a similar notion of security groups as in Eucalyptus by defining packet filtering rules (firewall) per security group. Each project can be associated with one or more security group. Also, OpenStack enforces access control through user level privileges using an RBAC model as explained above. Network communication is secured through the use of VPNs and VLAN providing VM isolation. Also, OpenStack has reported its plan to adopt Open vSwitch [20] which is a great enhancement to network security. Communication between VMs is secured through the use SSH for command line access. For cloud auditing, OpenStack is considering the use of cloudAudit [21] which provides a common interface for the automation of Audit, Assertion, Assessment and assurance of the IaaS. 5. Nimbus Nimbus is another open source infrastructure as a service implementation framework which came online September 2005 [25] and was initiated from the Workspace Service Project [22]. Nimbus was designed with the goal of turning clusters into clouds mainly to be used in scientific applications [25]. According to [23], Nimbus aims to allow building of private clouds, allow users to use clouds by leasing resources and allow developers to experiment with Nimbus. System architecture and design Nimbus offers a “cloudKit” that allows users to lease remote resources by allocating and configuring Virtual Workspace Service (VWS). The design of Nimbus which consists of a number of components based on the web service technology as illustrated in figure 8 adopted from [25]. 7 WSRF Cumulus Storage Service Workspace resource manager Workspace control Workspace Pilot Workspace service EC2 Context Broker Cloud provider such as OpenNebula Context Client Cloud client Workspace client Figure 8: Nimbus Architecture (adopted from [25]) Following is a description of each component [8, 9, 23, 24]: 1. 2. 3. 4. 5. 6. 7. 8. Workspace service: Allows clients to manage and administer VMs by providing to two interfaces; One interface is based on the web service resource framework (WSRF) and the other is based on EC2 WSDL. This service communicates with a workspace resource manager or a workspace pilot to manage instances. Workspace resource manager: Implements VM instance creation on a site and management. Workspace pilot: provides virtualization with significant changes to the site configurations. Workspace control: implements VM instance management such as start, stop and pause VM. It also provides image management and sets up networks and provides IP assignment. Context broker: allows clients coordinate large virtual cluster launches automatically and repeatedly. Workspace client: a complex client that provides full access to the workspace service functionality. Cloud client: a simpler client providing access to selected functionalities in the workspace service. Storage service: Cumulus is a web service providing users with storage capabilities to store images and works in conjunction with GridFTP [26]. Networking Similar to Eucalyptus, Nimbus provides two network configuration; external and internal. In external, a VM instance is associated with either a private IP (accessible through VPN) or a public IP. Each VM instance can specify one or more network configuration and thus mixing public and private networks similar to Eucalyptus. Security Nimbus implements and follows the grid security infrastructure (GSI) [28] in order to provide public key based authentication and authorization. It is also used to secure communications by forcing running processes in Nimbus to communicate through SSH. Key exchange is performed the integration of Globus certificate credentials. 6. Discussion OpenStack, Eucalyptus and Nimbus are some of the most commonly used infrastructure as a service open source implementations. OpenStack is one of top leading IaaS since its completely open source and is making headway design and implementation wise. OpenStack has a great advantage which is being supported by Rackspace and aims to standardized infrastructures as service frameworks. Eucalyptus is lacking because it is providing certain features in the commercial versions only and aims to implement every feature provided by AWS EC2. OpenStack is a good choice to cloud provider because it is scalable. Nimbus on the other hand, is geared towards scientific applications and its architecture does not seem to be scalable although it supports interesting security features except VM isolations. Table2 provides a comparison some of the main features found in Eucalyptus, OpenStack and Nimbus. 8 Feature \ IaaS Virtual network VM isolation System Security User Security DHCP Hypervisor support Storage System Eucalyptus VDE VLAN Firewall filters (Security groups) ,SSH & WS-Security User credentials provide through web interface OpenStack Nimbus VDE & Open vSwitch VDE VLAN Not available VPN access, SSH, security VPN access, SSH & use of groups and CloudAudit globus certificate credentials. User credential using Users x509 public key certificate authority for VPN certificates which is provided access to the cloud On the cluster controller On the network controller On the individual node XEN, KVM & VMware XEN, KVM, UML, QEMU & XEN & KVM Microsoft hyper-V Warlus Object store Cumlus Table 2: Feature comparison of Eucalyptus, OpenStack and Nimbus 7. Conclusion In this paper, we took a closer look into three common infrastructure as a service (IaaS) framework implementations; Eucalyptus, OpenStack and Nimbus. For each framework, we illustrated its architecture and design, network configurations and system security. OpenStack is the best choice for public cloud providers due to its scalability. Eucalyptus is a possible choice for private cloud providers and researchers due to its extensibility and quick installation. Nimbus is also a good choice for researchers and scientific applications since it has an efficient job scheduler. Although Nimbus started in 2005, it was lacking in detailed documentation and publications. As a future work we plan to expand our review on the security of Infrastructure as a service implementations. 8. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] Eucalyptus home page, http://open.eucalyptus.com/, march, 2010 Amazon S3 API, http://docs.amazonwebservices.com/AmazonS3/latest/dev/ Amazon EC2 API, http://docs.amazonwebservices.com/AWSEC2/2009-04-04/DeveloperGuide/ Ubuntu cloud computing home page, http://www.ubuntu.com/business/cloud/overview Eucalyptus Enterprise home page, http://www.eucalyptus.com/ D. Nurmi et al., “The Eucalyptus Open-Source Cloud- Computing System,” Cloud Computing and Applications 2008 (CCA 08), 2008;. Nurmi, d., wolski, r., grzegorczyk, c., obertelli, g., soman, s., youseff, l., and zagorodnov, d. Eucalyptus: a technical report on an elastic utility computing archietcture linking your programs to useful systems . Tech. Rep. 2008-10, university of california, santa barbara, october 2008 P. T. Endo, G. E. Gon¸calves, J. Kelner, and D. Sadok. A Survey on Open-source Cloud Computing Solutions. Brazilian Symposium on Computer Networks and Distributed Systems, May 2010. Peter Sempolinski, Douglas Thain, "A Comparison and Critique of Eucalyptus, OpenNebula and Nimbus," cloudcom, pp.417-426, 2010 IEEE Second International Conference on Cloud Computing Technology and Science, 2010 M. Chang, J. He, and E. Castro-Leon. Service orientation in the computing infrastructure. In SOSE ’06: Proceedings of the Second IEEE International Symposium on Service-Oriented System Engineering, pages 27–33, Washington, DC, USA, 2006. IEEE Computer Society. Eucalyptus network configuration, http://open.eucalyptus.com/wiki/EucalyptusNetworkConfiguration_v2.0 Virtual Distributed Ethernet, http://vde.sourceforge.net RackSpace cloud hosting home page, http://www.rackspace.com/cloud/ State of the Cloud – January 2011, http://www.jackofallclouds.com/2011/01/state-of-the-cloud-january-201/ OpenStack Tutorial, IEEE CloudCom , 2010 http://salsahpc.indiana.edu/CloudCom2010/slides/PDF/tutorials/OpenStackTutorialIEEECloudCom.p df OpenStack compute home page, http://openstack.org/projects/compute/ Open Stack Administration Manual home page, http://docs.openstack.org/openstack9 [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] compute/admin/content/ Nova’s Documentation, http://nova.openstack.org/ Rabbit MQ home page, http://www.rabbitmq.com/ Open Virtual Switch home page, http://openvswitch.org/ CloudAudit home page, http://www.cloudaudit.org/ The Globus alliance home page, http://globus.org/ Keahey, K. “Nimbus: Open Source Infrastructure-as-a-Service Cloud Computing Software”, Workshop on adapting applications and computing services to multi-core and virtualization, CERN, Switzerland, 2009 Nimbus home page, http://www.nimbusproject.org/docs/2.2/faq.html K. Keahey, T. Freeman. “Science Clouds: Early Experiences in Cloud computing for Scientific Applications,” Cloud Computing and Its Applications 2008 (CCA-08), Chicago, IL. October 2008 W. Allcock, J. Bester, J. Bresnahan, A. Chervenak, L. Liming, and S. Tuecke, “GridFTP: Protocol Extension to FTP for the Grid,” Grid Forum Internet-Draft, March 2001 Making your workspace secure: establishing trust with VMs in the Grid, Wei Lu. SC05 Poster, Seattle, WA. November 2005 Overview of the grid security infrastructure (GSI), http://www.globus.org/security/overview.html Armbrust. M et al. A view of cloud computing. Commun. ACM 53, 4 (Apr. 2010), 50–58. OpenNebula, http://opennebula.org/ 10