Behind the scenes of IaaS implementations Sumayah Alrwais Indiana University

advertisement
Behind the scenes of IaaS implementations
Sumayah Alrwais
Indiana University
salrwais@indiana.edu
Abstract
Open source IaaS is hot topic in cloud computing and every year a new IaaS with a different design is announced. This paper
provides a closer look into three common IaaS; Eucalyptus, OpenStack & Nimbus. OpenStack is currently the best choice of
public cloud providers due to its aim at standardization and scalability while Eucalyptus is more appropriate for private cloud
deployments.
1. Introduction
“Cloud computing” is the term used to describe basically outsourced services. Armbrust and others [29] have called it the dream
of computing as a utility “cloud computing, the long-held dream of computing as a utility, has the potential to transform a large
part of the IT industry, making software even more attractive as a service and shaping the way IT hardware is designed and
purchased”
The cloud system can be divided mainly into three layers as shown in figure1. The infrastructure as a service (IaaS) is the lowest
level which delivers computing infrastructure as a service to end users. Iaas provides users with a
Software as a service (SAAS)
way to monitor, manage and lease resources by deploying virtual machine (VM) instances on
those resources. Amazon EC2, Eucalyptus, Nimbus, Open stack & Open Nebula [1, 2, 16, 24,
30] are examples of cloud infrastructure implementations.
Platform as a service (PAAS)
An interesting cloud infrastructure is Nimbus. Nimbus first came out in 2005 and is gaining
momentum and popularity especially in scientific communities. It is an open source
Infrastructure as a service
implementation of cloud infrastructure which aims to allow providers to build private clouds and
(IAAS)
workspace services. Nimbus facilitates the management of images and allows users to utilize
Figure 1: Cloud Architecture layers
cloud computing resources [24]. Although Nimbus has been thoroughly researched and analyzed
in different dimensions, security did not get a fair share of it. Another commonly used cloud
infrastructure is Eucalyptus [1] which is a reverse engineering of EC2 and has an commercial enterprise version. OpenStack [16]
is a cloud infrastructure announced in June 2010 that is gaining momentum and competing with Eucalyptus for customers and
sponsors such as NASA and IBM.
In this paper, we provide closer look into the design, architecture, network configurations and security of Eucalyptus, OpenStack
and Nimbus. The remainder of this paper is organized as follows; section 2 provides an overview of related works. Sections 3, 4
& 5 describe Eucalyptus, OpenStack and Nimbus. Section 6 provides a discussion and comparison of the main features and
section 7 concludes the paper.
2. Related works
There are many works in the literature giving an over view of the Eucalyptus systems such as [6, 7, 8] where they gave a detailed
description of system design and performance. The authors in [7] also provide a general overview of the networking within
Eucalyptus. To best of our knowledge, there is no published works in the design of the OpenStack or Nimbus. There are number
of published works into experiments performed on Nimbus but not its design. A comparison between Eucalyptus, open nebula,
nimbus and others is provided in [8, 9] but these works only gave a brief description of nimbus and didn’t include OpenStack.
This paper is unique in its inclusion of OpenStack and considering the network and security features of all three Eucalyptus,
OpenStack and Nimbus.
1
3. Eucalyptus
Eucalyptus is an open source implementation of IaaS which enables the creation and administration of private clouds that are
scalable, elastic and extensible. Eucalyptus is the result of a research project team led by Rich Wolski at the MAYHEM labs,
University of California, Santa Barbara [1]. The name Eucalyptus stands for Elastic Utility Computing Architecture for Linking
Your Programs To Useful Systems. The Eucalyptus cloud allows users to create, manage and terminate virtual machines images
using tools (such as Euca2ools) that is the same as AWS EC2 and S3 APIs [3, 2] making it easier for users to migrated between
EC2 and eucalyptus without needing to learn new tools. Using the same interface as EC2 allows hybrid cloud containing
Eucalyptus and EC2 to exist. Eucalyptus has three partners; Ubuntu, Dell, HP, Novel and Redhat [1]. Ubuntu used to be the
largest supporter of Eucalyptus and has adopted Eucalyptus in its Ubuntu server releases [4]. A commercial enterprise version of
Eucalyptus was released in early 2009 [5]
System architecture and design
Eucalyptus was designed with the goal of its use to be in research setting. According to its designers [1, 6, 7, 8, 9], it is designed
to be extensible and easy to modify and use with not require dedicated resources. Eucalyptus components were designed and
implemented as web services and thus each service had a WSDL interface. Use of web services made it possible to utilize WSsecurity to secure communications between components and to plug in components as needed. The Eucalyptus system is
composed of composed of a 5 main components interacting together; Client, cloud controller, storage controller (Walrus), cluster
controller and a node controller. These components are organized in hierarchical structure as illustrated in figure 2 and
communicate through private and public networks as described in the networking section. Following is a description of each
component according to [6, 7, 8, 9]:
1.
Node Controller: Node controller is installed on each physical resource (node). It is responsible of hosting and
administering virtual machine (VM) instances running on it. The node controller interface described through WSDL
provides functionalities to manage its VMs such as runInstance, terminateInstance, describeInstance, describeResource and
startNetwork. The first three functions interact with the node’s underlying hypervisor to execute and govern such services.
Eucalyptus supports both XEN and KVM hypervisors. The describeResource is a query that reports to the cluster controller
its currents capabilities such as number of cores, memory and disk capacity. The startNetwork sets up the node’s virtual
network.
Client
Cloud Controller and warlus
Cluster Controller
Cluster Controller
Public and private
networks
Cluster 1
Node Controller
VM
Node Controller
VM
Cluster n
Node Controller
Node Controller
VM
VM
Node Controller
Node Controller
VM
VM
Node Controller
VM
Node Controller
VM
Figure 2: Eucalyptus Architecture
2
2.
3.
4.
Cluster Controller: Cluster controller runs on one machine (usually the head node) per cluster and works as an intermediary
between the cloud controller and node controller. A cluster controller manages a number of node controllers in its cluster
and is connected to both public and private networks. A cluster controller collects state information about its node
controllers, schedules VM creation requests and configures both public and private networks. Its WSDL is similar to that of
a node controller but is specific to a number of instances rather than one. Its functions are runInstances, terminateInstances,
describeInstances and describeResource. These functions interact with NC to execute. The cluster controller performs
scheduling by running describeResource on each node controller and selecting the first free node.
Storage controller: Storage controller (Warlus) is a component that provides storage services for storing virtual machine
images and user’s data. Warlus runs also as a web service and has a WSDL interface compatible to that of S3 [2].
Cloud Controller: Cloud controller is the user’s entry point into the Eucalyptus system and only one instance is run on the
system. It provides users with a way managing the system. Cloud controller processes and handles user’s (regular users and
administrators) requests, schedules VM instance creation, maintain system and user data and process service level
agreements [7]. According to [6], the cloud controller is built using the Enterprise service bus [10] providing decoupling
from the services’ implementation and the users requests which enabled the use of multiple interfaces and different
implementations of such services. Figure 3 illustrates the use of the ESB.
VM Scheduler
SLA Engine
User interfaces
Others
Enterprise Service Bus (ESB)
VM Scheduler
SLA Engine
User interfaces
Others
Figure 3: Cloud Controller ESB
5.
Client: The client component provides the user with a way to access the Eucalyptus system (cloud controller). Eucalyptus
provides two interfaces; One is a WDSL which is a SOAP client interface similar to AWS EC2 [3] interface called
euca2ools. Another interface is a HTTP query based interface. The administrator is provided with an interface to manage
users and VM images.
Networking
Networking in the cloud constitutes both inter-VM communication and external VM communication. Eucalyptus delivers
networking by providing private and public networks. Each VM instance is assigned two virtual NICs; one is called “private” and
is connected to the private network and used for inter-VM while the other “public” is connected to the public network (either
through NAT or a direct IP address). Figure 4 illustrates the use of private and public interfaces per VM.
3
Node
VM Instance1
Public
interface
VM Instance N
Private
interface
Public
interface
Private
interface
VLAN N
VLAN 1
Public Bridge
Private Bridge
Physical Interface
VDE Switch
To Ethernet
To another VDE Switch
Figure 4: Network Configurations per node
The public interface of each VM is directly connected to the public bridge (software based) which is connected to the physical
NIC. The cluster controller sets up and configures the public interface network and IPs are assigned in one of three ways [11, 7]:
1.
2.
3.
IPs are assigned using the DHCP server running on the whole physical network.
IPs are dynamically selected from a pre-specified IP pool (Elastic IP).
Static MAC (Media Access control) and IP tuples are pre-defined by the administrator.
The private network is set up through the use of Virtualized Ethernet Switch [12] which is software based. Each VM private
bridge is connected the private bridge to the Virtualized Ethernet Switch network. The VDE switch provides virtual networks and
handles connections to the real Ethernet network. One VDE switch is created per cluster controller and node controller. In order
to avoid two instances owned by two different users from inspecting and altering other user’s packets (i.e. network traffic
isolation), VLANs can be used. VLAN are used by tagging each packet with its VLAN name which would make the VDE switch
send the packet to that VLAN only as shown in figure 3.
Eucalyptus offers four networking modes; MANAGED, MANAGED-NOVLAN, SYSTEM and STATIC. Table 1 [11] provides
a comparison between the different networking modes and their supported features. IP control referes to whether the IPs are
assigned by Eucalyptus’s DHCP server or a DHCP server outside Eucalyptus. Metadata service provides instance specific
information from inside a VM instance.
Mode\Feature
Managed
Managed-NoVLAN
Static
System
IP control
X
X
X
Security groups
X
X
Elastic IPs
X
X
Metadata
service
X
X
X
X
VM Isolation
X
X
Table 1: Eucalyptus Network modes
Security
The eucalyptus provides ensures security in a number of ways. VDE switches are connected to each other providing one whole
private network inside eucalyptus. Their connections to each other are governed by firewall rules and a user sets up rules to
specify which packet to allow and deny. This is provided through security groups which are a set of rules to all VM instances
associated with a certain security group. Another security feature is the utilization of VLANs which allows the isolation of user’s
network traffic.
Eucalyptus’s web services communication is secured through the use of WS-Security which enforces message confidentiality and
integrity as specified in the security policies. End user’s communication to the eucalyptus system is secured through the
command line via secure socket shell which uses public key cryptography to encrypt traffic between the two end points.
4
4. OpenStack
OpenStack is another open source cloud infrastructure
implementation. OpenStack is developed jointly by RackSpace
cloud hosting [13] and NASA. In the latest report by Guy
Rosen, RackSpace is the 2nd top cloud hosting company in use.
OpenStack was first introduced in June, 2010 and is gaining
momentum and popularity every day. OpenStack acquired
partnership of Ubuntu and CISCO in addition to many other
contributors [15]. OpenStack mission acourding to [15] is “To
produce the ubiquitous Open Source cloud computing platform
that will meet the needs of public and private cloud providers
regardless of size, by being simple to implement and massively
scalable”. OpenStack has taken the spotlight from Eucalyptus
and has acquired NASA’s partnership. The main advantage of
OpenStack over Eucalyptus is that it is geared toward public
cloud provision and support of large deployments in addition to
private clouds.
Figure 5: Top Cloud hosting services [14]
OpenStack has three main parts: Swift, Nova and Imaging service. Swift is a redundant, scalable object storage capacity service
provided by RackSpace and is currently called “OS Object Storage” and Nova, the underlying cloud controller developed by
NASA and is currently called “OS Compute”. The imaging service provides VM image lookup and retrieval. In this paper we
will focus on OS Compute which provides cloud management of all resources, networking, authorization and scalability [16].
Current release is Bexar which was released in Feb, 3, 2010 which provides support of XEN, KVM, UML, QEMU and Microsoft
hyper-V hypervisor.
OpenStack Compute controls access to the cloud through users and projects. Projects constitute the main organization structure
with in the cloud. Projects are resource containers encompassing users, VM images, instances, keys, volumes & VLANs [17].
Each project has a pre-determined quota that controls the number and size of volumes, number of instances, number of cores and
the public IPs. OpenStack system is designed to be used by many types of users. Access of different kinds of users is governed
through the adoption of a role based access control (RBAC) where five roles are defined as follows [17]:
1.
2.
3.
Cloud Administrator (Admin): provides root system access to the whole system.
IT Security (itsec): allows users to quarantine instances.
Project Managers (projectmanager): it is used by project owners to manage projects by adding users, images and VM
instance administration.
System architecture and design
OpenStack compute was designed to be component based, highly available, fault tolerant, recoverable and provides API
compatibility [18]. OpenStack compute consists of 8 main components as illustrated in figure 6. The components are designed to
interact based on message based architecture. Communication between the cloud controller, scheduler, network controller and
volume controller is performed through the AMPQ (Advanced Message Queue Protocol) using asynchronous communication.
5
EC2 API
OpenStack API
Nova-manage
API Server
Cloud Contoller
AMQP
Queue
http
AMQP
Object Store
Scheduler
AMQP
Network
Controller
AMQP
Volume
Controller
Auth Manager
Compute
Controller
Figure 6: OpenStack Architecture
Following is a description of each component [17, 18]:
1.
2.
3.
4.
5.
6.
7.
API Servers: User’s front end providing interaction with the cloud controller allowing users holding all kinds of roles to
access and manage storage, network, images, projects, users and instances. Two interfaces are provided; one in EC2
compatible and another is openStack specific. Also, a nova-manage interface is provided for users holding admin roles for
administration and maintenance of the cloud.
Cloud controller: maintains cloud state and handles interaction between services such as scheduler, volume and network
controller through the use of a message queue called RabbitMQ [19].
Scheduler: Selects the available compute controller to host an instance.
Network controller: manages networking resources on each node builds virtual networks to be used by computer controllers.
It provides the commands such as Allocate Fixed IP Addresses, Configuring VLANs for projects and Configuring networks
for compute nodes.
Volume controller: provides permanent storage for the compute controllers. It provides commands such as Create Volumes,
Delete Volumes and Establish Compute volumes.
Auth Manager: Provides authorization and authentication services.
Compute controller: manages hosted VM instances and provides commands such as Run instances, Terminate instances,
Reboot instances, Attach volumes, Detach volumes and Get console output through an API interface.
Networking
OpenStack provides two types of IPs; Fixed IPs which are assigned to the VM at creation time and remains with it until the VM
is terminated and Floating IPs which are associated and de-associated with VMs dynamically. The network controller for each
node creates virtual networks and uses bridge networking to link virtual networks to each other and to public network.
Three networking modes are supported to implement fixed IPs; flat mode, flat DHCP mode and VLAN DHCP mode. In flat
mode, the network administrator specifies an IP prefix (IP pool) which is used to assign fixed IPs to instances at boot time. Two
6
network bridges (named br100) are configured; one on the network controller and the other on the compute controller with all
VMs connected to it.
In flat DHCP mode, an IP prefix is specified as in the flat mode but the assignment of the IPs is performed through the use of a
DHCP server where a dnsmasq is run on the network controller to listen to dhcpdiscover requests. In flat mode and flat DHCP
modes, the IPs assigned are public IPs thus VMs are not isolated.
In VLAN DHCP mode, VLANs are created using switches that support host managed VLAN tagging where one VLAN is
created for each project. In VLAN,
Instances are assigned private IP and are
only accessible from the VLAN. In
order for the user to gain access to the
VLAN, the user needs first to access a
head node called the cloud pipe which
provided VPN access. The user then is
able to VPN into the VLAN and
accesses the instances with the private
IPs. VLAN DHCP is the default mode
in OpenStack. Figures 7 obtained from
[18] give an overview of the VLAN
DHCP mode.
In the current release, IPv6 is supported
in all network
FlatManager.
modes
but
Figure 7: OpenStack VLAN Network mode [18]
Security
OpenStack uses a similar notion of security groups as in Eucalyptus by defining packet filtering rules (firewall) per security
group. Each project can be associated with one or more security group. Also, OpenStack enforces access control through user
level privileges using an RBAC model as explained above. Network communication is secured through the use of VPNs and
VLAN providing VM isolation. Also, OpenStack has reported its plan to adopt Open vSwitch [20] which is a great enhancement
to network security. Communication between VMs is secured through the use SSH for command line access. For cloud auditing,
OpenStack is considering the use of cloudAudit [21] which provides a common interface for the automation of Audit, Assertion,
Assessment and assurance of the IaaS.
5. Nimbus
Nimbus is another open source infrastructure as a service implementation framework which came online September 2005 [25]
and was initiated from the Workspace Service Project [22]. Nimbus was designed with the goal of turning clusters into clouds
mainly to be used in scientific applications [25]. According to [23], Nimbus aims to allow building of private clouds, allow users
to use clouds by leasing resources and allow developers to experiment with Nimbus.
System architecture and design
Nimbus offers a “cloudKit” that allows users to lease remote resources by allocating and configuring Virtual Workspace Service
(VWS). The design of Nimbus which consists of a number of components based on the web service technology as illustrated in
figure 8 adopted from [25].
7
WSRF
Cumulus Storage
Service
Workspace resource
manager
Workspace control
Workspace Pilot
Workspace
service
EC2
Context Broker
Cloud provider such
as OpenNebula
Context Client
Cloud client
Workspace
client
Figure 8: Nimbus Architecture (adopted from [25])
Following is a description of each component [8, 9, 23, 24]:
1.
2.
3.
4.
5.
6.
7.
8.
Workspace service: Allows clients to manage and administer VMs by providing to two interfaces; One interface is based on
the web service resource framework (WSRF) and the other is based on EC2 WSDL. This service communicates with a
workspace resource manager or a workspace pilot to manage instances.
Workspace resource manager: Implements VM instance creation on a site and management.
Workspace pilot: provides virtualization with significant changes to the site configurations.
Workspace control: implements VM instance management such as start, stop and pause VM. It also provides image
management and sets up networks and provides IP assignment.
Context broker: allows clients coordinate large virtual cluster launches automatically and repeatedly.
Workspace client: a complex client that provides full access to the workspace service functionality.
Cloud client: a simpler client providing access to selected functionalities in the workspace service.
Storage service: Cumulus is a web service providing users with storage capabilities to store images and works in conjunction
with GridFTP [26].
Networking
Similar to Eucalyptus, Nimbus provides two network configuration; external and internal. In external, a VM instance is
associated with either a private IP (accessible through VPN) or a public IP. Each VM instance can specify one or more network
configuration and thus mixing public and private networks similar to Eucalyptus.
Security
Nimbus implements and follows the grid security infrastructure (GSI) [28] in order to provide public key based authentication
and authorization. It is also used to secure communications by forcing running processes in Nimbus to communicate through
SSH. Key exchange is performed the integration of Globus certificate credentials.
6. Discussion
OpenStack, Eucalyptus and Nimbus are some of the most commonly used infrastructure as a service open source
implementations. OpenStack is one of top leading IaaS since its completely open source and is making headway design and
implementation wise. OpenStack has a great advantage which is being supported by Rackspace and aims to standardized
infrastructures as service frameworks. Eucalyptus is lacking because it is providing certain features in the commercial versions
only and aims to implement every feature provided by AWS EC2. OpenStack is a good choice to cloud provider because it is
scalable. Nimbus on the other hand, is geared towards scientific applications and its architecture does not seem to be scalable
although it supports interesting security features except VM isolations. Table2 provides a comparison some of the main features
found in Eucalyptus, OpenStack and Nimbus.
8
Feature \ IaaS
Virtual network
VM isolation
System Security
User Security
DHCP
Hypervisor support
Storage System
Eucalyptus
VDE
VLAN
Firewall filters (Security
groups) ,SSH & WS-Security
User credentials provide
through web interface
OpenStack
Nimbus
VDE & Open vSwitch
VDE
VLAN
Not available
VPN access, SSH, security
VPN access, SSH & use of
groups and CloudAudit
globus certificate credentials.
User credential using
Users x509 public key
certificate authority for VPN
certificates which is provided
access
to the cloud
On the cluster controller
On the network controller
On the individual node
XEN, KVM & VMware
XEN, KVM, UML, QEMU & XEN & KVM
Microsoft hyper-V
Warlus
Object store
Cumlus
Table 2: Feature comparison of Eucalyptus, OpenStack and Nimbus
7. Conclusion
In this paper, we took a closer look into three common infrastructure as a service (IaaS) framework implementations; Eucalyptus,
OpenStack and Nimbus. For each framework, we illustrated its architecture and design, network configurations and system
security. OpenStack is the best choice for public cloud providers due to its scalability. Eucalyptus is a possible choice for private
cloud providers and researchers due to its extensibility and quick installation. Nimbus is also a good choice for researchers and
scientific applications since it has an efficient job scheduler. Although Nimbus started in 2005, it was lacking in detailed
documentation and publications.
As a future work we plan to expand our review on the security of Infrastructure as a service implementations.
8. References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
Eucalyptus home page, http://open.eucalyptus.com/, march, 2010
Amazon S3 API, http://docs.amazonwebservices.com/AmazonS3/latest/dev/
Amazon EC2 API, http://docs.amazonwebservices.com/AWSEC2/2009-04-04/DeveloperGuide/
Ubuntu cloud computing home page, http://www.ubuntu.com/business/cloud/overview
Eucalyptus Enterprise home page, http://www.eucalyptus.com/
D. Nurmi et al., “The Eucalyptus Open-Source Cloud- Computing System,” Cloud Computing and
Applications 2008 (CCA 08), 2008;.
Nurmi, d., wolski, r., grzegorczyk, c., obertelli, g., soman, s., youseff, l., and zagorodnov, d.
Eucalyptus: a technical report on an elastic utility computing archietcture linking your programs to
useful systems .
Tech. Rep. 2008-10, university of california, santa barbara, october 2008
P. T. Endo, G. E. Gon¸calves, J. Kelner, and D. Sadok. A Survey on Open-source Cloud Computing
Solutions. Brazilian Symposium on Computer Networks and Distributed Systems, May 2010.
Peter Sempolinski, Douglas Thain, "A Comparison and Critique of Eucalyptus, OpenNebula and
Nimbus," cloudcom, pp.417-426, 2010 IEEE Second International Conference on Cloud Computing
Technology and Science, 2010
M. Chang, J. He, and E. Castro-Leon. Service orientation in the computing infrastructure. In SOSE
’06: Proceedings of the Second IEEE International Symposium on Service-Oriented System
Engineering, pages 27–33, Washington, DC, USA, 2006. IEEE
Computer Society.
Eucalyptus network configuration,
http://open.eucalyptus.com/wiki/EucalyptusNetworkConfiguration_v2.0
Virtual Distributed Ethernet, http://vde.sourceforge.net
RackSpace cloud hosting home page, http://www.rackspace.com/cloud/
State of the Cloud – January 2011, http://www.jackofallclouds.com/2011/01/state-of-the-cloud-january-201/
OpenStack Tutorial, IEEE CloudCom , 2010
http://salsahpc.indiana.edu/CloudCom2010/slides/PDF/tutorials/OpenStackTutorialIEEECloudCom.p
df
OpenStack compute home page, http://openstack.org/projects/compute/
Open Stack Administration Manual home page, http://docs.openstack.org/openstack9
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
compute/admin/content/
Nova’s Documentation, http://nova.openstack.org/
Rabbit MQ home page, http://www.rabbitmq.com/
Open Virtual Switch home page, http://openvswitch.org/
CloudAudit home page, http://www.cloudaudit.org/
The Globus alliance home page, http://globus.org/
Keahey, K. “Nimbus: Open Source Infrastructure-as-a-Service Cloud Computing Software”,
Workshop on adapting applications and computing services to multi-core and virtualization, CERN,
Switzerland, 2009
Nimbus home page, http://www.nimbusproject.org/docs/2.2/faq.html
K. Keahey, T. Freeman. “Science Clouds: Early Experiences in Cloud computing for Scientific
Applications,” Cloud Computing and Its Applications 2008 (CCA-08), Chicago, IL. October 2008
W. Allcock, J. Bester, J. Bresnahan, A. Chervenak, L. Liming, and S. Tuecke, “GridFTP: Protocol
Extension to FTP for the Grid,” Grid Forum Internet-Draft, March 2001
Making your workspace secure: establishing trust with VMs in the Grid, Wei Lu. SC05 Poster,
Seattle, WA. November 2005
Overview of the grid security infrastructure (GSI), http://www.globus.org/security/overview.html
Armbrust. M et al. A view of cloud computing. Commun. ACM 53, 4 (Apr. 2010), 50–58.
OpenNebula, http://opennebula.org/
10
Download