CLOUD INFRASTRUCTURE In this chapter we overview the cloud computing infrastructure offered at this time by Amazon, Google, and Microsoft; they support one or more of the three cloud computing Paradigms ie: IaaS (Infrastructure as a Service), SaaS (Software as a Service), and PaaS (Platform as a Service). Amazon is a pioneer in IaaS, while Google's efforts are focused on SaaS and PaaS paradigms. Sun and IBM offer their own cloud computing platforms, Sun Cloud and Blue Cloud, respectively. In 2011, HP announced plans to enter the cloud computing club. Private clouds are an alternative to commercial cloud platforms. Open-source cloud computing platforms such as Eucaliptus, OpenNebula, and Nimbus can be used as a control infrastructure for a private cloud. We continue our discussion of the cloud infrastruc-ture with an overview of SLAs (Service Level Agreements), followed by a brief discussion of software licensing and energy consumption and ecological impact of cloud computing. Cloud computing at Amazon Amazon was one of the first providers of cloud computing (http://aws.amazon.com); it announced a limited public beta release of its Elastic Computing platform called EC2 in August 2006. EC2 is based on the Xen virtualization strategy. In EC2 each virtual machine functions as a virtual private server and it is called an instance; an instance specifes the maximum amount of resources available to an application, the interface for that instance, as well as, the cost per hour. Amazon uses two categories, Region and Availability Zone to describe the physical and virtual placement of the systems providing each type of service. For example, the S3 storage service is available in the US Standard and US West, Europe, and Asia Paci¯c Regions; the corresponding storage facilities are located in Northern Virginia and Northern California, Ireland, Singapore and Tokyo, respectively. An application developer has the option to use these categories to reduce the communication latency, minimize costs, satisfy address regulatory requirements, and increase reliability and security. The AWS - Amazon Web Services infrastructure offers a palette of services available through the AWS Management Console discussed next; these services include: Elastic Compute Cloud (EC2); Simple Storage System (S3); Elastic Block Store (EBS); Simple DB; Simple Queue Service (SQS); CloudWatch; and Virtual Private Cloud (VPC). 1 Elastic Compute Cloud8 is a Web service with a simple interface for launching instances of an application under several operating systems, such as several Linux distributions, Microsoft Windows Server 2003 and 2008, OpenSolaris, FreeBSD, and NetBSD. EC2 allows a user to load instances of an application with a custom application environment, manage networks access permissions, and run the images using as many or as few systems as desired. EC2 instances boot from an AMI (Amazon Machine Image) digitally signed and stored in S3; one could use the few images provided by Amazon or customize an image and store it in S3. A user can interact with EC2 using a set of SOAP (Simple Object Access Protocol) messages and can list available AMI images, boot an instance from an image, terminate an image, display user's running instances, display console output, and so on. The user has root access to each instance in the elastic and secure computing environment of EC2. The instances can be placed in multiple locations in different Regions and Availability Zones. EC2 allows the import of virtual machine images from the user environment to an instance through a facility called VM import. It also distributes automatically the incoming application traffic among multiple instances using the elastic load balancing facility. EC2 associates an elastic IP address with an account; this mechanism allows a user to mask the failure of an instance and re-map a public IP address to any instance of the account, without the need to interact with the software support team. Another facility called auto scaling empowers the user to scale seamlessly up and down the number of instances used by an application. The EC2 system offers several instance types: Standard instances: micro (StdM), small (StdS), large (StdL), extra large (StdXL); small is the default. High memory instances: high-memory extra large (HmXL), high-memory double extra large (Hm2XL), and high-memory quadruple extra large (Hm4XL). High CPU instances: high-CPU extra large (HcpuXL). Cluster computing: cluster computing quadruple extra large (Cl4XL). Table 2 summarizes the features and the amount of resources supported by each instance. The resources supported by each con¯guration are: main memory, virtual computers (VCs) with a 32 or 64-bit architecture, instance memory (I-memory) on persistent storage, and I/O performance at two levels, moderate (M) of high (H). The computing power of a virtual core is measured in EC2 compute units (CUs). A main attraction of the Amazon cloud computing is the low cost; the dollar amounts charged for one hour of running under Linux or Unix and Windows at the time of this writing are summarized in Table 3. Simple Storage System is a storage service with a minimal set of functions: write, read, and delete. It allows an application to handle an unlimited number of objects ranging in size from 1 byte to 5 TB. An object is stored in a bucket and retrieved via a unique, developerassigned key; a bucket can be stored in a Region selected by the user. S3 maintains for each 2 object: the name, modifcation time, an access control list, and up to 4 KB of user-defined metadata; the object names are global. Authentication mechanisms ensure that data is kept secure; objects can be made public, and rights can be granted to other users. The Amazon S3 SLA guarantees reliability. S3 uses standards-based REST9 and SOAP10 interfaces; the default download protocol is HTTP and a BitTorrent11 protocol interface is provided to lower costs for high-scale distribution. S3 supports PUT, GET and DELETE primitives to manipulate objects but does not support primitives to copy, to rename, or to move an object from one bucket to another. S3 computes the MD5 12 of every object written and returns it in a ¯eld called ETag. A user is expected to compute the MD5 of an object stored or written and compare this with the ETag; if the two values do not match, then the object was corrupted during transmission or storage. S3 is designed to store large objects. Elastic Block Store provides persistent block level storage volumes for use with Amazon EC2 instances. A volume appears to an application as a raw, unformatted and reliable physical disk. 3 The size of the storage volumes range from 1 GB to 1 TB; the volumes are grouped together in Availability Zones and are automatically replicated in each zone. An EC2 instance may mount multiple volumes, but a volume cannot be shared among multiple instances. The EBS supports the creation of snapshots of the volumes attached to an instance and then uses them to restart an instance. The storage strategy provided by EBS is suitable for database applications, ¯le systems, and applications using raw data devices. SimpleDB is a non-relational data store that allows developers to store and query data items via web services requests; it supports store and query functions traditionally provided only by relational databases. SimpleDB creates multiple geographically distributed copies of each data item and supports high performance Web applications; at the same time, it manages automatically the infrastructure provisioning, hardware and software maintenance, replication and indexing of data items, and performance tuning. Simple Queue Service is a hosted message queue. SQS is a system for supporting automated work°ows; it allows multiple Amazon EC2 instances to coordinate their activities by sending and receiving SQS messages. Any computer connected to the Internet can add or read messages without any installed software or special firewall configurations. Applications using SQS can run independently and asynchronously, and do not need to be developed with the same technologies. A received message is \locked" during processing; if processing fails, the lock expires and the message is available again. The timeout for locking can be changed dynamically via the ChangeMessageVisibility operation. Developers can access QS through standards-based SOAP and Query interfaces. Queues can be shared with other AWS accounts and Anonymously; queue sharing can also be restricted by IP address and time-of-day. CloudWatch is a monitoring infrastructure used by application developers, users, and system administrators to collect and track metrics important for optimizing the performance of applications and for increasing the efficiency of resource utilization. Without installing any software a user can monitor free of charge either seven or eight pre-selected metrics collected at one or at five minute intervals and then view graphs and statistics for these metrics. Virtual Private Cloud provides a bridge between the existing IT infrastructure of an organization and the AWS cloud; the existing infrastructure is connected via a Virtual Private Network (VPN) to a set of isolated AWS compute resources. VPC allows existing management capabilities such as security services, firewalls, and intrusion detection systems to operate seamlessly within the cloud. In 2007 Garfinkel reported the results of an early evaluation of the Amazon Web Ser- vices . The paper reports that EC2 instances are fast, responsive, and very reliable, a new instance could be started in less than two minutes. During the year of testing one unscheduled reboot and one instance freeze were experienced, no data was lost during the reboot, but no data could be recovered from the virtual disks of the frozen instance. To test the S3 a bucket was created and loaded with objects in sizes of 1 byte, 1 KB, 1 MB, 16 MB, and 100 MB. The measured throughput for the 1-byte objects reflected the transaction speed of S3 4 because the testing program required that each transaction be successfully resolved before the next was initiated. The measurements showed that a user could execute at most 50 non-overlapping S3 transactions. The 100 MB probes measured the maximum data throughput that the S3 system could deliver to a single client thread. From the measurements the author concluded that the data throughput for large objects was considerably larger than for small objects due to a high transaction overhead. The write bandwidth for 1 MB data was roughly 5 MB/s while the read bandwidth was 5 times lower, 1 MB/s. Another test was designed to see if concurrent requests could improve the throughput of S3; the experiment involved two virtual machines running on two different clusters and accessing the same bucket with repeated 100 MB GET and PUT operations. The virtual machines were coordinated, with each one executing 1 to 6 threads for 10 minutes and then repeating the pattern for 11 hours. As the number of threads increased from 1 to 6, the bandwidth received by each thread was roughly cut in half and the aggregate bandwidth of the six threads was 30 MB/s, roughly three times the aggregate bandwidth of one thread. In 107556 tests of EC2 each one consisting of multiple read and write probes, only 6 write retries, 3 write errors, and 4 read retries were encountered. The AWSLA (Amazon Web Services Licensing Agreement) allows the company to terminate service to any customer at any time for any reason and contains a covenant not to sue Amazon or its a±liates as the result of any damages that might arise out of the use of AWS. As noted in [92], AWSLA prohibits the use of \other information obtained through AWS for the purpose of direct marketing, spamming, contacting sellers or customers." It prohibits AWS from being used to store any content that is \obscene, libellous, defamatory or otherwise malicious or harmful to any person or entity;" it also prohibits S3 from being used \in any way that is otherwise illegal or promotes illegal activities, including without limitation in any manner that might be discriminatory based on race, sex, religion, nationality, disability, sexual orientation or age." Cloud computing, the Google perspective Google's efforts are concentrated in the area of Software as a Service (SaaS); Gmail, Google docs, Google calendar, Picassa and Google Groups are Google services free of charge for individual users and available for a fee for organizations. These services are running on a cloud and can be invoked from a broad spectrum of devices including mobile ones such as iPhones, iPads, BlackBerries, and laptops and tablets; the data for these services is stored at data centers on the cloud. The Gmail service hosts the Emails on Google servers, provides a web interface to access them and tools for migrating from Lotus Notes and Microsoft Exchange. Google docs is a Web-based software for building text documents, spreadsheets and presentations. It supports features such as tables, bullet points, basic fonts and text size; it allows multiple users to edit and update the same document and to view the history of document changes and it has a spell checker. The service allows users to import and export files in several formats including Office, PDF, text, and OpenOffice extensions. 5 Google calendar is a browser-based scheduler; it supports multiple calendars for a user, the ability to share a calendar with other users, the display of daily/weekly/monthly views, to search events, and to synchronize with the Outlook Calendar. The calendar is accessible from mobile devices; event reminders can be received via SMS, desktop pop-ups, or Emails. It is also possible to share your calendar with other Google calendar users. Picasa is a tool to upload, share, and edit images; it provides 1 GB of disk space per user. Users can add tags to images and attach locations to photos using Google Maps. Google Groups allows users to host discussion forums to create messages online or via email. Google is also a leader in the Platform-as-a-Service (PaaS) space. AppEngine is a developer platform hosted on the cloud; initially it supported only Python and support for Java was added later; detailed documentation for Java is available. The database for code development can be accessed with GQL (Google Query Language) with an SQL-like syntax. The concept of structured data is important for Google's service strategy. The change of search philosophy reflects the transition from unstructured Web content to structured data, data which contain additional information, e.g., the place where a photograph was taken, information about the singer of a digital recording of a song, the local services at a geographic location, and so on. Search engine crawlers rely on hyperlinks to discover new content. The deep web is content stored in databases and served as pages created dynamically by querying HTML forms; such content is unavailable to crawlers which are unable to fill out such forms. Examples of deep Web sources are: sites with geographic-specific information such as local stores, services, and business; sites which report statistics and analysis produced by governmental and non governmental organizations; art collections; photo galleries; bus, train, and airlines schedules, and so on. Structured content is created by labelling; Flickr and Google Co-op are examples of structures where labels and annotations are added to objects, images and pages, stored on the Web. Google Co-op allows users to create customized search engines based on a set of facets or categories; for example, the facets for a search engine for the database re-search community available at http://data.cs.washington.edu/coop/dbresearch/index.html are: professor, project, publication, jobs. Google Base is a service allowing the users to load structured data from different sources to a central repository which is a very large, self-describing, semi-structured, heterogeneous database; it is self-describing because each item follows a simple schema: (item type, attribute names). Few users are aware of this service thus, Google Base is accessed in response to keyword queries posed on Google.com provided that there is relevant data in the database. To fully integrate Google Base the results should be ranked across properties; also the service needs to propose appropriate refinements with candidate values in select-menus; this is done by computing histograms on attributes and their values during query time. 6 Specialized structure-aware search engines for several areas, including travel, weather and local services, have already been implemented. But the data available on the Web covers a wealth of human knowledge; it is not feasible to define all the possible domains and it is nearly impossible to decide where one domain ends and another begins. Google has also redefrned the laptop with the introduction of the Chromebook, a purely Web-centric tablet running Chrome-OS. Cloud-based apps, extreme portability, built-in 3G connectivity, almost instant-on, and all-day battery life are the main attractions of this tablet with a keyboard. Google adheres to a bottom-up, engineer-driven, and liberal licensing and user application development philosophy, while Apple, a recent entry in cloud computing, tightly controls the technology stack, builds its own hardware and requires the applications developed to follow strict rules. Apple products including the iPhone, the iOS, the iTunes Store, Mac OS X, and iCloud offer unparalleled polish and effortless interoperability, while the flexibility of Google results in more cumbersome user interfaces for the broad spectrum of devices running the Android OS. Windows Azure and Online Services Azure and Online Services are PaaS (Platforms as a Service) and, respectively, SaaS (Software as a Service) cloud platforms from Microsoft. Windows Azure is an operating system, SQL Azure is a cloud-based version of the SQL Server, and Azure AppFabric (formerly .NET Services) is a collection of services for cloud applications. The components of Windows Azure: Compute - runs cloud applications; Storage - uses blobs, tables, and queues to store data; Fabric Controller - deploys, manages, and monitors applications; CDN - maintains cache copies of data; Connect - allows IP connections between the user systems and applications running on Windows Azure. 7 Windows Azure has three core components (see Figure): Compute which provides a computation environment, Storage for scalable storage, and Fabric Controller which deploys,manages, and monitors applications; it interconnects nodes consisting of servers, high-speed connections, and switches. The Content Delivery Network (CDN) maintains cache copies of data to speed up computations. The Connect subsystem supports IP connections between the users and their applications running on Windows Azure. The API interface to Windows Azure is built on REST, HTTP and XML. The platform includes five services: Live Services, SQL Azure, AppFabric, SharePoint and Dynamics CRM. A client library and tools are also provided for developing cloud applications in Visual Studio. The computations carried out by an application are implemented as one or more roles; an application typically runs multiple instances of a role. One distinguishes: (i) Web role instances used to create Web applications; (ii) Worker role instances used to run Window-based code; and (iii) VM role instances which run a user-provided Windows Server 2008 R2 image. Scaling, load balancing, memory management, and reliability are ensured by a fabric controller, a distributed application replicated across a group of machines which owns all of the resources in its environment: computers, switches, load balancers, and it is aware of every Windows Azure application. The fabric controller decides where new applications should run; it chooses the physical servers to optimize utilization using configuration information uploaded with each Windows Azure application. The configuration information is an XML-based description of how many Web role instances, how many Worker role instances, and of other needs of the application; the fabric controller uses this configuration file to determine how many VMs to create. Blobs, tables, queue, and drives are used as scalable storage. A blob contains binary data, a container consists of one or more blobs. Blobs can be up to a terabyte and they may have associated metadata, e.g., the information about where a JPEG photograph was taken. Blobs allow a Windows Azure role instance interact with persistent storage as if it were a local NTFS13 ¯le system. Queues enable Web role instances to communicate asynchronously with worker role instances. The Microsoft Azure platform currently does not provide or support any distributed parallel computing frameworks, such as MapReduce, Dryad or MPI, other than the support for implementing basic queue-based job scheduling. After reviewing cloud services provided by Amazon, Google, and Microsoft we are in a better position to understand the di®erences between SaaS, IaaS, and PaaS. There is no confusion about SaaS, the service provider supplies both the hardware and the application software; the user has direct access to these services through a Web interface and has no control on cloud resources. Typical examples are Google with Gmail, Google docs, Google calendar, Google Groups, and Picassa and Microsoft with the Online Services. 8 In the case of IaaS, the service provider supplies the hardware (servers, storage, networks), and system software (operating systems, databases); in addition, the provider ensures system attributes such as security, fault-tolerance, and load balancing. The representative of IaaS is Amazon AWS. PaaS provides only a platform including the hardware and system software such as operating systems and databases; the service provider is responsible for system updates, patches, and the software maintenance. PaaS does not allow any user control on the operating system, security features, or the ability to install applications. Typical examples are Google App Engine, Microsoft Azure, and Force.com provided by Salesforce.com. The level of users control over the system is different in IaaS versus PaaS; IaaS provides total control, PaaS typically provides no control. Consequently, IaaS incurs administration costs similar to a traditional computing infrastructure while the administrative costs are virtually zero for PaaS. Open-source software platforms for private clouds Private clouds provide a cost effective alternative for very large organizations. A private cloud has essentially the same structural components as a commercial one: the servers, the network, Virtual Machines Monitors (VMM) running on individual systems, an archive containing disk images of Virtual Machines (VMs), a front end for communication with the user, and a cloud control infrastructure. Open-source cloud computing platforms such as Eucaliptus, OpenNebula, and Nimbus can be used as a control infrastructure for a private cloud. Schematically, a cloud infrastructure carries out the following steps to run an application: retrieves the user input from the front-end; retrieves the disk image of a VM (Virtual Machine) from a repository; locates a system and requests the VMM (Virtual Machine Monitor) running on that system to setup a VM; invokes the DHCP14 and the IP bridging software to set up a MAC and IP address for the VM. We now discuss briefly the three open-source software systems, Eucalyptus, OpenNebula, and Nimbus. Eucalyptus (http://www.eucalyptus.com/) can be viewed as an open-source counterpart of Amazon's EC2. The system supports a strong separation between the user space and administrator space; users access the system via a Web interface while administrators need root access. The system supports a decentralized resource management of multiple clusters with multiple cluster controllers, but a single head node for handling user interfaces. It implements a distributed storage system, the analog of Amazons S3 system called Walrus. The procedure to construct a virtual machine is based on the generic one described above the euca2ools front-end is used to request a VM; 9 the VM disk image is transferred to a compute node; this disk image is modified for use by the VMM on the compute node; the compute node sets up network bridging to provide a virtual NIC15 with a virtual MAC address16; in the head node the DHCP is set up with the MAC/IP pair; VMM activates the VM; the user can now ssh directly into the VM. The system can support a large number of users in a corporate enterprise environment. Users are shielded from the complexity of disk configurations and can choose for their VM from a set of 5 configurations for available processors, memory and hard drive space setup by the system administrators. Open-Nebula (http://www.opennebula.org/) is a private cloud with users actually logging into the head node to access cloud functions. The system is centralized and its default configuration uses the NFS filesystem. The procedure to construct a virtual machine consists of several steps: (i) a user signs in to the head node using ssh17; (ii) next, it uses the one vm command to request a VM; (iii) the VM template disk image is transformed to fit the correct size and configuration within the NFS directory on the head node; (iv) the oned daemon on the head node uses ssh to log into a compute node; (v) the compute node sets up network bridging to provide a virtual NIC with a virtual MAC; (vi) the files needed by the VMM are transferred to the compute node via the NFS; (vii)) the VMM on the compute note starts the VM; (viii) the user is able to ssh directly to the VM on the compute node. The system is best suited for an operation involving a small to medium size group of trusted and knowledgeable users who are able to configure this versatile system based on their needs. Nimbus (http://www.nimbusproject.org/) is a cloud solution for scientific applications based on the Globus software; The system inherits from Globus the image storage, the credentials for user authentication, and the requirement that the running Nimbus process can ssh into all compute nodes. Customization in this system can only be done by the system administrators. Table 4 summarizes the features of the three systems. 10 The conclusions of the comparative analysis are as follows: Eucalyptus is best suited for a large corporation with its own private cloud; as it ensures a degree of protection from user malice and mistakes; OpenNebula is best suited for a testing environment with a few servers; Nimbus is more adequate for a scientific community less interested in the technical internals of the system, but with broad customization requirements. Service level agreements and compliance level agreements A Service Level Agreement (SLA) is a negotiated contract between two parties, the customer and the service provider; the agreement can be legally binding or informal and specifies the services that the customer receives, rather than how the service provider delivers the services. The objectives of the agreement are: Identify and define the customers needs and constraints including the level of resources, security, timing, and quality of service. Provide a framework for understanding; a critical aspect of this framework is a clear definition of classes of service and the costs. Simplify complex issues; for example, clarify the boundaries between the responsibilities of the clients and those of the provider of service in case of failures. Reduce areas of conflict. Encourage dialog in the event of disputes. Eliminate unrealistic expectations. An SLA records a common understanding in several areas: (i) services, (ii) priorities, (iii) responsibilities, 11 (iv) (v) guarantees, and warranties. An agreement usually covers: services to be delivered, performance, tracking and reporting, problem management, legal compliance and resolution of disputes, customer duties and responsibilities, security, handling of confidential information, and termination. Each area of service in cloud computing should define a \target level of service" or \minimum level of service" and specify the levels of availability, serviceability, performance, operation, or other attributes of the service, such as billing; penalties may also be specified in the case of non-compliance of the SLA. It is expected that any Service-Oriented Architecture (SOA) will eventually include middleware supporting SLA management; the Framework 7 project supported by the European Union is researching this area, see http://sla-at-soi.eu/. The common metrics specified by an SLA are service-specific. For example, the metrics used by a call center usually are: (i) abandonment rate - percentage of calls abandoned while waiting to be answered; (ii) average speed to answer - average time before the service desk answers a call; (iii) time service factor - percentage of calls answered within a definite time frame; (iv) first-call resolution - percentage of incoming calls that can be resolved without a callback; and (v) turnaround time - time to complete a certain task. There are two well-differentiated phases in SLA management: the negotiation of the contract and the monitoring of its ful¯lment in real-time. In turn, automated negotiation has three main components: (i) the object of negotiation which define the attributes and constraints under negotiation; (ii) the negotiation protocols which describe the interaction between negotiating parties, and (iii) the decision models responsible for processing proposals and generating counter proposals. The concept of compliance in cloud computing is in the context of the user ability to select a provider of service; the selection process is subject to customizable compliance with user requirements such as security, deadlines, and costs. The authors propose an infrastructure called Compliant Cloud Computing (C3) consisting of: (i) a language to express user requirements and the Compliance Level Agreements (CLA), and (ii) the middleware for managing CLAs. The Web Service Agreement Specification (WS-Agreement) uses an XML-based language to de¯ne a protocol for creating an agreement using a pre-defined template with some customizable aspects; it only supports one-round negotiation without counter proposals. 12