Introduction to Cloud Computing Rough on-going draft © 2011 B. Wilkinson/Clayton Ferner. Fall 2011 Grid computing course. Modification date: August 10, 2011 15.1 Server Hosting Renting Remote Servers • Around for many years and predates cloud computing – 1990s (?) to present • Companies provide servers through Internet that users can rent time on. • Typically done to host web sites. • Get whole server for your use (dedicated server). • Pay a monthly fee. • Generally you load whatever software you want. • Company only responsible for hardware, OS. • Still exists although many companies have moved into cloud computing also 15.2 Cloud Computing • Cloud computing – “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”* Came about really as business model to allow business to outsource their IT software to a third party Cloud provider. Driven by economics, the Internet and existence of large server farms. The word “cloud” comes from drawing cloud shapes to represent a network. * Wikipedia: http://en.wikipedia.org/wiki/Cloud_computing 15.3 Some key aspects of cloud computing 1. Computing resources available on demand, thereby eliminating the need to plan far ahead for provisioning. 2. Elimination of up-front commitment by Cloud users, thereby allowing companies to start small and increase hardware resources only when there is an increase in their needs. 3. Ability to pay for use of computing resources on a short-term basis as needed (e.g., processors by the hour and storage by the day) and release them as needed, thereby rewarding conservation by letting machines and storage go when they are no longer useful.* * “Above the Clouds: A Berkeley View of Cloud Computing,” University of California at Berkeley Technical Report No. UCB/EECS-2009-28. 15.4 Difference between renting physical servers remotely and cloud computing • In cloud computing, you get a virtual machine running on servers with your selected OS running on top of virtualization software. • There could be other users on servers. • You access servers through a web service/web site. • You pay for specific time used on processor, storage devices and bandwidth/network. • Cloud computing focuses on virtualization and service orient approach and making it economical fro companies to use a third party cloud provider to maintain hardware and software on a on-demand basis. 15.5 Relationship to Grid computing • Grid computing – using geographically distributed computing resources collaboratively began as a concept in the mid 1990’s with the growth of high speed networks and the Internet. • Began in the 1990s as a research concept to provide collaborative computing • The word “grid” came from the idea that grid computing would provide computing power on demand through the Internet in the same way as electrical power come from a distributed electrical Grid utility. • Cost of usage was not a driving force and usually no costs charged. 15.6 Grid and Cloud Computing • Both Grid computing and Cloud computing take advantage of the Internet. • One angle of Grid computing was “utility computing” from the original “grid” term. • Some companies, notably IBM, saw commercial possibilities in the early 2000’s – “on-demand computing” but it did not take off then commercially. 15.7 Utility computing resources Utility computing suggested by John McCarthy in 1960s: “computation may someday be organized as a public utility." (Wikipedia). Grid took it up idea in on-demand computing Cloud computing followed through with: 1. Maturing of virtualization and service-oriented technologies 2. The growth of large underutilized data centers. 15.8 Technologies underpinning Cloud computing 15.9 (Hardware) Virtualization • Method hiding the physical characteristics of a computer platform. User sees an abstract platform • Hypervisor - Software that controls this virtualization. (Word originally derived from 1960’s “supervisor”.) • Virtual machine (VM) - a “completely isolated guest operating system installation within your normal host operating system” • User’s programs execute on this virtual machine but has some access to underlying hardware as controlled by hypervisor. • Different OS’s can be provided to individual users • Performance reduced (how much?) but provides users with the illusion of their own platform. Users isolated from each other. http://en.wikipedia.org/wiki/Hardware_virtualization http://en.wikipedia.org/wiki/Virtual_machine 15.10 Full and Platform Virtualization • Full virtualization – complete simulation of underlying hardware – all instructions, etc. – Hardware-assisted virtualization – hardware architectural support provided to allow virtualization • Platform virtualization – limited simulation of underlying hardware. Limits what Apps may run. Server virtualization? 15.11 VMware • A company started in 1998 providing virtualization software, notably hypervisors • Offers a number of products – Cloud Foundry -- free, open source cloud computing platform as a service (PaaS) software http://en.wikipedia.org/wiki/VMware http://www.vmware.com/ 15.12 Hypervisor Example: Xen hypervisor and Xen cloud platform Open source hypervisor for x86, x86-64, Itanium, Power PC, and ARM processors. Supports various OS’s including Linux and Windows • “Runs directly on the hardware and becomes the interface for all hardware requests such as CPU, I/O, and disk for the guest operating systems.” • Used by Amazon Web Services AWS. http://www.xen.org/ 15.13 Service-oriented Technologies for the Cloud • Web services key for user access through the Internet. Principal Cloud service categories • Infrastructure as a Service (IaaS) • Platform as Service (PaaS) • Software as a Service (SaaS) Others in literature: Communication as a Service (CaaS), monitoring as a service (MaaS), Network Cloud services, Datacenter Cloud services, Compute and Storage Cloud services, Business Application Cloud services, database as a service, … 15.14 Infrastructure as a Service (IaaS) “Deliver computer infrastructure – typically a platform virtualization environment – as a service, along with raw (block) storage and networking.” * In IaaS, customers rent computing resources rather than purchase them and access the resources through a (Web) service infrastructure. Service billed typically monthly on a usage basis. Example: Amazon EC2 (see later) * http://en.wikipedia.org/wiki/Cloud_computing 15.15 Infrastructure as a Service (IaaS) Advantages • • • • • • Access to preconfigured environment Use of latest technology Reduced cost and risk of having third party maintain resources No capital investment No IT personal to maintain remote hardware/software Able to manage peak demand as needed without having to purchase a larger system that would be underutilized at other times • Secure – security handed by provider Disadvantages: Delays in network (Internet), confidential data concerns, … Discuss 15.16 Software as a Service (SaaS) “ Software delivery model in which software and its associated data are hosted centrally (typically in the (Internet) cloud) and are typically accessed by users using a thin client, normally using a web browser over the Internet.” * • Customer pays for access to software that is installed on providers remote computing resources, typically paid for on a subscription licensing model. • Many business software (accounting, email, management software, …) suitable for SaaS SaaS example: Google docs * http://en.wikipedia.org/wiki/Software_as_a_Service 15.17 Software as a Service (SaaS) Advantages • Relives businesses of maintaining software - updates. licenses, multiple copies being consistent… etc. • Since access is through a web browser, can access software from anywhere (globally) - mobile device etc. • Facilitates internal collaboration • Compatible data - All users use same software version • No or less dedicated application programming Disadvantages ? Discuss 15.18 Salesforce.com Formed in 1999 focusing on Software as a Service (SaaS) and Customer relationship management (CRM) CRM – “strategy for managing a company’s interactions with customers, clients and sales prospects… technology to organize, automate, and synchronize business processes—principally sales activities, but also … marketing, customer service, and technical support.” http://en.wikipedia.org/wiki/Customer_Relationship_Management 15.19 Platform as Service (PaaS) “The delivery of a computing platform and solution stack as a service.” * Unlike IaaS, generally not concerned with creating your own application software or selecting an OS. Users may develop their own web client interfaces Advantages derive from SaaS? Integrated software solution PaaS example: Google AppEngine * http://en.wikipedia.org/wiki/Platform_as_a_service 15.20 Major Cloud Providers 15.21 Amazon Web Services (AWS) Amazon started as an on-line bookstore in1994/5 Large server farms for their online business, led to offering servers to users through Amazon Web Services (AWS) in 2006. Google moved into cloud computing in same way having large available server farms. 15.22 • Amazon led cloud deployment with their AWS • They realized their large underutilized data centers could be put to good use by providing cloud computing to customers. • AWS - a collection of remote computing (web) services offered over the Internet (HTTP with REST/SOAP protocols) • Notable: • Amazon EC2 – Amazon Elastic Compute Cloud – rent virtual computers to run your own applications. Launched 2006. Full production in 2008. • Amazon S3 – Amazon Simple Storage Service - provides storage thro web service interfaces. Launched 2006 http://en.wikipedia.org/wiki/Amazon_Web_Services http://en.wikipedia.org/wiki/Amazon_EC2 http://en.wikipedia.org/wiki/Amazon_S3 15.23 Amazon Elastic Compute Cloud (EC2) • Uses Xen virtualization to create an instance • Various packaged instances, see next • Computing power defined by Elastic Compute Unit (ECU) – One EC2 Compute Unit equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. – 33.5 EC2 Compute Units = 2 x Intel Xeon X5570, quadcore “Nehalem” architecture • “Elastic” implies can quickly grow and shrink available computing power (within minutes) – user has to use AWS APIs and commands do do this? 15.24 AWS instances (2011) http://aws.amazon.com/ec2/ • Standard Instances – Small Instance (Default) 1.7 GB memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit), 160 GB local instance storage, 32-bit platform – Large Instance 7.5 GB memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB local instance storage, 64-bit platform – Extra Large Instance 15 GB memory, 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each), 1690 GB local instance storage, 64-bit platform • • • • • Micro Instances – to add burst capacity High-Memory Instances – increased memory High-CPU Instances – increase CPU performance Cluster Compute Instances – cluster configurations Cluster GPU Instances - GPU cluster configurations 15.25 Amazon Simple Storage Service (S3) Provides storage thro web service interfaces. Launched 2006 Data organization – Write/read/delete objects (1 byte to 5 TB each) – Each object stored in bucket retrieved by unique developer assigned key – Buckets stored in one of several regions: US Standard, EU (Ireland), US West (Northern California), Asia Pacific (Singapore), Asia Pacific (Tokyo) – Objects kept in one region (unless you transfer them out) – Authentications mechanism – private, public or rights to specific user http://aws.amazon.com/s3/ 15.26 Amazon Simple Storage Service (S3) continued – REST or SOAP interfaces. – Access using HTTP (BitTorrent available) – Reliability: Defined in Service level Agreement: Monthly Uptime Percentage of at least 99.9% during any monthly billing cycle. – Data stored on multiple devices. 99.999999999% durability (will survive permanently) and 99.99% availability of objects over a given year. http://aws.amazon.com/s3/ 15.27 S3 costs (2011) Sliding Scale. Briefly: Storage First 1 TB / month $0.140 per GB $0.093 per GB …. Over 5000 TB / month $0.055 per GB $0.037 per GB Data transfer costs None within region or into region via HTTP COPY request Out of region charged: First 1 GB / month $0.000 per GB Up to 10 TB / month $0.120 per GB … New AWS customers receive 5 GB S3 storage, 20,000 Get Requests, 2,000 Put Requests, and 15GB data transfer out each month for one year. 15.28 Microsoft Azure Microsoft jumped into cloud computing with Azure cloud software in 2008. Apart from software for Windows platforms, provides data centers in US, Europe and Asia. 15.29 http://www.microsoft.com/windowsazure/features/ 15.30 Eucalyptus 15.31 Some key Cloud Computing Issues • • • • Privacy and security Compliance and legal Performance, availability, durability, … Standards 15.32 15.33 Using Cloud computing in Distributed High Performance Computing (HPC) AWS EC2 provides instances for HPC: • High-CPU Instances – High-CPU Medium Instance 1.7 GB of memory, 5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each), 350 GB of local instance storage, 32-bit platform – High-CPU Extra Large Instance 7 GB of memory, 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each), 1690 GB of local instance storage, 64-bit platform • Cluster Compute Instances – Cluster Compute Quadruple Extra Large 23 GB memory, 33.5 EC2 Compute Units, 1690 GB of local instance storage, 64-bit platform, 10 Gigabit Ethernet • Cluster GPU Instances – Cluster GPU Quadruple Extra Large 22 GB memory, 33.5 EC2 Compute Units, 2 x NVIDIA Tesla “Fermi” M2050 GPUs, 1690 GB of local instance storage, 64-bit platform, 10 Gigabit Ethernet One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.01.2 GHz 2007 Opteron or 2007 Xeon processor. 15.34 MapReduce “Introduced by Google in 2004 “to support distribute computing on large data sets on clusters of computers.”* Computational processing occurs using two basic steps: "Map step: The master node takes the input, partitions it up into smaller sub-problems, and distributes those to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes that smaller problem, and passes the answer back to its master node. "Reduce step: The master node then takes the answers to all the subproblems and combines them in some way to get the output – the answer to the problem it was originally trying to solve.”* Used at Google. Apache Hadoop is an implementation of MapReduce. MapReduce using Hadoop available in AWS “Amazon Elastic MapReduce” * http://en.wikipedia.org/wiki/Mapreduce 15.35 Questions? 15.36 Quiz Questions What type of Cloud service is Amazon Web services (AWS)? a) b) c) d) Infrastructure as a Service (IaaS) Platform as Service (PaaS) Software as a Service (SaaS) Other 15.37 What type of Cloud service is Microsoft Azure? a) b) c) d) Infrastructure as a Service (IaaS) Platform as Service (PaaS) Software as a Service (SaaS) Other 15.38 Reading materials Cloud fundamentals: Michael Armbrust et al, “Above the Clouds: A Berkeley View of Cloud Computing,” Technical Report No. UCB/EECS2009-28 http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS2009-28.html READING ASSIGNMENT: READ ABOVE TECHNICAL REPORT – to discuss Video: CS309A Cloud Computing, Stanford University, http://myvideos.stanford.edu/player/slplayer.aspx?course=C S309A&p=true ASSIGNMENT: WATCH ABOVE VIDEO – to discuss 15.39 Reading materials Books: “Cloud Computing Implementation, Management and Security” by J. W. Rittinghouse and J. F. Ransome, CRC Press, 2010. ISBN 978-1-4398-0680-7. “Cloud Computing Explained 2nd ed.” by J. Rhoton, Recursive Press, 2010, ISBN 978-0-9563556 15.40 HPC: C.Evangelinos and C. N. Hill, “Cloud Computing for parallel Scientific HPC Applications: Feasibility of running Coupled Atmosphere-Ocean Climate Models on Amazon’s EC2, CCA-08. MapReduce/Hadoop … 15.41