CS 525 Advanced Distributed Systems Spring 2011 Yeah! That’s what I’d like to know. Indranil Gupta (Indy) Lecture 2 What(’s in) the Cloud? January 20, 2011 1 All Slides © IG Clouds are Water Vapor • Oracle has a Cloud Computing Center. • And yet… • Larry Ellison’s Rant on Cloud Computing 2 The Hype! • • • • • Gartner - Cloud computing revenue will soar faster than expected and will exceed $150 billion within five years. Forrester - Cloud-Based Email Is Often Cheaper Than On-Premise Email Vivek Kundra, CTO of Obama Government: “Growing adoption of cloud computing could improve data sharing and promote collaboration among federal, state and local governments.” E.g: fedbizopps.gov Merrill Lynch: “By 2011 the volume of cloud computing market opportunity would amount to $160bn, including $95bn in business and productivity apps (email, office, CRM, etc.) and $65bn in online advertising.” IDC: “Spending on IT cloud services will triple in the next 5 years, reaching $42 billion and capturing 25% of IT spending growth in 2012.” 3 Sources: http://www.infosysblogs.com/cloudcomputing/2009/08/the_cloud_computing_quotes.htm and http://www.mytestbox.com Ha ha hype! It’s a bunch of tripe, since no one is probably making or saving money. 4 $$$ • • • Ingo Elfering, Vice President of Information Technology Strategy, GlaxoSmithKline: “With Online Services, we are able to reduce our IT operational costs by roughly 30% of what we’re spending now and introduce a variable cost subscription model for these technologies that allows us to more rapidly scale or divest our investment as necessary as we undergo a transformational change in the pharmaceutical industry” Jim Swartz, CIO, Sybase: “At Sybase, a private cloud of virtual servers inside its data centre has saved nearly $US2 million annually since 2006, Swartz says, because the company can share computing power and storage resources across servers.” Dave Power, Associate Information Consultant at Eli Lilly and Company: “With AWS, Powers said, a new server can be up and running in three minutes (it used to take Eli Lilly seven and a half weeks to deploy a server internally) and a 64-node Linux cluster can be online in five minutes (compared with three months internally). The deployment time is really what impressed us. It's just shy of instantaneous." 5 Sources: http://www.infosysblogs.com/cloudcomputing/2009/08/the_cloud_computing_quotes.htm and http://www.mytestbox.com Alright, alright. But for heaven’s sake, can someone tell me what is a cloud? 6 What is a Cloud? • It’s a cluster! It’s a supercomputer! It’s a datastore! • It’s superman! • None of the above • All of the above • Cloud = Lots of storage + compute cycles nearby 7 What is a Cloud? • A single-site cloud (aka “Datacenter”) consists of – – – – – – Compute nodes (split into racks) Switches, connecting the racks A network topology, e.g., hierarchical Storage (backend) nodes connected to the network Front-end for submitting jobs Services: physical resource set, software services • A geographically distributed cloud consists of – Multiple such sites – Each site perhaps with a different structure and services 8 A Sample Cloud Topology Core Switch Top of the Rack Switch If higher bandwidth link, then a “fat tree” topology Rack Servers 9 Scale of Industry Datacenters • Microsoft [NYTimes, 2008] – – – – 150,000 machines Growth rate of 10,000 per month Largest datacenter: 48,000 machines 80,000 total running Bing • Yahoo! [Hadoop Summit, 2009] – 25,000 machines – Split into datacenters of 4000 machines each • AWS EC2 (Oct 2009) – 40,000 machines – 8 cores/machine • Google – (Rumored) several hundreds of thousands of machines 10 OK, they are massive. But it is still called a “cluster”! And that’s not a new concept! 11 “A Cloudy History of Time” © IG 2010 The first datacenters! 1940 1950 Timesharing Companies & Data Processing Industry 1960 Clusters 1970 Grids 1980 1990 PCs (not distributed!) 2000 Peer to peer systems Clouds and datacenters 2010 12 “A Cloudy History of Time” © IG 2010 First large datacenters: ENIAC, ORDVAC, ILLIAC Many used vacuum tubes and mechanical relays Berkeley NOW Project Supercomputers Server Farms (e.g., Oceano) P2P Systems (90s-00s) •Many Millions of users •Many GB per day Data Processing Industry - 1968: $70 M. 1978: $3.15 Billion. Timesharing Industry (1975): •Market Share: Honeywell 34%, IBM 15%, •Xerox 10%, CDC 10%, DEC 10%, UNIVAC 10% •Honeywell 6000 & 635, IBM 370/168, Xerox 940 & Sigma 9, DEC PDP-10, UNIVAC 1108 Grids (1980s-2000s): Clouds •GriPhyN (1970s-80s) •Open Science Grid and Lambda Rail (2000s) •Globus & other standards (1990s-2000s) 13 Why did all of this happen? 14 Trends: Technology • Doubling Periods – storage: 12 mos, bandwidth: 9 mos, and (what law is this?) cpu capacity: 18 mos • Then and Now Bandwidth – 1985: mostly 56Kbps links nationwide – 2004: 155 Mbps links widespread Disk capacity – Today’s PCs have 100GBs, same as a 1990 supercomputer 15 Trends: Users • Then and Now Biologists: – 1990: were running small single-molecule simulations – 2004: want to calculate structures of complex macromolecules, want to screen thousands of drug candidates, sequence very complex genomes Physicists – 2008 onwards: CERN’s Large Hadron Collider will produce 700 MB/s or 15 PB/year • Trends in Technology and User Requirements: Independent or Symbiotic? 16 Prophecies In 1965, MIT's Fernando Corbató and the other designers of the Multics operating system envisioned a computer facility operating “like a power company or water company”. Plug your thin client into the computing Utility and Play your favorite Intensive Compute & Communicate Application – [Have today’s clouds brought us closer to this reality?] 17 So, clouds have been around for decades! But aside from massive scale what’s new about today’s cloud computing?! 18 What(’s new) in Today’s Clouds? Three major features: I. On-demand access: Pay-as-you-go, no upfront commitment. – II. Anyone can access it (e.g., Washington Post – Hillary Clinton example) Data-intensive Nature: What was MBs has now become TBs. – – III. Daily logs, forensics, Web data, etc. Do you know the size of Wikipedia dump? New Cloud Programming Paradigms: MapReduce/Hadoop, Pig Latin, DryadLinq, Swift, and many others. – High in accessibility and ease of programmability Combination of one or more of these gives rise to novel and unsolved distributed computing problems in cloud 19 computing. I. On-demand access: *aaS Classification On-demand: renting a cab vs (previously) renting a car, or buying one. E.g.: – AWS Elastic Compute Cloud (EC2): $0.086-$1.16 per CPU hour – AWS Simple Storage Service (S3): $0.055-$0.15 per GB-month • HaaS: Hardware as a Service – You get access to barebones hardware machines, do whatever you want with them – Ex: Your own cluster, Emulab • IaaS: Infrastructure as a Service – You get access to flexible computing and storage infrastructure. Virtualization is one way of achieving this. Often said to subsume HaaS. – Ex: Amazon Web Services (AWS: EC2 and S3), Eucalyptus, Rightscale. • PaaS: Platform as a Service – You get access to flexible computing and storage infrastructure, coupled with a software platform (often tightly) – Ex: Google’s AppEngine • SaaS: Software as a Service – You get access to software services, when you need them. Often said to subsume SOA (Service Oriented Architectures). – Ex: Microsoft’s LiveMesh, MS Office on demand 20 II. Data-intensive Computing • Computation-Intensive Computing – Example areas: MPI-based, High-performance computing, Grids – Typically run on supercomputers (e.g., NCSA Blue Waters) • Data-Intensive – Typically store data at datacenters – Use compute nodes nearby – Compute nodes run computation services • • In data-intensive computing, the focus shifts from computation to the data: CPU utilization no longer the most important resource metric Problem areas include – – – – – – – Distributed systems Middleware OS Storage Networking Security Others 21 III. New Cloud Programming Paradigms Dataflow programming frameworks • Google: MapReduce and Sawzall • Yahoo: Hadoop and Pig Latin • Microsoft: DryadLINQ • Facebook: Hive • Amazon: Elastic MapReduce service (pay-as-you-go) • Google (MapReduce) – Indexing: a chain of 24 MapReduce jobs – ~200K jobs processing 50PB/month (in 2006) • Yahoo! (Hadoop + Pig) – WebMap: a chain of 100 MapReduce jobs – 280 TB of data, 2500 nodes, 73 hours • Facebook (Hadoop + Hive) – ~300TB total, adding 2TB/day (in 2008) – 3K jobs processing 55TB/day • Similar numbers from other companies, e.g., Yieldex, eharmony.com, etc. 22 This is all confusing. Can you give me some examples of clouds? 23 Two Categories of Clouds • Industrial Clouds – Can be either a (i) public cloud, or (ii) private cloud – Private clouds are accessible only to company employees – Public clouds provide service to any paying customer: • Amazon S3 (Simple Storage Service): store arbitrary datasets ,pay per GBonth stored • Amazon EC2 (Elastic Compute Cloud): upload and run arbitrary images, pay per CPU hour used • Google AppEngine: develop applications within their appengine framework, upload data which is then imported into their format, and run • Academic Clouds – Allow researchers to innovate, deploy, and experiment – Google-IBM Cloud (U. Washington): run apps programmed atop Hadoop – Cloud Computing Testbed (CCT @ UIUC): first cloud testbed to support systems research. Runs: (i) apps programmed atop Hadoop and Pig, (ii) systems-level research on this first generation of cloud computing models (~HaaS), and (iii) Eucalyptus services (~AWS EC2). http://cloud.cs.illinois.edu 24 – OpenCirrus: first federated cloud testbed. http://opencirrus.org Academic Clouds • CCT = Cloud Computing Testbed – NSF infrastructure – Used by 10+ NSF projects, including several nonUIUC projects – Housed within Siebel Center (4th floor!) – Accessible to students of CS525! • Almost half of SP09/SP10 course used CCT for their projects • OpenCirrus = Federated Cloud Testbed – Contains CCT and other sites • If you need a CCT account for your CS525 experiment, let me know asap! There are a limited number of these 25 available for CS525 Cloud Computing Testbed (CCT) 26 CCT Hardware in more Detail •128 compute nodes = 64+64 •500 TB & 1000+ shared cores 27 Goal of CCT: Support both Systems Research and Applications Research in Data-intensive Distributed Computing 28 CCT Software Services Accessing and Using CCT: I. Systems Partition (64-8 nodes): – – CentOS machines Dedicated access to a subset of machines (~ Emulab), with sudo access – User accounts • • User requests # machines (<= 64) + storage quota (<= 30 TB) Machine allocation survives for 4 weeks, storage survives for 6 months (both extendible) II. Hadoop/Pig Partition and Service (64 nodes) III. Eucalyptus Partition (8 nodes) 29 CCT Software Services Accessing and Using CCT: I. Systems Partition (64-8 nodes) II. Hadoop/Pig Partition and Service (64 nodes): – Looks like a regular shared Hadoop cluster service • • • • – User accounts • • III. Users share 64 nodes. Individual nodes not directly reachable. 4 slots per machine Several users are reporting stable operation at 256 instances During Spring 09/10, 10+ projects running simultaneously User requests account + storage quota (<= 30 TB) Storage survives for 6 months (extendible) Eucalyptus Partition (8 nodes) 30 CCT Software Services Accessing and Using CCT: I. Systems Partition (64-8 nodes) II. Hadoop/Pig Partition and Service (64 nodes): III. Eucalyptus Partition (8 nodes): • • Based on open-source version of Eucalyptus from UCSB (Rich Wolski) Exports same interface as AWS EC2 and S3. 31 CCT Software Services • Some Services running inside CCT – ZFS: backend file system. – Zenoss: Systems Monitoring. Shared with department’s other computing clusters – Hadoop + HDFS – Ability to make datasets publicly available • How do users request an account: two-stage process (go to http://cloud.cs.illinois.edu ) 1. User account request – require background check 2. Allocation request 32 Open Cirrus Federation Founding 6 sites 33 Open Cirrus Federation First open federated cloud testbed Shared: research, applications, infrastructure (9*1,000 cores), data sets Global services: sign on, monitoring, store, etc., Federated clouds, meaning each is different RAS Intel HP KIT (de) ETRI Yahoo UIUC CMU IDA (sg) MIMOS 34 22 March 2016 Grown to 9 sites, with more to come 34 OK, so that’s what a cloud looks like today. Now, suppose I want to start my own company, Devils Inc. Should I buy a cloud and own it, or should I outsource to a public cloud? 35 We’ll do that next week… • For now, it’s important to start thinking of who’s on your project team… Projects • Groups of 2 (need not be same as presentation groups). Could be 3. • We’ll start detailed discussions “soon” (a few classes into the student-led presentations) Entr. Tidbits: Selecting your Team • Selecting your partner is important: select someone with a complementary personality to yours! – Apple: Wozniak loved being an engineer and hated interacting with people, Jobs loved making calls, doing sales and preferred engineering much less – Flickr: Stewart was improvisational, Fake was goaldriven – Levchin loved to program and break things, Thiel talked to VCs and did sales. – Hansson says that development of Ruby on Rails benefited from having a small team and a small budget that kept them focused – this is why the big giants could not beat them. 37 Next Week • We will continue discussion of cloud computing – How MapReduce works – What is PlanetLab and Emulab – What is Grid computing • Then we will start to discuss Basics of P2P systems • Please read at least one paper from each 38 session Administrative Announcements Student-led paper presentations (see instructions on website) • Start from February 10th • Groups of up to 2 students present each class, responsible for a set of 3 “Main Papers” on a topic – 45 minute presentations (total) followed by discussion – Set up appointment with me to show slides by 5 pm day prior to presentation – Select your topic by Jan 31st • List of papers is up on the website • Each of the other students (non-presenters) expected to read the papers before class and turn in a one to two page review of the any two of the main set of papers (summary, comments, criticisms and possible future directions) – Email review and bring in hardcopy before class – Reviews are not due until student presentations start – Submit reviews for any 15 sessions (from 2/10 to 4/28) 39