An Introduction to Cloud-based Services Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk • e.g. Amazon 2 Plan • • • • What is Cloud Computing? Potential Advantages Lessons from our own experiences Cloud Issues What is Cloud Computing? “.. a broad array of web-based services aimed at allowing users to obtain a wide range of functional capabilities on a ‘pay-as-you-go’ basis that previously required tremendous hardware/software investments and professional skills to acquire.” Irving Wladawsky Berger What’s New? • illusion of Infinite computing resources On Demand • no up-front commitment by users • Pay for use of resources on a short-term basis as needed (from “Above the Clouds: A Berkeley View of Cloud Computing”) Example – Amazon Web Services • Based on Xen VMs – run any OS & software stack • CPU: 1.0Ghz x86 instance • Blob Storage • External Data Transfer @ $0.10 /hour @ $0.12 /GB month @ $0.10 /GB • Also queue, key store, block store, range of instances Why is this Important (I): Internal IT Problems (slide by permission of Arjuna Technologies) Dynamic Business Demand Silos = Inflexibility New demand Extinct demand Over-provision Under-provision Capacity Capacity Resources Resources Demand Demand Time Time 7 Static IT Supply Why is this Important (II)? Time to put Ideas into action Research 1. Have good idea 2. Write proposal 3. Wait 6 months 4. If successful.. 5. Buy Computers 6. Install Computers 7. Start Work Science Start-ups 1. Have good idea 2. Write Business Plan 3. Ask VCs to fund If successful.. 4. Buy computers 5. Install Computers 6. Start Work Why is this a Good idea: using commercial clouds 1. 2. 3. 4. Have good idea Grab nodes as needed from Cloud provider Start Work Pay for what you used Cloud Services Continuum (based on Robert Anderson) http://et.cairene.net/2008/07/03/cloud-services-continuum/ Salesforce.com Google AppEngine Platform (PaaS) Microsoft Azure Amazon EC2 & S3 Infrastructure (IaaS) Complexity Software (SaaS) Flexibility Google Docs Example Lessons from CARMEN Project • Design began in 2006 – Commercial clouds not an option • Designed own “private” cloud • Experimenting with Commercial Cloud CARMEN Project UK EPSRC e-Science Pilot £4M (2006-10) 20 Investigators Stirling St. Andrews Newcastle Manchester York Sheffield Leicester Warwick Cambridge Plymouth Imperial Industry & Associates Research Challenge Understanding the brain is the greatest informatics challenge • Enormous implications for science: • Medicine • Biology • Computer Science Collecting the Evidence 100,000 neuroscientists generate huge quantities of data – – – – molecular (genomic/proteomic) neurophysiological (time-series activity) anatomical (spatial) behavioural Epilepsy Exemplar Data analysis guides surgeon during operation Further analysis provides evidence WARNING! The next 2 Slides show an exposed human brain CARMEN enables sharing and collaborative exploitation of data, analysis code and expertise that are not physically collocated CARMEN e-Science Requirements • Store – very large quantities of data (100TB+) • Analyse – suite of neuroinformatics services – support data intensive analysis • Automate – workflow • Share – under user-control Background: North East Regional e-Science Centre • 25 Research Projects across many domains: • Bioinformatics, Ageing & Health, Neuroscience, Chemical Engineering, Transport, Geomatics, Video Archives, Artistic Performance Analysis, Computer Performance Analysis,.... • Same key needs: Share Automate Analyse Store Result: e-Science Central • Integrated Store-Analyse-Automate-Share infrastructure • Generic – CARMEN neuroinformatics & chemistry as pilots e-Science Central •Web based •Works anywhere e-Science Central Software as a Service • Dynamic Resource Allocation • Pay-as-you-Go* Social Networking • Controlled Sharing • Collaboration • Communities Cloud Computing Science Cloud Architecture Access over Internet (typically via browser) Upload data & services Run analyses Data storage and analysis Science Cloud Options Users Science App n Science App 1 Service Developers Science App 1 .... Science App n .... Science Platform Cloud Infrastructure: Storage & Compute Cloud Infrastructure: Storage & Compute App .... App App API e-Science Central Security Analysis Services Social Networking Workflow Enactment Processing Storage Science Cloud Platform Cloud Infrastructure Editing and Running a Workflow on the Web Workflow Result File Viewing the output of Workflow Runs Viewing results Blogs and links Communicating Results Linking to results & workflows What we learnt: Moving into a Cloud • Moving existing technologies into a cloud can be difficult – some can’t run in a Cloud at all Raw Data Exploration with Signal Data Explorer What we learnt : Scalability • Clouds offer the potential for scalability – grab compute power only when needed • Developers have to manage scalability – for Infrastructure as a Service Clouds – scale up as well as down Adaptive Dynamic Deployment with Dynasoar Commercial “pay-as-you-go” Response time (seconds) 450 400 Response time (Seconds) 16 350 processors in pool 14 300 12 250 10 200 Adding Processors as you need them optimises 150 resources and saves money100 in pay-as-you-go clouds 8 6 4 50 2 Arrival Rate (messages per second) 1 1 1 0.5 0.5 0.5 0.25 0.25 0.13 0.13 Ensure system can also release unwanted nodes 0.13 0.06 0.06 0.03 0.03 0 0.03 0 Processors in pool clouds would allow us to avoid this 18 limit Microsoft Azure Cloud for e-Science Demo • Recent experiments with Microsoft Azure Cloud – running Chemical analyses – Silverlight App Thanks to: - Paul Appleby & Team at the Microsoft Technology Centre, Reading - & MS External Research e-Science Group Microsoft Azure Cloud Demo When not to use Clouds? • Large data transfers –Time & Cost • High Performance – cpu/io/network bandwidth/low latency • Predictable Performance • Confidentiality • High Availability? • High Server Utilisation? –private clouds better? Create Private Cloud (slides by permission of Arjuna Technologies) Dynamic Business Demand New demand Arjuna AGILITY Resources Capacity Resources Demand Capacity Demand Time Time Agile IT Supply 37 Private Cloud Examples • Eucalyptus – Amazon API • Private Cloud deployments of Microsoft Azure • Arjuna Agility Federating Private & Public Clouds Public Cloud Public Cloud e.g. Amazon App1 Arjuna Agility App1 App1 & 2 Service Agreement Internal Cloud Dept A Dept B 39 Public Cloud e.g. Amazon App1 App1 Public Cloud e.g. FlexiScale Arjuna Agility App1 App1 & 2 40 Internal Cloud Arjuna Dept A Dept B Summary • Cloud computing can revolutionise e-science – provide sustainable infrastructure – reduce time from idea to realisation • Don’t underestimate complexity – building scalable distributed systems is still hard – can Science Clouds help by lowering the hurdles? • e-Science Central – Store-Analyse-Automate-Share e-science platform – adding content from a range of domains – CARMEN is evaluating it for neuroinformatics