Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps for RG Cloud Michael Faber Samuel Kounev michael.faber@kit.edu kounev@kit.edu http://descartes.ipd.kit.edu http://research.spec.org SOFTWARE DESIGN AND QUALITY GROUP INSTITUTE FOR PROGRAM STRUCTURES AND DATA ORGANIZATION, FACULTY OF INFORMATICS KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu Agenda Definitions and mission statement Summary of presentations so far Cloud application types Existing cloud benchmarks Challenges for cloud benchmarking Proposed next steps Possible topic of next meeting Taxonomy – patterns in cloud computing 2 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Agenda Definitions and mission statement Summary of presentations so far Cloud application types Existing cloud benchmarks Challenges for cloud benchmarking Proposed next steps Possible topic of next meeting Taxonomy – patterns in cloud computing 3 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Cloud Computing Essential characteristics defined by NIST On-demand self-service Broad network access Resource pooling Rapid elasticity Measured service Different abstraction levels SaaS (e.g., SalesForce.com, Google Docs) PaaS (e.g., MS Azure, Google AppEngine) IaaS (e.g., Amazon EC2, Rackspace) 4 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization OSG Cloud and RG Cloud – Mission and Charter Current efforts in OSG Cloud and RG Cloud Taxonomy of the cloud space Representative workloads Relevant metrics OSG Cloud Develop ready-to-use benchmarks Evaluation and comparison of cloud product offerings RG Cloud Define benchmarking scenarios at a higher abstraction level Evaluation of early prototypes and research results In-depth quantitative analysis Provide basis for building future conventional benchmarks 5 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Benchmarks „A benchmark is a test, or set of tests, designed to compare the performance of one computer system against the performance of others“ [SPEC] Types of benchmarks Synthetic benchmarks Micro-benchmarks Program kernels Application benchmarks 6 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Research Benchmarks Targeted for use in research environments Can be used as a basis for building conventional benchmarks Research Benchmark Conventional SPEC Benchmark Specification (higher abstraction level) Implementation In-depth evaluation of early prototypes Evaluation, comparison and marketing and research results as well as fullof commercial products and services blown implementations 7 Flexibility and customizability to different usage scenarios Pre-defined workload mixes and configuration to ensure comparability Range of possible metrics Fixed set of run rules and metrics Intended to have a longer lifespan Implementation has to be updated for new product releases 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Agenda Definitions and mission statement Summary of presentations so far Cloud application types Existing cloud benchmarks Challenges for cloud benchmarking Proposed next steps Possible topic of next meeting Taxonomy – patterns in cloud computing 8 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization RG Cloud Group – Last Meetings 18.05. 01.06. 15.06. 29.06./27.07. 10.08. 07.09. 21.09. 9 31.08.2011 „Live Migration Benchmark Research“ (Zhejiang University) „How A Consumer Can Measure Elasticity for Cloud Platforms“ (NICTA) „Benchmarking Cloud Services“ (SAP) „Towards a Benchmark for the Cloud“ (UC Berkeley) „CloudCmp: Comparing Public Cloud Providers“ (IBM) „Virtualization on Network Performance of EC2“ (Rice University) OSG Cloud status (AMD) Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Overview of Work Areas Overview of existing cloud benchmarking efforts Cloud taxonomy – Different patterns of cloud computing Benchmark scenarios / Application types Workload driver Metrics Controller System Under Test (SUT) Not yet discussed Initial discussion Detailed discussion 10 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Cloud Benchmark Metrics Considered metrics so far Elasticity (provisioning interval) Durability Response time Throughput Reliability Power Price Performance/Price Degree of interest highly depends on the target group IaaS provider (power / resource utilization,…) End-user (response time, price,…) … 11 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Agenda Definitions and mission statement Summary of presentations so far Cloud application types Existing cloud benchmarks Challenges for cloud benchmarking Proposed next steps Possible topic of next meeting Taxonomy – patterns in cloud computing 12 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Cloud Application Types - Results of Literature Research A. Data-Intensive / Planned Batch Jobs • B. Business intelligence, data warehousing, data analytics, HPC, etc. Processing Pipelines • C. Search engines (e.g., crawler), etc. Dynamic Web-Sites • D. E-commerce, social networks, marketing sites, education / e-learning, etc. Business Processing / OLTP / Mission-Critical Applications • E. CRM, ERP, enterprise portals, BPM, etc. Latency-Sensitive • F. Conferencing tools, streaming, online gaming, media broadcasting, etc. Application-Extensions / Backends for Mobile Communication • G. Mobile interactive applications, extension of compute-intensive desktop applications, etc. Bandwidth- and Storage-Intensive • H. I. Data storage (Dropbox, iCloud), video and photo sharing (YouTube, Flickr), etc. Mail Applications Others • 13 31.08.2011 Application platforms & development, testing, etc. Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Cloud Application Types - Results of Discussions in OSG Cloud Data Analytics 4.43 Expert search Clustering Customer segmentation Data Warehousing 4.00 Pipelines Iterative processing Business OLTP Mail Memory Cloud 3.56 3.43 3.00 Key-value pair databases Social Networking Web2.0 based application Write/read workload Memory cloud Search engine 14 31.08.2011 3.00 1 (doesn‘t matter) - 7 (most important) Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Source: OSG Cloud Software Design and Quality Group Institute for Program Structures and Data Organization Cloud Application Types Categorization Application Types OSG IBM * Cloud & SaaS Report Existing Benchmarks D Business Processing (OLTP) 3 2 2,3,5, 7 SPECjEnterprise2010, SPECJMS2007, SPECSOA, TPC-C A Planned Batch Jobs (+ Data Analytics, + Data Warehousing) 1,2 6 SPECJBB2005, TPC-H, TPC-DS C Websites (+ Social Networks) 6 4 SPECweb2009, TPC-W, RuBiS, Petstore H Mail 4 G Bandwith- and storageintensive (+ Memory Cloud) 5 E Latency-sensitive (streaming, VoIP) SPECMAIL2009 4 1 SPECsfs2008, Flexible File System Benchmark, SPC1, SPC2 1 SPECsip_Infrastructure2011 *IBM rank 3: „Business continuity / disaster recovery“ 15 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Agenda Definitions and mission statement Summary of presentations so far Cloud application types Existing cloud benchmarks Challenges for cloud benchmarking Proposed next steps Possible topic of next meeting Taxonomy – patterns in cloud computing 16 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Benchmarks Targeted at Cloud Computing Yahoo YCSB CloudBench CloudStone CloudCmp Cloudsleuth Global Provider View Cloud Performance Analyzer 17 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Dynamic Websites Yahoo YCSB IaaS Focus on data storage and management 18 Scenario / Workload • Synthetic WL mixes (insert, update, read, scan) • Different distributions of mixes to approximate: • Photo tagging, social network (user profile, status updates, threaded conversations) Workload generator • YCSB Client: extensible workload generator SUT • Cassandra, HBase, Yahoo!‘s PNUTS, Shared MySQL Metrics • • • • 31.08.2011 Tier 1 Tier 2 Tier 3 Tier 4 Performance (latency) Scaling (scaleup, elastic speedup) Availability (kill server during runtime) Replication (performance impact of replication) Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization CloudBench Dynamic Websites & OLTP IaaS, PaaS Transaction processing (OLTP) Simple HTTP request/response pattern 19 Scenario / Workload • Based on TPC-W (online bookstore) Workload generator • Remote Browser Emulator (in EC2) SUT • Google AppEngine, MS Azure, Amazon EC2 Metrics • Throughput (WIPS) • Cost (Cost/Wi, CostPerDay) 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Cloudstone Dynamic Websites IaaS Typical Web 2.0 application in cloud computing environment 20 Scenario / Workload • Olio Social Network – „Social-event Calendar Web application“ Workload generator • „Faban“ generated the workload (simulated users) • Parallel agents deployed on different machines (in EC2) SUT • Different Amazon EC2 configurations Metrics • Dollar-per-user-per-[1month | 1year | 3year] 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization CloudCmp IaaS, PaaS Case studies to evaluate the benchmark results (e.g., TPC-W) 21 Scenario / Workload • Micro-benchmarks for computing, storage, network Workload generator • Different workloads for different metrics • (e.g., SPECJVM2008) SUT • Amazon AWS, MS Azure, Google AppEngine, Rackspace (anonymized) Metrics • Instance (e.g., finishing time, cost per benchmark) • Storage (e.g., throughput, operation response time) • Network (e.g., optimal network latency) 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Global Provider View Dynamic Websites IaaS, PaaS Metrics for IaaS and PaaS providers Different regional views (averages from close backbones) 22 Scenario / Workload • Simple static webpage (1 product overview and 1 detailed product information) Workload generator • Gomez Performance Network runs test transaction and monitors • 30 backbone nodes (18 in US, 12 outside US) SUT • 25 different IaaS and PaaS providers (e.g., MS Azure, EC2, Rackspace, GoGrid) including different sites Metrics • Response time • Availability 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Cloud Performance Analyzer Dynamic Websites IaaS, PaaS Influence of content delivery networks (CDN) One deployed target application on Amazon EC2 (east coast site) 23 Scenario / Workload • Mixed webpage (1- static content, 2- mash-content from common map providers, 3- advertisements from commercial ad providers and 4- analytics) Workload generator • Gomez Performance Network runs test transaction and monitors • 35 backbone nodes (18 in US, 17 outside US) SUT • Origin only (Amazon EC2 East - no CDN) • CDNetworks • Amazon’s CloudFront Metrics • Response time • Availability 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Lessons Learned Existing cloud benchmarks differ in all essential points Goals of the benchmarks Target cloud systems Metrics, scenarios, workloads Requirements for research benchmarks Flexible for adapting to different cloud solutions and patterns Covering a wide range of relevant application types Support different workload types (e.g., linear, exponential increase, peaks) Flexible in customizing the workload based on the goals of the evaluation … 24 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Agenda Definitions and mission statement Summary of presentations so far Cloud application types Existing cloud benchmarks Challenges for cloud benchmarking Proposed next steps Possible topic of next meeting Taxonomy – patterns in cloud computing 25 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Identified Challenges Comparability of offerings Different abstraction levels (e.g., IaaS vs. PaaS) Highly differ in their offerings (e.g., auto-scaling) Reproducability of results Dynamic assignments of resources Contention by other customers SUT SUT is varying (e.g., during the benchmark through scaling) Carved-out portion vs. as-is portion Metrics How to define, measure and quantify elasticity? How to consider costs in the metrics? Workloads Cloud computing allows hosting of a variety of applications Representative workloads for most promising application types Network impact Where to deploy the workload driver? How to distinguish between network latency and cloud performance? 26 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Agenda Definitions and mission statement Summary of presentations so far Cloud application types Existing cloud benchmarks Challenges for cloud benchmarking Proposed next steps Possible topic of next meeting Taxonomy – patterns in cloud computing 27 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization Proposed Next Steps 1. Taxonomy for the cloud space Better understanding of different offerings and cloud patterns Consistent terminology as a basis for further discussions 2. Systematic classification of different cloud benchmarks What is needed at which layer and by whom? How can this be measured? 3. Appropriate scenarios for research benchmarks Promising application types Simple and easy to understand Accomodate more than one cloud pattern Continue close collaborations with OSG Cloud 28 31.08.2011 Michael Faber - Summary of Existing Cloud Benchmark Efforts and Proposed Next Steps Software Design and Quality Group Institute for Program Structures and Data Organization