The CloudStack development story and future vision Sheng Liang, CTO Cloud Platforms, Citrix Systems August 29, 2012 AWS is setting the standard… as measured by capacity… Every day through 2011, AWS added the same amount of server processing capacity, on average, that it took to run the Amazon online retailing operation in 2000, when it was a $2.76bn company. Total Number of Objects Stored in Amazon S3 762B Peak Requests: 500,000+ per second 262B 2.9B 14B 40B 102B Q4 2006 Q4 2007 Q4 2008 Q4 2009 Q4 2010 Q4 2011 Prickett-Morgan. “AWS Cloud Double Fluffs in 2011.” The Register, 6 Jan 2012. Source: UBS …data center footprint and geographic distribution… …the company said that with the opening of its AWS data center in São Paulo, Brazil in mid-December, the company has doubled its AWS data-center footprint. Prickett-Morgan. “AWS Cloud Double Fluffs in 2011.” The Register, 6 Jan 2012. AWS Regions Amazon Edge Locations (CloudFront & Route 53) …and, most importantly, revenue… $M It has been estimated that AWS could be a $1 billion business for the online retailer come next year…could hit $2.5B in 2014. Amazon Web Services Revenue Model $1,400 All Other $1,200 $1,000 $800 $600 $400 $200 $0 Hickey, Andrew. “Amazon Q3 Cloud Revenue Skyrockets” CRN. 26 Oct 2011. 2006e 2007e 2008e 2009e 2010e 2011e Source: UBS How did Amazon build its Cloud? Amazon eCommerce Platform AWS API (EC2, S3, …) Amazon Proprietary Orchestration Software Open Source Xen Hypervisor Networking Servers Storage How can we build a cloud using CloudStack? Amazon eCommerce User Portal Platform AWS API (EC2, Cloud APIS3, …) Amazon Proprietary Apache Orchestration CloudStack Software VMware KVM OVM Hyper-V Bare-Metal Open Source XenServer Xen Hypervisor Networking Servers Storage “Cloud OS or Data Center OS” The Virtual Datacenter OS allows businesses to efficiently pool all types of hardware resources - servers, storage and network – into an aggregated on-premise cloud VMware press release Sept 15, 2008 Eucalyptus is the only cloud architecture to support the same application programming interfaces (APIs) as public clouds, and today Eucalyptus is fully compatible with the Amazon AWS public cloud infrastructure. Eucalyptus Systems Press Release April 2009 Nimbula 3Tera Joyent Zimory Eucalyptus VMware OnApp Enomaly Cassatt Abiquo OpenNebula Yunteq Cloud.com June 2009 Prototype 2008 Sept 2008: VMOps Founded 1.0 GA 2.0 Refactor 2009 Nov 2009: CloudStack 1.0 GA AWS Compatibility 2010 May 2010: Cloud.com Launch & CloudStack 2.0 GA 2.2 Refactor 3.0 Quality Improvements 2011 July 2011: Citrix Acquires Cloud.com 4.X Refactor 2012 April 2012: Apache CloudStack Prototype 2008 1.0 GA 2009 2.0 Refactor AWS Compatibility 2010 2.2 Refactor 3.0 Quality Improvements 2011 • Initial target: hosting companies like Rackspace and Savvis • 3 engineers built a fully functional prototype in 5 months • Use the demo to sell to early customers (ReliaCloud, CloudCentral, 1800hosting.com, Go Daddy, etc.) 4.X Refactor 2012 Prototype 2008 1.0 GA 2009 2.0 Refactor AWS Compatibility 2010 2.2 Refactor 3.0 Quality Improvements 2011 • Took 6 more months to make 1.0 software production ready • Deployed on 5 production customers 4.X Refactor 2012 Prototype 2008 1.0 GA 2.0 Refactor 2009 AWS Compatibility 2010 2.2 Refactor 3.0 Quality Improvements 2011 • Product first, architecture second • From web hosting to enterprise workload • Multi-hypervisor, SAN, and VLAN support • Learn needs of enterprise workload from: Tata Communications, Korea Telecom, Macquarie Telecom • Competition: vCloud Express 4.X Refactor 2012 Prototype 2008 1.0 GA 2.0 Refactor 2009 AWS Compatibility 2010 2.2 Refactor 2011 • Private cloud demand picked up • Zynga wanted private cloud • Support Amazon-style flat networking and security groups • Competition: Eucalyptus 3.0 Quality Improvements 4.X Refactor 2012 Prototype 2008 1.0 GA 2.0 Refactor 2009 AWS Compatibility 2010 2.2 Refactor 3.0 Quality Improvements 2011 4.X Refactor 2012 • Second major refactoring of CloudStack code • Network-as-a-service combing both Amazon and traditional style networking • More flexible orchestration engine Prototype 2008 1.0 GA 2009 2.0 Refactor AWS Compatibility 2010 • Citrix acquisition • Rapid growth of CloudStack user base • Quality is more important than new features 2.2 Refactor 3.0 Quality Improvements 2011 4.X Refactor 2012 Prototype 2008 1.0 GA 2.0 Refactor 2009 AWS Compatibility 2010 2.2 Refactor 3.0 Quality Improvements 2011 • Third major refactoring of CloudStack code • Apache contribution drive rapid growth of CloudStack developer base • Apache license compliance • Services framework • Hadoop integration 4.X Refactor 2012 + • Optimize Hadoop on cloud infrastructure • Use HDFS as object store How is cloud different from legacy infrastructure? How to handle failures 8% Annual Failure Rate of servers Kashi Venkatesh Vishwanath and Nachiappan Nagappan, Characterizing Cloud Computing Hardware Reliability, SoCC’10 #CitrixSynergy 23 • Server failure comes from: ᵒ ᵒ ᵒ ᵒ 70% - hard disk 6% - RAID controller 5% - memory 18% - other factors • Application can still fail for other reasons: ᵒ Network failure ᵒ Software bugs ᵒ Human admin error Internet Core Routers … Access Routers Aggregation Switches Load Balancers … Top of Rack Switches Servers 40% Effectiveness of network redundancy in reducing failures Phillipa Gill, Navendu Jain & Nachiappan Nagappan, Understanding Network Failures in Data Centers: Measurement, Analysis and Implications, SIGCOMM 2011 #CitrixSynergy 25 • Bugs in failover mechanism • Incorrect configuration • Protocol issues such as TCP back-off, timeouts, and spanning tree reconfiguration Cloud workloads Traditional-Style Reliable hardware, backup entire cloud, and restore for users when failure happens Amazon-Style Tell users to expect failure. Users to build apps that can withstand infrastructure failure Link Aggregation VM Backup/Snapshots Storage Multi-pathing Ephemeral Resources VM HA, Fault Tolerance Chaos Monkey VM Live Migration Multi-site Redundancy Designing a zone for a traditional workload Traditional-Style Availability Zone Hypervisor vSphere or XenServer Enterprise vCenter/XenCenter Storage Enterprise Networking (e.g., VLAN) SAN Networking Hypervisor Cluster Hypervisor Cluster Hypervisor Cluster L2 VLANs Network Services Enterprise Storage (e.g., SAN) Load Balancing VPN Multi-tier Apps Multi-tier VLANs OVF Designing a zone for an Amazon-style workload Amazon-Style Availability Zone Software Defined Networks (e.g., Security Groups, EIP, ELB,...) Server Racks Server Racks Server Racks Server Racks Hypervisor XenServer Advanced Storage Local Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks SDN based L2 Elastic IP Network Services Security Groups Elastic Block Storage Object store Networking L3 Server Racks EBS ELB Multi-tier Apps 3rd Party Tools (e.g., RightScale, enStratus) CloudFormation GSLB CloudStack can Support Both Styles Apache CloudStack AWS-style Availability Zone AWS-style Availability Zone AWS-style Availability Zone Traditional Style Availability Zone Traditional Style Availability Zone CloudStack Future 146 Companies 238 Developers Service Providers Global User Groups 100’s of Production Clouds 32,000 Community Members Enterprises Universities Apache CloudStack community projects • SDN ᵒ ᵒ ᵒ ᵒ Nicira Midokura Big Switch Networks Stratosphere • Backup/DR ᵒ Sungard • Networking ᵒ Cisco (VXLAN, Nexus) ᵒ Brocade (ADX) • Smart Storage ᵒ ᵒ ᵒ ᵒ ᵒ Hadoop + S3 API for object store NetApp (FlexPod, object store) Basho RIAK CS Caringo object store Cloudian S3 • PaaS ᵒ CloudFoundry implementation through IronFoundry and Stackato teams ᵒ Engine Yard ᵒ Cumulogic ᵒ GigaSpaces “The Apache Way” • • • • • • Collaborative software development Commercial-friendly standard license Consistently high quality software Respectful, honest, technical-based interaction Faithful implementation of standards Security as a mandatory feature Innovative Cloud Applications and Services … Networking Servers Innovative Cloud Infrastructure Storage More information: http://cloudstack.org http://cloudstack.jp