Cloud Computing, IaaS and Availability Management: Any challenges? Antonio F Gaspar Santos, M.Sc. IBM & OpenGroup Certified IT Architect IBM Certified Solution Advisor - Cloud Computing IBM Client Center - Brazil agaspar@br.ibm.com +55-11-2132-2043 Cloud computing, “Deployment Models” and “Service Models” Application as a Service Platform as a Service Infrastructure as a Service Private Cloud Enterprise Data Center Managed Private Cloud Enterprise Data Center Third-party operated Hosted Private Cloud Enterprise Enterprises Public Cloud Services Users Third-party hosted and operated Public Private IT capabilities are provided “as a service,” over an intranet, within the enterprise and behind the firewall 2 Shared Private Cloud IT activities / functions are provided “as a service,” over the Internet 2013 IBM Corporation © 2012 IBM Corporation Now, everything goes to Cloud! Well, the real world is different… Standardize and automate Virtualize “Re” Consolidate Consolidate Relative maturity on virtualization High fragmentation IT “islands” Lack of scalability & elasticity High virtualization maturity IT optimization Promote Scalability Enable IaaS Standardize services Reduce deployment cycles Enable Elasticity Flexible delivery Cloud Virtualization Traditional 3 Dynamic 2013 IBM Corporation Some challenges... Sw licensing Portability Sizing Standardization requirements Culture, mind share... etcCorporation © 2012 IBM The real world of Cloud… - Large scale failures in the cloud are rare but happens - Yes! A cloud is a physical data center entity - Most clouds are shared/multitenant environments: availability issues => high impact - Application owners are also responsible for the availabilitry and recoverability - There are many architecture options for Cloud HA: multi-server to multiavailability domains 4 2013 IBM Corporation © 2012 IBM Corporation Interoperability with Clouds - Far beyond of just network connection Enterprise Organization Enterprise Processes Enterprise Technology Enterprise Cloud App / DB / MW App / DB / MW Operating System Operating System Hypervisor* Hypervisor Cloud Organization Cloud Processes Cloud Technology Infrastructure 5 Infrastructure 2013 IBM Corporation 2013 IBM Corporation © 2012 IBM Corporation Service Availability - Challenge1:Transparent service provider to users Transparent availability for end users. The need to provide a single point of contact regardless of the “X”aaS provider ERP 6 Collaborations HR - Payment Intranet Applications “Private Cloud” Legacy Applications Legacy Traditional IT 2013 IBM Corporation Private Cloud Enterprise Premises © 2012 IBM Corporation Service Availability - Challenge2: Multi Service Providers and composed SLAs Application Scope Operating System & Middleware Hypervisor IaaS Resolution Targets Resolution Targets Resolution/Response Times Severity 1 2 Hours 2 Hours 4 Hours* Severity 2 4 Hours 4 Hours 8 Hours* Severity 3 X Hours Y Hours X+Y Hours* Function 7 “Consolidated SLAs” “home” support Infrastructure (*) Worst case 2013 IBM Corporation © 2012 IBM Corporation Why not apply ITIL for cloud service Management? Service Management / Availability Management - Conceptualization… Availability Management 8 Incident Change & Configuration 2013 IBM Corporation © 2012 IBM Corporation Enhancing Cloud HA Management #1 – Enabling Auto restart MTTR can be reduced due to (re)provisioning of new instances of virtual resources on cloud rather than performing traditional ticket resolution over a specific virtual resource… Incident Management optimization by Cloud automation capabilities reduces MTTR! 9 2013 IBM Corporation © 2012 IBM Corporation Enhancing Cloud HA Management #2 – Enabling Auto Scaling Auto scaling leverages automation capabilities from cloud computing. It enhances the availability management based on KPIs measurements (e.g. application response time) due to provisioning IT resources to a given application… 10 2013 IBM Corporation © 2012 IBM Corporation Enhancing Cloud HA Management #3 - Communications Planned outages vs. availability service levels: communication must flow across cloud providers and cloud consumers 11 2013 IBM Corporation © 2012 IBM Corporation Understanding Cloud “Availability Domains” (ADs) An AD spans from a rack to datacenters... - Define the SLA measurements borders - AD is an island: runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable. - Usually ADs are physically separate. - You need to assign a VM instance to one AD, during provisioning. - By launching instances in separate Availability Domains, you can enhance your application robustness from failure of a single location. 12 2013 IBM Corporation © 2012 IBM Corporation Architecture Options for Cloud HA – Single AD Access Net* Users (*) Internet, MPLS… Load Balancers App Layer LB clusters Multi instances Clustering DB Layer Disk Arrays Storage AD1 13 2013 IBM Corporation © 2012 IBM Corporation Architecture Options for Cloud HA – Multi AD Access Net Users (*) Internet, MPLS… Load Balancers Load Balancers App Layer App Layer DB Layer DB Layer IP Replication Storage Storage AD1 14 AD2 2013 IBM Corporation © 2012 IBM Corporation Cloud HA Management – Checklist for Risk Mitigation - Proceed a risk assessment for each application - Specify target service levels (availability, RTO, RPO...) - Determine who owns the architecture - Design for failure (start on application architecture) - Leverage Sw products robustness for HA - Implement HA (application and infrastructure) - Document, test and test... - Explore cloud providers: - HA capabilities, Availability Domain options, Service Levels, Service Management interfaces - Identify “components in the middle” (between you and cloud) - Understand how to claim SLA penalties 15 2013 IBM Corporation © 2012 IBM Corporation Cloud HA Management - Best Practices - Leverage automation as much as possible! - Optimize/automate SM processes - Avoid single point of failure - Possible to replicate accross at least two Availability Domains? - Provide extra capacity to absorb cloud failures - Design stateless applications for quick relaunch - Enable auto-scalling when possible - Setup proactive monitoring, alerts and operations 16 2013 IBM Corporation Yes, you can have a Cloud HA! © 2012 IBM Corporation Antonio F Gaspar Santos IBM Certified Solution Advisor Cloud Computing IBM Client Center - Brazil IBM Brazil Rua Tutóia, 1157 – CEP 04007-900 São Paulo – SP - Brazil Phone: +55 11 2132 2043 Mobile: +55 11 9 9649 6177 E-mail: agaspar@br.ibm.com 17 2013 IBM Corporation © 2012 IBM Corporation