© Copyright 2016 EMC Corporation. All rights reserved. 1 RESILIENCY AND AVAILABILITY OF A CLOUD NATIVE INFRASTRUCTURE JONAS ROSLAND, EMC {code} © Copyright 2016 EMC Corporation. All rights reserved. 2 FIRST, SOME HISTORY • High Availability solutions • Mostly Primary-Secondary, Master-Slave • Provides redundancy © Copyright 2016 EMC Corporation. All rights reserved. Server By Souvik Bhattacharjee 5 SO WHAT IS RESILIENCY? • Resiliency is a quality in objects to hold or recover their shape • Example: If you bend a spoon and it bends right back — that's resiliency © Copyright 2016 EMC Corporation. All rights reserved. 6 HOW DOES SPOON BENDING APPLY TO IT? • Create services that withstand pressure • Foster a modern mindset in operations • Treat failures as normal occurrences, not fires © Copyright 2016 EMC Corporation. All rights reserved. 7 FAILURE HANDLING • Service failures are treated as non-critical issues • Applications and services self-heal • The infrastructure supports this with modern tools © Copyright 2016 EMC Corporation. All rights reserved. 8 MODERN INFRASTRUCTURE • Focused but not necessarily fully dedicated • Not limited to a single master-slave relationship • Example 1: 20 nodes with resources to run services such as Hadoop, web servers and databases • Example 2: VMware clusters © Copyright 2016 EMC Corporation. All rights reserved. 9 MANAGING THIS INFRASTRUCTURE • Not just server-side clustering tools • Many layers • Schedulers, service discovery, logging, monitoring and much more © Copyright 2016 EMC Corporation. All rights reserved. 10 RESOURCE MANAGERS AND SCHEDULERS • Collection of compute resources • Spread out horizontally • Brings scale to your deployments – – – – Docker Swarm Mesos Kubernetes Cloud Foundry © Copyright 2016 EMC Corporation. All rights reserved. 11 EXAMPLE © Copyright 2016 EMC Corporation. All rights reserved. 12 EXAMPLE © Copyright 2016 EMC Corporation. All rights reserved. 13 SERVICE DISCOVERY • Enables apps and infra to automatically find parts of each other – – – – Consul etcd SkyDNS Zookeeper © Copyright 2016 EMC Corporation. All rights reserved. 14 EXAMPLE © Copyright 2016 EMC Corporation. All rights reserved. Service Discovery 15 EXAMPLE Service Discovery Java App 3.4.5.6:9090 © Copyright 2016 EMC Corporation. All rights reserved. 16 LOGGING • All logs in one place – Fluentd – Logstash – Loggregator • Everything searchable – Elasticsearch – Splunk © Copyright 2016 EMC Corporation. All rights reserved. 17 EXAMPLE © Copyright 2016 EMC Corporation. All rights reserved. 18 MONITORING AND ANALYTICS • Sensu • Datadog • Prometheus • New Relic • Ruxit © Copyright 2016 EMC Corporation. All rights reserved. 19 EXAMPLE © Copyright 2016 EMC Corporation. All rights reserved. 20 LOOK OF PHYSICAL INFRASTRUCTURE • Everything is managed through smart tools • The tools have integrations to understand hardware • By changing the processes you’re more prepared • Being more prepared is crucial to deal with failures when they happens And they will happen © Copyright 2016 EMC Corporation. All rights reserved. 21 But what about my super special app? © Copyright 2016 EMC Corporation. All rights reserved. 22 Applications have to store data, right? © Copyright 2016 EMC Corporation. All rights reserved. 23 APPLICATION STATE DIFFERENCES • Storing data (state) is a critical component of any application • Where they store this state defines their class – Connected database – In memory – Local disk © Copyright 2016 EMC Corporation. All rights reserved. 24 STATELESS APPLICATIONS • Store no critical data locally • Can be scaled as needed • Recovers quickly © Copyright 2016 EMC Corporation. All rights reserved. 25 STATEFUL SERVICES • Services that store state • Have usually been treated as HA apps • Can be scale-out or scale-up © Copyright 2016 EMC Corporation. All rights reserved. 26 APPLICATIONS BUILT FROM SERVICES Front End or Non-Persistent Language Specific HTTP Redis Apache Mysql Postgres MongoDB MariaDB Nginx Memcached CouchDB Elastic Search Hadoop Cassandra ScaleIO Data Services (Persistent) © Copyright 2016 EMC Corporation. All rights reserved. RabbitMQ ECS Scale-Out Scale-Up Rails Tomcat HAProxy 27 Demo of resiliency © Copyright 2016 EMC Corporation. All rights reserved. 28 DEMO OF FAILURE HAPPENING • Several applications, stateful and stateless, are running and working together • Cascading server failures occurs • Applications are restarted and automatically heal themselves • The servers are back online and automatically join the cluster of resources again © Copyright 2016 EMC Corporation. All rights reserved. 29 SUMMARY • Being prepared is crucial, surprises shouldn’t happen • Use smart tools to deal with issue quickly • Leverage smart infrastructure that allows your apps to self-heal • Manage your datacenter as an entity, not silos © Copyright 2016 EMC Corporation. All rights reserved. 30 Before opening up for questions © Copyright 2016 EMC Corporation. All rights reserved. 31 CONTINUE THE DISCUSSION • Hands-on lab with Docker, Mesos and REX-Ray • Free stickers at our booth • Join our community at community.emccode.com • See all our projects at emccode.com • And follow us on Twitter @EMCcode © Copyright 2016 EMC Corporation. All rights reserved. 32 Questions? @EMCcode @jonasrosland emccode.com community.emccode.com © Copyright 2016 EMC Corporation. All rights reserved. Come visit us at Booth #1044 or in the vLab 33 © Copyright 2016 EMC Corporation. All rights reserved. 35