More Cloud Apps

advertisement

Lecture 9: More Cloud

Applications

Xiaowei Yang (Duke University)

News: Buffalo as Data Center

Mecca

• $1.9 billion, at least 200 employees

• Low-cost electric power, tax incentives, plenty of shovel-ready sites, cool climate

Review

• Cloud Computing

– Elasticity

– Pay-as-you-go

• Challenges

– Security: co-residence, inference

– Performance

• Coarse-grained sharing

• Lack of virtualized interface for specialized hardware

Today

• Cloud Applications

– Execution augmentation for mobile devices

– Energy saving for mobile

– Energy saving for desktops

– Disaster recovery

The Case for Energy-Oriented

Partial Desktop Migration

Nilton Bila†, Eyal de Lara†, Matti

Hiltunen, Kaustubh Joshi,

H. Andr´es Lagar-Cavillaand M.

Satyanarayanan

Motivations

• Offices and homes have many PCs

• But, they areoften left running idle

– PCs idle on average 12 hours a day

• “Skilled in the art of being idle” by Nedevschi et al. in NSDI 2009

– 60% of desktops remain powered overnight

• “After-hours power status of office equipment in the USA” by Webber, in Energy 2006

Why is it important?

• Dell Optiplex 745 Desktop

• Peak power: 280W

• Idle power: 102.1W

• Sleep power: 1.2W

• If we put one to sleep when it is idle, the saving is (102.1-1.2)W.

Why do we leave desktops on?

• Applications with always on semantics

– Skype, IM, email, personal media sharing

• Interspersed activities with idle periods

– Lunch break

– Chatting with colleagues

Related work

User0 User1 Dom0

Xen

• Full VM migration

– LiteGreen, USENIX 2010 best paper

– Encapsulate user session in VM

– When idle, migrate VM to consolidation server and power down PC

– When busy, migrate back to user’s PC

Partial VM migration

• Idle VM only access partial memory and disk state (working set)

• Migrate only the working set to a server

– Potentially a cloud server

– Cloud provider can further aggregate

Advantages

• Small migration footprint

• Client

– Fast migration

– Low energy cost

• Network

– Reduce bandwidth demand

• Server

– More VMs per server

Feasibility Study

• Can its desktop save energy by sleeping when an VM runs on the cloud?

• Does the entire domain save energy by migrating idle sessions by sleeping?

Methodology

• Prototyped simple on-demand migration approach with SnowFlock

– Prepared a VM image, and run the VM

– After five minutes, used SnowFlock to clone the VM

– Monitor memory and disk page migration to cloneVM

Setup

• Dell Optiplex 745 Desktop

– 4GB RAM, 2.66GHz Intel C2D

– Peak power: 280W

– Idle power: 102.1W

– Sleep power: 1.2W

• VM Image:

– Debian Linux 5

– 1GB RAM

– 12 GB disk

Workloads

Memory Request Pattern

• Spatial locality

– Pre-fetching

Page Request Interval

• 98% of request arrive in close succession

Potential Sleep Intervals

Potential Sleep Intervals

Potential Sleep Intervals

Potential Sleep Intervals

Energy Savings: an hour-long trace

Hourly Energy Savings: an overnight session

• Saves 69% of energy

Memory footprint

• A cloud node with 4GB of RAM can run

~30 VMs

Domain-wide Energy Savings

Annual Energy Savings

• No partial migration

Annual Energy Savings

• V = 23

Annual Savings

• Can it save cost?

– Network

– Cloud Rental

Open issues

• Frequent power cycling reduces hw life expectancy and limits power savings

– Reduce number of sleep cycles and increase sleep duration

– Predict page access patterns and prefetch

– Leverage content addressable memory

• Fast reintegration

– Big Q: Can it be fast enough so that a user does not suffer a long delay?

• Policies

– When to migrate/re-integrate?

– When does the desktop go to sleep?

– On re-integration, should state be maintained in the cloud?

For how long?

Disaster Recovery as a Cloud

Service: Economic Benefits &

Deployment Challenges

Timothy Wood and Emmanuel Cecchet, University of

Massachusetts Amherst; K.K. Ramakrishnan, AT&T

Labs—Research; Prashant Shenoy, University of

Massachusetts Amherst; Jacobus van der Merwe,

AT&T Labs—Research; Arun Venkataramani,

University of Massachusetts Amherst

Datacenter Disasters

• Disasters cause expensive application downtime

• Truck crash shuts down Amazon EC2 site center (May 2010)

• Lightning strikes EC2 data (May 2009)

• Comcast Down: Hunter shoots cable

(2008)

• Squirrels bring down NASDAQ exchange (1987 and 1994)

DR Fits in the Cloud

• Customer: pay-as-you-go and elasticity

– Normal is cheap (fewer resources for backup than normal operations)

– Rapidly scale up resources after disaster is detected

• Provider: high degree of multiplexing

– Customers will not fail at once

– Can offer extra services like disaster detection

What is disaster recovery

• Use DR services to prevent lengthy service disruptions

• Data backups + failover mechanism

– Periodically replicate state

– Switch to backup site after disaster

DR Metrics

• Recovery Point Objective (RPO): the most recent backup time prior to any failure

• Recovery Time Objective (RTO): how long it can take for an application to come back online after a failure occurs

– Time to detect failure

– Provision servers

– Initialize applications

– Configure networks to connect

• Performance

– Have a minimal impact on the performance of each application being protected under failure-free operation

– How can DR impact performance?

• Consistency

– The application can be restored to a consistent state

• Geographic separation

– Challenge: increasing network latency

DR Mechanisms

• Hot Backup Site

– Provides a set of mirrored stand-by servers that are always available

– Minimal RTO and RPO

– Use synchronous replication to prevent any data loss

Warm backup Site

• Cheaply synchronize state during normal operations

• Obtain resources on demand after failure

• Short delay to resource provision and applications

Cost analysis study

• Compare DR in Colocation center to

Cloud

• Colocation

– pays for servers and space at all times

• Cloud DR

– Pays for resources as they are used

Case Study 1

• RUBiS: an ebay-like multi-tier web application

– Three front ends

– One database server

– Only database state is replicated

Cost analysis

• 99% Uptime cost (3 days of disaster per year)

Case 2: Data Warehouse

• Post-disaster expensive due to high powered VM instance

• Overall cheaper because 99% Uptime

RPO vs Cost Tradeoff

• Flexible

• Colo has a fixed cost regardless of RPO requirements

Cost Analysis Summary

• Cloud DR’s benefits depend on

– Type of resources to run application

– Variation between normal and post-disaster costs

– RPO and RTO requirements

– Uptime

• Cloud is better if post-disaster cost much higher than normal mode

Provider Challenges

• How to maximize revenue?

– Makes money from storage in normal case

– But must pay for servers and keep them available for DR

– Possible solutions

• Spot instances (EC2 uses them)

• Higher prices for higher priority resources

• Correlated failures

– Large disasters may affect many

– Possible solutions

• Decide provision using a risk model

• Spread out customers

Mechanisms Needed for Cloud DR

• Network reconfiguration

– Application must be brought up online after moved to a backup site

– May require setting up a private business network

• Security and Isolation

• VM migration and cloning

– Restore an application after a disaster is handled

– Cloud provider does not support VM migration in and out cloud yet

Summary

• Cloud based disaster recovery

– Can reduce cost

• Up to 85% from a case study

– Flexible tradeoff between cost and RPO

Forecast

• Next lecture

– Another cloud application for group collaboration

• Monday is in fall break

• Next Wednesday

– Midterm

– http://www.cs.duke.edu/courses/fall10/cps

296.2/syllabus.html

Download