Cloud Computing Development

advertisement
Cloud Computing
Development
Cloud Computing
Development
Shallow Introduction
Introduction
What is the cloud computing
Is it computing while in flight?
Image Courtesy SevensHeaven.nl
What is the cloud computing
Is it computing while in flight?
NO
Image Courtesy SevensHeaven.nl
What is the cloud computing
What is it about then?
What is the cloud computing
What is it about then?
Cloud computing is
consumption of computing
resources without worrying
about specifics.
What is the cloud computing
What is it about then?
As well as ability to add or
remove resources according
to the demand.
What is the cloud computing
What is it about then?
Similar to the power grid and
telephone network.
What is the cloud computing
What is it about then?
Similar to the power grid and
telephone network.
How does it work?
‣
Consumer signs up for the service.
(Same as if you get a mobile phone plan)
‣
Consumer uses services according to their
needs
‣
Provider sends the bill at the end of the cycle
‣
Consumer pays
Provider Models
Software As A Service
SAAS
Email
CRM
Office Apps
Provider Models
Software As A Service
SAAS
Platform As A Service
PAAS
Email
CRM
Office Apps
Application Servers
Databases
Middleware
Provider Models
Software As A Service
SAAS
Platform As A Service
PAAS
Infrastructure As A Service
IAAS
Email
CRM
Office Apps
Application Servers
Databases
Middleware
Bare Hardware
(Sort of )
Providers
Software As A Service
SAAS
Google (GMail)
Salesforce
Microsoft (Office Live)
Providers
Software As A Service
SAAS
Platform As A Service
PAAS
Google (GMail)
Salesforce
Microsoft (Office Live)
Google App Engine
Heroku / Engine Yard (Rails)
Windows Azure (.NET)
Providers
Software As A Service
SAAS
Platform As A Service
PAAS
Infrastructure As A Service
IAAS
Google (GMail)
Salesforce
Microsoft (Office Live)
Google App Engine
Heroku / Engine Yard (Rails)
Windows Azure (.NET)
Amazon AWS
Rackspace
GoGrid
Provider: Windows
Azure
‣ Platform as a service
‣ Windows based
‣ Storage provided through blob storage,
drives, SQL Azure
‣ State is stored and propagated with Queues
and Tables
‣ Integrated with Visual Studio
‣ Eclipse plug-in for PHP
Slide courtesy Vlad Vinogradsky from Microsoft
Provider: Google App
Engine
• Platform as a service
• Python or Java based
• Storage provided through BigTable
• Automatically scales web nodes
Provider: Rackspace
• Infrastructure as a service
• Very Basic just a few Linux or Windows
images
• Provides storage with CloudFiles
• Very Cheap
• Open source API
• Relatively New
Provider: Amazon AWS
• Oldest on the market
• Many services / Images / Third party
providers
• Provides computation through EC2 / EMR
• Provides state / storage through S3, SQS,
RDS, SimpleDB
• Multiple APIs
Sample Prices
Amazon
Compute $0.10+ VM/Hr
Storage $0.15+ GB/Month
$0.15+ GB/XFer
Rackspace
Compute $0.02+ VM/Hr
Storage $0.15+ GB/Month
$0.22+ GB/XFer
Microsoft
Compute $0.12 VM/Hr
Storage $0.15 GB/mo
Bandwidh $0.15 GB/XFer
Development
Practical Considerations
• Cloud Development is slightly different
from traditional in house model.
Practical Considerations
• Cloud Development is slightly different
from traditional in house model.
• Everything is virtualized (most of the
time)
• Everything is distributed
• Per instance reliability is much lower
• Overall reliability is much higher
Cloud Programming Model
Cloud Programming Model
‣ Compute and Interface
nodes are not reliable,
they can crash and
disappear at any time.
‣ Storage and State are
reliable and heavily
distributed.
‣ At any time we can start
more compute or
interface nodes and shut
them down when
demand subsides.
Cloud Programming Model on Azure
‣
Compute : Worker Nodes
‣
State: Tables / Queues /
SQL
‣ Storage: SQL / Tables /
Blobs / Drives
‣
Client Inteface: Web
Nodes
Cloud Programming Model on AWS
‣ Compute :
EC2 Instances
‣ State:
S3 / Queues /
SimpleDB / RDS
‣ Storage: S3 / SimpleDB /
RDS
‣ Client Inteface:
CloudFront
S3 / EC2 /
AWS Details: S3
‣ S3 = Simple Storage Service
‣ Guaranteed to be reliable
‣ Simple {Key, Value} storage
‣ Keys are stored within buckets
‣ Values could be as large as 5GB
‣ Default Storage Mechanism for AWS
AWS Details: Simple DB
‣ Schema less database
‣ Main storage unit is domain ( similar to table )
‣ Each record can have many attributes, new attributes
could be added at any time
‣ Similar to LISP / Scheme attributes
‣ Can query domain for records containing particular
attribute
‣ No Joins / Unions with other domains
‣ Eventual Consistency
AWS Details: RDS
‣
RDS = Relational Data Storage
‣
MySQL in a cluster mode
‣ Preferred to simply running DB server within instance
(ask me why for details)
AWS Details: SQS
‣ SQS = Simple Queue System
‣ Massively scalable
‣ Allows to put message in the queue and retrieve later
on
‣ Retrieving the message hides it from the other users
‣ When message is processed it is deleted from the
queue
‣ If message is not deleted before the timeout it is
returned back
AWS Details: EC2
‣ EC2 = Elastic Compute Cloud
‣ Allows to run arbitrary virtual machines
Provided they are compatible with Amazon’s modified Xen
‣ Kernels and Startup Disks are stored in S3
‣ Also have large local storage
‣ Machines are not exactly like physical machines
‣ Local storage is not persistent
When machine is shut down all local data disappears.
‣ Hardware TCP
[No packet layer / No Broadcast ]
‣ Can launch many copies of the machine at the same
time
‣ Lot’s of preconfigured machines
AWS Details: Other Services
‣ EMR = Elastic Map Reduce
Let’s run Hadoop jobs on EC2
‣ CloudFront
Content Delivery Network
‣ ELB = Elastic Load Balancer
‣ EBS = Elastic Block Storage
S3 backed persistent storage
‣ Public Data Sets - Lots of publicly available data
Census ( 1980 , 1990, 2000 ), Wikipedia logs,
Freebase dumps, Genetic and Chemistry data
Starting Up
• Amazon Account
• Credentials KeyID :
• X509 Ceriticate
SecretKey
Helpful Tools
• S3 Fox - Firefox extension for browsing
S3
• Elastic Fox - Firefox extension for
operating EC2
• Transmit - Mac utility for S3 ($)
• Right Scale - Web based platform for
managing everything ( Free / $ )
Libraries
• Official Amazon Libraries (Java)
• Unofficial Libraries - .Net / Ruby / Perl
• AWS4C - C/C++/Objective C
• Boto - Very popular Python library
(official Hadoop/EC2 library)
Demo
Demo
Running Hadoop on EC2
Questions
????
Download