Amazon Web Services (AWS)

advertisement
Presenter: Daniel J. Schaak
History
AWS Services
Amazon EC2
Elastic Load Balancing
Amazon EBS
Amazon S3
Amazon EMR
Amazon RDS
Amazon SimpleDB
Amazon DynamoDB
AWS SDK
AWS Competitors
AWS Future
White paper definition:
“Amazon Web Services is a collection of remote computing
services that together make up a cloud computing platform,
offered over the Internet by Amazon.com.”
Can be used for almost anything imaginable.
Founded by Jeff Bezos
Incorporated (as Cadabra) in 1994
Amazon.com debuts on the web in 1995
Sold only books
Growth into new genres in coming years
2001 Q4 is first profitable quarter
Headquarters: Seattle, WA
117,300 employees
Current Forbes rankings
#6 Innovative Companies
#33 World's Most Valuable Brand
Originally developed for internal use
Chris Pinkham lead designer
Began building it in 2003
2005 offered technology to limited customers under
NDA
EC2 & S3 launched in 2006
Many new services and regions since then
Direct correlation between launch of AWS and
Amazon's growth
Image credit: Charles McLellan/ZDNet; Data: Amazon
Image credit: Charles McLellan/ZDNet; Data: Amazon
Image credit: Charles McLellan/ZDNet; Data: Amazon
Advantages to AWS
Flexible
Cost-Effective
Secure
Distinguished
Cost of operating
No minimum fee
Pay for what you use
As of white paper issued in January 2014 34
services in 6 primary service areas.
Compute & Networking
Storage & Content Delivery Network
Database
Analytics
Application Services
Deployment & Management
Compute & Networking
Amazon Elastic Compute Cloud (EC2)
Auto Scaling
Elastic Load Balancing
Amazon WorkSpaces
Amazon Virtual Private Cloud (Amazon VPC)
Amazon Route 53
AWS Direct Connect
Storage & Content Delivery Network
Amazon Simple Storage Service (Amazon S3)
Amazon Glacier
Amazon Elastic Block Storage (EBS)
AWS Storage Gateway
AWS Import/Export
Amazon CloudFront
Database
Amazon Relational Database Service (Amazon RDS)
Amazon DynamoDB
Amazon ElastiCache
Amazon Redshift
Amazon SimpleDB
Analytics
Amazon Elastic MapReduce (Amazon EMR)
Amazon Kinesis
AWS Data Pipeline
Application Services
Amazon AppStream
Amazon Simple Queue Service (Amazon SQS)
Amazon Simple Notification Service (Amazon SNS)
Amazon Simple Workflow Service (Amazon SWF)
Amazon Simple Email Service (Amazon SES)
Amazon CloudSearch
Amazon Elastic Transcoder
Deployment and Management
AWS Identity and Access Management (IAM)
AWS CloudTrail
Amazon CloudWatch
AWS Elastic Beanstalk
AWS CloudFormation
AWS OpsWorks
AWS CloudHSM
Elastic Compute Cloud
Simple to bring up new virtual machines
Many base images to choose from
Create custom images
Technology behind it
Amazon guards these details very closely
Article by Steven J. Vaughan-Nichols in March 2012
Huang Liu Ph.D. in EE (Research Manager with Accenture)
454,400 servers
Believed that each runs a custom version of Red Hat Enterprise Linux
Xen hypervisor for VM hosting
Who Is Using It?
Elastic Load Balancing
Central managing of encryption\decryption
Sticky sessions
Single point of contact for domain names
Amazon Elastic Block Store
Block level storage values
Linked to single EC2 instance at a time
shows as storage device on VM
Persists independent of EC2 instance
Provides reliable storage
Built in redundancy (within a single availability zone)
Snapshots stored in S3 provides long term data backups
Snapshots are incremental
Large data sets for free provided for public use
1000 Genomes Project
Enron Email Data
Marvel Universe Social Graph
NASA NEX
Daily Global Weather Measurements from 1929 - 2009
Amazon Simple Storage Service
Stores data as "objects" in "buckets"
Object sizes can range from 1 byte to 5 terabytes
Buckets are containers for data objects
Single bucket can store unlimited number of objects
Access permissions can be granted on a per bucket basis
Redundant backups
Multiple devices in multiple facilities
Regular data checks
99.999999999% durability and 99.99% availability
Highly integrated
Data can be made publicly viewable
Versioning
Common use cases:
Backup and storage
Application or media hosting
Software delivery
Static website hosting
Who Is Using It?
Elastic Map Reduce
Relies on EC2 and S3
Data & Processing code loaded into S3
Spins up cluster of EC2 instances
Results available in S3
Hadoop Ecosystem
Hive & PIG available
Who Is Using It?
Provides access to traditional RDBMS systems
Oracle
Microsoft SQL Server
MySQL
PostgreSQL
Existing applications & tools work as is
Features
Automatic patching of database software
Automatic backups
Support for multi zone deployments fail-over
Storage Type Options
General Purpose (SSD)
Consistent 3 IOPS per GB
Supports bursts up to 3000 IOPS
Provisioned IOPS (SSD)
High performance storage for I/O intensive workloads
1000 - 30,000 IOPS per instance
Magnetic Storage
Not listed guaranteed IOPS
Best suited for small workloads with minimal reads
Who Is Using It?
Written in Erlang
Introduced in 2007
Data model and architecture
Document based NoSQL database
Consists of one or more named fields
Supports multiple domains of documents
Can be independently queried
Each domain may be stored on a different Amazon node
Configurable Consistency
Switches CAP properties between
AP (eventually consistent)
Recommended mode
DB is typically consistent within less than 1 sec
CP (consistent)
Geographically distributed replicas of data
Best suited for smaller applications requiring flexibility
in queries
Logging is a good example of a use case
Limitations
Domains limited to 1 billion attributes (10GB)
Each document in a domain is limited to 256 attributes
Each attribute for a document is limited to 1024 bytes
Drawbacks
Manual partitioning of data
Typical max request capacity is under 25 writes/sec
Automatically indexes all item attributes
Provides query flexibility
Costs performance and scalability
Fully managed NoSQL database service
Debuted in 2012
Combines best parts of Dynamo and SimpleDB
Dynamo was first NoSQL solution built by Amazon
Provided reliability, performance and scalability
Required management
Turned people away to more simplistic options
Advantages
Managed
Scalable
Fast
Durable & Highly Available
Flexible
Low Cost
Document based NoSQL database
Provides two types of keys for PK indexing
Simple Hash Key
Single attribute (PK)
Composite Hash Key w/ Range Key
Key contains two attributes
Hash attribute
Range attribute: used to return multiple data records within specified criteria
CAP theorem properties
AP by default
Consistency usually reached within a second
Can be CP on a per read basis
Focus on scalability and performance
Runs on solid state disks
No limits on request capacity or table size
400 KB limit on item size
Automatic partitioning
Does not index all attributes
Keeps read\write cost low
Updates only require updated PK index
Secondary indexes can be defined
Impacts performance
Latencies remain stable even as datasets grow
Provisioned throughput
Provisioned throughput
Provides predictable (specifiable) performance
Allows a per table specification of throughput capacity
DynamoDB allocates resources sufficient to gaurantee it
Reservations are elastic
Scaled up or down at any time
Management console
API
Measured in capacity units
Capacity Units
Measure of strongly consistent operations per second
Eventually consistent operations are twice as efficient
Read = 4KB per unit
Write = 1KB per unit
Rounds up to nearest unit
Local secondary indexes impact throughput
Each item has 100 bytes of additional overhead for indexing
Capacity Units - Read
Expected Item
Size
Consistency
Desired Reads
Per Second
Provisioned
Throughput
Required
4 KB
Strongly
consistent
50
50
8 KB
Strongly
consistent
50
100
4 KB
Eventually
consistent
50
25
8 KB
Eventually
consistent
50
50
Image courtesy of: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithTables.html
Capacity Units - Write
Expected Item Size
Desired Writes Per
Second
Provisioned
Throughput Required
1 KB
50
50
2 KB
50
100
Image courtesy of: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithTables.html
Automatically replicates data
Replicates to at least 3 different data centers
Replicates to multiple AWS availability zones
Replicates within a single AWS region
Ensures availability and durability
Who Is Using It?
Management Console
Command line utility
HTTP API
AWS SDK
Available for a wide variety of languages including:
Java, .Net, Ruby, Python, PHP, Node.js, Browser (JavaScript)
Android, iOS
Extensive library of developer documentation available
API documents
Developer guides
Reference videos
Case studies
Sample code
DynamoDB API
HTTP requests
Data passed in JSON
AWS SDK
Low-level API methods
Correspond closely to DynamoDB operations
Common across languages
JAVA and .Net provide object persistence
Map client side classes to DynamoDB tables
Ability to call object methods rather than low-level API
.NET provides document model
High level object model
Abstracts low-level operations into table and document objects
GetItem
Eventually consistent by default
Returns ALL attributes of an item
PutItem
Creates new item
Overwrites existing items
UpdateItem
Modifies existing items
Creates new item if necessary
Only need to specify attributes to be updated
DeleteItem
Batch Operations
BatchGetItem
Retrieve up to 1MB OR 100 items
Can retrieve from multiple tables
BatchWriteItem
Put or Delete multiple tables
Up to 16 MB or 25 items
Can Put\Delete in multiple tables
Can NOT Update items
Invokes corresponding request for each item
Individual failed requests do not fail entire batch
Key and data returned for failed requests
ALL requests must fail for batch request to fail
Conditional Writes
Image courtesy of: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithItems.html
Conditional Writes
Specify expected conditions
Must be met PRIOR to operation taking effect
Applicable to PUT, UPDATE, DELETE
Idempotent
Image courtesy of: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithItems.html
Advanced Topics
Atomic Counters
Projection expressions
Specify attributes
Substitution in expressions
Specify return data
UPDATE or DELETE
None by default
Query
Hash key required
Range key optional
Secondary Indexes
Eventually consistent by default
Returns all attributes by default
Always returns a result set
Results always sorted by range key
Ascending by default
Max return of 1MB
Scan
Examines EVERY item in table
Always eventually consistent
Returns all data attributes by default
Always returns result set
Max return of 1MB
Advanced topics
Filters
Apply to Query or Scan
Conditional expression to limit result set
Applied AFTER query or Scan completes
Paging
Limit
Parallel scans
Scans multiple segments simultaneously
Application managed
Dependent on throughput settings
Must be finely tuned
Advanced topics
Performance
Query more efficient than Scan
Scan always scans entire table
Query uses indexes to find range of keys
Filters can degrade performance
Impact on both Query and Scan performance
Applied after initial search operation completes
Use with caution
Image courtesy of: http://www.fool.com/investing/general/2014/06/05/heres-why-microsoft-corporation-is-the-biggest-aws.aspx
More Infrastructure
Increased server capacity
Building a ‘private’ cloud for CIA usage
Additional data centers
Launched Frankfurt, Germany on 10-24-14
11th region in the world
More Data
Consumers
Photos, music, other files
Companies
Adopting “cloud first” strategies
More Security
Providing new tools
"I see most of this as an opportunity, not as something that is really
bad. It's an opportunity to give customers tools to protect
themselves.“
Werner Vogels (AWS CTO)
More Competition
"We've always said this is too good a business. It's not a
winner-take-all environment.“
Werner Vogels (AWS CTO)
“There’s also plenty of room for growth in Amazon Web
Services. The server market is a $50 billion industry, and
that represents just one piece of the current
hardware/software ecosystem that Amazon Web Services
aims to replace. By contrast, Amazon Web Services
generates about $4 billion of annual revenue today.”
Courtesy of: http://www.fool.com/investing/general/2014/09/30/3-reasons-amazoncom-incs-stock-could-rise.aspx
http://www.fundinguniverse.com/company-histories/amazon-com-inc-history/
http://www.zdnet.com/in-pictures-the-rise-of-aws_p3-3040155324/#photo
http://www.zdnet.com/in-pictures-the-rise-of-aws-3040155324/#photo
http://www.forbes.com/companies/amazon/
http://media.amazonwebservices.com/AWS_Overview.pdf
http://www.zdnet.com/blog/open-source/amazon-ec2-cloud-is-made-up-of-almost-half-a-million-linux-servers/10620
http://www.rightscale.com/blog/cloud-industry-insights/amazons-elastic-block-store-explained
https://www.youtube.com/playlist?list=PLhr1KZpdzukcMmx04RbtWuQ0yYOp1vQi4
http://www.academia.edu/1254017/Data_consistency_properties_in_Amazon_SimpleDB_and_Amazon_S3
http://www.browniethoughts.com/2013/02/nosql-databases-key-value-and-document.html
http://www.allthingsdistributed.com/2012/01/amazon-dynamodb.html
http://aws.amazon.com/tools/
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/
http://smashingboxes.com/ideas/heroku-vs-amazon-web-services
http://www.stackdriver.com/cassandra-aws-gce-rackspace/
http://www.rightscale.com/blog/cloud-cost-analysis/google-slashes-cloud-prices-google-vs-aws-price-comparison
http://www.theregister.co.uk/2014/07/26/amazon_aws_margin_decline/
http://www.geekwire.com/2014/amazon-web-services-expands-Europe-new-german-data-centers/
http://www.zdnet.com/aws-guru-werner-vogels-predicts-future-for-next-decade-in-the-cloud-7000030683/
http://www.fool.com/investing/general/2014/06/05/heres-why-microsoft-corporation-is-the-biggest-aws.aspx
http://www.fool.com/investing/general/2014/09/30/3-reasons-amazoncom-incs-stock-could-rise.aspx
Download