Amazon EC2 - UW Courses Web Server

advertisement
Amazon EC2
Andrew Chekerylla & Edward Kim
What is EC2?
Amazon Elastic Cloud Computing
 Infrastructure as a Service (IaaS)
 Allows customers to rent virtual computers by the hour. All
they need to provide is money, and they will have a virtual
server instance.
Development
Team




Amazon.com in Cape Town, South Africa
Chris Pinkham, VP IT Infrastructure
Christopher Brown, Design Architect
Willem Van Biljon, Product Manager
Product
 Amazon.com Elastic Compute Cloud (EC2)
 Web service that provides scalable computing resources in the
cloud.
Development
Motivation
 Previous data center solutions required large financial
investment and presented cost inefficiencies when data
needs changed.
 Amazon saw an opportunity to provide scalable cloud
computing that avoided these costs.
 They could charge clients only for what they needed, using a
variable pricing model.
Development
Timeline




March 2006: Filed initial patents
August 2006: Public beta test with UNIX platforms
October 2008: Production release with Windows Server platforms
Since then: Added SQL Server, NetBSD and FreeBSD.
Development
 Product Features
 Elastic Compute Units (ECUs) for variable computing power
 Elastic Block Storage (EBS) for network-based storage
 Xen Virtual Machines (VMs) for computing resources
 Elastic IP Addresses for user-controlled IP addresses
 CloudWatch for real-time dashboard of computing resource
utilization.
 Automated Scaling to automatically add or remove EC2
instances as needed.
 Availability Zones to ensure failure isolation between clusters.
Development
 Product Innovations
 Design details are proprietary information.
 However, initial patents are available and can be
downloaded.
 They are the closest glimpse into the core technology of
Amazon EC2
 Two patents filed in March 2006
 Managing execution of programs by multiple computing systems [1]
 Managing communications between computing nodes [2]
Patents [1]
March 2006: Managing execution of
programs by multiple computing systems
 Central program execution service for distributing jobs to
available computing resources.
 The service can discriminate resources by physical
proximity or by similar software state.
 Physical proximity allows for reduced latency since data
travels over a shorter distance.
 Similar software state allows for faster response since
copies of the program are already available and possibly
running.
Patents [1] Network Diagram
 The next slide contains a network diagram from the original
patent.
 The diagram shows multiple computing systems exchanging
and running program copies.
Patents [1] Network Diagram
 Note that System Manager nodes 140 and 150 take
responsibility for managing computing resources by initiating
program exchange or execute requests.
Patents [1] Groups of Systems
 The next slide contains a picture of groups of computing
systems that can store and exchange program copies.
 The diagram shows several computing systems that have
different programs locally stored.
Patents [1] Groups of Systems
 Note that not all programs are distributed to all nodes, since
that would add needless transmission time overhead to
system performance.
Patents [1] Block Diagram
 The next slide contains a block diagram from the original
patent.
 The diagram shows how computing systems could manage
the execution of programs on other computing systems.
Patents [1] Block Diagram
 Note that the System Manager Computing System
and the Machine Manager Computing System are
indicated on previous slides as parts of the same
local network or cloud system.
 They each run a core routine that implements the
program exchange and execution events in a masterslave architecture.
Patents [1] Flow Diagram
 The next slide contains a partial flow diagram of the System
Manager Module Routine.
 This runs on the system manager.
 Note this is complemented by a Machine Manager Module
Routine running on each computing resource.
Patents [1] Flow Diagram
 Note the System Manager Module Routine is a large function
and has additional steps.
 It negotiates with the machine managers to provide program
copies as needed.
Patents [2]
March 2006: Managing
communications between computing
nodes
 Groups of computing nodes use access policies to manage
communication between virtual machines.
 Authorization can be dynamically negotiated and stored for later
in order to automatically authorize future transmissions.
Job Management
Patent [1] describes a master-slave
architecture between master
computing resources and machine
computing resources.
Fault Tolerance
Patent [1] describes how multiple
program instances can be replicated
on machines in different Availability
Zones, to protect against network
outages.
EC2 Layers
EC2 Diagram
XEN Hypervisor
 Basic abstraction layer of software that sits directly on the
hardware below any operating systems.
 Responsible for CPU scheduling and memory partitioning of
the various virtual machines running on the hardware device.
 Controls the execution of virtual machines as they share the
common processing environment.
 No knowledge of networking, external storage devices, video,
or any other common I/O functions found on a computing
system.
Virtualization Specifications
 Xen Hypervisor for virtualization
 Provides services that allow multiple computer operation systems
to execute on the same computer hardware
 Hardware specifications are tailored to the needs of the use
 Storage, Computing, Memory, Graphics
 Why did Amazon choose Xen?
Virtualization
 Paravirtual
 Paravirtual AMIs boot with a special boot loader called PVGRUB, which starts the boot cycle and then chain loads the
kernel specified in the menu.lst file on your image
 Hardware Virtual Machine
 Unlike PV guests, HVM guests can take advantage of hardware
extensions that provide fast access to the underlying hardware
on the host system
 Allows user to run an operating system directly on top of a virtual
machine without any modification, as if it were run on the baremetal hardware.
EC2 Instances
Security
 Keypairs are used to authenticate when you login to the
instance.
 Can use security groups for more protection
 Contained in your own Virtual Private Network
Competitors
 Microsoft Azure
 Google Compute Engine
 GoGrid
 Rackspace
 Storm
 Voxel
 Linode VPS
 Joyent
…
Benefits




Less downtime setting up new servers
Highly Scalable
High Availability (over 99%)
Saves a lot of money
 Costs of upfront hardware
 Costs of leasing the space for the data center
 Operational overhead
 Easy to perform software updates or major upgrades
 Who would benefit most from this service?
Benefits
How/Why is it used?
Availability
 US East (N. Virginia)
 US West (Oregon, Northern California)
 Asia Pacific (Tokyo, Sydney)
 Europe (Ireland, Frankfurt)
 South America (Sao Paulo)
 AWS GovCloud (US)
 Benefits of breaking down into regions?
 Network transfer distance
 Options for backup servers in different regions
Cloud Computing for Job
Management
 What does this mean for parallel computing?
 In what ways can we utilize this capability to handle large
amounts of data?
 Amazon Elastic Map Reduce (EMR)
Storage
 Amazon EC2 uses two different kinds of storage. One is local
storage, known as Instance Storage, which is non-persistent
and data will be lost after an instance terminates. The other
kind is persistent, network-based storage called Elastic Block
Store(EBS), which can be attached to running instances or
also used as a persistent boot medium.
 Instance Storage
 EBS
Elastic Block Storage
 Provides raw data blocks that can be attached to EC2
instances. (Essentially works as network drives)
 Can be backed up and restored to another instance for when
failures occur on an a current instance
EBS Pros / Cons
 Good for elasticity
 Built in redundancy
 Poor I/O rates on EBS volumes
 More costs involved
 S3 storage space
 IOPS
Instance Storage
Network
 Elastic IP Address
 Address belong to the account it was created on and not to an
instance. It will exist even if the instance is deleted.
 IP addresses cannot be used outside the Amazon
environment, customers must use the FQDN provided by
Amazon to access their systems.
 Instances within the environment can communicate with the
IP addresses.
 Control what goes in/out of your VPN using Network
Translation Table (NAT)
Elasticity
 Things to think about when choosing your type of instance
 VPN vs Classic
 IP Address
 Data Persistence
Types of Instances
 Free Tier
 Use AWS instances for up to 12 months (minimal performance)
 On-Demand
 Setup and tear down whenever you need to
 Reserved
 Pay up front for servers with contracts
 Spot
 Bid for unused capacity, but no control over when it’s terminated
Costs (On Demand)
Why did Amazon choose this
method of charging customers?
• Compute
• Storage
• Network IOPS
What others are saying about EC2
 Seldo from aws.sm had some issues with the service




Whole-zone failure patterns
Lifecycle of virtual systems
Costs to have multi-zone redudency
EBS
Leaked Information
 How about some detailed info on the xen setup? Do they silo the instances?
(E.g. Have like sized instances run the same machine).
 hardware nodes (HN) runs a copy of Amazon Linux, which has several internal
flavors.
 Each HN is silo'd like you say. So, if you're running m1.xl, you'll be sharing with only
other m1.xl's
 Once your server is in a slot, it get's that internal IP address and an EIP is NAT'd to
that internal IP
 Is it really possible to push more than 1Gbit on the larger Amazon EC2
instances? I've heard that the larger (4GB+?) instances are on different
nodes which are connected by 10G.
 You're drifting more into the EC2 Development Team realm, butttt, from what I know
it works like this. In any typical Linux application you have a runq and an io elevator.
Prioritization of various pieces are included in the Kernel. So, in the case of
networking, the networking get's higher io elevator priority because it also carries
EBS. This higher priority directly affects the runq, ensuring that you get a two for one
increase. Both in storage performance and network performance, since it all runs
over the same nic.
Summary
 One of the first major IaaS implemented
 Everything within EC2 has a cost to it
 Still there are a lot of reasons why companies use EC2
Sources
1.
Awe.sm

2.
http://blog.awe.sm/2012/12/18/aws-the-good-the-bad-and-the-ugly/#~p5i4KuJAFmwJnv
Wikipedia

3.
http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud
Xenproject

4.
http://www-archive.xenproject.org/files/Marketing/HowDoesXenWork.pdf
AmazonAws

5.
http://aws.amazon.com/ec2/
Masterclass Webinar

6.
https://www.youtube.com/watch?v=TORzO9Oc9oU
Rightscale

7.
http://www.rightscale.com/blog/cloud-industry-insights/amazons-elastic-block-store-explained
Chris Pinkham Patent #1 in 2006:

8.
https://www.google.com/patents/US8190682
Chris Pinkham Patent #2 in 2006:

9.
https://www.google.com/patents/US7801128
Amazon EMR

https://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-what-is-emr.html
Reddit – Ask Me Anything (ex amazon aws engineer)
10.

11.
http://www.reddit.com/r/IAmA/comments/1e5o4p/iaman_exaws_engineer_ask_me_anything_about_the/
PCMag

http://www.pcmag.com/article2/0,2817,2458757,00.asp
Download