Cloud computing Basic concepts and cloud infrastructure

advertisement
Cloud computing
Basic concepts and cloud infrastructure
Dan C. Marinescu
Computer Science Division, EECS Department
University of Central Florida
Email: dcm@cs.ucf.edu
Contents

Basic concepts

Ethical issues in cloud computing
 Cloud diversity and vendor lock in
 Cloud computing paradigms and services

Cloud infrastructure and applications






AWS - the Amazon Web Services
Open-source platforms for cloud computing
How to use the AWS
Case study 1 - a cloud service for trust management in cognitive radio
networks
Case study 2 - adaptive data streaming from a cloud
This lecture is available online at
http://www.cs.ucf.edu/~dcm/Chile2012/ChileIndex.html
7/26/2016
UTFSM - May-June 2012
2
The main features of cloud computing






Uses Internet technologies to offer scalable and elastic services; the term
``elastic computing'' refers to the ability of dynamically acquiring
computing resources and supporting a variable workload.
The resources used for these services can be metered and the users can
be charged only for the resources they used.
The maintenance and security are ensured by service providers.
The service providers can operate more efficiently due to specialization
and centralization.
Cloud computing is cost-effective because of the multiplexing of
resources; lower costs for the service provider are past to the cloud users.
The application data is stored closer to the site where it is used in a device
and in a location-independent manner; potentially, this data storage
strategy increases reliability, as well as security and lowers
communication costs.
7/26/2016
UTFSM - May-June 2012
3
Types of clouds




Private Cloud  the infrastructure is operated solely for an organization; it
may be managed by the organization or a third party.
Community Cloud  the infrastructure is shared by several organizations
and supports a specific community that has shared concerns (e.g.,
mission, security requirements, policy, and compliance considerations).
Public Cloud  the infrastructure is made available to the general public
or a large industry group and is owned by an organization selling cloud
services, e.g., AWS – Amazon Web Services
Hybrid Cloud  the infrastructure is a composition of two or more clouds
(private, community, or public) that remain unique entities but are bound
together by standardized or proprietary technology that enables data and
application portability.
7/26/2016
UTFSM - May-June 2012
4
Ethical issues in cloud computing

Cloud computing is based on a paradigm shift with profound implications on
computing ethics. The main elements of this shift are:
the control is relinquished to third party services;
2. the data is stored on multiple sites administered by several organizations; and
3. multiple services interoperate across the network.
1.


Systems can span the boundaries of multiple organizations and cross the
security borders; not only the border of the organizations IT infrastructure
blurs, also the border of the accountability becomes less clear.
The complex structure of cloud services can make it difficult to determine
who is responsible in case something undesirable happens. In a complex
chain of events or systems, many entities contribute to an action with
undesirable consequences, some of them have the opportunity to prevent
these consequences, and therefore no one can be held responsible
7/26/2016
UTFSM - May-June 2012
5
Vendor lock-in

When a large organization relies solely on a single cloud provider there are
several risks involved.

Cloud services may be unavailable for a short, or even for an extended period of
time; such an interruption of service is likely to impact negatively the
organization and possibly diminish, or cancel completely, the benefits of utility
computing for that organization.
 The potential for permanent data loss in case of a catastrophic system failure
poses an equally greater danger.
 The single vendor may decide to increase the prices for service and charge
more for computing cycles, memory, storage space, and network bandwidth
than other cloud service providers. The alternative in this case is switching to
another provider; unfortunately, this solution could be very costly due to the
large volume of data to be transferred from the old to the new provider.
Transferring tera or possibly peta bytes of data over the network takes a fairly
long time, it incurs substantial charges for the bandwidth used and requires
substantial manpower.

Solution: replicate the data to multiple cloud service providers.
7/26/2016
UTFSM - May-June 2012
6
A (3,4) RAID-5, configuration where
individual blocks are stripped over three
disks and a parity block is added; the
parity block is constructed by XOR-ing
the data blocks, e.g.,
aP = a1 XOR a2 XOR a3.
The parity blocks are distributed among
the 4 disks, aP is on disk 4, bP on disk
3, cP on disk 2, and dP on disk 1.
A system which strips data across four
clouds; the proxy provides transparent
access to data
7/26/2016
UTFSM - May-June 2012
7
7/26/2016
UTFSM - May-June 2012
8
Software as a Service - SaaS




Applications supplied by the service provider in a cloud infrastructure.
Accessible through a thin client interface e.g., a browser.
The user does not manage or control the underlying cloud infrastructure
including network, servers, operating systems, storage, or even individual
application capabilities.
Examples of SaaS services offered by Google:






Gmail  Email stored on a Google cloud
Google Calendar  a browser-based scheduler
Google Groups  host discussion forums; create messages online or via email.
Google Co-op  create customized search engines based on a set of facets or
categories.
Google Base  load structured data from different sources to a central
repository; each item follows a simple schema: (item type, attribute)
Google Drive  an online service for data storage available since April 2012
7/26/2016
UTFSM - May-June 2012
9
Platform as a Service (PaaS)




Provides facilities to deploy consumer-created or acquired applications
using programming languages and tools supported by the provider.
The consumer does not manage or control the underlying cloud
infrastructure including network, servers, operating systems, or storage,
but has control over the deployed applications and, possibly, application
hosting environment configurations. Such services include: session
management, device integration, sandboxes, instrumentation and testing,
contents management, knowledge management.
Major application areas are in software development when multiple
developers and users collaborate and the deployment and testing
services should be automated.
PaaS not particularly useful when



the application must be portable,
proprietary programming languages are used,
the underlying hardware and software must be customized to improve the
performance of the application.
7/26/2016
UTFSM - May-June 2012
10
Infrastructure as a Service (IaaS)




Provides processing, storage, networks, and other fundamental
computing resources; the consumer is able to deploy and run arbitrary
software, which can include operating systems and applications.
The consumer does not manage or control the underlying cloud
infrastructure but has control over operating systems, storage, deployed
applications, and possibly limited control of select networking components
(e.g., host firewalls).
Services offered by this paradigm include: server hosting, web servers,
storage, computing hardware, operating systems, virtual instances, load
balancing, Internet access, and bandwidth provisioning.
Examples: AWS – Amazon Web Services including
EC2 – Elastic Cloud Computing
 S3 – Simple Storage System

7/26/2016
UTFSM - May-June 2012
11
7/26/2016
UTFSM - May-June 2012
12
AWS - Amazon Web Services


Amazon is a pioneer in IaaS
The infrastructure offers a palette of services available through the AWS
Management Console:








EC2 - Elastic Compute Cloud
S3 - Simple Storage System
EBS - Elastic Block Store
SDB - Simple Data Base
SQS - Simple Queue Service
CW - CloudWatch
VPC - Virtual Private Cloud
Auto Scaling
7/26/2016
UTFSM - May-June 2012
13
7/26/2016
UTFSM - May-June 2012
14
EC2





Is a Web service with a simple interface for launching instances of an
application under several operating systems: several Linux distributions,
Windows Server 2003 and 2008, OpenSolaris, FreeBSD, and NetBSD.
Allows a user to load instances of an application with a custom application
environment, manage network’s access permissions, and run the images
using as many or as few systems as desired.
Allows the import of virtual machine images from the user environment to an
instance through a facility called VM import.
EC2 instances boot from an AMI (Amazon Machine Image) digitally signed
and stored in S3; one could use the few images provided by Amazon or
customize an image and store it in S3.
An EC2 instance is characterized by the amount of resources it provides




VC (Virtual Computers) – virtual systems running the instance
CU (Compute Units) – measure the computing power of each virtual system
Memory
I/O capabilities
7/26/2016
UTFSM - May-June 2012
15
EC2 instances

EC2 offers several instance types:

Standard instances: micro (StdM), small (StdS), large (StdL), extra large
(StdXL); small is the default.
 High memory instances: high-memory extra large (HmXL), high-memory
double extra large (Hm2XL), and high-memory quadruple extra large
(Hm4XL).
 High CPU instances: high-CPU extra large (HcpuXL).
 Cluster computing: cluster computing quadruple extra large (Cl4XL).

A main attraction of the Amazon cloud computing is the low cost; e.g.,
0.007 cents/hour for StdM.
7/26/2016
UTFSM - May-June 2012
16
S3 – Simple Storage System





S3 is a storage service with a minimal set of functions: write, read, and
delete; it does not support primitives to copy, to rename, or to move an
object from one bucket to another. The object names are global.
It is designed to store large objects; an application can handle an
unlimited number of objects ranging in size from 1 byte to 5 TB.
An object is stored in a bucket and retrieved via a unique, developerassigned key; a bucket can be stored in a Region selected by the user. {\it
S3 maintains for each object: the name, modification time, an access
control list, and up to 4 KB of user-defined metadata.
Authentication mechanisms ensure that data is kept secure; objects can
be made public, and rights can be granted to other users.
S3 computes the MD5 of every object written and returns it in a field called
ETag. A user is expected to compute the MD5 of an object stored or
written and compare this with the ETag; if the two values do not match,
then the object was corrupted during transmission or storage.
7/26/2016
UTFSM - May-June 2012
17
Elastic Block Store (EBS)





EBS provides persistent block level storage volumes for use with EC2
instances. A volume appears to an application as a raw, unformatted and
reliable physical disk; the size of storage volumes from 1 GB to 1 TB.
The volumes are grouped together in Availability Zones and are
automatically replicated in each zone.
The storage strategy provided by EBS is suitable for database applications,
file systems, and applications using raw data devices.
An EC2 instance may mount multiple volumes, but a volume cannot be
shared among multiple instances.
EBS supports the creation of snapshots of the volumes attached to an
instance and then uses them to restart an instance.
7/26/2016
UTFSM - May-June 2012
18
SimpleDB





SimpleDB is a non-relational data store that allows developers to store
and query data items via Web services requests.
It supports store and query functions traditionally provided only by
relational databases.
It creates multiple geographically distributed copies of each data item.
It supports high performance Web applications.
It manages automatically

the infrastructure provisioning,
 hardware and software maintenance,
 replication and indexing of data items, and
 performance tuning.
7/26/2016
UTFSM - May-June 2012
19
Simple Queue Service (SQS)






SQS is a hosted message queue; it supports automated workflows.
Multiple Amazon EC2 instances can coordinate their activities by
sending and receiving SQS messages. Any computer connected to the
Internet can add or read messages without any installed software or
special firewall configurations.
Applications using SQS can run independently and asynchronously, and
do not need to be developed with the same technologies.
A received message is ``locked'' during processing; if processing fails,
the lock expires and the message is available again.
Developers can access SQS through standards-based SOAP and Query
interfaces.
Queues can be shared with other AWS accounts and Anonymously;
queue sharing can also be restricted by IP address and time-of-day.
7/26/2016
UTFSM - May-June 2012
20
CloudWatch



CloudWatch is a monitoring infrastructure used by application developers,
users, and system administrators to collect and track metrics important for
optimizing the performance of applications and for increasing the
efficiency of resource utilization.
Without installing any software a user can monitor either seven or eight
pre-selected metrics and then view graphs and statistics for these metrics.
When launching an Amazon Machine Image (AMI) the user can start the
CloudWatch and specify the type of monitoring:

Basic Monitoring is free of charge and collects data at five-minute intervals for
up to seven metrics,
 Detailed Monitoring is subject to a charge and collects data at one minute
interval. This service can also be used to monitor the latency of access to EBS
volumes, the available storage space for RDS DB instances, the number of
messages in SQS, and other parameters of interest for applications.
7/26/2016
UTFSM - May-June 2012
21
Application development using AWS





To access AWS one must first create an account at
http://aws.amazon.com/.
Once the account is created the Amazon Management Console (AWC)
allows the user to select one of the service, e.g., EC2, S3, and so on.
Several operating systems are supported including: Amazon Linux, Cent
OS, Debian, Fedora, Open Solaris, Open Suse, Red Hat, Ubuntu,
Windows, and SUSE Linux.
Then create an AMI (Amazon Machine Image) on one of the platforms
supported by AWS and start an instance using the RunInstance API. An
AMI is a unit of deployment, it is an environment including all information
necessary to set up and boot an instance.
The local instance store persists only for the duration of an instance; the
data will persist if an instance is started using the Amazon EBS (Elastic
Block Storage) and then the instance can be restated at a later time.
7/26/2016
UTFSM - May-June 2012
22
Instance management

Once an instance was created the user can perform several actions:

connect to the instance,
 launch more instances identical to the current one, or
 create an EBS AMI;
 terminate, reboot, or stop the instance.

The Network & Security panel allows the creation of Security Groups,
Elastic IP addresses, Placement Groups, Load Balancers and Key Pairs
while the EBS panel allows the specification of volumes and the creation
of snapshots.
7/26/2016
UTFSM - May-June 2012
23
Connecting clients and server through firewalls




Typically, the local area network (LAN) of an organization is connected
to the Internet via a router.
A router firewall often hides the true address of hosts in the local
network using the network address translation (NAT) mechanism. The
hosts behind a firewall are assigned addresses in a ''private address
range'' and the router uses the NAT tables to filter the incoming traffic
and translate external IP addresses to private ones.
The mapping between the pair (external address, external port) and the
(internal address, internal port) tuple carried by the network address
translation function of the router firewall is also called a pinhole.
If one tests a client-server application with the client and the server in
the same local area network the packets do not cross a router; once a
client from a different LAN attempts to use the service, the packets may
be discarded by the firewall of the router. The application may no
longer work if the router is not properly configured.
7/26/2016
UTFSM - May-June 2012
24
Firewalls




Firewalls screen incoming traffic and sometimes filer outgoing traffic as
well.
The first obstacle encountered by the incoming traffic in a network is a the
firewall supported by the operating system of the router.
The next obstacle is the firewall provided by the operating system running
on the local computer.
Sometimes there is an additional filtering due to antivirus software.
7/26/2016
UTFSM - May-June 2012
25
A virtual machine running under EC2 has several IP
addresses

EC2 Private IP Address the internal address of an instance only used for
routing within the EC2 Cloud.
 Public IP Address used by network traffic originating outside the EC2 network.
The public IP address is translated using the Network Address Translation
(NAT) to the private IP address when an instance is launched and it is valid
until the instance is terminated. Traffic to the public address is forwarded to the
private IP address of the instance.
 Elastic IP Address the IP address allocated to an EC2 account and used by
traffic originated outside the cloud. NAT maps an elastic IP address to the
private IP address. Elastic IP addresses allow the cloud user to mask instance
or availability zone failures by programmatically re-mapping a public IP
addresses to any instance associated with the user's account. This allows fast
recovery after a system failure; for example, rather than waiting for a cloud
maintenance team to reconfigure or replace the failing host, or waiting for DNS
to propagate the new public IP to all of the customers of a Web service hosted
by EC2, the Web service provider can re-map the elastic IP address to a
replacement instance.
7/26/2016
UTFSM - May-June 2012
26
Security rules for application- and transport-layer
protocols in EC2



Security groups control access to user's virtual machines. A virtual
machine instance belongs to one, and only one, security group, which can
only be defined before the instance is launched. More than one instance
can belong to a single security group.
Security group rules control inbound traffic to the instance and have no
effect on outbound traffic from the instance. All inbound traffic to an
instance, originated either from outside the cloud or from other instances
running on the cloud is blocked, unless a rule stating otherwise is added to
the security group of the instance.
How to add a security rule:

Sign in to AWS using your Email address and password.
 Use the EC2 Request Instance Wizard to specify the instance type, whether it
should be monitored, and specify a key/value pair for the instance to help
organize and search.
 Provide a name for the key pair, then on the left hand side panel choose
Security Groups under Network & Security
7/26/2016
UTFSM - May-June 2012
27
Next lecture – Thursday May 31, 12 AM

Open-source platforms for cloud computing
Eucaliptus
2. Nebula
3. Nimbus
1.

Cloud applications







7/26/2016
Challenges
Existing and new applications
Coordination and the Zookeeper
The Map-Reduce programming model
The GrepTheWeb application
Clouds in science and engineering
Benchmarks
UTFSM - May-June 2012
28
Download