cloud - MTA SZTAKI LPDS

advertisement
“Cloud bursting”
on SZTAKI Cloud
Attila Csaba Marosi
Cloud Computing Research Group
MTA SZTAKI LPDS
marosi.attila@sztaki.mta.hu
Summer School on Grid and Cloud
Workflows and Gateways 2013
1
Outline
•
•
•
•
•
•
•
Terminology
Recap: SZTAKI Cloud and LPDS Cloud
Cloud-Manager
Cloud bursting definition, scalability in general
Scaling scenarios @ SZTAKI Cloud
Summary
Additional Reading and References
Summer School on Grid and Cloud
Workflows and Gateways 2013
2
Terminology I.
• Based on deployment model:
o Public Cloud – “The cloud infrastructure is made available to the
general public or a large industry group and is owned by an
organization selling cloud services.” 3
o Private Cloud – “The cloud infrastructure is operated solely for an
organization. It may be managed by the organization or a third party
and may exist on premise or off premise.”3
o Hybrid Cloud – Environment created by the combination of public and
private cloud offerings
o (Community Cloud) 3
Summer School on Grid and Cloud
Workflows and Gateways 2013
3
Terminology II.
• Based on location:
o Internal Cloud – Subset of the Private Cloud model where it is offered
by an IT organization to its own business1 (“on premise”3 ).
o External Cloud – Not hosted by own organization and offered by a 3rd
party. It can be either public or private 1 (“off premise”3 ).
• Point of view of architectural service layers
o Software as a Service (SaaS)
o Platform as a Service (PaaS)
o Infrastructure as a Service (IaaS) – Cloud bursting (scaling) at this level
Summer School on Grid and Cloud
Workflows and Gateways 2013
4
Recap
• SZTAKI Cloud*
o
o
o
o
•
Institutional IaaS Cloud service by SZTAKI (private, internal)
7 nodes (7*64 Core, 7*256GB RAM), 2*32TB Storage
OpenNebula 3.8.3 based
Quotas for users
LPDS Cloud*
o Similar, but smaller scale
o Internal private cloud for LPDS
• Typically we use the LPDS Cloud for internal needs and scale
out to SZTAKI Cloud when needed.
Summer School on Grid and Cloud
Workflows and Gateways 2013
* Sándor Ács: “SZTAKI Cloud”. Monday, 1st July @ 12:00.
5
Definition, scalability
• Cloud Bursting:
o “Cloud bursting is an application deployment model in which an
application runs in a private cloud or data center and bursts into a
public cloud when the demand for computing capacity spikes.”4
• However more generally, cloud bursting is a subset of the
general scaling out problem
• Can be split into 2 parts:
1.
2.
Capability to scale out to a cloud to maintain QoS requirements (e.g.,
for handling short term spikes in computing capacity demand).
making the decision of (a) when, (b) how much, (c) how long and (d)
where to scale out.
Summer School on Grid and Cloud
Workflows and Gateways 2013
6
The ability to scale out (to a cloud) + Making the decision
Scaling out scenarios
(with SZTAKI Cloud)
In this talk
Summer School on Grid and Cloud
Workflows and Gateways 2013
Auto-scaling techniques
“Cloud bursting from WSPGRADE/ gUSE”
Thursday, 11:00-11:30
7
Cloud-Manager
Generic Meta-Broker Service
• Part of the FCM5 (“Federated Cloud
Management”) Architecture
• We’ll now focus on the Cloud-Manager
o For FCM c.f., Attila Kertesz: “Cloud Federation
Approaches” – @ 11:00 Today
• Schedules service calls to VMs and
manages VMs
• REST/SOAP Web service interface for
service call and VM queues
• The Cloud Resource Manager (CRM)
component is responsible for the scaling
decision (when/ where/ … )
• Initially it was intended for scaling
services in a single cloud
• We use this component internally for different
scaling (bursting) multi-cloud scenarios.
Summer School on Grid and Cloud
Workflows and Gateways 2013
Cloud-Manager
VAy
VAx
Q1
Service
Handler
Clouda
VMQx
Clouda
VMQy
FCM
Repository
VAx..VAy
Clouda VM Handler
VMx1
VMy1
VMx2
VMy2
…
…
VMxn
VMym
Clouda
8
Cloud-Manager
Cloud-Manager
1. Single queue for incoming service
calls (or tasks)
2. Multiple VM queues
o
o
Different one for each VA and resource
combination
VM queues can be managed automatically
(CRM) or manually
3. Manages VM lifecycle (EC2 REST API)
4. Performs the scheduling of service
calls to resources (Q1→VM)
VAx
1
Q1
4
Service
Handler
Clouda
VMQx
2
VAy
Clouda
VMQy
Clouda VM Handler
3
VMx1
VMy1
VMx2
VMy2
…
…
VMxn
VMym
Clouda
Summer School on Grid and Cloud
Workflows and Gateways 2013
9
Scenarios @ SZTAKI
• Source: Current infrastructure type (not necessarily cloud based!)
• Destination: target cloud infrastructure type
Destination
/ Source
Private
Volunteer
Summer School on Grid and Cloud
Workflows and Gateways 2013
Public
Private
Private→Public
Private→Private
(Scenario A. – “Cloud
bursting”)
(Scenario B.)
Volunteer→Public
Volunteer→Private
(Scenario C/1.)
(Scenario C/2.)
10
Scenario A: Private → Public
Destination
/ Source
Private
Volunteer
Summer School on Grid and Cloud
Workflows and Gateways 2013
Public
Private
Private→Public
Private→Private
(Scenario A. – “Cloud
bursting”)
(Scenario B.)
Volunteer→Public
Volunteer→Private
(Scenario C/1.)
(Scenario C/2.)
11
Scenario A: Private → Public
• Form a hybrid cloud: when local resources are insufficient
allocate resources from a public cloud provider
• Real world example: Prezi.com
o Uses private resources w/ Amazon EC2 to handle peak traffic
o Batch processing of tasks
• Zip files for download, fetch images for presentations, conversion jobs
o Prezi.com Scale Contest – http://prezi.com/scale/
• Jobs 5 seconds max in queue, VMs 2 minute boot time, instances paid by the
hour – minimize cost while honor requirements
Summer School on Grid and Cloud
Workflows and Gateways 2013
12
Scenario A: Private → Public
• In SZTAKI We have the following possibilities for bursting:
1.
2.
OpenNebula based bursting
Cloud-Manager based bursting
• However we prefer to use private clouds over public ones –
bursting to public clouds is set up as absolute last resort 
Summer School on Grid and Cloud
Workflows and Gateways 2013
13
OpenNebula: Building a
Hybrid Cloud (Scenario A)*
• OpenNebula supports accessing multiple remote providers
through the EC2 API – not necessarily just Amazon EC2
• Remote provider appears as new host in OpenNebula
• Resource limits by administrator
for number and type of instances
• VMs can be started in EC2 or
locally
• VM counterpart at remote provider
– EC2 section in VM template
• Network connectivity via VPN
Summer School on Grid and Cloud
Workflows and Gateways 2013
* Sándor Ács: “OpenNebula”. Monday, 1st July @ 11:00.
14
OpenNebula: Hybrid Cloud Use Cases*
On-demand Scaling of Computing Clusters
• E.g., elastic execution of a Condor
computing cluster
• Dynamic growth of the number of
worker nodes to meet demands using
EC2
• Private network with NIS and NFS
• EC2 worker nodes connect via VPN
On-demand Scaling of Web Servers
• E.g., elastic execution of the NGinx
web server
• The capacity of the elastic web
application can be dynamically
increased or decreased by adding or
removing NGinx instances
* Sándor Ács: “OpenNebula”. Monday, 1st July @ 11:00.
Cloud-Manager: multi-cloud
(Scenario A)
Cloud-Manager
• Cloud-Manager supports multiple
providers through the EC2 REST/ SOAP
API
o OpenNebula, OpenStack, Eucalyptus and
Amazon EC2
• Primarily for scaling Distributed
Computing Infrastructures (DCIs)
• Service calls are bound to VA’s
o Each configured provider must have the
counterpart (AMI-ID)
• Network connectivity via VPN when
needed
VAx
Q1
Clouda
VMQx
Cloudb
VMQx
Service
Handler
Clouda
Handler
Cloudb
Handler
VMx1
VMx1
VMx2
VMx2
…
…
VMxn
VMxm
Clouda
Summer School on Grid and Cloud
Workflows and Gateways 2013
VAy
Cloudb
16
Scenario B: Private →
Private
Destination
/ Source
Private
Volunteer
Summer School on Grid and Cloud
Workflows and Gateways 2013
Public
Private
Private→Public
Private→Private
(Scenario A. – “Cloud
bursting”)
(Scenario B.)
Volunteer→Public
Volunteer→Private
(Scenario C/1.)
(Scenario C/2.)
17
Scenario B: Private → Private
• Scale from a private infrastructure to another private
infrastructure
o E.g., scale from your local infrastructure (e.g., private internal) to
another academic cloud (e.g., private external)
• Typical use case for us: scaling out from LPDS Cloud to SZTAKI
Cloud (however both can be considered as internal clouds)
Summer School on Grid and Cloud
Workflows and Gateways 2013
18
SZTAKI: Scenario B+A (1/2.)
•
We scale primarily
computing clusters (Condor,
BOINC) with Cloud-Manager
1.
2.
3.
We use the LPDS Cloud
(private)
Scale out to SZTAKI cloud
(private)
As last resort scale out to
Amazon EC2 (public)
Summer School on Grid and Cloud
Workflows and Gateways 2013
19
SZTAKI: Scenario B+A (1/2.)
2
• The master node (1) and the
Cloud-Manager (2) are hosted
usually on a dedicated resource
• VPN head (3) must be typically
on a public IP node
o We use a patched version on TINC
with public key authentication
• The Cloud Resource Manager (4)
is responsible for auto-scaling
• New VM instances are created
and destroyed through the EC2
REST/SOAP API (5)
Summer School on Grid and Cloud
Workflows and Gateways 2013
4
1
5
3
20
1.
2.
3.
4.
Example: Scaling a Condor cluster with
Cloud-Manager
CM Service calls → Jobs for Condor
• Through REST/SOAP interface: (e.g., WS-PGRADE/ gUSE)
VPN Head on public IP
Manager node: Cloud-Manager and Condor Master
VAs are deployed at LPDS,
SZTAKI, Amazon EC2
1
• Contextualization by
Cloud-Manager:
• Key for VPN
3
• VPN Head public
IP
• Condor Master IP
on VPN
Summer School on Grid and Cloud
Workflows and Gateways 2013
4
4
4
2
21
Example: Scaling a Condor cluster with
Cloud-Manager
Summer School on Grid and Cloud
Workflows and Gateways 2013
22
Scenario C: Volunteer →
{Public, Private}
Destination
/ Source
Private
Volunteer
Summer School on Grid and Cloud
Workflows and Gateways 2013
Public
Private
Private→Public
Private→Private
(Scenario A. – “Cloud
bursting”)
(Scenario B.)
Volunteer→Public
Volunteer→Private
(Scenario C/1.)
(Scenario C/2.)
23
Scenario C: Volunteer →
{Public, Private}
• LPDS runs multiple BOINC based volunteer computing projects –
SZTAKI Desktop Grid, EDGeS@home
o People donate their computers’ idle computing cycles to science
o We do not own the resources
o We do not have any control over the resources
• These resources are “free” however not very reliable
o Jobs might be returned late or gone missing
• We burst to clouds to provide reliable computing resources for
problematic jobs when needed
o LPDS → SZTAKI → Academic Clouds →Amazon EC2
•
C.f., Jozsef Kovacs: “Integrating clouds with grid systems – the SZTAKI-BOINC
experience” @ 11:30
Summer School on Grid and Cloud
Workflows and Gateways 2013
24
Summary
• Bursting (scaling) consist of the capability + decision making
• In this presentation I showed some scenarios from SZTAKI:
o Private → {Public, Private}; Volunteer → {Private, Public}
o OpenNebula and Cloud-Manager based
• The decision making process (i.e., auto-scaling) will be the
topic of my presentation on Thursday
o “Cloud bursting from WS-PGRADE/ gUSE” – Thursday, 11:00-11:30
Summer School on Grid and Cloud
Workflows and Gateways 2013
25
References and
Additional reading
[1] Nair, S. K., Porwal, S., Dimitrakos, T., Ferrer, A. J., Tordsson, J., Sharif, T., Sheridan, C.,
Rajarajan, M. & Khan, A. U. (2010). Towards secure cloud bursting, brokerage and
aggregation. Paper presented at the IEEE European conference on Web Services, 1 Dec
2010 – 3 Dec 2010, Cyprus.
[2] D. McDysan: Cloud Bursting Use Case. IETF. http://tools.ietf.org/html/draft-mcdysansdnp-cloudbursting-usecase-00
[3] National Institute of Standards and Technology (NIST): The NIST Definition of Cloud
Computing. September, 2011. http://csrc.nist.gov/publications/nistpubs/800145/SP800-145.pdf
[4] SearchCloudComputing http://searchcloudcomputing.techtarget.com/definition/cloudbursting
[5] A. Cs. Marosi, G. Kecskemeti, A. Kertesz and P. Kacsuk, FCM: an Architecture for
Integrating IaaS Cloud Systems. In Proceedings of The Second International Conference
on Cloud Computing, GRIDs, and Virtualization. Rome, Italy. September, 2011.
Summer School on Grid and Cloud
Workflows and Gateways 2013
26
Thank you!
Questions?
Summer School on Grid and Cloud
Workflows and Gateways 2013
27
Summer School on Grid and Cloud
Workflows and Gateways 2013
28
Download