Globus Virtual Workspaces

advertisement
Cloud Computing and
Virtualization
with
Globus
Oakland, May 2008
Kate Keahey (keahey@mcs.anl.gov)
Tim Freeman (tfreeman@mcs.anl.gov)
University of Chicago
Argonne National Laboratory
Cloud Computing Tutorial
Hands-on



05/14/08
To participate in the hands-on part of the
tutorial, send your PKI X509 subject line to
nimbus@mcs.anl.gov
The first 10 requests will be given access
to the nimbus cloud
Hurry!
Virtual Workspaces: http//workspace.globus.org
Overview









05/14/08
Motivation
The Workspace Ecosystem: Abstractions and
Background
The Workspace Deployment Tools
Managing Resources with Virtual Workspaces
Appliance management and contextualization
Virtual Cluster Management with Workspace Tools
Application Example: the STAR experiment
Cloud Computing
Run on the cloud: hands-on tutorial
Virtual Workspaces: http//workspace.globus.org
Motivation
A Good Workspace is Hard to Find
?
05/14/08
1) Configuration: finding environment
tailored to my application
2) Leasing: negotiating a resource
allocation tailored to my needs
Virtual Workspaces: http//workspace.globus.org
Consumer’s Perspective:
Quality of Life

Real life applications are complex


… and require complex, customized
environments


Rely heavily on the right combination of compiler
versions and available libraries
Environment validation

05/14/08
STAR example: Developed over more than 10 years,
by more than 100 scientists, comprises ~2 M lines of
C++ and Fortran code
To ensure reproducibility and result uniformity across
environments
Virtual Workspaces: http//workspace.globus.org
Consumer’s Perspective:
Quality of Service

There is life beyond submitting batch jobs


Control of resources


Explicit SLA: different sites offer different
quality of service
Satisfying peak demand

05/14/08
Resource leases rather than job submission
Experiment season, paper deadlines, etc.
Virtual Workspaces: http//workspace.globus.org
Provider’s Perspective

Providing resources is easy, providing
environments is hard


Fine-tuning environments for different
communities is expensive




05/14/08
User comment: “I have 512 nodes I cannot use” ;-)
Evaluating, installing and maintaining software
packages etc.
Reconciling conflicts
Coordinating update schedules for different
communities is a nightmare
It may be hard to justify configuring/dedicating
resources if they are only needed 1% of the time - even if the 1% is very important for one of your
users
Virtual Workspaces: http//workspace.globus.org
The Workspace Ecosystem:
Abstractions and Background
Virtual Workspaces

A dynamically provisioned environment




Appliances/virtual appliances


05/14/08
A complete environment: a complete (software)
environment as required by our community or
applications provisioned on demand.
Resource allocation: provision the resources the
workspace needs (CPUs, memory, disk, bandwidth,
availability), allowing for dynamic renegotiation to
reflect changing requirements and conditions.
Deployment point of view
A complete environment that can be packaged in
various formats
Packaging point of view
Virtual Workspaces: http//workspace.globus.org
Workspace Implementations
Traditional tools





Base environment
(discovery)
Virtual machines

Automated
configuration

Typically long
deployment time

Isolation




Performance isolation
Complete
environment

Contextualization
Short deployment
time
Very good isolation
Runtime performance
impact
Runtime environment
Paper: “Virtual Workspaces: Achieving Quality of Service and Quality of
Life in the Grid”
05/14/08
Virtual Workspaces: http//workspace.globus.org
The Virtues of Virtualization
App
App
App
App
App
Guest OS
(Linux)
Guest OS
(NetBSD)
Guest OS
(Windows)
VM
VM
VM
Parallels
Xen
VMWare
Virtual Machine Monitor (VMM) / Hypervisor
UML
Hardware
KVM
etc.





05/14/08
Bring your environment with you
Excellent enforcement and isolation
Fast to deploy, enables short-term leasing
Have a performance impact but it is acceptable for most
modern hypervisors
Suspend/resume, migration
Virtual Workspaces: http//workspace.globus.org
Creating
a Virtual Cluster that Works
Create a functioning virtual ensemble
Put the VMs in context
Contextualization layer
Deploy VMs onto the resource
VM VM
Obtain a lease on a raw resource
Resource
Deploy virtual machines
05/14/08
VM
Virtual Workspaces: http//workspace.globus.org
The Workspace Ecosystem
Appliance Deployment:
Mapping environments onto leased computing resources
Coordinating creation of virtual resources
A mix of open source software and proprietary tools
communicating via common protocols
Resource Providers:
Grid providers: TeraGrid, OSG, etc.
Commercial providers: EC2, Sun, etc.
Appliance Providers:
off-the-shelf environment bundles
certified/endorsed for safety
leverage appliance software
commercial and open “marketplaces”
05/14/08
Virtual Workspaces: http//workspace.globus.org
Roles and Responsibilities



Division of labor

Resource providers provide resources

Virtual organizations provide appliances

Middleware that maps appliances onto resources
Appliance management software

Appliance creation, maintenance, validation, etc.

Not an appliance provider
Shifting the work around

05/14/08
Into the hands of the parties most motivated and
qualified to do it
Virtual Workspaces: http//workspace.globus.org
Workspace Deployment Tools
Virtual Workspaces:
Vital Stats

Virtual Workspace software allows an authorized
client to dynamically deploy and manage
workspaces


Currently implements workspaces as Xen VMs






KVM coming this summer
Also, contextualization layer
Globus incubator project
Started ~2003, first release in September 2005
Current release 1.3.1 (March ‘08)
Download it from:

05/14/08
Virtual Workspace Service (VWS), workspace control,
Context Broker
http://workspace.globus.org
Virtual Workspaces: http//workspace.globus.org
Using Workspaces
(Deployment)
VWS
Service
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Workspace
-Workspace metadata
-Pointer to the image
-Logistics information
-Deployment request
-CPU, memory, node count, etc.
05/14/08
Virtual Workspaces: http//workspace.globus.org
Using Workspaces
(Interaction)
The workspace service publishes
information on each workspace
as standard WSRF Resource
Properties.
VWS
Service
Users can query those
properties to find out
information about their
workspace (e.g. what IP
the workspace was
bound to)
Users can interact
directly with their
workspaces the same
way the would with a
physical machine.
05/14/08
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Trusted Computing Base (TCB)
Virtual Workspaces: http//workspace.globus.org
Workspace Service
(what sits inside)
Workspace WSRF front-end
that allows clients
to deploy and manage
virtual workspaces
VWS
Service
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Pool
node
Workspace back-end:
Resource manager for
a pool of physical nodes
Deploys and manages
Workspaces on the nodes
Each node must have a
VMM (Xen) installed, as
well as the workspace control
program that manages
individual nodes
Contextualization
creates a common context
for a virtual cluster
05/14/08
Trusted Computing Base (TCB)
Virtual Workspaces: http//workspace.globus.org
Workspace Service
Components

GT4 WSRF front-end


Leverages GT core and services, notifications, security, etc.
Roughly follows the OGF WS-Agreement provisioning
model



Workspace Service back-end



Works with multiple Resource Managers
Workspace Control for on the node functions
Contextualization

05/14/08
Lease descriptions
Publishes available lease terms
Put the virtual appliance in its deployment context
Virtual Workspaces: http//workspace.globus.org
Managing Resources with Virtual
Workspaces
Workspace Back-Ends

Default resource manager (basic slot fitting)



“datacenter technology” equivalent
Used for OSG Edge Services project
Challenge: finding Xen-enabled resources

Amazon Elastic Compute Cloud (EC2)





Solution: develop a back-end to EC2
Grid credential admission -> EC2 charging model
Address contextualization needs
Challenge: integrating VMs into current provisioning
models

05/14/08
Software similar to Workspace Service (no virtual clusters,
contextualization, fine-grain allocations, etc.)
Solution: gliding in VMs with the Workspace Pilot
Virtual Workspaces: http//workspace.globus.org
The Workspace Pilot


Challenge: how can I provide a “cloud”
using virtualization without disrupting the
current operation of my cluster?
Flying Low: the Workspace Pilot





Deployment



05/14/08
Integrates with popular LRMs (such as PBS)
Implements “best effort” leases
Glidein approach: submits a “pilot” program that
claims a resource slot
Includes administrator tools
Testing @ U of Victoria (Atlas), Ian Gable and
collaborators
Adapting for the use of the Atlas experiment @
CERN, Omer Khalid
TeraPort (small partition)
Virtual Workspaces: http//workspace.globus.org
Workspace Pilot in Action
Level 2:
provision VMs
VWS
Level 1:
provision raw
resources
VM VM
Xen dom0
VM VM
LRM/PBS
Xen dom0
Xen dom0
05/14/08
Virtual Workspaces: http//workspace.globus.org
The Pilot Program


Uses Xen balloon driver to reduce/restore domain0
memory so that guest domains (VMs) can be deployed
Secure VM deployment



The pilot requires sudo privilege and thus can be used only
with site administrator’s approval
The workspace service provides fine-grained authorization
for all requests
Signal handling

SIGTERM: pilot exceeded its allotted time




Default policy: one VM per physical node
Available for download


05/14/08
Notifies VWS, allows it to clean up
After a configurable time period takes things into its hands.
Workspace Release 1.3.1:
http://workspace.globus.org/downloads/index.html
Virtual Workspaces: http//workspace.globus.org
Workspace Control

VM control


Integrating a VM into the network







05/14/08
Assigning MAC addresses and IP addresses
DHCP delivery tool
Building up a trusted (non-spoofable) networking layer
VM image propagation
Image management and reconstruction


Starting, stopping, pausing, etc.
creating blank partitions, sharing partitions
Contextualization information management
Talks to the workspace service via ssh
Can be used as a standalone component
Virtual Workspaces: http//workspace.globus.org
Workspace Back-Ends

Default resource manager (basic slot fitting)



“datacenter technology” equivalent
Used for OSG Edge Services project
Challenge: finding Xen-enabled resources

Amazon Elastic Compute Cloud (EC2)





Solution: gliding in VMs with the Workspace Pilot Long-term
solutions
Leasing model with explicit terms


05/14/08
Solution: develop a back-end to EC2
Grid credential admission -> EC2 charging model
Address contextualization needs
Challenge: integrating VMs into current provisioning models


Software similar to Workspace Service (no virtual clusters, contextualization, finegrain allocations, etc.)
Semantically rich leases: advance reservations, urgent leases,
renegotiable leases, etc.
Cost-effective lease semantics
Virtual Workspaces: http//workspace.globus.org
Appliance Management and
Contextualization
Where Do Appliances
Come From?
Marketplaces
(VMWare, EC2,
Workspace …)
Appliance Provider
(a user, a VO, a Grid…)
appliance
description
Good… but: maintenance? ease of use? formats?
05/14/08
Virtual Workspaces: http//workspace.globus.org
Where Do Appliances
Come From?
Marketplaces
(VMWare, EC2,
Workspace …)
Appliance Management
Software
(OSFarm, rPath,…))
Xen
appliance
description
VMware
CDROM
Appliance Provider
(a user, a VO, a Grid…)
Better
05/14/08
Virtual Workspaces: http//workspace.globus.org
Deploying Appliances

Appliances need to be “portable”


Making the appliance contextaware:




So that they can be reused in many
contexts
VM VM
Other appliances
Site-specific information (e.g. a DNS
server)
User/group/VO/Grid-specific
information (e.g. public keys, host
certs, gridmapfiles, etc.)
VM VM
site
Security issues


05/14/08
Who do I trust to provide legitimate
context information?
How do I make sure that appliances
adhere to my site policies?
Virtual Organization
Virtual Workspaces: http//workspace.globus.org
Where Do Appliances
Come From?
Marketplaces
(VMWare, EC2,
Workspace …)
Appliance Management
Software
(OSFarm, rPath, CohesiveFT…))
appliance
description
Xen
VMware
CDROM
appliance
assertions
appliance
contextualization
Appliance Provider
(a user, a VO, a Grid…)
05/14/08
Virtual Workspaces: http//workspace.globus.org
Creating Virtual Clusters with
Workspace Tools
Make Me a Working Cluster

You got some VMs and you’ve deployed them…
Now What?


What network are they connected to? Do they actually represent something useful? (like a
ready-to-use OSG cluster?) Do the VMs know about each other? Can they share some disk? How
do they integrate into the site storage/account system? Do they have host certificates? And a
gridmapfile? And all the other things that will integrate them into my VO?
Challenge: what is a virtual cluster?

A more complex virtual machine




Networking, shared storage, etc.
Available at the same time and sharing a common
context
Example: an OSG cluster
Solutions



Ensemble management
Exporting and sharing common context
Sophisticated networking configurations.
Paper: “Virtual Clusters for Grid Communities”, CCGrid 2006
05/14/08
Virtual Workspaces: http//workspace.globus.org
Contextualization
Challenge: Putting a VM in the deployment context of the Grid, site,
and other VMs


Assigning and sharing IP addresses, name resolution, application-level
configuration, etc.
Solution: Management of Common Context


contextualization agent
Common
Context
IP
hostname
pk
Configuration-dependent



provides&requires
Common understanding
between the image “vendor”
and deployer
Mechanisms for securely
delivering the required
information to images across
different implementations
Paper: “A Scalable Approach To Deploying And Managing Appliances”,
TeraGrid conference 2007
05/14/08
Virtual Workspaces: http//workspace.globus.org
Contextualizing Appliances
Context
Broker
Appliance Provider
appliance context
Appliance
context agent
Appliance
context template
appliance context
Appliance
Deployer
generic context
disk image
application-specific
context agents
appliance
content
Resource Provider
05/14/08
Virtual Workspaces: http//workspace.globus.org
Application Example: Virtualization
with the STAR experiment
Virtual Workspaces for STAR

STAR image configuration


Using the workspace service over EC2 to provision resources


05/14/08
A virtual cluster composed of one OSG headnode and multiple
STAR worker nodes
Allocations of up to 100 nodes
Dynamically contextualized for out-of-the-box cluster
Virtual Workspaces: http//workspace.globus.org
Virtual Workspaces for STAR



Deployment stages:

Create an “ensemble” defining the virtual cluster

Deploy the virtual machines

Contextualize to provide an out-of the-box cluster
Contextualization:

Cluster applications: NFS & PBS

Grid information: gridmapfile and host certificates
Runs


05/14/08
Using VWS on the nimbus cloud for small node allocations
(VWS + default + Context Broker)
Using VWS with EC2 backend for allocations of ~100
nodes (VWS + EC2 backend + Context Broker)
Virtual Workspaces: http//workspace.globus.org
with thanks to Jerome Lauret and Doug Olson of the STAR project
Running
Running
Runningjobs
jobs
jobs::::124
150
142
109
94
73
42
0
Running
jobs
230
VWS/EC2
BNL
Running
Running
Runningjobs
jobs
jobs:::195
300
282
243
221
:140
76
0
Running
jobs
300
WSU
Running
Running
Runningjobs
jobs
jobs::::136
200
195
183
152
54
37
96
0
Running
jobs
150
Fermi
PDSF
Job Completion :
05/14/08
File Recovery :
Virtual Workspaces: http//workspace.globus.org
Running
Runningjobs
jobs:::15
50
42
39
34
27
21
9
0
Running
jobs
50
withwith
thanks
to Jerome
Lauret
and and
Doug
Olson
of the
project
thanks
to Jerome
Lauret
Doug
Olson
of STAR
the STAR
project
Nersc
PDSF
EC2
(via Workspace
Service)
WSU
05/14/08
Accelerated display of a workflow job state
Y = job http//workspace.globus.org
number, X = job state
Virtual Workspaces:
Cloud Computing
The Workspace Cloud Client

We took the workspace client and made it easy to
use



Allows scientists to lease VMs roughly following
Amazon’s EC2 model (simplified)



05/14/08
PKI X509 credentials and quotas instead of payment
The goal is to restore/evolve this functionality as
user requests come in


Narrowing down the functionality
Wrapper on top of the workspace client
Saving VMs, network configurations
In the future: richer leases, etc.
“Cloudkit” coming out in next release, due soon
Virtual Workspaces: http//workspace.globus.org
Nimbus @ University of Chicago

Objectives




Vital Stats






Deployed on 16 nodes of TeraPort cluster @ UC
Powered by the workspace set of tools
Image management handled via gridFTP
Made available mid-March ‘08
http://workspace.globus.org/clouds/
To obtain access mail nimbus@mcs.anl.gov

05/14/08
Make it easy for scientific community to experiment with
this mode of resource provisioning
Learn about the requirements of scientific projects and
evolve the infrastructure
Features, SLAs, security and sharing concerns, etc.
Available to scientific, educational projects, open source
testing, etc.
Virtual Workspaces: http//workspace.globus.org
Science Clouds

A group of clouds making resources available “on
the nimbus model”



Nimbus, Stratus@UFL (Mauricio Tsugawa), FZK in
Germany (almost done, Lizhe Wang), others
expressed interest
EC2
Some differences in setup, policies

UFL requires private networks (using OpenVPN)



EC2 requires payment
Cloud federation

Moving an app from a hardware platform to a cloud
is relatively hard



05/14/08
Currently you’d use the same credential for the cloud and for
the virtual private network
Need image, learn new paradigm, etc.
Moving between clouds is relatively easy
… if you have “rough consensus” on interfaces,
image formats, etc.
Virtual Workspaces: http//workspace.globus.org
Who runs on the clouds
and what do they do?
05/14/08
Virtual Workspaces: http//workspace.globus.org
Related Projects



05/14/08
Portal development (Josh Boverhof, LBNL)
Workspace KVM backend (Michael Fenn,
Clemson University)
Integration with the Nebula project
(University of Madrid)
Virtual Workspaces: http//workspace.globus.org
Let’s get on the cloud!
Parting Thoughts
05/14/08
Virtual Workspaces: http//workspace.globus.org
Parting Thoughts

Come and run on science clouds

Not just cloud computing


05/14/08
A bunch of technologies have to come
together to make cloud computing
widespread
The way we do computing is changing

Today we build horseless carriages

Tomorrow we might do things differently
Virtual Workspaces: http//workspace.globus.org
Credits

Workspace team:


Guest appearances


Ian Foster, Frank Siebenlist
With thanks to many collaborators:

05/14/08
Kate&Tim
Jerome Lauret (STAR, BNL), Doug Olson (STAR, LBNL), Marty Wesley
(rPath), Stu Gott (rPath), Ken Van Dine (rPath), Predrag Buncic (Alice,
CERN), Haavard Bjerke (CERN), Rick Bradshaw (Bcfg2, ANL), Narayan
Desai (Bcfg2, ANL), Duncan Penfold-Brown (Atlas,uvic), Ian Gable (Atlas,
uvic), David Grundy (Atlas, uvic), Ti Legget (University of Chicago), Greg
Cross (University of Chicago), Lizhe Wang (FZK), Marcel Kunze (FZK),
Mauricio Tsugawa (UFL), Jose Fortes (UFL), Renato Figueiredo (UFL), Omer
Khalid (CERN), Artem Harutyunyan (CERN), Mike Fenn (U of Clemson),
Sebastien Goasguen (U of Clemson), Josh Boverhof (LBNL), Leve Hajdu
(STAR, BNL), Lidia Didenko (STAR, BNL), David Bartle (Atlas, uvic), Lee
Liming (ANL), Frank Wuerthwein (OSG, SDSC), Abhishek Rana (OSG,
SDSC), Jeff Chase (Duke), and many others.
Virtual Workspaces: http//workspace.globus.org
Sponsors

NSF SDCI “Missing Links”

NSF CSR “Virtual Playgrounds”

TeraGrid

DOE SciDAC CEDPS
05/14/08
Virtual Workspaces: http//workspace.globus.org
Download