RAIN PPT

advertisement
FutureGrid Image
Management and Rain
Presenters:
Javier Diaz
Gregor von Laszewski
https://portal.futuregrid.org
Science Cloud Summer School 2012
Motivation
• FutureGrid (FG) is a testbed providing users with
grid, cloud, and high performance computing
resources
• One of the goals of FutureGrid is to provide a testbed
to perform experiments in a reproducible way among
different infrastructures
• We need mechanism to ease the use of these
infrastructures
• FG Rain and Image Management frameworks allow
users to easily create customized environments by
placing suitable images onto the FG resources
https://portal.futuregrid.org
Science Cloud Summer School 2012
Rain
• In FG, dynamic provisioning goes beyond the
services offered by common scheduling tools that
provide such features
• We want to easily provide custom HPC environment,
Cloud environment, or virtual networks on-demand
• Example: “rain” a Hadoop environment into a set of
machines
– fg-rain -n 8 –hadoop –j myHadoopApp.jar …
– Users and administrators do not have to set up the Hadoop
environment as it is being done for them
• Makes use of the Image Management Framework
https://portal.futuregrid.org
Science Cloud Summer School 2012
Architectural Overview
RAIN
Image
Management
Client
Image
Management
Server
Image
Generation
Portal
FG Shell
Image
Repository
API
Image
Registration
Image
Instantiation
External Services:
Chef, Security tools
https://portal.futuregrid.org
IaaS and Bare-Metal HPC
Infrastructures
Cloud IaaS
Frameworks
Nimbus
Eucalyptus
AWS
OpenNebula
OpenStack
HPC Clusters
Bare Metal
Science Cloud Summer School 2012
Image Management
• Key component in any modern compute infrastructure
(virtualized or non-virtualized)
• Processes part of the image management life-cycle:
Creating and
Customizing
Images
User selects
properties
and software stack
features
meeting his/her
requirements
(b)
Storing
Images
Abstract
Image
Repository
https://portal.futuregrid.org
(c)
Registering
Images
Adapting the Images
(a)
http://futuregrid.org
(d)
Instantiating
Images
Nimbus
Eucalyptus
OpenStack
OpenNebula
Bare Metal
Science Cloud Summer School 2012
FutureGrid Image Management
Framework
• Framework provides users with the tools needed to
ease image management across infrastructures
• Users choose the software stacks of their images and
the infrastructure/s
• Targets end-to-end workflow of the image life-cycle
• Create, store, register and deploy images for both
virtualized and non-virtualized resources in a
transparent way
• Allows users to have access to bare-metal
provisioning (departure from typical HPC centers)
– Users are not locked into a specific computational
environment offered typically by HPC centers
https://portal.futuregrid.org
Science Cloud Summer School 2012
Architectural Overview
Image
Management
Client
Image
Management
Server
Image
Generation
Portal
FG Shell
Image
Repository
API
Image
Registration
Image
Instantiation
External Services:
Chef, Security tools
https://portal.futuregrid.org
IaaS and Bare-Metal HPC
Infrastructures
Cloud IaaS
Frameworks
Nimbus
Eucalyptus
AWS
OpenNebula
OpenStack
HPC Clusters
Bare Metal
Science Cloud Summer School 2012
Image Generation
• Creates images according to
user’s specifications:
• OS type and version
• Architecture
• Software Packages
Command Line Tools
Requirements:
OS, version, hadrware,...
Yes
• Software installation may be
Retrieve
aided by Chef
Image from
• Images are not aimed to any Repository
specific infrastructure
• Image stored in Repository or
returned to user
https://portal.futuregrid.org
Matching Base
Image in the
Repository?
No
Generate Image
Image Gen. Server
OpenNebula
Base OS
VM
VM
CentOS 5 VM
CentOS 6
Ubuntu 12 X86_64
X86_64
X86
Base Software
Base Image
FG Software
Install Software
Cloud Software
Update Image
User Software
User's Image
Store in Image
Repository
Science Cloud Summer School 2012
Image Repository
• Service to query, store, and update images
• Unique interface to store various kind of images for
different systems
• Images are augmented with some metadata which is
maintained in a searchable catalog
• Keep data related with the usage to assist performance
monitoring and accounting
• Independent from the storage back-end. It supports a
variety of them and new plugins can be easily created
https://portal.futuregrid.org
Science Cloud Summer School 2012
Image Metadata
User Metadata
imgId
Image’s unique identifier
Field
Name
owner
owner
userId
os
Operating system
User’s unique
identifier
description
Description of the image
fsCap
Disk max usage (quota)
tag
Image’s keywords
fsUsed
Disk space used
vmType
Virtual machine type
lastLogin
Last time user used
the framework
imgType
Aim of the image
status
permission
Access permission
Active, pending,
disable
imgStatus
Status of the image
role
Admin, User
createdDate
Upload date
ownedimg
# of owned images
lastAccess
Last time the image was accessed
Field Name
Description
Description
accessCount # times the image has been
accessed
size
Size of the image
https://portal.futuregrid.org
Science Cloud Summer School 2012
Image Registration I
• Adapts and registers images into specific
infrastructures
• Two main infrastructures types are considered
to adapt the image:
– HPC: Create network bootable images that can
run in bare-metal machines (xCAT/Moab)
– Cloud: Convert the images in VM disks and
enable VM’s contextualization for the selected
cloud
https://portal.futuregrid.org
Science Cloud Summer School 2012
Image Registration II
• User specifies where to
Command Line Tools
register the image
Requirements: Image,
• Optionally, user can select Kernel, Infrastructure
User's Image
kernel from a catalog
Customize Image for:
• Decides if an image is
HPC
Eucalyptus
OpenNebula
secure enough to be
OpenStack Nimbus
Amazon
registered
Image Customized for the selected
Infrastructure
• The process of registering
Security Check
an image only needs to be
done once per infrastructure Upload Image to the Infrastructure
https://portal.futuregrid.org
Retrieve from
Image Repository
Image is Ready
for Instantiation in
the Infrastructure
Science Cloud Summer School
2012
Register Image in the
Infrastructure
FutureGrid Image Management
and Rain Examples
https://portal.futuregrid.org
Science Cloud Summer School 2012
Starting to use the software
• Requirements
– FutureGrid portal account
– Accounts in the infrastructures you want to use
(Eucalyptus, OpenStack, Nimbus, HPC)
– Request account to use Image Management and Rain
software
• Software is installed in India login node
– ssh jdiaz@india.futuregrid.org
• Load FutureGrid software
– module load futuregrid
https://portal.futuregrid.org
https://portal.futuregrid.orgScience
Cloud Summer School 2012
Generate an Image
• fg-generate -u jdiaz -o centos -v 5 -a x86_64 –
s python26, wget
Generate img
1
Deploy VM
And
2 Gen. Img
3
Store in the Repo
or
Return it to user
https://portal.futuregrid.org
Science Cloud Summer School 2012
Generate an Image
• fg-generate -u jdiaz -o centos -v 5 -a x86_64 s python26, wget
Client output: Generate img
Deploy VM
And
2 Gen. Img
Image generator client...
Please insert the password for 1
the user jdiaz
Password:
Selected Architecture: x86_64
Connecting server: i120:567913
Your image requestStore
is in the
queue
to be processed
in the
Repo
------wait here if too many request are being processed-----or
Your image request is being processed
Return it to user
Generating the image
------wait here until finished-----Your image has be uploaded in the repository with ID=915678426632408832461797
The image and the manifest generated are packaged in a tgz file.
Please be aware that this FutureGrid image does not have kernel and fstab. Thus, it is not built
for any deployment
type. To deploy the new image, use the IMDeploy
command.
https://portal.futuregrid.org
Science Cloud
Summer School 2012
Image Repository Examples
• Query the image repository
– fg-repo –u jdiaz –q “* where os=centos_5”
Authentication OK
2 items found
imgId=215369546596144595085417, os=centos_5, arch=x86_64, owner=jdiaz, description=None,
tag=jdiaz2699012769, vmType=none, imgType=machine, permission=private, status=available
imgId=68725515834828774883357, os=centos_5, arch=x86_64, owner=jdiaz, description=None,
tag=jdiaz1786816389, vmType=none, imgType=machine, permission=private, status=available
• Upload an Image
– fg-repo –u jdiaz –p imagefile.tgz “os=centos & vmtype=kvm
& description=my image”
Checking quota and Generating an ImgId
Authentication OK
Uploading image. You may be asked for ssh/passphrase password
Imagefile.tgz
100% 53 0.1KB/s 00:00
Registering the image
https://portal.futuregrid.org
The
image has been uploaded andhttps://portal.futuregrid.org
registered with id 211913675261934066702430
Science Cloud Summer School 2012
Image Repository Examples
• Add User
– fg-repo –u jdiaz --useradd userId
Authentication OK
User created successfully.
Remember that you still need to activate this user (see setuserstatus
command)
• Image Usage
– fg-repo –u jdiaz –histimg
Authentication OK
imgId=191563243441508818679593, createdDate(UTC)=2011-10-13 21:43:30,
lastAccess(UTC)=2011-10-24 17:37:45, accessCount=16,
imgId=111462205747829171557134, createdDate(UTC)=2011-10-14 20:36:40,
lastAccess(UTC)=2011-10-21 13:48:04, accessCount=4,
imgId=21870735808909675281040, createdDate(UTC)=2011-10-07 20:36:33,
lastAccess(UTC)=2011-10-07 20:36:33, accessCount=0,
https://portal.futuregrid.org
Science Cloud Summer School 2012
Register an Image for HPC
• fg-register -u jdiaz -r 2131235123 -x india
Register img
from Repo
1
Register img in
Moab and
6
recycle sched
Get img from
Repo
2
Customize img
5
3
Return info
about the img
4
https://portal.futuregrid.org
Register img in xCAT
(cp files/modify tables)
Science Cloud Summer School 2012
Register an Image for HPC
• fg-register -u jdiaz -r 2131235123 -x india
Client output:
Starting image deployer...
Please insert the password for theRegister
user jdiaz
img
Password:
from Repo
Get img from
Connecting to xCAT server
Repo
1
------wait here if an image is being registered----2
Authentication OK
Register
img in and registering image on xCAT
Customizing
Customize img
5
Moab------wait
and
here
6 until finished----3
recycle
sched to MoabReturn
Connecting
server info
Your image has been
registered
in xCAT as centosjavi960524558.
about
the img
Register img in xCAT
Please allow a few minutes for xCAT to register the image before attempting to use it.
(cp files/modify tables)
To boot an machine using your image: qsub -l os=<imagename>
4
To check the status of the job you can use checkjob and showq commands
https://portal.futuregrid.org
Science Cloud Summer School 2012
Register an Image stored in the
Repository into OpenStack
• fg-register -u jdiaz -r 2131235123 -s india -v ~/novarc
Deploy img
from Repo
1
Upload the
img to the
5
Cloud
4
Return img
to client
https://portal.futuregrid.org
Get img from
Repo
2
Customize img
3
Science Cloud Summer School 2012
Register an Image stored in the
Repository into OpenStack
• fg-register -u jdiaz -r 2131235123 -s india -v ~/novarc
Client output:
Starting image registration...
Please insert the password for the user jdiaz
Password:
Deploy img
Authentication OK
from Repo
------wait here until finished----Get img from
Retrieving image. You may be asked for ssh/passphrase password
Repo
1
centos5jdiaz2250444196.img
100% 1496MB 65.0MB/s 00:23
2
euca-bundle-image ….
Upload
the
euca-upload-image …
Customize img
imgeuca-register
to the
…
5
4
Cloud
3
IMAGE emi-437C1239 Return img
Your image has been registered on OpenStack with the id emi-437C1239
to client
To launch a VM you can use euca-run-instances -k keyfile -n <#instances> id
Remember to load you Eucalyptus environment before you run the instance (source eucarc)
More information is provided in More information is provided in
https://portal.futuregrid.org/tutorials/oss and in https://portal.futuregrid.org/tutorials/eucalyptus
https://portal.futuregrid.org
Science Cloud Summer School 2012
Rain an Image and execute a task
(baremetal)
• fg-rain -u jdiaz -r 123123123 -x india -j testjob.sh -m 2
7
qsub, monitor status,
completion status and
indiacate output files
1
Run job in my
image stored in
the repo
Register img
2
3
Register img
in Moab and
recycle
8
sched
Register img
from Repo
4
Customize img
7
5
Return
info about
the img
6
https://portal.futuregrid.org
Get img from
Repo
Register img in xCAT
(cp files/modify tables)
Science Cloud Summer School 2012
Rain an Image and execute a task
(baremetal)
• fg-rain -u jdiaz -r 123123123 -x india -j testjob.sh -m 2
Client output:
Run job in my
image stored in
the repo
7
Starting rain...
qsub,
Please insert
the monitor
passwordstatus,
for the user jdiaz
1
Password:completion status and
indiacate
output
files
----- Deploy
the image.
Same
logs as before --Register img
Job id is: 200941
Wait until the job finishes
2
Get img from
State: Idle
Register img
4 Repo
3
State: Idle
from Repo
State:
Running
Register img
Customize img
7
State:
Running
in Moab and
State: Completed
5
recycle
8
Return
Completion Code: 0 Time: Fri Oct 28 15:05:02
sched
info about
Register img in xCAT
The Standard output is in the file: salida.txt
the img
(cp files/modify tables)
The Error output is in the file: jobscript.e200941
6
https://portal.futuregrid.org
Science Cloud Summer School 2012
Rain a Hadoop environment in
Interactive mode
• fg-rain -u jdiaz -i ami-00000017 -s india -v ~/OSessexindia/novarc --hadoop --inputdir ~/inputdir1/ --outputdir
~/outputdir/ -m 3 -I
Start VM
2
VMs Running
3
1
Install/Configure
Hadoop
4
VM
Deploy Hadoop Login User in
Hadoop Master
Environment
VM
HADOOP
5
VM
https://portal.futuregrid.org
https://portal.futuregrid.orgScience
Cloud Summer School 2012
Rain a Hadoop environment in
Interactive mode
• fg-rain -u jdiaz -i ami-00000017 -s india -v ~/OSessexindia/novarc
--hadoop
--inputdir
--outputdir
Waiting
STARTUP_MSG:
been
starting
successfully
jobtracker,
tooutput:
have
access
formatted.
Starting
logging
to Instance
NameNode
to /N/u/jdiaz/hadoopjob764175511/hadoop-1.0.2/libexec/../logs/hadoopi-00000772
associated~/inputdir1/
with address server-1906
Client
If
we
exit
from
VM:
Waiting
STARTUP_MSG:
12/07/10
jdiaz-jobtracker-10.1.2.157.out
to
17:15:50
have access
INFO
hostto=namenode.NameNode:
Instance
10.1.2.157/10.1.2.157
i-00000773 associated
SHUTDOWN_MSG:
with address server-1907
Starting
Rain...
~/outputdir/
-m
3
-I
Waiting to have
STARTUP_MSG:
/************************************************************
server-1908:
starting
access
args
tasktracker,
to=Instance
[-format]
logging
i-00000774
to /N/u/jdiaz/hadoopjob764175511/hadoopassociated with address server-1908
Stopping
Please
insert
Hadoop
the password
Cluster for the user jdiaz
All
STARTUP_MSG:
SHUTDOWN_MSG:
1.0.2/libexec/../logs/hadoop-jdiaz-tasktracker-10.1.2.160.out
VMs
are
accessible:
Shutting
True= Start
1.0.2
downVM
NameNode at 10.1.2.157/10.1.2.157
Password:jobtracker version
stopping
Creating
STARTUP_MSG:
************************************************************/
server-1907:
temporal
sshkey
build
tasktracker,
=files
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
to /N/u/jdiaz/hadoopjob764175511/hadoop-r
Verify that
server-1907:
thestarting
stopping
requested
tasktracker
image
islogging
in available
status or wait until it is available
2Sat
Copying
1304954;
Starting
1.0.2/libexec/../logs/hadoop-jdiaz-tasktracker-10.1.2.159.out
temporal
compiled
cluster
private
bytasktracker
'hortonfo'
and
ssh-key
Mar 24
files
23:58:21
to VMsUTC 2012
Creatingthe
server-1908:
temportal
stopping
sshkey
pairpublic
foronEC2
VMs
Running
Configuring
************************************************************/
starting
Running
namenode,
Job ssh
in VM
logging
to
mounting
/N/u/jdiaz/hadoopjob764175511/hadoop-1.0.2/libexec/../logs/hadoophome
directory (assumes that sshfs and ldap is installed)
Save private
stopping
namenode
sshkey
intoand
a file
Copying
12/07/10
jdiaz-namenode-10.1.2.157.out
temporal
17:15:50
private
INFO
util.GSet:
and publicVM
ssh-key
VMs
Launching
server-1908:
image
stopping
datanode
3type files= to64-bit
Configuring
12/07/10
server-1908:
You
are going
17:15:50
ssh
starting
toin
beINFO
VM
logged
datanode,
and
util.GSet:
mounting
logging
but
2%home
you
max
to /N/u/jdiaz/hadoopjob764175511/hadoopcan
directory
memory
change
=(assumes
to
19.33375
your user
that
MB
by
sshfs
executing
and ldapsuis-installed)
<username>
Waiting
server-1907:
for
running
stopping
state
datanode
in as
allroot,
the
VMs
Install/Configure
Copying
12/07/10
1.0.2/libexec/../logs/hadoop-jdiaz-datanode-10.1.2.160.out
List
of
machines
temporal
17:15:50
are
private
INFO
in
/root/machines
util.GSet:
and
public
capacity
ssh-key
and /N/u/<username>/machines.
files
= 2^21
to VMs
= 2097152 entries
Your real home is in
i-00000772:pending
server-1906:
stopping secondarynamenode
Hadoop
Configuring
12/07/10
server-1907:
/tmp/N/u/<username>
17:15:50
ssh
starting
in INFO
VM
datanode,
and
util.GSet:
mounting
logging
recommended=2097152,
home
to /N/u/jdiaz/hadoopjob764175511/hadoopdirectory (assumes
actual=2097152
that sshfs and ldap is installed)
i-00000773:pending
Job
Done
1
4
VM
Setting
12/07/10
1.0.2/libexec/../logs/hadoop-jdiaz-datanode-10.1.2.159.out
Hadoop
up
is17:15:50
in
Hadoop
the home
INFO
environment
directory
namenode.FSNamesystem:
in
of the
your
jdiaz
user.home directory
fsOwner=jdiaz
VM
i-00000774:pending
Login
User
in
Deploy
Hadoop
Configure
12/07/10
server-1906:
[root@10
17:15:50
~]#
Hadoop
Warning:
INFO
cluster
Permanently
namenode.FSNamesystem:
in the jdiaz
added
home'server-1906,10.1.2.157'
directorysupergroup=supergroup
(RSA)
to the list of known hosts.
------------------------HADOOP
Hadoop
Master
Environment
Starting
12/07/10
server-1906:
Hadoop
17:15:50
starting
cluster
INFO
secondarynamenode,
innamenode.FSNamesystem:
the jdiaz home directory
logging to isPermissionEnabled=true
/N/u/jdiaz/hadoopjob764175511/hadoopi-00000772:running
5
Formatting
12/07/10
1.0.2/libexec/../logs/hadoop-jdiaz-secondarynamenode-10.1.2.157.out
17:15:50
HDFS INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
i-00000773:running
12/07/10
Waiting
in17:15:50
17:15:49
the safemode
INFO namenode.FSNamesystem:
namenode.NameNode: STARTUP_MSG:
isAccessTokenEnabled=false
VM
i-00000774:running
/************************************************************
accessKeyUpdateInterval=0
Safe
mode is OFF
min(s), accessTokenLifetime=0 min(s)
------------------------https://portal.futuregrid.org
12/07/10
Starting
17:15:50
INFO
daemons
namenode.NameNode:
Caching file Science
names occuring
more than
10 times
https://portal.futuregrid.org
https://portal.futuregrid.org
Cloud Summer
School
2012
Number MapReduce
of
instances
booted
3
Rain a Hadoop environment and
execute Word count 1/2
• As example we use the word count application to count the
words of several books
• Create script with the hadoop command (hadoopword.sh)
hadoop jar $HADOOP_CONF_DIR/../hadoop-examples*.jar
wordcount inputdir1 outputdir
• Download books in txt
$ wget i120/test-image/books-example.tgz
• Uncompress books
$ mkdir ~/inputdir1
$ tar xvfz books-example.tgz –C ~/inputdir1
https://portal.futuregrid.org
Science Cloud Summer School 2012
Rain a Hadoop environment and
execute Word count 2/2
• Execute rain
$ fg-rain -u jdiaz -i ami-00000017 -s india -v ~/OSessexindia/novarc –j ~/hadoopword.sh --hadoop --inputdir
~/inputdir1/ --outputdir ~/outputdir/ -m 3
• Once the job is done
$ ls ~/outputdir/outputdir/
_logs part-r-00000 _SUCCESS
• The output is in the file part-r-00000
https://portal.futuregrid.org
Science Cloud Summer School 2012
Rain a Virtual Cluster
• fg-cluter run -i ami-00000017 -n 3 -t m1.medium -a
mycluster
Start VM
2
VMs Running
3
Install/Configure
SLURM
1
4
Deploy Virtual
Cluster
Login User in
Frontend
VM
SLURM
Frontend
SLURM
Compute
VM
5
VM
https://portal.futuregrid.org
SLURM
Compute
Science Cloud
Summer School 2012
Additional Information
• FG Rain
– Download https://github.com/futuregrid/rain
– Doc http://futuregrid.github.com/rain/
• FG Cluster
– Download https://github.com/futuregrid/virtualcluster
– Doc http://futuregrid.github.com/virtual-cluster/
https://portal.futuregrid.org
Science Cloud Summer School 2012
Download