Legion: The Grid OS Architecture and User View

advertisement
Legion: The Grid OS
Architecture and User View
Anand Natrajan
(anand@virginia.edu)
Marty Humphrey
(humphrey@cs.virginia.edu)
The Legion Project, University of Virginia
(http://legion.virginia.edu)
Grid Environment
Disjoint
file systems
Computers
 Disjoint namespaces
 Networks
 Multiple

People
administration
domains
Data
Unpredictable
load,
availability,

Devices failures
 Security problems
Grid OS Requirements
• Wide-area
• High Performance
• Complexity
Management
• Extensibility
• Security
• Site Autonomy
• Input / Output
• Heterogeneity
•
•
•
•
•
•
Fault-tolerance
Scalability
Simplicity
Single Namespace
Resource Management
Platform
Independence
• Multi-language
• Legacy Support
Legion - A Grid OS
Tools
• MPI / PVM
• P-space studies multi-run
• Parallel C++
• Parallel object-based
Fortran
• CORBA binding
• Object migration
• Accounting
• Remote builds and
compilations
• Fault-tolerant MPI
libraries
• Post-mortem debugger
• Console objects
• Parallel 2D file objects
• Collections
• Licence support
Commercial Support - Avaki Corp.
Mentat
Legion
Avaki
Web
•
•
•
•
Venture funded
Headquartered in Boston
Growing number of employees
Multi-tiered support offering
Protein Folding with CHARMM
Molecular
Dynamics Simulations
100-200 structures
to sample
(r,Rgyr ) space
r
Rgyr
Resources Available
HP V-class
CalTech
440 MHz PA-8700
128/128
IBM SP3
UMich
375MHz Power3
24/24
DEC Alpha
UVa
533MHz EV56
32/128
IBM Blue Horizon
SDSC
375MHz Power3
512/1184
Sun HPC 10000
SDSC
400MHz SMP
32/64
IBM Azure
UTexas
160MHz Power2
32/64
Transparent Remote Execution
•
•
•
•
•
•
•
User initiates “run”
User/Legion selects site
Legion copies binaries
Legion copies input files
Legion starts job(s)
Legion monitors progress
Legion copies output files
Mechanics of CHARMM Runs
Create
task
Dispatch
Dispatch
directories
&
runs
more
runs
specification
Legion
Register binaries
2%
1%
20%
0%
0%
77%
Blue Horizon
CalTech
UTexas
DEC Alpha
UMich
Sun HPC
Types Of Applications
• Legacy applications
• Legion-aware applications
– I/O library
– 2D file object
• Applications Using Stdgrid
• Parameter Space Studies
• Parallel Programs
– MPI, PVM, MPL, Basic Fortran Support (BFS)
Grid Application Requirements
•
•
•
•
•
Security
Fault-tolerance
Heterogeneity
Collaboration
…
• Legion supports these and other needs
Heterogeneous Runs
BT-Med Ocean Model
Cross-Organisation Collaboration
•
•
•
•
Different companies
Proprietary simulations and data
Each needs the other
Form virtual partnership
Platforms
•
•
•
•
•
•
•
•
Windows NT, 2K, 98, 95
Sun (Solaris)
SGI (Irix, Origin)
Intel (Linux, Free BSD)
DEC (Unix, Linux)
Cray (T90, T3E)
IBM (AIX, SP-2)
HP (HPUX)
•
•
•
•
•
•
Codine
LoadLeveler
Maui
PBS
NQS
LSF
Applications
•
•
•
•
•
•
•
•
Biochemistry and Molecular Science
Information Retrieval
NPACI - SDSC, UCSD, Caltech,
Materials Science
UTexas, Umich, UCB, UVa. DoD MSRCs
- NAVO & ARL, NASA Ames
Climate Modelling
Neuroscience
Aerospace
Astronomy
Graphics
User View
Command-Line Interface
Setup
• Setup shell environment variables
. ~legion/setup.sh
OR
export LEGION=/home/legion/Legion
export LEGION_OPR=/home/maya/OPR
. $LEGION/bin/legion_env.sh
• Specifies where binaries and configuration
files can be found
• Sets root context
Login
• Authentication to system
legion_login /users/stephen
• Currently uses password - other
mechanisms, e.g., Kerberos ticket possible
• Login object (a.k.a. Authentication object)
- /users/stephen - is user’s proxy to world
• Login object generates certificate
identifying user
Context Space
/
• Unix-like
legion_ls
legion_pwd
legion_cd
legion_cat
...
hosts
mach1 mach2
subdir
prog
home
mydir
file1
users
you me
tty
Context Space
• Network-wide, transparent file system
• Location-independent read/write of files
• Convenient transfer of files between
context space and local file system
• I/O libraries for access
• Unix-like utilities
Context Example
legion_ls /
Another Context
legion_ls /hosts
Yet Another Context
legion_ls /users
More Context Fun
Other Context Commands
• Locate a LOID in context space
legion_list_names
• Locate an object on a machine
legion_whereis
• Find status of an object
legion_object_info
• List metadata of an object
legion_list_attributes
Status Of An Object
legion_object_info -c work
Physical Location Of Object
legion_whereis -c work
Context Space vs. Local Space
• Local space = your machine’s directory
structure
– OS-specific, Machine-specific
– Use cp, copy, etc.
– e.g., C:\Program Files\, /usr/bin, /mnt/disk1
• Context space = Legion’s directory
structure
– OS-independent, Machine-independent
– Use legion_cp, etc.
Context Space and Local Space
• Transfer one file from local space to
context space
legion_cp -localsrc <localfile>
<contextfile>
• Transfer one file from context space to
local space
legion_cp -localdest <contextfile>
<localfile>
Context Space and Local Space
• Copying local directory to context space
legion_cp -r -localsrc <localdir>
<contextdir>
OR
legion_import_tree <localdir>
<contextdir>
• Copying context directory to local space
legion_cp -r -localdest <contextdir>
<localdir>
Context Space and Local Space
• Map (not copy!) local directory to context
space temporarily
legion_export_dir <localdir>
<contextdir>
• Does NOT make copy of local directory
• Merely provides Legion-like access to local
directory
– Use legion_cat on local files
Making Context Space…
• Local sub-directory with Legion NFS
daemon
– Use cat on context files
• FTP directory with FTP interface
• Windows directory with Samba interface
• URL tree with HTTP interface
I/O Performance
Large Read Aggregate Bandwidth
NFS
lnfsd
LegionFS
200
180
Bandwidth (MB/sec)
160
140
120
100
80
60
40
20
0
1
10
20
30
40
50
Number of readers
–
–
–
–
X-Axis = number of clients simultaneously performing 1MB reads on 10MB files
Y-Axis = total read bandwidth
Each point = average of multiple runs
Clients = 400MHz Intels, NFS Server = 800MHz Intel
Making Context Space…
• Local sub-directory with Legion NFS
daemon
– Use cat on context files
• FTP directory with FTP interface
• Windows directory with Samba interface
• URL tree with HTTP interface
Flexible Context Space
e
Disk
Directory
Samba
NFS
ftp
HTTP
FTP
Context
legion_import_tree
Disk
Context
Context
Context
legion_export_dir
Directory
Directory
Directory
Access Control
• MayI for each object implements access
control on a per-function basis
• Users named by login object
• Sets of users grouped by contexts
legion_change_permissions [+-rwx] [-v]
<group/user context> <target context>
legion_change_permissions +r
/users/fred /home/grimshaw/myfile
Access Control Example
Unified Console
TTY
File
User shares
tty LOID
User shares
tty LOID
Program produces
stdout, stderr
User creates
tty object
Prog.
User
starts
Legion
passes tty
running
LOID toprogram
program
TTY Object
• Redirect run-time output to central (or
multiple) consoles
• Connect and disconnect dynamically
• Debug quickly and simply
• Monitor status, errors, easily
• Share console with others
legion_tty <ttyobj>
User View
Web Interface
Logging In
Listing Contents Of A Context
Control Window
Status Window
StdOut Window
StdErr Window
Listing Classes (Contents of /class)
Listing Hosts (Contents of /hosts)
List Attributes Of An Object
Start A Run
Check The Status Of A Job
Start An Amber (BioGrid) Run
Check The Status Of An Amber Run
Check The Status Of An Amber Run
Graphically Check An Amber Run
Interact With Amber Run
Interact With Amber Run
Interact With Amber Run
Start A Hawley-Hydro Run
Check The Status Of A Hydro Run
Check The Status Of A Hydro Run
Graphically Check A Hydro Run
Graphically Check A Hydro Run
Graphically Check A Hydro Run
Graphically Check A Hydro Run
Run RenderGrid Jobs (P-Space Jobs)
Run RenderGrid Jobs (P-Space Jobs)
Check The Status of A RenderGrid Job
Check Accounting Logs
User View
Windows Interface
Windows Browser
Context Space in Windows
 Ability to export local directories into


Legion’s context space
Easy-to-use interface
Ability of users to control when
shared directories are visible to other
users
Access Control
 Ability of users to specify access control

policies
Fine-grained nature of policies
 Allow/Deny read access to users or groups
 Allow/Deny write access to users or groups
 Ease with which access rights can be

changed
Speed at which access rights are
propagated through Legion space
Windows Legion FTP Daemon
Windows Job Sandbox
Windows Process Control
National Legion Net
Summary
• Philosophy
– Grid as a Single Virtual Machine
– Provide mechanisms; let others build policies
• Architecture
– Object-based, integrated
– Default policies for scheduling, security, …
• User Interfaces
– Command-line, Web, Windows, FTP, HTTP, …
Future Directions
•
•
•
•
•
•
•
Improved user interfaces
More robust system
Research activities - University of Virginia
Commercial activities - Avaki Corporation
Legion-G?
Continued participation @GGFs
Continued support for nationwide grid, grid
applications
Download