Document

advertisement
Building a Campus Grid with Existing
Resources
LabMan Conference, Notre Dame
June 8-9, 2009
Preston Smith
Purdue University
Special Thanks
• Thanks to the Condor Team at Wisconsin for
graciously allowing us to borrow from their tutorial
materials!
Outline
• Supercomputers on Campus
– Campus Grids
– High-Throughput Computing
– The impact of the campus grid
• The Condor Software
– Condor 101, at 200 mph
• Condor from an administrator’s view
– Policies
– Networking
– Security
– Virtual Appliance
Campus Grids
• Campus grids link computing resources within
universities and research institutes, often including
geographically distributed sites.
– Dedicated computing resources
– Idle non-dedicated computing resources
• Workstations
• Student Labs
– Campus grids build computation resources out of an
institution’s existing investment in computer resources
Supercomputers on Campus
• Purdue’s Campus Grid currently has 23,000 cores
– There are only 21 systems on the 11/2008 Top 500 list
with 16,000 or more cores.
• Theoretical peak capacity of the campus grid is 177
Teraflops
– This would place at #12 on the 11/2008 Top 500 list
• Acquiring a resource of this scale is expensive!
– $3 million for compute nodes alone
– Requires 2000 square feet of floor space, plus power and
cooling
BoilerGrid
• Purdue’s Campus Grid – West Lafayette
Campus
• 23,000 cores
– X86_64, ia32, ia64 Linux
•Idle HPC nodes in Rosen Center clusters
– Solaris, MacOS X
– Windows
•Instructional lab systems at main West Lafayette campus
and Purdue’s regional campuses
BoilerGrid
• Backfilling on idle HPC cluster nodes
– Condor runs on idle cluster nodes (nearly 10,000
cores today) when a node isn’t busy with PBS
(primary scheduler) jobs
BoilerGrid
• Windows systems (~7000 cores)
– Instructional Labs
• Purdue’s TLT division has run Condor on labs since 2001
– Supporting student rendering, some faculty research
– Library terminals
• Dedicated Condor resources
– GPU rendering cluster
– FPGA computation accelerator
BoilerGrid around Campus
• To date, the bulk of BoilerGrid cycles are provided by ITaP, Purdue’s
central IT
– Rosen Center for Advanced Computing (RCAC) – Research Computing
• Community Clusters – See http://www.isgtw.org/?pid=1001247
– Teaching and Learning Technologies (TLT) – Student Labs
• Centrally operated Linux clusters provide approximately 12k cores
• Centrally operated student labs provide 7k Windows cores
• That’s actually a lot of cores now, but there’s more around a large
campus like Purdue
– 27, 317 machines, to be exact
– Can the campus grid cover most of campus?
Target: All of Campus
• Green Computing is big everywhere,
Purdue is no exception
• CIO’s challenge – power-save your
idle computers, or run Condor and
join BoilerGrid
– University’s President runs Condor on her
PC
• Centrally supported workstations
have Condor available for install
through SCCM.
Thou shalt turn
off thy computer
or run Condor
Other Campus Grids
• Grid Laboratory of Wisconsin (GLOW)
– University of Wisconsin, Madison
• FermiGrid
– Fermi National Accelerator Lab
• Clemson University
• Rochester Institute of Technology
DiaGrid
• New name for our effort to spread the
campus grid gospel beyond Purdue’s
borders
– Perhaps institutions who wear red or green and
may be rivals on the gridiron or hardwood
wouldn’t like being in something named
“Boiler”.
• We’re regularly asked about
implementing a Purdue-style campus
grid at institutions without HPC on their
campus.
– Federate our campus grids into something far
greater than what one institution can do alone
http://farm1.static.flickr.com/124/365647571_e52111b7f4.jpg
DiaGrid Partners
• Sure, it’d make a good
basketball tournament…
• Purdue - West Lafayette
• Purdue Regionals
–
–
–
–
–
•
•
•
•
Calumet
North Central
IPFW
Statewide Technology
Cooperative Extension Offices
Your Campus??
Indiana University
Notre Dame
Indiana State
Wisconsin (GLOW)
– Via JobRouter
• Louisville
National scale: TeraGrid
• The Purdue Condor Pool is a resource available for allocation to
anybody in the nation today
• NSF now recognizes high-thoughput computing resources as a critical
part of the nation’s cyberinfrastructure portfolio going forward.
– Not just megaclusters, XT5s, Blue Waters, etc, but loosely-coupled as well
• NSF vision for HPC - Sharing among academic institutions to optimize
the accessibility and use of HPC as supported at the campus level
– This matches closely with our goal to spread the gospel of the campus grid via DiaGrid
High Throughput Computing
• Like the Top 500 List, High Performance
Computing is often measured by floating point
operations per second (FLOPS)
• High Throughput Computing is concerned with
how many floating point operations per month or
per year they can extract from their computing
environment rather than the number of such
operations the environment can provide them per
second or minute.
Impact - Disciplines
• Supply Chain Simulations
• Structural Biology (viruses)
• Astrophysics
• Particle Physics
• Mathematics
• Economics
• Communication
• Materials Science
• Hydrology
• Bioinformatics
Impact
14,000,000
50000
14,000,000
140
Hours Delivered
Pool Size
Jobs
Unique
Users
45000
12,000,000
12,000,000
120 40000
10,000,000
10,000,000
35000
100
30000
8,000,000
8,000,000
80
25000
6,000,000
60 20000
6,000,000
15000
4,000,000
40
10000
4,000,000
2,000,000
20 5000
0
0
2003
2004
2005
2006
2007
2008
2009
2010
0
2003
2004
2005
2006
2007
2008
2009
2010
2011
2003
2004
2005
2006
2007
2008
2009
2010
-2,000,000
0
2003
2004
2005
2006
2007
2008
2009
2010
2,000,000
Pool Size
Unique
Jobs
Users
Hours
Delivered
Condor
The Condor Software
• Available as a free download from
http://www.cs.wisc.edu/condor
• Download Condor for your operating system
– Available for most UNIX (including Linux and
Apple’s OS X) platforms
– Windows NT / XP / Vista
Full featured system
• Flexible scheduling policy engine via ClassAds
– Preemption, suspension, requirements,
preferences, groups, quotas, settable fair-share,
system hold…
• Facilities to manage BOTH dedicated CPUs
(clusters) and non-dedicated resources
(desktops)
• Transparent Checkpoint/Migration for many
types of serial jobs
• No shared file-system required
• Federate clusters w/ a wide array of Grid
Middleware
Full featured system
• Workflow management (inter-dependencies)
• Support for many job types – serial, parallel, etc.
• Fault-tolerant: can survive crashes, network outages,
no single point of failure.
• Development APIs: via SOAP / web services, DRMAA
(C), Perl package, GAHP, flexible command-line tools,
MW
• Platforms: Linux i386/IA64, Windows 2k/XP/Vista,
MacOS, FreeBSD, Solaris, IRIX, HP-UX, Compaq
Tru64, … lots.
– IRIX and Tru64 are no longer supported by current releases
of Condor
Condor – at 200 mph
• We could talk about Condor all day..
– So just the highlights
http://www.automopedia.org/wpcontent/uploads/2008/05/indy500_start2.jpg
Meet Phil.
He is a scientist
with a big
problem.
Phil’s Application …
Run a Parameter Sweep of F(x,y,z) for 200 values
of x, 100 values of y and 30 values of z
– 200×100×30 = 600,000 combinations
– F takes on the average 6 hours to compute on a
“typical” workstation (total = 600,000 × 6 = 3,600,000
hours: 410 years)
– F requires a “moderate” (512 MB) amount of memory
– F performs “moderate” I/O - (x,y,z) is 5 MB and
F(x,y,z) is 50 MB
I have 600,000
simulations to run.
Where can I get
help?
NSF won’t fund the
Blue Gene that I
requested.
While sharing a beverage with some
colleagues, Phil shares his problem.
Somebody asks “Have you tried Condor?.”
Phil Installs a
“Personal Condor” on his machine…
• What do we mean by a “Personal” Condor?
– Condor on your own workstation
– No root / administrator access required
– No system administrator intervention needed
• After installation, Phil submits his jobs to his
Personal Condor…
Phil’s Condor Pool
F(3,4,5)
600k Condor
jobs
personal
Condor
Phil's
workstation
Personal Condor?!
What’s the benefit of a Condor
“Pool” with just one user and one
machine?
Condor will ...
• Keep an eye on your jobs and will keep you
posted on their progress
• Implement your policy on the execution order
of the jobs
• Keep a log of your job activities
• Add fault tolerance to your jobs
• Implement your policy on when the jobs can
run on your workstation
Definitions
• Job
– The Condor representation of your work
• Machine
– The Condor representation of computers and that
can perform the work
• Match Making
– Matching a job with a machine “Resource”
Job
Jobs state their requirements and preferences:
I need a Linux/x86 platform
I want the machine with the most memory
I prefer a machine in the chemistry department
Machine
Machines state their requirements and preferences:
Run jobs only when there is no keyboard activity
I prefer to run Phil’s jobs
I am a machine in the physics department
Never run jobs belonging to Dr. Smith
The Magic of Matchmaking
• Jobs and machines state their requirements and
preferences
• Condor matches jobs with machines
based on requirements and preferences
Using the Vanilla Universe
• The Vanilla Universe:
– Allows running almost any
“serial” job
– Provides automatic file
transfer, etc.
– Like vanilla ice cream
•Can be used in just about any
situation
Make your job batch-ready
Must be able to run in the
background
• No interactive input
• No windows
• No GUI
Create a Submit Description File
• A plain ASCII text file
• Condor does not care about file extensions
• Tells Condor about your job:
– Which executable, universe, input, output and error files
to use, command-line arguments, environment variables,
any special requirements or preferences (more on this
later)
• Can describe many jobs at once (a “cluster”), each
with different input, arguments, output, etc.
Simple Submit Description File
# Simple condor_submit input file
# (Lines beginning with # are comments)
# NOTE: the words on the left side are not
#
case sensitive, but filenames are!
Universe
= vanilla
Executable = my_job
Output
= output.txt
Queue
4. Run condor_submit
• You give condor_submit the name of the
submit file you have created:
– condor_submit my_job.submit
• condor_submit:
– Parses the submit file, checks for errors
– Creates a “ClassAd” that describes your job(s)
– Puts job(s) in the Job Queue
ClassAd ?
• Condor’s internal data representation
– Similar to classified ads (as the name implies)
– Represent an object & its attributes
•Usually many attributes
– Can also describe what an object matches with
ClassAd Details
• ClassAds can contain a lot of details
– The job’s executable is analysis.exe
– The machine’s load average is 5.6
• ClassAds can specify requirements
– I require a machine with Linux
• ClassAds can specify preferences
– This machine prefers to run jobs from the physics group
ClassAd Details (continued)
• ClassAds are:
– semi-structured
– user-extensible
– schema-free
– Attribute = Expression
ClassAd Example
Example:
String
MyType
= "Job"
TargetType
= "Machine" Number
ClusterId
= 1377
Owner
= "roy"
Cmd
= "sim.exe"
Boolean
Requirements =
(Arch == "INTEL")
&& (OpSys == "LINUX")
&& (Disk >= DiskUsage)
&& ((Memory * 1024)>=ImageSize)
…
The Dog
ClassAd
Type = “Dog”
Color =
“Brown”
Price = 12
ClassAd for the “Job”
...
Requirements =
(type == “Dog”) &&
(color == “Brown”) &&
(price <= 15)
...
Phil’s Condor Pool
F(3,4,5)
600k Condor
jobs
personal
Condor
Phil's
workstation
Phil can still only run one
job at a time, however.
Good News
(Boss)
The Boss says Phil
can add his coworkers’ desktop
machines into his
Condor pool as well…
but only if they can
also submit jobs.
Adding nodes
• Phil installs Condor on the desktop machines,
and configures them with his machine as the
central manager
– The central manager:
•Central repository for the whole pool
•Performs job / machine matching, etc.
• These are “non-dedicated” nodes, meaning
that they can't always run Condor jobs
Phil’s Condor Pool
600k Condor
jobs
Condor Pool
Now, Phil and his coworkers can run multiple
jobs at a time so their
work completes sooner.
How can my jobs
access their data files?
Condor File Transfer
• ShouldTransferFiles = YES
– Always transfer files to execution site
• ShouldTransferFiles = NO
– Rely on a shared filesystem
• ShouldTransferFiles = IF_NEEDED
– Will automatically transfer the files if the submit and execute
machine are not in the same FileSystemDomain
Universe
= vanilla
Executable = my_job
Log
= my_job.log
ShouldTransferFiles
= IF_NEEDED
Transfer_input_files = dataset.$(Process), common.data
Transfer_output_files = TheAnswer.dat
Queue 600
Phil’s Condor Pool
600k Condor
jobs
Condor Pool
With the additional
resources, Phil and his coworkers can get their jobs
completed even faster.
Dedicated Cluster
Now what?
• Some of the machines in the pool can’t run my jobs
– Not enough RAM
– Not enough scratch disk space
– Required software not installed
– Etc.
Specify Requirements
• An expression (syntax similar to C or Java)
• Must evaluate to True for a match to be made
Universe
= vanilla
Executable = my_job
Log
= my_job.log
InitialDir = run_$(Process)
Requirements = Memory >= 256 && Disk > 10000
Queue 600
Advanced Requirements
• Requirements can match custom attributes in your
Machine Ad
– Can be added by hand to each machine
Universe
= vanilla
Executable = my_job
Log
= my_job.log
InitialDir = run_$(Process)
Requirements = Memory >= 256 && Disk > 10000 \
&& ( HasMATLAB =?= TRUE) )
Queue 600
And, Specify Rank
• All matches which meet the requirements can
be sorted by preference with a Rank
expression.
• Higher the Rank, the better the match
Universe
= vanilla
Executable = my_job
Log
= my_job.log
Arguments
= -arg1 –arg2
InitialDir = run_$(Process)
Requirements = Memory >= 256 && Disk > 10000
Rank = (KFLOPS*10000) + Memory
Queue 600
What does the IT shop
need to know?
The IT administrator should know:
– Condor’s daemons
– Policy Configuration
– Security
– Virtualization
Typical Condor Pool
= Process Spawned
= ClassAd
Communication
Pathway
master
master
startd
Submit-Only
Execute-Only
Central Manager
schedd
negotiator
collector
master
schedd
startd
Execute-Only
master
startd
Regular Node
master
startd
schedd
Regular Node
master
startd
schedd
Job Startup
Central Manager
Negotiator
Collector
Submit Machine
Execute Machine
Schedd
Startd
Starter
Submit
Shadow
Job
Condor
Syscall Lib
59
Ok, now what?
• Default configuration is pretty sane
– Only start a job when the keyboard is idle for > 15
minutes and there is no CPU load
– Terminate a job when the keyboard or mouse is
used, or when the CPU is busy for more than two
minutes
• Can one customize how Condor behaves?
Policy Expressions
• Allow machine owners to specify job priorities,
restrict access, and implement local policies
Policy Configuration
(The Boss)
• I asked the computer
lab folks to add
nodes into Condor…
but the jobs from
their users have
priority there
New Settings for the lab
machines
•Prefer lab jobs
START = True
RANK = Department == ”Lab”
SUSPEND = False
CONTINUE = True
PREEMPT = False
KILL = False
Submit file with Custom
Attribute
• Prefix an entry with “+” to add to job ClassAd
Executable = 3dsmax.exe
Universe = vanilla
+Department = “Lab"
queue
More Complex RANK
• Give the machine’s owners (psmith and jpcampbe)
highest priority, followed by Lab, followed by the
Physics department, followed by everyone else.
More Complex RANK
IsOwner = (Owner == ”psmith" || Owner ==
”jpcampbe")
IsTLT =(Department =!= UNDEFINED &&
Department == ”Lab")
IsPhys =(Department =!= UNDEFINED &&
Department == "Physics")
RANK = $(IsOwner)*20 + $(IsTLT)*10 +
$(IsPhys)
Policy Configuration
(The Boss)
• So far this is okay,
but... Condor can
use staff desktops
when they would
otherwise be idle
Defining Idle
• One possible definition:
– No keyboard or mouse activity for 5 minutes
– Load average below 0.3
Desktops should
• START jobs when the machine becomes idle
• SUSPEND jobs as soon as activity is detected
• PREEMPT jobs if the activity continues for 5
minutes or more
• KILL jobs if they take more than 5 minutes
to preempt
Policies
• Policies are nearly infinitely customizable!
– If you can describe it, you can make Condor do it!
• A couple examples follow
Custom Machine Attributes
• Can add attributes to a machine’s ClassAd, typically
done in the local config file
HAS_MATLAB=TRUE
NETWORK_SPEED=1000
MATLAB_PATH=“c:\matlab\bin\matlab.exe”
STARTD_EXPRS=HAS_MATLAB, MATLAB_PATH,
NETWORK_SPEED
71
Custom Machine Attributes
• Jobs can now specify Rank and Requirements using
new attributes:
Requirements = (HAS_MATLAB =?= UNDEFINED ||
HAS_MATLAB==TRUE)
Rank = NETWORK_SPEED =!= UNDEFINED &&
NETWORK_SPEED
72
START policies
• Time of Day Policy
– WorkHours = ( (ClockMin >= 480 && ClockMin < 1020) && \
(ClockDay > 0 && ClockDay < 6) )
AfterHours = ( (ClockMin < 480 || ClockMin >= 1020) || \
(ClockDay == 0 || ClockDay == 6) )
# Only start jobs after hours.
START = $(AfterHours) && $(CPUIdle) && KeyboardIdle >
$(StartIdleTime)
# Consider the machine busy during work hours,
# or if the keyboard or CPU are busy.
MachineBusy = ( $(WorkHours) || $(CPUBusy) || $(KeyboardBusy) )
START policies
• Policy to keep your network from saturating from
off-campus jobs
SmallRemoteJob = ( DiskUsage <= 30000 && \
FileSystemDomain != “my.filesystem.domain”)
# Only start jobs that don’t bring along
# huge amounts of data from off-campus.
START = $(SmallRemoteJob) && $(START)
Security
Host/IP Address Security
• The basic security model in Condor
– Stronger security available (Encrypted communications,
cryptographic authentication)
• Can configure each machine in your pool to allow
or deny certain actions from different groups of
machines
Advanced Security
Features
•AUTHENTICATION – Who is allowed
•ENCRYPTION - Private communications, requires
AUTHENTICATION.
•INTEGRITY - Checksums
Security Features
• Features individually set as REQUIRED,
PREFERRED, OPTIONAL, or NEVER
• Can set default and for each level (READ, WRITE,
etc)
• All default to OPTIONAL
• Leave NEGOTIATION at OPTIONAL
Authentication Complexity
• Authentication comes at a price: complexity
• Authentication between machines requires an
authentication system
• Condor supports several existing authentication
systems
– We don’t want to create yet another one
AUTHENTICATION_METHODS
• Authentication requires one or more methods:
– FS
– FS_REMOTE
– GSI
– Kerberos
– NTSSPI
– CLAIMTOBE
Networking
Networking
• Each submit node and potential execute node must
– Be able to communicate with each other
– Full bidirectional communication
• Firewalls are a problem
– We can deal with that, see next slide
• NAT is more of an issue…
Networking
• Firewalls
– Port 9618 needs to be open to your central manager,
from all of your execute machines
– Define range for dynamic ports
HIGHPORT = 50500
LOWPORT = 50000
– And open corresponding ports in firewall
– Condor can install its own exception in Windows firewall
configuration
Virtualization
Condor’s VM Universe
Submit Machine
Execute Machine
Startd
Schedd
VM
Startd
Job
Condor’s VM Universe
• Rather than submit a program into potentially
unknown execution environments, why not submit
the environment?
• The VM image is the job
• Job output is the modified VM image
• VMWare and Xen are supported
Virtual Condor Appliance
• Engineering is Purdue’s largest non-central IT organization – 4000
machines
– Already a BoilerGrid partner, providing nearly 1000 cores of Linux cluster nodes to
BoilerGrid.
• But what about desktops? What about Windows?
– Engineering is interested... But…
• Engineering leadership wants the ability to sandbox Condor away from
systems holding research or business data.
• Can we do this?
Virtual Condor Appliance
• Sure!
• Distribute virtual machine images running a
standard OS and Condor Configuration
– CentOS 5.2
– Virtual private p2p networking
– Encryption, authentication
Virtual Condor Appliance
• For us and partners on campus, this is a win
– Machine owners get their sandbox
– Our support load to bring new machine owners online gets easier
– Execution environments become consistent
• Much of the support load with new “sites” is firewall and Condor
permissions.
– Virtual machines and virtual “IPOP” network makes that all go away.
• Not only native installers for campus users, but now a VM image
– With installer to run virtual nodes as a service
– Systray app to forward keyboard/mouse events to virtual guests
• Not Virtualization implementation dependent – we can prepare and
distribute VM images with KVM, VirtualBox, Vmware, Xen, and so on.
– Just VMWare currently
– We’ll offer more in the future.
Condor Week 2009
Whew!!!
http://media.photobucket.com/image/is it live or is it
memorex/QueenB8271/MemorexAdPhoto.jpg>
I could also talk lots about…
• GCB: Living with firewalls & private networks
• Federated Grids/Clusters
• APIs and Portals
• MW
• Database Support (Quill)
• High Availability Fail-over
• Compute On-Demand (COD)
• Dynamic Pool Creation (“Glide-in”)
• Role-based prioritization and accounting
• Strong security, incl privilege separation
• Data movement scheduling in workflows
•…
Conclusion
• Campus Grids are effective ways of bringing highperformance computing to campus
– Using institution’s existing investment in computing
• The Condor software is an excellent framework for
implementing a Campus Grid
– Flexible
– Powerful
– Minimal extra work for lab adminstrators!
•Just one more package in your image
• Virtualization with Condor
– Improve security of machine owners’ systems
– Improve grid manageability
– Consistency
The End
Questions?
Interested in a campus grid at your
institution?
Want to join DiaGrid?
http://www.rcac.purdue.edu/boilergrid
Download