AdvancedSchedulerTipsandTricks

advertisement
Advanced Scheduler Tips and Tricks
Matthew Scholz
Why are you here?
• “My job isn’t starting quickly”
• “I want to do MORE work on the system”
• “My job keeps dying on the cluster”
Overview
•
•
•
•
•
Scheduling Priorities
Cluster Resources
Better Resource Requesting
Array Jobs
Longjobs (BLCR)
Scheduling Priorities
• Jobs that use more resources get higher priority
(because these are hard to schedule)
• Smaller jobs are “backfilled” to fit in the holes
created by the bigger jobs
• Eligible jobs acquire more priority as they sit in the
queue
• Jobs can be in three basic states:
– Blocked, eligible or running
Cluster Resources
• #PBS –l nodes=1:ppn=4;walltime=4:00:01;mem=4G
• Resources you request relate to time in queue
• Factors
– Priority (number of cores requested)
– Available Resources (number of cores available)
– RAM/Core
Cluster Resources 2
Year
Name
Description
2007
intel07
2009
amd09
2010
2010
2011
gfx10
intel10
intel11
Quad-core 2.3GHz Intel Xeon
E5345
Sun Fire X4600 (Fat Node)
AMD Opteron 8384
NVIDIA CUDA Node (feature=GBE)
Intel Xeon E5620 (2.40 GHz)
Intel Xeon 2.66 GHz E7-8837
2014
intel14
Intel Xeon E5-2670 v2 (2.6 GHz)
2 NVIDIA K20 GPUs
(feature=gpgpu)
2 Xeon Phi 5110P
(feature=phi)
Total*
ppn
Memory
Nodes
Total
Cores
8
8GB
124
992
32
256GB
3
96
8
8
32
32
64
18GB
24GB
512GB
1TB
2TB
32
192
2
1
2
256
1536
64
32
128
20
64GB
128
2560
20
256GB
24
480
20
128GB
40
800
20
128GB
28
560
576
7504
* Does not include Condor Cluster
Cluster Resources 3
• Buy-in Priority
– Investigators have helped make the cluster larger by
purchasing some of the nodes.
– These nodes are “reserved”
• Buy-in use = 1 week
• Non-buy in use = 4 Hours
QUESTIONS?
Better Resource Requests
• RAM/Core Vs. Ram/Node
– When requesting resource –l mem=XGB, rememeber it is
divided PER core.
– E.g. ppn=4;mem=4GB == 1GB/core
– Each node can accommodate total amount of RAM of
machine. Best to target to be able to use AVERAGE
RAM/core for best shot
Better Resource Requests
• Walltime:
– <= 4 hours more available machines
– Up to 1 week walltime allowed
• Feature=GBE
– ~320 cores available, not on infiniband (high-speed
interconnect)
– If nodes=1, you can also request feature=gbe
QUESTIONS?
Array Jobs
• Pleasantly parallel workflows:
– I need to sort 50 files, and generate 50 new files
– (NOT: I need to sort 50 files into 1 large file)
– Jobs are independent of eachother, but have the same
behavior
• #PBS –t 1-40
• (OR)
• #PBS –t 2,4,8
Array Jobs
• What does this DO?
• -t 1-2 submits _2_ jobs
• Each job is identical, EXCEPT
– When job starts, environment variable is set:
• $PBS_ARRAYID=1 (-t 1)
• $PBS_ARRAYID=2 (-t 2)
• Etc
– Workflows can be modified to take in variable to run
different workflows
QUESTIONS?
Long jobs (BLCR)
• Berkley Labs Checkpoint/Restart
– Wrapper around a program.
– Can save entire program (checkpoint) for restart later
– We have a powertool! (longjob)
• Uses:
– Jobs that need to run > 1 week
– Jobs that are taking too long to start (can be run in 4 hour
chunks)
BLCR Example
(commands)
Cd
mkdir examples
cd examples
module load powertools
getexample velvet_blcr
cd velvet_blcr
nano velveth_blcr.qsub
• Command line tools to
make a new examples dir in
your homedir, and grab the
example
• Nano to edit (you can use
your editor of choice)
Break down on commands
•
•
•
•
Must have powertools module loaded in script
use $PBS_O_WORKDIR
Important variables:
BLCR_WAIT_SEC
– how long to run before beginning checkpoint
– MUST be less than walltime (enough to allow save)
• BLCR_OUTPUT
– Name of output file
BLCR continued
# if checkpoint file does not exist
if [ ! -f checkfile.blcr ]
then
WORK=${PBS_O_WORKDIR}/${PBS_JOBID}
mkdir -p ${WORK}
#Run main simulation program
cd $WORK
fi
If statement, to
see if it is the first
time running:
If so, make a
new directory
For safety
BLCR
• longjob command
– ONE command, if you are running a pipeline, put it all into
a separate script file
– Remember that the WORK will be done in a new subdir
(no relative paths)
• Remember: It takes time to save memory footprint
Last chance:
QUESTIONS?
Download