Slides on CHTC from 692 Seminar

advertisement
When and How to Use Large-Scale
Computing: CHTC and HTCondor
Lauren Michael, Research Computing Facilitator
Center for High Throughput Computing
STAT 692, November 15, 2013
Topics We’ll Cover Today
›
›
›
›
›
›
Why to Access Large-Scale Computing resources
CHTC Services and Campus-Shared Computing
What is High-Throughput Computing (HTC)?
What is HTCondor and How Do You Use It?
Maximizing Computational Throughput
How to Run R on Campus-Shared Resources
2
When should you use outside
computing resources?
1. your computing work won’t run at all on your
computer(s) (lack sufficient RAM, disk, etc.)
2. your computing work will take too long on your own
computer(s)
3. you would like to off-load certain processes in favor
of running others on your computer(s)
3
CHTC Services
Center for High Throughput Computing, est. 2006
› Large-scale, campus-shared computing systems
 high-throughput computing (HTC) grid and high-performance
computing (HPC) cluster
 all standard services provided free-of-charge
 automatic access to the national Open Science Grid (OSG)
 hardware buy-in options for priority access
 information about other computing resources
› Support for using our systems
 consultation services, training, and proposal assistance
 solutions for numerous software (including Python, Matlab, R)
4
HTCondor: CHTC’s R&D Arm
› R&D for HTCondor and other HTC software
› Services provided to the campus community
 HTC Software
• HTCondor: manage your compute cluster
• DAGMan: manage computing workflows
• Bosco: submit locally, run globally
 Software Engineering Expertise & Consulting
• CHTC-operated Build-and-Test Lab (BaTLab)
 Software Security Consulting
Your Problems become Our Research!
http://chtc.cs.wisc.edu
Jul’10Jun’11
Jul’11Jun’12
Jul’12Jun’13
Quick Facts
45
70
97
Million Hours Served
54
106
120
Research Projects
35
52
52
Departments
10
13
15
Off-Campus Projects
Researchers who use the CHTC are located all over campus (red buildings)
CHTC Staff
Director, Miron Livny
miron@cs.wisc.edu
(also OSG Technical Director and WIDs CTO)
Campus Support:
chtc@cs.wisc.edu
2+ Research Computing Facilitators
› Lauren Michael (lead)
lmichael@wisc.edu
3 Systems Administrators
+4-8 Part-time Students
HTCondor Development Team
OSG Software Team
7
HTC versus HPC
› high-throughput computing (HTC)
 many independent processes that can run on 1 or few
processors (“cores” or “threads”) on the same computer
 mostly standard programming methods
 best accelerated by: access to as many cores as possible
› high-performance computing (HPC)
 sharing the workload of interdependent processes over
multiple cores to reduce overall compute time
 OpenMP and MPI programming methods, or multi-thread
 requires: access to many servers of cores within the same
tightly-networked cluster; access to shared files
8
“parallel” is confusing
› essentially means: spread computing work out over
multiple processors
› Use of the words “parallel” and “parallelize” can
apply to HTC or HPC when referring to programs
› It’s important to be clear!
9
Topics We’ll Cover Today
›
›
›
›
›
›
Why to Access Large-Scale Computing resources
CHTC Services and Campus-Shared Computing
What is High-Throughput Computing (HTC)?
What is HTCondor and How Do You Use It?
Maximizing Computational Throughput
How to Run R on Campus-Shared Resources
10
What is HTCondor?
› match-maker of computing work and computers
› “job scheduler”
 matches are made based upon necessary RAM,
CPUs, disk space, etc., as requested by the user
 jobs re-run if interrupted
› works beyond “clusters” to coordinate distributed
computers for maximum throughput
› coordinates data transfers between users and
distributed computers
› can coordinate servers, desktops, and laptops
11
How HTCondor Works
Central Manager
(of the pool)
input
Queue
job1.1
user1
job1.2
user1
job2.1
user2
output
Submit Node(s)
(where jobs are
submitted)
Execute Node(s)
(where jobs run)
12
13
Submit nodes available to YOU
Submit host
Stat dept servers
CS Pool CHTC Pool
Campus Grid
Open Science Grid
flocking
“glidein”
default
simon.stat.wisc.edu
default
CHTC submit nodes
default
14
Basic HTCondor Submission
› Prepare programs and files
›
›
›
›
Write submit file(s)
Submit jobs to the queue
Monitor the jobs
(Remove bad jobs)
15
Preparing Programs and Files
› Make programs portable
 compile code to a simple binary
 statically-link code dependencies
 consider CHTC’s tools for packaging Matlab, Python, and R
› Consider using a shell script (or other “wrapper”) to
run multiple commands for you
 create a local install of software
 set environment variables
 then, run your code
› Stage all files on a submit node
16
HTC Components
1. Cut up computing work into
many independent pieces
(CHTC can consult)
2. Make programs portable,
minimize dependencies
(CHTC can consult, or may have prepared solutions)
3. Learn how to submit jobs
(CHTC can help you a lot!)
4. Maximize your overall throughput on available
computational resources
(CHTC can help you a lot!)
17
Basic HTCondor Submit File
# This is a comment
basic jobs
universe = vanilla
are vanilla
output and error
universe
are where
output = process.out
system output
error = process.err
log is where
and error will go
log = process.log
HTCondor stores
executable = cosmos
info about how
executable is
your job ran
arguments = cosmos.in 4
your single
should_transfer_files = YES
program or a
transfer_input_files = cosmos.in
shell script
when_to_transfer_output = ON_EXIT
The program will be run as:
request_memory = 100
./cosmos cosmos.in 4
memory in MB
request_disk = 100000
and disk in KB
queue with no
request_cpus = 1
number after it
queue
will submit only
one job
18
Basic HTCondor Submit File
# This is a comment
universe = vanilla
output = process.out
error = process.err
log = process.log
executable = cosmos
arguments = cosmos.in 4
should_transfer_files = YES
transfer_input_files = cosmos.in
when_to_transfer_output = ON_EXIT
request_memory = 100
request_disk = 100000
request_cpus = 1
queue
19
Initial File Organization
In folder test/
cosmos
cosmos.in
submit.txt
HTCondor Multi-Job Submit File
# This is a comment
universe = vanilla
output = $(Process).out
error = $(Process).err
log = $(Cluster).log
executable = cosmos
arguments = cosmos_$(Process).in
should_transfer_files = YES
transfer_input_files = cosmos_$(Process).in
when_to_transfer_output = ON_EXIT
request_memory = 100
request_disk = 100000
request_cpus = 1
queue 3
20
test/
cosmos
cosmos_0.in
cosmos_1.in
cosmos_2.in
submit.txt
HTCondor Multi-Folder Submit File
# This is a comment
universe = vanilla
InitialDir = $(Process)
output = $(Process).out
error = $(Process).err
log = /home/user/test/$(Cluster).log
executable = /home/user/test/cosmos
arguments = cosmos.in
should_transfer_files = YES
transfer_input_files = cosmos.in
when_to_transfer_output = ON_EXIT
request_memory = 100
request_disk = 100000
request_cpus = 1
queue 3
21
test/
cosmos
cosmos.in
submit.txt
0/
cosmos.in
1/
cosmos.in
2/
cosmos.in
Submitting Jobs
[lmichael@simon test]$ condor_submit submit.txt
Submitting job(s)...
3 job(s) submitted to cluster 29747.
[lmichael@simon test]$
22
Checking the Queue
[lmichael@simon test]$ condor_q lmichael
-- Submitter: simon.stat.wisc.edu :
<144.92.142.159:9620?sock=3678_5c57_3> : simon.stat.wisc.edu
ID
OWNER
SUBMITTED
RUN_TIME ST PRI SIZE CMD
29747.0
lmichael
2/15 09:06
0+00:01:34 R
0
9.8
cosmos cosmos.in
29747.1
lmichael
2/15 09:06
0+00:00:00 I
0
9.8
cosmos cosmos.in
29747.2
lmichael
2/15 09:06
0+00:00:00 I
0
9.8
cosmos cosmos.in
3 jobs; 0 completed, 0 removed, 2 idle, 1 running, 0 held, 0 suspended
[lmichael@simon test]$
View all user jobs in the queue: condor_q
23
Log Files
000 (29747.001.000) 02/15 09:29:17 Job submitted from host:
<144.92.142.159:9620?sock=3678_5c57_3>
...
001 (29747.001.000) 02/15 09:33:59 Job executing on host:
<144.92.142.153:9618?sock=17172_f1f3_3>
...
005 (29747.001.000) 02/15 09:39:01 Job terminated.
(1) Normal termination (return value 0)
Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
0 - Total Bytes Sent By Job
0 - Total Bytes Received By Job
Partitionable Resources :
Usage Request Allocated
Cpus
:
1
1
Disk (KB)
:
225624
100000
645674
Memory (MB)
:
85
1000
1024
24
Removing Jobs
› Remove a single job: condor_rm 29747.0
› Remove all jobs of a cluster: condor_rm 29747
› Remove all of your jobs: condor_rm lmichael
25
Topics We’ll Cover Today
›
›
›
›
›
›
Why to Access Large-Scale Computing resources
CHTC Services and Campus-Shared Computing
What is High-Throughput Computing (HTC)?
What is HTCondor and How Do You Use It?
Maximizing Computational Throughput
How to Run R on Campus-Shared Resources
26
Maximizing Throughput
› The Philosophy of HTC
› The Art of HTC
› Other Best-Practices
27
The Philosophy of HTC
› break up your work into many ‘smaller’ jobs
 single CPU, short run times, small input/output data
› run on as many processors as possible
 single CPU and low RAM needs
 take everything with you; make programs portable
 use the “right” submit node for the right “resources”
› automate as much as you can
› (share your processors with others to increase
everyone’s throughput)
28
Success Stories
› Edgar Spalding: studies effect of gene on plant
growth outcomes
› GeoDeepDive Project: extracts and comprises “dark
data” from PDFs of publications in Geosciences
We want HTC to revolutionize your research!
29
The Art of HTC
carrying out the philosophy, well
› Tuning job requests for memory and disk
› Matching run times to the maximum number of
available processors
› Automation
30
Tuning Job Resource Requests
Problem: Don’t know what your job needs?
› If you don’t ask for enough memory and disk:
 Your jobs will be kicked off for going over, and will
have to be retried (though, HTCondor will automatically
request more for you)
› If you ask for too much:
 Your jobs won’t match to as many available “slots” as
they could
31
Tuning Job Resource Requests
Solution: Testing is Key!!!
1. Run just a few jobs at first to determine memory and
disk needs from log files
 If your first request is not enough, HTCondor will retry the
jobs and request more until they finish.
 It’s okay to request a lot (1 GB each) for a few tests.
2. Change the “request” lines to a better value
3. Submit a large batch
32
Time-Matching (submit file additions)
Submit host
Stat dept servers
CS
Pool
(4 hrs?)
CHTC Pool
<24 hrs
(up to 72)*
Campus Grid
<4 hrs
Open Science Grid
<2 hrs
+WantFlocking
= true
+WantGlidein = true
default
simon.stat.wisc.ed
u
default
CHTC submit
nodes
default
33
Time-Tuning: Batching
› Problem: Jobs less than 5 minutes are bad for
overall throughput
 more time spent on matching and data transfers than
on your job’s processes
 Ideal time is between 5 minutes and 2 hours (OSG)
› Solution: Use a shell script (or other method) to run
multiple processes within a single job
 avoids transfer of intermediate files between
sequential, related processes
 debugging can be a bit trickier
34
Time-Tuning: Checkpointing
› The best way to run longer jobs without losing
progress to eviction.
Two Ways:
1. Compile your code with condor_compile and use
the “standard” universe within HTCondor
2. Implement self-checkpointing
*Consult HTCondor’s online manual or contact the
CHTC for help
35
Automate Tasks
› Use $(Process)
› Shell scripts to run multiple tasks within the same job
 including environment preparation
› Hardcode arguments, calculate them (random number
generation), or use parameter files/tables
› Use HTCondor’s DAGMan feature
 “directed acyclic graph”
 create complex workflows of dependent jobs, and submit
them all at once
 additional helpful features: success checks and more
36
Non-Throughput Considerations
Remember that you are sharing with others
› “Be Kind to Your Submit Node”
 avoid transfers of large files through the submit node
(large: >10GB per batch; ~10 MB/job x 1000+ jobs)
• transfer files from another server as part of your job
(wget and curl)
• compress where appropriate; delete unnecessary files
• remember: “new” files are copied back to submit nodes
 avoid running multiple CPU-intensive executables
› Test all new batches, and scale up gradually
 3 jobs, then 100s, then 1000s, then
37
Topics We’ll Cover Today
›
›
›
›
›
›
Why to Access Large-Scale Computing resources
CHTC Services and Campus-Shared Computing
What is High-Throughput Computing (HTC)?
What is HTCondor and How Do You Use It?
Maximizing Computational Throughput
How to Run R on Campus-Shared Resources
38
Running R on HTC Resources:
The Best Way
› Problem: R programs don’t easily compile to a binary
› Solution: Take R with your job!
CHTC has tools just for R (and Python, and Matlab)
› Installed on CS/Stat submit nodes, simon, and CHTC
submit nodes
39
40
1. Build R Code with chtc_buildRlibs
› Copy your R code and any R library tar.gz files to the
submit node
› Run the following command:
chtc_buildRlibs --rversion=sl5-R-2.10.1 \
--<library1>.tar.gz,<library2>.tar.gz
› R versions supported: 2.10.1, 2.13.1, 2.15.1
(use the closest version below yours)
› Get back sl5-RLIBS.tar.gz and sl6-RLIBS.tar.gz
(you’ll use these in the next step)
41
42
2. Download the “ChtcRun” Package
› download ChtcRun.tar.gz, according to the guide (wget)
› un-tar it: tar xzf ChtcRun.tar.gz
› View ChtcRun contents:
process.template
mkdag
(submit file template)
(script that will ‘create’ jobs based
Rin/
upon your staged data)
(example data staging folder)
43
3. Prepare data and process.template
› Stage data as such:
ChtcRun/
data/
1/
2/
job3/
test4/
shared/
input.in <specific_files>
input.in <specific_files>
input.in <specific_files>
input.in <specific_files>
<RLIBS.tar.gz> <program>.R <shared_files>
› Modify process.template with respect to:
 request_memory and request_disk, if you know
 +WantFlocking = true
OR +WantGlidein = true
44
4. Run mkdag and submit jobs
› In ChtcRun, execute the mkdag script
 (Examples at the top of “./mkdag --help”)
./mkdag --data=Rin –outputdir=Rout \
--cmdtorun=soartest.R --type=R \
--version=R-2.10.1 --pattern=meanx
 “pattern” indicates a portion of a filename that you expect to
be created by successful completion of any single job
› A successful mkdag run will instruct you to navigate to the
‘outputdir’, and submit the jobs as a single DAG:
condor_submit_dag mydag.dag
45
5. Monitor Job Completion
› Check jobs in the queue as they’re gradually added and
completed (condor_q)
› Check other files in your ‘outputdir’:
Rout/
mydag.dag.dagman.out
(updated table of job stats)
1/ process.log process.out,err ChtcWrapper1.out
2/ process.log process.out,err ChtcWrapper2.out
…/
After testing a small number of jobs, submit many!
(up to many 10,000s; # submitted is throttled for you)
46
What Next?
1. Use a Stat server to submit shorter jobs to the CS pool.
2. Obtain access to simon.stat.wisc.edu from Mike
3.
Camilleri (mikec@stat.wisc.edu), and submit longer jobs
to the CHTC Pool.
Meet with the CHTC to submit jobs to the entire UW Grid
and to the national Open Science Grid.
 chtc.cs.wisc.edu, click “Get Started”
User support for HTCondor users at UW:
chtc@cs.wisc.edu
47
48
49
50
What Next?
1. Use a Stat server to submit shorter jobs to the CS pool.
2. Obtain access to simon.stat.wisc.edu from Mike
3.
Camilleri (mikec@stat.wisc.edu), and submit longer jobs
to the CHTC Pool.
Meet with the CHTC to submit jobs to the entire UW Grid
and to the national Open Science Grid.
 chtc.cs.wisc.edu, click “Get Started”
User support for HTCondor users at UW:
chtc@cs.wisc.edu
51
Download