PhD Seminar Series Introduction Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PhD Seminar Series Introduction Chris Cantwell Mathematics Institute, University of Warwick January 2008 Outline PhD Seminar Series Introduction Chris Cantwell PhD Seminars PhD Seminars Introduction Introduction Organisation PBS and CSC Overview Organisation Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PhD Seminar Series PhD Seminar Series Introduction Chris Cantwell PhD Seminars I Series of informal seminars run for Masters and PhD students. I Main purpose is not to discuss our research. I Share knowledge and experience. I Provide practical examples of common numerical methods. I . . . but remain as language-independant as possible. . . I Help students in the ’CSC suburbs’ who may not have as much peer support. I Provide a friendly atmosphere for improving presentation skills. Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Topics Covered PhD Seminar Series Introduction Chris Cantwell PhD Seminars Some of the topics we hope to cover over the term are: Introduction Organisation PBS and CSC I Using CSC resources I I I Numerical libraries and how to use them in your own research I I I I Clusters and CSCs systems PBS Linear Algebra: BLAS, LAPACK Fourier Transform: FFTW Others. . . Scientific Packages I I Mathematics: Mathematica, MATLAB and Maple LaTeX, Beamer Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Topics Covered PhD Seminar Series Introduction Chris Cantwell PhD Seminars Some of the topics we hope to cover over the term are: Introduction Organisation PBS and CSC I Using CSC resources I I I Numerical libraries and how to use them in your own research I I I I Clusters and CSCs systems PBS Linear Algebra: BLAS, LAPACK Fourier Transform: FFTW Others. . . Scientific Packages I I Mathematics: Mathematica, MATLAB and Maple LaTeX, Beamer Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Topics Covered PhD Seminar Series Introduction Chris Cantwell PhD Seminars Some of the topics we hope to cover over the term are: Introduction Organisation PBS and CSC I Using CSC resources I I I Numerical libraries and how to use them in your own research I I I I Clusters and CSCs systems PBS Linear Algebra: BLAS, LAPACK Fourier Transform: FFTW Others. . . Scientific Packages I I Mathematics: Mathematica, MATLAB and Maple LaTeX, Beamer Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Outline PhD Seminar Series Introduction Chris Cantwell PhD Seminars PhD Seminars Introduction Introduction Organisation PBS and CSC Overview Organisation Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Organisation PhD Seminar Series Introduction Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC Overview I PS017a ’fishbowl’ I Breakfast provided! I Initially on even weeks. Will be held weekly if enough speakers confirmed. Any volunteers? :-) Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Organisation PhD Seminar Series Introduction Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC Overview I PS017a ’fishbowl’ I Breakfast provided! I Initially on even weeks. Will be held weekly if enough speakers confirmed. Any volunteers? :-) Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion This session PhD Seminar Series Introduction Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun The remainder of this session will discuss PBS - a key software tool used for managing computation jobs on CSCs systems. . . Conclusion PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PBS Pro and CSC Chris Cantwell Mathematics Institute, University of Warwick January 2008 Outline PBS Pro and CSC Chris Cantwell PhD Seminars PhD Seminars Introduction Organisation Introduction PBS and CSC Overview Organisation Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion What is PBS? PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC PBS stands for Portable Batch System Overview Submitting and Monitoring Jobs Working with Jobs I Set of queues I Jobs are placed in one queue I A job is a script I Each job specifies a set of required resources I Jobs are executed when sufficient resources available More PBS fun Conclusion Where is PBS used? A rough guide. . . I CoW I I I I I I SGI Altix (skua.csc.warwick.ac.uk) I I I I I ’Cluster of Workstations’. 184 (as of yesterday) individual desktop computers. Many different architectures and specifications. Mainly for serial codes Everyone with a CSC account has access Shared memory machine (112GB) with 56 processors. Single OS image across all processors. Mainly for highly parallel or large memory codes. Access must be granted. New Cluster. . . I I I I I Collection of 240 identical hosts. Each with two dual-core processors = 960 cores. Separate memory for each host. Suitable for parallel or serial codes. Access must be granted. PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Where is PBS used? A rough guide. . . I CoW I I I I I I SGI Altix (skua.csc.warwick.ac.uk) I I I I I ’Cluster of Workstations’. 184 (as of yesterday) individual desktop computers. Many different architectures and specifications. Mainly for serial codes Everyone with a CSC account has access Shared memory machine (112GB) with 56 processors. Single OS image across all processors. Mainly for highly parallel or large memory codes. Access must be granted. New Cluster. . . I I I I I Collection of 240 identical hosts. Each with two dual-core processors = 960 cores. Separate memory for each host. Suitable for parallel or serial codes. Access must be granted. PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Where is PBS used? A rough guide. . . I CoW I I I I I I SGI Altix (skua.csc.warwick.ac.uk) I I I I I ’Cluster of Workstations’. 184 (as of yesterday) individual desktop computers. Many different architectures and specifications. Mainly for serial codes Everyone with a CSC account has access Shared memory machine (112GB) with 56 processors. Single OS image across all processors. Mainly for highly parallel or large memory codes. Access must be granted. New Cluster. . . I I I I I Collection of 240 identical hosts. Each with two dual-core processors = 960 cores. Separate memory for each host. Suitable for parallel or serial codes. Access must be granted. PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Why use PBS? PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation Mainly useful for long jobs you don’t want to leave running on your laptop. PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun I Harness idle computing power. I Easily share the available computing resources of CSC. I Optimise the use of those resources. I Allows fair usage for all users. I Simplifies running code on multiple hosts. Conclusion Using PBS PBS Pro and CSC Chris Cantwell So, you want to run your code on the CoW? PhD Seminars Introduction Organisation From our user perspective, PBS is just a set of utilities for managing jobs and a ’black-box’ server which schedules the jobs and runs them. PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun To use PBS, we can either use: I A set of command-line utilities - mainly qsub, qstat, qdel, qalter, etc I A Graphical Interface - xpbs Command-line utilities are preferable if you are confident with them. For the purpose of demonstrations in this session, we will use the CoW. Conclusion A Quick Glossary of Terms PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation I job - a single unit of work I queue - a named, ordered, container for jobs within a server. I vnode - virtual node, a set of usable resources on a machine. I host - a machine with its own OS having one or more vnodes. I chunk - a set of resources allocated to a job. Chunks can be split across vnodes but not hosts. PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion A Diagramatic explanation of chunk PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation vnode1 PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion vnode2 vnode3 Host Outline PBS Pro and CSC Chris Cantwell PhD Seminars PhD Seminars Introduction Organisation Introduction PBS and CSC Overview Organisation Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Submitting a job PBS Pro and CSC Chris Cantwell First write a PBS script which does the job you want done. PhD Seminars Introduction Organisation PBS and CSC I A PBS script is just a normal script! Overview I Usually a shell script (e.g. bash) Working with Jobs I Use special comments to give instructions to PBS such as what resources your job requires. Conclusion Submit the script to the PBS system I Use the qsub command. qsub <script-name> I Can also specify resources on the command-line. I If qsub runs successfully, it will return a Job ID. I Use this ID with other PBS commands. Submitting and Monitoring Jobs More PBS fun Submitting a job PBS Pro and CSC Chris Cantwell First write a PBS script which does the job you want done. PhD Seminars Introduction Organisation PBS and CSC I A PBS script is just a normal script! Overview I Usually a shell script (e.g. bash) Working with Jobs I Use special comments to give instructions to PBS such as what resources your job requires. Conclusion Submit the script to the PBS system I Use the qsub command. qsub <script-name> I Can also specify resources on the command-line. I If qsub runs successfully, it will return a Job ID. I Use this ID with other PBS commands. Submitting and Monitoring Jobs More PBS fun Submitting a job PBS Pro and CSC Chris Cantwell First write a PBS script which does the job you want done. PhD Seminars Introduction Organisation PBS and CSC I A PBS script is just a normal script! Overview I Usually a shell script (e.g. bash) Working with Jobs I Use special comments to give instructions to PBS such as what resources your job requires. Conclusion Submit the script to the PBS system I Use the qsub command. qsub <script-name> I Can also specify resources on the command-line. I If qsub runs successfully, it will return a Job ID. I Use this ID with other PBS commands. Submitting and Monitoring Jobs More PBS fun Which queue will my job go in? PBS Pro and CSC Chris Cantwell PhD Seminars qsub will choose the best queue based on the resources you’ve asked for. Introduction Organisation PBS and CSC Overview For example, the queues on the CoW include Submitting and Monitoring Jobs Working with Jobs More PBS fun I serial short I serial long I mpi short and mpi long I and some others. . . You can force a job into a specific queue I Some research groups have their own queues. I Best just to let PBS do its job! Conclusion Example 1 PBS Pro and CSC Chris Cantwell PhD Seminars Here is a simple PBS script for use on the CoW. #!/bin/bash #PBS -l select=1:ncpus=1,mem=100mb #PBS -l walltime=00:10:00 #PBS -V sleep 120 I First line specifies script interpretter. I Lines beginning with hash are comments. I except such lines starting with #PBS. I These specify instructions to the PBS system. I -l indicates we’re giving PBS a resource list. I More details to follow. . . Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Example 1 PBS Pro and CSC Chris Cantwell PhD Seminars Here is a simple PBS script for use on the CoW. #!/bin/bash #PBS -l select=1:ncpus=1,mem=100mb #PBS -l walltime=00:10:00 #PBS -V sleep 120 I First line specifies script interpretter. I Lines beginning with hash are comments. I except such lines starting with #PBS. I These specify instructions to the PBS system. I -l indicates we’re giving PBS a resource list. I More details to follow. . . Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Example 1 PBS Pro and CSC Chris Cantwell #PBS -l select=1:ncpus=1,mem=100mb PhD Seminars Introduction Organisation I select indicates we are asking for one or more chunks. PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs I ncpus specify the number of processors required. I mem specifies the amount of memory (mb, gb). #PBS -l walltime=00:10:00 I We ask for 10 minutes of walltime I This is different from CPU time (cput). These are the usual resources requested when submitting a simple serial job, otherwise your job won’t be able to do much! More PBS fun Conclusion Monitoring your jobs PBS Pro and CSC Chris Cantwell To monitor jobs, use the qstat command. PhD Seminars Introduction Organisation I By default, shows all jobs in all queues. I Command-line switches can be used to tailor display to our needs. PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun I e.g. to display your jobs: A more detailed view of a specific job can be seen using the -f switch and a Job ID. qstat -f <job-id> Let’s run a job and see what else is running on the CoW at the moment. . . Example: simple.pbs Conclusion Monitoring your jobs PBS Pro and CSC Chris Cantwell To monitor jobs, use the qstat command. PhD Seminars Introduction Organisation I By default, shows all jobs in all queues. I Command-line switches can be used to tailor display to our needs. PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun I e.g. to display your jobs: A more detailed view of a specific job can be seen using the -f switch and a Job ID. qstat -f <job-id> Let’s run a job and see what else is running on the CoW at the moment. . . Example: simple.pbs Conclusion Job Output PBS Pro and CSC Chris Cantwell As well as any output files your job might produce, PBS will always create two files: [job-name].o[job-id] and [job-name].e[job-id] PhD Seminars Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs I These contain everything sent to the UNIX files stdout and stderr, respectively. I i.e. everything that would normally be output to the terminal by your PBS script and code. If your job fails to run correctly, these two files are invaluable in determining the cause. Any files produced by your program will be output into your home directory (unless you use an absolute path!). Example: fileout.pbs More PBS fun Conclusion Environmental Variables PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC I By default any environmental variables defined on your local machine aren’t necessarily replicated to the job. I Use #PBS -V to transfer all currently defined variables to your job. I Can specify variables manually: #PBS -v myvar=myvalue Example: envvar.pbs Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Outline PBS Pro and CSC Chris Cantwell PhD Seminars PhD Seminars Introduction Organisation Introduction PBS and CSC Overview Organisation Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Altering a Job You can alter properties or the resource requests for a job after it has been submitted. PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation I I I Your new request must satisfy the limits of the queue it is in. If your job has started running, can only change cput and walltime resources. . . . and then can only reduce these. Alter a job using the qalter command. For example change the name and reduce the walltime: qalter -N mynewjobname qalter -lwalltime=00:10:00 Example: Lets change the name of another job long.pbs PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Altering a Job You can alter properties or the resource requests for a job after it has been submitted. PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation I I I Your new request must satisfy the limits of the queue it is in. If your job has started running, can only change cput and walltime resources. . . . and then can only reduce these. Alter a job using the qalter command. For example change the name and reduce the walltime: qalter -N mynewjobname qalter -lwalltime=00:10:00 Example: Lets change the name of another job long.pbs PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Deleting a job PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC Overview You can delete a queued or running job if necessary. Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion I For this use the qdel command and the Job ID. qdel [job-id] I Clearly you can’t delete other people’s jobs! Example: Lets delete long.pbs Outline PBS Pro and CSC Chris Cantwell PhD Seminars PhD Seminars Introduction Organisation Introduction PBS and CSC Overview Organisation Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Choosing specific resources PBS Pro and CSC Chris Cantwell Our previous example requested a computer with one processor and 100MB of RAM available. PhD Seminars Introduction Organisation PBS and CSC Overview I I Can choose specific hardware architecture on which to run your code. Useful if you have compiled your code on a specific architecture. #PBS -l select=1:ncpus=1:x86 64=true This requests a single 64-bit processor. Can also request a specific network to run your job on #PBS -l select=1:ncpus=1:astro=true Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Other Useful Switches PBS Pro and CSC Chris Cantwell PhD Seminars I PBS can email you when your job reach certain stages. These are: I I I I a - Your job is aborted by the system. b - Your job begins. e - Your job ends. n - Don’t send any email. Specify a string indicating when PBS should send email using -m switch. #PBS -m ae Then specify an email address to send to: #PBS -M someone@somewhere.com I Redirect output and error files using the #PBS -o and #PBS -e switches. Example: switch.pbs Introduction Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Selecting a Queue PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC By default, PBS will choose the most appropriate queue for your job based on your chosen resources. Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion I Some research groups have their own queues. I These may operate on their own computing resources. I You should only select a queue if you’ve been told to use a specific queue. Selecting a Queue PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC By default, PBS will choose the most appropriate queue for your job based on your chosen resources. Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion I Some research groups have their own queues. I These may operate on their own computing resources. I You should only select a queue if you’ve been told to use a specific queue. Moving a Job to a different Queue PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation A job can also be moved to a different queue after it has been submitted. PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun I Use the qmove command. I Specify the new queue, and the job to move. I Job must satisfy requirements of destination queue. For example: qmove myqueue 12345.foo Conclusion Job Arrays These allow multiple jobs which differ only by a single index to be managed as a single unit. PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation I I I Use a script in the same way as with single jobs. Differentiate each subjob using the $PBS ARRAY INDEX and $PBS JOBID shell variables. All other parameters are identical. To create a job array use the #PBS -J [X]-[Y]:[Z] switch. I I I X - Starting index Y - Ending index (inclusive) Z - Optional step Can use the array index as an index to an array of parameters in the script. Examples: array.pbs, array2.pbs PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Job Arrays These allow multiple jobs which differ only by a single index to be managed as a single unit. PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation I I I Use a script in the same way as with single jobs. Differentiate each subjob using the $PBS ARRAY INDEX and $PBS JOBID shell variables. All other parameters are identical. To create a job array use the #PBS -J [X]-[Y]:[Z] switch. I I I X - Starting index Y - Ending index (inclusive) Z - Optional step Can use the array index as an index to an array of parameters in the script. Examples: array.pbs, array2.pbs PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Submitting MPI Jobs PBS Pro and CSC Chris Cantwell PhD Seminars Introduction I I I Request resources, for example #PBS -l select=10:ncpus=1:mem=2gb The resources of each chunk is the amount of resources required for each rank of your MPI job. You can optionally place the chunks depending on the requirements of your job. Options are: I I I free - chunks can be placed any host (default). pack - all chunks are on the same host. scatter - all chunks are from separate hosts. e.g. #PBS -l place=pack Note: Currently when running MPI jobs, you need to confine your job to a single network. Organisation PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Outline PBS Pro and CSC Chris Cantwell PhD Seminars PhD Seminars Introduction Organisation Introduction PBS and CSC Overview Organisation Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion Conclusion PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation This is only an overview of the most common ways of using PBS. We’ve covered: PBS and CSC Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun I Writing PBS scripts. I Submitting and monitoring jobs using qsub and qstat I Managing and altering jobs using qalter and qdel. I Using job arrays and multi-processor jobs. There are many more features available. Conclusion Available Resources PBS Pro and CSC Chris Cantwell PhD Seminars Introduction Organisation PBS and CSC The following resources will be made available for download and use from the website. Overview Submitting and Monitoring Jobs Working with Jobs More PBS fun Conclusion I This presentation I A PBS Quick Reference. I All the examples shown today. Also use the CSC User Forum if you have questions: Local → User Forum PBS Pro and CSC References Chris Cantwell PhD Seminars Introduction Some places to find additional information: Organisation PBS and CSC Overview I CSC Website Submitting and Monitoring Jobs Working with Jobs www.csc.warwick.ac.uk I PBS Professional 9.1 User’s Guide, Altair, October 24th 2007. I Information on the CoW: Local → Desktop Computing → Support → CoW I To use the CoW you need a CSC computer account Local → Desktop Computing → Your Account → Application Form More PBS fun Conclusion