Using the High Performance Computing Cluster at TSL Graham Etherington TSL Bioinformatics Support Officer What is the HPC Cluster? • Like a large collection high-memory computers. • Controlled by management software (LSF) which allocates tasks to each computer (node). Access node LSF Hard disk storage HPC Node Resources • Shared with TGAC • Currently: – 29 x 128Gb nodes – 4 x 256Gb nodes • Planned: – 88 x 128Gb nodes – 9 x 256Gb nodes What does the HPC allow us to do? • Send computing tasks somewhere else – leave your PC free for other tasks. • Carry out memory-intensive jobs – 4GB RAM on PC, up to 256GB RAM on cluster. • Install programs once, use by everyone. Logging in to the HPC • Open a terminal window and type: $ ssh hpc.tsl.ac.uk • You’ll be asked for your password – enter your current NBI password. When your NBI password changes, it will also change on the cluster. • The first time you log-in your account will be created. The log-in screen Displays information for: The number of jobs you have running. The number of jobs you have queued. The total number of jobs running on the cluster. The total number of jobs queued on the cluster. Task 1 • Start up your computers • Log on using the username on the yellow tag on your computer • Password – “Learning26” • Complete Task 1 in the tutorial. Where stuff is • Your home-directory /usr/users/sl/<username> shortcut: ~ • Data for everyone (blastdbs, indexes, reference sequences, external read sets, genome annotations,etc) /usr/users/sl/data/ • Programs /tsl/software/testing/ Task 2 • Complete Task 2 in the tutorial Submitting jobs $ bsub -q TSL-Test128 -We 5 "echo hello” • Breaking it down: . . . . bsub = submit the following job -q = use the following queue on the cluster -We = estimated run time of the job (in minutes) “echo hello” – the task we want to run (in quotes) • What queues are available? – – – – TSL-Test128 TSL-Test256 TSL-Prod128 TSL-Prod256 Submitting jobs • How do I run my own script? $ bsub -q TSL-Test128 -We 5 "source perl-5.16.3; perl SqrtTwoNumbers.pl 1 2” • ‘source’? What’s that about? – stipulate which version of a program to run (e.g. “source bwa-0.7.3”, “source bwa-0.7.4”, etc.) – often only one choice, but you still need to source it. Submitting jobs • Programs are always installed in the ‘Testing’ environment. • /tsl/software/testing/ • Strict directory-naming scheme • /tsl/software/testing/<program_name>/<version> – e.g. • /tsl/software/testing/bwa/0.7.3/ • /tsl/software/testing/bwa/0.7.4/ Submitting jobs • How do I run an installed program? bsub -q TSL-Test128 -We 5 "source <program>-<version>; <program> [opts]” e.g. $ bsub -q TSL-Test128 -We 5 "source bwa-0.7.4; bwa index ref.fa” • Some things to note – hyphen between program and version – semi-colon after source Task 3 • Complete Task 3 in the tutorial. Submitting jobs $ bsub –q TSL-Test128 –We 10 “source bwa-0.7.4; bwa mem -t 4 -c 5000 -O 5 -P -B 5 -U 10 ref.fa left.fq right.fq > map.sam” • Make life easier by using a bash script – store the commands and parameters for commonly used programs run_bwa.sh #! /bin/bash source bwa-0.7.4; bwa mem -t 4 -c 5000 -O 5 -P -B 5 -U 10 $1 $2 $3 > $4; $ bsub –q TSL-Test128 –We 10 “./run_bwa.sh ref.fa left.fq right.fq map.sam” • Edit the bash script to change the parameters. Task 4 • Complete Task 4 in the tutorial. Installing software • Remember: Strict directory-naming scheme – /tsl/software/testing/<program_name>/<version> • Inside each <version> there will be two subdirectories ‘src’ and ‘x86_64/bin/’ • ‘src’ is for source code and ‘x86_64/bin/’ for binaries (the actual executables). • After installation, a wrapper needs to be created. $ create-software-testing-wrapper <program_name>-<version> Which queue? • Testing – newly installed software. – a script you’re developing or running for the first time. • Production – ‘Stable’ software and scripts. Task 5 • Complete Task 5 in the tutorial