Advanced High Performance Computing Workshop HPC 201 Charles J Antonelli Mark Champe Seth Meyer LSAIT ARS October, 2014 Roadmap Flux review Globus Connect Advanced PBS Array & dependent scheduling Tools GPUs on Flux Scientific applications R, Python, MATLAB Parallel programming Debugging & profiling cja 2014 2 10/14 Flux review cja 2014 3 10/14 The Flux cluster … cja 2014 4 10/14 A Flux node 48 GB - 1 TB RAM 8 GPUs (GPU Flux) 12-40 Intel cores Local disk Each GPU contains 2,688 GPU cores cja 2014 5 10/14 Programming Models Two basic parallel programming models Message-passing The application consists of several processes running on different nodes and communicating with each other over the network Used when the data are too large to fit on a single node, and simple synchronization is adequate "Coarse parallelism" or "SPMD" Implemented using MPI (Message Passing Interface) libraries Multi-threaded The application consists of a single process containing several parallel threads that communicate with each other using synchronization primitives Used when the data can fit into a single process, and the communications overhead of the message-passing model is intolerable "Fine-grained parallelism" or "shared-memory parallelism" Implemented using OpenMP (Open Multi-Processing) compilers and libraries Both cja 2014 6 10/14 Using Flux Three basic requirements: A Flux login account A Flux allocation An MToken (or a Software Token) Logging in to Flux ssh flux-login.engin.umich.edu Campus wired or MWireless Otherwise: VPN ssh login.itd.umich.edu first cja 2014 7 10/14 Cluster batch workflow You create a batch script and submit it to PBS PBS schedules your job, and it enters the flux queue When its turn arrives, your job will execute the batch script Your script has access to all Flux applications and data When your script completes, anything it sent to standard output and error are saved in files stored in your submission directory You can ask that email be sent to you when your jobs starts, ends, or aborts You can check on the status of your job at any time, or delete it if it's not doing what you want A short time after your job completes, it disappears from PBS cja 2014 8 10/14 Loosely-coupled batch script #PBS -N yourjobname #PBS -V #PBS -A youralloc_flux #PBS -l qos=flux #PBS -q flux #PBS -l procs=12,pmem=1gb,walltime=00:05:00 #PBS -M youremailaddress #PBS -m abe #PBS -j oe #Your Code Goes Below: cat $PBS_NODEFILE cd $PBS_O_WORKDIR mpirun ./c_ex01 cja 2014 9 10/14 Tightly-coupled batch script #PBS -N yourjobname #PBS -V #PBS -A youralloc_flux #PBS -l qos=flux #PBS -q flux #PBS -l nodes=1:ppn=12,mem=4gb,walltime=00:05:00 #PBS -M youremailaddress #PBS -m abe #PBS -j oe #Your Code Goes Below: cd $PBS_O_WORKDIR matlab -nodisplay -r script cja 2014 10 10/14 GPU batch script #PBS -N yourjobname #PBS -V #PBS -A youralloc_flux #PBS -l qos=flux #PBS -q flux #PBS -l nodes=1:gpus=1,walltime=00:05:00 #PBS -M youremailaddress #PBS -m abe #PBS -j oe #Your Code Goes Below: cat $PBS_NODEFILE cd $PBS_O_WORKDIR matlab -nodisplay -r gpuscript cja 2014 11 10/14 Copying data From Linux or Mac OS X, use scp or sftp Non-interactive (scp) scp localfile uniqname@flux-xfer.engin.umich.edu:remotefile scp -r localdir uniqname@flux-xfer.engin.umich.edu:remotedir scp uniqname@flux-login.engin.umich.edu:remotefile localfile Use "." as destination to copy to your Flux home directory: scp localfile login@flux-xfer.engin.umich.edu:. ... or to your Flux scratch directory: scp localfile login@fluxxfer.engin.umich.edu:/scratch/allocname/uniqname Interactive (sftp) sftp uniqname@flux-xfer.engin.umich.edu From Windows, use WinSCP U-M Blue Disc: http://www.itcs.umich.edu/bluedisc/ LSA IT ARS / cja © 2014 12 Version ca-0.96 - 8 Oct 2014 Globus Online Features High-speed data transfer, much faster than SCP or SFTP Reliable & persistent Minimal client software: Mac OS X, Linux, Windows GridFTP Endpoints Gateways through which data flow Exist for XSEDE, OSG, … UMich: umich#flux, umich#nyx Add your own client endpoint! Add your own server endpoint: contact flux-support@umich.edu More information http://cac.engin.umich.edu/resources/login-nodes/globus-gridftp cja 2014 13 10/14 Advanced PBS cja 2014 14 10/14 Job Arrays • Submit copies of identical jobs • Invoked via qsub -t: qsub -t array-spec pbsbatch.txt Where array-spec can be m-n a,b,c m-n%slotlimit e.g. qsub -t 1-50%10 Fifty jobs, numbered 1 through 50, only ten can run simultaneously • $PBS_ARRAYID records array identifier cja 2014 15 15 10/14 Dependent scheduling • Submit job to become eligible for execution at a given time • Invoked via qsub -a: qsub -a [[[[CC]YY]MM]DD]hhmm[.SS] … qsub -a 201412312359 j1.pbs j1.pbs becomes eligible one minute before New Year's Day 2015 qsub -a 1800 j2.pbs j2.pbs becomes eligible at six PM today (or tomorrow, if submitted after six PM) cja 2014 16 16 10/14 Dependent scheduling • Submit job to run after specified job(s) • Invoked via qsub -W: qsub -W depend=type:jobid[:jobid]… Where depend can be after afterany afterok afternotok Schedule this job after jobids have started Schedule this job after jobids have finished Schedule this job after jobids have finished with no errors Schedule this job after jobids have finished with errors qsub first.pbs # assume receives jobid 12345 qsub -W afterany:12345 second.pbs Schedule second.pbs after first.pbs completes cja 2014 17 17 10/14 Dependent scheduling • Submit job to run before specified job(s) • Requires dependent jobs to be scheduled first • Invoked via qsub -W: qsub -W depend=type:jobid[:jobid]… Where depend can be before beforeany beforeok beforenotok on:N jobids scheduled after this job starts jobids scheduled after this job completes jobids scheduled after this job completes with no errors jobids scheduled after this job completes with errors wait for N job completions qsub -W on:1 second.pbs # assume receives jobid 12345 qsub -W beforeany:12345 first.pbs Schedule second.pbs after first.pbs completes cja 2014 18 18 10/14 Troubleshooting showq [-r][-i][-b][-w user=uniq] # running/idle/blocked jobs qstat -f jobno # full info incl gpu qstat -n jobno # nodes/cores where job running diagnose -p # job prio and components pbsnodes # nodes, states, properties pbsnodes -l # list nodes marked down checkjob [-v] jobno # why job jobno not running mdiag -a # allocs & users (flux) freenodes # aggregate node/core busy/free mdiag -u uniq # allocs for uniq (flux) mdiag -a alloc_flux # cores active, alloc (flux) cja 2014 19 10/14 Scientific applications cja 2014 20 10/14 Scientific Applications R (incl snow and multicore) R with GPU (GpuLm, dist) SAS, Stata Python, SciPy, NumPy, BioPy MATLAB with GPU CUDA Overview CUDA C (matrix multiply) cja 2014 21 10/14 Python Python software available on Flux EPD The Enthought Python Distribution provides scientists with a comprehensive set of tools to perform rigorous data analysis and visualization. https://www.enthought.com/products/epd/ biopython Python tools for computational molecular biology http://biopython.org/wiki/Main_Page numpy Fundamental package for scientific computing http://www.numpy.org/ scipy Python-based ecosystem of open-source software for mathematics, science, and engineering http://www.scipy.org/ cja 2014 22 10/14 Debugging & profiling cja 2014 23 10/14 Debugging with GDB Command-line debugger Start programs or attach to running programs Display source program lines Display and change variables or memory Plant breakpoints, watchpoints Examine stack frames Excellent tutorial documentation http://www.gnu.org/s/gdb/documentation/ cja 2014 24 24 10/14 Compiling for GDB Debugging is easier if you ask the compiler to generate extra source-level debugging information Add -g flag to your compilation icc -g serialprogram.c -o serialprogram or mpicc -g mpiprogram.c -o mpiprogram GDB will work without symbols Need to be fluent in machine instructions and hexadecimal Be careful using -O with -g Some compilers won't optimize code when debugging Most will, but you sometimes won't recognize the resulting source code at optimization level -O2 and higher Use -O0 -g to suppress optimization cja 2014 25 25 10/14 Running GDB Two ways to invoke GDB: Debugging a serial program: gdb ./serialprogram Debugging an MPI program: mpirun -np N xterm -e gdb ./mpiprogram This gives you N separate GDB sessions, each debugging one rank of the program Remember to use the -X or -Y option to ssh when connecting to Flux, or you can't start xterms there cja 2014 26 10/14 Useful GDB commands gdb exec gdb exec core l [m,n] disas disas func b func b line# b *0xaddr ib d bp# r [args] bt c step next stepi p var p *var p &var p arr[idx] x 0xaddr x *0xaddr x/20x 0xaddr ir i r ebp set var = expression q cja 2014 start gdb on executable exec start gdb on executable exec with core file core list source disassemble function enclosing current instruction disassemble function func set breakpoint at entry to func set breakpoint at source line# set breakpoint at address addr show breakpoints delete beakpoint bp# run program with optional args show stack backtrace continue execution from breakpoint single-step one source line single-step, don't step into function single-step one instruction display contents of variable var display value pointed to by var display address of var display element idx of array arr display hex word at addr display hex word pointed to by addr display 20 words in hex starting at addr display registers display register ebp set variable var to expression quit gdb 27 10/14 Debugging with DDT Allinea's Distributed Debugging Tool is a comprehensive graphical debugger designed for the complex task of debugging parallel code Advantages include Provides GUI interface to debugging Similar capabilities as, e.g., Eclipse or Visual Studio Supports parallel debugging of MPI programs Scales much better than GDB cja 2014 28 10/14 Running DDT Compile with -g: mpicc -g mpiprogram.c -o mpiprogram Load the DDT module: module load ddt Start DDT: ddt mpiprogram This starts a DDT session, debugging all ranks concurrently Remember to use the -X or -Y option to ssh when connecting to Flux, or you can't start ddt there http://arc.research.umich.edu/flux-and-other-hpc-resources/flux/software-library/ http://content.allinea.com/downloads/userguide.pdf cja 2014 29 10/14 Application Profiling with MAP Allinea's MAP Tool is a statistical application profiler designed for the complex task of profiling parallel code Advantages include Provides GUI interface to profiling Observe cumulative results, drill down for details Supports parallel profiling of MPI programs Handles most of the details under the covers cja 2014 30 10/14 Running MAP Compile with -g: mpicc -g mpiprogram.c -o mpiprogram Load the MAP module: module load ddt Start MAP: ddt mpiprogram This starts a MAP session Runs your program, gathers profile data, displays summary statistics Remember to use the -X or -Y option to ssh when connecting to Flux, or you can't start ddt there http://content.allinea.com/downloads/userguide.pdf cja 2014 31 10/14 Resources http://arc.research.umich.edu/flux-and-other-hpc-resources/flux/software-library/ U-M Advanced Research Computing Flux pages http://arc.research.umich.edu/resources-services/flux/ U-M Advanced Research Computing Flux pages http://cac.engin.umich.edu/ CAEN HPC Flux pages http://www.youtube.com/user/UMCoECAC CAEN HPC YouTube channel For assistance: flux-support@umich.edu Read by a team of people including unit support staff Cannot help with programming questions, but can help with operational Flux and basic usage questions cja 2014 32 10/14 References 1. Supported Flux software, http://arc.research.umich.edu/flux-and-other-hpcresources/flux/software-library/, (accessed June 2014) 2. Free Software Foundation, Inc., "GDB User Manual," http://www.gnu.org/s/gdb/documentation/ (accessed June 2014). 3. Intel C and C++ Compiler 14 User and Reference Guide, https://software.intel.com/enus/compiler_14.0_ug_c (accessed June 2014). 4. Intel Fortran Compiler 14 User and Reference Guide,https://software.intel.com/enus/compiler_14.0_ug_f(accessed June 2014). 5. Torque Administrator's Guide, http://docs.adaptivecomputing.com/torque/4-28/torqueAdminGuide-4.2.8.pdf (accessed June 2014). 6. Submitting GPGPU Jobs, https://sites.google.com/a/umich.edu/engincac/resources/systems/flux/gpgpus (accessed June 2014). 7. http://content.allinea.com/downloads/userguide.pdf (accessed October 2014) cja 2014 33 10/14