HCC Workshop Department of Earth and Atmospheric Sciences September 23/30, 2014 Introduction to LINUX ● Operating system like Windows or OS X (but different) ● OS used by HCC ● Means of communicating with computer ● Command Line Interface (instead of GUI) The Terminal Window After logging in to your account, commands are entered in a terminal window: You are here Working in LINUX, you are always located somewhere in your directory tree. Most common commands: navigating within directory tree, creating, editing, moving and removing files and directories. Getting Started - Logging in ● Quick start guide for Windows ● Quick start guide for Mac/Linux Your Directories on HCC Computers Home Directory - smaller space than work (quota limited by group) Work Directory - not backed up /home/<group>/<userid> or just $HOME /work/<group>/<userid> or just $WORK Use for: program binaries, source code, configuration files, etc. Use for: temporary scratch space (not long-term storage) I/O for running jobs “print working directory” “change directory” Your directories are not shared/mounted across all HCC clusters (ie. Tusker and Crane have separate file systems.) Other commands to cover Move/rename files: mv Example: mv oldname newname Copy files: cp Example: cp myfile myfilecopy Change the current directory: cd Example: cd mydirectory See contents of the directory: ls See directory contents with more details: ls -l Print working directory: pwd Print file to screen: more Example: more myfile Merging multiple files: cat Example: cat file1 file2 > combinedfile Create empty file: touch Example: touch newfile Search a string in a file: grep Example: grep mystring myfile Compress & decompress files: tar, zip, gzip Remove files: rm Example: rm filetodelete Remove entire directory: rm -rf Example: rm -rf directorytodelete See more at: http://www.comptechdoc.org/os/linux/usersguide/linux_ugbasics.html Exercise: Using commands on the left: 1. Create a directory named “mydir”. 2. Change into the directory, created a file named “myfile”. 3. Delete “myfile” and “mydir”. Basic exercise - create directory/file One editor : nano or any other you prefer Start nano by running ‘nano’. Save: Control + o Exit: Control + x Exercise - Create submit script Using nano, create the text file to the right. Save it as ‘helloworld.submit’. We will submit it to SLURM later. #!/bin/sh #SBATCH --time=00:05:00 #SBATCH --job-name=helloworld #SBATCH --error=helloworld.%J.err #SBATCH --output=helloworld.%J.out #SBATCH --qos=short echo “this is my first slurm job” sleep 30 Cluster Basics ● Every job must be run through a scheduler: Submitting Jobs using SLURM ● Software is managed through the module software: Using Module ● Every user has two directories to use Where to Put Data SLURM (Simple Linux Utility for Resource Management) ● Define the job name #SBATCH --job-name=MY_Test_Job ● Emails for job notification #SBATCH --mail-user=my@mail.com #SBATCH --mail-type=ALL # (ALL = BEGIN, END, FAIL, REQUEUE) ● Requested run time #SBATCH --time=7-0 # Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "dayshours", "days-hours:minutes" and "days-hours:minutes:seconds" ● Requested number of cores #SBATCH --ntasks=16 or #SBATCH --nodes=2 #SBATCH --ntasks-per-node=16 ● Request memory for your job #SBATCH --mem-per-cpu=128 # Set the memory requirements for the job in MB. Your job will be allocated exclusive access to that amount of RAM. In the case it overuses that amount, Slurm will kill it. ● Set a pattern for the output file #SBATCH --output=<filename pattern> # The filename pattern may contain one or more replacement symbols, which are a percent sign "%" followed by a letter (e.g. %J). # Supported replacement symbols are: %J Job allocation number. %N Main node name. ● Submit an interactive job Submitting an interactive job is done with the command srun. $ srun --pty $SHELL or to allocate 4 cores per node: $ srun --nodes=1 --ntasks-per-node=4 --mem-per-cpu=1024 --pty $SHELL Exercise - Submit hello world job Globus Connect Use Globus Connect to: ● Transfer files between HCC clusters (Tusker, Crane, & Sandhills) ● Transfer files between your laptop/pc and HCC clusters ● Share files with colleagues Globus Connect Getting Started: (HCC-DOCS full instructions) ● Sign up for a Globus account ● Install Globus Connect Personal on your computer ● Activate HCC endpoint(s): hcc#tusker, ● Log in to Globus account (online) and start making transfers hcc#crane, hcc#sandhills Important: /home is read-only via Globus Connect (transfer from /home, but not to /home). /work is readable/writable via Globus Connect (can transfer files to and from /work). Exercise: Transfer files with Globus Option 1: Transfer file with Globus 1. Download example query file to your laptop: matlab_demo.tar.gz 2. Transfer file from your laptop to your Crane work directory using Globus 3. Unpack the tar archive on Crane: cd $WORK tar -xzf matlab_demo.tar.gz Option 2: Copy file into your work directory 1. Log in to crane 2. Copy query file to work with the following command: cp -r /work/demo/shared/matlab_demo $WORK Exercise - Submit MATLAB Job Invert a large (104 x 104) random square matrix ● Using only 1 compute core ● Using 10 cores (open a matlabpool with 10 workers) ● Compare times Use a Job Array to invert 5 (or more!) matrices at once Exercise - Submit MATLAB Job Copy demo files to $WORK Examine contents of Matlab and Slurm scripts. Submit first job. Monitor job. Edit Slurm submit script (change # tasks) and re-submit. Compare output files (can also use ‘cat’ or ‘tail’). Examine submit script for job array and then submit. Examine job array output.