There have been quite a few changes with regards to... clusters, Baobab and sunny, were discontinued mid-year 2007. The... Introduction to Unix Christy Avery

advertisement
Introduction to Unix
Christy Avery
March 17, 2008
There have been quite a few changes with regards to research computing at UNC. The previous
clusters, Baobab and sunny, were discontinued mid-year 2007. The current cluster, Emerald, has
many (if not more) of the same features found on Baobab and sunny, but is quite a bit different
with regards to data storage and management.
The main difference is that YOUR data, files, logs etc. are now stored on a secured access
scratch space. What does this mean for you? It means that files are periodically deleted
(generally after 30 days of inactivity). Please be aware that every time you use Emerald you
should get in the habit of transferring all of your files back to your personal AFS space (i.e. the H
drive) or your PC. The good thing is that the scratch space is huge, so you don’t have to worry
about running out of disk space when running simulations etc. A good place to start if you get
stuck is help.unc.edu. I also recommend buying “Unix for Dummies” or a similar reference if
you need to use UNIX frequently.
To get started you need to setup your Emerald profile on SSH Secure Shell:
Go to profiles
Add profile
Type “Emerald”, click “Add to Profiles”
Go to “Edit Profiles”
Select Emerald
Host name: emerald.unc.edu
User name: your ONYEN
Terminal answerback: vt100
Tunneling: check “Tunnel X11 connections”
Keyboard: check “Backspace sends Delete” and “Line wrap”
*Note, SSH Secure shell is available at https://software.unc.edu/available.php#SSH
How to add programs
Emerald doesn’t automatically register you as a user of all computing programs available at
UNC. Instead, you need to request access. This is done using the “ipm” unix command and
only has to be done once. Also remember that unix is case sensitive and all commands are
executed through the SSH secure shell software (see link above)
Command
If you want to query the available programs
ipm query –all
query the available sas programs
ipm query sas
add sas
ipm add sas
add stata
ipm add stata
see the programs you’ve subscribed to
ipm query –current
remove a (an old version of a) program
ipm remove sas-912
1
How to get to your scratch space, add a folder, and navigate between folders
Unix is command line driven. This means that you don’t use a mouse to navigate between
subdirectories (i.e. folders). Your net-scratch (netscr) space should be set up if you requested an
account through onyen.unc.edu. For example, mine is /netscr/christya
If you want to see what (sub) directory you’re in
How to
change directories
move up
one directory
Two directories
Three directories
and so on…
move down
one directory
two directories
move up two then down two
Command
pwd
cd
cd ../
cd ../../
cd ../../../
cd directory1
cd directory1/directory2
cd ../../directory1/directory2
Let’s say you want to add a folder named “temp” under your ONYEN directory.
How to add a folder
How to look at directory contents
How to go to temp (i.e. change directory to temp)
Command
mkdir temp
ls
cd temp
How to add a file to your temp NETSCR folder
Copying programs from your H drive or pc to your scratch space can be done two ways. The
easiest is through the SSH Secure file transfer interface available through SSH Secure Shell.
2
Next, type /netscr/Your ONYEN in the right-hand box and H: in the left
Now, make a new file in your H drive entitled “test.txt”. Right click on the white space in your h
drive folder, click new…text document. Open it and type your onyen. Save and close. Rename
the file test.txt.
Since Unix is command line driven you cannot use spaces in names. I also recommend keeping
file names as short as possible.
Go back to your SSH Secure file transfer window. You’ll probably need to click “refresh” on
the H drive side (the refresh button has two green arrows going different horizontal directions).
Now, click test.txt and drag it into the temp file. Go back to your SSH Secure Shell window.
Navigate to the temp subdirectory.
Command
How to view your .txt file
from the temp subdirectory
more test.txt
From the ONYEN subdirectory
more temp/test.txt
While you can use “more” to view your file, you need to use a text editor if you want to change
the file. Text editors are a more advanced topic and won’t be covered today. I often find that it
is easier to edit the file on your pc and simply copy over a new version.
3
How to run SAS on Emerald
There are two ways to run SAS on emerald. (Remember that you have to use the ipm add
command first.) The program and dataset should be copied over to the same netscr subdirectory
(for example: netscr/YOURONYEN/sasProgs). If you have a short program (i.e. one that would
only take 5 or so minutes to run) you may use the bsas command
Command
How to run short sas programs
bsas yourprogramname.sas
*Remember that you have to be in the same subdirectory as the program and dataset when you
type the sas command.
However, if you have a longer program you have to submit it to LSF (a software to allocate
resources fairly to all running jobs). Trust me, you will receive a phone call or email from the IT
people if you submit a long sas program using the short bsas command. Instead, you have to
submit it to the blade server (more info at http://help.unc.edu/4372).
Command
How to run a longer sas program
bsub -R blade sas my_prog.sas
To run SAS on one of the high memory AIX UNIX nodes you need to specify the p5aix
resource:
bsub -Ip -q int -R p5aix sas -memsize 7G -sortsize 7136M my_prog.sas
The above example allows SAS to access up to 7 gigabytes of memory on the compute node for
your job. Whenever you set memsize , you should make sure to set sortsize at least 32
megabytes smaller. Note that bsub options come after the word "bsub" but before the word "sas"
and that SAS options come after the word "sas".
UNIX/Linux is unforgiving about writing over existing files. If you ask SAS to write a dataset or
file that already exists, it will simply write over the old one.
If you run SAS jobs without changing to a different directory, all files will be written in your
working directory. To avoid cluttering up your working directory, create a subdirectory using the
mkdir command, and run all SAS jobs from there instead. Then, all your SAS files will be
nicely organized.
Viewing a job's log or output files with an editor while it is running may cause the job to, at best,
halt or, at worst, crash. Instead, view these files with the command more (to start from the
beginning) or less (to start from the end). Similarly, if your job uses any other files, such as
permanent datasets or raw data files, do not alter these while the job is running.
LSF will send you an email when your SAS program is complete. Although you will see a .log
and .list file while the program is running, as mentioned above you can safely view these ONLY
after the program is finished. I typically transfer these files to my H drive and view them in
SAS.
4
Checking the status of jobs on Emerald
How to check the status of a job on Emerald
How to cancel a job (using any program) on Emerald
Command
bhist OR bjobs
bkill jobid
Other considerations
Both SAS and Stata require a command to point to the starting dataset (i.e. a libname or use
statement, respectively). Remember that you first need to transfer your dataset over to the unix
folder and assign the correct location. This is when the pwd command is handy. For example,
my libname/use statement could read:
libname christy “/netscr/christya/temp”;
use “/netscr/christya/temp/temp.dta”, clear
For additional help with UNIX:
http://help.unc.edu/?id=5288
REMEMBER TO TRANSFER ALL FILES YOU WANT TO KEEP BACK TO YOUR H
DRIVE OR PC. THEY WILL BE DELETED AFTER A MONTH OF INACTIVITY.
5
Download