1 LINUX BASICS: Linux traditionally operates in text mode or in graphical mode. Since CPU power and RAM are not the cost anymore these days, every Linux user can afford to work in graphical mode and will usually do so. This does not mean that you don't have to know about text mode: we will work in the text environment throughout this course, using a terminal window. Linux encourages its users to acquire knowledge and to become independent. Inevitably, you will have to read a lot of documentation to achieve that goal; that is why, as you will notice, we refer to extra documentation for almost every command, tool and problem listed in this book. The more docs you read, the easier it will become and the faster you will leaf through manuals. Make reading documentation a habit as soon as possible. When you don't know the answer to a problem, refering to the documentation should become a second nature. Linux is an implementation of UNIX. The Linux operating system is written in the C programming language. "De gustibus et coloribus non disputandum est": there's a Linux for everyone. Linux uses GNU tools, a set of freely available standard tools for handling the operating system I found few basic commands that can be used on LINUX Operating system. Quickstart commands Command Meaning ls Displays a list of files in the current working directory, like the dir command in DOS cd directory change directories passwd change the password for the current user file filename display file type of file with name filename cat textfile throws content of textfile on the screen display present working directory pwd exit or logout leave this session man command read man pages on command info command read Info pages on command apropos string search the whatis database for strings 1 Source(s): http://tldp.org/LDP/intro-linux/html/ 2 BASIC MPI Overview MPI (Message Passing Interface) is a standard interface for the message passing paradigm of parallel computing. This is a model of cooperative processes, working on separate data spaces and interchanging messages when they need to share or communicate data. The model implies an active role on both the part of the sending process and the receiving process. The machine model underlying MPI is that of a multicomputer: a distributed memory machine with processors tied together via some kind of interconnection fabric. The multicomputer model ignores the underlying topology of the fabric, and presumes there are two types of memory accesses: local and distant. However, it is possible to impose a topology model on MPI, an advanced topic reserved for later. Although MPI has an underlying distributed memory model, it can be used for 1. 2. 3. 4. 5. Distributed memory machines Shared memory machines Arrays of SMPs ("clusters") Networks of workstations Heterogeneous networks of machines This is because the logical programming model does not have to match the physical machine architecture. Generally you can get more efficient code if your programming model matches the machine's physical hardware, but the advantage of MPI is that by aiming at the absolute minimal assumption about the hardware, your code will be portable across the wide variety of actual systems listed above. Basic MPI Concepts A MPI program consists of multiple processes, each with its own address space. Each process runs the same program (SPMD model), but has a unique number that identifies it. If there are p MPI processes participating in a single program, a process's identifying 2 number is an integer between 0 and p-1 and is called its "rank". In the statement of most SPMD algorithms, there are lines like if (myid == 0) { ... } else { ... } which is read "if my process identification number is zero, then do the following; otherwise, do something else". This is how MIMD programs are built on a SPMD model - each runs the same program, which branches dependent on the process's ID number. In MPI, that ID number is called its rank. Note that most distributed memory machines can be run directly in MIMD mode with each processor actually running a completely different program. However, SPMD is the model most often used to emulate MIMD actions. The reasons are psychological - it is easier to have a single source code to write and examine. Each message in MPI consists of the data sent and a header . The header contains 1. 2. 3. 4. The rank of the sender The rank of the receiver A message identifier number called its tag A communicator identification The MPI standard guarantees that the integers 0-32767 can be used as valid tag numbers, but most implementations allow far more. One basic concept in MPI is that of a communicator group : a set of MPI processes that are grouped together in working on a problem, which can send messages to each other. For the start, we will use the default communicator group MPI_COMM_WORLD , which sets a single context and involves all the processes running. This is a predefined communicator, of type MPI_Comm . Later we will cover more details about the concepts of communicators and contexts. But to get an idea of why different communicator groups may be needed in a single program, consider what happens if we are running an MPI program that calls a math library - which was also built to use parallelism via MPI. To keep process number 3 in our program getting confused with process number 3 as defined by the library, we need to have an additional identifier to distinguish them (and keep one from receiving a message intended for the other). This additional identification is the communicator group. Basic MPI Functions Although MPI has over 120 different functions that can be invoked, all parallel programs can be built using just six: 1. MPI_INIT() initializes MPI in a program. 2. MPI_COMM_SIZE() returns the number of cooperating processes. 3. MPI_COMM_RANK() returns the process identifier for the process that invokes it. 4. MPI_SEND() sends a message. 5. MPI_RECV() receives a message. 3 6. MPI_FINALIZE() cleans up and terminates MPI For our purposes, there are a few more that are useful right from the start: 1. MPI_BCAST(): send a message from one processor to all the others in the specified communicator group. 2. MPI_ALLREDUCE(): perform a reduction operation , and make the reduced scalar available to all participating processes. The last one is useful for most dotproducts, since it is typically the case that the resulting scalar is needed by all the processors. For performance evaluation, we can also use: 1. MPI_WTIME(): returns a double that gives the number of seconds since either the beginning of the program, or 1 January 1970. 2. MPI_WTICK(): returns a float that gives the clock resolution. We will not use all of them immediately. Here are some details about the ones we need right away; this is for the C language versions. MPI_Init(&argc, &argv); This must be called at the beginning of the parallel program, after whatever global initializations need to be performed. Its two arguments are the two that a C function main() takes (so that they can be passed along to all the MPI processes). MPI_Comm_size(MPIcomm comm, &p); This takes the communicator group as first argument (which will be MPI_COMM_WORLD for all our beginning programs). It returns in the second argument the number of processes participating in the communicator group. MPI_Comm_rank(MPIcomm comm, &myrank); This takes the communicator group as first argument and returns the rank of the calling process. MPI_Finalize(); Takes no arguments; it just cleans up things. Try leaving it off your program on the burrow to see what happens. MPI_Send(void* message, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm); MPI_Recv(void* message, int count, MPI_Datatype datatype, int src, int tag, MPI_Comm comm, MPI_Status *status); The last two are the most basic send/receive pair, and their required arguments are void* message: The beginning address of the block of memory containing the message. MPI_Datatype datatype: This is one of the allowed types, generally corresponding to a C datatype. For example, these include (with C type in parentheses) o MPI_CHAR (signed char) o MPI_INT (signed int) o MPI_DOUBLE (double) o MPI_FLOAT (float) 4 int count: The message consists of a "count" number of the given datatype. int src : the source of the message (part of MPI_Recv calling sequence) int dest: the destination for the message (part of MPI_Send calling sequence) int tag: the tag for the message. Note that there is no argument giving the "buffsize" as was mentioned earlier in generic send/receive functions. In MPI, if the sent message is too large to fit into the receiving buffer, it causes either segmentation fault (the best case) or weird corruption of your data. The final argument of the MPI_Recv function gives information about the message as actually received. It is a C structure with at least three components, the source, the tag, and an error code of type MPI_ERROR. So if, for example, the receive used for the source field the wild-card MPI_ANY_SOURCE, then status->MPI_SOURCE will contain the rank of the process that sent the received message. Note that the MPI_Status variable does not necessarily have a field for the count of data items actually sent; you should use the function MPI_Get_count() for that. Because it is often the case that one process needs to send or receive data to or from all other processes, MPI provides collective communication functions. If you are lucky, the vendor has optimized them for the machine topology; in MPICH they use a tree algorithm like the one above. Here are some of the functions: 1. int MPI_Bcast(void *msg, int count, MPI_Datatype, int root, MPI_Comm): sends a message from process root to all others in the communicator group. This function must be called by all participating processes. Also, count and datatype must match on all processes, unlike MPI_Send and MPI_Recv. 2. MPI_Reduce(void *operand, void *result, int count, MPI_Datatype, MPI_OP, int root, MPI_Comm): combines operands stored in operand and leaves the answer in result on process with rank "root". Here count, datatype, and operation MPI_OP must be the same on all processes. The types of operations are MAX, MIN, SUM, PROD, and various logical operations. Warning: although only root gets the result, all the participating processes must provide void *result, and all must provide the actual storage space it uses. 3. Often you want a global reduction operation with the result left on on every process, not just a root one. Instead of following MPI_Reduce with a MPI_Bcast, use instead MPI_Allreduce(). In general, the "All" word embedded in an MPI function means to have the operation result end up in all tasks in the communicator group. Source(s): http://www.cs.indiana.edu/classes/b673/notes/mpi1.html And also in addition to the above data , I found few links below where I can get information on MPI Tutorial material on MPI available on the Web http://www-unix.mcs.anl.gov/mpi/tutorial/index.html 5 about parallel processing. http://linuxgazette.net/issue65/joshi.html 3 OPEN MPI Open MPI provides a high performance platform for parallel applications on both homogeneous and heterogeneous platforms. The application developer is not unduly burdened by the cost of heterogeneous application development as Open MPI transparently handles process start-up, communication, and data conversion. Open MPI also determines all architecture and networking properties on a perpeer basis, and selects the most efficient mode of communication with the given peer, in order to maximize performance. our design to support multiple, disparate networks can offer a performance increase in some situations. Open MPI defines a job as the execution of either a single application,or multiple communicating applications, each potentially on a separate computing system. Within that context, the project generally breaks the definition of heterogeneity into four broad categories: Processor heterogeneity Dealing with differences in processor speed, internal architecture (e.g., multicore), and communication issues caused by the transfer of data between processors with different data representations (different endian, floating point representation,etc.). Network heterogeneity Using different network protocols between processes in a job. This includes multiple network protocols between two processes and different network protocol to different processes in the job. Run-Time Environment heterogeneity Executing a job across multiple resources (e.g., multiple clusters) that are locally administered by possibly different scheduling systems, authentication realms, etc. Binary heterogeneity Coordinating execution of different binaries, either through the coupling of different applications or as part of a complex single application. Bootstrapping Startup OpenRTE provides transparent, scalable support for high-performance applications in heterogeneous environments. OpenRTE originated within the Open MPI project but has since spun-off into its own effort, though the two projects remain closely coordinated. The ability of Open MPI to transparently operate in heterogeneous environments is largely due to the support services provided by OpenRTE, so it is useful to understand the OpenRTE architecture. The OpenRTE system is comprised of four major functional groups: 6 General Purpose Registry(GPR): A centralized publish/subscribe data repository used to store administrative information (e.g., communication contact information and process state) supporting OpenRTE and Open MPI operations. Resource Management (RM) group A collection of four subsystems supporting the identification and allocation of resources to an application, mapping of processes to specific nodes, and launch of processes. Support Services A suite of subsystems that provide general support for OpenRTE operations, including services to generate unique process names, I/O forwarding between application processes and the user, and OpenRTE’s communication subsystem. Error Management (EM) group Several subsystems that collectively monitor the state of processes within an application and respond to changes in those states that indicate errors have occurred. Point-to-Point Design Open MPI provides point-to-point message transfer facilities via multiple MCA frameworks. The point-to-point architecture consists of four main layers: the Byte Transport Layer (BTL), BTL Management Layer (BML), Point-to-Point Messaging Layer (PML) and the MPI layer. Two additional frameworks are shown, the Memory Pool (MPool) and the Registration Cache (Rcache). Message Scheduling To effectively utilize multiple network interconnects Open MPI provides a mechanism to schedule a single message across these network resources. This mechanism is currently isolated at the BTL and BML levels in such a way as to allow other components to implement an effective scheduling policy. The BML also provides a simple round robin scheduling policy which other component may use as appropriate. For point-to-point communication the PML uses both round robin and custom scheduling based on a variety of factors. Interconnects may exhibit widely different performance characteristics which a scheduling policy should take into account. These performance characteristics are exported by each BTL and include both bandwidth and latency. During BTL initialization the BML prioritizes each BTL based on these characteristics allowing upper level components such as OB1 to choose the appropriate interconnect(s) to communicate with a peer. In addition to performance characteristics, the BML groups interconnects based on capabilities such as send/receive and RDMA. These groupings are cached on a data structure associated with each peer for efficient access. These groupings include Eager (Low Latency), Send/Receive and RDMA capable BTLs. Source: http://www.open-mpi.org/papers/heteropar-2006/heteropar-2006-paper.pdf 4 TORQUE 7 TORQUE Resource Manager (Terascale Open-Source Resource and QUEue Manager) TORQUE is an open source resource manager providing control over batch jobs and distributed compute nodes. It is a community effort based on the original *PBS project and, with more than 1,200 patches, has incorporated significant advances in the areas of scalability, fault tolerance, and feature extensions contributed by NCSA, OSC, USC , the U.S. Dept of Energy, Sandia, PNNL, U of Buffalo, TeraGrid, and many other leading edge HPC organizations. This version may be freely modified and redistributed subject to the constraints of the included license. TORQUE can integrate with Moab Workload Manager® to improve overall utilization, scheduling and administration on a cluster. Customers who purchase Moab Workload Manager also receive free support for TORQUE. Feature Set TORQUE provides enhancements over standard OpenPBS in the following areas: Fault Tolerance o Additional failure conditions checked/handled o Node health check script support Scheduling Interface o Extended query interface providing the scheduler with additional and more accurate information o Extended control interface allowing the scheduler increased control over job behavior and attributes o Allows the collection of statistics for completed jobs Scalability o Significantly improved server to MOM communication model o Ability to handle larger clusters (over 15 TF/2,500 processors) o Ability to handle larger jobs (over 2000 processors) o Ability to support larger server messages Usability o Extensive logging additions o More human readable logging (ie no more 'error 15038 on command 42') Source: http://www.clusterresources.com/pages/products/torque-resourcemanager.php 5 OPEN PBS 8 OpenPBS is the original version of the Portable Batch System. It is a flexible batch queueing system developed for NASA in the early to mid-1990s. It operates on networked, multi-platform UNIX environments. The menu on the right provides additional technical information. Today's computing challenges often demand resources and skills not available within a single organization or at one location. Users need cost-effective services and tools that allow easy sharing of costly computing resources without knowing the specifics of each system environment. Few of the features of OPEN PBS include New comprehensive job-submission language that provides users a common syntax that is independent of hardware architectures. Flexibility that enables IT managers to replace or add new hardware platforms without modifying job-submission mechanisms. Automatic and efficient job placement on any hardware platform, including clusters, SMP, NUMA and new massively parallel architectures. New node-virtualization functionality, which extends scalability and provides finer control over individual hardware components. This functionality results in higher levels of system availability, while maintaining excellent performance levels. Tight integrations with five additional MPI libraries (IBM POE, Intel MPI, MPICH2, MPICH-GM/MX and SGI MPT), which enhance job process management. Optimized backfilling capabilities that increase the throughput of smaller jobs without delaying the execution of larger jobs. Guaranteed exclusivity to computational nodes, which enables maximum utilization and repeatability of the jobs. Facilitation of the powerful SGI Altix and IBM Blue Gene feature-sets for faster, more consistent execution times, as well as better control, monitoring and cleanup of jobs. Response to numerous customer requests for increased usability, reliability and enhanced scheduling algorithms. Source : http://www.openpbs.org/about.html Few useful links on OPEN PBS User Manual : http://www.hrz.uni-dortmund.de/S2/lido_doc/openpbs.pdf OpenPBs public home : http://www-unix.mcs.anl.gov/openpbs/ 9 6 HOW TO DO SCRIPTS Introduction A Shell script is a program interpreted and executed by the shell, which is essentially a command line interpretor. So, think of a shell script as a list of commands that are run in sequence. This guide covers scripts created for the Borne shell and is based on Reg Quinton's Introduction to Shell Programing. Creating a Script Suppose you often type the command find . -name file -print and you'd rather type a simple command, say sfind file Create a shell script % cd ~/bin % emacs sfind % page sfind find . -name $1 -print % chmod a+x sfind % rehash % cd /usr/local/bin % sfind tcsh ./shells/tcsh Observations This quick example is far from adequate but some observations: 1. 2. 3. 4. 5. Shell scripts are simple text files created with an editor. Shell scripts are marked as executeable %chmod a+x sfind Should be located in your search path and ~/bin should be in your search path. You likely need to rehash if you're a Csh (tcsh) user (but not again when you login). 6. Arguments are passed from the command line and referenced. For example, as $1. #!/bin/sh All Bourne Shell scripts should begin with the sequence #!/bin/sh From the man page for exec(2): 10 "On the first line of an interpreter script, following the "#!", is the name of a program which should be used to interpret the contents of the file. For instance, if the first line contains "#! /bin/sh", then the con- tents of the file are executed as a shell script." You can get away without this, but you shouldn't. All good scripts state the interpretor explicitly. Long ago there was just one (the Bourne Shell) but these days there are many interpretors -- Csh, Ksh, Bash, and others. Comments Comments are any text beginning with the pound (#) sign. A comment can start anywhere on a line and continue until the end of the line. Search Path All shell scripts should include a search path specifica- tion: PATH=/usr/ucb:/usr/bin:/bin; export PATH A PATH specification is recommended -- often times a script will fail for some people because they have a different or incomplete search path. The Bourne Shell does not export environment variables to children unless explicitly instructed to do so by using the export command. Argument Checking A good shell script should verify that the arguments sup- plied (if any) are correct. if [ $# -ne 3 ]; then echo 1>&2 Usage: $0 19 Oct 91 exit 127 fi This script requires three arguments and gripes accordingly. Exit status All Unix utilities should return an exit status. # is the year out of range for me? if [ $year -lt 1901 -o $year -gt 2099 ]; then echo 1>&2 Year \"$year\" out of range exit 127 fi etc... # All done, exit ok 11 exit 0 A non-zero exit status indicates an error condition of some sort while a zero exit status indicates things worked as expected. On BSD systems there's been an attempt to categorize some of the more common exit status codes. See /usr/include/sysexits.h. Using exit status Exit codes are important for those who use your code. Many constructs test on the exit status of a command. The conditional construct is: if command; then command fi For example, if tty -s; then echo Enter text end with \^D fi Your code should be written with the expectation that others will use it. Making sure you return a meaningful exit status will help. Stdin, Stdout, Stderr Standard input, output, and error are file descriptors 0, 1, and 2. Each has a particular role and should be used accordingly: # is the year out of range for me? if [ $year -lt 1901 -o $year -gt 2099 ]; then echo 1>&2 Year \"$year\" out of my range exit 127 fi etc... # ok, you have the number of days since Jan 1, ... case `expr $days % 7` in 0) echo Mon;; 1) echo Tue;; 12 etc... Error messages should appear on stderr not on stdout! Output should appear on stdout. As for input/output dialogue: # give the fellow a chance to quit if tty -s ; then echo This will remove all files in $* since ... echo $n Ok to procede? $c; read ans case "$ans" in n*|N*) echo File purge abandoned; exit 0 ;; esac RM="rm -rfi" else RM="rm -rf" fi Note: this code behaves differently if there's a user to communicate with (ie. if the standard input is a tty rather than a pipe, or file, or etc. See tty(1)). Language Constructs For loop iteration Substitute values for variable and perform task: for variable in word ... do command done For example: for i in `cat $LOGS` do mv $i $i.$TODAY cp /dev/null $i chmod 664 $i done Alternatively you may see: for variable in word ...; do command; done Case Switch to statements depending on pattern match case word in [ pattern [ | pattern ... ] ) 13 command ;; ] ... esac For example: case "$year" in [0-9][0-9]) year=19${year} years=`expr $year - 1901` ;; [0-9][0-9][0-9][0-9]) years=`expr $year - 1901` ;; *) echo 1>&2 Year \"$year\" out of range ... exit 127 ;; esac Conditional Execution Test exit status of command and branch if command then command [ else command ] fi For example: if [ $# -ne 3 ]; then echo 1>&2 Usage: $0 19 Oct 91 exit 127 fi Alternatively you may see: if command; then command; [ else command; ] fi While/Until Iteration Repeat task while command returns good exit status. 14 {while | until} command do command done For example: # for each argument mentioned, purge that directory while [ $# -ge 1 ]; do _purge $1 shift done Alternatively you may see: while command; do command; done Variables Variables are sequences of letters, digits, or underscores beginning with a letter or underscore. To get the contents of a variable you must prepend the name with a $. Numeric variables (eg. like $1, etc.) are positional vari- ables for argument communication. o Variable Assignment Assign a value to a variable by variable=value. For example: PATH=/usr/ucb:/usr/bin:/bin; export PATH or TODAY=`(set \`date\`; echo $1)` o Exporting Variables Variables are not exported to children unless explicitly marked. # We MUST have a DISPLAY environment variable if [ "$DISPLAY" = "" ]; then if tty -s ; then echo "DISPLAY (`hostname`:0.0)? \c"; read DISPLAY 15 fi if [ "$DISPLAY" = "" ]; then DISPLAY=`hostname`:0.0 fi export DISPLAY fi Likewise, for variables like the PRINTER which you want hon- ored by lpr(1). From a user's .profile: PRINTER=PostScript; export PRINTER Note: that the Cshell exports all environment variables. o Referencing Variables Use $variable (or, if necessary, ${variable}) to reference the value. # Most user's have a /bin of their own if [ "$USER" != "root" ]; then PATH=$HOME/bin:$PATH else PATH=/etc:/usr/etc:$PATH fi The braces are required for concatenation constructs. $p_01 The value of the variable "p_01". ${p}_01 The value of the variable "p" with "_01" pasted onto the end. o o Conditional Reference ${variable-word} If the variable has been set, use it's value, else use word. POSTSCRIPT=${POSTSCRIPT-PostScript}; export POSTSCRIPT ${variable:-word} 16 If the variable has been set and is not null, use it's value, else use word. These are useful constructions for honoring the user envi- ronment. Ie. the user of the script can override variable assignments. Cf. programs like lpr(1) honor the PRINTER environment variable, you can do the same trick with your shell scripts. ${variable:?word} If variable is set use it's value, else print out word and exit. Useful for bailing out. o Arguments Command line arguments to shell scripts are positional vari- ables: $0, $1, ... The command and arguments. With $0 the command and the rest the arguments. $# The number of arguments. $*, $@ All the arguments as a blank separated string. Watch out for "$*" vs. "$@". And, some commands: shift Shift the postional variables down one and decrement number of arguments. set arg arg ... Set the positional variables to the argument list. Command line parsing uses shift: # parse argument list while [ $# -ge 1 ]; do case $1 in 17 process arguments... esac shift done A use of the set command: # figure out what day it is TODAY=`(set \`date\`; echo $1)` cd $SPOOL for i in `cat $LOGS` do mv $i $i.$TODAY cp /dev/null $i chmod 664 $i done o o Special Variables $$ Current process id. This is very useful for constructing temporary files. tmp=/tmp/cal0$$ trap "rm -f $tmp /tmp/cal1$$ /tmp/cal2$$" trap exit 1 2 13 15 /usr/lib/calprog >$tmp $? The exit status of the last command. $command # Run target file if no errors and ... if [ $? -eq 0 ] then etc... fi Quotes/Special Characters Special characters to terminate words: 18 ; & ( ) | ^ < > new-line space tab These are for command sequences, background jobs, etc. To quote any of these use a backslash (\) or bracket with quote marks ("" or ''). Single Quotes Within single quotes all characters are quoted -- including the backslash. The result is one word. grep :${gid}: /etc/group | awk -F: '{print $1}' Double Quotes Within double quotes you have variable subsitution (ie. the dollar sign is interpreted) but no file name generation (ie. * and ? are quoted). The result is one word. if [ ! "${parent}" ]; then parent=${people}/${group}/${user} fi Back Quotes Back quotes mean run the command and substitute the output. if [ "`echo -n`" = "-n" ]; then n="" c="\c" else n="-n" c="" fi and TODAY=`(set \`date\`; echo $1)` Functions Functions are a powerful feature that aren't used often enough. Syntax is name () { 19 commands } For example: # Purge a directory _purge() { # there had better be a directory if [ ! -d $1 ]; then echo $1: No such directory 1>&2 return fi etc... } Within a function the positional parmeters $0, $1, etc. are the arguments to the function (not the arguments to the script). Within a function use return instead of exit. Functions are good for encapsulations. You can pipe, redi- rect input, etc. to functions. For example: # deal with a file, add people one at a time do_file() { while parse_one etc... } etc... # take standard input (or a specified file) and do it. if [ "$1" != "" ]; then cat $1 | do_file else do_file fi 20 Sourcing commands You can execute shell scripts from within shell scripts. A couple of choices: sh command This runs the shell script as a separate shell. For example, on Sun machines in /etc/rc: sh /etc/rc.local . command This runs the shell script from within the current shell script. For example: # Read in configuration information . /etc/hostconfig What are the virtues of each? What's the difference? The second form is useful for configuration files where environment variable are set for the script. For example: for HOST in $HOSTS; do # is there a config file for this host? if [ -r ${BACKUPHOME}/${HOST} ]; then . ${BACKUPHOME}/${HOST} fi etc... Using configuration files in this manner makes it possible to write scripts that are automatically tailored for differ- ent situations. Some Tricks Test The most powerful command is test(1). if test expression; then etc... and (note the matching bracket argument) if [ expression ]; then 21 etc... On System V machines this is a builtin (check out the com- mand /bin/test). On BSD systems (like the Suns) compare the command /usr/bin/test with /usr/bin/[. Useful expressions are: test { -w, -r, -x, -s, ... } filename is file writeable, readable, executeable, empty, etc? test n1 { -eq, -ne, -gt, ... } n2 are numbers equal, not equal, greater than, etc.? test s1 { =, != } s2 Are strings the same or different? test cond1 { -o, -a } cond2 Binary or; binary and; use ! for unary negation. For example if [ $year -lt 1901 -o $year -gt 2099 ]; then echo 1>&2 Year \"$year\" out of range exit 127 fi Learn this command inside out! It does a lot for you. String matching The test command provides limited string matching tests. A more powerful trick is to match strings with the case switch. # parse argument list while [ $# -ge 1 ]; do case $1 in -c*) rate=`echo $1 | cut -c3-`;; -c) shift; rate=$1 ;; 22 -p*) -p) -*) *) esac prefix=`echo $1 | cut -c3-`;; shift; prefix=$1 ;; echo $Usage; exit 1 ;; disks=$*; break ;; shift done Of course getopt would work much better. SysV vs BSD echo On BSD systems to get a prompt you'd say: echo -n Ok to procede?; read ans On SysV systems you'd say: echo Ok to procede? \c; read ans In an effort to produce portable code we've been using: # figure out what kind of echo to use if [ "`echo -n`" = "-n" ]; then n=""; c="\c" else n="-n"; c="" fi etc... echo $n Ok to procede? $c; read ans Is there a person? The Unix tradition is that programs should execute as qui- etly as possible. Especially for pipelines, cron jobs, etc. User prompts aren't required if there's no user. # If there's a person out there, prod him a bit. if tty -s; then 23 echo Enter text end with \^D fi The tradition also extends to output. # If the output is to a terminal, be verbose if tty -s <&1; then verbose=true else verbose=false fi Beware: just because stdin is a tty that doesn't mean that stdout is too. User prompts should be directed to the user terminal. # If there's a person out there, prod him a bit. if tty -s; then echo Enter text end with \^D >&0 fi Have you ever had a program stop waiting for keyboard input when the output is directed elsewhere? Creating Input We're familiar with redirecting input. For example: # take standard input (or a specified file) and do it. if [ "$1" != "" ]; then cat $1 | do_file else do_file fi alternatively, redirection from a file: # take standard input (or a specified file) and do it. if [ "$1" != "" ]; then do_file < $1 else do_file fi 24 You can also construct files on the fly. rmail bsmtp < rcpt to: data from: <$1@newshost.uwo.ca> to: Subject: Signon $2 subscribe $2 Usenet Feeder at UWO . quit EOF Note: that variables are expanded in the input. String Manipulations One of the more common things you'll need to do is parse strings. Some tricks TIME=`date | cut -c12-19` TIME=`date | sed 's/.* .* .* \(.*\) .* .*/\1/'` TIME=`date | awk '{print $4}'` TIME=`set \`date\`; echo $4` TIME=`date | (read u v w x y z; echo $x)` With some care, redefining the input field separators can help. #!/bin/sh # convert IP number to in-addr.arpa name name() { set `IFS=".";echo $1` echo $4.$3.$2.$1.in-addr.arpa } if [ $# -ne 1 ]; then echo 1>&2 Usage: bynum IP-address exit 127 fi 25 add=`name $1` nslookup < < EOF | grep "$add" | sed 's/.*= //' set type=any $add EOF Pattern Matching There are two kinds of pattern matching available, matching from the left and matching from the right. The operators, with their functions. Operator ${foo#t*is} Example export $foo="this is a test" echo ${foo#t*is} is a test ${foo##t*is} deletes the longest possible export $foo="this match from the left is a test" echo ${foo#t*is} a test ${foo%t*st} deletes the shortest export $foo="this possible match from the is a test" right echo ${foo%t*st} this is a ${foo%%t*st} deletes the longest possible export $foo="this match from the right is a test" echo ${foo#t*is} Function deletes the shortest possible match from the left Substitution Another kind of variable mangling you might want to employ is substitution. There are four substitution operators in Bash. Operator ${foo:-bar} Function Example If $foo exists and is not null, export foo="" return $foo. If it doesn't exist, echo ${foo:or is null, return bar. one} one echo $foo ${foo:=bar} If $foo exists and is not null, export foo="" 26 return $foo. If it doesn't exist, echo or is null, set $foo to bar and ${foo:=one} return bar one echo $foo one ${foo:+bar} If $foo exists and is not null, export return bar. If it doesn't exist, foo="this is a or is null, return a null. test" echo ${foo:+bar} bar ${foo:?"error If $foo exists and isn't null, export message"} return it's value. If it doesn't foo="one" exist, or is null, print the error for i in foo bar message. If no error message baz; do is given, it prints parameter eval echo null or not set. \${$foo:?} Note: In a non-interactive one shell, this will abort the bash: bar: current script. In an parameter null interactive shell, this will just or not set print the error message. bash: baz: parameter null or not set Debugging The shell has a number of flags that make debugging easier: sh -n command Read the shell script but don't execute the commands. IE. check syntax. sh -x command Display commands and arguments as they're executed. In a lot of my shell scripts you'll see # Uncomment the next line for testing # set -x Source: http://linux.dbw.org/shellscript_howto.html 7 JOBCONTROL 27 Job control, a feature standardized by POSIX.1 and mandated by many standards, allows a single terminal to run multiple jobs. Each job is a group of one or more processes, usually connected by pipes. Mechanisms are provided to move jobs between the foreground and the background and to prevent background jobs from accessing the terminal. Jobs can either be in the foreground or in the background. There can only be one job in the foreground at a time. The foreground job is the job with which you interact--it receives input from the keyboard and sends output to your screen, unless, of course, you have redirected input or output, as described starting on page ). On the other hand, jobs in the background do not receive input from the terminal--in general, they run along quietly without the need for interaction. Some jobs take a long time to finish and don't do anything interesting while they are running. Compiling programs is one such job, as is compressing a large file. There's no reason why you should sit around being bored while these jobs complete their tasks; just run them in the background. While jobs run in the background, you are free to run other programs. Jobs may also be suspended. A suspended job is a job that is temporarily stopped. After you suspend a job, you can tell the job to continue in the foreground or the background as needed. Resuming a suspended job does not change the state of the job in any way--the job continues to run where it left off. Suspending a job is not equal to interrupting a job. When you interrupt a running process (by pressing the interrupt key, which is usually Ctrl-C) , the process is killed, for good. Once the job is killed, there's no hope of resuming it. You'll must run the command again. Also, some programs trap the interrupt, so that pressing Ctrl-C won't immediately kill the job. This is to let the program perform any necessary cleanup operations before exiting. In fact, some programs don't let you kill them with an interrupt at all. Job Control Commands : Nice & Renice : The nice command is used to alter an initial job priority. On Linux systems this is fairly simple: the lower the nice command the higher the priority. The range on a Linux system is -20 (being the highest) to 19 (the lowest). Using nice is pretty simple. Let's say we want to make sure that a compile and install for fetchmail has a pretty high priority. We might do the following: nice -n 5 make 28 We have lowered the nice number and raised the job priority initially for this task. The renice command is used to alter the nice value of a job after it has been started. It is important to note that only root may alter the nice value of jobs it does not own, and nonroot users may only alter their nice values between 0 and 20 (the former is obviously so users may not tamper with other users while the latter protects the privileged processes of the system). An example of using renice on a single process might look like so: renice 5 -p 10023 where we lower the nice value to 5 of PID 10023. The renice command can also affect entire group of processes as well. For instance, let us say we wanted all of jdoe user's processes to have a nice value of 12. The syntax on Linux would be: renice 12 -u jdoe While running a command (job) you can pause/suspend it with ctrl-z and kill it with ctrl-c. While running a job you can Shortcut suspend a job ctrl -z terminate a job ctrl -c Running a job in background : The & is used to put the job in the background so my terminal is free for me to keep working. The system will inform me when the jobs are done. It is important not to logout while background jobs are running. On most systems you will see a warning message stating that there are running jobs. If this is ignored, the jobs will be terminated. jobs Lists the jobs running in the background, giving the job number. It is all too easy to confuse jobs and processes. Certain builtins, such as kill, disown, and wait accept either a job number or a process number as an argument. The fg, bg and jobs commands accept only a job number. bash$ sleep 100 & [1] 1384 29 bash $ jobs [1]+ Running sleep 100 & "1" is the job number (jobs are maintained by the current shell), and "1384" is the process number (processes are maintained by the system). To kill this job/process, either a kill %1 or a kill 1384 works. disown Remove job(s) from the shell's table of active jobs. fg, bg The fg command switches a job running in the background into the foreground. The bg command restarts a suspended job, and runs it in the background. If no job number is specified, then the fg or bg command acts upon the currently running job. wait Stop script execution until all jobs running in background have terminated, or until the job number or process ID specified as an option terminates. Returns the exit status of waited-for command. You may use the wait command to prevent a script from exiting before a background job finishes executing Optionally, wait can take a job identifier as an argument, for example, wait%1 or wait $PPID. suspend This has a similar effect to Control-Z, but it suspends the shell (the shell's parent process should resume it at an appropriate time). logout Exit a login shell, optionally specifying an exit status. times Gives statistics on the system time used in executing commands, in the following form: 0m0.020s 0m0.020s This capability is of very limited value, since it is uncommon to profile and benchmark shell scripts. 30 kill Forcibly terminate a process by sending it an appropriate terminate signal builtin Invoking builtin BUILTIN_COMMAND runs the command "BUILTIN_COMMAND" as a shell builtin, temporarily disabling both functions and external system commands with the same name. enable This either enables or disables a shell builtin command. As an example, enable -n kill disables the shell builtin kill, so that when Bash subsequently encounters kill, it invokes /bin/kill. The -a option to enable lists all the shell builtins, indicating whether or not they are enabled. The -f filename option lets enable load a builtin as a shared library (DLL) module from a properly compiled object file. autoload This is a port to Bash of the ksh autoloader. With autoload in place, a function with an "autoload" declaration will load from an external file at its first invocation. This saves system resources. Note that autoload is not a part of the core Bash installation. It needs to be loaded in with enable -f Links : http://www.linuxplanet.com/linuxplanet/tutorials/2116/1/ http://www.linux.com/guides/abs-guide/x6689.shtml http://linuxreviews.org/beginner/jobs/ 8 HOW TO MANAGE EXPERIMENTS I found this link, I don’t know how far this will help you. Please let me know if this doesn’t work http://mylinuxsaga.blogspot.com/search?q=LINUX 31