BASIC BASH SCRIPTING Katherine Holcomb UVACSE Uses for Scripting • Automate repetitive tasks • Could also use languages like Perl or Python but it’s still handy to know some bash • Write job scripts for the PBS queueing system • For most users these must be a shell script (bash, ksh, or tcsh) • Anything you can type on the command line can be put into a script Example #!/bin/bash #This is a bash script echo “Hello world!” echo “I am done now!” #my comment • The first line tells the system to start the shell and execute the commands. It is sometimes called a “shebang.” Executing Your Script • Suppose the previous script was in a file hello.sh • By default it is just text. You must explicitly make it executable chmod a+x hello.sh • Then you can just invoke it by name at the command line ./hello.sh Comments and Continuations • All text from the # to the end of the line is ignored • To continue a statement on to the next line, type \<enter> #This is a comment echo “This line is too long I \ want it to be on two lines” • Multiple statements can be placed on the same line when separated by the ; a=20; b=3 Sourcing • If you execute the script as a standalone script, it starts a new shell -- that is what the shebang does -- if you do not use that you can type bash myscript.sh • Sometimes you just want to bundle up some commands that you repeat regularly and execute them within the current shell. To do this you should not include a shebang, and you must source the script . <script> or source <script> Variables • Like all programming languages, the shell permits you to define variables. • Variables and environment variables are essentially the same thing. Environment variables are conventionally written with all capital letters and are used to set properties. Shell variables are used for assignments and such. • Arrays are supported • Scalar variables are referenced with $ • When assigning variables do not put spaces around the equals sign! String Expressions and Variables • Bash isn’t very good at math so much of what we do with it involves strings. • String expressions • Single quotes (‘hard quotes’): the expression is evaluated literally (no substitutions) • Double quotes (“soft quotes”): the $, \ (backslash), and ` (backtick) are taken to be shell characters $ indicates a variable name follows \ escapes the following character ` command` command is executed and its output streams are returned --In newer versions $(command)is synonymous Example echo `date` echo ‘This is var=2 echo “This is echo “This is echo “This is worth $2’ worth $2” worth $var” worth \$$var” Conditionals • The shell has an if else construct: if [[ <condition> ]] then commands elif commands else more commands fi #optional #optional Conditions • The conditions take different forms depending on what you want to compare. • String comparison operators: • = != < > -z -n • equal, not equal, lexically before, lexically after (both these may need to be escaped), zero length, not zero length. • Can use == but behaves differently in [] versus [[]] Example: String Comparisons if [[ $name == “Tom” ]]; then echo Tom, you are assigned to Room 12 elif [[ $name == “Harry” ]]; then echo Harry, please go to Room 3 elif [[ -z $name ]]; then echo You did not tell me your name else echo Your name is $name fi Numeric Comparisons • Numeric comparisons are possible: -eq -ne -gt -ge -lt -le if [[ $a -eq 2 ]]; then stuff fi #note ; Testing File Properties • This is a very common occurrence in bash scripts. • There are many operators for this. These are the most common: -e <file> : file exists -f <file> -s <file> -d <dir> : file exists and is a regular file : file exists and has length > 0 : exists and is a directory Example if [[ -d mydir ]]; then if [[ -f the_file ]]; then cp the_file mydir fi else mkdir mydir echo “Created mydir” fi Other Conditional Operators ! : not ; negates what follows -a : and ; for compound conditionals can use && with [[]] -o : or ; for compound conditionals can use || with [[]] Case Statement case expression in pattern1 ) statements ;; pattern2 ) statements ;; pattern3 ) statements ;; esac Example case $filename in *.f) echo “Fortran 77 source file” ;; *.c) echo “C source file” ;; *.py) echo “Python script file” ;; *) #optional, indicates default echo “I don’t know what this file is” ;; esac Loops • The bash for loop is a little different from the equivalent in C/C++/Fortran (but is similar to Perl or Python) for variable in iterator do commands done Examples for i in 1 2 3 4 5 ; do echo “I am on step $i” done for i in {1..5} #bash 3.0 and up do echo “I am on step $i” done for i in 0{1..9} {10..100}; do echo “File extension will be $i” done Three-Expression For for (( EXPR1; EXPR2; EXPR3 )) do statements done • Example: for (( i=0; i<$IMAX;i++ )); do echo $name”.”$i done while loop while [ condition ] do command command command done One of the commands in the while loop must update the condition so that it eventually becomes false. break • If you need to exit a loop before its termination condition is reached you can use the break statement. while [ condition ] do if [ disaster ]; then break fi command command done continue • To skip the commands for an iteration use continue for i in iterator; do if [[ something ]]; then continue fi command command done Bash Arithmetic • We said earlier that bash is bad at arithmetic, but some basic operations can be performed. • Expressions to be evaluated numerically must be enclosed in double parentheses. It works only with integers. x=$((4+20)) i=$(($i+1)) If you need more advanced math (even just fractions!) you must use bc. bc • If you really, really, really must do math in a bash script, most of the time you must use bc • The syntax is very peculiar x=$(echo “3*8+$z” | bc) Command-Line Arguments • Many bash scripts need to read arguments from the command line. The arguments are indicated by special variables $0, $1, $2, etc. • $0 is the name of the command itself • The subsequent ones are the options • To have a variable number of options, use the shift built-in. • The special variable $# is the number of arguments (not counting the command name) Example #!/bin/bash USAGE="Usage:$0 dir1 dir2 dir3 ...dirN" if [ "$#" == "0" ]; then echo "$USAGE" exit 1 fi while [ $# -gt 0 ]; do echo “$1” shift done More Useful Example • while plus case while [ $# -gt 0 ]; do case “$1” in -v) verbose=“on”;; -*) echo >&2 “USAGE: $0 [-v] [file]” exit 1;; *) break;; # default esac shift done Functions function name() { statement statement statement VALUE=integer return $VALUE } • the keyword function is optional in newer versions of bash. The parentheses are always left empty. • Function definitions must precede any invocations. Function Arguments • Function arguments are passed in the caller. In the function they are treated like command-line options. #!/bin/bash function writeout() { echo $1 } writeout “Hello World” Variables • Variables set in the function are global to the script! #!/bin/bash myvar="hello" myfunc() { myvar="one two three" for x in $myvar do echo $x done } myfunc echo $myvar $x Making Local Variables • We can use the keyword local to avoid clobbering our global variables. #!bin/bash myvar="hello" myfunc() { local x local myvar="one two three" for x in $myvar ; do echo $x done } myfunc echo $myvar $x Return Values • Strictly speaking, a function returns only its exit status. • The returned value must be an integer • You can get the value with $? • e.g. myfunc $1 $2 result=$? String Operations • Bash has a number of built-in string operations. • Concatenation • Just write them together (literals should be quoted) newstring=$oldstring”.ext” • String length ${#string} • Extract substring • Strings count from 0—first character is numbered 0 • Extract from pos to the end ${string:pos} • Extract len characters starting at pos ${string:pos:len} Clipping Strings • It is very common in bash scripts to clip off part of a string so it can be remodeled. • Delete shortest match from front of string ${string#substring} • Delete longest match from front of string ${string##substring} • Delete shortest match from back of string ${string%substring} • Delete longest match from back of string ${string%%substring} Arrays • Array variables exist but have a somewhat unusual syntax. • Arrays are zero based so the first index is 0 • Initialize arrays with a list enclosed in parentheses • Obtaining the value of an item in an array requires use of ${} val=${arr[$i]} ${arr[@]} # All of the items in the array @ or * work in this and next example ${#arr[@]} # Number of items in the array ${#arr[0]} # Length of item zero Example my_arr=(1 2 3 4 5 6) for num in ${my_arr[@]}; do echo $num done Realistic Example jpg_files=(`ls *jpg`) for file in ${jpg_files[*]}; do if [[ -n $file ]]; then convert $file ${file%%”.jpg”}.png fi done Herefiles • A herefile or here document is a block of text that is dynamically generated when the script is run CMD << Delimiter line line Delimiter Example #!/bin/bash # 'echo' is fine for printing single line messages, # but somewhat problematic for for message blocks. # A 'cat' here document overcomes this limitation. cat <<End-of-message ------------------------------------This is line 1 of the message. This is line 2 of the message. This is line 3 of the message. This is line 4 of the message. This is the last line of the message. ------------------------------------End-of-message Regular Expressions • Regular expressions are generalizations of the wildcards often used for simple file commands. • A regular expression consists of a pattern that attempts to match text. • It contains one or more of: • A character set • An anchor (to the line position) • Modifiers • Without an anchor or repeat it will find the leftmost match and stop. Regex: Character sets and modifiers • The character set is the set of characters that must be matched literally. • Modifiers expand the potential matches. • * matches any number of repeats of the pattern before it (note that this is different from its use in the shell) including 0 matches. • ? Matches 0 or 1 character (also different from the shell wildcard). • + matches one or more, but not 0, occurrences. • . matches any single character, except newline More Modifiers • \ escapes the preceding character, meaning it is to be used literally and not as a regex symbol. • \ can also indicate nonprinting characters, e.g. \t for tab. • () group the pattern into a subexpression • | pipe is or • [gray|grey] equivalent to [gr(a|e)y] Regex: Ranges and Repetition • [] enclose a set of characters to be matched. • - indicates a range of characters (must be a subsequence of the ASCII sequence) • {n} where n is a digit, indicates exactly n repetitions of the preceding character or range. • {n,m} matches n to m occurrences. • {n,} matches n or more occurrences. Regex: Anchors and Negation • ^ inside square brackets negates what follows it • ^ outside square brackets means “beginning of target string” • $ means “end of target string” • . Matches “any character in the indicated position” • Note: the “target string” is usually but not always a line in a file. Examples • AABB* matches • AAB • AABB • AABBBB • But not • AB • ABB • ABBBB Examples (Cont.) • [a-zA-Z] matches any letter • [^a-z] matches anything except lower-case letters • .all matches all, call, ball, mall, and so forth. Also matches shall (since it contains hall). • Regex patterns are said to be greedy since they find a match with the most generous interpretation of the pattern. Extensions and Shortcuts • Most shells and languages support some shortcuts: • \w : [A-Za-z0-9_] • \s : [ \t\r\n] some flavors have a few more rare whitespace characters • \d : [0-9] • \D : ^\d • ^\W: ^\w • ^\S: ^\s • NB \D\S is not the same as ^\d\s; in fact it matches anything. ^\d\s matches a but not 1 Grep, Sed and Awk • grep or egrep can be used with regular expressions. • sed is the stream editor. It is used to script editing of files. • awk is a programming language to extract data and print reports. grep examples • grep “regex” filename • The quotes are often needed; they make sure it’s interpreted as a regex • We assume Gnu grep on Linux (it is also called egrep) • egrep matches anywhere in the string • grep ^root /etc/passwd • grep :$ /etc/passwd sed examples • Sed operates on standard input and outputs to stdout unless told otherwise. Thus you must redirect. • An option will tell it to overwrite the old file. This is often a mistake (if you do this, be sure to debug the sed command carefully). • sed ‘command’ < old > new • Note hard quotes—best practice is to use them • sed ‘s/day/night/’ < old > new • Remember, expressions are greedy; day/night here changes Sunday to Sunnight awk examples • awk ‘pattern {actions}’ file • Similar to C, awk “action” lines end with ; • For each record (line) awk splits on whitespace. The results are stored as fields and can be referenced as $1, $2, etc. The entire line is $0 and $NF indicates the number of fields in the line. • awk ‘pattern1 {actions} pattern2 {actions}’ file • awk ‘Smith’ employees.txt • awk ‘{print $2, $NF;}’ employees.txt Some Resources • Linux Shell Scripting Tutorial: http://bash.cyberciti.biz/guide/Main_Page • How-To: http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html • Advanced Scripting Tutorial http://tldp.org/LDP/abs/html/ More Resources • Regex Tutorials http://www.regular-expressions.info/ http://www.zytrax.com/tech/web/regex.htm • sed tutorial http://www.grymoire.com/Unix/Sed.html • awk tutorial http://www.grymoire.com/Unix/Awk.html (but note that we have gawk on Linux)