Bash Scripting Tutorial

advertisement
BASIC BASH
SCRIPTING
Katherine Holcomb
UVACSE
Uses for Scripting
• Automate repetitive tasks
• Could also use languages like Perl or Python but it’s still
handy to know some bash
• Write job scripts for the PBS queueing system
• For most users these must be a shell script (bash, ksh,
or tcsh)
• Anything you can type on the command line can
be put into a script
Example
#!/bin/bash
#This is a bash script
echo “Hello world!”
echo “I am done now!” #my comment
• The first line tells the system to start the shell and
execute the commands. It is sometimes called a
“shebang.”
Executing Your Script
• Suppose the previous script was in a file
hello.sh
• By default it is just text. You must explicitly make
it executable
chmod a+x hello.sh
• Then you can just invoke it by name at the
command line
./hello.sh
Comments and Continuations
• All text from the # to the end of the line is ignored
• To continue a statement on to the next line, type
\<enter>
#This is a comment
echo “This line is too long I \
want it to be on two lines”
• Multiple statements can be placed on the same
line when separated by the ;
a=20; b=3
Sourcing
• If you execute the script as a standalone script, it
starts a new shell -- that is what the shebang
does -- if you do not use that you can type
bash myscript.sh
• Sometimes you just want to bundle up some
commands that you repeat regularly and execute
them within the current shell. To do this you
should not include a shebang, and you must
source the script
. <script> or source <script>
Variables
• Like all programming languages, the shell permits
you to define variables.
• Variables and environment variables are
essentially the same thing. Environment
variables are conventionally written with all
capital letters and are used to set properties.
Shell variables are used for assignments and
such.
• Arrays are supported
• Scalar variables are referenced with $
• When assigning variables do not put spaces
around the equals sign!
String Expressions and Variables
• Bash isn’t very good at math so much of what we
do with it involves strings.
• String expressions
• Single quotes (‘hard quotes’): the expression is
evaluated literally (no substitutions)
• Double quotes (“soft quotes”): the $, \
(backslash), and ` (backtick) are taken to be shell
characters
$ indicates a variable name follows
\ escapes the following character
` command` command is executed and its output
streams are returned
--In newer versions $(command)is synonymous
Example
echo `date`
echo ‘This is
var=2
echo “This is
echo “This is
echo “This is
worth $2’
worth $2”
worth $var”
worth \$$var”
Conditionals
• The shell has an if else construct:
if [[ <condition> ]]
then
commands
elif
commands
else
more commands
fi
#optional
#optional
Conditions
• The conditions take different forms depending on
what you want to compare.
• String comparison operators:
• = != < > -z -n
• equal, not equal, lexically before, lexically after
(both these may need to be escaped), zero
length, not zero length.
• Can use == but behaves differently in [] versus
[[]]
Example: String Comparisons
if [[ $name == “Tom” ]]; then
echo Tom, you are assigned to Room 12
elif [[ $name == “Harry” ]]; then
echo Harry, please go to Room 3
elif [[ -z $name ]]; then
echo You did not tell me your name
else
echo Your name is $name
fi
Numeric Comparisons
• Numeric comparisons are possible:
-eq -ne -gt -ge -lt -le
if [[ $a -eq 2 ]]; then
stuff
fi
#note ;
Testing File Properties
• This is a very common occurrence in bash
scripts.
• There are many operators for this. These are the
most common:
-e <file>
: file exists
-f <file>
-s <file>
-d <dir>
: file exists and is a regular file
: file exists and has length > 0
: exists and is a directory
Example
if [[ -d mydir ]]; then
if [[ -f the_file ]]; then
cp the_file mydir
fi
else
mkdir mydir
echo “Created mydir”
fi
Other Conditional Operators
! : not ; negates what follows
-a : and ; for compound conditionals
can use && with [[]]
-o : or ; for compound conditionals
can use || with [[]]
Case Statement
case expression in
pattern1 )
statements ;;
pattern2 )
statements ;;
pattern3 )
statements ;;
esac
Example
case $filename in
*.f)
echo “Fortran 77 source file”
;;
*.c)
echo “C source file”
;;
*.py)
echo “Python script file”
;;
*)
#optional, indicates default
echo “I don’t know what this file is”
;;
esac
Loops
• The bash for loop is a little different from the
equivalent in C/C++/Fortran (but is similar to Perl
or Python)
for variable in iterator
do
commands
done
Examples
for i in 1 2 3 4 5 ; do
echo “I am on step $i”
done
for i in {1..5} #bash 3.0 and up
do
echo “I am on step $i”
done
for i in 0{1..9} {10..100}; do
echo “File extension will be $i”
done
Three-Expression For
for (( EXPR1; EXPR2; EXPR3 ))
do
statements
done
• Example:
for (( i=0; i<$IMAX;i++ )); do
echo $name”.”$i
done
while loop
while [ condition ]
do
command
command
command
done
One of the commands in the while loop must
update the condition so that it eventually becomes
false.
break
• If you need to exit a loop before its termination
condition is reached you can use the break
statement.
while [ condition ]
do
if [ disaster ]; then
break
fi
command
command
done
continue
• To skip the commands for an iteration use
continue
for i in iterator; do
if [[ something ]]; then
continue
fi
command
command
done
Bash Arithmetic
• We said earlier that bash is bad at arithmetic, but
some basic operations can be performed.
• Expressions to be evaluated numerically must be
enclosed in double parentheses. It works only
with integers.
x=$((4+20))
i=$(($i+1))
If you need more advanced math (even just
fractions!) you must use bc.
bc
• If you really, really, really must do math in a bash
script, most of the time you must use bc
• The syntax is very peculiar
x=$(echo “3*8+$z” | bc)
Command-Line Arguments
• Many bash scripts need to read arguments from
the command line. The arguments are indicated
by special variables $0, $1, $2, etc.
• $0 is the name of the command itself
• The subsequent ones are the options
• To have a variable number of options, use the
shift built-in.
• The special variable $# is the number of
arguments (not counting the command name)
Example
#!/bin/bash
USAGE="Usage:$0 dir1 dir2 dir3 ...dirN"
if [ "$#" == "0" ]; then
echo "$USAGE"
exit 1
fi
while [ $# -gt 0 ]; do
echo “$1”
shift
done
More Useful Example
• while plus case
while [ $# -gt 0 ]; do
case “$1” in
-v) verbose=“on”;;
-*)
echo >&2 “USAGE: $0 [-v] [file]”
exit 1;;
*) break;;
# default
esac
shift
done
Functions
function name() {
statement
statement
statement
VALUE=integer
return $VALUE
}
• the keyword function is optional in newer versions
of bash. The parentheses are always left empty.
• Function definitions must precede any invocations.
Function Arguments
• Function arguments are passed in the caller. In
the function they are treated like command-line
options.
#!/bin/bash
function writeout() {
echo $1
}
writeout “Hello World”
Variables
• Variables set in the function are global to
the script!
#!/bin/bash
myvar="hello"
myfunc() {
myvar="one two three"
for x in $myvar
do
echo $x
done
}
myfunc
echo $myvar $x
Making Local Variables
• We can use the keyword local to avoid
clobbering our global variables.
#!bin/bash
myvar="hello"
myfunc() {
local x
local myvar="one two three"
for x in $myvar ; do
echo $x
done
}
myfunc echo $myvar $x
Return Values
• Strictly speaking, a function returns only its exit
status.
• The returned value must be an integer
• You can get the value with $?
• e.g.
myfunc $1 $2
result=$?
String Operations
• Bash has a number of built-in string operations.
• Concatenation
• Just write them together (literals should be quoted)
newstring=$oldstring”.ext”
• String length
${#string}
• Extract substring
• Strings count from 0—first character is numbered 0
• Extract from pos to the end
${string:pos}
• Extract len characters starting at pos
${string:pos:len}
Clipping Strings
• It is very common in bash scripts to clip off part of
a string so it can be remodeled.
• Delete shortest match from front of string
${string#substring}
• Delete longest match from front of string
${string##substring}
• Delete shortest match from back of string
${string%substring}
• Delete longest match from back of string
${string%%substring}
Arrays
• Array variables exist but have a somewhat unusual
syntax.
• Arrays are zero based so the first index is 0
• Initialize arrays with a list enclosed in parentheses
• Obtaining the value of an item in an array requires
use of ${}
val=${arr[$i]}
${arr[@]} # All of the items in the array
@ or * work in this and next example
${#arr[@]} # Number of items in the array
${#arr[0]} # Length of item zero
Example
my_arr=(1 2 3 4 5 6)
for num in ${my_arr[@]}; do
echo $num
done
Realistic Example
jpg_files=(`ls *jpg`)
for file in ${jpg_files[*]}; do
if [[ -n $file ]]; then
convert $file ${file%%”.jpg”}.png
fi
done
Herefiles
• A herefile or here document is a block of text that
is dynamically generated when the script is run
CMD << Delimiter
line
line
Delimiter
Example
#!/bin/bash
# 'echo' is fine for printing single line messages,
# but somewhat problematic for for message blocks.
# A 'cat' here document overcomes this limitation.
cat <<End-of-message
------------------------------------This is line 1 of the message.
This is line 2 of the message.
This is line 3 of the message.
This is line 4 of the message.
This is the last line of the message.
------------------------------------End-of-message
Regular Expressions
• Regular expressions are generalizations of the
wildcards often used for simple file commands.
• A regular expression consists of a pattern that
attempts to match text.
• It contains one or more of:
• A character set
• An anchor (to the line position)
• Modifiers
• Without an anchor or repeat it will find the
leftmost match and stop.
Regex: Character sets and modifiers
• The character set is the set of characters that
must be matched literally.
• Modifiers expand the potential matches.
• * matches any number of repeats of the pattern
before it (note that this is different from its use in
the shell) including 0 matches.
• ? Matches 0 or 1 character (also different from
the shell wildcard).
• + matches one or more, but not 0, occurrences.
• . matches any single character, except newline
More Modifiers
• \ escapes the preceding character, meaning it is
to be used literally and not as a regex symbol.
• \ can also indicate nonprinting characters, e.g.
\t for tab.
• () group the pattern into a subexpression
• | pipe is or
• [gray|grey] equivalent to [gr(a|e)y]
Regex: Ranges and Repetition
• [] enclose a set of characters to be matched.
• - indicates a range of characters (must be a
subsequence of the ASCII sequence)
• {n} where n is a digit, indicates exactly n
repetitions of the preceding character or range.
• {n,m} matches n to m occurrences.
• {n,} matches n or more occurrences.
Regex: Anchors and Negation
• ^ inside square brackets negates what follows it
• ^ outside square brackets means “beginning of
target string”
• $ means “end of target string”
• . Matches “any character in the indicated
position”
• Note: the “target string” is usually but not always
a line in a file.
Examples
• AABB* matches
• AAB
• AABB
• AABBBB
• But not
• AB
• ABB
• ABBBB
Examples (Cont.)
• [a-zA-Z] matches any letter
• [^a-z] matches anything except lower-case
letters
• .all matches all, call, ball, mall, and so
forth. Also matches shall (since it contains
hall).
• Regex patterns are said to be greedy since they
find a match with the most generous
interpretation of the pattern.
Extensions and Shortcuts
• Most shells and languages support some shortcuts:
• \w : [A-Za-z0-9_]
• \s : [ \t\r\n]
some flavors have a few more
rare whitespace characters
• \d : [0-9]
• \D : ^\d
• ^\W: ^\w
• ^\S: ^\s
• NB \D\S is not the same as ^\d\s; in fact it
matches anything. ^\d\s matches a but not 1
Grep, Sed and Awk
• grep or egrep can be used with regular
expressions.
• sed is the stream editor. It is used to script
editing of files.
• awk is a programming language to extract data
and print reports.
grep examples
• grep “regex” filename
• The quotes are often needed; they make sure it’s
interpreted as a regex
• We assume Gnu grep on Linux (it is also called egrep)
• egrep matches anywhere in the string
• grep ^root /etc/passwd
• grep :$ /etc/passwd
sed examples
• Sed operates on standard input and outputs to
stdout unless told otherwise. Thus you must
redirect.
• An option will tell it to overwrite the old file. This
is often a mistake (if you do this, be sure to debug
the sed command carefully).
• sed ‘command’ < old > new
• Note hard quotes—best practice is to use them
• sed ‘s/day/night/’ < old > new
• Remember, expressions are greedy; day/night here
changes Sunday to Sunnight
awk examples
• awk ‘pattern {actions}’ file
• Similar to C, awk “action” lines end with ;
• For each record (line) awk splits on whitespace.
The results are stored as fields and can be
referenced as $1, $2, etc. The entire line is $0
and $NF indicates the number of fields in the
line.
• awk ‘pattern1 {actions} pattern2
{actions}’ file
• awk ‘Smith’ employees.txt
• awk ‘{print $2, $NF;}’ employees.txt
Some Resources
• Linux Shell Scripting Tutorial:
http://bash.cyberciti.biz/guide/Main_Page
• How-To:
http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html
• Advanced Scripting Tutorial
http://tldp.org/LDP/abs/html/
More Resources
• Regex Tutorials
http://www.regular-expressions.info/
http://www.zytrax.com/tech/web/regex.htm
• sed tutorial
http://www.grymoire.com/Unix/Sed.html
• awk tutorial
http://www.grymoire.com/Unix/Awk.html
(but note that we have gawk on Linux)
Download