BASH BASH Michael Musick CSC 415: Programming Languages 1 BASH The History of bash Origins The origins of bash, also known as the Bourne Again shell, which was named in tribute to Steve Bourne's shell, is a pretty basic and linear story. Brian Fox, the man who wrote the first versions of bash, officially released bash on January 10, 1988. He continued to improve the shell until 1993. That was when Chet Ramey took over, he started working with Brian Fox back in 1989. In 1995 Chet started to work on release 2.0 which was released in December 1996, to this day Chet Ramey still works on and maintains bash and is up to release 4.3.This language was created for use in the GNU project, the purpose of this project was to create a UNIX-compatible operating system that would replace all commercial UNIX utilities with freely distributable ones. Thanks to this work with the GNU project, all versions of bash from 0.99 to the most recent releases have been available to the public free of cost. (Newham 26) Current Events Recently a bug has been found within the bash shell, this bug has said to be just as bad if not worse than the Heartbleed bug. This bug has been aptly named the Shellshock bug. The versions stated to be affected are all versions up to 4.3, this is roughly 25 years worth of bash versions. What the bug actually does is pretty dangerous, according to an article written by Troy Hunt this bug "allows remote attackers to execute arbitrary code" and he later refers to this attack as a "code injection attack". What he is saying is that an attacker can execute a shell command of their 2 BASH choosing on any vulnerable machine. The ramifications of a bug like this are nearly limitless and the damage that can be done depends on what the attacker would want to do. Luckily there has been a patch to fix this glaring bug almost immediately upon its discovery. (Hunt) Names, Bindings, and Scopes When talking about bash and how it works its best to talk about its variables first, since that is one of the more peculiar traits of most interpreted languages. Much like with Java script when declaring a variable the naming method is very relaxed. All you need to do is something along the lines of this when declaring a variable. COLOR="black" NUMBER="9" Now, if you want to see what is stored inside the variables a user can use the $ to get the content of the variable, for example. echo "This is a string: $COLOR" echo "And this is a number: $NUMBER" This all seems to be very basic stuff but it shows how very basic the syntax of bash can be, even Java script indicates the end of a line with a semi-colon, bash has a different use for semi-colons. There will be more on that later, getting back to variables in bash, take note at how the variable was declared, this is a naming procedure that is required. Putting a space between the variable name and its value results in an error. Looking 3 BASH back at the examples used to show how variables are declared, all upper case letters were used, and this is not required but are usually capitalized by default. Variables in Bash are case sensitive but using all upper case letters when naming variables does improve the readability of the language. Reserved variables in bash is very interesting, there are two different types of reserved variables, there are Bourne shell variables and then Bash reserved variables. The difference is Bourne shell variables are reserved for most shells and treated the same with each shell whereas Bash reserved variables are only used by Bash and in other shells are not treated as reserved variables. A couple of reserved Bourne shell variables are variables such as HOME which shows the current user's home directory, CDPATH which is used as a search path for the cd built-in command. As for Bash reserved variables the list is much larger, a couple of variables out of this large list BASH_VERSION which gives the current version number of the current instance of bash, EUID gives the current numeric ID of the current user, these are just a few examples of the basic Bash and Bourne shell reserved variables. Scoping with Bash is very simple if a variable is just declared it is treated as a global variable regardless of where this variable is declared. It is possible to easily change this simply by adding local to the declaration, for example "var=23" would be treated at a global variable by default but by changing that to "local var=23" will change its declaration to a local variable. Even though variables default to global it is possible to declare a variable as global just by adding global to the declaration of the variable much like what was done with the examples for local variables. This is important to remember because in Bash before a function is called all variables declared within the function are invisible 4 BASH outside the body of the function, not just the ones that were declared as local. This feature of scoping with Bash makes for a very flexible language. Data Types When it comes to data types in Bash there are only 4 data types: strings, integers, constants, and arrays. With this being the case all variables in Bash are treated as strings until the script determines what the data type actually is. This is a common feature in most interpreted languages. There is a feature in Bash that will be covered that can assign a data type of a variable before the execution of the script, this is called the declare feature. This declare feature is a very large portion of data types in Bash. As previously mentioned there are only four data types in this language, but the declare feature gives a user many more options for what they want to do with their variables. To use the declare feature it is similar to the previously mentioned method for declaring local and global variables. The following example is how to declare a variable as an integer: “declare –i num1=12” This will declare this variable as an integer, this is not a practice that is commonly used since it can limit what a programmer could do with Bash a great deal. Strings are ironically the most common form of data type in Bash, it is this way simply because all variables that have not been declared otherwise through the use of the declare feature are treated as strings until the interpreter determines what that variable needs to be, this is an easy job given that the only other options are array, constant or an integer. There are multiple ways to declare an array in Bash, what is to follow are a few examples of how to manipulate arrays. The first is simply declaring arrays and assigning values to the arrays. 5 BASH Unix[0]=’Ubuntu’ Unix[1]=’Debian’ The other way to create an array in Bash is to initialize an array during declaration, the following is an example of that. Declare –a example=(value1 value2 value3 value4) An important thing to remember with this feature if there is white space in the elements of the array it needs to be enclosed with quotes like this. Declare –a example=(‘value one’ ‘value two’ ‘value three’ ‘value4’) Even the values that don’t have any white space need quotes in this type of declaration. The syntax to print the elements of an array is as follows: echo ${example[@]} and this will print all of the elements of an array. If the user wishes to see a certain element with in an array just replace the @ with the numeric value remembering that the array index with Bash starts at 0. The parameter use to see the length of a Bash array is ${#example[@]}. These are just a small sample of a few ways to manipulate arrays in Bash, there are obvious commands for deleting arrays, there is a very simple way to concatenate two arrays into a single array, the way to do that is as follows Unixexample=(“${Unix[@]}” “${example[@}”) and the new array would be called Unixexample and would contain the elements of both arrays starting with whichever array was first entered into the command, in this case the first elements would be from the Unix array. When it comes down to data types arrays are the most complex aspect of this part of the language. As for multi-dimensional arrays, this is currently not 6 BASH supported by Bash, even though there have been claims that it is possible to simulate a multi-dimensional array in Bash, the feature is not officially supported yet. Declaring a constant is much more simple in Bash, all a user has to do is use the declare feature, the following is an example of how to use the feature to create a constant data type in Bash. Declare –r example=”sample” This uses the declare feature to create the variable example with the string value of sample and is set to a read only variable that can’t be modified by the Bash script or any of the functions within the script. The only data types not mentioned yet are the most basic types, integers and strings. Integers can be declared simply by doing one of the two following examples. Example1=19 Declare –i example2=23 One is simply declaring a numeric value to the variable named example1 the other example uses the declare feature, both are viable options even though the first example is the more frequently used method. Strings are the easiest data type to manage in Bash since all variables are initially treated as strings. Regardless the following is an example of declaring a string in Bash. SAMPLESTRING=”this is a string” There is no command in the declare feature to set the variable data type to a string since the default data type for all variables in Bash is that of a string. 7 BASH Expressions and Assignment Statements Operator precedence in Bash is a very basic almost left to right set up and easy to remember even for someone who may have only complete a basic algebra class. The easiest way to remember operator precedence for Bash is that it starts with multiply, divide, add, and subtract, then compound logical operators such as the && (the and operator), || (the or operator), -a (the declare operator to make a variable an array), and –o (another option for the or operator) have low precedence and the order of evaluation is usually left-to-right. The list below is the order of precedence for operators in Bash from highest to lowest. 1. Var++ var are the post increment and decrement operators 2. ++var –var are the pre increment and decrement operators 3. ! ~ the negation operators (inverts the sense of the following operator) 4. ** the exponentiation operator, raises a number to a power of n 5. * / % the multiplication division and modulo operators 6. + - the addition and subtraction operators 7. <<>> left, right shift 8. –z –n unary comparison if a string is or is-not null 9. –e –f –t –x etc. unary comparison for file tests 10. < -lt > -bt <= -le >= -ge compound comparison for strings and integers 11. –nt –ot –ef compound comparison for file tests 12. == -eq != -ne equality and inequality test operators for strings and integers 13. & (AND) bitwise 14. ^ (XOR) exclusive OR, bitwise 8 BASH 15. | (OR) bitwise 16. ?: trinary operator (condition?result-if-ture:result-if-false) 17. = assignment (not to be confused with equality test) 18. *= /= %= += -= <<= >>= &= combination assignments (times-equal etc.) 19. , comma (links a sequence of operations) (Cooper) Following this list along with the previous mentioned guidelines for operator precedence in Bash it should be fairly easy to fully understand. Most of the operator functions explain themselves but the more confusing ones such as the –lt –ge and other compound comparison operators are just abbreviations of greater than or less than or greater than or equal to and so on. File test operators are a bit stranger and are used when checking to see if a file has certain properties, to simplify what they do it runs a check with these operators and returns a true value for the operator. When testing for multiple features the file test operator is declared as such –xzvf, this will then test each of those but the whole thing is treated as a single operator. Commands like –x check if the file has execute permission –f checks to see if the file is a regular file and not a directory or device file. (Cooper) There are two features that are common place features in less archaic languages that have yet to be implemented in Bash. Those features are operator overloading and type conversions. These are both useful features in more up to date languages but are features that have yet to be implemented in Bash. It is easy to understand why type conversion has not been implemented yet in Bash since there is such a limited number of data types there is no need for such a feature to be placed in the language. Taking a look back at the operator precedence it is easy to see that assignment statements function much like they do in just about any other language, 9 BASH they have unary assignments such as post increment and pre increment assignments also known as unary assignment operators as well as compound assignments such as += -= and *= assignments that make basic arithmetic functions simple. As for mixedmode assignments that is not something that is not needed and therefore not implemented in Bash, given the limited data types available in the Bash language mixed mode assignment is not currently something in high demand for Bash users. Statement Level Control Structures It was stated previously that the semicolons that are strangely absent from this language so far would come into play. This is where they finally get some attention, the best place to start is with the basics, if else statements. The structure of if else statements are somewhat different for Bash users. Starting off with an example and then explaining its structure is best. if[ $1 -eq2 ]; then return1 elif[ $(($1%2)) -eq0 ]; then return0 else for(( d=3; d*d <= $1; d+=2 )); do if[ $(($1%$d)) -eq0 ]; then return0 fi done 10 BASH return1 fi The example here is part of a script that helps to determine if a chosen number is prime or not. This is where semicolons come into use with Bash, the semicolon indicates the end of the initial condition for the statement, which is located within the brackets. The elif is Bash’s version of else if, which again requires brackets for the condition and then a semicolon to end the conditional part of the statement. The fi found at the end of the if then statement is used to end the if statements. This control structure also contains another popular type of control structure, for loops. Like most other languages a for loop starts off with the obvious for statement, then with in the parenthesis is the condition for the loop, which is almost always limited to some sort of increment or decrement, and then what is unique to Bash, there is a semicolon used to end the declaration of the loop, and another feature unique to Bash is the do command, which basically states what will be done with in the for loop. To indicate the end of the for loop the keyword done is used to exit the loop. The next control structure to talk about is the while loop, the following example will show the syntax for a while loop in Bash. COUNTER=0 while [ $COUNTER -lt 10 ]; do echo The counter is $COUNTER let COUNTER=COUNTER+1 done 11 BASH Like the if statements in Bash the while loop uses the brackets for its condition to test against the semicolon ends the condition portion and like the for loop it has the do command to tell the loop what to do and the done statement to exit the loop once the condition has been met. The next control structure that can be used in bash is called the case statement the following example that will be used to show the syntax of the case statement has been pulled from tldp.org in their beginners guide for Bash. case $space in [1-6]*) Message="All is quiet." ;; [7-8]*) Message="Start thinking about cleaning out some stuff. There's a partition that is $space % full." ;; 9[1-8]) Message="Better hurry with that new disk... One partition is $space % full." ;; 99) Message="I'm drowning here! There's a partition at $space %!" ;; *) Message="I seem to be running with a nonexistent amount of disk space..." ;; 12 BASH Esac There is much more than just this found on their example but a lot of it is not relevant to the syntax of the case statement. Case statements are similar to if statements but are able to be a bit more specific in their structure, this specific snippet is checking disk usage and offering a warning message dependent on how much space is left. It is easy to see that the syntax for case statements in Bash is very strange, but those semicolons still play an important role the dual semicolons terminate each clause for the case statements there is the case to initialize the case statement and much like with the if statements the syntax to end the case statement is just case spelled backwards. The final control structure covered in bash is the select construct. The syntax for a select statement is as follows. Select WORD [in LIST]; do RESPECTIVE-COMMANDS; done The main use of the select construct is for making lists for a user to select from. While it sounds pretty simple the user selects a word, or option from a list, once that selection is made, the respective commands are executed and the list either repeats or the select construct terminates depending on what the respective commands are. Control structures like these really help expand the versatility of Bash which given its limited capabilities is something that this language needs a great deal of. Subprograms Subprograms are a very simple concept in Bash. Subprograms are basically functions that are created within a bash script and run their own parameters whenever they are called. Subprograms are essential for larger programs. The following is a short sample of a hello world program using subprograms. 13 BASH #!/bin/bash function quit { exit } function e { echo $1 } e Hello e World quit (Garrels) This script creates a function quit and a function e, it then hits the main portion of the program which then calls the function e and passes the string “Hello” into $1 and the string “World” into $1 the second time it is called. This prints Hello World onto the screen then the quit function is called that exits the program. This is a very basic example of how subprograms are used. Subprograms are used in the practice of recursion which is required for larger and more complex programs and scripts. The previous code sample also demonstrates how variables are passed between subprograms in Bash as well. It may sound like a complex concept but it is as easy as naming the variables and values to be passed into the subprogram. Other Issues of Bash One of the major issues with Bash is that it shows its age in comparison to other languages. There is no support for abstract data types, encapsulation constructs or concurrency. Given that Bash is completely run in the command prompt there is no 14 BASH support for object oriented programming, exception or event handling. The archaic structure of the language means that it is lacking in a lot of modern features that most users take for granted. Putting the archaic structure of Bash aside there are not many other issues that need to be address with the language. Evaluation The readability and writability of Bash go hand in hand with the syntax of the language. The readability is not as bad as it may seem but given the strange structure of the language it can be difficult to properly read through and understand a Bash script the first time through can be difficult, the same goes with writability in this case, the language has some very strange rules for how it is structured compared to most programming languages. Once a user gets used to reading Bash scripts it slowly gets easier but that is a small comfort given how difficult it is to get started on. When it comes down to the reliability of the Bash language, putting shell shock aside the language has not had many major issues in the past 25 years. The code executes efficiently and is effective at doing what it is needed for. When it comes to the cost of Bash all of the software is free to use thanks to the GNU project, which only leaves the cost for training people to use the language meaning that the cost of using Bash is extremely affordable. 15 BASH Bibliography Bezroukov, N. (2014, February 19). Softpanorama. Retrieved from Bash Control Structures: http://www.softpanorama.org/Scripting/Shellorama/control_structures.shtml Cooper, M. (2014, March 10). Advanced Bash-Scripting Guide. Retrieved from The Linux Documentation Project: http://www.tldp.org/LDP/abs/html/index.html Garrels, M. (2008, December 27). Bash Guide for Beginners. Retrieved from The Linux Documentation Program: http://www.tldp.org/LDP/Bash-Beginners-Guide/html/index.html Hunt, T. (2014, September 25). Troyhunt.com. Retrieved from Everything you need to know about the Shellshock Bash bug: http://www.troyhunt.com/2014/09/everything-you-need-to-knowabout.html Mikkey, M. G. (2014, July 27). The Linux Documentation Project. Retrieved from BASH Programming Introduction HOW-TO: http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html#toc8 Newham, C. (2009). Learning the bash Shell, 3rd Edition. O'Reilly Media. Ramey, C. (2014, September 24). BASH. Retrieved from The GNU Bourne Again Shell: http://tiswww.case.edu/php/chet/bash/bashtop.html 16