Unix/Linux basics for Biolinux Table of Contents 1.What is Linux .................................................2 1.1 A brief history of Linux..................................................................................... 2 1.2 Free Software pre-Linux ................................................................................... 2 1.3 The kernel .......................................................................................................... 2 1.4 1991, a fateful year ............................................................................................ 2 1.5 Linux is introduced ............................................................................................ 3 1.6 Linux, at first, not for everybody ...................................................................... 3 1.7 Linux Today ...................................................................................................... 3 2. SSH (Logging in/out) ....................................5 2.1 The "Bash" shell ................................................................................................ 5 2.2 Typing Tricks .................................................................................................... 6 Moving around / Basic commands ....................7 3.1 Command syntax ............................................................................................... 7 3.2 Figuring out where you are ............................................................................... 7 3.3 Changing to another directory ........................................................................... 7 3.4 Seeing what's there ............................................................................................ 8 4. Files and Directories ......................................10 4.1 Unix File Names ................................................................................................ 10 4.2 Creating Files..................................................................................................... 10 4.3 Copying Files..................................................................................................... 10 4.4 Moving (Renaming) Files ................................................................................. 10 4.5 Viewing Files..................................................................................................... 11 4.6 Searching for something in a File ..................................................................... 11 4.7 Deleting Files..................................................................................................... 11 4.8 Creating Directories .......................................................................................... 12 4.9 Deleting Directories .......................................................................................... 12 5. Introduction to File Ownership/Permissions .13 5.1 File Ownership .................................................................................................. 13 5.2 File Permissions ................................................................................................ 13 5.3 Representing File Permissions .......................................................................... 13 Unix/Linux basics for Biolinux 5.3a Symbolic Representation ................................................................................. 13 5.3b Numeric Representation .................................................................................. 14 6. Obtaining Help ..............................................15 6.1 The "man" command ......................................................................................... 15 6.2 The "info" command ......................................................................................... 15 6.3 The "--help" option ............................................................................................ 15 6.4 Using “apropos” to find commands .................................................................. 15 6.5 Biolinux applications documenation ................................................................. 16 7. Using Pipes ....................................................17 7.1 General Approach.............................................................................................. 16 7.2 Some Examples ................................................................................................. 16 8. TAR and GZIP ..............................................18 8.1 TAR ................................................................................................................... 18 8.2 GZIP .................................................................................................................. 18 8.3 Using TAR and GZIP together.......................................................................... 19 9. Linux Filesystem ...........................................20 10. A UNIX Command Summary .....................21 Unix/Linux basics for biolinux 1. What is Linux? Linux is an operating system that evolved from a kernel created by Linus Torvalds when he was a student at the University of Helsinki. Generally, it is obvious to most people what Linux is, however, both for political and practical reasons, it needs to be explained further. To say that Linux is an operating system means that it's meant to be used as an alternative to other operating systems like MS-DOS, the various versions of MS Windows, Mac OS, Solaris and others. Linux is not a program like a word processor and is not a set of programs like an office suite. 1.1 A brief history of Linux When Linus Torvalds was studying at the University of Helsinki, he was using a version of the UNIX operating system called 'Minix'. Linus and other users sent requests for modifications and improvements to Minix's creator, Andrew Tanenbaum, but he felt that they weren't necessary. That's when Linus decided to create his own operating system that would take into account users' comments and suggestions for improvements. 1.2 Free Software pre-Linux This philosophy of asking for users comments and suggestions and using them to improve computer programs was not new. Richard Stallman, who worked at the Massachusetts Institute of Technology, had been advocating just such an approach to computer programming and use since the early 1970's. He was a pioneer in the concept of 'free software', always pointing out that 'free' means 'freedom', not zero cost. Finding it difficult to continue working under conditions that he felt went against his concept of 'free software' he left MIT in 1984 and founded GNU. The goal of GNU was to produce software that was free to use, distribute and modify. Linus Torvalds' goal 6 years later was basically the same: to produce an operating system that took into account user feedback. 1.3 The kernel We should point out here that the focal point of any operating system is its 'kernel'. Without going into great detail, the kernel is what tells the big chip that controls your computer to do what you want the program that you're using to do. To use a metaphor, if you go to your favorite Italian restaurant and order 'Spaghetti alla Bolognese', this dish is like your operating system. There are a lot of things that go into making that dish like pasta, tomato sauce, meatballs and cheese. Well, the kernel is like the pasta. Without pasta, that dish doesn't exist. You might as well find some bread and make a sandwich. A plate of just pasta is fairly unappetizing. Without a kernel, an operating system doesn't exist. Without programs, a kernel is useless. 1.4 1991, a fateful year In 1991, ideal conditions existed that would create Linux. In essence, Linus Torvalds had a kernel but no programs of his own, Richard Stallman and GNU had programs but no working kernel. Read the two men's own words about this: Linus: "Sadly, a kernel by itself gets you nowhere. To get a working system you need a 2 Unix/Linux basics for biolinux shell, compilers, a library etc." RMS: The GNU Hurd is not ready for production use. Fortunately, another kernel is available. [It is called] Linux. So combining the necessary programs provided by GNU in Cambridge, Massachusetts and a kernel, developed by Linus Torvalds in Helsinki, Finland, Linux was born. Due to the physical distances involved, the means used to get Linus' kernel together with the GNU programs was the Internet, then in its infancy. We can say then that Linux is an operating system that came to life on the Internet. The Internet would also be crucial in Linux's subsequent development as the means of coordinating the work of all the developers that have made Linux into what is is today. 1.5 Linux is introduced Late in 1991, Linus Torvalds had his kernel and a few GNU programs wrapped around it so it would work well enough to show other people what he had done. And that's what he did. The first people to see Linux knew that Linus was on to something. At this point, though, he needed more people to help him. Here's what Linus had to say back in 1991. "Are you without a nice project and dying to cut your teeth on an OS you can try to modify for your needs?... This post might just be for you." People all over the world decided to take him up on it. At first, only people with extensive computer programming knowledge would be able to do anything with that early public version of Linux. These people started to offer their help. The version numbers of Linux were getting higher and higher. People began writing programs specifically to be run under Linux. Developers began writing drivers so different video cards, sound cards and other gadgets inside and outside your computer could use Linux. Nevertheless, throughout most of first part of the 1990's Linux did not get out of the 'GURU' stage. GURU is a term that has evolved to mean anyone who has special expertise in a particular subject. That is, you had to have special expertise in how computers worked to be able to install Linux in those days. 1.6 Linux, at first, not for everybody Other popular software companies sold you a CD or a set of floppies and a brief instruction booklet and in probably less than a half an hour, you could install a fully working operating system on your PC. The only ability you needed was knowing how to read. Those companies had that intention when they actually sat down and developed their operating systems. Linus Torvalds didn't have that in mind when he developed Linux. It was just a hobby for him. Later on, companies like Red Hat made it their goal to bring Linux to the point where it could be installed just like any other operating system; by anyone who can follow a set of simple instructions, and they have succeeded. For some reason, though, Linux hasn't shaken off completely that 'Gurus only' image that it took on at the beginning. That is mostly the result of articles in the popular, quasi-technical press whose experience with Linux has been quite limited. 1.7 Linux Today Today, Linux is enjoying a favorable press for the most part. This comes from the fact that Linux 3 Unix/Linux basics for biolinux has proven to be a tremendously stable and versatile operating system, particularly as a network server. When Linux is deployed as a web server or in corporate networks, its down-time is almost negligible. There have been cases when Linux servers have been running for more than a year without re-booting and then only taken down for a brief period for routine maintainance. Its cost effectiveness has sold it more than anything else. Linux can be installed on a home PC as well as a network server for a fraction of the cost of other companies' software packages. More reliability and less cost - it's ideal. If you're reading this, you're obviously here to learn how to use Linux. Any learning experience means opening up to new ideas and new ways of doing things. As mentioned before, Linux is in the UNIX family of operating systems. UNIX is primarily designed to be used by professionals. You will have to learn some UNIX concepts in this lesson, but that doesn't mean that Linux is a professionals-only operating system. Quite the contrary. Most major versions of Linux are designed to be as user-friendly and as easy to install as any other operating system on the market today. Now that you know what Linux is and how good it is, there's one more thing we have to do - install Linux! 4 Unix/Linux basics for biolinux 2. SSH (Logging in/out) Providing you have a biolinux account, a safe and secure connection can be made to a biolinux server using a protocol called SSH. This stands for Secure SHell. However, in order to run graphical Linux applications remotely you also need to run something called Xwindows. If you are logging into a Biolinux server for the first time, then you will need to configure Xwindows on your PC so that it is tunneled through your SSH connection. This will enable you to securely run graphical applications available on the Biolinux server. Instructions for doing this can be found at: http://watson-bios.grid.cf.ac.uk/pdf/ssh_tunnelling_X-Win32.pdf 2.1 The "Bash" shell When you first connect to a Biolinux server your current working directory will be your home directory and you will see something like the following: ************* Welcome to your local Bio-Linux system ************* http://envgen.nox.ac.uk The following software is available: ACT APE ARB ARTEMIS ATV BIOCONDUCTOR BLAST BLIXEM CLUSTAL(X|W) DOTTER EMBOSS FASTA FASTDNAML FORESTER GDE GENESPRING HAPPY HMMER JALVIEW MAXD MAXDLOAD MCL MRBAYES MSPCRUNCH MVIEW NETBLAST NJPLOT PAML PARTIGENE PEDRO PHYLIP PRIMER3 QTLCARTOGRAPHER R/BQTL R/QTL R/QTLSIM RASMOL READSEQ SEAVIEW SPLITSTREE STADEN STARS T-COFFEE TRACE2DBEST TREE-PUZZLE TRIBE-MCL WISE2 XMGRACE The following languages are available: C, C++, Perl/BioPerl, Python/BioPython, Ruby/BioRuby, Java/BioJava, R [manager@bioremediation manager]$ The SSH connection gives you access to something known as a “shell” or “terminal” and the “$” sign at the bottom of the screen is known as the command prompt. A "shell" is a program which interprets commands, either typed in directly by the user, or contained in a file called a "shell script", which is a simple interpreted program. The equivalents in WindowsTM would be "command 5 Unix/Linux basics for biolinux processor" for shell, "COMMAND.COM" or "CMD.EXE" instead of bash, and ".BAT files" instead of shell scripts. The Shell in Unix/Linux is much more powerful than the DOS terminal in Windows. Linux has a variety of different shells which all vary slightly, but the most popular is "bash". This is your current shell on biolinux. To run any bioloinux application simply type the executable name followed by an ampersand: artemis & The ampersand is important as it tells the shell to run the application in the background thus enabling you to keep control of the shell so that you are able to carry out other tasks. To exit your shell and SSH connection: 1. Close all Biolinux graphical applications you may have running 3. Type “exit” at the command prompt and press Enter. 2.2 Typing Tricks When you're in the bash prompt, you can use the up- and down-arrow keys to recall previously typed commands. You can also press “Ctrl-R” and start typing part of another command to find the last command that contains the letters you are typing. Thus if you want to find the last changedirectory, type "[Ctrl-R]cd", and the command line will display the last "cd" command you typed. If you start typing a filename or directory name, you can press “[Tab]” and bash will complete the file or directory name for you, assuming that such a file exists and is the only one that starts with the typed-in part. For example, if you type "ls br[Tab]", bash will complete the filename to "brushtopbm", if this file exists and is the only file starting with "br". If you make a mistake at the command line and wish to interrupt just type “Ctrl-C” to cancel. To copy and paste in the shell simply highlight the text you wish to copy, and then having positioned the cursor, press both mouse keys at the same time to paste. This also works for graphical applications. Also, if you have a wheel mouse you can press the wheel to paste instead. Basically a mouse for a Unix machine traditionally has three keys instead of two. However, biolinux machines should be configured by default to allow Windows users to be able to emulate the missing third key by using one of the above methods. Also see: man history man scp man ssh 6 Unix/Linux basics for biolinux 3. Moving around / Basic commands 3.1 Command syntax UNIX commands begin with a command name, often an abbreviation of the command's action (such as cp for ``copy'' or mv for ``move''). Most commands include ``flags'' and ``arguments''. A flag identifies some optional capability and begins with a hyphen. An argument is usually the name of a file, such as one to be read. For example, the command line cat -n stuff calls the cat program (to ``concatenate'' files). In this case, cat reads the file named ``stuff'' and displays it. The -n flag tells cat to number the lines in the display. The hyphen that precedes a flag is a special character used to distinguish flags from filenames. In the example above, the hyphen prevents cat from trying to display a file named ``n''. Commands can contain other special characters as well. The shell interprets such characters, and sometimes replaces them with other values, before it passes the command with its flags and arguments to the program that actually executes the command. Remember that uppercase and lowercase letters are not interchangeable. Thus if you tried to give the cat command by typing CAT, it would fail; there is no program named (uppercase) CAT. 3.2 Figuring out where you are As previously stated, when you first connect to a biolinux machine your current working directory will always be your home directory. pwd Prints the current (working) directory, like so: $ pwd /home/user1 3.3 Changing to another directory cd [directory] Changes the current working directory. To back up a directory, you'd do: cd .. To back up two directory levels, you'd do: 7 Unix/Linux basics for biolinux cd ../.. To change to a subdirectory in your current directory, you can just type the name of that subdirectory (remember to use tab to save typing the entire directory name): cd public_html To change to some other directory on the system, you must type the full path name: cd /tmp For a short cut to your home directory just type: cd 3.4 Seeing what's there ls [-options] [name] List the current directory's contents. By itself, ls just prints a columnar list of files in your directory: $ ls News admin bin biz bkup html-helper-mode.el letters lynx_bookmarks.html mail moo perl-mode.el perlmods public_html scripts tempo.el tiny.world tmp Here are a few other options that can format the listing or display additional information about the files: -a list all files, including those starting with a "." -d list directories like other files, rather than displaying their contents -k list file sizes in kilobytes -l long (verbose) format — show permissions, ownership, size, and modification date -t sort the listing according to modification time (most recently modified files first) -X sort the files according to file extension -1 display the listing in 1 column Options can be combined; in this example, we show a verbose listing of files by last modification date: $ ls -lt total 94 drwx-----drwx-----drwx-----drwxr-xr-x drwx-----drwx------rw------drwx-----drwx-----drwx-----drwx------ 6 2 13 2 7 2 1 2 2 2 2 kira kira kira kira kira kira kira kira kira kira kira kira kira kira kira kira kira kira kira kira kira kira 1024 1024 1024 1024 1024 1024 29 1024 1024 1024 1024 8 Feb Feb Feb Feb Feb Feb Feb Feb Feb Feb Feb 28 28 28 28 28 26 21 20 14 12 1 19:42 19:41 19:39 19:38 19:00 18:45 16:29 09:25 01:29 13:40 17:19 admin scripts perlmods public_html moo mail tiny.world bin tmp letters biz Unix/Linux basics for biolinux drwx--x--x drwx------rw-r--r--rw-rw-r--rw-rw-r--rw-rw-r-- 2 2 1 1 1 1 kira kira kira kira kira kira kira kira kira kira kira kira 1024 1024 592 23815 25802 27491 Jan Nov Nov Oct Oct Oct 9 13 8 25 25 25 20:59 09:02 18:12 07:35 07:35 07:35 News bkup lynx_bookmarks.html tempo.el perl-mode.el html-helper-mode.el Also, you can specify a filename or directory to list: $ ls -l public_html/ total 1 -rwxr-xr-x 1 kira kira 436 Feb 28 19:52 index.html The verbose listing shows the file permissions of a given file: -rwxr-xr-x Directories have a "d" in the first column; regular files have a "-". The remaining 9 characters indicate the owner, group, and world permissions of the file. An "r" indicates it's readable; "w" is writable, and "x" is executable. A dash in the column instead of a letter means that particular permission is turned off. So, "-rwxr-xr-x" is a plain file that is read-write-execute by the owner, and read-execute by group and world. "drwx------" is a directory that is read-write-execute by owner, and group and world have no permissions at all. This is explained in more detail later on in Section 5 'Introduction to ownership/permissions'. 9 Unix/Linux basics for biolinux 4. Files and Directories 4.1 Unix File Names Unix filenames can't have spaces, slashes, or weird characters in them. (Or, sometimes they can, but this will make your life miserable, because referring to strange characters requires a backlash in the filename.) Also, file names are case sensitive, so if you create a script and upload it as "SCRIPT.TXT", and then attempt to run it as “script.txt”, it won't work, because Unix can't find "script.txt" in your directory. 4.2 Creating Files You can create files by editing them with an editor, or ftp'ing them into your directory. Biolinux includes gedit, a very simple text editor. To use it, just type gedit newfile.txt & There's also a way you can create an empty file without editing it: the touch command. touch filename The main use of touch is to update the timestamp on a file; if you touch an existing file, it changes the last modification date of that file to now. However if the file doesn't exist, touch creates an empty file. This may be useful for creating counter data files or output logs: touch outlog chmod 666 outlog 4.3 Copying Files cp [options] source dest Copies the source file to the destination. The source file remains after this. Options: -b backup files that are about to be overwritten or removed -i interactive mode; if dest exists, you'll be asked whether to overwrite the file -p preserves the original file's ownership, group, permissions, and timestamp 4.4 Moving (Renaming) Files mv [options] source destination Moves the source file to the destination. The source file ceases to exist after this. Options: -b backup files that are about to be overwritten or removed -i interactive mode; if dest exists, you'll be asked whether to overwrite the file 10 Unix/Linux basics for biolinux 4.5 Viewing Files more filename less filename These two commands allow you to page through a file. less is often preferred because it allows you to back up in a file. Both commands scroll through the file, starting at the first line, and displaying one page at a time. Press the space bar to continue to the next page. In less, pressing "b" instead of the spacebar will backup to the previous page. Alternatively just use the up and down arrows on the keyboard. A variety of other scrolling and searching options exist; consult the man pages for a detailed listing. head [options] filename tail [options] filename head displays lines from the beginning of a file. If no options are given, the default is 10 lines. An optional argument of --lines can be used to specify the number of lines to display. For example, to list the first 5 lines of a file, you'd do: head -5 filename tail is similar, except it shows lines from the end of a file. Again, with no arguments, it shows the last 10 lines. Try: tail filename 4.6 Searching for something in a File grep [options] pattern filenames fgrep [options] string filenames grep and fgrep search a file or files for a given pattern. fgrep (or "fast grep") only searches for strings; grep is a full-blown regular-expression matcher. Some of the valid options are: -i -n -v -w case-insensitive search show the line# along with the matched line invert match, e.g. find all lines that do NOT match match entire words, rather than substrings An example: if you wanted to find all instances of the word "Fred" in a file, case-insensitive but whole words (e.g. don't match "Frederick"), and display the line numbers: $ grep -inw "Fred" fnord 3:Fred 9:Fred There are a great many other options to grep. Check the man page for more information. 4.7 Deleting Files rm [options] filenames 11 Unix/Linux basics for biolinux Deletes the named file(s). Options: -f -i -R force, delete files without prompting interactive — prompts whether you want to delete the file recursively delete all files in directories 4.8 Creating Directories mkdir dirname Creates the named directory. If a full path is not given, the directory is created as a subdirectory of your current working directory. You must have write permissions on the current directory to create a new directory. 4.9 Deleting Directories rmdir dirname Deletes the named directory. If the directory is not empty, this will fail. To remove all files from the directory, first do "rm -rf dirname" but BE CAREFUL! 12 Unix/Linux basics for biolinux 5. Introduction to File Ownership/Permissions One of the key things that makes Unix different from Windows are file ownership and permissions. 5.1 File Ownership All files are on a Unix system are owned by a user and a group. Users own files they create, and root can change the ownership of files through the "chown" utility. A file's group is set to the primary group of the user that creates it initially. Since users can belong to more than one group (and often do), a user can change the group of any file they own with the "chgrp" utility to any group they belong to. You can use the "groups" utility to list what groups you (and others) belong to. Commands: chmod [permissions][file] chgrp [group] [file] groups [user] 5.2 File Permissions File permissions control who can do what to a file. They're divided into four sections. Firstly, there are “Extended Permissions” which are for the more advanced user and thus beyond the requirement of this section. Secondly, “User Permissions” control what the user who owns a file can do to it. Thirdly, “Group Permissions” control what members of a group that owns a file can do to it. And fourthly, “World or Other Permissions” control what everyone else can do to the file. Each section is represented internally as three bits, which gives three on/off fields for each. For user, group, and other, the fields are: read - The user/group can read the contents of the file. For directories, this means the user/group can list its contents. write - The user/group can write to the file, changing the data in it however they want. For directories, it gives permission to create/remove files, so you can't actually DELETE a file unless you've got write perms to the directory it’s in. execute - The program can be run by the user/group. For directories, it means you can cd into the directory. 5.3 Representing File Permissions There are two ways to represent file permissions: symbolic and numeric 5.3a Symbolic Representation: File modes are represented using letters and symbols familiar to humans. These are fairly easy to 13 Unix/Linux basics for biolinux understand: chmod u+rwx,go+rx index.html u is the owner's ("user's") permissions; g is the group permissions, and o is "other" or world permissions. The + sign turns the stated permissions on; a — sign turns them off. So, if you want to change a file so that it's group writable, but not readable or executable, you'd do: chmod g+w,g-rx filename 5.3b Numeric Representation This is a bit more confusing, so I'll go into more detail here. As each field of the file's permissions is three bits, it can be represented as a series of octal (base 8) numbers. The maximum base-10 value three bits can have is seven, which is the maximum value for a single digit in base 8. Read perms (bit 3) are represented by a 4. Write perms (bit 2) are represented by a 2. Execute perms (bit 1) are represented by a 1. To specify multiple permissions, just add their values together. For example, to specify read and write perms, you'd use a 6. To specify all of read, write, and execute, you'd use a 7. To specify none, use a 0. Now, to actually specify these, you use four digits. The first specifies extended file perms (this is optional), and is almost always 0. The second specifies user permissions, the third group permissions, and the fourth world permissions. So to specify user read, write and execute, group read and execute, and world read, you'd use: 0754 or simply 754 References: info chmod info groups info chgrp 14 Unix/Linux basics for biolinux 6. Obtaining help 6.1 The "man" command All commands on the Linux system are described online in a collection of files called "man pages", because they were originally pages from the UNIX Programmer's Manual. Try it now - type in "man ls". The resulting page will describe the command, then describe every option, then give further details about the program, the author, and so on. This information is shown using the "less" command. For now, it is sufficient to know that you can use the up and down arrow, “PgUp” and “PgDn” keys to move around, and the “Q” key to quit. 6.2 The "info" command Another source of online help is the "info" command. Some Linux commands may supply both "man" and "info" documentation. As a general rule, "info" documentation is more verbose and descriptive, like a user guide, while "man" documentation is more like a reference manual, giving lists of options and parameters, and the meaning of each. Try typing "info ls" now. The method for moving around in "info" is quite similar to "man" you can also use the arrows and “PgUp/PgDn” to move, and “Q” to quit. The main difference is that info pages can contain "menus" of links which lead to other pages. To follow a link, move the text cursor to it with the arrow keys, and press Enter. 6.3 The "--help" option Most (but not all) programs have a “--help” option which displays a very short description of its main options and parameters. Try typing "ls --help" to see. This will produce more than one screenful of information, so you'll have to use the terminal's scrollbar to see what was displayed. The "--help" information rarely says anything that isn't also found in the "man" documentation, so it's rarely needed, except in a tiny number of programs which do not supply any other form of documentation. 6.4 Using “apropos” to find commands If you know the name of a command, you can view its man page at your terminal. If you don't know its name, you can use the “apropos” command, which searches through the header lines of the man pages for whatever keyword you supply and shows you a list of the man pages it finds. To use it, type apropos topic where topic is a word describing what you want to know. The “apropos” command does not require an exact case match and will also find partial words. 15 Unix/Linux basics for biolinux 6.5 Biolinux applications documentation All documentation for biolinux software applications can be found under the following path: /usr/software/documentation In order to view a documentation file you'll need to first know what type of file it is. This is usually obvious by its extension. Below are the different file viewers you'll need for different file types. For a text editor type: gedit documentation.txt & For a pdf viewer type: xpdf documentation.pdf & For a simple Web browser type: mozilla documentation.html & 16 Unix/Linux basics for biolinux 7. Using Pipes One of the features that makes UNIX so flexible is the ability to combine two or more utilities together to perform a more complex function. Pipes are the mechanisms that allow you to combine several utilities together. Pipes let you take the standard output of one command and use it for the standard input for another command. The vertical bar (|) is used to designate a pipe and is placed between the two commands. 7.1 General Approach 1. To use pipes, you must have at least two commands: one that sends its output to standard output for the first, and one that reads its input from standard input for the second. You can connect more than two commands with a pipe. Usually the middle components of the pipe are filters. 2. Type the commands in the order you want them executed, separating each command from the next with the vertical bar, |. 3. Press the return key, and your commands will be executed. 7.2 Some Examples Copy the file 'namesfile.txt' from the '/tmp/workshop' directory into your home directory. Now, try the following command. cat namesfile.txt | wc -l Here we are using a pipe to combine the “cat” and “wc” utilities. (See man wc ). Now try the following and see if you can work out what the command should do before executing it. Use the man pages to look up the commands being used before hand. ls | wc -l who | sort cat namesfile.txt | sort grep -i e namesfile.txt | less ls -l /home | grep users | wc -l 17 Unix/Linux basics for biolinux 8. TAR and GZIP 8.1 TAR tar stands for “Tape ARchive”. It was originally designed for tape backups, but it is used to create a tar file anywhere on the filesystem. tar creates one "tar file" (also known as a "tarball") out of several files and directories. A tar file isn't compressed, it's just a heap of files assembled together in "one container". So, the tar file will take up the same amount of space as all the individual files combined, plus a little extra. A tar file can be compressed by using gzip. Here are some examples: tar -xvf example.tar Extract the contents of example.tar and display the files as they are extracted. tar -cf backup.tar /home/ftp/pub Create a tar file named backup.tar from the contents of the directory /home/ftp/pub tar -tvf example.tar list contents of example.tar to the screen Now, an example: 1. Log in to the linux box and make sure you are in the home directory. 2. Do an “ls -al” to see what is in the home directory. We should have a directory called "Desktop". “cd Desktop” into that directory and look around. Type "cd" to bring yourself back to the home directory. Again, confirm that you are in the home directory. 3. Type in the following command: "tar -cvf desktop.tar Desktop" (please pay attention to CaSe) 4. Do an “ls -al” again to see what is in the directory now. You should see a file called “desktop.tar” 5. Now type in "mv Desktop Desktop.old" to rename the Desktop directory. Do an “ls -l” to confirm that the directory name has been changed and there no longer is a directory named Desktop in the home directory. 6. “cd” into the Desktop.old directory and confirm that the files are the same as what you saw in #2. 7. “cd” back to your home directory (to move up (back) one directory, you can type in "cd .." 8. Now type in the command "tar -xvf desktop.tar" to extract the contents of the archive. 9. Do an “ls -al” again. You should see the original Desktop directory. “cd” into it and make sure the files are in it. 10. Remove the tar file and the Desktop.old directories if everything worked properly. ("rm desktop.tar", "rm -rf Desktop.old") 8.2 GZIP gzip is the original UNIX ZIP format. It's common to first "tar" files and then compress them afterwards using gzip. These files are normally given the extentions .tar.gz to show that they are tar archives zipped up with gzip. You may also see the extension, .tgz. An archive that is compressed with gzip is compatible with WinZip and PkZip. So, a file zipped up on a UNIX box can be 18 Unix/Linux basics for biolinux unzipped with a Windows box. Here are some examples: To compress a file using gzip, execute the following command: gzip filename.tar (where filename.tar is the name of the file you wish to compress). The result of this operation is a file called “filename.tar.gz”. Note: by default, gzip will delete the filename.tar file. To decompress a file using gzip, execute the following command: gzip -d filename.tar.gz The result of this operation is a file called “filename.tar”. Note: by default, gzip will delete the filename.tar.gz file. You can also decompress the file using the command: gunzip filename.tar.gz This is the same as using the “gzip -d” command. There are many options that you can use with gzip. Do a man on the utility to learn more. 8.3 Using TAR and GZIP together Rather than typing two commands most Linux users will use gzip within tar like so: tar -zcvf folder.tar.gz folder creates a compressed archive called “folder.tar.gz” from a directory called “folder” including all its subdirectories, and: tar -zxvf folder.tar.gz will decompress and extract the compressed archive “folder.tar.gz” 19 Unix/Linux basics for biolinux 9. Linux Filesystem UNIX/Linux filesystems are very different from Windows. UNIX/Linux is organized in a hierarchy, starting with "slash." Slash is this: / The rest of the directories and subdirectories continue down from slash. For example, the directory, /etc/httpd/ , is a subdirectory in the the /etc (pronounced "etsie") directory. /etc/ is a directory in slash. Unlike Windows that represents drives as letters (e.g., a:,c:, e:), all devices, including hard drives, cdroms, and other storage media in Linux all fit into the normal directory structure. Devices are mounted onto the file system tree and, as a user, you might not be able to tell if a directory is actually located on your own hard drive or on a hard drive of another computer. Linux, allows for the great level of flexibility in terms of its file systems. You can literally put in four drives and "mount" them to various directories in your tree. Here are some common mounts you should know about: / swap /boot /usr* /dev /etc /var /bin /home* (Called root), this is equivalent to C:\ in the DOS/Windows world. You cannot run a Linux system without the root partition. All other partitions are a subset of the root partition. Unless you have massive amounts of memory, you are going to need some swap space. In Windows, you have Win386.swp; here, it's a separate drive partition. This contains the necessary stuff to start the machine, including the base kernel. This partition is optional, but usually present. You will not need to mess around in here if you have a running system. This is the directory where global executables are stored. It can be read-only, if you want. Generally speaking, most software is installed here by default. All Biolinux applications are installed here within a subdirectory called ‘software’. Documentation and examples for these applications can be found within a subdirectory of software. i.e. /usr/software/documentation This is the directory where all of your devices are. There are a few useful examples for you to know. /dev/hda is the first ide hard drive. /dev/hdb is the second. /dev/sda would be the first SCSI drive, and /dev/sg0 is your robotic arm. This is where most configuration files are stored. You will spend a lot of time in here if you are an administrator. Most files require "root" access to change. Many of the system log files are here, as well as spools (mail, printer...) This directory is the home of binary executables. These include the common commands we have already learned like ls, cat, gzip and tar. This is where all users home directories are located * = Important to biolinux users 20 Unix/Linux basics for biolinux 10. A Unix Command Summary The following commands are grouped alphabetically by function. Each has a man page describing it in full. Access Control exit - terminate a shell (see "man sh") passwd - change login password Communications talk - talk to another logged-in user (full screen) write - write to another logged-in user Programming Tools awk - pattern scanning and processing language /bin/time - time a command kill - kill a process perl - Popular script interpreter sh - Bourne shell command interpreter Documentation apropos - locate commands by keyword lookup info - start the InfoExplorer program (ADS only) man - find manual information about commands whatis - describe what a command is whereis - locate source, binary, or man page for a program Editors gedit – easy to use text editor emacs - screen-oriented text editor pico - simple, screen-oriented text editor (easiest for new users) sed - stream-oriented text editor able to perform basic text transformations vi - full-screen text editor vim - full-screen text editor ("vi-improved") File and Directory Management cd - change working directory chmod - change the protection of a file or directory cmp - compare two files comm - select/reject lines common to two sorted files compress - compress a file cp - copy files crypt - encrypt/decrypt files (CCWF only) diff - compare the contents of two ASCII files file - determine file type grep - search a file for a pattern gzip – compress or expand files ln - make a link to a file ls - list the contents of a directory mkdir - create a directory mv - move or rename files and directories pwd - show the full pathname of your working directory rm - delete (remove) files rmdir - delete (remove) directories 21 Unix/Linux basics for biolinux sort - sort or merge files tar – create or extract archives uncompress - restore compressed file uniq - report (or delete) repeated lines in a file wc - count lines, words, and characters in a file File Display and Printing cat - show the contents of a file; catenate files fold - fold long lines to fit output device head - show first few lines of a file more - display a file, one screen at a time less - display a file, one screen at a time with scroll option page - like "more", but prints screens top to bottom tail - show the last part of a file zcat - display a compressed file File Transfer ftp - transfer files between network hosts scp – secure transfer of files between networked UNIX hosts Miscellaneous alias - define synonym commands chsh - change default login shell clear - clear terminal screen echo - echo arguments stty - set terminal options News/Networks netstat - show network status (on UTS, /usr/sbin/netstat) ssh - secure-shell version of rsh telnet - run Telnet to log in to remote host Process Control bg - put suspended process into background fg - bring process into foreground jobs - list processes Status Information date - show date and time df - summarize free disk space du - summarize disk space used env - display environment finger - look up user information history - list previously issued commands (C shell, bash, and ksh only) last - indicate last login of users manpath - show search path for man pages printenv - print out environment ps - show process status pwd - print full pathname of working directory set - set shell variables (C shell, bash, and ksh only) stty - set terminal options uptime - show system load, how long system has been up w - show who is on system, what command each job is executing who - show who is logged onto the system whois - Internet user name directory service 22