Lecture 1: Introduction, Basic UNIX Advanced Programming Techniques Summer 2003 Unix Programming Environment1 Objective: To introduce students to the basic features of Unix and the Unix Philosophy (collection of combinable tools and environment that supports their use) 1Many Basic commands File system Shell Filters (wc, grep, sort, awk) of the examples for this lecture come from the UNIX Prog. Env. and AWK books shown (see lecture outline for full references) Logging In To log in to a Unix machine you can either: sit at the console (the computer itself) access remotely, via SSH, e.g. The system prompts you for your username and password. Usernames and passwords are case sensitive! CS Dept. Accounts See http://www.cs.drexel.edu/~kschmidt/Re f/csLogin.html All CS machines (that you have access to) running Linux tux machines – 64-bit lab machines – 32-bit Not administered by Drexel IRT Username (typically) a sequence of alphanumeric characters of length no more than 8. the primary identifying attribute of your account. (usually) used as an email address the name of your home directory is usually related to your username. Password a password is a secret string that only the user knows (not even the system knows!) When you enter your password the system encrypts it and compares to a stored string. passwords should have at least 6 characters It's a good idea to mix case, include numbers and/or special characters (don't use anything that appears in a dictionary!) Home Directory The user’s personal directory. E.g., /home/kschmidt /home/vzaychik Where all your files go (hopefully organised into subdirectories) Mounted from a file server – available (seemlessly) on *any* department machine you log into Home Directory Your current directory when you log in cd (by itself) takes you home Location of many startup and customization files. E.g.: .vimrc .bashrc .bash_profile .forward .plan .mozilla/ .elm/ .logout Operating Systems An Operating System controls (manages) hardware and software. provides support for peripherals such as keyboard, mouse, screen, disk drives, … software applications use the OS to communicate with peripherals. The OS typically manages (starts, stops, pauses, etc) applications. Unix and Users Most flavors of Unix (there are many) provide the same set of applications to support humans (commands and shells). Although these user interface programs are not part of the OS directly, they are standardized enough that learning your way around one flavor of Unix is enough. Flavors of Unix There are many versions of Unix that are used by lots of people: SysV (from AT&T) BSD (from Berkeley) Solaris (Sun) IRIX (SGI) AIX (IBM) LINUX (free software) The power of Unix Open source, portability You can extend the basic functionality of Unix: customize the shell and user interface. string together a series of Unix commands to create new functionality. create custom commands that do exactly what we want. Structure of the UNIX system Applications Shell There are many standard applications: Kernel\ (OS) • file system commands Hardware • text editors • compilers • text processing Kernel (OS) Interacts directly with the hardware through device drivers Provides sets of services to programs, insulating these programs from the underlying hardware Manages memory, controls access, maintains file system, handles interrupts, allocates resources of the computer Programs interact with the kernel through system calls Files and File Names A file is a basic unit of storage (usually storage on a disk). Every file has a name. Filenames are case-sensitive! Unix file names can contain any characters (although some make it difficult to access the file) except the null character and the slash (/). Unix file names can be long! how long depends on your specific flavor of Unix Directories A directory is a special kind of file - Unix uses a directory to hold information about other files. We often think of a directory as a container that holds other files (or directories). A directory is the same idea as a folder on Windows. More about File Names Review: every file has a name. Each file in the same directory must have a unique name. Files that are in different directories can have the same name. The Filesystem (eg) / bin etc hollid2 netprog unix home/ scully X tmp bin ls usr etc who Unix Filesystem The filesystem is a hierarchical system of organizing files and directories. The top level in the hierarchy is called the "root" and holds all files and directories. The name of the root directory is / Pathnames The pathname of a file includes the file name and the name of the directory that holds the file, and the name of the directory that holds the directory that holds the file, and the name of the … up to the root The pathname of every file in a given filesystem is unique. Pathnames (cont.) To create a pathname you start at the root (so you start with "/"), then follow the path down the hierarchy (including each directory name) and you end with the filename. In between every directory name you put a "/". Pathname Examples / bin/ etc/ Hollid2/ netprog home/ tmp/ scully/ unix/ X usr/ bin/ local/ ls Syllabus /home/hollid2/unix/Syllabus who /usr/bin/ls Absolute Pathnames The pathnames described in the previous slides start at the root. These pathnames are called "absolute pathnames". Special absolute: ~kschmidt/ – /home/kschmidt (for users’ home directories only) ~/ – Your home directory (so, relative to login, $USER) Relative Pathnames Prefixed w/the current directory, $PWD So, relative to the current working directory $ cd /home/hollid2 $ pwd /home/hollid2 $ ls unix/Syllabus unix/Syllabus $ ls X ls: X: No such file or directory $ ls /home/scully/X /home/scully/X Special Relative paths… . – The current directory .. – The parent directory $ pwd /home/holid2 $ ls ./netprog ./netprog $ ls ../scully X Disk vs. Filesystem The entire hierarchy can actually include many disk drives. some directories can be on other computers / bin etc hollid2 users scully tmp usr Commands for Traversing Filesystem ls – lists contents of a directory -a – all files -l – long listing pwd – print working (current) directory cd – change directory w/out argument, takes you home man Pages To get information about anything that's been properly installed, use man: man ls man cat man man Linux boxes also have info pages The ls command The ls command displays the names of some files. If you give it the name of a directory as a command line argument it will list all the (unhidden) files in the named directory. Command Line Options We can modify the output format of the ls program with a command line option. The ls command support a bunch of options: l long format (include file times, owner and permissions) a all (shows hidden* files as well as regular files) F include special char to indicate file types. C place into columns *hidden files have names that start with "." cd – change directory The cd command can change the current working directory: cd change directory The general form is: cd [directoryname] Viewing files cat – concatenate, send to stdout. View contents of text files less, more – paging utilities (hit ‘q’ for help) od – octal dump. For viewing raw data in octal, hex, control chars, etc. Copying, removing, linking rm – remove file rm ~/tmp/download mv – move (rename) file mv old.file ../otherDir/new.name cp – copy file cp someDir/file someDir/file.copy ln – create hard (inode) or soft (symbolic) links to a file Commands for directories mkdir make directory rmdir remove directory Directories can also be moved or renamed (mv), and copied (cp –r) Commands for Archiving tar – Tape Archive makes a large file from many files gzip, gunzip compression utility tar on Linux does compression with the z option: $ tar czf 571back.tgz CS571 $ tar xzf assn1.tgz File attributes Every file has some attributes: Access Times: when the file was created when the file was last changed when the file was last read Size Owners (user and group) Permissions Type – directory, link, regular file, etc. File Time Attributes Time Attributes: when the file was last changed sort by modification time ls -l ls -lt File Owners Each file is owned by a user. You can find out the username of the file's owner with the -l or -o option to ls: [jjohnson@ws44 winter]$ ls -l total 24 drwxr-xr-x 7 jjohnson users -rw------1 jjohnson users -rw-r--r-1 jjohnson users 80 Jan 8258 Jan 8261 Jan 3 3 3 2005 cs265/ 2005 cs265.html 2005 cs265.html~ ls -l $ ls -l foo -rw-rw---- permissions 1 hollingd grads 13 Jan 10 23:05 foo size owner name group time File Permissions Each file has a set of permissions that control who can mess with the file. There are three types of permissions: read write execute abbreviated r abbreviated w abbreviated x There are 3 sets of permission: 1. 2. 3. user group other (the world, everybody else) ls -l and permissions -rwxrwxrwx User Group Type of file: - – plain file d – directory s – symbolic link Others rwx Files: r - allowed to read. w - allowed to write x - allowed to execute Directories: r - allowed to see the names of the file. w - allowed to add and remove files. x - allowed to enter the directory Changing Permissions The chmod command changes the permissions associated with a file or directory. There are a number of forms of chmod, this is the simplest: chmod mode file chmod – numeric modes Consider permission for each set of users (user, group, other) as a 3-bit # r–4 w–2 x–1 A permission (mode) for all 3 classes is a 3digit octal # 755 – rwxr-xr-x 644 – rw-r—r-700 – rwx------ chmod - examples $ chmod 700 CS571 $ ls –o Personal drwx------ 10 kschmidt 4096 Dec 19 2004 CS571/ $ chmod 755 public_html $ chmod 644 public_html/index.html $ ls –ao public_html drwxr-xr-x drwx--x--x -rw-r--r-- 16 kschmidt 4096 Jan 8 10:15 . 92 kschmidt 8192 Jan 8 13:36 .. 5 kschmidt 151 Nov 16 19:18 index.html $ chmod 644 .plan $ ls –o .plan -rw-r--r-- 5 kschmidt 151 Nov 16 19:18 .plan chmod – symbolic modes Can be used to set, add, or remove permissions Mode has the following form: [ugoa][+-=][rwx] u – user g – group o – other a – all + add permission - remove permission = set permission chmod examples $ ls -al foo -rwxrwx--x 1 hollingd grads foo $ chmod g-wx foo $ ls -al foo -rwxr----x 1 hollingd grads foo $ chmod u-r . $ ls ls: .: Permission denied Shell as a user interface A shell is a command interpreter, an interface between a human (or another program) and the OS runs a program, perhaps the ls program. allows you to edit a command line. can establish alternative sources of input and destinations for output for programs. Is, itself, just another program Bourne-again Shell (bash) We’ll teach bash in this course Extension of the Bourne Shell (sh) Contains many of the Korn Shell (ksh) extensions You may use the shell of your choice (tcsh, zsh, etc.), but that’s on you. Session Startup Once you log in, your shell will be started and it will display a prompt. (for our examples, we will use $ as the prompt. It is not part of the input) When the shell is started it looks in your home directory for some customization files. You can change the shell prompt, your PATH, and a bunch of other things by creating customization files. Customization Each shell supports some customization. User prompt Where to find mail Shortcuts The customization takes place in startup files – files that are read by the shell when it starts up Startup files sh,ksh: /etc/profile (system defaults) ~/.profile bash: ~/.bash_profile ~/.bashrc ~/.bash_logout csh: ~/.cshrc ~/.login ~/.logout Incorrect login You will receive the “Password:” prompt even if you type an incorrect or nonexistent login name Can you guess why? Entering Commands The shell prints a prompt and waits for you to type in a command. The first token on the line is taken to be a command. Come in 2 flavors: shell builtin - commands that the shell interprets directly. External programs (utilities) – standalone programs on disk (directories in your $PATH are searched, in order) Interpreting a Command - type When a command is seen, the shell: 1. 2. 3. 4. Checks for aliases Checks for user-defined functions Looks for a builtin Checks directories in $PATH for a utility Use Bash’s type builtin to see what the shell is using: kschmidt@ws60 kschmidt> type echo echo is a shell builtin kschmidt@ws60 kschmidt> type chmod chmod is /bin/chmod Command Options and Arguments standardized command syntax (applies to most commands): command option(s) arguments options modify the way in which a command works, often single letters prefixed with a dash (can be sometimes combined after a single dash Getting help manual – original Unix help (flat, single page) $man who $man man info – 2-d system, emacs-like navigation $info who The resource frame on the class page Internet – google, wikipedia The linux documentation project (http://www.tldp.org/) Safari online Friends, group-mates, and others Some simple commands date – print current date who – print who is currently logged in finger usr – more information about usr ls -ao – lists (long) all files in a directory du -sh – disk usage summary, human readable quota Logging off exit command Exits the shell If it is the login (top-level) shell, then it disconnects you A shell is just another program that is running. Can recursively invoke shells Please don’t just disconnect w/out exiting Standard I/O When you enter a command the shell creates a subshell to run the process or script. The shell establishes 3 I/O channels: Standard Input (0) – keyboard Standard Output (1) – screen Standard Error (2) – screen These streams my be redirected to/from a file, or even another command Programs and Standard I/O Standard Input (STDIN) Program Standard Output (STDOUT) Standard Error (STDERR) Terminating Standard Input If standard input is your keyboard, you can type stuff in that goes to a program. To end the input you press Ctrl-D (^D), the EOF signal, on a line by itself, this ends the input stream. The shell is a program that reads from standard input. What happens when you give the shell ^D (see the bash set command, ignoreeof) Shell metacharacters Some characters have special meaning to the shell: I/O redirection < > | wildcards * ? [ ] others & ; $ ! \ ( ) space tab newline These must be escaped or quoted to inhibit special behavior Wildcards * – matches 0 or more characters ? – matches exactly 1 character [<list>] – matches any single character in <list> E.g. ls ls ls ls *.cc – list all C++ source files in directory a* – list all files that start w/’a’ a*.jpeg – list all JPEGs that start w/’a’ * - (make sure you have a subdirectory, and try it) Wildcards (more examples) ls file? - matches file1, file2, but not file nor file22 ls file?.*.DEL - matches file1.h.DEL, file9.cc.DEL, file3..DEL but not file8.DEL nor file.html.DEL These are not regular expressions! Wildcards - classes [abc…] matches any of the enclosed characters ls T[eE][sS][tT].doc [a-z] matches any character in a range ls [a-zA-Z]* [!abc…] matches any character except those listed. ls [!0-9]* Shell Variables bash uses shell variables to store information Shell variables are used to affect the behavior of the shell, and many other programs We can access these variables: set new values for some to customize the shell. find out the value of some to help accomplish a task. Setting/Viewing Variables To assign (in sh, ksh, bash): VAR=someString OTHER_VAR=“I have whitespace” Note, no whitespace around the ‘=‘! To view (dereference) a variable: $ echo $VAR someString $ echo $OTHER_VAR I have whitespace Example Shell Variables current working directory list of directories to look for commands HOME home directory of user USER user’s login name TERM what kind of terminal you have HISTFILE where your command history is saved PWD PATH Shell maintains variables Some common ones: $PATH – list of directories to search for utilities $PS1 – Primary prompt $HOME – user’s home directory $USER – user’s login name $PWD – current working directory Displaying Shell Variables Prefix the name of a shell variable with "$". The echo command will do: echo $HOME echo $PATH You can use these variables on any command line: ls -al $HOME Setting Shell Variables You can change the value of a shell variable with an assignment command (this is a shell builtin command): HOME=/etc PATH=/usr/bin:/usr/etc:/sbin NEWVAR="blah blah blah" set command (shell builtin) The set command with no parameters will print out a list of all the shell varibles. You'll probably get a pretty long list… Depending on your shell, you might get other stuff as well... Quoting – escape character, \ Use the backslash to inhibit the special meaning of the following character: $ echo $USER kschmidt $ echo \$USER $USER $ echo a\\b a\b Quoting – double quotes Double quotes inhibit all behavior except variable substitution, $, command substitution, `, and the escape, \ $ echo “$USER is $USER” kschmidt is kschmidt $ echo “\$USER is $USER” $USER is kschmidt $ echo “I said, \”Wait a moment\”” I said, “Wait a moment” Quoting – single quotes Single quotes inhibit all special behavior May not contain a single quote $ echo ‘I said “Wait!”’ I said “Wait!” $ echo ‘My name is $USER’ My name is $USER $ mv rambleOnByLedZeppelin ‘ramble on – led zeppelin’ Input Redirection The shell can attach things other than your keyboard to standard input. A file (the contents of the file are fed to a program as if you typed it). A pipe (the output of another program is fed as input as if you typed it). Output Redirection The shell can attach things other than your screen to standard output (or stderr). A file (the output of a program is stored in file). A pipe (the output of a program is fed as input to another program). Redirecting stdout Use “>” after a command (and its arguments) to send output to a file: ls > lsout if lsout previously existed it will be truncated (gone), unless noclobber is set (see bash) Redirecting stdin To tell the shell to get standard input from a file, use the “<“ character: sort < nums The command above would sort the lines in the file nums and send the result to stdout. You can do both! sort < nums > sortednums tr a-z A-Z < letter > rudeletter Appending Output Use >> to append append output to a file: ls /etc >> foo ls /usr >> foo Easy way to concatenate files: cat rest_of_file >> my_file Redirecting stderr stderr is file descriptor 2, so: gcc buggy.c 2> error.log grep ‘[Vv]era’ *.html > log 2> errorlog To send both to the same place (stdout is file descriptor 1): find . –name ‘core*’ > coreList 2>&1 Pipes – connecting processes A pipe is a holder for a stream of data. A pipe can be used to hold the output of one program and feed it to the input of another. prog1 prog2 STDOUT STDIN Asking for a pipe Separate 2 commands with the “|” character. The shell does all the work! ls | sort ls | sort > sortedls Process Control Processes are run in a subshell (by default) Subshells inherit exported variables Each process is has an ID (pid) and a parent (ppid) Use the ps utility to look at some processes: $ ps PID TTY TIME CMD 350 pts/4 00:00:00 bash 22251 pts/4 00:00:00 vim 22300 pts/4 00:00:00 ps Process Control (cont.) Use the –f option for a long listing: $ ps –f UID PID kschmidt 350 kschmidt 22251 kschmidt 22437 PPID 349 350 350 C 0 0 0 STIME 10:06 17:32 17:36 TTY pts/4 pts/4 pts/4 TIME 00:00:00 00:00:00 00:00:00 Use the –e option to see more processes (all of them). $ ps –e | grep xmms 29940 pts/0 00:33:47 xmms CMD -bash vim myHomework ps -f Killing a process (not usually nice) The kill command sends a signal to a process (the given pid) By default, sends TERM (terminate), which asks the process to finish, so that it may do clean-up use -9 to send a KILL (won’t be ignored), but no cleanup My mp3 player hangs once in while: $ kill -9 29940 Job Control The shell allows you to manage jobs place jobs in the background move a job to the foreground suspend a job kill a job Background jobs If you follow a command line with "&", the shell will run the job in the background. you don't need to wait for the job to complete, you can type in a new command right away. you can have a bunch of jobs running at once. you can do all this with a single terminal (window). ls -lR > saved_ls & Listing jobs The command jobs will list all background jobs: > jobs [1] Running > ls -lR > saved_ls & The shell assigns a number to each job (this one is job number 1). Suspending and Resuming the Foreground Job You can suspend the foreground job by pressing ^Z (Ctrl-Z). Suspend means the job is stopped, but not dead. The job will show up in the jobs output. You give fg a job number (as reported by the jobs command) preceeded by a %. Without an argument, fg brings the last job forward $ jobs [1] Stopped $ fg %1 ls -lR > saved_ls ls -lR > saved_ls & Placing a suspended job in the background If it’s in the foreground, suspend it Use bg, just as you did fg, to let a suspended job continue in the background: $ bg %3 Killing a job Kill may also take a job number or even a job name, introduced by %: $ find . –name core\* -print > corefiles & $ firefox& $ jobs [1]+ Running find . –name … [2]+ Running firefox $ kill %2 Editors A text editor is used to create and modify text files. The most commonly used editors in the Unix community: vi (vim on Linux) $ vimtutor emac $ emacs Then, hit ctrl-h t (that’s control-h, followed by ‘t’) You must learn at least one of these editors The Unix Philosophy Stringing small utilities together with pipes and redirection to accomplish non-trivial tasks easily E.g., find the 3 largest subdirectories: $ du –sh * | sort –nr | head -3 120180 22652 9472 Files Zaychik tweedledee.tgz pipes and combining filters Connect the output of one command to the input of another command to obtain a composition of filters who | wc -l ls | sort -f ls -s | sort -n ls -l | sort -nr -k4 ls -l | grep ‘^d’ Further information http://www.geek-girl.com/unix.html Many tutorials and references available online! filters Programs that read some input (but don’t change it), perform a simple transformation on it, and write some output (to stdout) Some common filters… wc – word count (line count, character count) tr – translate grep, egrep – search files using regular expressions sort – sorts files by line (lexically or numerically) cut – select portions of a line uniq – Removes identical adjacent lines head, tail – displays first (last) n lines of a file