RHA030 - Workbook

advertisement
Workbook 7
Standard I/O and Pipes
Pace Center for Business and Technology
1
Standard I/O and Pipes
Key Concepts
• Terminal based programs tend to read information from one source, and
write information to one destination.
• The source programs read from is referred to as Standard In (stdin), and is
usually connected to a terminal's keyboard.
• The destination programs write to is referred to as Standard Out (stdout),
and is usually connected to a terminal's display.
• When using the bash shell, stdout can be redirected using > or >>, and
stdin can be redirected using <.
2
Three types of programs
How you can redirect where input is read from and where output goes. The output of one
command can be used as the input for another command, allowing simple commands to be
used together to perform more complicated tasks.
Three types of programs
In Linux (and Unix), programs can generally be grouped into the following three designs.
Graphical Programs
Graphical programs are designed to run in the X graphical environment. They expect the
user to be using a mouse, and use common graphical components, such as popup menus
and buttons, for user input. The mozilla web browser is an example of a graphical program.
Screen Programs
Screen based programs expect to use a text console. They make use of the entire display,
and handle text placement and screen redraws in sophisticated ways. They do not require a
mouse, and are appropriate for terminals and virtual consoles. The vi and nano text editors,
and links web browser, are examples of screen based programs.
Terminal Programs
Terminal programs collect input and display output in a stream, seldom if ever redrawing
the screen, as if writing directly to a printer that does not allow the cursor to move back up
the page. Because of their simplicity, terminal based programs are often called simply
commands. ls, grep, and useradd are examples of terminal based programs.
This chapter focuses on the latter type of program. Do not let the simplicity of the way
these commands receive input and output fool you. You will find that many of these
commands are very sophisticated, and allow you to use the command line interface in
powerful ways.
3
Standard in (stdin) and Standard out (stdout)
Terminal based programs generally read information as stream
from a single source, such as a terminal's keyboard. Likewise,
they generally write information as a steam to a single
destination, such as a display. In Linux (and Unix), the input
stream is referred to as Standard In (usually abbreviated stdin),
and the output stream is referred to as Standard Out (usually
abbreviated stdout).
Usually, stdin and stdout are connected to the terminal that runs
the command. Sometimes, in order to automate commonly
repeated commands, or in order to record the output of a
command for later inclusion in a report or email, people find it
convenient to redirect stdin from or stdout into files.
4
Redirecting stdout
Writing Output to a File
When a terminal based program generates output, it generally writes that
output to its stdout stream, without knowing what is connected to the
receiving end of that stream. Usually, the stdout stream is connected to the
terminal that started the process, so the output is written to the terminal's
display. The bash shell uses > to redirect a process's stdout stream to a file.
For example, suppose the machine elvis is using becomes very sluggish and
non-responsive. In order to diagnose the problem, elvis would like to examine
the currently running processes. Because the machine is so sluggish, however,
he wants to collect the information now, but analyze it later. He can redirect
the output of the ps aux command into the file sluggish.txt, and come back to
examine the file when the machine is more responsive.
5
Redirecting stdout
Notice that no output is displayed to the terminal. The ps command writes to
stdout, as it always does, but stdout is redirected by the bash shell to the file
sluggish.txt. The user elvis can examine the file later, at a more convenient
time.
6
Appending Output to a File
If the file sluggish.txt already existed, its original contents would be lost. This
is often referred to as clobbering a file. To append a command's output to a
file, rather than clobbering it, bash uses >>.
Suppose that elvis wanted to record a timestamp of when the sluggish
behavior was happening, as well as a list of currently running processes. He
could first create (or clobber) the file with the output of the date command,
using >, and then append to it the output of the ps aux command using >>.
7
Redirecting stdin
Just as bash uses > to coax commands into delivering their output
somewhere other than the display, bash uses < to cause them to read input
from somewhere other than the keyboard. The user elvis is still trying to
figure out why his machine was acting sluggish. He talked to his local system
administrator, who thought that looking at the list of currently running
processes sounded like a good idea, and asked elvis to mail him a copy.
Using the terminal based mail command, elvis first writes an email message
to the administrator "manually", from the keyboard. The mail command
expects a recipient as an argument, and the subject line can be specified with
the -s command line switch. The email body is then entered from the
keyboard. The end of the message text is signaled by a lone period on a line.
8
Redirecting stdin
For his follow-up message, elvis can easily mail the output of the ps command
he recorded in the file sluggish.txt. He just redirects the mail command's stdin
stream to be read from the file.
The system administrator will receive an email from elvis, with "ps output" as
it's subject line, and the contents of the file sluggish.txt as its body.
In the first case, the mail process's stdin was connected to the terminal, and
the message body was provided by the keyboard. In the second case, bash
arranged for the mail process's stdin to be connected to the file sluggish.txt,
and the message body was provided by its contents. The mail command
doesn't change its basic behavior: It reads the body of the email message
from stdin. [8]
9
Under the Hood: Open Files and File Descriptors
Open Files and File Descriptors
To fully appreciate how processes manage Standard In, Standard Out, and
files, we must introduce the concept of a file descriptor. In order to read
information from or write information to a file, a process must open the file.
Linux (and Unix) processes keep track of the files they currently have open by
assigning each an integer. The integer is called a file descriptor.
The Linux kernel provides an easy way to examine the open files and file
descriptors of a currently running process, using the /proc file system. Every
process has an associated subdirectory under /proc, named after its PID
(process ID). The process's subdirectory in turn has a subdirectory called fd
(for file descritptor). Within the /proc/pid/fd subdirectory, a symbolic links
exists for every file the process has open. The name of the symbolic link is the
open file's integer file descriptor, and the symbolic link resolves to the open
file itself.
In the following, elvis cats the file /usr/share/hwdata/oui.txt, and then almost
immediately suspends the program with a CTRL+Z.
10
Under the Hood: Open Files and File Descriptors
Using the ps command to look up the process's PID, elvis next examines the
process's /proc/pid/fd directory.
Not surprisingly, the cat process has the file /usr/share/hwdata/oui.txt open
(it must be able to read the file to display its contents). Perhaps a little
surprising, it is not the only, or even the first, file that the process has open.
The cat command has three open files before it, or, more exactly, the same
file open three times: /dev/tty1.
11
Under the Hood: Open Files and File Descriptors
As a Linux (and Unix) convention, every process inherits three open files upon
startup. The first, file descriptor 0, is Standard In. The second, file descriptor
1, is Standard Out, and the third, file descriptor 2, is Standard Error (to be
discussed in the next Lesson). What open files did the cat command inherit
from the bash shell that started it? The device node /dev/tty1 for all three.
Recall that /dev/tty1 is the device node which connects to the console serial
driver within the kernel. Whatever elvis types can be read from this file, and
whatever is written to this file is displayed on elvis's terminal. What happens
if the cat process reads from stdin? It reads input from elvis's keyboard. What
happens if it writes to stdout? Whatever is written is displayed on elvis's
terminal.
12
Redirection
In the next example, elvis cat's the /usr/share/hwdata/oui.txt file, but this
time redirects stdout to the file /tmp/foo. Again, elvis suspends the command
in mid-stride with the CTRL+Z control sequence.
Using the same technique as above, elvis examines the files opened by the
cat command, and the file descriptors associated with them.
13
What happens when elvis redirects both Standard Out and
Standard In?
What happens when elvis redirects both Standard Out and Standard In?
14
What happens when elvis redirects both Standard Out and
Standard In?
When the cat command is called without arguments (i.e., without any
filenames of files to display), it displays Standard In instead. Rather than
opening a specified file (using file descriptor 3, as above), the cat command
reads from stdin instead.
What is the effective difference between the following three commands?
There is none. In order to appreciate the real benefit of designing commands
to read from Standard In in lieu of named files, we must wait until pipes are
introduced in a subsequent Lesson.
15
Examples
Chapter 1. Standard In and Standard Out
Automating Graph Generation with gnuplot
About: 20 minutes
http://csis.pace.edu/adelgado/rha-030/scripts/workbook-07/chapter-1/Gnuplot-lab.htm
16
Chapter 2. Standard Error
Key Concepts
Unix programs commonly report error conditions to a
destination called Standard Error (stderr).
Usually, stderr is connected to a terminal's display, and
error messages are found intermixed with standard
output.
When using the bash shell, the stderr stream can be
redirected to a file using 2>.
When using bash, the stderr stream can be combined
with stdout stream using 2>&1 or >&
17
Standard Error (stderr)
We have discussed the standard input and output streams, stdin and stdout,
and how to use > and < in the bash command line to redirect them. We are
now ready to confuse matters a little by introducing a second output stream,
commonly used for reporting error conditions, called Standard Error (often
abbreviated stderr).
18
Standard Error (stderr)
In the following sequence, elvis is using the head -1 command to generate a
list of the first lines of all the files in the /etc/rc.d directory.
19
Standard Error (stderr)
The head command, when fed multiple file names as arguments,
conveniently decorates the name of the file, followed by the first specified
number of lines (in this case, one). When the head command encounters a
directory, however, it merely complains. Next, elvis runs the same command,
redirecting stdout to the file rcsummary.out.
Most of the output is obediently redirected to the file rcsummary.out, but the
directory complaints are still displayed. Although not obvious at the outset,
the head command is really sending output to two independent streams.
Normal output is written to Standard Out, but error message are written to a
separate stream called Standard Error (often abbreviated stderr). Usually,
both streams are connected to the terminal, and so the two are difficult to
distinguish. By redirecting stdout, however, the information written to stderr
is obvious.
20
Redirecting stderr
Just as bash uses > to redirect stdout, bash uses 2> to redirect stderr. For
example, elvis repeats the head command from above, but instead of
redirecting stdout to rcsummary.out, he redirects stderr to the file
rcsummary.err.
21
Redirecting stderr
The output is the complement to the previous example. We now see the
normal output displayed to the screen, but no error messages. Where did the
error messages go? It shouldn't be hard to guess.
In the following example, both > and 2> are used to redirect stdout and stderr
independently.
In this case, the standard output can be found in the file rcsummary.out, error
messages can be found in rcsummary.err, and nothing is left over to be
displayed to the screen.
22
Combining stdout and stderr: Old School
Often, someone would like to redirect the combined stdout and stderr
streams to a single file. As a first attempt, elvis tries the following command.
Upon examining the file rcsummary.both, however, elvis doesn't find what he
expects.
23
Combining stdout and stderr: Old School
The bash shell opened the file rcsummary.both twice, but treated each open file
independently. When stdout and stderr both wrote to the file, they clobbered each
other's information. What is needed instead is some way to tell bash to effectively
combine stderr and stdout into a single stream, and then redirect that stream to a
single file. As you would expect, there is such a way.
Although awkward, the last token 2>&1 should be thought of as saying "take stderr,
and send it wherever stdout is currently going". Now rcsummary.both contains the
expected output.
24
Combining stdout and stderr: New School
Using 2>&1 to combine stdout and stderr was introduced in the original Unix shell, the
Bourne shell (sh). Because bash is designed to be backwards compatible with sh, it
supports the syntax as well. The syntax, however, is inconvenient. Besides being
difficult to write, the order of the redirections is important. Using ">out.txt 2>&1" and
"2>&1 >out.txt" does not have the same effect!
In order to simplify things, bash uses >& to combine both stdin and stdout, as in the
following example.
Summary
The following table summarizes the syntax used by the bash shell for redirecting stdin,
stdout, and stderr learned in this and the previous lesson.
25
Examples
Chapter 2. Standard Error
Using /dev/null to filter out stderr
The user elvis is has recently learned that, besides the /home/elvis and /tmp
directories he's familiar with, he may also own files in the /var directory.
These files are usually spooling files for received but not yet viewed email,
print jobs waiting to be sent to the printer, etc.
Curious, he uses the find command to find all files within the /var directory
that he owns.
26
Examples
Chapter 2. Standard Error
Although the find command appropriately reported the /var/spool/mail/elvis file, the
output is difficult to find among all of the "Permission denied" error messages being
reported from various subdirectories of /var. In order to help separate the wheat from the
chaff, elvis redirects stderr to some file in the /tmp directory.
While this works, elvis is left with a file called /tmp/foo that he really didn't want. In
situations like this, when a user wants to discard a stream of information, experienced Unix
users usually redirect output to a pseudo device called /dev/null.
As the following long listing shows, /dev/null is a character device node, like those used for
conventional device drivers.
When a user writes to /dev/null, the information is merely discarded by the kernel. When a
user reads from /dev/null, they encounter an immediate end of file. Notice that /dev/null is
one of the few files in Red Hat Enterprise Linux that has world writable permissions by
default.
27
Questions
Chapter 2. Standard Error
1 and 2
28
Chapter 3. Pipes
Part Workbook 7. Standard I/O and Pipes
Key Concepts
• The stdout stream from one process can be
connected to the stdin stream of another process,
using what Unix calls a "pipe".
• Many commands in Unix are designed to operate as
a filter, reading input from stdin and sending output
to stdout.
• bash uses "|" to create a pipe between two
commands.
29
Pipes
Pipes
In the previous Lessons, we have seen that a process's output can be
redirected to somewhere other than the terminal display, or that a process
can be asked to read input from some location other than the terminal
keyboard. One of the most common, and most powerful, forms of redirection
is a combination of the two, where the output (Standard Out) of one
command is "piped" directly into the input (Standard In) of another
command, forming what Linux (and Unix) refers to as a pipe.
When two commands are joined by a pipe, the stdout stream of the first
process is tied directly to the stdin sequence of the second process, so that
multiple processes can be combined in a sequence. In order to create a pipe
using bash, the two commands are joined with a vertical bar |. (On most
keyboards, this character is found on the same key as the backslash, above
the RETURN key.) All processes that are joined in a pipe are referred to as a
process group.
30
Pipes
As an example, consider prince, who is trying to find the largest files underneath the
/etc directory. He begins by composing a find command that will list all file with a size
greater than 100Kbytes.
Observing that the find command seems to list the files in no particular order, prince
decides he would like the files to be listed alphabetically. He could redirect the output
to a file, and then sort the file. Instead, he takes advantage of the fact that the sort
command, when invoked without arguments, looks to Standard In for the data to sort.
He pipes the output of his find command into sort.
The files are now listed in alphabetical order.
31
Filtering output using grep
The traditional Unix grep command is commonly used in pipes to reduce data
to only the "interesting" parts. The grep command will be discussed in detail
in a later Workbook. Here, we introduce grep in its simplest form.
The grep command is used to search for and extract lines which contain a
specified string of text. For example, in the following, prince prints all lines
that contain the text "root" from the /etc/passwd file.
The first argument to the grep command is the string of text to be searched
for, and any remaining arguments are files to be searched for the text. If the
grep command is called with only one argument (a string to be searched for,
but no files to search), it looks to Standard In as its source of data on which to
operate.
32
Filtering output using grep
In the following, prince has so many files in his home directory that he is having trouble
keeping track of them. He's trying to find a directory called templates that he created a few
months ago. He uses the locate command to help him find it.
Unfortunately for prince, there are many files which contain the text templates in their
name on the system, and prince becomes overwhelmed with lines and lines of output. In
order to reduce the information to more relevant files, prince next takes stdout from the
locate command, and creates a pipe to stdin of the grep command, "grepping" for the word
"prince".
Because the grep command is not given a file to search, it looks to stdin, where it finds the
stdout stream of the locate command. Filtering the stream, grep only duplicates to its
stdout lines that matched the specified text, "prince". The rest were discarded. The user
prince easily finds his directory under ~/proj, as well as another directory created by the
application quanta.
33
Pipes and stderr
In the next example, prince is curious to see where he shows up in the system's
configuration files, and "greps" for his name in the /etc directory.
34
Pipes and stderr
Again, prince is overwhelmed by the amount of output from this command. He tries
the same trick, "grepping" it down for all lines that contain the word "passwd".
While stdout from the first grep command was appropriately filtered, stderr is
unaffected, and still gets displayed to the screen. How would prince go about
suppressing stderr as well?
Commands as filters
The concept of a pipe extends naturally, so that multiple commands can be used
together, each reading information from stdin, somehow modifying or filtering the
information, and passing the result to stdout. In a subsequent Workbook, you will find
that there are many standard Linux (and Unix) commands that are designed for this
purpose, including some that you are already familiar with: grep, head, tail, cut, sort,
sed, and awk, to name a few.
35
Pipes
Listing Processes by Name
Often, one would like to list information about processes which are running a specific
command. While ps aux tables a lot of information about currently running processes,
the number of processes running on the machine can make the output overwhelming.
The grep command can help simplify the output.
In the following, prince would like to list information about the processes which are
implementing his web server, the httpd command. He lists all processes, but then
reduces the output to only those lines which contain the text httpd.
36
Examples
Listing Processes by Name
Often, one would like to list information about processes which are running a specific
command. While ps aux tables a lot of information about currently running processes,
the number of processes running on the machine can make the output overwhelming.
The grep command can help simplify the output.
In the following, prince would like to list information about the processes which are
implementing his web server, the httpd command. He lists all processes, but then
reduces the output to only those lines which contain the text httpd.
37
Questions
Chapter 3. Pipes
1 and 2
38
Download