Lecture 14 Log into Linux. Copy files from /home/hwang/cs375/lecture14 Project 4 posted to course webpage. Questions? Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 1 Outline Introduction to processes Creating new processes Avoiding zombie processes Note: this material is covered in both BLP and AUP (as indicated in the syllabus) as well as the man pages for the various system calls. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 2 Introduction Suppose we are writing an application that requires the production of a nice graph, or the conversion an image from GIF to PNG formats, etc. We could do one of the following to solve this problem: write the required routines from scratch find and use library routines that do the work use existing programs such as gnuplot or the PNM utilities. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 3 Introduction The third approach can be very powerful. It is often the easiest method as well. Why not use an existing utility that does what is needed and is known to be efficient and robust? UNIX allows us run other programs from our program and communicate with these programs using a number of different interprocess communication (IPC) methods. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 4 Introduction UNIX allows us to do this in a manner that is completely transparent to the user of our program. That is, we can use gnuplot to produce our graphs, but the user does not necessarily need to know that. In BLP Chapter 11 and AUP Chapter 5, we learn to run other programs from our program. In BLP Chapters 13-15 and AUP Chapters 6-8, we will learn how to communicate (pass data) with these other programs. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 5 What is a Process? A program is a collection of instructions and data kept in an file. A process is a running program. It consists of segments containing instructions, user data, and system data. The instruction and user data segments are initialized from the program. System data includes the current directory, open file descriptors, total CPU time, etc. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 6 Parent and Child Processes A new child process is created by the kernel on behalf of a parent process (via fork()). A child inherits most of the system data from the parent. For example, files that are opened by the parent, also will be opened in the child. Each process has a unique process ID (PID - a positive int). getpid( ) and getppid( ) can be used to find the current and parent process IDs. The init process has PID 1. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 7 Process Groups Each process is a member of a unique process group with its own process group ID (PGID). When a process is created it is a member of its parent's process group. One process is the process group leader. The PGID is the same as the leader PID. Only one process group is the foreground process group. The terminal device driver sends tty signals (interrupt, quit, etc) to each process in the foreground process group. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 8 Additional Process Information At the command line use “ps ajx” to list the PIDs, PPIDs and PGIDs of all processes. A parent may wait for a child to terminate by using the wait( ) or waitpid( ) routines. If a child ends before the parent calls wait( ), then the child lives on as a “zombie” process until the parent calls wait( ) or ends. If a parent ends, a child's parent process ID is set to 1. (The child is adopted by init.) Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 9 Process Creation: system( ) The system( ) function (defined in <cstdlib>) can be used to run another program (and thereby create a new process) from an existing process: system("program_name"); The parent process waits until the program completes. (See sys_xmpl.cpp) The program is run just as if the following had been entered at the command prompt: $ bash ­c program_name Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 10 Process Creation: fork( )/exec( ) The system( ) function is built upon the fork( ), exec( ) (both defined in <unistd.h>) and wait( ) (defined in <sys/wait.h>) routines. fork( ) creates a child process that is a clone of the parent (identical instruction, user-data, and system-data segments). The exec( ) routine reinitializes a process from a designated program (file on disk). fork( ) and exec( ) are usually used together. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 11 The exec( ) System Calls There are six exec( ) calls. First execl( ) arguments are a vararg list ending with 0. int execl(char *path, char *arg0, ..., char *argn, (char *)0); Here is an example: execl(“/bin/ls”, “ls”, “­al”, “/tmp”, (char *)0); Recall that a new process is not created by exec( ). The code and user-data segments are reinitialized from the indicated program. The system data segment is not overwritten. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 12 The exec( ) System Calls exec( ) is the only way to execute a program under UNIX. (See exc_xmpl.cpp) Other exec( ) calls are execlp( ), execle( ), execv( ), execvp( ) and execve( ): execv*( ) routines expect the arguments to be passed in an array (i.e. char *argv[ ]). exec*p( ) routines look for the program in the PATH (vs. in the environment's PATH). exec*e( ) routines pass a pointer to a new environment array (instead of a copy). Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 13 The fork( ) System Call If we only had exec( ) there would be only one process running on the system. We could run different programs by having each program call exec( ) to start a new program. The fork( ) system call is the only way to create a new process. The new (child) process's instruction, user-data, and system-data segments are almost exact copies of the old (parent) process segments. (The PIDs and PPIDs will, of course, be different.) Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 14 The fork( ) System Call The child gets copies of the parent's open file descriptors. The file pointer is in the system file table and is shared by both processes. When fork( ) returns, both processes (parent and child) get different return values (of type pid_t defined in <sys/types.h>). The child gets 0 while the parent gets the PID of the child. Usually the child will do an exec( ), while the parent either waits for the child or goes off to do something else. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 15 The fork( ) System Call For example, the following code: cout << “Start of test” << endl; pid = fork(); cout << “fork() returned ” << pid << endl; might display (exact order depends on which process executes first): Start of test fork() returned 0 fork() returned 17625 See frk_xmpl1.cpp Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 16 The fork( ) System Call Examine frk_xmpl2.cpp How many processes will be created? Can output with count=3 appear before output with count=1? See vssh_xmpl.cpp for a very simple shell program. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 17 The wait( ) System Call wait( ) causes the parent to sleep until any child terminates: int status; switch (fork()) { case 0: // the child execvp(program, args); default: // the parent pid = wait(&status); } It is not necessary for a parent to wait. See wait_xmpl.cpp for example code. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 18 Using wait( ) wait( ) returns the child's PID and passes back the child's exit status in the status argument. The exit status is encoded and can be decoded using macros in <sys/wait.h>: WIFEXITED(status) WEXITSTATUS(status) WIFSIGNALED(status) WTERMSIG(status) WIFSTOPPED(status) Non-zero if normal termination Exit status (normal termination) Non-zero if signal termination Signal number (if signal term) Non-zero if stopped A stopped process is different than a terminated one (it can be restarted). Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 19 Using waitpid( ) waitpid( ) allows you to: (1) wait on a specific process, (2) check status without blocking, and (3) supports job control. pid_t waitpid(pid_t pid, int *status, int opts) If pid == -1, waitpid( ) waits for any child. If pid > 0, waitpid( ) waits for child with that pid. status is defined just as for wait( ). If opts is WNOHANG, check status and return. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 20 Zombie Processes Recall that a parent process creates a child process and that the parent process can wait for the child to end, receiving the child's termination status (normal or signal) and exit value via the wait( ) system call. In order for the child to terminate completely, its status information must be received by a wait( ) call. Until this happens, the child process hangs around, not executing anything, but not dead either (hence calling them zombies). Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 21 Zombie Processes See zmb_xmpl1.cpp and zmb_xmpl2.cpp It is not good for a system to accumulate zombie processes. They take up resources and do not produce any results. This is particularly bad if the parent process is a server daemon that is not expected to exit in the normal case. Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 22 Avoiding Zombies If we want to "disown" a child process we can call fork twice (aka double fork): if ( (pid = fork()) == 0) { // child process if ( (pid = fork()) == 0) { // "grandchild" process // parent becomes init when child exits execvp(program, args); } // child process, so exit // "grandchild" is adopted by init exit(0); } waitpid(pid, NULL, 0); // wait for child process // we're the parent – go on and do our own thing // we don't have to worry about the "grandchild" Thursday, February 27 CS 375 UNIX System Programming - Lecture 14 23