CS449, Introduction to Systems Software, Spring 2007 TA: Ricardo Villamarín-Salomón Lab #10 Part I Process ID UNIX identifies processes using a unique integer called the process ID: The process that executes the request for creation of a process is called the parent of that process. The created process is called the child The parent process ID identifies the parent of the process Use the functions getpid and getppid to obtain these IDs SYNOPSIS #include <unistd.h> #include <sys/types.h> pid_t getpid(void); pid_t getppid(void); Listing 1 below shows an example of their use1: Listing 1 #include <stdio.h> #include <unistd.h> #include <sys/types.h> int main(void) { printf("Process ID: %ld\n", (long)getpid()); printf("Parent process ID: %ld\n", (long)getppid()); return 0; } Part II Process creation and the UNIX fork A process (“the parent”) can create a new process (a “child”) by calling fork. This function copies the parent's memory image so that the new process receives a copy of the address space of the parent. Both processes continue at the instruction after the fork statement (executing in their respective memory images). 1 You may copy the source file from my directory using (notice the dot at the end): $ cp ~rmv4/public/cs0449/r10/r10p1.c . -1/9- SYNOPSIS #include <unistd.h> pid_t fork(void); The fork function returns 0 to the child but returns the child's process ID to the parent so that they can distinguish themselves and to execute different code. When fork fails, it returns -1 and sets the errno variable2. Possible values of errno are: EAGAIN: The system-imposed limit on the total number of processes under execution by a single user has been exceeded; or the total amount of system memory available is temporarily insufficient to duplicate this process. ENOMEM: Insufficient storage space is available. In the program below, both parent and child execute the x=1 assignment statement after returning from fork. Listing 2 #include <stdio.h> #include <unistd.h> int main(void) { int x; x = 0; fork(); x = 1; printf("I am process %ld and my x is %d\n", (long)getpid(), x); return 0; } Before the fork of Listing 2, one process (parent) executes with a single variable x. After the fork, two independent processes execute each with its own copy of that variable. Both, parent and child, execute independently so they do not modify the same memory locations. However, those two processes execute the same instructions because the code of Listing 2 did not test the return value of fork. Listing 3 shows how to test the return value of fork. Listing 33 #include <stdio.h> #include <unistd.h> #include <sys/types.h> int main(void) { int x = 0; pid_t child_pid; 2 This external variable is set when errors occur but not cleared when non-erroneous calls are made The routine perror() produces a message on the standard error output, describing the last error encountered during a call to a system or library function. The error number is taken from errno. 3 -2/9- child_pid = fork(); if (child_pid == -1) { perror("Failed to fork"); return 1; } if (child_pid == 0) /* child code */ printf("I am child %ld and my x is: %d\n", (long)getpid(), ++x); else /* parent code */ printf("I am parent %ld and my x is: %d\n", (long)getpid(), x+2); return 0; } What does the program in Listing 4 do? Listing 4 /* 1) Create a file called r10p4.c containing this code * 2) Compile it: gcc -o procs r10p4.c * 3) Run it (you may try other values instead of 5): ./procs 5 */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> int main (int argc, char *argv[]) { pid_t childpid = 0; int i, n; if (argc != 2){ /* check for valid number of command-line arguments */ fprintf(stderr, "Usage: %s processes\n", argv[0]); return 1; } n = atoi(argv[1]); /* converts a string to an integer (if possible) */ for (i = 1; i < n; i++) if ((childpid = fork()) <= 0) break; fprintf(stderr, "i:%d process ID:%ld parent ID:%ld child ID:%ld\n", i, (long)getpid(), (long)getppid(), (long)childpid); return 0; } The fork function creates a new process by making a copy of the parent's image in memory. The child inherits parent attributes such as environment and privileges. The child also inherits some of the parent's resources such as open files and devices4. Part III The wait system call The wait function causes the caller (parent) to suspend its execution until a child's status becomes available or until the caller receives a signal. A process status can be available after termination or after the process has been stopped. The waitpid function allows a 4 Not every parent attribute or resource is inherited by the child, but discussion of that is beyond the scope of this recitation -3/9- parent to wait for a particular child. This function also allows a parent to check whether a child has terminated without blocking. SYNOPSIS #include <sys/wait.h> pid_t wait(int *stat_loc); pid_t waitpid(pid_t pid, int *stat_loc, int options); The waitpid function takes three parameters: a pid, a pointer to a location for returning the status and a flag specifying options. Possible values of pid are: -1: wait for any child process (this is the same behavior which wait exhibits). >0: wait for the child whose process ID is equal to the value of pid. ==0, <-1: values related to process groups which are not covered here. The options parameter of waitpid is the bitwise inclusive OR (a single |) of one or more flags. The WNOHANG option causes waitpid to return even if the status of a child is not immediately available. The WUNTRACED option causes waitpid to report the status of unreported child processes that have been stopped. Check the man page on waitpid for a complete specification of its parameters. The status information from the child process is stored in the object that stat_loc points to, unless stat_loc is a null pointer (i.e. == NULL). The return value is normally the process ID of the child process whose status is reported. If there are child processes but none of them is waiting to be noticed, waitpid will block until one is. However, if the WNOHANG option was specified, waitpid will return zero instead of blocking. The function wait is a simplified version of waitpid, and is used to wait until any one child process terminates: wait (&status); Is equivalent to: waitpid (-1, &status, 0); If wait or waitpid returns because the status of a child is reported, these functions return the process ID of that child. If an error occurs, these functions return –1 and set errno. If called with the WNOHANG option, waitpid returns 0 to report that there are possible unwaited-for children but that their status is not available. The following table lists the mandatory errors for wait and waitpid. -4/9- errno ECHILD EINTR EINVAL Cause caller has no unwaited-for children (wait), or process or process group specified by pid does not exist (waitpid), or process group specified by pid does not have a member that is a child of caller (waitpid) function was interrupted by a signal options parameter of waitpid was invalid What is the output of the program in Listing 5? Listing 5 #include #include #include #include #include <stdio.h> <sys/types.h> <unistd.h> <errno.h> <sys/wait.h> int main(){ pid_t pid = fork(); if( pid == 0 ){ long i; int m = 1; for( i = 0; i < 90000000; i++) m *=m; printf("I'm the child and i'm done\n"); } else{ waitpid( -1, NULL, 0); printf("I'm the parent and i'm so busy. Sorry i can't wait!\n"); } return 0; } What is the output of the program in Listing 6? Listing 6 #include <stdio.h> #include <sys/types.h> #include <unistd.h> #include <errno.h> #include <sys/wait.h> int main(){ pid_t pid = fork(); if( pid == 0 ){ long i; int m = 1; for( i = 0; i < 90000000; i++) -5/9- m *=m; printf("I'm the child and i'm done\n"); } else{ waitpid( -1, NULL, WNOHANG); printf("I'm the parent and I'm so busy. Sorry I can't wait!\n"); } return 0; } Part IV The exec functions The fork function creates a copy of the calling process, but many applications require the child process to execute code that is different from that of the parent. The exec family of functions replaces the current process image with a new process image. The traditional way to use the fork/exec combination is for the child to execute (with a call to exec) the new program while the parent continues to execute the original code. SYNOPSIS int execl(const char *path, const char *arg0, ... /*, char *(0) */); int execle (const char *path, const char *arg0, ... /*, char *(0), char *const envp[] */); int execlp (const char *file, const char *arg0, ... /*, char *(0) */); int execv(const char *path, char *const argv[]); int execve (const char *path, char *const argv[], char *const envp[]); int execvp (const char *file, char *const argv[]); If unsuccessful, all exec functions return -1 and set errno. If any of these functions return at all, the call was unsuccessful. The following table lists the mandatory errors for the exec functions. errno E2BIG Cause size of new process's argument list and environment list is greater than system-imposed limit of ARG_MAX bytes EACCES search permission on directory in path prefix of new process is denied, new process image file execution permission is denied, or new process image file is not a regular file and cannot be executed EINVAL new process image file has appropriate permission and is in a recognizable executable binary format, but system cannot execute files with this format ELOOP a loop exists in resolution of path or file argument ENAMETOOLONG the length of path or file exceeds PATH_MAX, or a pathname component is longer than NAME_MAX ENOENT component of path or file does not name an existing file, or path or file is an empty string ENOEXEC image file has appropriate access permission but has an unrecognized format (does not apply to execlp or execvp) -6/9- errno ENOTDIR Cause a component of the image file path prefix is not a directory The six variations of the exec function differ in the way that command-line arguments and the environment are passed. They also differ in whether a full pathname must be given for the executable (execlp() and execvp() do not require a full pathname). The execl (execl, execlp and execle) functions pass the command-line arguments in an explicit list and are useful if you know the number of command-line arguments at compile time. The program in Listing 7 calls the ls shell command with a command-line argument of -l. The program assumes that ls is located in the /bin directory. The execl function uses its character-string parameters to construct an argv array for the command to be executed. Since argv[0] is the program name, it is the second argument of the execl. Notice that the first argument of execl, the pathname of the command, also includes the name of the executable. The path parameter to execl is the pathname of a process image file specified either as a fully qualified pathname or relative to the current directory. The individual command-line arguments are then listed, followed by a (char *)0 pointer (a NULL pointer). Listing 7 #include #include #include #include <stdio.h> <stdlib.h> <unistd.h> <sys/wait.h> int main(void) { pid_t childpid; childpid = fork(); if (childpid == -1) { perror("Failed to fork"); return 1; } if (childpid == 0) { /* child code */ execl("/bin/ls", "ls", "-l", NULL); perror("Child failed to exec ls"); return 1; } if (childpid != wait(NULL)) { /* parent code */ perror("Parent failed to wait due to signal or error"); return 1; } return 0; } The execv (execv, execvp and execve) functions pass the command-line arguments in an argument array. The argi parameter represents a pointer to a string, and argv and envp represent NULL-terminated arrays of pointers to strings. See Listing 8 below. -7/9- The main() function in C takes a third optional argument, containing the environment. This is an array of strings (the last string in the array is NULL). Each string has the form "VAR=VAL" where VAR is the variable name, and VAL is the value of VAR. The program in Listing 9 will print all the environment variables and their values. Create, compile and run it to see what is passed in the third parameter of main. The functions execle() and execve() allow to pass this third parameter to a program. Listing 8 #include #include #include #include <stdio.h> <stdlib.h> <unistd.h> <sys/wait.h> int main(void) { pid_t childpid; char *arguments[4]; arguments[0] arguments[1] arguments[2] arguments[3] = = = = "/bin/ls"; "-l"; "/etc"; /* it is a specific directory */ NULL; childpid = fork(); if (childpid == -1) { perror("Failed to fork"); return 1; } if (childpid == 0) { /* child code */ execv("/bin/ls", arguments); perror("Child failed to exec ls"); return 1; } if (childpid != wait(NULL)) { /* parent code */ perror("Parent failed to wait due to signal or error"); return 1; } return 0; } Listing 9 #include <stdio.h> int main( int argc, char *argv[], char *envp[]) { int i; for (i=0; envp[i]!= NULL; i++) printf("Envp[%d] = %s\n", i, envp[i]); return 0; } -8/9- References 1. Practical UNIX Programming. Kay A. Robbins, Steven Robbins. Prentice Hall PTR; 1st edition. 2. C Notes: Quick Reference. By Prof. Chee Yap. http://cs.nyu.edu/~yap/prog/c/#main -9/9-