Shell (Part 1) Process r A process is an instance of an application running r If there are two instances of an application running then there are two processes r Example: Let us say there are two users on gaul m m Both run grep on a file This results in two processes Process A process is more than the program code (sometimes known as text section) Process information includes: Process stack: Includes temporary data (function parameters, return addresses and local variables) Data Section: Includes global variables Heap: Memory that is dynamically allocated during process run-time Process in Memory Process State As a process executes it changes state. Possible states: New: The process is created Running: Instructions are being executed Waiting: The process is waiting for some event to occur e.g., I/O completion Terminated: Process has finished execution Process Control Block Each process is represented in the operating system by a process control block Information related to program execution Context information including: • Process Identifier (PID) • Process state • Program counter and other program related information • CPU registers • CPU-Scheduling information • Memory-Management information • I/O status information Operating System Services An operating system provides an environment for the execution of programs Services are provided to programs and users of programs Operating System Services OS services (User perspective) User interface: Command-Line (CLI), Graphics User Interface (GUI) Program execution: OS needs to load a program into memory, runs that program, end execution, either normally or abnormally (indicating error) I/O operations: A running program may require I/O, which may involve a file or an I/O device File-system manipulation: Need to read and write files and directories, create and delete them, search Operating System Services OS provides operations for ensuring efficient operation. Examples: Resource allocation Accounting Protection and Security The Shell The command interpreter is a program that accepts commands from the user and interprets the commands (type of user interface) The shell refers to an executing command interpreter Accepts commands such as “ls” or “ps” The commands make use of system calls. A system call allows a command to request a service from the operating system. A shell is a process A process e.g., a program being executed • We will overuse the term “shell” to refer to the program as well as the process The Shell There are different shells that you can use in a Unix-based system including: bourne shell C shell bash shell tcsh shell and many more The Shell When a user logs in, a shell is started up. The shell has the terminal as standard input and standard output The shell starts out by typing the prompt, a character such as a dollar sign or percentage sign e.g., hanan% User enters a command e.g., date The Shell The user can specify that standard output be redirected to a file e.g., date > file Standard input can be redirected e.g., sort < file1 > file2 The output of one program can be used as the input to another program: cat file1 file2 file3 | sort >/dev/lp & High-Level View of Shell Code while (1) { get a line from the user Execute command found in line } Details Not Highlighted Making sure that the line from the user is correct How is shell termination handled? The execution of a command is done by a separate process (child process) from the shell process For simple commands, the shell process waits for the child process to terminate so that it can print the prompt If a child process is put in the background (using &) then the shell process can continue without waiting for the child process to terminate The Concept of Fork The Unix system call for process creation is called fork(). The fork system call creates a child process that is a duplicate of the parent. Child inherits state from parent process • Same program instructions, variables have the same values, same position in the code Parent and child have separate copies of that state Child has the same open file descriptors from the parent. • Parent and child file descriptors point to a common entry in the system open file descriptor table. More on this later fork() as a diagram Parent pid = fork() Returns a new PID: e.g. pid == 56 Data Child pid == 0 Shared Program Data Copied Process Creation Using Fork int main () { pid_t pid; int status = 0; pid = fork(); if (pid < 0) perror(“fork()”); if (pid > 0) { /* parent */ printf(“I am parent\n”); wait(0); } else { /* child */ printf(“I am child\n”); } return 0; } The fork system call returns twice: it returns a zero to the child and the child process ID (pid) to the parent. The perror function produces a message on the standard error output describing the last error encountered during a call to a system or library function (man page) The wait function is used to terminate the parent process when the child terminates pid is zero which indicates a child process Fork System Call If fork () succeeds it returns the child PID to the parent and returns 0 to the child If fork() fails, it returns -1 to the parent (no child is created) and sets errno A program almost always uses this difference to do different things in the parent and child processes. Failure occurs when the limit of processes that can be created is reached. pid_t data type represents process identifiers Other calls: pid_t getpid() – returns the PID of calling process Pid_t getppid() – returns the PID of parent process fork() Example 1 #include <stdio.h> #include <sys/types.h> #include <unistd.h> int main() { pid_t pid; int i; sum = 0; pid = fork(); if( pid > 0 ) } { /* parent */ for( i=0; i < 10; i++ ) sum = sum + i; printf(“parent: sum is %\n”,sum); wait(0); } else { /* child */ for( i=0; i < 10; i++ ) sum = sum - i; printf(“child: sum is %d\n”,sum); } return 0; fork() Example 1 What is the value of sum in the parent and child processes after pid = fork()? 0 What is the value of sum in the parent and child processes at the print statements? parent: sum is 45 child: sum is -45 fork() Example 1 Remember that sum was 0 before the fork took place When the fork took place the process was duplicated which means that a copy is made of each variable; sum was duplicated Since sum was 0 just before the fork then sum is 0 right after the fork in both the child and parent processes fork() Example 2 #include <stdio.h> #include <sys/types.h> #include <unistd.h> int main() { pid_t pid; int i; pid = fork(); if( pid > 0 ) { /* parent */ for( i=0; i < 1000; i++ ) printf(“\t\t\tPARENT %d\n”, i); } wait(0); What is the possible output? else { /* child */ for( i=0; i < 1000; i++ ) printf( “CHILD %d\n”, i ); } return 0; } fork () Example 2:Possible Output PARENT 0 PARENT PARENT PARENT PARENT PARENT PARENT PARENT PARENT PARENT CHILD CHILD CHILD CHILD CHILD CHILD CHILD CHILD CHILD CHILD 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 fork () Example 2:Possible Output PARENT PARENT PARENT PARENT PARENT PARENT PARENT CHILD 0 CHILD 1 CHILD 2 CHILD CHILD CHILD CHILD CHILD CHILD CHILD 0 1 2 3 4 5 6 PARENT 7 PARENT 8 PARENT 9 3 4 5 6 7 8 9 Lots of possible outputs!! Execution Processes get a share of the CPU before giving it up to give another process a turn The switching between the parent and child depends on many factors: machine load, system process scheduling Output interleaving is nondeterministic Cannot determine output by looking at code How many Processes are Created by this Program? #include <stdio.h> #include <unistd.h> int main() { fork(); fork(); fork(); } Process Structure? What does the pid_t childpid = 0; process structure look like? for (i=1;i<n;i++) Is this a tree or a if ((childpid = fork()) == 0) break; chain of processes Process Structure? What does the process structure look like? Is this a tree or a chain of processes? pid_t childpid = 0; for (i=1;i<n;i++) if ((childpid = fork()) > 0) break; Process File Descriptor Table Every process has a process file descriptor table Each entry represents something that can be read from or written to e.g., file Screen pipe (later) System File Descriptor Table The OS maintains a system file descriptor table in the OS's memory space. Every open file in the system has an entry in this table. One file can be associated with many entries in this table (if opened by many processes). These entries contain among other things: • the file descriptor's permissions • # links • the file offset which is moved by read and write. An entry is removed when 0 links point to it. OS File Structures 0 1 2 3 4 stdin stdout stderr in_file File info e.g., read offset Parent File Descriptor table Assume that there was something like this in a program System file table FILE *in_file; in_file = fopen("list.txt", "r"); Fork and Files In a fork what gets copied is the file descriptor table; Thus the parent and child point to the same entry in the system file table Fork and Files 0 1 2 3 4 stdin stdout stderr in_file File info e.g., read offset Parent File Descriptor table 0 1 2 3 4 stdin stdout stderr in_file Child File Descriptor table System file table Fork and Files Open a file before a fork The child process gets a copy of its parent's file descriptor table. The child and parent share a file offset pointer because the open came before the fork. Open a file after a fork Assume that parent and child each open a file after the fork They get their own entries in the System File Descriptor table • This implies that the file position information is different Question Suppose that foobar.txt consists of the 6 ASCII characters foobar. Then what is the output of the following program? int main() { FILE *fd1, *fd2; char c; fd1 = fopen("foobar.txt", “r”); fd2 = fopen("foobar.txt", “r”); fscanf(fd1, “%c”, &c); fscanf(fd2, “%c”, &c); printf("c = %c\n", c); } Answer The descriptors fd1 and fd2 each have their own open file table entry, so each descriptor has its own file position for foobar.txt. Thus, the read from fd2 reads the first byte of foobar.txt, and the output is c=f and not c=o Question int main() { FILE *fd1, *fd2; char c; pid_t pid; fd1 = fopen(“foo.txt”, “w”); fd2 = fopen(“foo.txt”, “w”); fprintf(fd1, “%s”, “Hanan”); fprintf(fd2, “%s”, “Lutfiyya”); } int main() { FILE *fd1, *fd2; char c; pid_t pid; Question fd1 = fopen("foobar.txt", “r”); fd2 = fopen("foobar.txt", “r”); pid = fork (); if (pid > 0){ fscanf(fd1, “%c”, &c); printf(”parent: c = %c\n", c); }else if (pid == 0) { fscanf(fd2, “%c”, &c); printf(”child: c = %c\n", c); } } Fork and Files Be careful It is much better to open files after forks. Even so you need to be careful if there is writing This often requires that a process has mutual exclusive access during the write process (more later in the course) Wait Parents waits for a child (system call) Blocks until a child terminates Returns pid of the child process Returns -1 if no child process exists (already exited) status #include <sys/types.h> #include <sys/wait.h> pid_t wait(int *status) Parent waits for a specific child to terminate pid_t waitpid(pid_t pid, int *status, int options) Process Creation Using Fork int main () { pid_t pid; int status = 0; pid = fork(); if (pid < 0) perror(“fork()”); if (pid > 0) { /* parent */ printf(“I am parent\n”); pid = wait(&status); } else { /* child */ printf(“I am child\n”); exit(status); } } The fork syscall returns twice: it returns a zero to the child and the child process ID (pid) to the parent. Parent uses wait to sleep until the child exits; wait returns child pid and status. Wait variants allow wait on a specific child, or notification of stops and other signals. More about Process Operations In Unix-based systems, a hierarchy of processes is formed In Unix, we can obtain a listing of processes by using the ps command ps –el will list complete information for all processes Exec The term exec refers to a family of functions where each of the functions replace a process’s program (the one calling one of the exec functions) with a new loaded program A call to a function from exec loads a binary file into memory (destroying the memory image of the program calling it) The new program starts executing from the beginning (where main begins) On success, exec never returns; on failure, exec returns -1. The different versions are different primarily in the way parameters are passed Exec The exec family consists of these functions: execvp, execlp, execv, execve, execl, execle Functions with p in their name (execvp, execlp) search for the program in the path indicated by the PATH environment variable; functions without p must be given full path. Functions with v in their name (execv, execvp, execve) differ from functions with l (execl, execlp, execle) in the way arguments are passed Functions with e accept array of environment variables Versions of exec Versions of exec offered by C library: int execl( const char *path, const char *arg, ... ); int execlp( const char *file, const char *arg, ... ); int execle( const char *path, const char *arg , ..., char *const envp[] ); int execv( const char *path, char *const argv[] ); int execvp( const char *file, char *const argv[] ); int execve( const char *filename, char *const argv [], char *const envp[] ); Exec Example Program A: int i = 5; printf(“%d\n”,i); execl(“B”, “B”, NULL); printf(“%d\n”,i); Program B: main() { printf(“hello\n”); } What is the output of program A? 5 hello Why is it not this? 5 hello 5 The execl command replaces the instructions in the process with instructions for program B. It starts at the first instruction (starts at main) Exec Example Program A: int i = 5; prog_argv[0] = "B"; prog_argv[1] = NULL; printf(“%d\n”,i); execv(prog_argv[0], prog_argv); printf(“%d\n”,i); Program B: main() { printf(“hello\n”); } Same functionality as the program on the previous slide Used execv instead of execl execv uses an array to pass arguments execl uses a list to pass arguments Exec Example int main(int argc, char *argv[]) { int i = 5; int status; status = execlp(“ls”, “ls”, “-l”, “a.c”, NULL); if (status !=0) { perror(“you goofed\n"); printf(“errno is %d\n”,errno); } printf("%d\n",i); } In this example, note that the command is ls –l a.c Each argument is in the list Question: What would cause the perror function to be executed? Exec Example int main(int argc, char *argv[]) { char *prog1_argv[4]; int i = 5; prog1_argv[0] = "ls"; prog1_argv[1] = "-l"; prog1_argv[2] = "a.c"; prog1_argv[3] = NULL; execvp(prog1_argv[0], prog1_argv); … printf("%d\n",i); } Same example as that on the previous side but execvp is used which requires an array Fork and Exec Child process may choose to execute some other program than the parent by using one of the exec calls. Exec overlays a new program on the existing process. Child will not return to the old program unless exec fails. This is an important point to remember. File descriptors are preserved fork() and execv() execv(new_program, argv[ ]) Initial process fork Returns a new PID Original process Continues fork() returns pid=0 and runs as a cloned parent until execv is called New Copy of Parent new_Program (replacement) execv(new_program) Example #include <sys/types.h> #include <stdio.h> #include <unistd.h> int main() { pid_t pid; pid = fork(); if (pid < 0) perror("fork()"); if (pid > 0) { wait(NULL); printf("Child Complete"); } else{ if (pid == 0) execlp("ls","ls", “-l, “a.c”, NULL); } } Exec and Shell As we can see from the example on the previous slide a process, which can be a shell, can fork a child and the child can execute a command using exec