Programming Interface Main Points • Process Concept • Creating and managing processes – fork, exec, wait • Communicating between processes • Example: implementing a shell Objectives • To introduce the notion of a process -a program in execution, which forms the basis of all computation • To describe the various features of processes, including scheduling, creation and termination, and communication Process vs. Program • Review: – Program • Process is an instance of an executing program. • Relationship: – Program can create many processes – Many processes can be running the same program Process Concept • An operating system executes a variety of programs: – Batch system – jobs – Time-shared systems – user programs or tasks • Textbook uses the terms job and process almost interchangeably Operating System View of Process • At start of process, the Operating System’s duties include: – Load program into memory – Allocate memory for program data – Set up kernel bookkeeping: Process ID Priority User IDs Process State Address of executing instruction Address of return instruction Typical Process Layout • Libraries provide the glue between user processes and the OS – libc linked in with all C programs – Provides printf, malloc, and a whole slew of other routines necessary for programs Activation Records Stack OBJECT1 OBJECT2 Heap HELLO WORLD GO BIG RED CS! Data printf(char * fmt, …) { create the string to be printed SYSCALL 80 } malloc() { … } strcmp() { … } Library Text main() { printf (“HELLO WORLD”); printf(“GO BIG RED CS”); ! Program Stack and Stack Frames square (int x) { return x*x } doCalc(int val) { printf (“square is %d\n”, square(val) } main (int x, int y) { key = 9999 doCalc (key) } STACK Frames for C run-time start up functions Frame for main Frame for doCalc Frame for square Process in Memory Full System Layout • The OS is omnipresent and steps in where necessary to aid application execution Kernel Activation Records USER OBJECT1 OBJECT2 LINUX syscall_entry_point() { … } – Typically resides in high memory • When an application needs to perform a privileged operation, it needs to invoke the OS OS Stack OS Heap OS Data OS Text Activation Records OBJECT1 OBJECT2 HELLO WORLD GO BIG RED CS! printf(char * fmt, …) { Stack Heap Data Library main() { … } Program Process State • As a process executes, it changes state – new: The process is being created – running: Instructions are being executed – waiting: The process is waiting for some event to occur – ready: The process is waiting to be assigned to a processor – terminated: The process has finished execution Diagram of Process State Process Creation Parent process create children processes, which, in turn create other processes, forming a tree of processes UNIX examples fork system call creates new process Child process is almost an exact duplicate of parent process exec system call used after a fork to replace the process’ memory space with a new program Process Creation (Cont) • Generally, process identified and managed via a process identifier (pid) • Resource sharing – Parent and children share all resources – Children share subset of parent’s resources – Parent and child share no resources • Execution – Parent and children execute concurrently – Parent waits until children terminate • Address space – Child duplicate of parent – Child has a program loaded into it Process Termination • Process executes last statement and asks the operating system to delete it (exit) – Output data from child to parent (via wait) – Process’ resources are deallocated by operating system • Parent may terminate execution of children processes (abort) – Child has exceeded allocated resources – Task assigned to child is no longer required – If parent is exiting • Some operating system do not allow child to continue if its parent terminates – All children terminated - cascading termination Process Termination Parent Process It is useful for the parent to know how and when the child terminates Waiting on a child process: Wait() system call waits for one (any one) of the child processes to end Returns the process ID (PID) of the child being terminated Returns status of the child Kernel adds the process CPU times and resource statistic to the running total for all children of the parent process Releases resources held by child to system pool SIGCHLD signal Q.E.D. Orphans and Zombies Orphan Parents ends before Child process Who becomes the parent? INIT, the ancestor of all processes. INIT will periodically clean up orphans Zombie Child terminates before the parent performs wait() Kernel turns process into ZOMBIE Most resources held by child are returned to the system pool Un-killable processes – Are waiting for their parent to issue wait() Wait for parents to exit when they will become orphans Big problem for daemons. Process Control Block (PCB) Bookkeeping/Management of a process Includes: • • • • • • • • • Process ID Program Counter CPU Registers CPU scheduling information Priority Process state Memory management information Accounting information List of I/O devices allocated to process Process Control Block (PCB) UNIX Process Management • UNIX fork – system call to create a copy of the current process, and start it running – No arguments! • UNIX exec – system call to change the program being run by the current process • UNIX wait – system call to wait for a process to finish • UNIX signal – system call to send a notification to another process Process Creation Parent my perform other actions here Child statuses passed to parent Memory of parent copied to child Kernel restarts parent C Program Forking Separate Process int main() { pid_t pid; /* fork another process */ pid = fork(); if (pid < 0) { /* error occurred */ fprintf(stderr, "Fork Failed"); exit(-1); } else if (pid == 0) { /* child process */ execlp("/bin/ls", "ls", NULL); } else { /* parent process */ /* parent will wait for the child to complete */ wait (NULL); printf ("Child Complete"); exit(0); } } UNIX Process Management Question: What does this code print? int child_pid = fork(); if (child_pid == 0) { // I'm the child process printf("I am process #%d\n", getpid()); return 0; } else { // I'm the parent process printf("I am parent of process #%d\n", child_pid); return 0; } Implementing UNIX fork Steps to implement UNIX fork – Create and initialize the process control block (PCB) in the kernel – Create a new address space – Initialize the address space with a copy of the entire contents of the address space of the parent – Inherit the execution context of the parent (e.g., any open files) – Inform the scheduler that the new process is ready to run Implementing UNIX exec • Steps to implement UNIX fork – Load the program into the current address space – Copy arguments into memory in the address space – Initialize the hardware context to start execution at ``start'' Windows CreateProcess • System call to create a new process to run a program – Create and initialize the process control block (PCB) in the kernel – Create and initialize a new address space – Load the program into the address space – Copy arguments into memory in the address space – Initialize the hardware context to start execution at ``start'’ – Inform the scheduler that the new process is ready to run Windows CreateProcess API (simplified) if (!CreateProcess( NULL, // No module name (use command line) argv[1], // Command line NULL, // Process handle not inheritable NULL, // Thread handle not inheritable FALSE, // Set handle inheritance to FALSE 0, // No creation flags NULL, // Use parent's environment block NULL, // Use parent's starting directory &si, // Pointer to STARTUPINFO structure &pi ) // Pointer to PROCESS_INFORMATION structure ) Context Switch • When CPU switches to another process, the system must save the state of the old process and load the saved state for the new process via a context switch • Context of a process represented in the PCB • Context-switch time is overhead; the system does no useful work while switching • Time dependent on hardware support CPU Switch From Process to Process CPU Switch From Process to Process Process Scheduling • Multiprogramming • To have CPU always active, we switch between processes. • Process Scheduler – Select an available process from a set of processes for program execution. – May force a process to become idle, so another can run. Process Scheduling Queues • Job queue – set of all processes in the system • Ready queue – set of all processes residing in main memory, ready and waiting to execute • Device queues – set of processes waiting for an I/O device • Processes migrate among the various queues Representation of Process Scheduling Interprocess Communication • Processes within a system may be independent or cooperating • Cooperating process can affect or be affected by other processes, including sharing data • Reasons for cooperating processes: – – – – Information sharing Computation speedup Modularity Convenience • Cooperating processes need interprocess communication (IPC) • Two models of IPC – Shared memory – Data Transfer Cooperating Processes • Independent process cannot affect or be affected by the execution of another process • Cooperating process can affect or be affected by the execution of another process • Advantages of process cooperation – Information sharing – Computation speed-up – Modularity – Convenience Communications Models Message Passing Shared Memory Producer-Consumer Problem • Paradigm for cooperating processes, producer process produces information that is consumed by a consumer process IPC Data Transfer – Message Passing • Mechanism for processes to communicate and to synchronize their actions • Message system – processes communicate with each other without resorting to shared variables • IPC facility provides two operations: – send(message) – message size fixed or variable – receive(message) • If P and Q wish to communicate, they need to: – establish a communication link between them – exchange messages via send/receive • Implementation of communication link – physical (e.g., shared memory, hardware bus) – logical (e.g., logical properties) Implementation Questions • How are links established? • Can a link be associated with more than two processes? • How many links can there be between every pair of communicating processes? • What is the capacity of a link? • Is the size of a message that the link can accommodate fixed or variable? • Is a link unidirectional or bi-directional? Direct Communication • Processes must name each other explicitly: – send (P, message) – send a message to process P – receive(Q, message) – receive a message from process Q • Properties of communication link – Links are established automatically – A link is associated with exactly one pair of communicating processes – Between each pair there exists exactly one link – The link may be unidirectional, but is usually bidirectional Indirect Communication • Messages are directed and received from mailboxes (also referred to as ports) – Each mailbox has a unique id – Processes can communicate only if they share a mailbox • Properties of communication link – Link established only if processes share a common mailbox – A link may be associated with many processes – Each pair of processes may share several communication links – Link may be unidirectional or bi-directional Indirect Communication • Operations – create a new mailbox – send and receive messages through mailbox – destroy a mailbox • Primitives are defined as: send(A, message) – send a message to mailbox A receive(A, message) – receive a message from mailbox A Indirect Communication • Mailbox sharing – P1, P2, and P3 share mailbox A – P1, sends; P2 and P3 receive – Who gets the message? • Solutions – Allow a link to be associated with at most two processes – Allow only one process at a time to execute a receive operation – Allow the system to select arbitrarily the receiver. Sender is notified who the receiver was. Synchronization • Message passing may be either blocking or nonblocking • Blocking is considered synchronous – Blocking send has the sender block until the message is received – Blocking receive has the receiver block until a message is available • Non-blocking is considered asynchronous – Non-blocking send has the sender send the message and continue – Non-blocking receive has the receiver receive a valid message or null Buffering • Queue of messages attached to the link; implemented in one of three ways 1.Zero capacity – 0 messages Sender must wait for receiver (rendezvous) 2.Bounded capacity – finite length of n messages Sender must wait if link full 3.Unbounded capacity – infinite length Sender never waits Examples of IPC System PIPES Data streaming Direct communication Ordinary pipes Producer – Consumer fashion Cannot be accessed outside the process that creates it. Usually Parent – Child communication Deleted when process terminates Named pipes Alias: FIFO in unix Appear as typical files in the system Examples of IPC Systems Shared Memory • POSIX Shared Memory – Process first creates shared memory segment segment id = shmget(IPC PRIVATE, size, S IRUSR | S IWUSR); – Process wanting access to that shared memory must attach to it shared memory = (char *) shmat(id, NULL, 0); – Now the process could write to the shared memory sprintf(shared memory, "Writing to shared memory"); – When done a process can detach the shared memory from its address space shmdt(shared memory); Examples of IPC Systems - Mach • Mach communication is message based – Even system calls are messages – Each task gets two mailboxes at creation- Kernel and Notify – Only three system calls needed for message transfer msg_send(), msg_receive(), msg_rpc() – Mailboxes needed for commuication, created via port_allocate() Examples of IPC Systems – Windows XP • Message-passing centric via local procedure call (LPC) facility – Only works between processes on the same system – Uses ports (like mailboxes) to establish and maintain communication channels – Communication works as follows: • The client opens a handle to the subsystem’s connection port object • The client sends a connection request • The server creates two private communication ports and returns the handle to one of them to the client • The client and server use the corresponding port handle to send messages or callbacks and to listen for replies Examples of IPC Systems - POSIX POSIX Message Passing Processes can exchange messages by using four system calls: msgget(mailbox_name, IPC_CREAT) Converts a mailbox name to a message queue ID (msqid). It will create the mailbox if necessary. Returns the msqid. msgsnd(msqid, message@, message_size) Sends the message to the mailbox msgrcv( msqid, message@, message_size, priority, synch/asynch) Receives the message from the mailbox msgctl(msqid, IPC_RMID, dummyParam@) Release the mailbox from process resources Communications in Client-Server Systems • Sockets • Remote Procedure Calls Sockets • A socket is defined as an endpoint for communication • Concatenation of IP address and port • The socket 161.25.19.8:1625 refers to port 1625 on host 161.25.19.8 • Communication consists between a pair of sockets Socket Communication Socket Communication Application Application Socket Transport layer Socket Transport layer Network layer Network layer Link Layer Link Layer Physical Layer Physical Layer Remote Procedure Calls • Remote procedure call (RPC) abstracts procedure calls between processes on networked systems • Stubs – client-side proxy for the actual procedure on the server • The client-side stub locates the server and marshalls the parameters • The server-side stub receives this message, unpacks the marshalled parameters, and peforms the procedure on the server Execution of RPC System Call : fork() • Create a child process • Returns to both the child and parent System Call : exec() • Run application in current process exec(prog, args) Execl( pathname, arg[0], arg[1]…. 0) Excelp(arg[0], arg[1]…. 0) http://www.yolinux.com/TUTORIALS/ForkExecPr ocesses.html System Call: wait() • Pause until child process has exited wait(): Blocks calling process until the child process terminates. If child process has already teminated, the wait() call returns immediately. if the calling process has multiple child processes, the function returns when one returns. waitpid(): Options available to block calling process for a particular child process not the first one • http://www.yolinux.com/TUTORIALS/ForkExecProcesse s.html System call : system() • Invokes the command processor to execute a command. • Uses fork() exec() and wait() • The call "blocks" and waits for the task to be performed before continuing. system(“ls –l”); UNIX Signal • Facility for one process to send another instant notification • Send an interrupt to a process signal(processID, type) • Signal Handling example – http://www.yolinux.com/TUTORIALS/C++Signals.h tml System Call : dup2() • Replace the tofileDesc file descriptor with a copy of the fromfiledesc file descriptor. Used for replacing stdin or stdout or both in a child process dup2(fromFileDesc, toFileDesc) • Example: http://www.cs.loyola.edu/~jglenn/702/S2005/Exam ples/pipe.html Shell • A shell is a job control system – Allows programmer to create and manage a set of programs to do some task – Windows, MacOS, Linux all have shells • Example: to compile a C program cc –c sourcefile1.c cc –c sourcefile2.c ln –o program sourcefile1.o sourcefile2.o Implementing a Shell char *prog, **args; int child_pid; // Read and parse the input a line at a time while (readAndParseCmdLine(&prog, &args)) { child_pid = fork(); // create a child process if (child_pid == 0) { exec(prog, args); // I'm the child process. Run program // NOT REACHED } else { wait(child_pid); // I'm the parent, wait for child return 0; } } POSIX • POSIX, an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems. POSIX defines the application programming interface (API), along with command line shells and utility interfaces, for software compatibility with variants of Unix and other operating systems. http://en.wikipedia.org/wiki/POSIX