The ‘process’ abstraction An introductory exploration of ‘process management’ as conducted by Linux ‘program’ versus ‘process’ • A ‘program’ is a sequence of instructions • It’s an algorithm, expressed in a computer language, but it does nothing on its own • A ‘process’ is a succession of actions that are performed when a program executes • It’s a dynamically changing entity, which has an existence over time, but it requires a processor and it uses various resources Process management • Some operating systems were designed to manage only one process at a time (e.g. CP/M, PC-DOS) • But most modern systems are designed to manage multiple processes concurrently (e.g., Windows, UNIX/Linux) • Not surprisingly, these latter systems are much more complex: they must cope with issues of ‘cooperation’ and ‘competition’ Tracking a ‘process’ • As a process-manager, the OS must keep track of each process that is currently in existence, to facilitate cooperation among them and to mediate competing demands • It uses some special data-structures to do this (with assistance from a timing-device) • The various processes ‘take turns’ using the processor’s time, and share access to the system’s peripheral hardware devices The process control block • The OS kernel creates a data-structure for each current process, known as a process control block or task record or ‘task_struct’ • In Linux, these structures are all arranged in a doubly-linked list (i.e., the ‘task list’) task list task struct task struct task struct task struct Several dozen fields • Dozens of separate items of information are kept in a Linux ‘task_struct’; e.g.: – pid; – state; – priority; – start_time; – sleep_time; – ... // process ID-number // current task-state // current task-priority // time when task was created // time when task began sleeping // … many more items … • This info is used by the Linux ‘scheduler’ The ‘states’ of a Linux process • A process can be in one of several states: – 0: TASK_RUNNING – 1: TASK_INTERRUPRIBLE – 2: TASK_UNINTERRUPTIBLE – 4: TASK_ZOMBIE – 8: TASK_STOPPED state-transition diagram UNINTERRUPTIBLE INTERRUPTIBLE RUNNING event create schedule signal wait preempt signal STOPPED CPU terminate ZOMBIE A process needs ‘resources’ • Each task may acquire ownership of some system resources while it is executing • For example, a task needs to use some system memory (code, data, heap, stack) • And a task typically needs to read or write to some disk-files or to peripheral devices • Usually a task will want to call subroutines which belong to a shared runtime library • Every task needs to use some CPU time The OS is a ‘resource manager’ • Sometimes a task is temporarily granted ‘exclusive’ access to a system resource (e.g., each task has its own private stack) • But often a task will be allowed to ‘share’ access to a system resource (e.g., many processes call the same runtime libraries) • Acquisition of resources by a task is kept track of within the task’s ‘task_struct’ Examples: memory and files • The ‘task_struct’ record has fields (named ‘mm’ and ‘files’) which hold pointers to OS structures (‘mm_struct’ and ‘files_struct’) that describe the particular memory-areas and files which the task currently ‘owns’ mm_struct task_struct mm files files_struct Demo-module: ‘tasklist.c’ • This kernel module creates a pseudo-file (named ‘/proc/tasklist’) that will display a list of all the current processes • It traverses the linked-list of all the process control blocks (i.e., the ‘task_struct’ nodes) • It prints each task’s pid, state, and name • It’s a ‘big’ proc-file (over 3K), so it needs to utilize the ‘start’, ‘offset’, ‘count, and ‘eof’ parameters to the ‘proc_read’ function Demo: ‘sleep.c’ • For kernel version 2.4.26, Linux stores two timestamp-values in every ‘task_struct’: – start_time; // time when task was created – sleep_time; // time when task went to sleep • The ‘/proc/sleep’ file shows each process ‘state’ and ‘sleep_time’ • EXERCISE: Add the needed code to show each process’s ‘start_time’, too (‘big’ proc)