* S 4 DESIGN AND IMPLEMENTAT ION OF AN EXAMPLE OPERATING SYSTEM A by JOHN DANI EL DeTREVI LLE 1, S.B., University of South Carolina (1970) SUBMI TTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCI ENCE at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June, 1972 Signa t ure Redacted 4 Signature of Authort Department of E\4t ical Engineering, May 17, 1972 edacted F a Signature A q . Certified by The iF)Supervi sor , ,1 Accepted b Chairman -- D 'p nature Redacted ar t ee o 'Archives A (JUN 271972) -1- agraa tuens A DESIGN AND IMPLEMENTATION OF AN EXAMPLE OPERATING SYSTEM by John Daniel DeTreville Submitted to the Department of Electrical Engineering on May 17, 1972, in partial fulfillment of the requirements for the Degree of Master of Science. ABSTRACT In software courses dealing with the principles'of design of operating systems for digital computers, it is useful to be able to give the students a case-study of a specially-designed example operating system. The design and implementation of such an example operating system is presented in this thesis, along with guidelines for its use in a classroom environment. The body of this thesis, then, is concerned with the details of the example operating system as designed and as implemented on the IBM System/360. The system consists of two major sections: a nucleus, which performs the basic operations of process management, memory management, and device management, and a top-level supervisor, which utilizes the features provided by the nucleus. These sections are presented from various viewpoints, and then the details of the operation of their composite modules and routines are given in increasing level of detail. Effort is made in this presentation to split the functions of the system, and particularly of the nucleus, into parts along the same manner in which these functions are typically split in computer science courses. These sections also serve as the primary of the example operating system. documentation An appendix serves to illustrate the use of the example operating system in the classroom. General guidelines are presented, and then a number of typical homework assignments for the students, based on the workings of the example operating system, are outlined, along with information on their teaching value. THESIS SUPERVISOR: John J. Donovan TITLE: Associate Professor of Electrical Engineering -2- -3- Acknowledgements ACKNOWLEDGEMENTS Well, first off, I to have thank John Professor Donovan. He was the one who first suggested this topic to me; suggestions as he has continued to offer helpful supervisor. Throughout the closely with me, thesis and was work, very Stuart Madnick worked understanding concerning delivery slippages for the system and for the documentation. Fellow graduate students Jerry Goodman were forced to sit through Jolinson several and Leonard expositions of the principles of the operating system and should be thanked for this. Plus which, thanks to everyone else who's had a part in this thesis. This thesis was edited on the Project MAC Multics time-sharing system; the final copy was produced using the RUNOFF program. Work reported herein was supported (in part) by Project MAC, an M.I.T. research program sponsored by the Advanced under Department of Defense, Research Projects Agency, Research Contract Number Nonr-4102(01). Office of Naval Reproduction of this document, in whole or in part, is permitted for any purpose of the United States government. -4- Table of Contents - - - - - - - - Table of Contents Goals of - - - - - - - - - - - - - - the System - - - -3 -4 -5 -7 - - Technical Acknowledgements Chapter 1, Overview of the System -2 - Acknowl edgements - - - - - 1 - - - - - - - - - - -- Abstract - - - - Chapter 0, - - - - Title page - - - TABLE OF CONTENTS 11 - - - - - - - - - - Classroom use Appendix B, The References - - - - - IBM System/360 - - - - - - - - - - - - - - - - - 23 24 36 39 42 60 64 66 70 - Appendix A, 16 17 19 21 22 71 - Supervisor process User program 13 84 - Device handlers 1/0 interrupts - Chapter 3, Details of Operation Databases - - - - - Routines - - - - - Traffic controller - - - Primitives - - - Chapter 2, Modules of the System - Process management, lower modu 1 eMemory Management module Process Management, upper modu 1e Device Management module Supervisor Process module User program - - - - - -8 87 -5- Chapter 0 Goal s CHAPTER 0 GOALS OF THE SYSTEM topic The af this thesis is design the Implementation of an example operating system for use pedogogical aid in computer science courses. to use this system as an example in 6.802, a at on M.I.T. the of principles It is and as a intended course taught operating systems. This system meets a large number of pedogogical goals. is a simple operating system, making it It to students learn. is This systems as OS/360 or rMultics, in whose easy for the to such other contrast purpose main is not pedogogical. It go. In is its a small operating system, present form, it occupies as operating systems about (three-quarters of a box) of assembly-language code. not include language processors instead implements a basic or management, etc., programs, but dynamic memory allocation, to which any top-level and associated programs could be easily fitted. features It does system nucleus which provides such features as multiprocessing, device utility cards 1500 supervisor These basic are the ones of the most importance in a course on -6- Chapter 0 Goals operating systems. include This implementation does supervisor to top-level simple a provide processing of the job streams. Thus, in the operating system is complete, that could students imagine it being used for some real-world application. (Such an application for this system would be controlling a large number of devices, real-time with additional background computing.) Lastly, this operating system is explicitly designed in a modular manner, which is not often the case in real-world operating systems. In particular, management, processor or the relevant sections for memory management, management, can easily be separated from system. This In rest device of the is important in a classroom environment since these functions are other. the or usually real-world taught separately from each systems, some of this modularity is usually sacrificed for performance. Chapter 0 Goals Technical acknowledgements -7- TECHNICAL ACKNOWLEDGEMENTS Most of the features first intr oduced of of a systems. operating The in various discrete 1evels im plemented system were and are here taken from other els ewhere, papers in the field, o r from other idea system operating this (here, the nucleus and top-level) has been best expressed by Dijkstra CDij 68), who introduced the concept of a also semaphore. The form of the message system communicati on comes from for termi nology and in concepts introduced by Saltzer (Sal 66) and (Cor 65), from which several modules, and the process control implemented in were Multics ideas in this area were taken. The view of the operat ing system management also A great part of elucidated the concept of a system nucleus. the who 70), (Han Hansen interprocess as a specific group of resource breakdown in this respect, comes from Donovan (Don 72). Additi onally, several key points were taken operating systems as OS/360 (both MFT (IBM :3)), CTSS (Cri 65) and IBSYS (IBM :1). from such :2) and MVT (IBM -8- Chapter 1 Overview OVERVIEW OF THE SYSTEM The example operating system for the IBM System/360. instruction It system (user example of a supported, prog rams routines for non-standard devi ces, control requires the standard but Any currently is provided only for card readers support and line printers The a multiprogramming and both st ore and fetch protection. set, types of external devices can be standard is system can provide their own however). operating system currently runs under the System/360 simulator, which simulates environment with 32K-bytes of storage, two card readers, an and four line printers. form of not relocatable; the is in the Memory allocation for user programs variable-length partitions (although System/360 hardware will not allow this). Users can specify of 2K-byte increments requires memory required in from 2K-bytes upwards. The about 6K-bytes for its terms system own routines; currently the rest of is available for user programs. The streams storage of amount the system coming is in capable from of supporting different input multiple devices job (here, -9- Chapter 1 Overview readers), with the output being directed to different output devices (here, printers). Each job stream consists of a number of jobs, stacked in order. A job consists of a $JOB card, followed deck, followed by optional data. by an object The format of the $JOB card is: $JOB, core,name=devtype, name=devtype,... as in: $JOB, 8K,FI LEA=IN, FI LEB=OUT The "core" field gives the amount of core required the job (in the example, 8K). for The "name=devtype" field can be repeated any number of times (including zero). The "name" gives the name by which the user program can reference this device (these names are up to the user, and very flexible, device-independent referencing of devices is provided) while the "devtype" tells the type of device to be assigned. There are currently three possibilities for this field. The type "IN" specifies the system input unit (here, the card reader). The type "OUT" specifies the system output unit (here, the line printer). The type EXCP indicates a non-standard device, for which the user will supply his handler routine. Currently, own the $JOB card can contain at -10- Chapter 1 Overview most one reference to IN, one reference to OUT, and one reference to EXCP. The object deck, which card, has the same format deck. External as references object deck gives the text to partition, and immediately follows the $JOB the are be standard not OS/360 object allowed, however. The loaded into the user's specifies the starting point; relocation is performed during loading. Following the object deck is the card input data to the program, user if there is a reference to "IN" on the $JOB card. The user's program may specify parallel take place this, between within along it, and with flexible different parallel processing to there are features to control facilities for communication execution paths of the program. The system automatically schedules the various user programs in the system, and their parallelly-executing sections, in such a way as to tend to maximize the use of the CPU and the data channels. -11- Chapter 2 Modules MODULES OF THE SYSTEM As described in chapter 0, the example operating system is implemented in two levels: supervisor. However, for a nucleus and top-level a clarity, we can split the system down more finely to view it as being implemented in a number of layers. One property of this layered construction is that each successive layer, from the bottom up, depends only on the existence of those layers existing below it, and does not directly depend on higher layers (this feature being of importance for pedogogical reasons). The five layers (or "modules") of the example operating system are: Process Management, lower module Memory Management module Process Management, upper module Device Management Supervisor Process module (where, of course, these). the user program depends (lowest) (highest) on all of A few features of this breakdown are of note. First, the functions of process management have been split into a lower module and an upper module, one on either side of memory management. This is because certain parts of -12- Chapter 2 Modules process management (those in the memory management routines, module) upper but on depend memory management itself depends on certain process management routines, thus placing these in a lower module below memory management. As shown, the module depends on process supervisor process management (both modules), memory management, and device management. The dichotomy between the nucleus and the top-level when it supervisor is not as clear In was presented and the as breakdown in chapter 0; however, the modules below the supervisor process module may nucleus, this supervisor be viewed as the process module itself as the currently-implemented supervisor. Finally, follows that it will the be seen supervisor from process the discussion module and the user program itself operate in very nearly the same This is an unusual aspect that environment. of this operating system, and comes from the nucleus vs. top-level supervisor breakdown. -13- Chapter 2 Modules Process man., lower PROCESS MANAGEMENT, LOWER MODULE This module of process multiplexing, and management additionally provides processor basic primitive provides This synchronization. operations for inter-process module may be viewed as the "multiprogramming support" module. This module provides the basic support for processes. A process may be viewed as the executi on of a within program certain constraints (e.g., within a certain area of memory); for a more precise Multiprogramming definition,' allows Saltzer (Sal 66). processes to run see these independently, in parallel with each other. The "traffic and schedules runs controller" this of section processes which are eligible to run (a unless process in the system is usually eligible to run is module it waiting for the completion of some external event (e.g., it might be waiting for the completion operation; alternatively, synchronizing relevant with other it might processes: of be see an in input/output the action of below for the primitives). Processes are schedled by the traffic controller in a round-robin fashion (see the writeups on the traffic controller in chapter 3 for more details on this) Chapter 2 Modules Process man., lower -14- and run until they become temporarily ineligible to run, (in until a certain amount of time passes or above, described this system, fifty milliseconds), called the quantum process is In either case, the traffic controller of to continue eligible the which after time, as running later. then selects another process and causes it to begins running. provides also module This primitives for the called are These primitives synchronization of processes. the basic "P" and "V" operations (as named by Dijkstra (Dij 68)). Their operation is as follows. Both the "P" and "V" operations act on has which associated integer value. an has two possib le effects. greater value. process signal a "semaphore", The "P" operation If the value of the semaphore is zero, it causes one to besubtracted from that than less, If the value of the semaphore is zero or issui ng the another from ineligible). This begins operation "P" process (thus signalling will becoming be the to wait for a temporarily caused by another process execut ing a "V" operation on the semaphore. 1 o peration also The "V" courses of action. If follows there are one it adds two possible no processes currently waiting as a result of performing a "P" semaphore, of operation on one to the value of the semaphore. this If, Process man., -15- Chapter 2 Modules is however, there are processes waiting, a signal lower sent to one of then to stop waiting. The and "V" operations, together with semaphores, "P" a have several useful applications. For example, semaphore with initial value one can be associated with some data base base. to serve as a lock on that data data base perform accessing that associated semaphore before perform acessing If all processes a "P" operation on the the data base, "V' operations on that semaphore afterwards, it may be seen that only one process can ever access the data at any base. and one base time, thus insuring the integrity of that data Memory man. -16- Chapter 2 Modules MEMORY MANAGEMENT MODULE This dynamic module performs allocation and the operations freeing of necessary memory. for It provides routines which will on request allocate a block of memory of a given size and of a given address-alignment (e.g., on a double-word boundary), or which will on request free a block of of memory a given size at a location. given accomplish this, the module keeps a list of "free" taking from processed. or adding to this list as Communication with this module is To storage, requests by are means of explicit calls. The memory management module uses the management lower module in the following respect. should ever process If there come a request to allocate memory which cannot be satisfied (i.e., there is currently no such contiguous block of memory free) the allocation routine must wait until some other process synchronization frees primitives some provided memory. by The the management lower module are used for this (for more see the appropriate writeups in the next chapter). basic process detail, Process man., upper -17- Chapter 2 Modules PROCESS MANAGEMENT, UPPER MODULE module This processes (e.g., their and system), also for routines provides creation provides the deletion and control of within the a sophisticated inter-process Communication communication routine with buffered messages. with this module is in the form of explicit calls. Firstly, this module provides routines for the control are of processes. Most importantly, there of a given name may be created and started to run processes at a given location, with certain register to It passed contents as arguments, and there are calls by which a process within the system may be level, which by calls of course, it stopped is and destroyed. At this entirely possible to ignore the existence of the traffic controller, in that we may view all processes, say, as continually running; this view, if taken by this module, would be perfectly consistent. This module also provides a method may be sent send, to a between process of by which messages processes. There is a call saying to a certain name, the k-character message "such-and-such"; there is also a call saying to read the next message waiting to be read, which returns the text -18- Chapter 2 Modules of upper Process man., message and the name of the sender. Messages may be the of arbitrary length and any number messages of may be wa iting to be read by a process. seen be may As from the above, this is the level at which we introduce the concept of the "name" of chosen are Names process. a at process-creation time and are used to reference processes within this upper module. This module also introduces the concept of a group (see chapter 3) which is the set of processes associated with a job within the system. This upper module depends directly on both the process management lower module and the memory management module. used the "P" and "V" primitives to provide synchronization between the sender and the receiver in Also, space needed processing. processing. traffic controller. management module is called to allocate or free memory processes, message this module runs in the process of its caller, since there is an implicit dependency on the The It of to to store provide system information concerning buffers for message temporary Device man. -19- Chapter 2 Modules DEVICE MANAGEMENT MODULE the input/output commands to external devices. appropriate Unlike before, those issue to necessary This module provides the routines higher all like but levels' (the supervisor process module and the user program), this module runs in a separate process (in it, messages sent to and this with Communication pair). device) one this case, status the of (group, per is by result is process the returned via messages which it sends back. routine, when created, The is provided with sufficient information to issue input/output occurs, issue to is purpose these Its instructions. major then, after an interrupt and interpret the status information available for the construction of a return message. module depends directly on the process management This lower module in that it uses semaphores as locks against two simultaneously processes the device (here, information held attempting semaphore about the interrupt (by performing a "P" initial value is a device) to lock and the access on to a block same of wait for an operation on a semaphore with zero, in order to wait until another routine in this module, which is entered upon interrupts, performs a Chapter 2 Modules -20- "V" operation on this semaphore). Supervisor Chapter 2 Modules -21- Supervisor SUPERVISOR PROCESS MODULE This module serves as the top-level supervisor, the features provided by the previous modules to interface It uses create an This module runs in for processing of user jobs. a number of separate processes, one per job-stream (thus one per group). Its purpose is to, for each job in sequence, allocate a partition for it to run in (the size being specified on $JOB card) and create and start processes for Interface to It then the device module (see chapter 3 for more details). loads the user's starts a process in supervisor process deck Into the partition and creates and that can partition. At this When completion is process cleans up (i.e., destroys have created for this point, the stop running for a moment and wait for a message signalling completion to come program. the job, signalled, the processes any and from frees allocated), and goes on to the next job. the the user supervisor It might partition -22- Chapter 2 Modules User program THE USER PROGRAM The user allocated for program it first one process to create completed, others by runs the in of memory supervisor process. There is at in which it runs, but there is the ability When the job is to run in parallel. there are primitives provided by which program can signal either successful The partition the the user completion. user program can, of course, any of the other modules in the system. use the facilities of -23- Chapter 3 Details CHAPTER 3 DETAILS OF THE OPERATION OF THE EXAPIPLE OPERATING SYSTEM This chapter the data bases of modules. goes the into detail on the organization of system and the operation First, we will examine the databases. of the Databases -24- Chapter 3 Details DATABASES This section presents all the major data bases present in the system, explaining their structure and their use. The descriptions here consist of a PL/-style structure declaration, followed by details on each of the fields. PROCESS CONTROL BLOCK The Process Control Block (PCB) stores the information associated with a process. There process in the system. The PCB is one PCB for every is defined approximately by: declare 1 pcb based (pcbptr) aligned, 2 name char (8), 2 next_pcbthis-group ptr, 2 last_pcbthisgroup ptr, 2 next_pcball ptr, 2 last_pcb_ all ptr, 2 stopped bit (1), 2 blocked bit (1), 2 interruptsavearea like (save_area), 2 message-semaphore_common like (semaphore), 2 message-semaphore-receiver like (semaphore), 2 first_message ptr, 2 nextsemaphorewaiter ptr, 2 insmc bit(1), 2 stopwaiting bit (1), 2 stoppersemaphore like (semaphore), 2 stoppeesemaphore like (semaphore), 2 faultsave-area like (save_area), 2 memory_allocator_save_area like (save_area), 2 memory_freersave-area like (save_area); Chapter 3 Details -25- Databases PCB (contd.) where the fields are as follows. "Iame": this contains an eight character field field giving the name for this process. A name is supplied for process by the process that creates it; to reference processes in calls to a these names are used process the management upper module routines. "next Dcb this group": is a pointer to the next PCB in this group. The set of processes in the system is divided into a number of mutually-exclusive subsets known as groups; the processes in circular list. circular list. each group are chained together in a This pointer is the forward pointer in that All names within a group are unique, and naming of processes is always relative to a group. "last Dcb this group": circular list of the PCB's "next ocb all": together in the backward pointer for the being chained in this group. addition to their in the above-mentioned group-oriented fashion, all PCB's in the system are independently linked into circular list. management controller) lower This is for module (in the a single purposes of the process particular, the traffic which is not concerned with groups (these being Chapter 3 Details -26- Databases PCB (contd.) within an upper-module concept This management). process pointer is the forward pointer within this chain. "last Dcb all": backward pointer for the chain of the all PCB's. this bit is zero when the associated process "Stopped": is not stopped, it and one when considered runnable by the traffic When stopped. process a A process is not controller first created, is started stopped state, and may be is. some by if is it is in the process other performing a "start process" primitive for it. it A non-stopped process may be stopped by another process performing a "stop usually it, for primitive process" as a prelude to destroying it. this bit is zero when the associated process "blocked": is and blocked, not blocked considered not is A controller. blocked if it non-positive when one it is. A process which is runnable by traffic the process is normally not blocked, but can go performs a "P" operation on a value. It will non-blocked) whenever some be other waked process with semaphore up (made performs to a go "VI' operation on the semaphore (for a more precise statement of how is this structure and primitives). performed, on the see the functioning writeups of the on "P" semaphore and "V"1 -27- Chapter 3 Details Databases PCB (contd.) "interrupt save area": a save area. below in the save_area section, and its Its format is given purpose is described below in the routines section. "messave semaphore common": initial this is a semaphore with value of one. Whenever a routine is entered to send a message to this process or to read a message sent to process, this semaphore this used as a lock on the message is chain pointed to by first_message, during the period when the message chain is being searched or modified. "messaze semaphore receiver": this semaphore is used to regulate a process receiving messages. 0. Whenever initial value is a message is sent to this process, the sending process performs a "VI" operation adding on Its this to its value. Whenever an attempt one to read a message sent to this process, operation performed wait until on this semaphore. waiting to be read by format message chain. process receiver" is being made Thus, if there are message" routine there are. "first message": this message thus there is first a "P" no messages currently readable, the "read will semaphore, for is a pointer to the first this process. information If there are no (i.e., if is zero), the the on value of on the the structure of the the of writeup See messages value message readable by this "message semaphore this pointer is Chapter 3 Details -28- Databases PCB (contd.) meaningless. "next semahore waiter": all waiting on a single semaphore chained currently together in a list. The head of this list is pointed to by a field linear in the semaphore, and the together by this PCB's field. and the negative in The writeup on semaphore format) zero, are processes the list are length of the list is taken to be the of the value as linked (see the maximum of stored in the semaphore. "in smc sma": section, this bit and is zero if one if it "system must complete". the process is not in is. an The term "smc" stands for If a process an attempt to stop that process will is in an smc wait until leaves the smc section. There are primitives section, that process to enter and leave an smc section. "stoD waiting": this bit is zero, if there is no stop request waiting for this process for when it leaves any section it smc may be in, and one if there is. This bit is set by the "stop process" primitive. "stoDper semaphore: stopping a process this semaphore is used to in an srnc section. attempting to do the stopping performs a this semaphore, which has an initial "P" The wait on process operation on value of zero, whenever -29- Chapter 3 Details it Databases PCB (contd.) notices that the process that it the "in smc" bit set attempts to execute a is trying to stop has on. When the process to be stopped "leave smc" section notices that the "stop waiting" bit is on, it primitive and performs a V" operation on this semaphore, and then a "P" operation on the "stoppee semaphore" (described next). "stoDnee semaphore": this semaphore, which has initial value of zero, is used to stop a process stop postponed until it which left an snc section. has had a It performs a "P" operation on this semaphore, thus blocking itself until the stopping process can execute another stop request. "fault save area": a save Its format is given area. below in the save-area section, and its purpose is described below in the routines section. "memory allocator save area": a save area. is given below in the save_area section, and its Its format urpose is described below in the routines section. "memory freer save area": a save given below in the save_area section, area. and described below in the routines section. its Its format is purpose is -30- Chapter 3 Details Databases RUNNI NG This a pointer to the PCB of the process currently is being run It is set by the traffic controller, . whenever and is used is necessary to access the PCB for "this" (the it current) process. NEXTTRY This is a pointer to the PCB of the will controller traffic next process try to schedule after it next entered. This pointer is set by the traffic to the be same as a V operation the is controller the "next pcb all" pointer in the PCB pointed to be RUNNING, of that but may be modified by the execution (see the writeup on the V operation for more deatils). NEXTTRYMOD IF IED This is a bit which is zero if modified if it has. NEXTTRY has not been since being set by the traffic controller, and one -31- Chapter 3 Details Databases SAVE AREA Whenever system routines need save the conditions appertaining when they can be restored upon exit, it it areas for storing was entered, so that uses one of the four save areas in the PCB. These save areas look like: declare 1 savearea based (savearea-ptr), 2 oldpsw like (psw), 2 oldregisters(0:15) like (register), 2 next_savearea ptr, 2 temporary(3) bit(32); where the fields are used as follows: "old sw": this is the old upon psw entry to the routine. "old resisters": these are the old general-purpose registers upon entry to the routine. "next save area": this pointer is currently unused. See Appendix A, Minor Flaws section, as to why. "temporarv": words. It this is a temporary storage area of three is used mainly to contruct argument lists, which must be held in storage, for calling other routines. If a -32- Chapter 3 Details routine Databases Sa ve area (contd.) ever needs more than three words of temporaries, uses these three words to contruct a call to the it memory allocator. S EMA PHOR ES Semaphores are used for low-l evel basic, synchronization of processes. They can be accessed by "P" or "V" operations, or their value field may be examined directly. Their format is as follows: declare 1 semaphore based (semaphore-ptr), 2 value fixed (31,0), 2 first_waiter ptr; where the fields are used as follows: ".value": is the val ue of the semaphore. A "P" operation always subtracts one from this value, and a "V" operation always adds one. The ini tial value of the semaphore must be non-negative. "first waiter": for semaphores with negative values, this is the pointer to the first in a list of PCB's together by their the length of the list operation chained "next semaphore waiter" pointers, where is the magnitude of the value. A "P" on a semaphore with non-positive value places the -33- Chapter 3 Details Semaphores PCB for its process at the end of the list, operation on Databases (contd.) whereas a "V" a semaphore with negative value modifies this pointer to in essence take the first PCB off the list (after which it receives a wake-up): thus we have a FIFO queue. FREE STORAGE BLOCK All of free storage allocate requests) is blocks. storage All free (storage chained ever adjacent in memory: together blocks order of increasing size. No two which can be given to free storage are chained together in free rather, in storage these blocks two would are be collapsed together into a single larger block. Whenever an allocation Is requested writeups (see the routine for more information) a block is unlinked from the free storage list, and the remainder or remainders is linked back onto the list. onto the list, When a block is freed, it is linked back after collapsing any adjacent blocks. The format of a free storage block (FSB) declare 1 freestorageblock based (fsbptr), 2 next ptr, 2 size fixed (31,0), 2 unused (freestorage_block.size-8) char(1); is: -34- Chapter 3 Details Databases FSB (contd.) where: "next": a pointer ascending-size-order chain to of the next FSB's. FSB in the For the last FSB in the chain, this field contai ns al 1 zeros (nul 1). "Size": this field contains the size of this size The stored in bytes, but all FSB's contain an integral is number of words (since all module FSB. (all requests to the memory management calls are made by the system) specify that the allocated area is to be on a doubleword boundary). "unused": this field fills out the remainder of the block and contains nothing of importance. FREE STORAGE BLOCK POINTER This pointer (FSBPTR) points to the first free storage block in the ascending-order-of-size chain. If there are no blocks in the chain, this pointer contains all zeros (null). FREE STORAGE BLOCK SEMAPHORE Chapter 3 Details -35- Databases FSB semaphore (contd.) This semaphore, with initial to the free management examines storage list by serving as a lock. Any memory routine, or value one, controls access upon modifies the entering free a section storage where it list, does a "P" operation on this semaphore, thus ensuring the integrity this data base, and does a "V" after it of stops using this data base. MEMORY SEMAPHORE This semaphore, waiting for memory. but is unable, semaphore, thus it with initial value zero, controls If a process attempts to allocate memory performs blocking a "P" itself. operation When reattempts the allocation, again possibly it on this reawakens, blocking it itself, until the request can be satisfied. A sequence of "V" operations is performed on this semaphore whenever a block of memory is freed. The number of "V" operations performed is equal to the number of processes waiting on the semaphore at that time. -36- Chapter 3 Details Routines ROUTINES We now discuss the details of the composite routines of the into the modules module and an system operating a (with the lower example the of realm divided As Chapter 2 system. of process management memory module), upper management, device management, the top-level supervisor, and user programs, so we can routines. here divide system of set the Here, though, the distinction will be on how they The following are brief expositions gain control. of this division. A detailed explanation of the operation of each of these groups follows in subsequent sections. The traffic controller, which forms part of module process of a is detected, operation is performed on a semaphore with "P" non-positive value, to relinquished lower management, constitutes the first case. It gains control whenever a timer runout trap whenever the it or whenever (this control is specifically last is a special case: see the following documentation for its use). The synchronization primitives "P" and "V", the remainder of the which lower module of process management, together with all of memory management and all of the module of process form upper management, form the second case. ? hese -37- Chapter 3 Details Routines routines gain control by virtue of a specific call means of (here, by the SVC instruction). They are passed an argument list and perform requested functions. Device management gaining reside control. ihe is split dev ice-handling created communication with the by them two is top-level by means methods of routines themselves one per (group, in a special process , originally between device) pair, supervisor, of the and message primitives provided by the upper module of process control. One function, though, performed by process management is the fielding and handling of routines for doing this are input/output invoked interrupts. The whenever interrupt occurs, run for a very short time in of whoever was an process running at the time of the interrupt, just long enough to store status process the such information and wake up the waiting for that interrupt, before returning to the processing of the interrupted process. The supervisor process module runs in a special set processes, one for each job stream. and started at IPL (Initial It of is initially created Program Load) time, and arguments by the IPL routine concerning its operation. passed Finally, processes supervisor. Routines -38- Chapter 3 Details of the user's its own, program being runs started in by a process or the top-level -39- Chapter 3 Details Traffic controller DETAILS OF THE TRAFFIC CONTROLLER The traffic controller serves the purpose of creating a processes", It runs "between process-oriented environment. as it were. traffic The basic the performs controller task of processor multiplexi ng, sharing the machine's CPU among various processes runnable said to be runnable if at any one time. A process is both the "s topped" and "blocked" bits I t runs until of the PCB are off. When a process Is ent ered, its quantum of time is consumed (a timer ru nou t trap is to wait occur 50 for after a process is ente red), synchronization is operation value), or traffic ms. performed it with on specifically a another by schedule IPL routine, a proce ss relinquishes (a "P" cont rol to the special this ent ry is only to start the sy stem going, or by the input/output interrupt handler, to it needs to via a special call (th is is an SVC call controller the set semaphore w Ith non-positive to routine XPER, for which see below; used the process in the case quickl y of fo r needing real-time response. See the appropriat e sections for each of these). All entries immediately to the traffic to routine GETWORK. controller transfer This routine runs with all -40- Chapter 3 Details Traffic controller interrupts off, in key 0 (thus allowing part of memory). it to access The old registers and PSW of the process "interrupt save area". just left are stored in the PCB's Beginning with the process pointed to by cell it searches NEXTTRY, through all PCB's in the system, going through the pcb.nextpcb_all chain, finding the first PCB neither stopped or blocked. which resets cell NEXTTRY is If there does exist such a one, the traffic controller resets cell RUNNING to point to PCB, any that to point to the next PCB in the "all" chain, sets the timer to interrupt in 50 milliseconds, and then proceeds to call a standard system service routine to reload the registers and PSW from the interruptsave_area in the PCB pointed to by the new RUNNING, thus beginning to run this new process. If all PCB's in the "all" chain are stopped or blocked, the traffic controller loads a PSW with location set to GETWORK, but with the wait state set on. As the input-output interrupts explains, occurs, the old PSW's wait bit reloaded. Thus, is when set section such an interrupt off before it is the traffic controller hereby waits until an input-output interrupt occurs before it has another scheduling on (since there can be, processes can be runnable until such in an this try at situation, no interrupt occurs, t here although afterwards). instead Traffic controller -41- Chapter 3 Details of is always one System/360 efficient immediately The traffic controller utilizes the wait state scheduling, say, an ever-present, ever-runnable process with an infinite loop in it, relates) runnable th is system simulator; on since (as chapter 1 is really run under the control of a this simulating the simulator wait vastly is state than it is for infinite loo ps running, for at total, with a simulated of several s econds for even a small job stream. more time Primitives -42- Chapter 3 Details REQUEST-DRIVEN ROUTINES PROVIDING SYSTEM PRIMITIVES section includes those routines which are entered This due to a request for a system primitive function. These by invoked the SVC are instruction which, on the System/360, provides for a one-byte operand; this byte is used to store an mnemonic for the operation to be performed. A list of the allowable mnemonics, processes them, their MA as for the fault save save area; note that these used here are archaic and no longer really relate to the function performed (Fixing which stands for the memory allocator save area, and MF stands for the memory freeer names routine and the save area which they use (I stands for the interrupt save area, F stands area, the function, Minor Flaws by section) the has save area: Appendix further information on this). (in A process control, lower module) P the "P" synchronization primitive rout. XP (1) V the "V" synchronization primitive rout. XV (1) Primitives -43- Chapter 3 Detail s (in memory management) (in A allocate a block of memory rout. XA (MA) F free a block of memory rout. XF (MA) B l ink a block onto the FSB chain rout. XB (MF) process management, upper module) C create process rou t. XC (F) D destroy process rout. XD (F) H halt job and signal rout. XH (F) I insert a PCB into the chain rout. XI (I) J remove a PCB from the chain rout. XJ (I) N find a PCB given its name rout. XN (MF) R read message rout. XR (F) S send message rout. XS (F) Y start process rout. XY (MF) Z stop process rout. Xz (I) . enter traffic I enter smc section rout. XEXC( I) , leave smc section rout. XCOM( I) ? abnormally rout. XQUE( I) supervisor rout. XPER( I) controller terminate this job The save areas are so assigned tha t, modules calling on one another, overlaid. Note also that the calls no whi ch in the save course area might of is ever cause the Primitives Chapter 3 Details traffic to be entered (namely XP and XPER) both controller use the I save area, the the as same traffic controller itself. routines run in key 0, allowing arbitrary of these All Routines XP, the overriding references, memory protection XI, XJ, XN, XZ, XPER, XV, XB, XQUE operate with interrupts off, while the with mechanism. XEXC, others XCOM, and operate on. The advantage to having interrupts off interrupts is the assurance that no other processes can run while that routine is in operation. All which operate with interrupts on (namely routines XA, XF, XB, XC, XD, XR, routine upon XY, XS, and XZ) call the XEXC entry and the XCOM rout ine upon exi t. This is to assure that a process will not be de stroyed whe n it has some important system lock set. Only some these routin es are callable direct] y by of the user. These are the XC, XD, X H, XN, XR, XS, XY, XZ, XQUE routines; the routines (there is a testing 0, rest check usable are this for on by the sy stem entry, done by if the key under which the call er was operating was indicating a system procedure, user only and procedure. Only these or non-zero, routines Ind icating listed, a which are primitives in the process management upper module, are safe Chapter 3 Details for the -45- Primitives user to call; allowing a direct call to the others could allow a breach of security. When an automatic SVC instruction transfer is executed, there is to a special SVC-handling routine. This routine looks up the mnemonic for the operation, stores registers and transfers to operates under old the PSW key 0). routine (the SVC handler If the called routine needs to call it again uses the SVC-style call. Arguments are passed in an area pointed to by Any the in the appropriate save area, then appropriate another service routine, 2. an register returned values are placed in this argument list. A return is effected by reloading the old registers and old purpose and PSW. Following is a description of the functioning of the above-mentioned routines. Primi tives Chapter 3 Details ROUTINE XP This routine implements the "1p" 2 entry, The routine is assumed to point to a semaphore . register primitive. Upon subtracts one from the value, and then returns i f the result is non-negative. the if result was positive, however, then the PCB for this process is inserted at the end of the list of processes on waiting this semaphore. See the writeups on the "first waiter" field in the and semaphore, waiter" field in the PCB for more information. the "blocked" bit in the PCB to from one, semaphore "next the keeping It then sets the process being scheduled to be run until a wakeup is sent to it by the XV routine (see below). Control then transfers directly to the traffic controller. ROUTINE XV This routine implements the "V" primitive. Upon entry, register 2 is assumed to point to a semaphore. adds one to the value of that semaphore. The routine If the value is now greater than zero, it returns. If the value is zero or less, that were processes waiting on the semaphore. means that there The first is taken Primitives Routine XV (contd) -47- Chapter 3 Details from the list (see the references in referenced the XP writeup) and a wakeup is performed for that process, setting its "blocked" bit to a zero, allowing the process to be scheduled once again. To allow the target process to be scheduled soon, the XV routine typically sets the NEXTTRY pointer traffic used however, bit is already a one. Thus, if one process wakes that up a large number of others in the course of its the the controller to point to the target PCB, and sets the "nexttry modified" bit to a one. This is not done, when by to first execution, be awakened is the first to be run after the current one. The XV routine then returns. ROUTINE XA This routine serves to allocate a block of entry, register and supplied (for example, expressed in by the user, and then a power of two, indicating what type of boundary the allocation on On 2 is assumed to point to an argument list, which contains the size of the block desired, bytes, memory. is desired an argument of eight would indicate that doubleword alignment was desired), and then a fullword space -48- Chapter 3 Details into Primitives Routine XA (contd.) which the routine returns the address of the allocated area. The routine first semaphore". does a P" operation on the "fsb It then goes through the free storage list until it finds a block large enough to contain the requested area. routine calculates which pa rts of that block need to be The relinked onto the free storage list (i.e., those on either side of the a llocated area: overlap the allocated area is selected as close to the beginning of the possible with the specified ali gnment, breakage) and primitive. performs The this which block as in order to minimize relinking using the "B" of th e allocated area is inserted address into its place in the argument 1 ist and the routine performs a "V" operation on the "fsb semaphore", If, (i.e., there is finding a suitable block) operation on the "fsb operation on the semaphore", "memory writeup for this semaphore, until no way to satisfy the request the routine reaches the end of the free storage without "V" however, and then returns. list then the routine does a and then does a "P" semaphore". As explained in the this has the effect someone frees some memory. The routine, of waiting after waiting, returns to the beginning and re-attempts the allocation. -49- Chapter 3 Details Primitives ROUTINE XB This routine performs the function of linking of storage block onto the free storage list, with the assumption that compacting of the new area with existing FSB's possible (thus On entry, it assumes that register 2 points to list, a which is not XB can be called directly from routine XA). contains the size of an argument the area to be freed, measured in bytes, and the address of the area. Since this routine is called only by those routines (XA XF) this that have routine already does not and done a "P" on the "FSB semaphore", need to worry about database interference from other processes. The routine searches through the free storage list to find the point at which the new block should be inserted, and then performs the relinking, formatting the new block to look like a FSB. It then returns. ROUTINE XF Primitives Routine XF (contd.) -50- Chapter 3 Details This routine performs of freeing areas of storage, whenever compacting of the new area with current FSB's might be necessary. to On entry, it assumes that register 2 points an argument list, which contains the size of the area to be freed, measured in bytes, and the address of the area. The routine first does a semaphore", "P" operation to lock the free storage list. on the It then searches through the free storage list to see if these are any defining a region "fsb FSB's of memory contiguous with the one being freed. If there are, they are removed from the free list, and the address and size of the block being freed are recomputed. routine When the does a end of the list is storage reached, the "B" operation, calling routine XB, to link the block onto the free storage list. The routine then examines semaphore" to the on of the "memory determine how many processes are waiting for memory to be freed, and then performs operations value that number of "V" the "memory" semaphore (obviously, since the XB routine is called only the XA and XF routines, performing these wakeups here is sufficient). The routine then returns. ROUTINE XC Routine XC (contd.) the This routine performs entry, Primitives -51- Chapter 3 Details register 2 function. process" "create On is assumed to point to an argument list consisting solely of a name for the process to be created. The routine first checks that the name is in used group. If it is, an error routine is entered this (calling routine XQUE). If not, however, the routine already not routine calls XA to allocate an area for the new PCB. The new PCB has the name stored into it, the stopped bit turned on, blocked bit turned off, the semaphores set to their initial values (as noted in their writeups), and the turned the off. It then calls the Xl into the two chains. "in bit smc" routine to link this PCB The routine then returns. ROUTINE XD process" This routine performs the "destroy function. On entry, register 2 is assumed to point to an argument list consisting solely of a name for the process to be destroyed. The process to be destroyed must been previously have stopped using the "stop proces s" primitive (routine XZ). The routine uses the XN routi ne to determine address of the PCB for the process to be destroyed (if is no such process, an error routine is entered, the there calling Chapter 3 Details routine "P" -52- Primitives Routine XD (contd.) XQUE). This process must be in the stopped state. A operation is performed on the "message semaphore common", thus keeping other processes from sending messages. Then all messages waiting to be read gone by storage for the process are Finally, routine XJ through, and their storage freed. is called to unlink the PCB from the this two chains, and the PCB is freed (all freeing here is done by the XF routine), and this routine returns. ROUTINE XH This routine performs the "halt this job" function. takes no argument The sole It list. processing performed by this routine is to send a message (via routine XS) to process "*IBSUP" (see the writeup job on should pervade, and the supervisor process module) stating that the be halted then now, perform and a that ever performed on that no "V' operation (this being done after the value of that semaphore is set to zero). Thus, the XH returns. conditions "P" operation on a standard semaphore, used just for this purpose, is normal routine never Primitives -53- Chapter 3 Details ROUTINE XI This routine performs the function of chaining a PCB into the two PCB chains. On entry, register 2 is assumed to point to a PCB. The PCB is chained onto the all-PCB chain immediately following the PCB for the this-group chain running process. running immediately and process, fol lowing the onto PCB the for the The routine then returns. ROUTINE XJ This rout ine performs the function of the two PCB chains. On entry, from removing a PCB register 2 is assumed to contain a pointer to the PCB. The PCB is removed from the two chains by modifying the links. This routine does not free the storage; that is left up to the cal ler. It then returns. Primitives -54- Chapter 3 Details ROUTINE XN This routine serves the function of finding a PCB for a process, given the name of that process. On entry, register 2 points to an argument list consisting of a name and an area into which it returns a pointer to the PCB. The chain, it routine looks along the "next pcb this group" starting with the PCB for the running process, until finds a PCB containing the desired name, upon which case it stores the pointer in the argument list and it until finds that there is none such, returns, or in which case it stores zeros in the argument list and returns. ROUTINE XR This routine performs the "read message" entry, register function. On 2 points to an argument list containing an area for the name of the sender, a number giving the size of the area. area supplied to receive the text, followed by that The routine first "message Primitives Routine XR (contd.) -55- Chapter 3 Details semaphore a performs "P" receiver" semaphore, which serves to wait on operation in the "P" "message semaphore common" semaphore, the these lock the message chain (both of ones a does then a message has arrived. The routine until its on operation process's current are semaphores to the PCB, of course). Then, the the name of the first message is taken off the message list, process is found and placed into the argument list, sending and then the text is moved into the or padded with blanks as necessary. truncated area, receiving being Then the size field in the argument list is modified to contain the number of signifigant characters transmitted. for the message in rout ine does a message the "V" on list "message Finally, is the storage and the semaphore" and freed, common returns.. ROUTINE XS This routine performs the "send message" entry, function. On register 2 points to an argument list containing the name of the destination process, a count for the number of characters in the text, followed by the text itself. The routine first finds the PCB for the process by the given name: if there is none such, it enters an error routine d Primitives Routine XS (contd.) -56- Chapter 3 Details If there is, (by calling routine XQUE). s a "P" on the "message semaphore that common" the routine semaphore after allocating a region of memory large enough PCB, to hold the message block. Then that message block is down in the onto end moved of the message list (the length of the value message list being determined by the PCB's that of "message semaphore receiver" semaphore), and filled with the sender's name, the count of the text, The routine does then a semaphore common" semaphore, and the a then "V" itself. on the "message operation "V" text the on "message semaphore receiver" semaphore, then returns. ROUTINE XY This serves to start a process that is in the routine "stopped" state. On entry, register 2 contains a pointer to an argument list consisting of the name of the process to be started, an address in memory at which start running, and a to pointer a specifying the registers which should process in the beginning (this the process block be of loaded should 16 words for that is the method of passing arguments to newly-created processes). The routine first gets a pointer process of the given name (if to the PCB of the there is none such, an error Primitives -57- Chapter 3 Details Routine XY (contd.) routine is entered, calling routine XQUE). that "interrupt PCB's It then stores in save area" the registers specified, and a PSW with the address specified (and with the remainder of the PSW identical to ol d PSW existing when this the The "stopped" bit in the PCB routine was called). then is turned off, and the routine returns. ROUTINE XZ routine performs the "stop process" operation. On This entry, register 2 points to an argument list, consisting solely of the name of the process to be stopped. The first routine gets process of the given name (if a pointer to the PCB of the there is none such, routine is entered, calling routine XQUE). an error If that PCB's "in smc" bit is off, the routine simple turns the "stopped" bit on and returns. If, the however, "in smc" bit is on, then the "stop waiting" bit is turned on, and a "P" operation is on the "stopper semaphore". When the process next begins to run (see the writeups on these data bases), the be performed stopped has left performed normally. process to the smc section, and the stop can be Primitives Routine XZ (contd.) -58- Chapter 3 Details It is not There is one exception to the above. for a process whose (this lack of an allowed name does not begin with an asterisk being asterisk used the by supervisor process module to indicate user processes) to stop processes whose names do begin with asterisks presence (the of an asterisk similarly being used to indicate system processes). ROUTINE XPER This routine controller. to serves transfer to the traffic It simply causes a transfer to routine GETWORK. ROUTINE XEXC This routine serves to notify that an smc section is about to be entered. The routine sets the " in smc" bit in its PCB on, and then returns. ROUTINE XCOM This being left. routine serves It turns the "in to notify that an smc section is smc" bit to zero. Then, if the Primitives Routine XCOM (contd.) -59- Chapter 3 Details "stop waiting"bit in the PCB is off, it If, returns. however, the "stop waiting" bit is one, then it is reset to zero, a "V" operation is performed on the "stopper semaphore", and a "P" operation is performed on the "stoppee semaphore" (see the writeups of these data bases for more information). ROUTINE XQUE This routine serves to abnormally terminate a job. operates the same way that the XH routine operates above) but with a message that an error has occurred. It (see Device handlers -60- Chapter 3 Details DEVICE-HANDLING PROCESSES for the control of external provide To the supervisor process, and passed by created is process arguments (through registers, when it of address external the started running in and device, has operation back when the completed, been these it messages containing are There three for reader, printer, and user-supplied one routines: of input/output requested information on the result of the request. such type its handling for It is requests in the form of messages; receives messages sends is started) giving the which it controls. device routine a This pair. device) (job, each for is a special process devices, there handler-interface. READER HANDLER This routine controls card readers. When first entered, in process, its register 3, and the has the unit control block address in it protection key register 4. messages. If the messages are not of where xxxx The routine represents then to goes the be used held in into a loop reading form "READxxxx", a four-byte binary storage address, the message is rejected and another message is read. -61- Chapter 3 Details If, however, Device handlers Reader handler (contd.) is message the the correct, routine performs a "P" operation on the "user semaphore" in the UCB, to keep other iterim. processes format). The request, with received. bl ock command for the address the The is count in the CC W is a "read" specified as special its for writeup sp ecified operation the in message characters, and a 11 flags ei ghty controlling chaining, special the device The routine then con structs a channel contr ol block (see the channel Then the using from interrupts, etc. set t o zero. a CAW is constructed, w ith the address of the CCW, and key operation is key the being on done pa ssed the Then originally. a "P" "CAW semaphore", the rr achi ne's location for the CAW is loaded, the "start i/o" instruction for the appropriate reader is issued, and a "V" operation is done on the "caw semaphore". Then a "P" operation is done on "wait semaphore" on the UCB, thus waiting until an interrupt occurs. The status as stored by the interrupt routine indicates successful or is then unsuccessful examined. completion request, the process examines the card, as described If, however, the operation has not completed, If of it the below. the process goes back to wait some more. The process invoker. now examines the process name of its If the name starts with an asterisk, it is assumed to be part of the supervisor process module (see the section Device handlers Reader handler (contd.) -62- Chapter 3 Details on the supervisor process module for naming conventions) and a message reading either "YES" or is returned. However, success of the request, the on depending "NO", if the name of the invoking process does not start with an asterisk, it to assumed be a is process, and the situation is more user complicated. First, card, if the card just read in was a $JOB this indicates the end of the data for this program. Thus, a "NO" message is sent back to the caller, after the card is in a special area and the user's copy is zero'd out. Then the process enters requests with saved a a loop "NO" it wherein until answers all user It gets a system request, at which point it stores the card in the specified area and returns the message "YES". Otherwise, the message "YES" or "NO" is returned as for a ssystem request. PRINTER HANDLER This routine handles output for printers. is directly analogous to that of Messages are of the form "PRINxxxx", the address to be printed from. the where Its operation reader xxxx handler. represents Device handlers Printer handler (contd.) -63- Chapter 3 Details There is no data checking based on process name. EXCP HANDLER This handles routine where the user routine. The wishes method similar, but the to of messages "EXCPxxxxcccccccc", where the interface for those devices provide initialization to sent input/output own his and operation is are it of the form xxxx represents the unit address and cccccccc represents the channel command word to be used. There is a UCB for EXCP devices (one per group) but it does not contain a unit address. When an interrupt occurs for an EXCP device, interface returns ssssssss is the channel. a status EXCP message of the form "ssssssss", where returned the by device and the (Thus, the routine having called the EXCP process will be the next to be scheduled, since operation "V"-causing with the fact that EXCP required" the performed UCB's here. have the this is the only This fact, coupled "fast processing bit on, provides for very fast processing of EXCP interrupts.) Chapter 3 Details - INPUT/OUTPUT The 1/0 Interrupts -64 INTERRUPTS routine iointerrupt input/output interrupts request. The routine occurring operates handl ing performs after of the input/output an in (essentially) whatever process was running when the interrupt occurred. The routine returns performs a small number of operations, and then to the normal processing of that process. it First, finds the UCB for the device causing the interrupt (see the writeup on Channel Command Blocks for how this is done). It then OR's the new status information with structure the status information as stored in that UCB (the of the System/360 is such that this performs a "V"1 operation on the UCB. the Normally, routine is meaningful) and then "wait at returns modified might be on: this (the reloading the old registers and old PSW been in semaphore" old that point by PSW has by this point to turn off the wait bit if it for the signifigance of this, see the Traffic Controller writeup). If, UCB is on, into the however, the "fast processing required" bit in the the routine used the traffic XPER routine to transfer controller to run a new process (see the writeups for the "wait semaphore" and the "fast processing required" bit, along with the writeup on the XV routine, for -65- Chapter 3 Details more information). modified to point I/0 Interrupts Since the NEXTTRY field has usually been to the PCB for the device handler, this does indeed provide fast processing. Chapter 3 Details -66- Supervisor DETAILS OF THE SUPERVISOR PROCESS The supervisor process, which serves in this system the general capacity of the top-level supervisor, serves to provide processing of the job stream, the in system as giving structure to the interface between the user jobs and the nucleus. There is one supervisor process per job-stream, created by the IPL routine (see below). Since these are created in separate groups, there is no communication between them; they are invisible to each other. The supervisor processes, as was mentioned above, are created by the IPL routine. This entered upon an IPL (initial runs free of the rest of is the routine that is Program Load). The IPL routine the system; most particularly, there is no point in the IPL routine, until the very end, at which the (Thus, traffic there is controller a flag set can be such logically that if the traffic controller does get entered during this period, IPL routine makes an will stop cold. Such a little before explicit transfer to it, transfer could be entered. the the system caused by too core, for example.) The IPL routine first sets up a PCB for itself, in a permanently-allocated area, and sets pointers RUNNING and Chapter 3 Details NEXTTRY -67- to point to that Supervisor PCB, as well as all the chains within the PCB. Next, all available memory (not used by system) the is placed on the free list, and the protection keys associated with all of memory are set to zero. Other PCB's are now created, one process, and each in for each supervisor a group by itself (thus the "create process" primitive cannot process "*IBSUP". Then there is a call to "start is named be directly used here); process" for each of these processes, starting them each in the Job Stream Processor routine, with register 3 pointing to an argument list containing a pointer to the UCB for the reader and a pointer to the UCB for the printer for this job stream (all UCB's are stored memory), in a permanently-assigned part of and register 4 containing the protection key which should be used for user programs in this job stream. Then the IPL routine modifies its PCB to read "stopped", and only then does it transfer (via routine XPER) to the traffic controller, to begin running the supervisor processes. Upon entry, the supervisor proces.ses first create and start processes called "*IN" pointer to and "*OUT", passing them a their UCB's in register 3, and their protection key in register 4. each job in turn. Then it begins to sequentially process Chapter 3 -68- Supervisor Details reading Upon a $JOB card, the first card of each job, to the supervisor process first sends it the determines then allocates that amount, lying on a 2K the required and then memory of amount It printer. the of (because boundary hardware), and sets the key associated with protection (as that area to the one to be assigned to the user program the IPL routine). by sent will USERPROG, which program first The become begins to run the process that the user in. scans the device assignment fields, then process It then creates a process called form these fields being of the (Note name=devtype. that "name' cannot begin with an asterisk ("*"): see below.) For it creates a process called "name" and starts each of these, that process in an interface routine for that "devtype". For used "devtype" either IN or OUT, the routine reads messages from that that routine as waits appropriate, for a process, and then sends that reply to the For original calling process. interface one from the user process, sends them unchanged to process "*IN" or "*OUT", reply is is the "devtype" one being EXCP, the earlier in this described chapter under the Device Handler heading. The supervisor module then begins to object deck into his performed as necessary. partition of read core. the user's Relocation is When this is. completed, the process Supervisor -69- Chapter 3 Details PCB USERPROG's has its blocked bit temporarily turned on. Then the supervisor process starts the user process location, specified and the modifies the PSW so that it then will run in user mode under at the key. specified Then the bit is turned off and the user process can start to blocked run. starts At this point, the supervisor process to wait for a message to be returned. This can be either a "success" or a "failure" message. In either case, the content is message printed on the printer, and of the the supervisor process destroys all of the processes created for or by the job, frees the partition of memory allocated, and goes user to the next job. Whenever supervisor (which it there are no more jobs to be run, the process enters an infinite loop reading messages will never get). Chapter 3 Details -70- User programs DETAILS OF USER PROGRAMS User programs run in a process or process of their own. They start out, as was mentioned called USERPROG. However, more free creation of above, processes in one process be may created; subsidiary processes is encouraged. User processes cannot have names starting with an asterisk ("*"), nor can they destroy a process whose name does begin with an asterisk. Thus, the supervisor is protected routines, XH and against the user. There two are XQUE, which signal success and failure, respectively. XH Is usually the program, wheras XQUE routine invoked by the user, condition (XQUE can take English-language message). is usually upon an Both of these by called by a system detection argument called of an error In the form of an routines send a message to the supervisor process and then block themselves. Appendix A Classroom use -71- USE OF THE EXAMPLE OPERATING SYSTEM IN THE CLASSROOM ENVIRONMENT This operating system course on the principles of was designed for design of its use in a operating systems. There are two categories of such use. First, it can be presented, either whole or piecemeal, to the students for their examination as a case study. When learning about the generalized aspects of operating systems, it is useful be able to tie each of these down by seeing how a real operating system has handled such problems. The second category of use, however, is the more important. After the students have studied the example operating system and become familiar with its workings, they can be assigned problems concerned with adding new features not included in the original version. Thus, detection (such one and a facility assignment may be to add a facili prevention of ty for the memory-allocation not currently being provided). easier to have students add such features to deadlocks It is much an existing -72- Appendix A Classroom use operating system than to have them program, as it were, in a vacuum, since their additions can easily be checked out by running test cases which would cause, say, memory-allocation deadlocks, and noticing taken. Without an whether desired actions special grading program which the students' programs as subroutines, providing them with a pseudo-environment similar to (but not identical the are example operating system available, the analogue is to construct a calls the to) environment these programs would have embedded within a real system. Provision of quite difficult timing (in behavior of this type of facility becomes when there is of necessity a dependence on device management, for example) or on the the programs running on the system at the time (with respect to some aspects of memory management and other modules). There has yet been little direct experience with this system in the classroom in this regard, so there information yet available. on knowledge may be gotten from (Cor classroom-oriented how best is little to do this. Some 63), a report on the assembler used in 6.251 at M.I.T. early sixties; certain aspects of this text are with respect to assignments given to the students. CAP in the analogous Appendix A Classroom use These categories. assignments -73- may be broken down into a few Fixing minor flaws -74- Appendix A Cl,assroom use FIXING MINOR OPERATIONAL FLAWS PRESENT IN THE EXISTING SYSTEM having This category includes assignments dealing with the correct students a minor flaw in the system, which is not really critical, but which possibly has tricky aspects. Currently, after using "stop the primitive process" (routine XZ), the stopped process cannot be restarted by the "start process" primitive (routine XY), process be might stopped the since in the middle of the "leave smc section" primitive (routine XCOM) stopped waiting a on semaphore no other process will ever access (at to why, see the which appropriate documentation). An assignment to test for such a condition the in "start process" students' basic understanding of the routine would test the data its and system bases. Currently, semaphores are not usable directly by user programs (although messages can be utilized to provide the same operational capability). This is because certain system information, a pointer to the PCB of the first process waiting on the semaphore, is stored in the semaphore itself. Therefore, if a semaphore could be stored in the user's partition, this pointer's integrity could not be guaranteed. -75- Appendix A Classroom use Fixing minor flaws Thus, the system currently allows only and "V' routines, and these call ing routines "P" the call pass only system semaphores, stored outside partition and arguments. The students could facility accessible usable only for be system user's any to assigned provide semaphores used in in a future an fro m awareness a entry a system area , at the name for the and pseudo-P created pseudo-V requests. The basic thing the students would get from apart as routines, would have to have user's req uest, and passing back to of the user's programs for semaphore-type by creating semaphore by be synchroniz ation. Basically, it point to routines system this, gaining a familiarity with P and V, would be an of the problems of memory allocation, as implemente d within this system, since full records will have to be kept of what name is associated with and also of what what semaphore, semaphores were allocated by the user's programs, so that these can be freed at the end of the bookkeeping This serves to job. get the student down into the details very quickly. The system as now implemented performs One assignment expect another USER:username, is for the students field where on the username $JOB is no accounting. to modify the system to card, of the name of the form the user to which this work is to be charged. At the end of the job, an Fixing minor flaws -76- Appendix A Classroom use accounting tailsheet is printed containing the username, the resources used, and the total cost. problem, what a 370/155 and its peripheral devices what paper cost then asked to develop implementing various goals for this pricing this per month, cost, what an operator costs per hour supplies and what types of functions he must perform and to could be given basic information on students the As an adjunct Nielson (Nie 70) is a helpful various the be would scheme for each at which The paper by implemented. in reading collateral schemes pricing installation job, this regard. there then And small bugs noticed in the system are during its development but left in for possible problems for the student. These category. As an example, interrupt might be traffic controller best probab ly serve it pending because PO ssible is the warm-up that process when of, in say I, a timer goes to the a performing "P" operation on a semaphore with value 0. When the next process running, is scheduled and begins it will get the timer runout trap right off, even though it was meant for the last process. This is perhaps not crucia I but it is certainly the notion of what a inelegant: it tends to undermin e process is. A simple test will corr ect this one. Another inelegancy in the system is caused by the large Appendix A Classroom use number (4) -77- of save areas Fixing minor flaws in the PCB; these save areas together take up the vast majority of the bit of history development of certain is the perhaps system in it order was PCB's here: decided during that the perhaps modules might neeed to call each other recursively, or at least in very complicated pat terns. Thus, whereby any system routine cou 1d allocate automatic storage, in the PL/I styl e, the (A space. a facility add ed. was of block a However, memory allocator would be nece ssary to do this, thus it needed space for its allocator called save the are a in the PCB. The memory memory free er, which th us needed its own space in the PCB, since it couldn't all ocate its own, being unable to call back on th e allocator. The timer runout routine needed space, since a t imer runout could time, even in the middle o f memory mana gement. any And the routines themselves needed a sa ve area, for use before coul d allocate an the automatic region into they wou1 d move their old save area that region, f reeing that part of the PCB up for other area, which that this facility w so they au toma tic area. After the allocation of cal s; this explains the "next save save at occur deeply imbedded remo ved from the PCB certain amount area" pointer in the currently unused. The problem here is no longer needed, but the scheme was hat not one of the save areas coul.d be The student could possibly, with a of rethinking, devise a new method for save -78- Appendix A Classroom use Fixing minor flaws areas which does not use up quirte so much memory. These important small as a "warm-up" whole, problems are not terribly but have their pedogogical uses in getting the students used to working on the system. Major modifications -79- Appendix A Classroom use MAJOR MODIFICATIONS TO EXISTING MODULES This category includes those changes to the system which are only modifications to the now-present structure, but which have major to impact the system, and are important pedogogically. the processes running i n the system all if Currently, start to request memory to the extent to provide it, and there is of the memory allocator ability be no process freeing memory, a deadlock will in processes the system memory, or waiting for for waiting performance, memory. if might well for other waiting either processes, are runnable there (because being performed by traffic the a detection a would will This test, which ever deadlock be operations would be controller as z n extension to a currently perfo rms, would cause it randomly-selected of or input-ou tput no are One in the syst em are currently runnable is performed). similar test which it throw processes no-one that currently system it does not stop the systerr cold. A student to detect that all blocked, and themselves degrade could be assigned one of two types of projec ts. be free to pro cesses seriously can All reached. be waiting on other which processes This the surpassing of job off situation the system. Here, is easy, but to the the -80- Appendix A Classroom use modifications Major modifications to the system will cause problems. Currently, the method to throw a job off the system has as part sending a be allocated. words, if The requi res were ever to to be impossible to the removed far A have wou1 d controller, the most basic part of the system, traffic calling on the supervisor process, a section of very satisfy ourselves in this p osI tion. find pedogogically more important problem is that we the memory required here is jus t a few amount but would be very likely we it to the supervisor process in its group. message The problem here is that sending a message to of viewpoint of from it. This modularity, the system is a definite p roblem from resolution its and is non-obvious. possible Another memory-allocation deadlocks. It student assignment would deadlocks be deadlock it use the memory while the get caught in the controller to more likely same deadlock. An important, but intellectually difficult assignment would be to traffic allocator. completes its compute-bound section, it might free enough memory to allow the others to run, but it will partial has a long compute-bound program running, which has no current use for programs which When detect might be that with five job streams into the system, four of them are involved in a fifth to associated with extend the detect this type of deadlock coming on, and deal with mentioned Major modifications -81- Appendix A Classroom use as it above (of in the course, detection deadlock problem all of the problems noted in reference to that problem still such first apply). The specifics of would, of course, depend on the instructor; there is certainly no easy, general solution. Another mod ification of similar dealing deadlocks, with would complexity, be to and improve also the device-allocatio n procedure. Currently, each job can request one reader one and printer, where which device is to be assigned is dete rmined before the request is made. certainly the not most efficient manner of This is proceeding (particularly si nce there are twice as nany printers present on the system o n which we run as there are readers!) but is simple moreover completely avoids the problem of deadlocks. An assignment to the students to improve the situation would be a good Moreover, to one a rea 1 test their knowledge there deadlocks. improvement in system performance might be gained if, say, a reader used by a job were to whenever of be released were no more cards to be read for that job, thus allowing a then-created supervi sor process to begin process the next supervisor process). to job (this would change the status of the Appendix A Classroom use -82- Major additions MAJOR ADDITIONS TO THE EXISTING SYSTEM There is finally the category of major additions of new to features the system. These can, of course, take on any form. A few representative ones are considered below. One assignment would be to add routines to to handle other types of devices, the as such disk. This assignment in itself might be of small teaching it system value, but the way for further assignments. The modifications lays would come in the Job Stream Processor and in the device management routines. If there were disk management, there could then come an assignment to, let us say, optimize the coming to in the disk by a order of requests modification to the disk management module. Here can be clearly seen the advantage of a having real modifications: operating there is system which to base a clumsy disk management, furthermore, there is no need to on write disk-environment simulator. Once there is the possibility of adding a file system. might vary, but Specifics of this it might be noticed that, from the user's viewpoint, the fields on the $JOB card make easy provision -83- Appendix A Classroom use Major additions for this. With disk management and a simple file system (perhaps even none at all, for simple versions) we can perform and output spooling. And with spooling, addition of various job scheduling algorithms becomes Exact input details instructor. here, of course, a practical are best assignment. left to the -84- Appendix B System/360 APPENDIX B THE IBM SYSTEM/360 This appendix explains the structure System/360 to a degree suitable to allow a of the reader who IB is unfamiliar with the System/360 to understand the terminology used in this thesis. The System/360 digital computer. is a general-purpose, stored-program Its words are 32 bits long and its bytes are 8 bits long. Internal character coding is in EBCDIC, a variant of BCD. Addressing of data fields is by a 24-bit address giving the beginning byte's address. Each 2K-byte block of memory can have a protection key from 0-15 associated with it, used as described below. The main processor state register is the Program Status Word (PSW). The It stores the following. instruction counter, which is the address of the byte following the current instruction. The program/supervisor mode switch. This bit is used to distinguish between program mode, where certain priviledged instructions may not be executed, and supervisor mode, where Appendix B System/360 all -85- instructions may be executed. The instruction to reload the complete PSW instructions to reload sensitive sections of the PSW, or to is a priviledged as instruction, are reset storage keys, or to perform input/output. The program conjunction above. protection with the key. storage This protection If it is zero, the program may attached memory; if it key is used keys access in mentioned any part of is non-zero, he can access only those sections of memory with a matching key. A running/waiting flag. This bit indicates whether program is running in state (normal) program in wait state is not in waiting for the occurrence or in wait state. A execution, of the whatever but is rather interrupts are enab 1ed. Interrupt inhibit, or flags. delay These fl age are used to allow, interrupts, depending on the setting of the flags, and the nature of the device. Other fields, not important here. Input/output is performed by means of the instruction, the Channel Address (CAW), a word stored in a permanently-assigned address of memory. The CAW contains the protection key by i/o" which specifies the device address to be used. This instruction sends to the channel Word "start the operation, and contains to be used the address of the first -86- Appendix B System/360 Channel Command Word (CCW) the channel which the contain operations fetches the CCW's, to be performed, along with flags to control such options as command chaining or special interrupts. Input/output interrupts occur when the appropriate bit in the PSW allows them, or wait until it Channel request. stored, does. They store a Status Word (CSW) which indicates the status of the Then, as on and a new other PSW, interrupts, the old PSW is stored in a permanently-assigned location, is loaded (thus effecting a transfer). References -87REFERENCES (Cor 63) Corbatd, F.J., Poduska, J.W., and Saltzer, J.H., Advanced Computer Programming: A Casa Study of_ a Classroom Assembly Progra..m, M.I.T. Press, Massachusetts Institute of Technology, Cambridge, Massachusetts, 1963. (Cor 65) Corbat6., F.J., jet al.,. "A New Remote Accessed Man-Machine System," AFI PS Conf. Proc. 22 (1965 FJCC), pp. 185-247. (Cri 65) Crisman, P.A., ed., lb& Compatible Time-Sharing System: A Programmer's Guide, (second edition), M.I.T. Press, Massachusetts Institute of Technology, Cambridge, Massachusetts, 1965. (Dij 68) Dijkstra, E.W., Multiprogramming "The Structure System," Comm. 1968), pp. 341-346. (Don 72) Donovan, J.J., Systems Computer Science Series, Programming, 1972. (Han 70) Hansen, P. B., "The Nucleus of System," Comm. ACM JA, 238-241. (IBM 'THE' of the ACM 11,. 5 (May, a McGraw-Hill Multiprogramming (April 1970), pp. :1) IBM System Reference Library, J 3 7090-7094 IubSYS Qprating Monitor System, System Version I 3_, 7090- 36, Number Number (iBSYS), ile Form c28-6248-7 1966. (IBM :2) Sv stem/360 IBM IBM System Reference Library, Planning for Multipro r ammi n Operating System, Witha Fixed Number of Tasks, Version HL Hi), File Number S360-36, Form c27-6939-0. (IBM :3) iBM IBM System Reference Library, M.V.T. Control Program Logic Summar., S360-36, Form Y28-6658-0. (Nie 70) Nielson, Sy stem/360 Fil e Number Computer of "The Allocation Is Pr icing the Answer?", Comm. AIQU 13, 8 (August, 1970), pp. 467-474. Resources N.R., -- (Sal 66) Saltzer, J.H., "Traff c Control in a Multiplexed Computer System," M.I T. Ph.D. Thesi s, June, 1966. (Available as Project MAC Technical Report TR-30.)