Storage Allocation Operating System Hebrew University Spring 2007 1 Background • Program must be brought into memory and placed within a process for it to be run. • Input queue – collection of processes on the disk that are waiting to be brought into memory to run the program. • User programs go through several steps before being run. 2 Binding of Instructions and Data to Memory • Address binding of instructions and data to memory addresses can happen at three different stages: – Compile time – Load time – Execution time 3 Binding of Instructions and Data to Memory • Compile time: If memory location known in advance, absolute code can be generated; • Must recompile code if starting location changes. • Load time: Must generate relocatable code if memory location is not known at compile time. • Execution time: Binding delayed until run time if the process can be moved during its execution from one memory segment to another. Need hardware support for address maps (e.g., base and limit registers). 4 Multi-step Processing of a User Program 5 Division of Responsibility • Compiler: generates one object file for each source code file containing information for that file. Information is incomplete, since each source file generally uses some things defined in other source files. • Linker/Loader: combines all of the object files for one program into a single object file, which is complete and self-sufficient. • Operating system: loads object files into memory, allows several different processes to share memory at once, provides facilities for processes to get more memory after they have started running. • Run-time library: provides dynamic allocation routines, such as malloc and free in C. 6 Logical v. Physical Address Space • The concept of a logical address space that is bound to a separate physical address space is central to proper memory management. – Logical address – generated by the CPU; also referred to as virtual address. – Physical address – address seen by the memory unit. • Logical and physical addresses are the same in compile and load time; logical (virtual) and physical addresses differ in execution time. 7 Memory Management Unit (MMU) • Hardware device that maps virtual to physical address. • In MMU scheme, the value in the relocation register is added to every address generated by a user process at the time it is sent to memory. • The user program deals with logical addresses; it never sees the real physical addresses. 8 Dynamic Relocation using a Relocation Register 9 Contiguous Allocation • Single partition allocation (one process) • Relocation register scheme used to protect user processes from each other, and from changing operating system code and data. • Relocation register contains value of smallest physical address; • Limit register contains range of logical addresses – each logical address must be less than the limit register. 10 Hardware for Relocation and Limit Registers 11 Hardware for Relocation and Limit Registers 12 Contiguous Allocation • Multiple partition allocation (multiple processes) • Hole – block of available memory; holes of various size are scattered throughout memory. • When a process arrives, it is allocated memory from a hole large enough to accommodate it. • Operating system maintains information about: – a) allocated partitions – b) free partitions (hole) 13 Contiguous Allocation 14 Dynamic Storage Allocation Problem • Algorithms differ in how they manage the free holes: • Best fit: search the whole list on each allocation, choose hole that comes closest to matching the needs of the allocation, save the excess for later. During release operations, merge adjacent free blocks. • First fit: scan for the first hole that is large enough. Also merge on releases. • Worse fit: select the largest block, produces the largest leftover hole. 15 So Which One? • All could leave many small and useless holes. – Assuming 1K blocks: • Best-Fit sometimes performs better: Assume holes of 20K and 15K, requests for 12K followed by 16K can be satisfied only by best-fit. • But First-Fit can also perform better: Assume holes of 20K and 15K, requests for 12K, followed by 14K, and 7K, can be satisfied only by first-fit. • In practice (based on trace-driven simulation) – First-Fit is usually better than Best-Fit – First-Fit and Best-Fit are better than Worst Fit 16 Fragmentation • External Fragmentation – total memory space exists to satisfy a request, but it is not contiguous. • Internal Fragmentation – allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used. 17 Fragmentation • Reduce external fragmentation by compaction – Shuffle memory contents to place all free memory together in one large block. • Compaction is possible only if relocation is dynamic, and is done at execution time. 18 Segmentation • Memory management scheme that supports user view of memory. • A program is a collection of segments. A segment is a logical unit such as: – – – – – – – – main program procedure, function, object, local variables, global variables, common block, stack, symbol table, arrays 19 User's View of a Program 20 Logical View of Segmentation 21 Segmentation Architecture • Logical address consists of a two-tople: – <segment-number, offset> • Segment table – maps two dimensional physical addresses; each table entry has: – base – contains the starting physical address where the segments reside in memory. – limit – specifies the length of the segment. 22 Segmentation Architecture • Segment table base register (STBR) - points to the segment table’s location in memory. • Segment table length register (STLR) indicates number of segments used by a program; • Segment number s is legal if s < STLR. 23 Segmentation Architecture • Relocation. – dynamic – by segment table • Sharing – shared segments – same segment number • Allocation – first fit/best fit – external fragmentation 24 Protection • Protection. With each entry in segment table associate: – validation bit = 0 illegal segment – read/write/execute privileges • Protection bits associated with segments; code sharing occurs at segment level. • Since segments vary in length, memory allocation is a dynamic storage allocation problem. 25 Segmentation Hardware 26 Segmentation Example 27 The User Program • Information stored in memory is used in many different ways. Some possible classifications are: – Role in Programming Language: • Instructions (operations and the operands in the operations). • Variables (change as the program runs: locals, globals, parameters, dynamic storage). • Constants (used as operands, but that never changes: pi for example). – Changeability: • Read-only: (code, constants). • Read & write: (variables). 28 Variables and Locations • Can we allocate all the required memory to our program in advanced (compilation time)? – It is in general impossible to know at compile time how many variables are going to be created at run time • recursive procedures containing local variables • dynamic variables • We cannot solve all the allocations at compile time. • Do we want to allocate the required memory in advanced? – We want to be efficient in the management of the memory: We only want to allocate storage for a variable when needed, and we want to de-allocate when possible. 29 Variable Lifetime • lifetime of a variable - the period of time during execution in which the variable has storage allocated for it. – – – Global variables: Lifetime is entire runtime of program. Local variables (variables declared in a procedure): Lifetime is during activation of procedure. User-allocated variables (aka dynamic variables variables created with new/malloc and destroyed with delete/free): Lifetime is from user allocation to user de-allocation 30 Storage Allocation Policies • We have the following storage allocation policies: – Static Allocation – Dynamic Allocation of local variables – Dynamic Allocation of user-allocated variables 31 Static Allocation (for globals only) • Done at compile time • Lifetime = entire runtime of program • Advantage: efficient execution time 32 Dynamic Allocation of local variables • Done at run time • Lifetimes = duration of procedure activation • Advantage: efficient storage use – Two Methods: • Stack Allocation (all variables allocated within the procedure scope) • Heap Allocation (unless declared 'static') 33 Dynamic Allocation of user-allocated variables • Done at run time • Lifetimes = until the user deletes it (or until it is garbage-collected) • Advantage: permits creation of dynamic structures, like lists, trees, etc. • Heap Allocation 34 Memory Management • The memory of the machine is divided into three parts: – A space for the globals and the code of the program – The stack, for the locals – The heap, for the dynamic variables • The division between 1 and 2 is only logical: physically the locals and the code are (usually) placed at the base of the stack. 35 The Segments 36 Managing the Information • One of the steps in creating a process is to load its information into main memory, creating the necessary segments. • Information comes from a file that gives the size and contents of each segment. • The file is called an object file. 37 Dynamic Memory Allocation • Q. Why is not static allocation sufficient for everything? • A. Unpredictability: cannot predict ahead of time how much memory, or in what form, will be needed: – Recursive procedures. – OS does not know how many jobs there will be or which programs will be run. – Complex data structures, e.g. linker symbol table. If all storage must be reserved in advance (statically), then it will be used inefficiently (enough will be reserved to handle the worst possible case). 38 Dynamic Memory Allocation – Cont. • Dynamic allocation can be handled in one of two general ways: – Stack allocation (hierarchical): restricted, but simple and efficient. – Heap allocation: more general, but less efficient, more difficult to implement. 39 Stack-Based Allocation • Memory allocation and freeing are partially predictable (as usual, we do better when we can predict the future). • Allocation is hierarchical: memory is freed in opposite order from allocation. • If alloc(A) then alloc(B) then alloc(C), then it must be free(C) then free(B) then free(A). 40 Example • Procedure call: – Program calls Y, which calls X. Each call pushes another stack frame on top of the stack. Each stack frame has space for variable, parameters, and return addresses. • Stacks are also useful for lots of other things: tree traversal, top-down recursive descent parsers, etc. 41 Stack-Based Organization • A stack-based organization keeps all the free space together in one place. 42 Handling the Stack 1. 2. 3. 4. 5. 6. 7. 8. 9. Let us consider the various operations that take place when a procedure is called: Caller processes actual parameters (evaluation, address calculation) and stores them Caller stores some control information (e.g., return address) Control transferred from Caller to Callee Callee allocates storage for locals Callee executes Callee deallocates storage for locals Callee stores return value (if there is any) Control transferred back to Caller Caller deallocates storage used for control information and 43 actual parameters Heap-based Allocation • Allocation and release are unpredictable. • Heaps are used for arbitrary list structures, complex data organizations. • Example: payroll system. Do not know when employees will join and leave the company, must be able to keep track of all them using the least possible amount of storage. 44 Heap-based Organization • Inevitably end up with lots of holes. – Goal: reuse the space in holes to keep the number of holes small, their size large. • Fragmentation: inefficient use of memory due to holes that are too small to be useful. In stack allocation, all the holes are together in one big chunk.45 The Free List • Typically, heap allocation schemes use a free list to keep track of the storage that is not in use. • Algorithms differ in how they manage the free list: • Best fit: keep linked list of free blocks, search the whole list on each allocation, choose block that comes closest to matching the needs of the allocation, save the excess for later. During release operations, merge adjacent free blocks. • First fit: just scan list for the first hole that is large enough. Also merge on releases. 46 Bit Map Allocation • Used for allocation of storage that comes in fixedsize chunks (e.g. disk blocks, or 32-byte chunks). • Keep a large array of bits, one for each chunk. If bit is 0 it means chunk is in use, if bit is 1 it means chunk is free. 47 Reclamation Methods • how do we know when memory can be freed? • It is easy when a chunk is only used in one place. • Reclamation is hard when information is shared: it cannot be recycled until all of the sharers are finished. • Sharing is indicated by the presence of pointers to the data. Without a pointer, cannot access (cannot find it). 48 Problems in reclamation • There are two problems: – Dangling pointers: better not recycle storage while it is still being used. – Core leaks: Better not "lose" storage by forgetting to free it even when it cannot ever be used again. 49 Reference Counts • keep track of the number of outstanding pointers to each chunk of memory. • When this goes to zero, free the memory. • Example: file descriptors in Unix. Works fine for hierarchical structures. The reference counts must be managed automatically (by the system) so no mistakes are made in incrementing and decrementing them. 50 Example 51 Garbage Collection • Storage is not freed explicitly (using free operation), but rather implicitly: just delete pointers. • When the system needs storage, it searches through all of the pointers (must be able to find them all!) and collects things that are not used. • Makes life easier on the application programmer, but garbage collectors are incredibly difficult to program and debug. 52 How does garbage collection work? • Must be able to find all objects. • Must be able to find all pointers to objects. • Pass 1: mark. Go through all pointers that are known to be in use. Mark each object pointed to, and recursively mark all objects it points to. • Pass 2: sweep. Go through all objects, free up those that are not marked. 53 Garbage Collection 54