Outline • Memory management – Introduction – Memory allocation strategies • Segmentation • Paging Memory Manager in OS 5/29/2016 COP4610 2 Background • Program must be brought into memory and placed within a process for it to be executed – A program is a file on disk – CPU reads instructions from main memory and reads/writes data to main memory • Determined by the computer architecture – Address binding of instructions and data to memory addresses 5/29/2016 COP4610 3 Review: Instruction Execution • Instruction fetch (IF) MAR PC; IR M[MAR] • Instruction Decode (ID) A Rs1; B Rs2; PC PC + 4 • Execution (EXE) – Depends on the instruction • Memory Access (MEM) – Depends on the instruction • Write-back (WB) 5/29/2016 COP4610 4 S1 bus Dest bus S2 bus C o n t r o l ALU A B R0, r1,... C (registers) ia(PC) u n i t IR psw... MAR MDR Memory MAR memory address register MDR memory data register IR instruction register Creating a load module 5/29/2016 COP4610 6 Linker function • Combine the object modules into a load module • Relocate the object modules as they are being loaded • Link the object modules together as they are being loaded • Search libraries for external references not defined in the object modules 5/29/2016 COP4610 7 Relocation 5/29/2016 COP4610 8 Linking 5/29/2016 COP4610 9 A Sample Code Segment ... static int gVar; ... int proc_a(int arg){ ... gVar = 7; put_record(gVar); ... } 5/29/2016 COP4610 10 The Relocatable Object module Code Segment Relative Address Generated Code Data Segment 0000 ... Relative ... Address Generated variable space 0008 entry proc_a ... ... 0036 [Space for gVar variable] 0220 load =7, R1 ... 0224 store R1, 0036 0228 push 0036 0049 (last location in the data segment) 0232 call ‘put_record’ ... 0400 External reference table ... 0404 ‘put_record’ 0232 ... 0500 External definition table ... 0540 ‘proc_a’ 0008 ... 0600 (symbol table) ... 0799 (last location in the code segment) 5/29/2016 COP4610 11 The Absolute Program Code Segment Relative Address Generated Code 0000 (Other modules) ... 1008 entry proc_a Data Segment ... Relative 1220 load =7, R1 Address Generated variable space 1224 store R1, 0136 ... 1228 push 1036 0136 [Space for gVar variable] 1232 call 2334 ... ... 1000 (last location in the data segment) 1399 (End of proc_a) ... (Other modules) 2334 entry put_record ... 2670 (optional symbol table) ... 2999 (last location in the code segment) 5/29/2016 COP4610 12 The Program Loaded at Location 4000 Relative Address 0000 4000 ... 5008 ... 5036 ... 5220 5224 5228 5232 ... 5399 ... 6334 ... 6670 ... 6999 7000 ... 7136 ... 8000 5/29/2016 Generated Code (Other process’s programs) (Other modules) entry proc_a [Space for gVar variable] load store push call =7, R1 R1, 7136 5036 6334 (End of proc_a) (Other modules) entry put_record (optional symbol table) (last location in the code segment) (first location in the data segment) [Space for gVar variable] (Other process’s programs) COP4610 13 Variations in program linking/loading 5/29/2016 COP4610 14 Normal linking and loading 5/29/2016 COP4610 15 Load-time dynamic linking 5/29/2016 COP4610 16 Run-time dynamic linking 5/29/2016 COP4610 17 UNIX Style Memory Layout for a Process 5/29/2016 COP4610 18 Memory Management 5/29/2016 COP4610 19 Requirements on Memory Designs • The primary memory access time must be as small as possible • The perceived primary memory must be as large as possible • The memory system must be cost effective 5/29/2016 COP4610 20 Functions of Memory Manager • Allocate primary memory space to processes • Map the process address space into the allocated portion of the primary memory • Minimize access times using a cost-effective amount of primary memory 5/29/2016 COP4610 21 The External View of the Memory Manager Application Program Memory Mgr Process Mgr File Mgr UNIX Device Mgr VirtualAlloc() VMQuery() VirtualLock() VirtualFree() ZeroMemory() Memory Mgr Process Mgr Device Mgr File Mgr exec() shmalloc() sbrk() getrlimit() Windows Hardware 5/29/2016 COP4610 22 Storage Hierarchies 5/29/2016 COP4610 23 The Basic Memory Hierarchy CPU Registers Primary Memory (Executable Memory) e.g. RAM Secondary Memory e.g. Disk or Tape 5/29/2016 COP4610 24 CPU Registers L1 Cache Memory L2 Cache Memory “Main” Memory Rotating Magnetic Memory Faster access Primary (Executable) Secondary Larger storage Contemporary Memory Hierarchy Optical Memory Sequentially Accessed Memory 5/29/2016 COP4610 25 Exploiting the Hierarchy • Upward moves are (usually) copy operations – Require allocation in upper memory – Image exists in both higher & lower memories • Updates are first applied to upper memory • Downward move is (usually) destructive – Destroy image in upper memory – Update image in lower memory • Place frequently-used info high, infrequently-used info low in the hierarchy • Reconfigure as process changes phases 5/29/2016 COP4610 26 Overview of Memory Management Techniques • Memory allocation strategies – View the process address space and the primary memory as contiguous address space • Paging and segmentation based techniques – View the process address space and the primary memory as a set of pages / segments – Map an address in process space to a memory address • Virtual memory – Extension of paging/segmentation based techniques – To run a program, only the current pages/segments need to in primary memory 5/29/2016 COP4610 27 Memory Allocation Strategies - There are two different levels in memory allocation 5/29/2016 COP4610 28 Two levels of memory management 5/29/2016 COP4610 29 Memory Management System Calls • In Unix, the system call is brk – Increase the amount of memory allocated to a process 5/29/2016 COP4610 30 Malloc and New functions • They are user-level memory allocation functions, not system calls 5/29/2016 COP4610 31 Memory Management in a Process 5/29/2016 COP4610 32 Issues in a memory allocation algorithm • Memory layout / organization – how to divide the memory into blocks for allocation? • Fixed partition method: divide the memory once before any bytes are allocated. • Variable partition method: divide it up as you are allocating the memory. • Memory allocation – select which piece of memory to allocate to a request • Memory organization and memory allocation are close related • It is a very general problem – Variations of this problem occurs in many places. • For examples: disk space management 5/29/2016 COP4610 33 Fixed-Partition Memory allocation • Statically divide the primary memory into fixed size regions – Regions can have different sizes or same sizes • A process / request can be allocated to any region that is large enough 5/29/2016 COP4610 34 Fixed-Partition Memory allocation – cont. • Advantages – easy to implement. – Good when the sizes for memory requests are known. • Disadvantage: – cannot handle variable-size requests effectively. – Might need to use a large block to satisfy a request for small size. – Internal fragmentation – The difference between the request and the allocated region size; Space allocated to a process but is not used • It can be significant if the requests vary in size considerably 5/29/2016 COP4610 35 Queue for each block size 5/29/2016 COP4610 36 Allocate a large block to a small request? 5/29/2016 COP4610 37 Variably-sized memory requests 5/29/2016 COP4610 38 Variable partition memory allocation • Grant only the size requested – Example: • • • • • • 5/29/2016 total 512 bytes: allocate(r1, 100), allocate(r2, 200), allocate(r3, 200), free(r2), allocate(r4, 10), free(r1), allocate(r5, 200) COP4610 39 Issues in Variable partition memory allocation • Where are the free memory blocks? – Keeping track of the memory blocks – List method and bitmap method • Which memory blocks to allocate? – There may exist multiple free memory blocks that can satisfy a request. Which block to use? • Fragmentation must be minimized • How to keep track of free and allocated memory blocks? 5/29/2016 COP4610 40 The block list method 5/29/2016 COP4610 41 Information of each memory block • Information of each memory block – Address: Block start address – size: size of the block. – Allocated: whether the block is free or allocated. Struct list_node { int address; int size; int allocated; struct list_node *next; }; 5/29/2016 COP4610 42 After allocating P5 5/29/2016 COP4610 43 After freeing P3 5/29/2016 COP4610 44 Design issues in the list method • Linked list or doubly linked list ? • Keep track of allocated blocks in the list? – Two separate lists • Can be used for error checking • Where to keep the block list – Static reserved space for linked list – Use space with each block for memory management 5/29/2016 COP4610 45 Reserving space for the block list 5/29/2016 COP4610 46 Block list with list headers 5/29/2016 COP4610 47 Bitmap method 5/29/2016 COP4610 48 Which free block to allocate • How to satisfy a request of size n from a list of free blocks – First-fit: Allocate the first free block that is big enough – Next-fit: Choose the next free block that is large enough – Best-fit: Allocate the smallest free block that is big enough • Must search entire list, unless ordered by size. • Produces the smallest leftover hole. – Worst-fit: Allocate the largest free block • Must also search entire list. • Produces the largest leftover hole. 5/29/2016 COP4610 49 Fragmentation • Without mapping, programs require physically continuous memory – External fragmentation • total memory space exists to satisfy a request, but it is not contiguous. – Internal fragmentation • allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used. 5/29/2016 COP4610 50 Fragmentation – cont. • Reduce external fragmentation by compaction – Shuffle memory contents to place all free memory together in one large block – Compaction is possible only if relocation is dynamic, and is done at execution time 5/29/2016 COP4610 51 Compacting memory 5/29/2016 COP4610 52 Memory Mapping • How to reduce internal fragmentation – Large blocks mean large internal fragmentation – Waste memory • Divide the memory into small segments • Memory mapping – Logical address from CPU is mapped to a physical address 5/29/2016 COP4610 53 Logical vs. Physical Address Space • The concept of a logical address space that is bound to a separate physical address space is central to proper memory management. – Logical address – generated by the CPU; also referred to as virtual address – Physical address – address seen by the memory unit 5/29/2016 COP4610 54 A Simple Memory Mapping • Through base and bound registers – A simple mapping between logical and physical addresses • Logical address + base => physical address 5/29/2016 COP4610 55 Separate code and data spaces • Two sets of base and bound registers – One set for code segment – One set for data segment 5/29/2016 COP4610 56 Code/data memory relocation 5/29/2016 COP4610 57 Segmentation • Memory-management scheme that supports user view of memory. • A program is a collection of segments. A segment is a logical unit such as: main program, procedure, function, local variables, global variables, common block, stack, symbol table, arrays 5/29/2016 COP4610 58 Logical View of Segmentation 1 4 1 2 3 2 4 3 user space 5/29/2016 physical memory space COP4610 59 Segmentation – cont. • Divide the logical address space into segments (variable-sized chunks of memory) • Each segment has a base and bound register – and so segments do not need to be contiguous in the physical address space – but the logical address space is still contiguous 5/29/2016 COP4610 60 Two segmented address spaces 5/29/2016 COP4610 61 Segmentation Architecture • Logical address consists of a two tuple: <segment-number, offset>, • Segment table – maps two-dimensional physical addresses; each table entry has: – base – contains the starting physical address where the segments reside in memory. – limit – specifies the length of the segment. 5/29/2016 COP4610 62 Segmentation memory mapping 5/29/2016 COP4610 63 Contiguous 24K logical address space 5/29/2016 COP4610 64 Memory Protection • Within each segment, through the bound register – If the offset of an address is larger than the bound, the address is illegal 5/29/2016 COP4610 65 Memory Protection – cont. • Protection. With each entry in segment table associate: – validation bit = 0 illegal segment – read/write/execute privileges • Protection bits associated with segments; code sharing occurs at segment level. • Since segments vary in length, memory allocation is a dynamic storage-allocation problem. 5/29/2016 COP4610 66 Sharing of segments 5/29/2016 COP4610 67 Segments to pages • Large segments do not help the internal and external fragmentation problems – so we need small segments • Small segments are usually full – so we don’t need a length register – just make them all the same length • Identical length segments are called pages • We use page tables instead of segment tables – base register but no limit register 5/29/2016 COP4610 68 Paging • Physical address space of a process can be noncontiguous; process is allocated physical memory whenever the latter is available. • Divide physical memory into fixed-sized blocks called frames (size is power of 2). • Divide logical memory into blocks of same size called pages. • Keep track of all free frames. • To run a program of size n pages, need to find n free frames and load program. • Internal fragmentation. 5/29/2016 COP4610 69 Page and page frame • Page – – – – the information in the page frame can be stored in memory (in a page frame) can be stored on disk multiple copies are possible • Page frame – the physical memory that holds a page – a resource to be allocated 5/29/2016 COP4610 70 Address Translation Scheme • Set up a page table to translate logical to physical addresses. • Address generated by CPU is divided into: – Page number (p) – used as an index into a page table which contains base address of each page in physical memory. – Page offset (d) – combined with base address to define the physical memory address that is sent to the memory unit. 5/29/2016 COP4610 71 Address Translation Architecture 5/29/2016 COP4610 72 Paging Example 5/29/2016 COP4610 73 Hardware register page table 5/29/2016 COP4610 74 Problems with page tables in registers • Practical limit on the number of pages • Time to save and load page registers on context switches • Cost of hardware registers • Solution: put the page table in memory and have a single register that points to it 5/29/2016 COP4610 75 Page tables in memory 5/29/2016 COP4610 76 Page table mapping 5/29/2016 COP4610 77 Problems with page tables in memory • Every data memory access requires a corresponding page table memory access – the memory usage has doubled – and program speed is cut in half • Solution: caching page table entries – called a translation lookaside buffer – or TLB 5/29/2016 COP4610 78 Associative Register • Associative registers – parallel search Page # Frame # • Address translation (A´, A´´) – If A´ is in associative register, get frame # out. – Otherwise get frame # from page table in memory – Only applies within one process 5/29/2016 COP4610 79 TLB flow chart 5/29/2016 COP4610 80 TLB lookup 5/29/2016 COP4610 81 Effective Access Time • Associative Lookup = time unit • Assume memory cycle time is 1 microsecond • Hit ratio – percentage of times that a page number is found in the associative registers; ratio related to number of associative registers. • Hit ratio = • Effective Access Time (EAT) EAT = (1 + ) + (2 + )(1 – ) =2+– 5/29/2016 COP4610 82 Why TLBs work • Memory access is not random, that is, not all locations in the address space are equally likely to be referenced • References are localized because – – – – sequential code execution loops in code groups of data accessed together data is accessed many times • This property is called locality • TLB hit rates are 90+% . 5/29/2016 COP4610 83 Good and bad cases for paging 5/29/2016 COP4610 84 Memory Protection • Memory protection implemented by associating protection bit with each frame. • Valid-invalid bit attached to each entry in the page table: – “valid” indicates that the associated page is in the process’ logical address space, and is thus a legal page. – “invalid” indicates that the page is not in the process’ logical address space. 5/29/2016 COP4610 85 Page table entry 5/29/2016 COP4610 86 Page table protection • Three bits control: read, write, execute • Possible protection modes: – – – – – – – – 000: page cannot be accessed at all 001: page is read only 010: page is write only 100: page is execute only 011: page can be read or written 101: page can be read as data or executed 110: write or execute, unlikely to be used 111: any access is allowed 5/29/2016 COP4610 87 Large address spaces • Large address spaces need large page tables • Large page tables take a lot of memory – 32 bit addresses, 4K pages => 1 M pages – 1 M pages => 4 Mbytes of page tables • But most of these page tables are rarely used because of locality 5/29/2016 COP4610 88 Two-level paging • Solution: reuse a good idea – We paged programs because their memory use of highly localized – So let’s page the page tables • Two-level paging: a tree of page tables – Master page table is always in memory – Secondary page tables can be on disk 5/29/2016 COP4610 89 Two-level paging 5/29/2016 COP4610 90 Another view of two-level paging 5/29/2016 COP4610 91 Two-Level Page-Table Scheme 5/29/2016 COP4610 92 Two-Level Paging Example • A logical address (on 32-bit machine with 4K page size) is divided into: – a page number consisting of 20 bits. – a page offset consisting of 12 bits. • Since the page table is paged, the page number is further divided into: – a 10-bit page number. – a 10-bit page offset. • Thus, a logical address is as follows: page number page offset 5/29/2016 pi p2 d 10 10 12 COP4610 93 Address-Translation Scheme • Address-translation scheme for a two-level 32-bit paging architecture 5/29/2016 COP4610 94 Two-level paging • Benefits – page table need not be in contiguous memory – allow page faults on secondary page tables • take advantage of locality – can easily have “holes” in the address space • this allows better protection from overflows of arrays, etc • Problems – two memory accesses to get to a PTE – not enough for really large address spaces • three-level paging can help here 5/29/2016 COP4610 95 Three-level paging 5/29/2016 COP4610 96 Multilevel Paging and Performance • Since each level is stored as a separate table in memory, covering a logical address to a physical one may take four memory accesses. • Even though time needed for one memory access is quintupled, caching permits performance to remain reasonable. • Cache hit rate of 98 percent yields: effective access time = 0.98 x 120 + 0.02 x 520 = 128 nanoseconds. – only a 28 percent slowdown in memory access time. 5/29/2016 COP4610 97 Inverted Page Table • One entry for each real page of memory – One inverted page table shared among all processes • Entry consists of the virtual address of the page stored in that real memory location, with information about the process that owns that page. • Decreases memory needed to store each page table, but increases time needed to search the table when a page reference occurs. • Use hash table to limit the search to one — or at most a few — page-table entries. 5/29/2016 COP4610 98 Inverted Page Table Architecture 5/29/2016 COP4610 99 Shared Pages • Shared code – One copy of read-only (reentrant) code shared among processes (i.e., text editors, compilers, window systems). – Shared code must appear in same location in the logical address space of all processes. • Private code and data – Each process keeps a separate copy of the code and data. – The pages for the private code and data can appear anywhere in the logical address space. 5/29/2016 COP4610 100 Shared Pages Example 5/29/2016 COP4610 101 Internal fragmentation 5/29/2016 COP4610 102 Segmentation with Paging • Segmentation with paging – Combine advantages of segmentation and paging – Page the segments • MULTICS – The MULTICS system solved problems of external fragmentation and lengthy search times by paging the segments – Solution differs from pure segmentation in that the segment-table entry contains not the base address of the segment, but rather the base address of a page table for this segment 5/29/2016 COP4610 103 MULTICS Address Translation Scheme 5/29/2016 COP4610 104 Segmentation with Paging – Intel 386 • Intel 386 – As shown in the following diagram, the Intel 386 uses segmentation with paging for memory management with a two-level paging scheme 5/29/2016 COP4610 105 Intel 30386 address translation 5/29/2016 COP4610 106 Comparing Memory-Management Strategies • • • • • • • Hardware support Performance Fragmentation Relocation Swapping Sharing Protection 5/29/2016 COP4610 107