Operating Systems Martin O Hanlon Memory Management In a multiprogramming computer the operating system resides in part of memory and the rest is used by multiple processes. The task of subdividing the memory not used by the operating system among the various processes is called memory management. Memory management requirements Relocation - programs not loaded into a fixed point in memory but may reside in various areas. Protection - each process should be protected against unwanted interference by other processes either accidentally or intentionally. All memory references generated by a process must be checked at run time to ensure that they only reference memory that has been allocated to that process. Sharing – Protection mechanisms must be flexible enough to allow a number of processes to access the same area of memory. For example if a number of processes are executing the same program, it is advantageous to allow each program to access the same copy of the program. Logical Organisation – Programs are normally organised into modules some of which are non modifiable and some of which contain data that may be modified. How does the operating system organise RAM which is a linear address space to reflect this logical structure. Physical Organisation – Computer memory is organises into a least two levels : main memory and secondary memory. The task of moving information between the two levels of memory should be the systems responsibility (not the programmers). -1- Operating Systems Martin O Hanlon Relocation How a program is loaded from disk. fig1 The first step in the creation of an active process is to load a program into main memory and create a process image, as shown above. The actual program itself may consist of a number of compiled or assembled modules that are linked together to resolve any references between modules (see Fig 2 below). The Loader places the program in memory. It may do so in three ways: Absolute Loading Relocatable loading Dynamic run-time loading -2- Operating Systems Martin O Hanlon Fig2 Absolute Loading: An absolute loader requires that a given module is always loaded into the same location in memory. Therefore any references in the program must correspond exactly to the actual memory location when the program is loaded. The assignment of specific address values to memory references within a program can be done either by the programmer or at compile or assembly time. There are several disadvantages to the former approach. Firstly every programmer would have to know exactly how programs are loaded into main memory. Secondly, if any modifications are made to the program that involved insertions or deletions in the body of the program, then all addresses would have to be altered. For this reason it is preferable to allow memory references within programs to be expressed symbolically (see Fig3 below). and have those symbolic references resolved at compilation or assembly. This is illustrated below every reference to an instruction or item of data is initially represented by a symbol. In preparing the module for input to an absolute loader, the assembler or compiler will convert all these references to specific address. The absolute loader will then load the program at the addresses specified. -3- Operating Systems Martin O Hanlon Fig3 -4- Operating Systems Martin O Hanlon Relocatable loader: A huge disadvantage of binding references to specific addresses prior to loading is that the resulting load module can only be places in one region of main memory. And as normally memory is shared between a number of processes it is undesirable to state in advance into which region of memory a particular module should be loaded. It would be better to make that decision at run time. What we need is a module that can be loaded into any part of available memory. To satisfy this requirement, the assember/compiler produces not actual main memory addresses ( absolute addresses) but addresses that are relative to some known point such as the start of the program. This is shown above in Fig3. The start of the load module is assigned the relative address 0 and all other memory references within the module are expressed relative to the beginning of the module. With all the references expressed in relative format it is quite easy for the loader to place the module in the desired location. If the module in to be loaded beginning at location x, then the loader must simply add x to each memory reference as it loads the module into memory. To assist in this task, the load module must include information that tells the loader where the address references are and how they are to be interpreted , usually relative to the program origin but also possibly relative to some other point in the program such as current location. -5- Operating Systems Martin O Hanlon Dynamic Run Time Loader: The relocatble loader is an obvious advance over absolute loaders however in a multiprogramming environment it is still inadequate. In a multiprogramming environment there is often a need to swap process images in and out of main memory to maximise the use of the processor (or some other operational reason). To implement this it will be necessary to reload the process in different memory locations (depending what is available at the time). Thus a program once loaded may be swapped out and reloaded at a different location. This would not be possible with the relocatable loader where memory references are bound to absolute at initial load time. The solution to this is to defer the calculation of an absolute address until it is actually needed at run time. How this works is that the load module is brought into main memory with all the memory references in relative form as in c) (Fig3) above. It is not until an instruction is actually executed that the absolute address is calculated. To ensure that this conversion of addresses takes place as fast as possible it is done by special processor hardware (Memory Management Unit). Dynamic address calculation provides complete flexibility. A program can be loaded into any region of main memory and logical addresses converted to physical addresses on the fly. Fig4 below shows how the address translation is achieved. When a process is assigned to the Running state, a special register called the base register is loaded with the starting address in main memory of the process. A bounds register which indicates the ending location of the program is also used. During the course of program execution, relative addresses are encountered. These include the contents of instruction registers, instruction addresses that occur in branch and call instructions and data addresses that occur in load and store instructions. Each such relative address goes through two steps of manipulation by the processor. First the value in the base address is added to the relative address to produce an absolute address. Second, the resulting address is compared to the value in the bounds register. If the address is within bounds then the instruction execution proceeds. Otherwise an interrupt indicating an error is generated. This scheme allows programs to be swapped in and out of memory during execution and also provides a measure of protection – each process image is isolated by the contents of the base and bounds register and is safe from unwanted access by other processes. -6- Operating Systems Martin O Hanlon Single User, Single process System In this system – only one user is present and one process is running at any time. Memory management is easy – memory is divided into two parts – one for the Op system and one for the program currently being executed. Op System User Program Fixed Partitioning Main memory divided into a number of fixed size partitions – partitions may be the all the same size or unequal size partitions. Internal Fragmentation Fixed Number of processes in memory Contiguous memory No Longer used Partition 3 400k Partition 2 300k Partition 1 200k Always present in Memory -7- Operating Systems Martin O Hanlon Dynamic Partitioning Partitions are created dynamically and each process is loaded into a partition of exactly the same size as the process. External fragmentation Periodic compaction of memory space needed Placement algorithms: Best-fit, First-fit and Next-fit The Effects of dynamic partitioning -8- Operating Systems Martin O Hanlon Placement algorithms Because of the overheads involved in compaction, the OP system must be clever in deciding how to assign processes to memory. Best –fit: Allocated the smallest hole that is big enough. First-fit: Allocates the first hole that is big enough, starting from the beginning of memory. Next-fit: Starts scan from last allocation and allocates the next hole big enough. Deciding which placement algorithm is the best depends on the size of the processes and the exact sequence of process swapping that takes place. In general , the First-fit usually results in the best and fastest allocation. Next-fit usually results in the last block being allocated and is in general large in size. Best-fit results in compaction having to be carried out more frequently than the others – because it results in lots of small fragments scattered throughout memory. -9- Operating Systems Martin O Hanlon The above diagram shows an example of memory configuration after a number of placement and swapping-out operations. The last operation was allocating a 14 Mb process to an used 22 Mb block. The left side shows how a 16 Mb request might be satisfied. Paging Introduced to alleviate problem of fragmentation. Memory is divided into page frames all of equal size. The logical address space divided into pages of equal size. The memory manager determines 1. The number of pages in the program 2. Locates enough empty page frames to facilitate 3. Loads all of the pages into memory, pages need not be contiguous. A number of tables need to be maintained for this system to operate: 1. Job Table- for each job holds the size of the job, the memory location of the Page table. 2. Page Table – For each active job the Page , page Frame memory address 3. Memory Map table – for each page Frame its location and whether free or busy. Logical Address Is a reference to a memory location independent of the current assignment of data to memory – a translation must be made to a physical address before memory access can be achieved. ( Mov Reg,8192) Physical Address Is an actual location in main memory Page addressing Memory address translation takes place at run time. Reading a word from memory involves translating a virtual or logical address, consisting of a page number and offset, into an actual physical address, consisting of a frame number and offset. This process will make use of the Page Table entries. Logical address Page No Physical address Offset Frame No - 10 - Offset Operating Systems Martin O Hanlon If the system used 16 bits then it could utilise memory in this fashion: 15 10 00101 0 00000101010 Page no Displacement with this set up the system would have 32 pages (25 ) each with 2048 bytes (2 11) Example If a logical address of 0010100000101010 was encountered this will represent offset 42 on page 5. The Page table would be accessed to see the Mapping of page 5. logical address 00101 000001010 10 10111 Physical address Page Table 01111 00101 00001 00011 10111 Static paging: 000001010 No external fragmentation Fixed size pages Internal fragmentation – only on last page Non- contiguous memory (page table) - 11 - Operating Systems Martin O Hanlon - 12 - Operating Systems Martin O Hanlon Paging Lets say memory is only 1024 bytes = 210 address locations. A page frame size is 256 byte 28 . Logical address 00,0000,0000 page0 00,1111,1111 01,0000,0000 page1 01,1111,1111 10,0000,0000 page2 10,1111,1111 11,0000,0000 page3 11,1111,1111 Page Table 01 11 00 10 0000,0000 frame0 1111,1111 0000,0000 where is logical address 00,0000,00010 0000,0010 frame1 where is 11,0000,1111 1111,1111 0000,1111 frame2 frame3 - 13 - Operating Systems Martin O Hanlon Segmentation (static) The concept of segmentation is based on the common practice by programmers of structuring their programs in modules-logical groupings of code. With segmented memory allocation, each job is divided into several segments of different sizes, one for each module which contains pieces that perform related functions. A subroutine is an example of one such logical group. This is fundamentally different from a paging scheme, which divides the job into several pages all of the same size each of which often contains pieces from more than one program module. A second important difference is that main memory is no longer divided into page frames because the size of each segment is different - some are large and some are small. Therefore, as with the dynamic partitions , memory is allocated in a dynamic manner. When a program is compiled or assembled, the segments are set up according to the program's structural modules. Each segment is numbered and a Segment Map Table (SMT) is generated for each job - it contains the segment numbers, their lengths, and (when each is loaded into memory) its location in memory. - 14 - Operating Systems Martin O Hanlon The system maintains three tables: 1. The job table (as with static paging) 2. The segment Map table list details about each job (one for each job) 3. Memory Map Table (as before). Like paging, the instructions within each segment are ordered sequentially, but the segments need not be contiguous in memory. We only need to know where each segment is stored. The contents of the segments themselves are contiguous (in this scheme). To - 15 - Operating Systems Martin O Hanlon access a specific location within a segment we can perform an operation similar to the one used with paged memory management. The only difference is that we work with segments instead of pages. The addressing scheme requires the segment number and the displacement within that segment, and, because the segments are of different sizes, the displacement must be verified to make sure it isn't outside of the segment's range. A segmented address reference requires the following steps: 1. extract the segment number and the displacement from the logical address 2. use the segment number to index the segment table, to obtain the segment base address and length. 3. check that the offset is not greater than the given length; if so an invalid address is signaled. 4. generate the required physical address by adding the offset to the base address. Main Points: The benefits of segmentation include modularity of programs and sharing and protection. There is a maximum segment size that the programmer must be aware of. No internal fragmentation Unequal sized segments Non –contiguous memory. - 16 - Operating Systems Martin O Hanlon Some external fragmentation. Segmentation greatly facilitates the sharing of procedures or data between a number of processes. Because a segment normally has either program code or data within them different segments within the same process can have different protections set for them. While protection is possible in a paging environment it is far more difficult to implement, the programmer is unaware of what pages hold a procedure or data area that he may wish to share, to do this the programmer would have to keep track of the exact pages that held a procedure and then assign the necessary protection to those pages. Virtual Memory- Demand Paging Demand paging introduces the concept of loading only a part of the program into memory for processing. When a process begins to run, pages are brought into memory only as they are needed. It was the first widely used scheme that removed the restriction of having the entire job in memory from the beginning to the end of its processing. With demand paging, jobs are still divided into equally sized pages that initially reside in secondary storage. When the job begins to run, its pages are brought into memory only as they are needed. Demand paging takes advantage of the fact that programs are written sequentially so that while one section, or module, is being processed all of the other modules are idle. Demand paging allows users to run processes with less main memory than would be required if the OP system was using any of the schemes described earlier. Demand paging can give the appearance of almost infinite amount of physical memory. With a virtual memory system, the logical address space available to a program is totally independent of the amount of physical memory space. For instance a program may have a logical address space of 220 bytes and be capable of running in a system with only 216 bytes of physical memory. Three tables used in demand paging: Job Table Page Map Table –part or all loaded into memory depending on size of virtual address space of process. For example in Vax architecture a process can have 232 =2 Gb of virtual address space. If each page was 29 =512 Bytes then there would need to be 222 entries in the page table per process –most systems store page tables in virtual memory rather than real memory. Memory Map Table Page Table Three extra bits are required in the page table for demand paging. page table present in memory (P bit) - 17 - Operating Systems Martin O Hanlon whether the page has been modified or not ( M bit) if the page has been referenced recently ( R bit) Virtual/Logical address Page Number Offset Page Table Entry P M P R Other Control Bits Frame number Page Fault When a program tries to use a page which is not memory, a page fault occurs. This is one of the reasons for a process to be suspended for I/O. Algorithm 1. Bind virtual address to physical address 2. If the address is illegal terminate the process 3. Find page table containing address (look up page table) 4. If the page table is in memory get data and finish instruction advance to next instruction return to step 1 else generate a page interrupt call page interrupt handler 5. The page interrupt handler takes the following action Find a free frame in memory ( memory map table) if there is no free memory frame then : Select a page to be swapped out using Page Replacement algorithm Update process’s page table if contents of page have been modified then writ page to disk endif 6. load the request page into freed frame 7. update page table 8. update memory map table When an executable file is first used to create a new process, only a portion of the program and data for that file may be loaded into real memory. Later as page faults occur, new portions of the program and data are loaded. It is only at the time of first loading that virtual memory pages are created and assigned to locations on one of the devices (hard disks) used for swapping. Locality of Reference Well structured programs obey the principle of locality of reference. What this means is that the address references generated by a process tend to cluster within narrow ranges which are likely to be contained in a few pages. The implications of this is that the needs of a process over a period of time can be satisfied by having a small number of resident - 18 - Operating Systems Martin O Hanlon pages (not necessarily adjacent). This situation leads to a relatively small number of page faults. The set of pages that a process is currently using, at this stage is called its working set. If the entire working set is in memory, the process will run without incurring many page faults until it moves into another execution phase. At different stages of the processes execution there will be different working sets in memory. If the available memory is to small to hold the working set, the process will run slowly as it incurs many page faults. It is possible that a situation can arise where a page swapped out of memory can be needed almost immediately – this can lead to a condition called thrashing , where the processor spends most of its time swapping pages rather than executing instructions. In a multiprogramming system process are often moved to disk to let other processes have a turn on the CPU. When the process is again scheduled for running a question arises as to what pages to bring into memory. Many systems try to keep track of the working set of each process, in order that all the pages belonging to the set can be brought in their entirety. This eliminates the numerous costly page faults that would occur. Because the working set varies slowly with time it is possible to make a reasonable assumption as to which pages will be needed when the program is restarted on the basis of its working set when it was suspended. - 19 - Operating Systems Martin O Hanlon Multi-Level Page Table Motivation: The problem with page tables was that they could take up too much memory to store them. Consider a process on a system with a 32 bit virtual address space… It will be laid out as follows in its virtual space: The program always starts at virtual address 0. The heap always starts on then first page boundary after the end of the program. The stack always starts at the 4 GB mark and works its way down. So how big is the gap in most programs? It is huge. How many 4kb pages are being used in the above process? 14 MB / 4 Kb = 14 * 2^20 / 4 * 2^10 = 3.5 * 2 ^10 (approx 4*2^10 = 4MB) What does that say about most of the 220 page table entries for this process? They will be empty. So do we need to keep the empty page table entries in memory? No, they will never be used so why keep them in memory. Lets just keep the one's we need. Multi-Level Page Table Implementation: - 20 - Operating Systems Martin O Hanlon A multi-level page table system (on a 20 bit system with 4kb pages) works as follows: The Virtual address is now divided slightly differently. There are two page table indexes The first page index is used to index the outer page table. Each entry in the outer page table can possibly point to a second level page table. The present bit in the outer page table is used to indicate if there is a second level page table for this entry. The second page index is used to index the second level page table pointed to by the outer page table. The entry of the second level page table might contain the page frame number of the desired page. If the present bit of the second level page table indicates the presence or absence of a page in the physical memory. If the present bit is a 1 then the page frame number is contained in the entry. If the present bit is a 0 then a page fault occurs and the virtual page must be brought into the physical memory. In a 32 bit system the virtual address might be broken up as follows: - 21 - Operating Systems Martin O Hanlon What size are the pages in this system? They would be 4Kb because the offset has 12 bits. What is the size of each second level page table? Each second level page table will have 1024 entries because the second Index is 10 bits. If each entry is 4 bytes as we assumed previously that makes each table 4kb. interesting that the second level page tables are exactly the same size as a page. is it a coincidence or was it done on purpose? What is the size of the outer page table. The same size for the same reasons. How much memory is represented by each entry in a second level page table? Each entry points to a single page that is 4kb. How much memory does each second level page table represent? Each table points to up to 1024 pages of 4kb each so 4 MB. How many entries in the outer page table would be used in the example process from page 1? The program is 8 MB so it needs 2 entries, the heap is 4 MB so that is another entry and the stack is 2 MB which needs a full entry. What about the second level page table for the outer page table entry for the stack, is it full? No it will be half empty. So by using a 2 level page table, for our example, we only need to keep 5 sub-tables in memory. The outer page table. 4 inner page tables. Therefore, while this example would have required 4 MB of page table space for a single level page table it only requires 20 KB when a 2 level page table is used. What is the maximum amount of memory that the two level page table could take up? 4 Kb more than the single level page table. If every second level page table is full then there will be 4 MB worth of second level page tables and one outer page table. So in that worst case scenario the two level page table doesn't help. Luckily programs that are that large are very rare so most of the time using a two level page table is very helpful. It is also possible to use 3 or more level page tables. - 22 - Operating Systems Martin O Hanlon The Other Problem With Page Tables: So paging solved the problem of not being able to run enough processes at a given time or not being able to run really large processes. Multi-level page tables solved the problem of the page table taking up too much memory. Let's consider the following instruction on a machine that doesn't use any virtual memory: LOAD BX, 200 This instruction gets what is in memory address 200 and puts it into the BX register. How many memory references does it take to execute this instruction? Just one. It accesses location 200 to get the data and then puts that value into the register BX. Of course the instruction had to be loaded from memory before it could be executed so we should probably count that also. Therefore, it takes 2 references. One to fetch the instruction. One to fetch the data from location 200. How many memory references would that same instruction take on a machine with a single level page table? (Including fetching the instruction.) Four! Look up the physical address of the instruction in the page table. Fetch the instruction from that physical memory location. Look up the physical address of the data in the page table. I.e. Translate the virtual address 200 to a physical address. Fetch the data from the memory. How many memory references would that same instruction take on a machine with a two level page table? Six Look up the second level page table for the instruction in the outer page table. Look up the physical address of the instruction in the second level page table. Fetch the instruction from that physical memory location. - 23 - Operating Systems Martin O Hanlon Look up the second level page table for the data in the outer page table. Look up the physical address of the data in the second level page table. I.e. Translate the virtual address 200 to a physical address. Fetch the data from the memory. Memory accesses take time. A single level page table makes the machine twice as slow as it would be without the virtual memory. A two level page table makes the machine three times as slow as it would be without the virtual memory. This is a big problem and no one would use VM if it couldn't be fixed. Translation Look-Aside Buffers: The following show the use of a translation look-aside buffer. - 24 - Operating Systems Martin O Hanlon The Translation Look-Aside Buffer (TLB) is a small amount of associative memory (16, 32 or 64 words). This memory very fast, very expensive and resides on the CPU chip so that no main memory references are needed to access its contents. The left column of the associative memory is the key value. It holds the virtual page number. The right column is the value of the associated memory cell and holds the physical page frame that contains the virtual page. When the associative memory is presented with a key value it looks at all of the cells in the memory simultaneously for the key value. If the key value is found the value in the associated memory cell is output. When a memory request is made the page indices are sent simultaneously to the page translation system (could be a single level or a multi-level paging system) and to the Translation Look-Aside buffer (TLB). The paging system begins its look up process. The TLB also searches for the page index in its associative memory. If the page index is found in the TLB then the physical page frame that is returned by the TLB is used. When the page frame is found in the TLB the lookup by the paging system is canceled. If the page index is not found in the TLB then the paging system continues its lookup process and finds the page frame in the same way that was discussed previously. If the page index was not in the TLB it and the associated page frame are put into the TLB after the paging system finishes the lookup. This may involve removing one of the page indices that is currently in the TLB to make room for the new one. If a page index is removed from the TLB it might be the one that has not been used for the longest period of time. (There are lots of different algorithms for picking the page index to remove - these are very similar to the page replacement algorithms that we will see in the next class.) Does the TLB make a significant impact on the performance of the system? It might appear not… Consider a 4 MB process on a system with 4KB pages. That process will have 1000 pages. If the TLB has 32 entries then only 32 of 1000 pages will be in the TLB. Therefore, one might think that only 32/1000 page requests would be found in the TLB. That might be true if the process was just making random memory requests. Luckily, most processes do not behave that way. Locality of Reference: - 25 - Operating Systems Martin O Hanlon Locality of reference basically says that most processes will make a large number of references to a small number of pages as opposed to a small number of references to a large number of pages. Locality of reference can be summarized by the 90/10 rule of thumb. A process spends 90% of its time in 10% of its code. It is not clear that that also holds for a program's data. (But we will ignore that for now.) What does that do for us as far as the TLB goes? Well, given that about 90% of the references will be to 10% of the pages we would do very well with the TLB if it could hold those 10% of the pages that are used most often. 10% of 1000 is 100 so we can't quite hold all 100 page translations in the TLB. However, holding the 32 most often used pages will ensure that most of the page requests can be handled by the TLB. Most machines that have Virtual Memory use some combination of multi-level paging with a TLB. Intel Pentium, Macintosh, Sun Sparc, DEC Alpha. A flowchart that shows the use of the TLB is shown above. Note that by the principle of locality, it is hoped that most page references will involve page table entries in the cache. - 26 - Operating Systems Martin O Hanlon . - 27 - Operating Systems Martin O Hanlon Page Size The page size of a system has a significant impact on overall performance. Internal fragmentation – smaller page size less internal fragmentation less internal fragmentation better use of main memory. Smaller page size – more pages per process, larger page tables. In multiprogramming environments this means page tables may be in virtual memory – this may mean a double page fault one to bring in required portion of page table one to bring the required process page. The graph below shows the effect on page faults of two variables one the page size and the second the number of frames allocated to a process. The leftmost graph shows that as page size increases that number of page fault correspondingly increase. This is because the principle of locality of reference is weakened. Eventually as the page size approaches the size of the process the faults begin to decrease. On the right the graph shows that for fixed page size the faults decrease as the number of pages in memory grows. Thus, a software policy (the amount of memory to allocated to each process) affects a hardware design decision (page size). Of course the actual size of physical size of memory is important. More memory should reduce page faults. However as main memory is growing the address space used by applications is also growing reducing performance and modern programming techniques such as Object-Oriented programming (which encourages the use of many small program and data modules with references scattered over a large number of objects) reduce the locality of reference within a process. A small page size reduces internal fragmentation. A large page size reduces the number of pages needed, thereby reducing the size of the page table (page table takes up less memory). A large page size reduces the overhead in swapping pages in or out. In addition to the processing time required to a handle a page fault , transferring 2 1K blocks of data from disk is almost twice as long as transferring 1 2K block . - 28 - Operating Systems Martin O Hanlon A smaller page size, with its finer resolution, is better able to target the process’s locality of references. This reduces the amount of unused information copied back and forth between memory and swapping storage. It also reduces the amount of unused information stored in main memory, making more memory available for useful purposes. Page Replacement When a page fault occurs and all frames are occupied a decision must be made as to what page to swap out. The page replacement algorithm is used to choose what page should be swapped. Note that if there is no free frame a busy one must be swapped out and the new one swapped in. This means a double transfer – which takes time. However if we can find a busy page that has not been modified since it was loaded to disk then it does not need to be written back out. The modified bit in the page table entry indicates whether a page has been modified or not. Thus the I/O overhead is cut by half if the page is unmodified. Page Replacement Algorithm Optimal Replacement The best policy is one which swaps out a page that will not be used for the longest period of time. This will give the lowest possible page fault rate for a fixed number of frames. Consider the following string of pages which are to be processed. 7,0,1,2,0,3,0,4,2,3,0,3,2,1,2,0,1,7,0,1 If we assume an address space of three frames, the pattern of page faults is shown below. 7 0 1 2 20 pages to be processed 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1 Page Frames Optimal page replacement Unfortunately the optimum scheme is impossible to implement as we need to know in advance what pages will be demanded. However it can be used as a standard to judge other schemes by. First in First out FiFo uses the oldest page first . We do not need to know how long it has been in memory if the pages are placed in a FIFI queue. This is easy to implement but suffers from being to simple and does not always give good results. - 29 - Operating Systems Martin O Hanlon Least Recently Used An alternative scheme is to replace Least recently Used pages. It replaces the page that has not been used for the longest period of time. To see how this works we use the same string 7 0 1 2 0 3 0 4 as before. Assume that page 4 is requested. At this stage, pages 0,2,3 are in memory. By scanning backwards, we see that of the three frames in memory page 2 is the least recently used. The frame holding this page is the one that will be sacrificed to make way for page 4. By the principle of locality the LRU policy should and does work well. However it is difficult to implement. Each page would need a tag with the time it was last referenced and even if the hardware supported such a scheme the overheads would be tremendous. Not Recently Used(NRU)is an approximation of LRU. With this policy each page table entry contains a modified bit and a reference bit as shown above. When a page is loaded, the operating system sets the reference bit to 1. Periodically the operating system will reset the reference bit to 0. When a page is accessed the bit is set to 1. When selecting a page for replacement the system will use two criteria: the first priority is to select a page with a reference bit of 0, which indicates that it has not been accessed within a period of time, the second priority is to select a page whose modified bit is 0 (indicating that it has not been modified while in memory). - 30 - Operating Systems Martin O Hanlon Another approximation of LRU is called aging. Aging is achieved by a adding an extra byte to a page table entry. When a page is accessed, the hardware sets the most significant bit in the reference byte. Periodically, as with NRU above, the contents of this byte is adjusted by shifting all the bits right. A page with the lowest value in its reference byte is selected for removal. As with NRU, the operating system may pick any page from among all the pages with the lowest value. Over the years designers have tried to implement algorithms that approximate to the performance of LRU but without the overheads. Many of these are referred to as the clock policy. The simplest form of clock policy requires the association of an additional bit with each frame, referred to as the reference bit. When a page is first loaded into a frame in memory, the reference bit for that frame is set to 1. When the page is subsequently referenced (after the reference that generated the page fault), its reference bit is set to 1. For 1 the page replacement algorithm, the set of frames that are candidates for replacement is considered to be a circular buffer, with which a pointer is associated. When a page is replaced, the pointer is set to indicate the next frame in the buffer. When it comes time to replace a page, the operating system scans the buffer to find a frame with a use bit set to zero. Each time it encounters a frame with a use bit of 1, it resets that bit to zero. If any of the frames in the buffer have a use bit of zero at the beginning of this process, the first such frame encountered is chosen for replacement. If all of the frames have a use bit of 1, then the pointer will make one complete cycle through the buffer, set ting all the use bits to zero, and stop at its original position, replacing the page in that frame. We can see that this policy is similar to FIFO, except that, in the clock policy, any frame with a use bit of 1 is passed over by the algorithm. The policy is referred to as a clock policy because we can visualize the page frames as laid out in a circle. A number of operating systems have employed some variation of this simple clock policy (for example, Multics). The figure below provides an example of the simple clock policy mechanism. A circular buffer of n- 1 main memory frames is available for page replacement. Just prior to the replacement of a page from the buffer with incoming page 727, the next frame pointer points at frame 2, which contains page 45. The clock policy is now executed. Because the use bit for page 45 in frame 2 is equal to 1, this page is not replaced. Instead, the use bit is set to zero and the pointer advances. Similarly, page 191 in frame 3 is not replaced; its use bit is set to zero and the pointer advances. In the next frame, frame 4, the use bit is set to O. Therefore, page 556 is replaced with page 727. The use bit is set to 1 for this frame and the pointer advances to frame 5, completing the page replacement procedure. - 31 - Operating Systems Martin O Hanlon - 32 - Operating Systems Martin O Hanlon Virtual Segmented systems In simple segmented systems previously considered, all the segments of an active process were resident in real memory. In a virtual segmented scheme, the segments of a process are loaded independently and hence may be stored in any available memory positions or may not be in memory at all Virtual segmented systems have certain advantages: facilitates the handling of growing data structures. With segmented virtual memory, the data structure can be assigned its own segment and the OP system will expand and shrink the segment as needed. It lends itself to protection and sharing. logical structure of the process is reflected in the physical structure which reinforces the principle of locality. The segmented address translation process shown below, is similar to the simple segmentation scheme, except that, firstly, the segment table entry, usually called a segment descriptor, must now have a number of additional bits (see below) and secondly, the logical address is now a virtual address typically much larger than the real address. The diagram also shows the presence of a segment table register which points to the start of the segment table for he current process. The segmentation virtual address consists of two components, (s, d). The value indexes the segment table for the appropriate descriptor, while the d value gives the displacement within the segment. The typical contents of a segment descriptor are listed below: base address of segment segment limit; ie size of segment, used to detect addressing errors. segment in memory bit; indicates whether segment is currently in real memory segment used bit; indicates whether segment has been accessed since previous reset of this bit. Used to monitor usage of segment. protection bits; specify read, write and execution access of segment. - 33 - Operating Systems Martin O Hanlon Combined Paging and segmentation The respective merits of paging and segmentation can be combined into a paged-segmented system. Segmentation is visible to the programmer and meets many of his requirements. Paging provides transparent memory management which eliminates wasted space and facilitates complex optimization techniques. A combined paged/segmented system appears to the programmer to be a simple segmented system; however, the segments defined are subdivide at the physical level into a number off fixed length pages. Pages do not bridge segments; a segment smaller than a page will use a whole page, and typically one page within a segment will be partially unfilled. The diagram below shows how a process could be divided into paged segments. - 34 - Operating Systems Martin O Hanlon Address translation in a paged segmented system combines the translations of the individual paging and segmentation systems. A memory reference is specified by a three element value such as: (s, p, d) where: s is the segment number p is the page within this segment d is the displacement within this page The figure below shows the complete translation process. The segment number, S, is used as an index to a segment table, as before. The segment descriptor, among other things, contains a pointer to a page table for that segment; note that each segment is treated as an independent address space, each with its own page table. The page number, p, is used to index this page table which translates the virtual page number p to a physical page number. The displacement d is then used to locate the exact address within this physical page frame. A paged-segmented system is clearly quite complex, but provides a powerful environment for modem computers, giving the programmer control over process structure while efficiently managing memory space. Among current systems using such a scheme are Microsoft Windows, OS/2 and IBM MVS/ESA. - 35 - Operating Systems Martin O Hanlon Segmentation lends itself to the implementation of protection and sharing policies. To achieve sharing, it is possible for a segment to be references in the segment tables of more than one process. This mechanism is available in a paging system. However, in this case the page structure of programs and data is not visible to the programmer, making the specification of protection and sharing more difficult to implement - 36 - Operating Systems Martin O Hanlon Virtual memory advantages A job’s size is no longer restricted to the size of main memory (or the free space within main memory). Memory is used more efficiently because the only sections of a job stored in memory are those needed immediately while those not needed remain in secondary storage. It allows unlimited amounts of multiprogramming. It eliminates external fragmentation and minimizes internal fragmentation by combined segmentation and paging. It allows for sharing of code and data These far outweigh the disadvantages: Increased hardware costs Increased overheads for handling paging interrupts. Increased software complexity to prevent thrashing. - 37 -