VIRTUAL MEMORY CS6410 9/22/2011 DAN DENG Virtual Memory • Separates • Logical memory as perceived by users • Physical memory in the system • Provides for • Larger programs • Process isolation • Memory sharing Comparing Asbestos and Mach VM • Similarities • Underlying IPC mechanism • Mach VM • Simplify support for diverse architectures • Efficiently support memory sharing • Asbestos • Applications specify protection policies • Kernel enforces policies to isolate private data Machine-Independent Virtual Memory Management for Paged Uniprocessor and Multiprocessor Architectures Richard Rashid - Chief Executive Officer, Microsoft Research (joined Microsoft in 1991) Lead developer of Mach Avadis Tevanian – Chief Software Technology Officer (2003-2006) and Senior VP of Software Engineering (1997-2003) at Apple Inc., Witness for the United States DoJ testifying against Microsoft in United States v. Microsoft, William Bolosky (CMU) – Microsoft Research David Black (CMU) Michael Young (CMU) David Golub (CMU) Robert Baron (CMU) Jonathan Chew (CMU) Mach Takeaways • Hardware-independent virtual memory (VM) is possible • Hardware-dependent structures can be kept small • Mach can be easily adapted to a variety of architectures • Flaws • Incomplete Evaluation – microbenchmarks • Use of IPC for all tasks turned out costly • Memory remapping between programs became more expensive Context • Successor to Accent • Accent pioneered many novel OS concepts • Accent limited by inability to execute UNIX applications • Accent had strong ties to a single hardware architecture • Explosion of hardware architectures and memory structures • Many of them were multiprocessor systems • Solution • Mach – cleanly defined, UNIX-based, highly portable OS Design Principles • Support for diverse architectures • Simplified kernel structure • Distributed operation • Integrated memory management and IPC • Heterogeneous system support Approach • Simple, extensible kernel • As little as possible inside the kernel • What is in the kernel must be powerful • Concentrate on communication facilities • IPC handles all kernel requests and data movement among tasks • System-wide protection by protecting the IPC mechanism • Key to efficiency • Integration of memory management and IPC Abstractions • Task • Unit of resource allocation • Thread • Unit of execution • Port • Communication channel for IPC • Message • Collection of objects used in communication • Memory Object • Collection of data provided and managed by a task Improving Portability • Perform software memory management • Manage all important virtual memory information in software • Communicate with hardware using a well-defined interface • Machine dependent code • Implement the bare minimum to allow the kernel to run • Export a clean interface to the software Implementation • Split hardware-independent and hardware-dependent memory management • Hardware-independent: • Resident Page Table • Address Map • Memory Object • Hardware-dependent: • Pmap • Hardware-independent parts are the driving force Hardware-Independent • Resident Page table • Indexed by physical page number • Entries are linked into • Memory object list • Memory allocation queue • Object/offset hash bucket • Address map • Sorted, doubly-linked list of entries • Maps a range of virtual addresses to area of a memory object • Memory Object • Resembles a UNIX file • Managed by a pager Hardware-Dependent • Pmaps – management of physical address maps • VAX – pmap corresponds to VAX page table • IBM RT PC – pmap corresponds to allocated segment registers • Pmaps are small • Code is confined to a single module • Interface with Mach is relatively small • Modifying pmap requires little knowledge of Mach • Pmap does not have to keep track of all valid mappings VM-IPC Integration • Memory management based on use of memory objects, which are represented by ports • IPC in memory management • Memory objects may reside on remote systems • Kernel caches the contents of memory objects in local memory • Memory-management use in IPC • Mach passes messages by moving pointers to shared memory objects • Mach uses VM remapping to transfer large messages Adapting Mach • Code for VM originally ran on VAX machines • IBM RT PC • Approx. 3 months for pmap module • Sequent Balance • 5 weeks – bootable system • Sun 3, Encore MultiMAX Performance Table 7.1 The cost of various measures of virtual memory performance for Mach, ACIS 4.2a, SunOS 3.2, and 4.3bsd UNIX Table 7.2 Cost of compiling the entire Mach kernel and a set of 13C programs on a VAX 8650 with 36 MB of memory under both Mach and 4.3bsd UNIX Perspective • Achieved Goals • Sophisticated, hardware-independent VM system possible • Good micro-benchmark performance • Ongoing research • Later evaluation demonstrated poor performance Labels and Event Processes in the Asbestos Operating System Petros Efstathopoulos – Researcher at Symantec, works on OS kernel Projects, PhD from UCLA in 2008 Maxwell Krohn– Founder and Chief Scientist of OkCupid.com (built on OKWS), Creator SFSLite and OK Web Server, PhD from MIT in 2008 Steve VanDeBogart – PhD from UCLA in 2009 Cliff Frey – CS Meng from MIT in 2005, Meraki David Ziegler – CS Meng from MIT in 2005 Labels and Event Processes in the Asbestos Operating System Eddie Kohler – CS Professor at Harvard. Creator of Click Modular Router. David Mazieres – CS Professor at Standard University, Secure Systems Group. Creator of SFS and libasync. Francs Kaashoek – CS Professor at MIT, member of CSAIL, Parallel and Distributed Operating Systems group. Creator of Chord. Robert Morris – CS Professor at MIT, member of CSAIL, Parallel and Distributed Operating Systems group. Creator of Chord. Asbestos Takeaways • Asbestos operating system • Applications use labels to define their own isolation policies • Kernel enforces isolation policies • Isolate processes by tracking and limiting the flow of information • OK web server on Asbestos • Performs comparably to Apache • Provides better security properties than Apache • Lessons/Flaws • Label checking scales linearly with number of compartments • Kernel must be invoked to update send labels Context • Problem • Web servers and networked systems are insecure • Lack of good primitives for security • Monolithic code with many privileges • Exploits: XSS, SQL Injection, Buffer Overrun • Lack of isolation of user data • Breaches divulge private information on a massive scale • TJX – 45 million stolen CC numbers in 2007 • Sony PSN – $171 million in stolen accounts Example The Problem Problem The POST Kernel Kernel Write() /bin/sh Alice Apache Bob • • By exploiting a If Bob compromises software flaw, Bob can If Bob compromises access Alice’s data the system, system, he he can can the access Alice's Alice's data data access Kernel /submit_order.cgi /submit_order.cgi /alice_private_data.dat Alice 123 Main Alice 123 Main St. 123 MainAlice St. 4275-8204-4009-7915 St. 4275-8204-4009-7915 4275-8204Bob Bob 4009-7915 456 Elm St. 456 Elm St. 5829-7640-4607-1273 5829-7640-4607-1273 Goal • Bob can’t access Alice's data without Alice's permission • Alice’s and Bob’s data are isolated • Complications • Bugs in the applications • Alice's data may travel through several processes • There may be hundreds of thousands of users • To isolate, must prevent inappropriate data flow ow Control Kernel The Problem The Problem Information Flow Control ormation Flow Control Kernel Kernel Kernel /submit_order.cgi Kernel Apache Kernel Alice /submit_order.cgi 123 Main St. 4275-8204-4009-7915 /submit_order.cgi /submit_order.cgi Bob abel data with its Alice 456 Elm St./submit_order.cgi wner (contaminate 123 Main St. /bob_private_data.dat /alice_private_data.dat Alice 5829-7640-4607-1273 4275-8204-4009-7915 ithKeep respect to its Alice • If Bob compromises Label track data w.r.t. its Base access of control who 123 Main St. • If Bob compromises wner) 123 MainAlice St. Alice owner (contaminate Bob Bob 4275-8204-4009-7915 the decisions connection on labels is for the system, system, he can can 4275-8204-4009-7915 Main St. 456 Elm St. he 456 w.r.t. the to its owner) Elm St. 123 5829123 Main St. 4275Bob 4275-8204-4009-7915 5829-7640-4607-1273 Bob access Alice's data 7640-4607-1273 8204-4009-7915 456 Elm St. access Alice's data Bob 456 Elm St. 5829-7640-4607-1273 456 Elm St. 5829-7640-4607-1273 5829-7640-4607-1273 Asbestos Goals • Large-scale • Changing population of 100,000s of users • Efficient • Unprivileged • Minimum privilege required to do its job • Isolated • One user can’t gain inappropriate access to other users’ data • Isolation policy • Application-defined, OS-enforced Compared to MAC • Mandatory access control systems • Inflexible, administrator-established policies • Fixed number of levels and compartments • Central authority, no privilege delegation • Asbestos • Applications can define flexible policies • Almost unlimited* number of compartments • Any application can dynamically create compartments Asbestos Overview • Processes communicate using ports • Asbestos labels • Each process has a send label and a receive label • Labels determine where messages can be sent • Contamination of receiver updated • Event processes • Efficiently support/isolate many concurrent users Asbestos Labels • Applications have send and receive labels Information Flow Control • Labels consist of strings of handles and levels • Handle Kernel Information Flow Control • 61-bit # identifying a compartment • Applications can Kernelcreate compartments Label • data with its owner (contaminate with respect to its owner) Label data with its Creating a compartment returns its handle /submit_order.cgi • Level Alice 123 Main St.= level (*, 0-3) /submit_order.cgi owner (contaminate • Contamination & Privilege 4275-8204-4009-7915 with respect to its owner) • Example send label Bob{ Alice – } = {A 3, B 3, 1} 123 Main 456St.Elm St. 4275-8204-4009-7915 5829-7640-4607-1273 • Label size linear Bob in # compartments 456 Elm St. 5829-7640-4607-1273 Contamination Levels • Asbestos’s four levels • 2 levels for send(1)/receive(2) defaults, 1 above & below defaults • User messages using PS of 3 • System defaults (2) denies user-tainted messages • Raising PR requires special privilege • User messages using PS of 2 • System defaults (2) allows communication • must explicitly lower the PR to deny user messages Implementing Clearance Checks Policy enforcement • How does the clearance check work? •• Kernel performs Labels form aclearance lattice checks on message send • Labels form a lattice • Partial ordering • Partial ordering – Sender's send label must be less than or equal to the • Sender's send label ⊑ destination's receive label destination's receive • L1 ⊑ L2 iff L1(h) ≤ L2(h) label for all h Send label updated a upper least bound upper operator bound •• Send label updated with with a least operator v v v v The Problem Basic Label Return example mation Flowto Control Control Example Kernel mation Flow Control Flow Control Information Kernel ow Control Information Flow Control Kernel ow Control Information FlowBasic ControlLabel Information Flow Control cgicgi script Kernel Kernel Bob’s ahttpd script Bob's Backend ahttpd• If Bob compromises DB a with its user The kernel contaminates /submit_order.cgi ntaminate The kernelthe validates system, the message with thethat the he can/submit_order.cgi ubmit_order.cgi ect to itsKernel Kernel destination has clearance to sender’s contamination At delivery, the destination kernel access Alice's data Label data with itsAlice receive the contamination of Kernel Alice User a with its Label data with its 123 Main St. takes on/submit_order.cgi the contamination /submit_order.cgi owner (contaminate 123 Main St. 4275-8204-4009-7915 /submit_order.cgi the message /submit_order.cgi Alice ntaminate owner (contaminate with respect to its 4275-8204-4009-7915 of the message • If Bob compromises 123 Main ect toKernel itsSendAlice with respect to its owner) cgi script Bob Alice Bob Alice's Bob's Alice's Backe 4275-8204 Label 123 Main St.owner) 123 Main Alice St. 456 Elm St. Alice 456 Elm St. Label data with its the5829-7640-4607-1273 system,/submit_order.cgi he can 4275-8204-4009-7915 4275-8204-4009-7915 ahttpd ahttpd ahttpd DB 123 Main St. 123 Main St. 5829-7640-4607-1273 /submit_order.cgi owner (contaminate Bob Send Bob 4275-8204-4009-7915 Bob with respect to its4275-8204-4009-7915 Label data withAlice's its access data Receive 456 Elm S 456 Elm St. 456 Elm St. Label Alice /submit_order.cgi owner) owner (contaminate Bob Bob Alice 5829-7640-4607-1273 5829-7640-4607-1273 Label123 5829-7640 Main St. 123 Main456 St. Elm St. with respect 456 Elmto St.its Kernel Alice’sKernel ahttpd Alice's ahttpd 4275-8204-4009-7915 User Recv Label Bob 456 Elm St. 5829-7640-4607-1273 Kernel 5829-7640-4607-1273 owner) Example 4275-8204-4009-7915 5829-7640-4607-1273 Alice 123 Main St. Bob 456 Elm St. 4275-8204-4009-7915 5829-7640-4607-1273 Bob User Kernel The Problem Return example mation Flowto Control Kernel mation Flow Control Information Flow Control ow Control Information Flow Control Kernel Kernel Kernel Kernel cgi script Alice’sKernel ahttpd Flow Bob’s ahttpd ow Control Information Control • Flow If BobControl compromises a with its Information user The kernel kernel validates contaminates The that the /submit_order.cgi nformation Flow Control ntaminate the system, he can the message with the destination Kernel has clearance to sender’s contamination contamination of kernel access Alice's Label data with itsAlice receive the Kernel a with its Label data with its 123 Main St. /submit_order.cgi owner (contaminate /submit_order.cgi Kernelthe message 4275-8204-4009-7915 /submit_order.cgi /submit_order.cgi ntaminate owner (contaminate with respect to its • If Bob compromises ect to itsSendAlice with respect to its owner) Bob Alice Label123 Main Label owner) St. 123 Main Alice St. AliceElm St. data with its456 ect to itsKernel /submit_order.cgi data Alice's ahttpd Alice Basic Lab the system, he can 5829-7640-4607-1273 123 Main St. /submit_order.cgi owner (contaminate /submit_order.cgi 123 Main St. Bob 4275-8204-4009-7915 User with respect to its4275-8204-4009-7915 Label data withAlice's its access data Receive 456 Elm S Label data with its owner) /submit_order.cgi Exampl owner (contaminate 4275-8204-4009-7915 Bob 456 Elm St. Alice 5829-7640-4607-1273 owner (contaminate 123 Main St. with respect 4275-8204-4009-7915 to its owner) 4275-8204-4009-7915 Bob 456 Elm St. Bob Bob Alice 5829-7640-4607-1273 /submit_order.cgi 123 Main St. Elm St. with respect 456 Elmto St.its 456 4275-8204-4009-7915 5829-7640-4607-1273 owner) 5829-7640-4607-1273 Alice Alice 123 Main St. Bob Bob 123 Main St. 456 Elm St. 456 Elm St. 4275-8204-4009-7915 4275-8204-4009-7915 5829-7640-4607-1273 5829-7640-4607-1273 Bob Label 123 Main 4275-8204 Kernel Send 5829-7640 Efficient Isolation • Threads don’t provide enough isolation • One process per user • Processes consume too much memory The code for a typical event process-based server resembles the • Solution is an event-process event-driven dispatch loop above: abstraction 1. 2. 3. 4. 5. 6. 7. ep checkpoint(&msg); if (!state.initialized) { initialize state(state); state.reply = new port(); } process msg(msg, state); ep yield(); Intuitively, many event process entities now share the event loop, each with its own isolated state. The two ep system calls manage control flow transfer between events. The single “state” variable refers to a different user in each event process. When the base process first calls the ep checkpoint system call, though this ca cumulating ta An event p restrict its lab a specific use in Section 5.2 process’s sen that just the e The base p does it know o the base proc way to chang aware of each based commu represented b Event Process Abstraction • Fork memory state for each new session • No event process will run until message is available • Kernel scheduling cost is little higher than a single process • Small differences anticipated, stored efficiently • Event loop allows shared execution state • Light weight context switches • Event process isolates state • Each event process is only contaminated by one user • One process per service with one event process per user Evaluation • Asbestos web server • Based on the OKWS • Enforces user isolation within workers via event processes • Apache web server • Version 1.3.33 • Application • Response is a string of characters whose length depends on client’s parameters • OKWS had comparable throughput and latency to Apache Label Costs • Size of labels in the system increases with # sessions • IPC operations take more time as # sessions increase • Linear performance degradation as labels increase in size Perspective • Asbestos labels make MAC (mandatory access control) tractable • Labels provide decentralized policy specification • Kernel enforces information flow control • Event processes efficiently supports many users • The OK web server on Asbestos • Performs comparably to Apache • Provides better security properties than Apache Backup slides Delegating to user processes • Using IPC, a task can • Allocate a region of VM • De-allocate a region of VM • Set the protection status of a region of VM • Specify the inheritance of a region of VM • Create and manage a memory object Sharing Memory • Copy-on-write for sharing memory objects • Shadow objects • Rely on original object for unmodified data • Shadow objects may also be shadowed • Read/write sharing must use the sharing map • Sharing map is identical in organization to address map Multiprocessor Issues • TLBs Consistency • Solutions • Force interrupts to all CPUs • Wait until timer interrupt • Temporarily allow inconsistency