os-intro.ppt

advertisement
Classical Operating Systems: A Quick Tour
Jeff Chase
The Birth of a Program
myprogram.c
int j;
char* s = “hello\n”;
myprogram.o
assembler
object
file
data
int p() {
j = write(1, s, 6);
return(j);
}
data
data
data
…..
compiler
p:
store this
store that
push
jsr _write
ret
etc.
myprogram.s
libraries
and other
objects
linker
data
program
myprogram
(executable file)
What’s in an Object File or Executable?
Header “magic number”
indicates type of image.
Section table an array
of (offset, len, startVA)
program sections
Used by linker; may be
removed after final link
step and strip.
header
text
program instructions
p
data
idata
immutable data (constants)
“hello\n”
wdata
writable global/static data
j, s
symbol
table
relocation
records
j, s ,p,sbuf
int j = 327;
char* s = “hello\n”;
char sbuf[512];
int p() {
int k = 0;
j = write(1, s, 6);
return(j);
}
Memory and the CPU
0
OS code
CPU
OS data
Program A
data
Data
R0
x
Rn
PC
Program B
Data
x
registers
code library
2n
main memory
Threads
A thread is a schedulable stream of control.
defined by CPU register values (PC, SP)
suspend: save register values in memory
resume: restore registers from memory
Multiple threads can execute independently:
They can run in parallel on multiple CPUs...
- physical concurrency
…or arbitrarily interleaved on a single CPU.
- logical concurrency
Each thread must have its own stack.
A Peek Inside a Running Program
0
CPU
common runtime
x
your program
code library
your data
R0
heap
Rn
PC
SP
x
y
registers
y
stack
high
“memory”
address space
(virtual or physical)
Two Threads Sharing a CPU
concept
reality
context
switch
A Program With Two Threads
“on deck” and
ready to run
address space
0
common runtime
x
program
code library
running
thread
CPU
data
R0
Rn
PC
SP
y
x
y
stack
registers
stack
high
“memory”
Thread Context Switch
switch
out
switch
in
address space
0
common runtime
x
program
code library
data
R0
CPU
1. save registers
Rn
PC
SP
y
x
y
stack
registers
2. load registers
stack
high
“memory”
Threads vs. Processes
1. The process is a kernel abstraction for an
independent executing program.
data
includes at least one “thread of control”
also includes a private address space (VAS)
2. Threads may share a process/address space
Every thread must exist within some process VAS.
Processes may be “multithreaded”.
Threads can access memory as enabled in their VAS.
Threads can access memory only as enabled in their VAS.
data
Memory Protection
Paging Virtual memory provides protection by:
• Each process (user or OS) has different virtual memory
space.
• The OS maintain the page tables for all processes.
• A reference outside the process allocated space cause an
exception that lets the OS decide what to do.
• Memory sharing between processes is done via different
Virtual spaces but common physical frames.
[Kedem, CPS 104, Fall05]
The Program and the Process VAS
Process text segment
is initialized directly
from program text
section.
sections
Process data
segment(s) are
initialized from idata
and wdata sections.
Text and idata segments
may be write-protected.
BSS
“Block Started by Symbol”
(uninitialized global data)
e.g., heap and sbuf go here.
header
text
text
data
idata
data
wdata
symbol
table
relocation
records
program
BSS
user stack
args/env
kernel
process VAS
Process stack and BSS
(e.g., heap) segment(s) are
zero-filled.
segments
Process BSS segment may be
expanded at runtime with a
system call (e.g., Unix sbrk)
called by the heap manager
routines.
Args/env strings copied
in by kernel when the
process is created.
Virtual Address Translation
29
Example: typical 32-bit
architecture with 8KB pages.
00
Virtual address translation maps a
virtual page number (VPN) to a
physical page frame number (PFN):
the rest is easy.
virtual address
VPN
0
13
offset
address
translation
Deliver exception to
OS if translation is not
valid and accessible in
requested mode.
physical address
{
PFN
+
offset
A Simple Page Table
Each process/VAS has
its own page table.
Virtual addresses are
translated relative to
the current page table.
process page table
PFN 0
PFN 1
PFN i
In this example, each
VPN j maps to PFN j,
but in practice any
physical frame may be
used for any virtual page.
PFN i
+
offset
page #i
offset
user virtual address
physical memory
page frames
The page tables are
themselves stored in
memory; a protected
register holds a pointer to
the current page table.
Virtual Memory as a Cache
executable
file
virtual
memory
(big)
header
text
text
data
idata
data
wdata
symbol
table, etc.
BSS
program
sections
physical
memory
(small)
backing
storage
pageout/eviction
user stack
args/env
page fetch
kernel
process
segments
virtual-to-physical
translations
physical
page frames
Completing a VM Reference
start
here
probe
page table
load
TLB
probe
TLB
access
valid?
load
TLB
zero-fill
fetch
from disk
page on
disk?
MMU
access
physical
memory
raise
exception
OS
allocate
frame
page
fault?
signal
process
Processes and the Kernel
processes
in private
virtual
address
spaces
data
The kernel sets
up process
execution
contexts to
“virtualize” the
machine.
data
system call traps
...and upcalls (e.g.,
signals)
shared kernel
code and data
in shared
address space
CPU and devices force entry to the kernel to handle exceptional events.
Threads or
processes
enter the
kernel for
services.
Architectural Foundations of OS Kernels
• One or more privileged execution modes (e.g., kernel mode)
protected device control registers
privileged instructions to control basic machine functions
• System call trap instruction and protected fault handling
User processes safely enter the kernel to access shared OS services.
• Virtual memory mapping
OS controls virtual-physical translations for each address space.
• Device interrupts to notify the kernel of I/O completion etc.
Includes timer hardware and clock interrupts to periodically return
control to the kernel as user code executes.
• Atomic instructions for coordination on multiprocessors
Example: Mac OS X
Classical View: The Questions
The basic issues/questions classical OS are how to:
• allocate memory and storage to multiple programs?
• share the CPU among concurrently executing programs?
• suspend and resume programs?
• share data safely among concurrent activities?
• protect one executing program’s storage from another?
• protect the code that implements the protection, and
mediates access to resources?
• prevent rogue programs from taking over the machine?
• allow programs to interact safely?
The Access Control Model
1. Isolation Boundary to prevent attacks outside
access-controlled channels
2. Access Control for channel traffic
3. Policy management
Authorization
Authentication
Principal
Do
operation
Reference
monitor
Object
Source
Request
Guard
Resource
1. Isolation
boundary
2. Access
control
3.
Policy
Policy
Audit
log
22
[Butler Lampson – Accountability and Freedom]
Isolation
• I am isolated if anything that goes wrong is my fault
– Actually, my program’s fault
Program
Data
Boundary
Creator
Host
policy G
U
A
R
D
• Attacks on:
– Program
– Isolation
– Policy
G
U
A
R
policy D
guard
Services
Authentication
Principal
Do
operation
Source
Request
Authorizatio
n
Reference
Object
monitor
Guard
Resource
1. Isolation boundary
2. Access
control
Policy
Audit log
3.
Policy
23
[Butler Lampson – Accountability and Freedom]
Access Control Mechanisms:
The Gold Standard
 Authenticate principals: Who made a request
 Mainly people, but also channels, servers, programs
(encryption implements channels, so key is a principal)
 Authorize access: Who is trusted with a resource
 Group principals or resources, to simplify management
 Can define by a property, e.g. “type-safe” or “safe for scripting”
 Audit: Who did what when?
• Lock = Authenticate + Authorize
• Deter = Authenticate + Audit
Authorization
Authentication
Principal
Do
operation
Reference
monitor
Object
Source
Request
Guard
Resource
1. Isolation boundary
2. Access control
Policy
Audit log
3.
Policy
24
[Butler Lampson – Accountability and Freedom]
Kernel Mode
0
CPU mode (a field
in some status
register) indicates
whether the CPU is
running in a user
program or in the
protected kernel.
OS code
CPU
OS data
Program A
data
Data
mode
R0
x
Some instructions or
register accesses are
only legal when the
CPU is executing in
kernel mode.
Rn
PC
Program B
Data
x
registers
code library
2n
main memory
physical
address
space
The Kernel
• Today, all “real” operating systems have protected kernels.
The kernel resides in a well-known file: the “machine”
automatically loads it into memory (boots) on power-on/reset.
Our “kernel” is called the executive in some systems (e.g., XP).
• The kernel is (mostly) a library of service procedures shared
by all user programs, but the kernel is protected:
User code cannot access internal kernel data structures directly,
and it can invoke the kernel only at well-defined entry points
(system calls).
• Kernel code is like user code, but the kernel is privileged:
The kernel has direct access to all hardware functions, and
defines the machine entry points for interrupts and exceptions.
Protecting Entry to the Kernel
Protected events and kernel mode are the architectural
foundations of kernel-based OS (Unix, XP, etc).
• The machine defines a small set of exceptional event types.
• The machine defines what conditions raise each event.
• The kernel installs handlers for each event at boot time.
e.g., a table in kernel memory read by the machine
The machine transitions to kernel mode
only on an exceptional event.
The kernel defines the event handlers.
Therefore the kernel chooses what code
will execute in kernel mode, and when.
user
trap/return
kernel
interrupt or
exception
The Role of Events
A CPU event is an “unnatural” change in control flow.
Like a procedure call, an event changes the PC.
Also changes mode or context (current stack), or both.
Events do not change the current space!
The kernel defines a handler routine for each event type.
Event handlers always execute in kernel mode.
The specific types of events are defined by the machine.
Once the system is booted, every entry to the kernel occurs as a
result of an event.
In some sense, the whole kernel is a big event handler.
CPU Events: Interrupts and Exceptions
An interrupt is caused by an external event.
device requests attention, timer expires, etc.
An exception is caused by an executing instruction.
CPU requires software intervention to handle a fault or trap.
unplanned deliberate
sync fault
syscall trap
async interrupt
AST
control flow
AST: Asynchronous System Trap
Also called a software interrupt or an
Asynchronous or Deferred Procedure Call
(APC or DPC)
Note: different “cultures” may use some of these terms (e.g.,
trap, fault, exception, event, interrupt) slightly differently.
exception.cc
event handler (e.g.,
ISR: Interrupt Service
Routine)
Protection Rings
 Kernel/user mode not
enough!
 Need a mode for hypervisor
 X86 rings
 Has built in security levels
(Rings 0, 1, 2, 3)
 Ring 0 – OS Software
(most privileged)
 Ring 3 – User software
 Ring 1 & 2 – Not used
 Xen guest OS executes on
e.g. Ring 1
Increasing Privilege Level
Ring 0
Ring 1
Ring 2
Ring 3
[Fischbach]
Example: System Call Traps
User code invokes kernel services by initiating system call traps.
• Programs in C, C++, etc. invoke system calls by linking to a
standard library of procedures written in assembly language.
the library defines a stub or wrapper routine for each syscall
stub executes a special trap instruction (e.g., chmk or callsys or int)
syscall arguments/results passed in registers or user stack
Alpha CPU architecture
read() in Unix libc.a library (executes in user mode):
#define SYSCALL_READ 27
# code for a read system call
move arg0…argn, a0…an
move SYSCALL_READ, v0
callsys
move r1, _errno
return
# syscall args in registers A0..AN
# syscall dispatch code in V0
# kernel trap
# errno = return status
Faults
Faults are similar to system calls in some respects:
• Faults occur as a result of a process executing an instruction.
Fault handlers execute on the process kernel stack; the fault handler
may block (sleep) in the kernel.
• The completed fault handler may return to the faulted context.
But faults are different from syscall traps in other respects:
• Syscalls are deliberate, but faults are “accidents”.
divide-by-zero, dereference invalid pointer, memory page fault
• Not every execution of the faulting instruction results in a fault.
may depend on memory state or register contents
Processes and the Kernel
processes
in private
virtual
address
spaces
data
The kernel sets
up process
execution
contexts to
“virtualize” the
machine.
data
system call traps
...and upcalls (e.g.,
signals)
shared kernel
code and data
in shared
address space
CPU and devices force entry to the kernel to handle exceptional events.
Threads or
processes
enter the
kernel for
services.
Process Internals
thread
virtual address space
+
stack
process descriptor (PCB)
+
The address space is
represented by page
table, a set of
translations to physical
memory allocated from a
kernel memory manager.
The thread has a saved user
context as well as a system
context.
The kernel must
initialize the process
memory with the
program image to run.
The kernel can manipulate
the user context to start the
thread in user mode
wherever it wants.
Each process has a thread
bound to the VAS.
user ID
process ID
parent PID
sibling links
children
resources
Process state includes
a file descriptor table,
links to maintain the
process tree, and a
place to store the exit
status.
Kernel Stacks and Trap/Fault Handling
Processes
execute user
code on a user
stack in the user
portion of the
process virtual
address space.
Each process has a
second kernel stack
in kernel space (the
kernel portion of the
address space).
data
stack
stack
stack
syscall
dispatch
table
stack
System calls
and faults run
in kernel mode
on the process
kernel stack.
System calls run
in the process
space, so copyin
and copyout can
access user
memory.
The syscall trap handler makes an indirect call through the system
call dispatch table to the handler for the specific system call.
The Classical OS Model in Unix
A Lasting Achievement?
“Perhaps the most important achievement of Unix is to
demonstrate that a powerful operating system for
interactive use need not be expensive…it can run on
hardware costing as little as $40,000.”
The UNIX Time-Sharing System*
D. M. Ritchie and K. Thompson
DEC PDP-11/24
http://histoire.info.online.fr/pdp11.html
Elements of the Unix
1. rich model for IPC and I/O: “everything is a file”
file descriptors: most/all interactions with the outside world are
through system calls to read/write from file descriptors, with a
unified set of syscalls for operating on open descriptors of
different types.
2. simple and powerful primitives for creating and
initializing child processes
fork: easy to use, expensive to implement
Command shell is an “application” (user mode)
3. general support for combining small simple programs to
perform complex tasks
standard I/O and pipelines
The Shell
The Unix command interpreters run as ordinary user
processes with no special privilege.
This was novel at the time Unix was created: other systems
viewed the command interpreter as a trusted part of the OS.
Users may select from a range of interpreter programs
available, or even write their own (to add to the confusion).
csh, sh, ksh, tcsh, bash: choose your flavor...or use perl.
Shells use fork/exec/exit/wait to execute commands composed
of program filenames, args, and I/O redirection symbols.
Shells are general enough to run files of commands (scripts) for
more complex tasks, e.g., by redirecting shell’s stdin.
Shell’s behavior is guided by environment variables.
Using the shell
•
•
•
•
•
•
•
•
•
•
•
•
Commands: ls, cat, and all that
Current directory: cd and pwd
Arguments: echo
Signals: ctrl-c
Job control, foreground, and background: &, ctrl-z, bg, fg
Environment variables: printenv and setenv
Most commands are programs: which, $PATH, and /bin
Shells are commands: sh, csh, ksh, tcsh, bash
Pipes and redirection: ls | grep a
Files and I/O: open, read, write, lseek, close
stdin, stdout, stderr
Users and groups: whoami, sudo, groups
Other application programs
nroff
sh
who
cpp
a.out
Kernel
comp
date
Hardware
cc
wc
as
ld
vi
ed
grep
Other application programs
Questions about Processes
A process is an execution of a program within a private
virtual address space (VAS).
1. What are the system calls to operate on processes?
2. How does the kernel maintain the state of a process?
Processes are the “basic unit of resource grouping”.
3. How is the process virtual address space laid out?
What is the relationship between the program and the process?
4. How does the kernel create a new process?
How to allocate physical memory for processes?
How to create/initialize the virtual address space?
Process Creation
Two ways to create a process
• Build a new empty process from scratch
• Copy an existing process and change it appropriately
Option 1: New process from scratch
• Steps
Load specified code and data into memory;
Create empty call stack
Create and initialize PCB (make look like context-switch)
Put process on ready list
• Advantages: No wasted work
• Disadvantages: Difficult to setup process correctly and to express
all possible options
Process permissions, where to write I/O, environment variables
Example: WindowsNT has call with 10 arguments
[Remzi Arpaci-Dusseau]
Process Creation
Option 2: Clone existing process and change
• Example: Unix fork() and exec()
Fork(): Clones calling process
Exec(char *file): Overlays file image on calling process
• Fork()
Stop current process and save its state
Make copy of code, data, stack, and PCB
Add new PCB to ready list
Any changes needed to PCB?
• Exec(char *file)
Replace current data and code segments with those in specified file
• Advantages: Flexible, clean, simple
• Disadvantages: Wasteful to perform copy and then overwrite of
memory
[Remzi Arpaci-Dusseau]
Process Creation in Unix
int pid;
int status = 0;
if (pid = fork()) {
/* parent */
…..
pid = wait(&status);
} else {
/* child */
…..
exit(status);
}
The fork syscall returns
twice: it returns a zero to the
child and the child process ID
(pid) to the parent.
Parent uses wait to sleep until
the child exits; wait returns
child pid and status.
Wait variants allow wait on a
specific child, or notification of
stops and other signals.
Unix Fork/Exec/Exit/Wait Example
fork parent
fork child
initialize
child context
exec
int pid = fork();
Create a new process that is a clone of
its parent.
exec*(“program” [, argvp, envp]);
Overlay the calling process virtual
memory with a new program, and
transfer control to it.
exit(status);
Exit with status, destroying the process.
Note: this is not the only way for a
process to exit!
wait
exit
int pid = wait*(&status);
Wait for exit (or other status change) of
a child, and “reap” its exit status.
Note: child may have exited before
parent calls wait!
How are Unix shells implemented?
while (1) {
Char *cmd = getcmd();
int retval = fork();
if (retval == 0) {
// This is the child process
// Setup the child’s process environment here
// E.g., where is standard I/O, how to handle signals?
exec(cmd);
// exec does not return if it succeeds
printf(“ERROR: Could not execute %s\n”, cmd);
exit(1);
} else {
// This is the parent process; Wait for child to
finish
int pid = retval;
wait(pid);
}
}
[Remzi Arpaci-Dusseau]
The Concept of Fork
fork creates a child process that is a clone of the parent.
• Child has a (virtual) copy of the parent’s virtual memory.
• Child is running the same program as the parent.
• Child inherits open file descriptors from the parent.
(Parent and child file descriptors point to a common entry in the
system open file table.)
• Child begins life with the same register values as parent.
The child process may execute a different program in its
context with a separate exec() system call.
What’s So Cool About Fork
1. fork is a simple primitive that allows process creation
without troubling with what program to run, args, etc.
Serves the purpose of “lightweight” processes (like threads?).
2. fork gives the parent program an opportunity to initialize
the child process…e.g., the open file descriptors.
Unix syscalls for file descriptors operate on the current process.
Parent program running in child process context may open/close
I/O and IPC objects, and bind them to stdin, stdout, and stderr.
Also may modify environment variables, arguments, etc.
3. Using the common fork/exec sequence, the parent (e.g., a command
interpreter or shell) can transparently cause children to read/write from
files, terminal windows, network connections, pipes, etc.
Unix File Descriptors
Unix processes name I/O and IPC objects by integers
known as file descriptors.
• File descriptors 0, 1, and 2 are reserved by convention
for standard input, standard output, and standard error.
“Conforming” Unix programs read input from stdin, write
output to stdout, and errors to stderr by default.
• Other descriptors are assigned by syscalls to open/create
files, create pipes, or bind to devices or network sockets.
pipe, socket, open, creat
• A common set of syscalls operate on open file
descriptors independent of their underlying types.
read, write, dup, close
Unix File Descriptors Illustrated
user space
kernel
file
pipe
process file
descriptor
table
File descriptors are a special
case of kernel object handles.
socket
system open file
table
The binding of file descriptors to objects is
specific to each process, like the virtual
translations in the virtual address space.
tty
Disclaimer:
this drawing is
oversimplified.
Kernel Object Handles
Instances of kernel abstractions may be viewed as “objects”
named by protected handles held by processes.
• Handles are obtained by create/open calls, subject to
security policies that grant specific rights for each handle.
• Any process with a handle for an object may operate on the
object using operations (system calls).
Specific operations are defined by the object’s type.
• The handle is an integer index to a kernel table.
file
Microsoft Windows object handles
Unix file descriptors
port
object
handles
user space
kernel
etc.
Unix Philosophy
Rule of Modularity: Write simple parts connected by clean interfaces.
Rule of Composition: Design programs to be connected to other
programs.
Rule of Separation: Separate policy from mechanism; separate interfaces
from engines.
Rule of Representation: Fold knowledge into data so program logic can
be stupid and robust.
Rule of Transparency: Design for visibility to make inspection and
debugging easier.
Rule of Repair: When you must fail, fail noisily and as soon as possible
Rule of Extensibility: Design for the future, because it will be here
sooner than you think.
Rule of Robustness: Robustness is the child of transparency and
simplicity.
[Eric Raymond]
Unix Philosophy: Simplicity
Rule of Economy: Programmer time is expensive; conserve it in
preference to machine time.
Rule of Clarity: Clarity is better than cleverness.
Rule of Simplicity: Design for simplicity; add complexity only where you
must.
Rule of Parsimony: Write a big program only when it is clear by
demonstration that nothing else will do.
Rule of Generation: Avoid hand-hacking; write programs to write
programs when you can.
Rule of Optimization: Prototype before polishing. Get it working before
you optimize it.
[Eric Raymond]
Unix Philosophy: Interfaces
Rule of Least Surprise: In interface design, always
do the least surprising thing.
Rule of Silence: When a program has nothing
surprising to say, it should say nothing.
Rule of Diversity: Distrust all claims for “one true way”.
[Eric Raymond]
Introduction to Virtual Addressing
virtual
memory
(big?)
User processes
address memory
through virtual
addresses.
text
data
physical
memory
(small?)
The kernel controls
the virtual-physical
translations in effect
for each space.
BSS
The kernel and the
machine collude to
translate virtual
addresses to
physical addresses.
user stack
args/env
kernel
virtual-to-physical
translations
The machine does not
allow a user process
to access memory
unless the kernel
“says it’s OK”.
The specific mechanisms for
implementing virtual address translation
are machine-dependent.
Multics (1965-2000)
• “Multiplexed 24x7 computer utility”
Multi-user “time-sharing”, interactive and batch
Processes, privacy, security, sharing, accounting
“Decentralized system programming”
• Virtual memory with “automatic page turning”
• “Two-dimensional VM” with segmentation
Avoids “complicated overlay techniques”
Modular sharing and dynamic linking
• Hierarchical file system with symbolic names, automatic backup
“Single-level store”
• Dawn of “systems research” as an academic enterprise
Multics Concepts
• Segments as granularity of sharing
Protection rings and the segment protection level
• Segments have symbolic names in a hierarchical name space
• Segments are “made known” before access from a process.
Apply access control at this point.
• Segmentation is a cheap way to extend the address space.
Note: paging mechanisms are independent of segmentation.
• Dynamic linking resolves references across segments.
How does the Unix philosophy differ?
Segmentation (1)
One-dimensional address space
growing tables
tables may bump
[Tanenbaum]
Variable Partitioning
Variable partitioning is the strategy of parking differently sized cars
along a street with no marked parking space dividers.
Wasted space
from external
fragmentation
Fixed Partitioning
Wasted space from internal fragmentation
Segmentation with Paging: MULTICS (1)
Descriptor segment points to page tables
Segment descriptor – numbers are field lengths
Segmentation with Paging: MULTICS (2)
A 34-bit MULTICS virtual address
Segmentation with Paging: MULTICS (3)
Conversion of a 2-part MULTICS address into a main memory address
[Tanenbaum]
The Virtual Address Space
0
text
0x0
data
BSS
user stack
• user regions in the lower half
sbrk()
jsr
V->P mappings specific to each process
accessible to user or kernel code
args/env
• kernel regions in upper half
2n-1
shared by all processes, but accessible only to
kernel code
kernel text
and
kernel data
2n-1
A typical process VAS space includes:
0xffffffff
• Windows on IA32 subdivides kernel region into an
unpaged half and a (mostly) paged upper half at
0xC0000000 for page tables and I/O cache.
• Win95/98 used the lower half of system space as a
system-wide shared region.
A VAS for a private address space system (e.g., Unix, NT/XP) executing on a typical 32-bit system (e.g., x86).
Process and Kernel Address Spaces
0x0
0
n-bit virtual
address
space
2n-1-1
data
data
32-bit virtual
address
space
0x7FFFFFFF
2n-1
0x80000000
2n-1
0xFFFFFFFF
The OS Directs the MMU
The OS controls the operation of the MMU to select:
(1) the subset of possible virtual addresses that are valid for
each process (the process virtual address space);
(2) the physical translations for those virtual addresses;
(3) the modes of permissible access to those virtual addresses;
read/write/execute
(4) the specific set of translations in effect at any instant.
need rapid context switch from one address space to another
MMU completes a reference only if the OS “says it’s OK”.
MMU raises an exception if the reference is “not OK”.
Example: Windows/IA32
There is lots more to say about address translation, but we
don’t want to spend too much time on it now.
•
Each address space has a page directory
•
One page: 4K bytes, 1024 4-byte entries (PTEs)
•
Each PDIR entry points to a “page table”
•
Each “page table” is one page with 1024 PTEs
•
each PTE maps one 4K page of the address space
•
Each page table maps 4MB of memory: 1024*4K
•
One PDIR for a 4GB address space, max 4MB of tables
•
Load PDIR base address into a register to activate the VAS
[from Tanenbaum]
Top-level
page table
32 bit address with 2 page table fields
Two-level page tables
What did we just do?
We used special machine features to “virtualize” a core
resource: memory.
• Each process/space only gets some of the memory.
• The OS decides how much you get.
• The OS decides what parts of the program and its data are in
memory, and what parts you will have to wait for.
• You can’t tell exactly what you have.
• The OS isolates each process from its competitors.
Virtualization involves a clean abstract interface with a level
of indirection that enables the system to interpose on
important actions, securely and transparently, in order to
hide details of the system below the interface.
Mode, Space, and Context
At any time, the state of each processor (core) is defined by:
1. mode: given by the mode bit(s)
Is the CPU executing in the protected kernel or a user program?
2. space: defined by V->P translations currently in effect
What address space is the CPU running in? Once the system is
booted, it always runs in some virtual address space.
3. context: given by register state and execution stream
Is the CPU executing a thread/process, or an interrupt handler?
Where is the stack?
These are important because the mode/space/context
determines the meaning and validity of key operations.
VM Internals: Mach/BSD Example
address
space (task)
start, len,
prot
start, len,
prot
start, len,
prot
start, len,
prot
memory
objects
vm_map
lookup
enter
putpage
getpage
pmap_page_protect
pmap_clear_modify
pmap_is_modified
pmap_is_referenced
pmap_clear_reference
pmap
pmap_enter()
pmap_remove()
One pmap (physical map)
per virtual address space.
page cells (vm_page_t)
array indexed by PFN
page table
system-wide
phys-virtual map
Memory Objects
•Memory objects “virtualize” VM backing
storage policy.
• source and sink for pages
•triggered by faults
object->putpage(page)
object->getpage(offset, page, mode)
•...or OS eviction policy
memory object
• manage their own storage
• external pager has some control:
•prefetch
•prewrite
•protect/enable
• can be shared via vm_map()
•(Mach extended mmap syscall)
swap
pager
vnode
pager
anonymous VM
mapped files
extern
pager
DSM
databases
reliable VM
etc.
Windows/NT Processes
• A raw NT process is just a virtual address space, a handle
table, and an (initially empty) list of threads.
• Processes are themselves objects named by handles,
supporting specific operations.
• create threads
• map sections (VM regions)
• NtCreateProcess returns an object handle for the process.
• Creator may specify a separate (assignable) “parent” process.
• Inherit VAS from designated parent, or initialize as empty.
• Handles can be inherited; creator controls per-handle
inheritance.
Sharing the CPU
We have seen how an operating system can share and
“virtualize” one hardware resource: memory.
How can does an OS share the CPU among multiple running
programs (processes)?
• Safely
• Fairly (?)
• Efficiently
Sharing Disks
How should the OS mediate/virtualize/share the disk(s)
among multiple users or programs?
• Safely
• Fairly
• Securely
• Efficiently
• Effectively
• Robustly
Download