PPT

advertisement
Virtual Memory (Review)

Programs refer to virtual memory addresses
00∙∙∙∙∙∙0




movl (%ecx),%eax
Conceptually very large array of bytes
Each byte has its own address
Actually implemented with hierarchy of different
memory types
 System provides address space private to particular
“process”

Allocation: Compiler and run-time system
 Where in single virtual address space each program
object is to be stored

But why virtual and not physical memory?
FF∙∙∙∙∙∙F
Problem 1: How Does Everything Fit?
64-bit addresses:
16 Exabyte
Physical main memory:
Few Gigabytes
?
And there are many processes ….
Problem 2: Memory Management
Physical main memory
Process 1
Process 2
Process 3
…
Process n
x
stack
heap
.text
.data
…
What goes
where?
Problem 3: How To Protect
Physical main memory
Process i
Process j
Problem 4: How To Share?
Physical main memory
Process i
Process j
Solution: Level Of Indirection
Virtual memory
Process 1
Physical memory
mapping
Virtual memory
Process n


Each process gets its own private memory space
Solves the previous problems
Address Spaces

Linear address space: Ordered set of contiguous non-negative integer
addresses:
{0, 1, 2, 3, … }

Virtual address space: Set of N = 2n virtual addresses
{0, 1, 2, 3, …, N-1}

Physical address space: Set of M = 2m physical addresses
{0, 1, 2, 3, …, M-1}

Clean distinction between data (bytes) and their attributes (addresses)
Each object can now have multiple addresses
Every byte in main memory:
 One physical address
 One (or more) virtual addresses


A System Using Physical Addressing
CPU
Physical address
(PA)
Main memory
0:
1:
2:
3:
4:
5:
6:
7:
8:
...
M-1:
Data word

Used in “simple” systems with embedded microcontrollers
 In devices such as like cars, elevators, digital picture frames, ...
A System Using Virtual Addressing
CPU Chip
CPU
Virtual address
(VA)
MMU
Physical address
(PA)
Main memory
0:
1:
2:
3:
4:
5:
6:
7:
8:
...
M-1:
Data word



Used in all modern desktops, laptops, workstations
One of the great ideas in computer science
MMU checks the cache
Why Virtual Addressing?

Simplifies memory management for programmers
 Each process gets an identical, full, private, linear address space

Isolates address spaces
 One process can’t interfere with another’s memory
Because they operate in different address spaces
 User process cannot access privileged information
 Different sections of address spaces have different permissions

Why Virtual Memory?

Efficient use of limited main memory (RAM)
 Use RAM as a cache for the parts of a virtual address space
Some non-cached parts stored on disk
 Some (unallocated) non-cached parts stored nowhere
 Keep only active areas of virtual address space in memory
 Transfer data back and forth as needed

VM as a Tool for Caching



Virtual memory: array of N = 2n contiguous bytes
 Think of the array (allocated part) as being stored on disk
Physical main memory (DRAM) = cache for allocated virtual memory
Blocks are called pages; size = 2p
Virtual memory
VP 0 Unallocated
VP 1 Cached
Disk
VP 2n-p-1
Uncached
Unallocated
Cached
Uncached
Cached
Uncached
Physical memory
0
0
Empty
PP 0
PP 1
Empty
Empty
2m-1
PP 2m-p-1
2n-1
Virtual pages (VP's)
stored on disk
Physical pages (PP's)
cached in DRAM
Memory Hierarchy: Core 2 Duo
Not drawn to scale
L1/L2 cache: 64 B blocks
~4 MB
~4 GB
L2
unified
cache
Main
Memory
~500 GB
L1
I-cache
32 KB
CPU
L1
D-cache
Reg
Throughput: 16 B/cycle
Latency:
3 cycles
8 B/cycle
14 cycles
2 B/cycle
100 cycles
1 B/30 cycles
millions
Miss penalty (latency): 30x
Miss penalty (latency): 10,000x
Disk
DRAM Cache Organization

DRAM cache organization driven by the enormous miss penalty
 DRAM is about 10x slower than SRAM
 Disk is about 10,000x slower than DRAM


For first byte, faster for next byte
Consequences
 Large page (block) size: typically 4-8 KB, sometimes 4 MB
 Fully associative
Any VP can be placed in any PP
 Requires a “large” mapping function – different from CPU caches
 Highly sophisticated, expensive replacement algorithms
 Too complicated and open-ended to be implemented in hardware
 Write-back rather than write-through

Address Translation: Page Tables

A page table is an array of page table entries (PTEs) that
maps virtual pages to physical pages. Here: 8 VPs
 Per-process kernel data structure in DRAM
Physical page
number or
Valid disk address
PTE 0 0
null
1
1
0
1
0
0
PTE 7 1
null
Physical memory
(DRAM)
VP 1
VP 2
VP 7
VP 4
Virtual memory
(disk)
VP 1
Memory resident
page table
(DRAM)
VP 2
VP 3
VP 4
VP 6
VP 7
PP 0
PP 3
Address Translation With a Page Table
Virtual address
Page table
base register
(PTBR)
Page table address
for process
Virtual page number (VPN)
Virtual page offset (VPO)
Page table
Valid
Physical page number (PPN)
Valid bit = 0:
page not in memory
(page fault)
Physical page number (PPN)
Physical address
Physical page offset (PPO)
Page Hit

Page hit: reference to VM word that is in physical memory
Virtual address
Physical page
number or
Valid disk address
PTE 0 0
null
1
1
0
1
0
0
PTE 7 1
null
Physical memory
(DRAM)
VP 1
VP 2
VP 7
VP 4
Virtual memory
(disk)
VP 1
Memory resident
page table
(DRAM)
VP 2
VP 3
VP 4
VP 6
VP 7
PP 0
PP 3
Page Miss

Page miss: reference to VM word that is not in physical
memory
Virtual address
Physical page
number or
Valid disk address
PTE 0 0
null
1
1
0
1
0
0
PTE 7 1
null
Physical memory
(DRAM)
VP 1
VP 2
VP 7
VP 4
Virtual memory
(disk)
VP 1
Memory resident
page table
(DRAM)
VP 2
VP 3
VP 4
VP 6
VP 7
PP 0
PP 3
Handling Page Fault

Page miss causes page fault (an exception)
Virtual address
Physical page
number or
Valid disk address
PTE 0 0
null
1
1
0
1
0
0
PTE 7 1
null
Physical memory
(DRAM)
VP 1
VP 2
VP 7
VP 4
Virtual memory
(disk)
VP 1
Memory resident
page table
(DRAM)
VP 2
VP 3
VP 4
VP 6
VP 7
PP 0
PP 3
Handling Page Fault


Page miss causes page fault (an exception)
Page fault handler selects a victim to be evicted (here VP 4)
Virtual address
Physical page
number or
Valid disk address
PTE 0 0
null
1
1
0
1
0
0
PTE 7 1
null
Physical memory
(DRAM)
VP 1
VP 2
VP 7
VP 4
Virtual memory
(disk)
VP 1
Memory resident
page table
(DRAM)
VP 2
VP 3
VP 4
VP 6
VP 7
PP 0
PP 3
Handling Page Fault


Page miss causes page fault (an exception)
Page fault handler selects a victim to be evicted (here VP 4)
Virtual address
Physical page
number or
Valid disk address
PTE 0 0
null
1
1
1
0
0
0
PTE 7 1
null
Physical memory
(DRAM)
VP 1
VP 2
VP 7
VP 3
Virtual memory
(disk)
VP 1
Memory resident
page table
(DRAM)
VP 2
VP 3
VP 4
VP 6
VP 7
PP 0
PP 3
Handling Page Fault



Page miss causes page fault (an exception)
Page fault handler selects a victim to be evicted (here VP 4)
Offending instruction is restarted: page hit!
Virtual address
Physical page
number or
Valid disk address
PTE 0 0
null
1
1
1
0
0
0
PTE 7 1
null
Physical memory
(DRAM)
VP 1
VP 2
VP 7
VP 3
Virtual memory
(disk)
VP 1
Memory resident
page table
(DRAM)
VP 2
VP 3
VP 4
VP 6
VP 7
PP 0
PP 3
Why does it work? Locality

Virtual memory works because of locality

At any point in time, programs tend to access a set of active
virtual pages called the working set
 Programs with better temporal locality will have smaller working sets

If (working set size < main memory size)
 Good performance for one process after compulsory misses

If ( SUM(working set sizes) > main memory size )
 Thrashing: Performance meltdown where pages are swapped (copied)
in and out continuously
VM as a Tool for Memory Management

Key idea: each process has its own virtual address space
 It can view memory as a simple linear array
 Mapping function scatters addresses through physical memory

Well chosen mappings simplify memory allocation and management
Virtual
Address
Space for
Process 1:
0
VP 1
VP 2
Address
translation
0
PP 2
...
Physical
Address
Space
(DRAM)
N-1
PP 6
Virtual
Address
Space for
Process 2:
0
PP 8
VP 1
VP 2
...
...
N-1
M-1
(e.g., read-only
library code)
VM as a Tool for Memory Management

Memory allocation
 Each virtual page can be mapped to any physical page
 A virtual page can be stored in different physical pages at different times

Sharing code and data among processes
 Map virtual pages to the same physical page (here: PP 6)
Virtual
Address
Space for
Process 1:
0
VP 1
VP 2
Address
translation
0
PP 2
...
Physical
Address
Space
(DRAM)
N-1
PP 6
Virtual
Address
Space for
Process 2:
0
PP 8
VP 1
VP 2
...
...
N-1
M-1
(e.g., read-only
library code)
Simplifying Linking and Loading
Kernel virtual memory

Linking
0xc0000000
 Each program has similar virtual
User stack
(created at runtime)
address space
 Code, stack, and shared libraries
always start at the same address
Memory
invisible to
user code
%esp
(stack
pointer)
Memory-mapped region for
shared libraries
0x40000000

Loading
 execve() allocates virtual pages
brk
Run-time heap
(created by malloc)
for .text and .data sections
= creates PTEs marked as invalid
 The .text and .data sections
are copied, page by page, on
demand by the virtual memory
system
Read/write segment
(.data, .bss)
Read-only segment
(.init, .text, .rodata)
0x08048000
0
Unused
Loaded
from
the
executable
file
VM as a Tool for Memory Protection


Extend PTEs with permission bits
Page fault handler checks these before remapping
 If violated, send process SIGSEGV (segmentation fault)
SUP
Process i:
READ WRITE
Address
VP 0:
No
Yes
No
PP 6
VP 1:
No
Yes
Yes
PP 4
VP 2:
Yes
Yes
Yes
•
•
•
PP 2
Physical
Address Space
PP 2
PP 4
PP 6
SUP
Process j:
READ WRITE
Address
VP 0:
No
Yes
No
PP 9
VP 1:
Yes
Yes
Yes
PP 6
VP 2:
No
Yes
Yes
PP 11
PP 8
PP 9
PP 11
Address Translation: Page Hit
2
PTEA
CPU Chip
CPU
1
VA
PTE
MMU
3
PA
4
Data
5
1) Processor sends virtual address to MMU
2-3) MMU fetches PTE from page table in memory
4) MMU sends physical address to cache/memory
5) Cache/memory sends data word to processor
Cache/
Memory
Address Translation: Page Fault
Exception
Page fault handler
4
2
PTEA
CPU Chip
CPU
1
VA
7
MMU
PTE
3
Victim page
5
Cache/
Memory
Disk
New page
6
1) Processor sends virtual address to MMU
2-3) MMU fetches PTE from page table in memory
4) Valid bit is zero, so MMU triggers page fault exception
5) Handler identifies victim (and, if dirty, pages it out to disk)
6) Handler pages in new page and updates PTE in memory
7) Handler returns to original process, restarting faulting instruction
Speeding up Translation with a TLB

Page table entries (PTEs) are cached in L1 like any other
memory word
 PTEs may be evicted by other data references
 PTE hit still requires a 1-cycle delay

Solution: Translation Lookaside Buffer (TLB)
 Small hardware cache in MMU
 Maps virtual page numbers to physical page numbers
 Contains complete page table entries for small number of pages
TLB Hit
CPU Chip
CPU
TLB
2
PTE
VPN
3
1
VA
MMU
Data
5
A TLB hit eliminates a memory access
PA
4
Cache/
Memory
TLB Miss
CPU Chip
TLB
2
4
PTE
VPN
CPU
1
VA
MMU
3
PTEA
PA
Cache/
Memory
5
Data
6
A TLB miss incurs an add’l memory access (the PTE)
Fortunately, TLB misses are rare (WHY?)
From virtual address to memory location
• Translation Lookaside Buffer (TLB) is a special cache
just for the page table.
• Usually fully associative.
CPU
hit
Virtual address
cache
miss
TLB
miss
hit
Physical address
Main
memory
(page
table)
Translation Lookaside Buffer

Virtual to Physical translations are cached in a TLB.
Virtual address
31 30 29
15 14 13 12 11 10 9 8
Virtual page number
3210
Page offset
20
Valid Dirty
12
Physical page number
Tag
TLB
TLB hit
20
Physical page number
Page offset
Physical address
Physical address tag
Cache index
14
16
Valid
Tag
Data
Cache
32
Cache hit
Data
Byte
offset
2
What Happens on a Context Switch?




Page table is per process
So is TLB
TLB flush
TLB tagging
Review of Abbreviations

Components of the virtual address (VA)





TLBI: TLB index
TLBT: TLB tag
VPO: virtual page offset
VPN: virtual page number
Components of the physical address (PA)





PPO: physical page offset (same as VPO)
PPN: physical page number
CO: byte offset within cache line
CI: cache index
CT: cache tag
Simple Memory System Example

Addressing
 14-bit virtual addresses
 12-bit physical address
 Page size = 64 bytes
13
12
11
10
9
8
7
6
5
4
3
2
1
VPN
VPO
Virtual Page Number
Virtual Page Offset
11
10
9
8
7
6
5
4
3
2
1
PPN
PPO
Physical Page Number
Physical Page Offset
0
0
Simple Memory System Page Table
Only show first 16 entries (out of 256)
VPN
PPN
Valid
VPN
PPN
Valid
00
28
1
08
13
1
01
–
0
09
17
1
02
33
1
0A
09
1
03
02
1
0B
–
0
04
–
0
0C
–
0
05
16
1
0D
2D
1
06
–
0
0E
11
1
07
–
0
0F
0D
1
Simple Memory System TLB


16 entries
4-way associative
TLBT
13
12
11
10
TLBI
9
8
7
6
5
4
3
2
1
0
VPO
VPN
Set
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
0
03
–
0
09
0D
1
00
–
0
07
02
1
1
03
2D
1
02
–
0
04
–
0
0A
–
0
2
02
–
0
08
–
0
06
–
0
03
–
0
3
07
–
0
03
0D
1
0A
34
1
02
–
0
Simple Memory System Cache



16 lines, 4-byte block size
Physically addressed
Direct mapped
CT
11
10
9
CI
8
7
6
5
4
CO
3
PPN
2
1
0
PPO
Idx
Tag
Valid
B0
B1
B2
B3
Idx
Tag
Valid
B0
B1
B2
B3
0
19
1
99
11
23
11
8
24
1
3A
00
51
89
1
15
0
–
–
–
–
9
2D
0
–
–
–
–
2
1B
1
00
02
04
08
A
2D
1
93
15
DA
3B
3
36
0
–
–
–
–
B
0B
0
–
–
–
–
4
32
1
43
6D
8F
09
C
12
0
–
–
–
–
5
0D
1
36
72
F0
1D
D
16
1
04
96
34
15
6
31
0
–
–
–
–
E
13
1
83
77
1B
D3
7
16
1
11
C2
DF
03
F
14
0
–
–
–
–
Address Translation Example #1
Virtual Address: 0x03D4
TLBT
TLBI
13
12
11
10
9
8
7
6
5
4
3
2
1
0
0
0
0
0
1
1
1
1
0
1
0
1
0
0
VPN
VPN 0x0F
___
3
TLBI ___
VPO
0x03
TLBT ____
Y
TLB Hit? __
N
Page Fault? __
PPN: 0x0D
____
Physical Address
CI
CT
11
10
9
8
7
6
5
4
3
2
1
0
0
0
1
1
0
1
0
1
0
1
0
0
PPN
0
CO ___
CO
0x5
CI___
0x0D
CT ____
PPO
Y
Hit? __
0x36
Byte: ____
Address Translation Example #2
Virtual Address: 0x0020
TLBT
TLBI
13
12
11
10
9
8
7
6
5
4
3
2
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
VPN
VPN 0x00
___
0
TLBI ___
VPO
0x00
TLBT ____
N
TLB Hit? __
N
Page Fault? __
PPN: 0x28
____
Physical Address
CI
CT
11
10
9
8
7
6
5
4
3
2
1
0
1
0
1
0
0
0
1
0
0
0
0
0
PPN
0
CO___
CO
0x8
CI___
0x28
CT ____
PPO
N
Hit? __
Mem
Byte: ____
Address Translation Example #3
Virtual Address: 0x0B8F
TLBT
TLBI
13
12
11
10
9
8
7
6
5
4
3
2
1
0
0
0
1
0
1
1
1
0
0
0
1
1
1
1
VPN
VPN 0x2E
___
2
TLBI ___
VPO
0x0B
TLBT ____
N
TLB Hit? __
Y
Page Fault? __
TBD
PPN: ____
Physical Address
CI
CT
11
10
9
8
7
6
PPN
CO ___
CI___
CT ____
5
4
CO
3
PPO
Hit? __
Byte: ____
2
1
0
Summary

Programmer’s view of virtual address space
 Each process has its own private linear address space
 Cannot be corrupted by other processes

System view of VAS & virtual memory
 Uses memory efficiently by caching virtual memory pages
Efficient only because of locality
 Simplifies memory management and programming
 Simplifies protection by providing a convenient interpositioning point
to check permissions

Allocating Virtual Pages

Example: Allocating VP5
Physical page
number or
Valid disk address
PTE 0 0
null
1
1
1
0
0
0
PTE 7 1
null
Physical memory
(DRAM)
VP 1
VP 2
VP 7
VP 3
Virtual memory
(disk)
VP 1
Memory resident
page table
(DRAM)
VP 2
VP 3
VP 4
VP 6
VP 7
PP 0
PP 3
Allocating Virtual Pages


Example: Allocating VP 5
Kernel allocates VP 5 on disk and points PTE 5 to it
Physical page
number or
Valid disk address
PTE 0 0
null
1
1
1
0
0
0
PTE 7 1
Physical memory
(DRAM)
VP 1
VP 2
VP 7
VP 3
Virtual memory
(disk)
VP 1
Memory resident
page table
(DRAM)
VP 2
VP 3
VP 4
VP 5
VP 6
VP 7
PP 0
PP 3
Page Tables Size

Given:
 4KB (212) page size
 48-bit address space
 4-byte PTE

How big is the page table?
Multi-Level Page Tables

Given:
 4KB (212) page size
 48-bit address space
 4-byte PTE

Problem:
 Would need a 256 GB page table!

Level 1
Table
248 * 2-12 * 22 = 238 bytes
Common solution
 Multi-level page tables
 Example: 2-level page table
 Level 1 table: each PTE points to a page table
 Level 2 table: each PTE points to a page
(paged in and out like other data)
 Level 1 table stays in memory
 Level 2 tables paged in and out
...

Level 2
Tables
...
A Two-Level Page Table Hierarchy
Level 1
page table
Level 2
page tables
Virtual
address space
VP 0
PTE 0
PTE 0
...
PTE 1
...
VP 1023
PTE 2 (null)
PTE 1023
VP 1024
0
2K allocated VM pages
for code and data
...
PTE 3 (null)
VP 2047
PTE 4 (null)
PTE 0
PTE 5 (null)
...
PTE 6 (null)
PTE 1023
Gap
PTE 7 (null)
6K unallocated VM pages
PTE 8
(1K - 9)
null PTEs
1023 null
PTEs
PTE 1023
1023
unallocated
pages
VP 9215
1023 unallocated pages
1 allocated VM page
for the stack
...
Translating with a k-level Page Table
Virtual Address
n-1
p-1
VPN 1
VPN 2
Level 2
page table
Level 1
page table
...
...
VPN k
...
0
VPO
Level k
page table
PPN
m-1
p-1
PPN
Physical Address
0
PPO
x86-64 Paging

Origin
 AMD’s way of extending x86 to 64-bit instruction set
 Intel has followed with “EM64T”

Requirements
 48-bit virtual address
256 terabytes (TB)
 Not yet ready for full 64 bits
– Nobody can buy that much DRAM yet
– Mapping tables would be huge
 52-bit physical address = 40 bits for PPN
 Requires 64-bit table entries
 Keep traditional x86 4KB page size, and same size for page tables
 (4096 bytes per PT) / (8 bytes per PTE) = only 512 entries per page

Intel Core i7
Core x4
Registers
Instruction
fetch
L1 d-cache
32 KB, 8-way
L1 i-cache
32 KB, 8-way
L2 unified cache
256 KB, 8-way
MMU
(addr translation)
L1 d-TLB
64 entries, 4-way
L1 i-TLB
128 entries, 4-way
L2 unified TLB
512 entries, 4-way
QuickPath interconnect
4 links @ 25.6 GB/s
102.4 GB/s total
L3 unified cache
8 MB, 16-way
(shared by all cores)
DDR3 Memory controller
3 x 64 bit @ 10.66 GB/s
32 GB/s total (shared by all cores)
Processor package
Main memory
To other
cores
To I/O
bridge
Intel Core i7
How many caches (including TLBs) are on this chip?
High end of Intel
“core” brand, 731M
transistors, 1366 pins.
Quadcore Core i7 announced late 2008,
six-core addition was launched in March 2010
Review of Abbreviations

Components of the virtual address (VA)





TLBI: TLB index
TLBT: TLB tag
VPO: virtual page offset
VPN: virtual page number
Components of the physical address (PA)





PPO: physical page offset (same as VPO)
PPN: physical page number
CO: byte offset within cache line
CI: cache index
CT: cache tag
Overview of Core i7 Address Translation
32/64
result
CPU
L2, L3 and
main memory
Virtual address (VA)
36
VPN
32
TLBT
12
VPO
L1
miss
L1
hit
4
TLBI
L1 d-cache
(64 sets, 8 lines/set)
TLB
hit
TLB
miss
...
...
L1 TLB (16 sets, 4 entries/set)
9
9
9
9
VPN1 VPN2 VPN3 VPN4
CR3
PTE
PTE
PTE
Page tables
PTE
40
PPN
40
CT
12
PPO
Physical
address
(PA)
6 6
CI CO
TLB Translation
1. Partition VPN into TLBT
and TLBI.
CPU
2. Is the PTE for VPN cached
in set TLBI?
3. Yes: Check permissions,
build physical address
12 virtual address
VPO
36
VPN
32
TLBT
4
TLBI
TLB
miss
1
2
PTE
PTE
...
TLB
hit 3
40
PPN
partial
TLB hit
page table translation
4
12
PPO
physical
address
4. No: Read PTE (and others
as necessary) from
memory and build physical
address
TLB Miss: Page Table Translation
CR3
9
VPN1
9
VPN2
9
VPN3
Page global
directory
Page upper
directory
Page middle
directory
Page
table
L1 PTE
L2 PTE
L3 PTE
L4 PTE
40
PPN
9
VPN4
12
VPO
12
PPO
Virtual address
Physical address
BONUS SLIDES
PTE Formats
63 62
XD
52 51
12 11
Page table physical base addr
Unused
9
Unused
8
7
G
PS
6
5
A
4
3
2
1
CD WT U/S R/W P=1
Available for OS (page table location on disk)
Level 1-3 PTE
Level 4 PTE
63 62
XD
P=0
P:
Page table is present in memory
R/W: read-only or read+write
U/S: user or supervisor mode access
WT: write-through or write-back cache policy for this page table
CD: cache disabled or enabled
A:
accessed (set by MMU on reads and writes, cleared by OS)
D:
dirty (set by MMU on writes, cleared by OS)
PS:
page size 4K (0) or 4MB (1), For level 1 PTE only
G:
global page (don’t evict from TLB on task switch)
Page table physical base address: 40 most significant bits of physical page table address
XD: disable or enable instruction fetches from this page
52 51
Unused
0
12 11
Page physical base address
Available for OS (page location on disk)
9
Unused
8
7
6
5
G
0
D
A
4
3
2
1
0
CD WT U/S R/W P=1
P=0
L1 Cache Access
32/64
data
L2, L3 and
main memory
L1
miss
L1
hit


L1 d-cache
(64 sets, 8 lines/set)

...
physical
address (PA)

40
CT
6 6
CI CO
Partition physical
address: CO, CI, and CT
Use CT to determine if
line containing word at
address PA is cached in
set CI
No: check L2
Yes: extract word at
byte offset CO and
return to processor
Speeding Up L1 Access: A “Neat Trick”
Tag Check
40
CT
6 6
CI CO
PPN
PPO
Physical address (PA)
Address
Translation
Virtual address (VA)

No
Change
VPN
VPO
36
12
CI
Observation





Bits that determine CI identical in virtual and physical address
Can index into cache while address translation taking place
Generally we hit in TLB, so PPN bits (CT bits) available quickly
“Virtually indexed, physically tagged”
Cache carefully sized to make this possible
Linux VM “Areas”
task_struct
vm_area_struct
mm_struct
mm
pgd
mmap
vm_end
vm_start
vm_prot
vm_flags
vm_next


 Address of level 1 page table
vm_end
vm_start
vm_prot
vm_flags
vm_prot:
vm_next
pgd:
 Read/write permissions for
all pages in this area

vm_flags
 Shared/private status of all
pages in this area
process virtual memory
shared libraries
0x40000000
data
0x0804a020
text
vm_end
vm_start
vm_prot
vm_flags
vm_next
0x08048000
0
Linux Page Fault Handling
vm_area_struct
process virtual memory

 i.e., Is it in area defined
vm_end
vm_start
vm_prot
vm_flags
vm_next
vm_end
vm_start
vm_prot
vm_flags
by a vm_area_struct?
 If not (#1), then signal
segmentation violation
shared libraries
1
read
data
3
read

Is the operation legal?
 i.e., Can the process
read/write this area?
 If not (#2), then signal
protection violation
2
write
vm_next
text
vm_end
vm_start
vm_prot
vm_flags
vm_next
Is the VA legal?

Otherwise
 Valid address (#3):
handle fault
Memory Mapping

Creation of new VM area done via “memory mapping”
 Create new vm_area_struct and page tables for area

Area can be backed by (i.e., get its initial values from) :
 Regular file on disk (e.g., an executable object file)
Initial page bytes come from a section of a file
 Nothing (e.g., .bss) aka “anonymous file”
 First fault will allocate a physical page full of 0's (demand-zero)
 Once the page is written to (dirtied), it is like any other page



Dirty pages are swapped back and forth between a special
swap file.
Key point: no virtual pages are copied into physical memory
until they are referenced!
 Known as “demand paging”
 Crucial for time and space efficiency
User-Level Memory Mapping
void *mmap(void *start, int len,
int prot, int flags, int fd, int offset)
start
len
bytes
offset
(bytes)
len
bytes
Disk file specified by
file descriptor fd
Process virtual memory
(or address
chosen by
kernel)
User-Level Memory Mapping
void *mmap(void *start, int len,
int prot, int flags, int fd, int offset)

Map len bytes starting at offset offset of the file specified
by file description fd, preferably at address start
 start: may be 0 for “pick an address”
 prot: PROT_READ, PROT_WRITE, ...
 flags: MAP_PRIVATE, MAP_SHARED, ...

Return a pointer to start of mapped area (may not be start)

Example: fast file-copy
 Useful for applications like Web servers that need to quickly copy files.
 mmap()allows file transfers without copying into user space.
mmap() Example: Fast File Copy
#include
#include
#include
#include
#include
<unistd.h>
<sys/mman.h>
<sys/types.h>
<sys/stat.h>
<fcntl.h>
int main() {
struct stat stat;
int i, fd, size;
char *bufp;
/* open the file & get its size */
fd = open("./input.txt", O_RDONLY);
fstat(fd, &stat);
size = stat.st_size;
/*
* a program that uses mmap to copy
* the file input.txt to stdout
*/
/* map the file to a new VM area */
bufp = mmap(0, size, PROT_READ,
MAP_PRIVATE, fd, 0);
/* write the VM area to stdout */
write(1, bufp, size);
exit(0);
}
Exec() Revisited
To run a new program p in the
current process using exec():
process-specific data
structures
(page tables,
task and mm structs)

Free vm_area_struct’s and page
tables for old areas

Create new vm_area_struct’s and
page tables for new areas
 Stack, BSS, data, text, shared libs.
 Text and data backed by ELF
physical memory
same for
each
process
kernel code/data/stack
0xc0…
%esp
stack
kernel
VM
demand-zero
process
VM
Memory mapped region
for shared libraries
.data
.text
executable object file
 BSS and stack initialized to zero
libc.so
brk
runtime heap (via malloc)
0
uninitialized data (.bss)
initialized data (.data)
program text (.text)
forbidden

demand-zero
.data
.text
p
Set PC to entry point in .text
 Linux will fault in code, data pages
as needed
Fork() Revisited

To create a new process using fork():
 Make copies of the old process’s mm_struct, vm_area_struct’s, and
page tables.
 At this point the two processes share all of their pages.
 How to get separate spaces without copying all the virtual pages
from one space to another?
– “Copy on Write” (COW) technique.
 Copy-on-write
 Mark PTE's of writeable areas as read-only
 Writes by either process to these pages will cause page faults
 Flag vm_area_struct’s for these areas as private “copy-on-write”
– Fault handler recognizes copy-on-write, makes a copy of the
page, and restores write permissions.

Net result:
 Copies are deferred until absolutely necessary (i.e., when one of the
processes tries to modify a shared page).
Discussion





How does the kernel manage stack growth?
How does the kernel manage heap growth?
How does the kernel manage dynamic libraries?
How can multiple user processes share writable data?
How can mmap be used to access file contents in arbitrary
(non-sequential) order?
9.9: Dynamic Memory Allocation

Motivation
 Size of data structures may be known only at runtime

Essentials
 Heap: demand-zero memory immediately after bss area, grows
upward
 Allocator manages heap as collection of variable sized blocks.

Two styles of allocators:
 Explicit: allocation and freeing both explicit
C (malloc and free), C++ (new and free)
 Implicit: allocation explicit, freeing implicit
 Java, Lisp, ML
 Garbage collection: automatically freeing unused blocks
 Tradeoffs: ease of (correct) use, runtime overhead

Heap Management
kernel virtual memory
memory protected
from user code
stack
%esp
Allocators request
additional heap memory
from the kernel using the
sbrk() function:
the “brk” ptr
error = sbrk(amt_more)
run-time heap (via malloc)
uninitialized data (.bss)
initialized data (.data)
program text (.text)
0
Heap Management

Classic CS problem







Handle arbitrary request sequence
Respond immediately to allocation requests
Meet alignment requirements
Avoid modifying allocated blocks
Maximize throughput and memory utilization
Avoid fragmentation
Specific issues to consider




How are free blocks tracked?
Which free block to pick for next allocation?
What to do with remainder of free block when part allocated?
How to coalesce freed blocks?
Heap Management
Block format


Allocators typically maintain header, optional padding
Stepping beyond block bounds can really mess up allocator
header
Format of
allocated and
free blocks
size
payload
a
a = 1: allocated block
a = 0: free block
size: block size
payload: application data
(allocated blocks only)
optional
padding
9.10: Garbage Collection

Related to dynamic memory allocation
 Garbage collection: automatically reclaiming allocated blocks that
are no longer used
 Need arises when blocks are not explicitly freed
 Also a classic CS problem
9.11: Memory-Related Bugs

Selected highlights







Dereferencing bad pointers
Reading uninitialized memory
Overwriting memory
Referencing nonexistent variables
Freeing blocks multiple times
Referencing freed blocks
Failing to free blocks
Dereferencing Bad Pointers

The classic scanf bug
scanf(“%d”, val);
Reading Uninitialized Memory

Assuming that heap data is initialized to zero
/* return y = Ax */
int *matvec(int **A, int *x)
{
int *y = malloc(N*sizeof(int));
int i, j;
for (i=0; i<N; i++)
for (j=0; j<N; j++)
y[i] += A[i][j]*x[j];
return y;
}
Overwriting Memory

Allocating the (possibly) wrong sized object
int **p;
p = malloc(N*sizeof(int));
for (i=0; i<N; i++)
{
p[i] = malloc(M*sizeof(int));
}
Overwriting Memory

Off-by-one error
int **p;
p = malloc(N*sizeof(int *));
for (i=0; i<=N; i++)
{
p[i] = malloc(M*sizeof(int));
}
Overwriting Memory

Not checking the max string size
char s[8];
int i;
gets(s);

/* reads “123456789” from stdin */
Basis for classic buffer overflow attacks
 1988 Internet worm
 Modern attacks on Web servers
 AOL/Microsoft IM war
Overwriting Memory

Referencing a pointer instead of the object it points to
 Code below intended to remove first item in a binary heap of *size
items, then reheapify the remaining items.
int *BinheapDelete(int **binheap, int *size)
{
int *packet;
packet = binheap[0];
binheap[0] = binheap[*size - 1];
*size--;
Heapify(binheap, *size, 0);
return(packet);
}
 Problem: * and -- have equal precedence, associate r to l


Programmer intended (*size)-Compiler interprets as *(size--)
Other Pointer Pitfalls

Misunderstanding pointer arithmetic
 Code below intended to scan array of ints and return a pointer to
the first occurrence of val.
int *search(int *p, int val)
{
while (*p && *p != val)
p += sizeof(int);
return p;
}
Referencing Nonexistent Variables

Forgetting that local variables disappear when a function
returns
int *foo ()
{
int val;
return &val;
}
Freeing Blocks Multiple Times

Nasty!
x = malloc(N*sizeof(int));
/* do some stuff with x */
free(x);
y = malloc(M*sizeof(int));
/* do some stuff with y */
free(x);
Referencing Freed Blocks

Evil!
x = malloc(N*sizeof(int));
/* do some stuff with x */
free(x);
...
y = malloc(M*sizeof(int));
for (i=0; i<M; i++)
y[i] = x[i]++;
Failing to Free Blocks (Memory Leaks)

Slow, long-term killer!
foo()
{
int *x = malloc(N*sizeof(int));
...
return;
}
Failing to Free Blocks (Memory Leaks)

Freeing only part of a data structure
struct list {
int val;
struct list *next;
};
foo() {
struct list *head =
malloc(sizeof(struct list));
head->val = 0;
head->next = NULL;
/* create, manipulate rest of the list */
...
free(head);
return;
}
Before You Sell Book Back…

Consider useful content of remaining chapters

Chapter 10: System level I/O







Unix file I/O
Opening and closing files
Reading and writing files
Reading file metadata
Sharing files
I/O redirection
Standard I/O
Chapter 11

Network programming





Client-server programming model
Networks
Global IP internet: IP addresses, domain names, DNS servers
Sockets
Web servers
Chapter 12

Concurrent programming
 CP with processes (e.g., fork, exec, waitpid)
 CP with I/O multiplexing
Ask kernel to suspend process, returning control when certain
I/O events have occurred
 CP with threads, shared variables, semaphores for synchronization

Class Wrap-up

Final exam in testing center: both days of finals







Check testing center hours, days!
50 multiple choice questions
Covers all chapters (4-7 questions each from chapters 2-9)
3 hour time limit: but unlikely to use all of it
Review midterm solutions, chapter review questions
Remember:
 Final exam score replaces lower midterm scores
 4 low quizzes will be dropped in computing overall quiz score
Assignments
 Deadline for late labs is tomorrow (15 June).
 Notify instructor immediately if you are still working on a lab
 All submissions from here on out: send instructor an email
Reminder + Request

From class syllabus:
 Must complete all labs to receive passing grade in class
 Must receive passing grade on final to pass class

Please double check all scores on Blackboard
 Contact TA for problems with labs, homework
 Contact instructor for problems with posted exam or quiz scores
Parting Thought
Again and again I admonish my students both in Europe and
in America: “Don’t aim at success – the more you aim at it
and make it a target, the more you are going to miss it. For
success, like happiness, cannot be pursued; it must ensue,
and it only does so as the unintended side-effect of one’s
personal dedication to a cause greater than oneself or as the
by-product of one’s surrender to a person other than oneself.
Happiness must happen, and the same holds for success: you
have to let it happen by not caring about it. I want you to
listen to what your conscience commands you to do and go
on to carry it out to the best of your knowledge. Then you
will live to see that in the long run – in the long run, I say! –
success will follow you precisely because you had forgotten
to think of it.”
Viktor Frankl
Download