15-213 Recitation 7 Office Hours: Wed 2:00-3:00PM Greg Reshko March 31

advertisement
15-213 Recitation 7
Greg Reshko
Office Hours: Wed 2:00-3:00PM
March 31st, 2003
Outline

Virtual Memory
 Paging
 Page
faults
 TLB
 Address

translation
Malloc Lab
 Lots
of hints and ideas
Virtual Memory

Reasons
 Use
RAM as a cache for disk
 Easier memory management
 Protection
 Enable ‘partial swapping’
 Share memory efficiently
Physical memory
Memory
Physical
Addresses
0:
1:
CPU
N-1:
Virtual Memory
Memory
0:
1:
Page Table
Virtual
Addresses
0:
1:
Physical
Addresses
CPU
P-1:
N-1:
Disk
Paging: Purpose

Solves two problems
 External
memory fragmentation
 Long delay to swap a whole process

Divide memory more finely
– small logical memory region
 Frame – small physical memory region
 Page

Any page can map to any frame
Paging: Address Mapping
Logical Address
Page
Offset
Frame
Offset
....
f29
f34
....
Page table
Physical Address
Paging: Multi-Level
P1 P2
Offset
....
f07
f08
....
Page Directory
....
f99
f87
....
....
f29
f34
f25
Page Tables
Frame
Offset
Page Faults

Virtual address not in memory

This means it is on a disk
 Go to disk, fetch the page, load it into memory, get back to the process
Memory
Memory
Page Table
Virtual
Addresses
Physical
Addresses
CPU
Page Table
Virtual
Addresses
Physical
Addresses
CPU
Disk
Disk
Copy-on-Write

“Simulated” Copy
 Copy page table entries to new process
 Mark PTEs read-only in old and new

What really happens
 Process writes to page
 Page fault handler is called
 Copy page into empty frame
 Mark read-write in both PTEs

Result
 Faster
and less work
Relevance to Fork

Why is paging good for fork and exec?
 Fork

produces two very similar processes
Same code, data, and stack
 Copying

Many will never be modified (especially in exec)
 Share

all pages is expensive
pages instead
i.e. just mark them as read only and duplicate
when necessary
Address Translation:
General Idea

Mapping between virtual and physical addresses
page fault
fault
handler
Processor
V
Hardware
Addr Trans
Mechanism

Main
Memory
Secondary
memory
P
virtual address
part of the
physical address
on-chip
memory mgmt unit (MMU)
OS performs
this transfer
(only if miss)
Address Translation:
In terms of address itself


Higher bits of the address get mapped from virtual address to
physical.
Lower bits (page offset) stays the same.
p p–1
n–1
virtual page number
0
virtual address
page offset
address translation
m–1
p p–1
physical page number
page offset
0
physical address
TLB

Translation Lookaside Buffer


Small hardware cache in MMU
Maps virtual page numbers to physical page numbers
hit
PA
VA
CPU
miss
TLB
Lookup
miss
Cache
hit
Translation
data
Main
Memory
Address Translation with TLB
n–1
p p–1
0
virtual page number page offset
valid
.
virtual address
tag physical page number
.
TLB
.
=
TLB hit
physical address
tag
index
valid tag
byte offset
data
Cache
=
cache hit
data
Example

Motivation:


A detailed example of end-to-end address translation
Same as in the book and lecture


I just want to make sure it makes perfect sense
Do practice problems at home

Ask questions if anything is unclear
Example: Description







Memory is byte addressable
Accesses are to 1-byte words
Virtual addresses are 14 bits
Physical addresses are 12 bits
Page size is 64 bytes
TLB is 4-way set associative with 16 total entries
L1 d-cache is physically addressed and direct mapped,
with 4-byte line size and 16 total sets
Example: Addresses



14-bit virtual addresses
12-bit physical address
Page size = 64 bits
13
12
11
10
9
8
7
6
5
4
VPN
10
2
1
0
VPO
(Virtual Page Offset)
(Virtual Page Number)
11
3
9
8
7
6
5
4
3
2
1
PPN
PPO
(Physical Page Number)
(Physical Page Offset)
0
Example: Page Table
VPN
PPN
Valid
VPN
PPN
Valid
00
28
1
08
13
1
01
–
0
09
17
1
02
33
1
0A
09
1
03
02
1
0B
–
0
04
–
0
0C
–
0
05
16
1
0D
2D
1
06
–
0
0E
11
1
07
–
0
0F
0D
1
…
Example: TLB
TLBT
13
12
11
10
TLBI
9
8
7
6
5
4
3
VPN
2
1
VPO
16 entries
4-way associative


Set
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
0
03
–
0
09
0D
1
00
–
0
07
02
1
1
03
2D
1
02
–
0
04
–
0
0A
–
0
2
02
–
0
08
–
0
06
–
0
03
–
0
3
07
–
0
03
0D
1
0A
34
1
02
–
0
0
Example: Cache



CI
CT
16 lines
4-byte line size
Direct mapped
11
10
9
8
7
6
5
4
PPN
CO
3
2
1
0
PPO
Index
Tag
Valid
B0
B1
B2
B3
Index
Tag
Valid
B0
B1
B2
B3
0
19
1
99
11
23
11
8
24
1
3A
00
51
89
1
15
0
–
–
–
–
9
2D
0
–
–
–
–
2
1B
1
00
02
04
08
A
2D
1
93
15
DA
3B
3
36
0
–
–
–
–
B
0B
0
–
–
–
–
4
32
1
43
6D
8F
09
C
12
0
–
–
–
–
5
0D
1
36
72
F0
1D
D
16
1
04
96
34
15
6
31
0
–
–
–
–
E
13
1
83
77
1B
D3
7
16
1
11
C2
DF
03
F
14
0
–
–
–
–
Example: Address Translation


Virtual Address 0x03D4
Split into offset and page number




0x03D4 = 00001111010100
VPO = 010100 = 0x14
VPN = 00001111 = 0x0F
Lets see if this is in TLB



0x03D4 = 00001111010100
TLBI = 11 = 0x03
TLBT = 000011 = 0x03
Example: TLB
TLBT
13
12
11
10
TLBI
9
8
7
6
5
4
3
VPN
2
1
VPO
16 entries
4-way associative


Set
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
0
03
–
0
09
0D
1
00
–
0
07
02
1
1
03
2D
1
02
–
0
04
–
0
0A
–
0
2
02
–
0
08
–
0
06
–
0
03
–
0
3
07
–
0
03
0D
1
0A
34
1
02
–
0
0
Example: Address Translation


Virtual Address 0x03D4
TLB lookup

This address is in TLB (second entry, set 0x3)
 PPN = 0x0D = 001101
 PPO = VPO = 0x14 = 010100
 PA = PPN + PPO = 001101010100

Cache

PA = 0x354 = 0x001101010100
 CT = 001101 = 0x0D
 CI = 0101 = 0x05
 CO = 00 = 0x0
Example: Cache



CI
CT
16 lines
4-byte line size
Direct mapped
11
10
9
8
7
6
5
4
PPN
CO
3
2
1
0
PPO
Index
Tag
Valid
B0
B1
B2
B3
Index
Tag
Valid
B0
B1
B2
B3
0
19
1
99
11
23
11
8
24
1
3A
00
51
89
1
15
0
–
–
–
–
9
2D
0
–
–
–
–
2
1B
1
00
02
04
08
A
2D
1
93
15
DA
3B
3
36
0
–
–
–
–
B
0B
0
–
–
–
–
4
32
1
43
6D
8F
09
C
12
0
–
–
–
–
5
0D
1
36
72
F0
1D
D
16
1
04
96
34
15
6
31
0
–
–
–
–
E
13
1
83
77
1B
D3
7
16
1
11
C2
DF
03
F
14
0
–
–
–
–
Example: Address Translation


Virtual Address 0x03D4
Cache Hit




Tag in set 0x5 matches CT
Data at offset CO is 0x36
Data returned to MMU
Data returned to CPU
Lab 6 Hints and Ideas

Due April 16


40 points for performance
20 points for correctness
5 points for style

Get the correctness points this week




Get a feel for how hard the lab is
You'll probably need the time
Starting a couple days before is a BAD idea!
How to get the correctness points

We provide mm-helper.c which contains the code from
the book





malloc works
free works (with coalescing)
Heap checking doesn't work
realloc doesn't work
Implement a dumb version of realloc

malloc new block, memcpy, free old block, return new block
How to get the correctness points

Implement heap checking



Have to add a request id field to each allocated block (tricky)
Hint: need padding to maintain 8 byte alignment of user pointer
In the book's code bp always the same as the user pointer
Size+a Payload…
Footer
bp


The 4 bytes immediately before bp contain size of
payload
3 lsb of size unused (because of alignment)

first bit indicates of the block is alloced or not
How to get the correctness points

Need to change block layout to look like this:
ID Size+a Payload…
Footer
bp

This changes how the implicit list has to be traversed

But size is at same place relative to bp
How to get the correctness points

Or change block layout to look like this:
Size+a ID Payload…
Footer
bp



All accesses to what was size now access id but can be
clever and make size 4 bytes larger
Could even make bp point to id..
Most code would just work
How to get the correctness points

Once malloc, free, and realloc work with the id field, write
heapcheck
 Iterate over the whole heap and print out allocated
blocks
 Need to read the id field…

That's it for correctness
Hints




Remember that pointer arithematic behaves differently
depending on type of pointer
Consider using structs/unions to eliminate some messy
pointer code
Get things working with the short trace file first:
./mdriver -f short1-bal.rep
To get the best performance



Red-Black trees
Ternary trees
Other interesting data structures
That’s it for hints…
Good Luck!
Download