과제 슬라이드

advertisement
Seoul National University
Computer Architecture Project #2
Cache Simulator
1
Seoul National University
Objectives

To understand cache memory
 Organization
Set associativity
 Operation
 Cache Read & Write, Hit & Miss
 LRU replacement policy
 Performance
 Hit/miss ratio, miss penalty


To develop your own cache simulator
Memory
Access
Pattern
Cache
Organization
Display
Option
Cache Simulator
Hit/Miss
Performance
2
Seoul National University
General Cache Organization (S, E, B)
E = 2e lines per set
set
line
If e = 1, “Direct Mapped Cache”
else If s = 1, “Fully Associative Cache”
else “E-Way Set Associative Cache”
S = 2s sets
v
valid bit
tag
0 1 2
B-1
Cache size:
C = S x E x B data bytes
B = 2b bytes per cache block (the data)
3
Seoul National University
E-way Set Associative Cache (Here: E = 2)
E = 2: Two lines per set
Assume that cache block size is 8 bytes
Address of short int:
t bits
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
0…01
100
find set
4
Seoul National University
E-way Set Associative Cache (Here: E = 2)
E = 2: Two lines per set
Assume that cache block size is 8 bytes
Address of short int:
t bits
compare both
0…01
100
valid? + match: yes = hit
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
block offset
5
Seoul National University
E-way Set Associative Cache (Here: E = 2)
E = 2: Two lines per set
Assume that cache block size is 8 bytes
Address of short int:
t bits
compare both
0…01
100
valid? + match: yes = hit
v
tag
0 1 2 3 4 5 6 7
v
tag
0 1 2 3 4 5 6 7
block offset
short int (2 Bytes) is here
No match :
• One line in set is selected for eviction and replacement
• Replacement policies: random, least recently used (LRU), …
6
Seoul National University
LRU Replacement Policy

Theoretically…
Address
1
2
3
4
1
2
3
1
2
3
4
5
Set
1
2
3
4
1
2
3
1
2
3
4
5
1
2
3
4
1
2
3
1
2
3
4
1
2
3
4
1
2
3
1
2
3

Practically…
7
Seoul National University
Performance

(Average Access Time) = (Hit Time) + (Miss Rate) × (Miss Penalty)
= (Hit Time) + [1 – (Hit Rate)] × (Miss Penalty)

Example




Suppose cache hit time is 1 cycle,
Miss penalty is 100 cycles,
and hit rate is 97%.
Then average access time is:
1 cycle + ( 1 – 0.97 ) × 100 cycles = 1 + 0.03 × 100 = 4 cycles.
8
Seoul National University
Requirements of the cache simulator (1)

Cache simulator (hereinafter referred to CSIM) shall implement arbi
trary numbers of sets and lines, and block size.
 You should implement a way to provide the numbers of sets and lines, and
block size as inputs to CSIM.

CSIM shall a read trace file line by line and process it.
 You should determine whether each memory operation is a cache hit or
miss.
 You should implement the LRU replacement policy

CSIM shall report the result of cache simulation.
 You should report these three basic results: numbers of Hits, misses, and evicts
 You should be able to report the average access time of cache simulation
 You should be able to report whether each memory access in trace file results
in a cache hit or miss
9
Seoul National University
Restrictions & Advices

Implement method for input parameters.
 You should implement it by argument passing. (full credit)
 If you can’t, you can use standard input such as scanf(). (low credit)

Evaluate only data cache performance.
 Therefore, you should ignore instruction load.
 You should assume that the memory accesses are aligned properly.
Therefore, you can ignore requested size in trace file.
 You should evaluate your CSIM with, at least, 3 different trace data. You can
use one provided with this project.

Calculate average access time using below assumption:
 Hit time = 1 cycle, miss penalty = 100 cycles.

Compile your CSIM without warnings.
10
Seoul National University
How to trace memory accesses

“valgrind”
 GPL licensed programming tool for memory debugging, memory leak detection,
and profiling. (from http://en.wikipedia.org/wiki/Valgrind)
 Usage: >> valgrind -log-fd=1 --tool=lackey -v --trace-mem=yes ls -l
– Valgrind prints out memory accesses of “ls -l” on stdout, so you need
to capture it by:
>> valgrind -log-fd=1 --tool=lackey -v --trace-mem=yes ls -l > ls.trace
 Output Format: [space]operation address,size
Output
Type
Example
Naccess
[space]
I 0400d7d4,8
Instruction load
All instructions
1
X
L 04f6b868,8
Data Load
movl (%eax), %ebx
1
O
S 7ff0005c8,8
Data Store
movl %eax, (%ebx)
1
O
M 0421c7f0,4
Data Modify
incl (%ecx)
2
O
11
Seoul National University
Reference Cache Simulator

Usage: >>./csim [-v] -s <s> -E <E> -b <b> -t <trace file>





-v: Optional verbose flag that displays trace info
-s <s>: Number of set index bits (S = 2s is the number of sets)
-E <E>: Associativity (number of lines per set)
-b <b>: Number of block bits (B = 2b is the block size)
-t <trace file>: Name of the valgrind trace to replay
set
line
S = 2s sets
v
tag
0 1 2
Cache size:
C = S x E x B data bytes
B-1
valid bit
B = 2b bytes per cache block (the data)
12
Seoul National University
Cache Simulation Example (1)


Usage: >>./csim [-v] -s <s> -E <E> -b <b> -t <trace file>
Example: >>./csim -v -s 4 -E 1 -b 4 -t ./traces/yi.trace
 Number of set index bits = 4 (16 sets)
 Associativity = 1 (Direct Mapped Cache)
 Number of block bits = 4 (16 blocks in a cache line)

Output
L 10,1 miss
M 20,1 miss hit
….
hits: 4 misses:5 eviction: 3
13
Seoul National University
Cache Simulation Example (2)

Example memory access pattern
Oper.
Address
Byte
S
V
0
I
1
I
Load
0x10
1
Modify
0x20
1
2
I
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
14
Seoul National University
Cache Simulation Example (3)
R/W
Address
Byte
S
V
0
I
1
V
Load
0x10
1
Modify
0x20
1
2
I
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
0
Evict
1
0
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
0x0
15
Seoul National University
Cache Simulation Example (4)
R/W
Address
Byte
S
V
0
I
1
V
0x0
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
1
Evict
2
0
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
16
Seoul National University
Cache Simulation Example (5)
R/W
Address
Byte
S
V
0
I
1
V
0x0
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
2
Evict
2
0
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
17
Seoul National University
Cache Simulation Example (6)
R/W
Address
Byte
S
V
0
I
1
V
0x0
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
3
Evict
2
0
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
18
Seoul National University
Cache Simulation Example (7)
R/W
Address
Byte
S
V
0
I
1
V
0x1
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
3
Evict
3
1
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
19
Seoul National University
Cache Simulation Example (8)
R/W
Address
Byte
S
V
0
I
1
V
0x2
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
Modify
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Hit
Miss
3
Evict
4
2
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
20
Seoul National University
Cache Simulation Example (9)
R/W
Address
Byte
S
V
0
I
1
V
0x0
0x0
Load
0x10
1
Modify
0x20
1
2
V
Load
0x22
1
3
I
Store
0x18
1
4
I
5
I
Load
0x110
1
6
I
Load
0x210
1
7
I
0x12
1
8
I
9
I
A
I
B
I
C
I
D
I
E
I
F
I
Modify
Hit
Miss
4
Evict
5
3
Average Access Time
= 1 + (5 / 9) * 100 = 56.5 Cycle
Tag
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
21
Seoul National University
보고서 작성요령 (1)

설계
시험
아래의 내용을 포함할 것
 설계 요구사항
구현
제시된 CSIM의 설계 요구사항을 자신의 CSIM에 맞춰
재정의
 구현
 자신의 CSIM이 어떤 식으로 동작하며 어떻게 설계 요구사항을
반영하는지 서술
 자신의 CSIM의 사용법과 시뮬레이션 결과 출력 방법에 대해 서술
 시험
 CSIM의 요구사항을 어떤 방법으로 검증하였는지 서술
 최소 3가지 Trace Data를 이용하여 검증 수행
추가적으로, Trace Data를 어떤 방법으로 얻었는지를 서술
 CSIM 구현 내용을 알 수 있도록 캡쳐된 이미지를 첨부할 것

22
Seoul National University
보고서 작성요령 (2)

Design
Testing
아래의 내용을 포함할 것
 성능 평가

Coding
각각의 Cache 구조 (direct mapped, E-way set
associative 및 fully associative cache)별로 성능을 측정하고 각각을
비교할 것
23
Seoul National University
평가기준
Title
CSIM
Pts.
Description
70
10
Warning: 각 -0.5 pt. / Error: 각 -1 pt.
Parameter Input
10
Argument Passing: 10 pts., Other methods: 5 pts.
Cache Operation
성능 평가
주석
제출지연
30
Details
제출
Cache Organization
보고서
Pts.
5
20
5
10
설계 요구사항
7
구현
7
시험
8
성능 평가
8
매 1일 당
-5
Dynamic allocation 사용 시: 5 pts.
- 배열 사용 시: 2 pts.
Hit/miss의 정확한 처리: 10pts.
Replacement policy (LRU): 5 pts
- implementing random replacement: 3pts.
각각의 Memory Access에 대한 결과 (Hit/Miss) 시현: 4pts.
- 결과 시현 여부를 선택할 수 있는 옵션 제공: 1pts.
정확한 Average Access Time의 제공
최대한 각각의 라인에 주석을 제공
제출 기한 1주일까지 제출 가능
24
Seoul National University
제출방법

아래 제출 목록의 산출물들을 메일로 제출
 E-mail address: yonghunlee@archi.snu.ac.kr
 E-mail 제목: “[CSIM]학번_이름”
 산출물들은 “학번_이름.zip” 또는 “학번_이름.tar”으로 압축하여 제출

제출 목록
 CSIM source code
 Project 보고서
 CSIM의 검증 시 사용한 Trace file

제출 기한 : ’13. 12. 18(수) 23:59 까지
25
Download