ucp.v2

advertisement
A Few Subtle Insights About UCP
Moinuddin K. Qureshi
Work on UCP done while at:
1
First Things First
I thank Xing and Rajeev for:
1. Validating that UCP (based on misses) works
2. Re-Validating that UCP (based on IPC) is slightly better
than one based on misses: ~1%-3%
As mentioned, this is not the 1st (or 2nd, or 3rd, or 4th …)
paper to provide this insight
2
Critique 1: UCP(MPKI) = UCP
10
Consider two apps, A and B,
with identical miss rate curves
4
3
5
2
UCP(MPKI) gives 2 ways to both: A&B
A & B both access cache 1 per 100 inst,
Cache Hit: 1 Cycle, Memory: 100 cycles
A has 99 integer ops (1 cycle each):
CPI_A = (99+1+ MissRatePerc)/100
0
1
3
2
4
1
Num Ways in 4-way Cache
1.0
A
0.5
B has 99 FP ops (10 cycles each):
CPI_B = (990+1+ MissRatePerc)/100
UCP(MPKC)  4 ways to A: IPC_best, WS_best
B
0
3
2
4
1
Num Ways in 4-way Cache
UCP(MICRO’06) optimizes perf more than UCP(MPKI)
3
Critique 2: Dynamic can beat Static Optima
4
Critique 3: Not all Misses are Created Equal
MPKI
CPI
Problem with Linear
CPI Model of Xing
5
UCP: The last 4.5 years …
Things I would have liked to see in literature:
1. Non-Integer Way Partition
2. Utility Based Cache Insertion
3. Prefetch Aware Cache Partition
6
Extension 1: Probabilistic Way Partition
Common criticism of way partitioning: We can only allocate
Integer number of ways
A simple way to avoid this is Probabilistic Way Partition.
Say you want to allocate 3.5 ways to application A
Then on a cache miss, consult a Rand number generator
If Randval > 50% of Randmax, then A gets 4 ways, else 3 ways
On average, A will end up getting 3.5 ways in the cache
Can go finer, say we want to allocate 4.125 ways to B
7
Extension 2: Utility Based Cache Insertion
One can achieve the effect of partitioning by intelligent insertion
In a 16-way cache, a given application A can insert at 16 locations
If N applications share the cache the decision space is 16N
An efficient hardware scheme that obtains the best decision in
this decision space will outperform both UCP and TADIP
8
Extension 3: Prefetch Aware Partitioning
How does one do partitioning under prefetching ?
For applications whose dataset is prefetchable, we may
Not want to give cache space (even if it has high utility)
In-fact sometimes it’s a win-win to give more cache to irregular
Apps, as it provides more bandwidth available for prefetching
What is the right way to extend UCP to prefetches ?
9
Summary
UCP: Partitioning based on misses works (simple)
Several work has shown UCP based on IPC works slightly better
There are several extensions of UCP still unexplored:
-- Let me know if you are interested in exploring
questions/comments:
moinqureshi@gmail.com
10
Download