Application Slowdown Model: Quantifying and Controlling Impact of Interference at

advertisement
Application Slowdown Model:
Quantifying and Controlling Impact of Interference at
Shared Caches and Main Memory
Lavanya Subramanian, Vivek Seshadri, Arnab Ghosh, Samira Khan, Onur Mutlu
Carnegie Mellon University, Intel Labs, University of Virginia
Impact of
Shared Resource Interference
Core
Shared
Cache
Core
Main
Memory
6
5
4
3
2
1
0
Slowdown
Core
Slowdown
Problem:
Shared Resource Interference
Provide high and predictable
performance in the presence of
shared resource interference
6
5
4
3
2
1
0
leslie3d gccgcc
(core
(core 0) (core
1) 1)
Core
Our Goal
Our Approach
leslie3d mcfmcf
(core
(core 0) (core
1) 1)
1. Build a model to estimate
slowdowns
2. Leverage our model for slowdownaware resource management
1. High application slowdowns
2. Unpredictable application slowdowns
Application Slowdown Model (ASM)
Observation: Proxy for Performance
For a memory bound application,
Performance α Cache access rate
Minimize memory bandwidth contention:
Using priority (Subramanian et al., HPCA 2013)
1. Run alone
2.2
1.8
astar
lbm
bzip2
1.4
1.2
1 1.2 1.4 1.6 1.8 2 2.2
Shared Cache Access Rate Alone/
Shared Cache Access Rate Shared
Slowdown=
2
1
Core
Main
Memory
Shared
Cache
2. Run with another application
Time units Service order
Request Buffer State
3
Main
Memory
1
Slowdown=
3
Main
Memory
1.6
Quantify shared cache capacity contention:
Using auxiliary tag stores (Pomerene et al., 1989)
Time units Service order
Request Buffer State
2
Slowdown
Challenge: Estimating Cache Access Rate Alone
2
1
Core
Main
Memory
Performance Alone
Performance Shared
Cache Access Rate Alone (CARAlone)
Cache Access Rate Shared (CARShared)
Auxiliary
Tag Store
Time units Service order
3
2
1
Priority
Auxiliary
Tag Store
Main
Memory
3. Run with another application: highest priority
Request Buffer State
Main
Memory
Main
Memory
Auxiliary tag store counts #contention misses
Highest priority  Little interference
(almost as if application were run alone)
Enables estimation of miss service time
Cache Contention Cycles = #Contention Misses x
Average Miss Service Time
From auxiliary tag store
when given high priority
Measured when given high
priority
Remove contention cycles when estimating CARAlone
Average error of ASM: 10%; Average error of previous models: 30%
Leveraging the Application Slowdown Model
Coordinated Resource
Allocation Schemes
Shared
Cache
Main
Memory
Core
Core
Previous work: Reduce miss counts;
Our proposal: Reduce slowdowns
Shared
Cache
Core
Slowdown-aware memory bandwidth partitioning
Shared
Cache
Main
Memory
Core
Allocation memory bandwidth
proportional to slowdowns
Main
Memory
Core
16-core system 100 workloads
Fairness
(Lower is better)
Core
Core
11
0.35
10
0.3
9
0.25
Performance
Core
8
7
6
0
1
2
Number of Channels
3.5
TCM+UCP
3
PARBS+UCP
ASM-Cache-Mem
0.1
4
4
FRFCFS+UCP
0.15
0.05
• Cache allocation with the goal of
meeting a slowdown bound
• Allocate just enough cache space to
critical application
• Allocate remaining cache space to
other applications
FRFCFS-NoPart
0.2
5
Providing
Slowdown Guarantees
1
2
Number of Channels
Slowdown
Slowdown-aware cache capacity partitioning
Naive-QoS
2.5
ASM-QoS-2.5
2
ASM-QoS-3
1.5
ASM-QoS-3.5
ASM-QoS-4
1
0.5
Significant fairness benefits across
different channel counts
0
h264ref
mcf
sphinx3
soplex
Download