pptx - University of Michigan

advertisement
Archipelago: A Polymorphic
Cache Design for Enabling Robust
Near-Threshold Operation
Amin Ansari, Shuguang Feng,
Shantanu Gupta, and Scott Mahlke
University of Michigan, Ann Arbor
HPCA-17
February 16, 2011
University of Michigan
Electrical Engineering and Computer Science
Matching Power Consumption and Utilization
More than
50% of all
computers
Large SRAM
structures limit
the Min Vdd
[Webber et. al.]
More than
80% of
times idle
[Roth et. al.]
DVS to
improve
battery life
Logic cells
can operate
close to Vth
Core i7 achieves
37% power reduction
in idle state.
2
University of Michigan
Electrical Engineering and Computer Science
Bit-Error-Rate for an SRAM Cell

Extremely fast growth in failure rate with decreasing Vdd
3
University of Michigan
Electrical Engineering and Computer Science
Our Goal

Enabling DVS to push core’s Vdd down to
o
o
Ultra low voltage region ( < 650mV )
While preserving correct functionality of on-chip caches

Proposing a highly flexible and FT
cache architecture that can efficiently
tolerate these SRAM failures

Minimizing our overheads in highpower mode
4
University of Michigan
Electrical Engineering and Computer Science
Archipelago (AP)
data chunk
Thisautonomous
particular cache
has
By forming
islands,
a6
single
AP only
saves
out offunctional
8 lines. line.
1
2
Island
1
3
4
Island
2
5
6
sacrificial line
7
8
sacrificial line
University of Michigan
Electrical Engineering and Computer Science
Baseline AP Architecture

Two lines
have collision, if they have at least one faulty
in
Addedchunk
modules:
Fault map address
Sacrificial line
Memory
Mapand 15 are collision
+ Memory
the
position
(10
free) map
Datasame
line
+ Fault map
There should be
no
collision
between
lines
within
a group
G3
Input Address
+ MUXing layer
[Group 3 (G3) contains lines 4, 10, and 15]

First Bank
Second Bank
S
Fault Map
MUXing layer
G3
-
-
Functional Block
6
Two type of lines:
+ data line
+ sacrificial line
University of Michigan
Electrical Engineering and Computer Science
AP with Relaxed Group Formation

Sacrificial lines do not contribute to the effective capacity
o
We want to minimize the total number of groups
Second Bank
First Bank
S
S
First Bank
Second Bank
S
7
University of Michigan
Electrical Engineering and Computer Science
Semi-Sacrificial Lines
First Bank


Semi-sacrificial line guarantees the parallel access
In contrast to a sacrificial line, it also contributes to
the effective cache capacity
Sacrificial line
MUXing Layer
Second Bank
Semi-sacrificial line
8
University of Michigan
Electrical Engineering and Computer Science
AP with Semi-Sacrificial Lines
Memory Map
Input Address
G3
First Bank
Second Bank
S
semisacrificial
line
way0
way1
way0
way1
Fault Map
MUXing layer
G3
Functional Block
9
University of Michigan
Electrical Engineering and Computer Science
AP Configuration

We model the problem as a graph:
o
o

Each node is a line of the cache.
Edge when there is no collision between nodes
A collision free group forms a clique
o
Group formation


Finding the cliques
To maximize the number of
functional lines, we need to
minimize the number of groups.
o
minimum clique cover (MCC).
10
University of Michigan
Electrical Engineering and Computer Science
AP Configuration Example
First Bank
Second Bank
1
2
3
4
5
G1(1)
G2(1)
G2(S)
G1(2)
G2(2)
6
7
8
9
10
D
G2(3)
G1(3)
G1(S)
G2(4)
10
1
Island or Group 2
7
9
2
4
Island or Group 1
8
5
6
11
3
Disabled
University of Michigan
Electrical Engineering and Computer Science
Operation Modes
High power mode (AP is turned off)


There is no non-functional lines in this case
Clock gating to reduce dynamic power of SRAM structures
Low power mode
o
During the boot time in low-power mode



BIST scans cache for potential faulty cells
Processor switches back to high power mode
Forms groups and configure the HW
12
University of Michigan
Electrical Engineering and Computer Science
Evaluation Methodology

Performance [DEC Alpha 21364]
o

SimAlpha that is based on SimpleScalar OoO [SPEC2K]
Delay, power and area
o
o
o
o
Wattch and hot-leakage for power of processor
Artisan memory-compiler for our SRAM structures
CACTI for baseline on-chip caches (64KB, 2MB)
Synopsys design-compiler and power-compiler for


Miscellaneous logic (e.g. bypass MUXes and comparators)
Given set of cache parameters (e.g. Vdd)
o
o
Monte Carlo (with 1000 iterations) using our modified MCC
Determining disabled portion of caches (for 99% yield)
13
University of Michigan
Electrical Engineering and Computer Science
Minimum Achievable Vdd
14
University of Michigan
Electrical Engineering and Computer Science
Overheads
Overheads for L1 and L2 caches
o
10T used to protect the fault map, tag array, and memory map
fault map (10T)
miscellaneous logic
memory map (10T)
tag overhead (10T)
14
Percentage of Overhead

12
10
High Power Mode
8
6
4
2
0
L1 area
L2 area
L1 leakage
power
15
L2 leakage
power
L1 dynamic
power
L2 dynamic
power
University of Michigan
Electrical Engineering and Computer Science
Performance Loss

One extra cycle latency for L1 and 2 cycles for L2
16
University of Michigan
Electrical Engineering and Computer Science
Summary of Benefits
Larger leakage power
savings for deeper
technology nodes
17
University of Michigan
Electrical Engineering and Computer Science
Comparison with Alternative Methods
100
10T
Recently
Proposed
Cache Area Overhead (%)
66% area
overhead
Conventional
ZC
ECC-2
10
SECDED
Row
Red
AP
BF
Disabled: 25%
Disabled: 9%
1
0.5
1
1.5
2
2.5
3
Power (at minimum Vdd) Normalized to Archipelago
18
University of Michigan
Electrical Engineering and Computer Science
Conclusion

DVS is widely used to deal with high power dissipation
o

We proposed a highly flexible cache architecture
o

Minimum achievable voltage is bounded by SRAM structures
To tolerate failures when operating in near-threshold region
Using our approach
o
o
o
Vdd of processor can be reduced to 375mV
79% dynamic power saving and 51% leakage power saving
< 10% area overhead and performance overheads
19
University of Michigan
Electrical Engineering and Computer Science
Thank You
http://cccp.eecs.umich.edu
20
University of Michigan
Electrical Engineering and Computer Science
Download