ppt - Rudrajit Datta

advertisement
Post-Manufacturing ECC Customization
Based on Orthogonal Latin Square
Codes and Its Application to Ultra-Low
Power Caches
Rudrajit Datta and Nur A. Touba
Computer Engineering Research Center
Dept. of Electrical and Computer Engineering
University of Texas at Austin
Motivation
 For memories with high defect rates
• Reduce check-bit overhead
• Increase reliability
 Applicable to low voltage caches
Agenda
 Introduction
 Proposed Approach
 Application
 Related Work
 Orthogonal Latin Square (OLS) Codes
 Customization
 Results
 Conclusion
Introduction
 Tolerate high defect rates for memories
• Occurs in memories operating at ultra-low voltages
• Expected in future nanoscale technologies
– Eg. nanoscale crossbar architectures
 Conventional method
• ECC selected based on
– Expected number of maximum defects per word
Introduction
Data
Check Bit
Generator
cfull
Memory
Information Bits
cfull
Check Bits
cfull
Decoder
Corrected Data
Observations
 A priori information available for location of defects
• Through post-manufacturing memory tests
– Obtain a defect map
• Use information to customize code
– Reduce check bit storage in memory/caches
Proposed Approach
Data
Check Bit
Generator
cfull
Switch
Network
cused
Memory
Information Bits
cused
Check Bits
cused
Switch
Network
cfull
Decoder
Corrected Data
Config.
Bits
Proposed Approach
 Customize code by disabling rows of the H-matrix
• Possible if modular code used for ECC
• Current work looks at OLS codes
Configuration
Bits
1
0
1
0
Application - Low-voltage Caches
 Microprocessor voltage lowered while idle
• Reduces power
 Caches and memories susceptible at lower voltages
• Unreliable below Vccmin
 Enable reliable cache operation at lower voltages
• At lower voltages use part of cache to store extra
check bits
Related Work
 Word-disable and Bit-fix [Wilkerson 08]
• Defect map
– Identify vulnerable bits
• Mitigates only persistent errors
• Uses up half of the cache to store extra check-bits
 Two-dimensional ECC [Kim 07]
• Slow
• Complicated decoding
 Multi-bit segmented ECC [Chishti 09]
• Orthogonal Latin Square (OLS) code
– Single step decodable
• High redundancy
Key Takeaways
 Have full ECC on chip
• Can handle all defect maps
 Generate defect map
• Disable part of the original code
• Reduces check bit redundancy
• Retain capability of original code w.r.t the defect map
One Step Majority Decoding
 t-error correctable – information bit copied over 2t+1
times; each an independent copy
 One copy – bit itself
 Rest - 2t independent parity equations
+
corrected
di
Majority
Voter
+
+
di
dp
cp
dq
cq
ds
cs
Orthogonal Latin Square Codes
 Latin Square
• m x m array
• Row-columns permutation of digits 0,1,…..m-1
 Orthogonal Latin Squares
• Ordered pair of elements (r, c, s) appear only once
 m2 data bits, 2tm check bits, t-error correctable
[Hsiao 70]
 Single step decodable
Proposed Scheme
 Implement full OLS code on chip
 Run memory tests
• Generate defect map
– At manufacturing time or at boot-time
• Identify vulnerable bits
 Disable rows in OLS H-matrix
• On chip-by-chip basis, based on defect map
• Correct all erasures PLUS ‘e’ random error in each
cache line
• Reduce redundancy while providing same reliability
Definitions
 “good row” – for information bit di
• Row of OLS H-matrix
• No ‘1’ in any other erasure position save bit di
− Holds true for all lines In cache
 “bad row” – for information bit di
• Row of OLS H-matrix
• ‘1’ in one or more erasure positions apart from bit di
• Holds for at least one line of cache
“Good Rows” & “Bad Rows”
d0
d1
d2
d3
d4
d5
d6
d7
line1
line2
-
E
-
-
E
-
E
-
-
-
H-row1
H-row2
H-row3
1
0
1
0
1
0
0
1
0
0
0
1
1
1
0
0
0
1
1
0
1
1
1
0
H-row1
H-row2
H-row3
G
B
G
-
B
-
B
G
B
-
B
G
B
G
B
-
Necessary and Sufficient Conditions
 Tolerate ‘e’ random errors
• “good rows” – “bad rows” ≥ 2(e + 1)
 Original code – t-error correcting
• (Max vulnerable bits in any line) + e ≤ t
Row Selection
 Covering problem
•
•
•
•
Select enough good rows for each information bit di
Until constraint is satisfied
NP-complete problem
Apply heuristics
H-row1
H-row2
H-row3
G
B
“good rows” –
-11
“bad rows”
G
-
B
-
B
G
B
-
B
G
B
G
B
-
11
-1-1
-10
-10
-10
-11
-10
Covering Problem
 Solve for cache line with maximum erasures first
 Apply solution to all other cache lines
 If unsatisfactory, add erasures from one of unsolved
lines
 Repeat until solution fits entire cache
Implementation
di
dp
&
ctlp
corrected
di
Adjustable
Majority
Threshold
Voter
Voter
+
cp
dq
&
ctlq
+
cq
ctl
ds
+
&
ctls
cs
Experimental Results
Results for Word Size of 256 Bits and Bit-Error Rate of 10-3
Check bits for
conventional OLS
Check bits for
customized OLS
Avg
Max
Avg
Max
Percentage
reduction
in Max.
Check Bits
16 KB
155
224
117
145
35.27
32 KB
166
256
125
148
42.19
64 KB
175
256
134
156
39.06
128 KB
208
256
163
177
30.86
Cache Size
(Bytes)
Experimental Results
Results for Constant Cache Size of 64KB
Word
Size
(Bits)
256
484
Bit-error
Rate
10-3
10-4
10-5
10-3
10-4
10-5
Check bits for
conventional OLS
Avg
175
98
66
295
143
92
Max
256
128
102
396
176
132
Check bits for
customized OLS
Avg
138
84
64
198
117
89
Max
156
107
68
230
139
115
Experimental Results
64 KB cache, 484-bit word, 10-3 bit-error rate
Conclusion
 Post-manufacturing customization
• Reduces large check-bit overhead
• Provides requisite reliability
• Applicable to systems with high defect rate
Download