Soft Error

advertisement
AKT211 – CAO
08 – Computer Memory (2)
Ghifar
Parahyangan Catholic University
Okt 31, 2011
Last Course Review
Computer Memory System
 Memory Characteristics
 Memory Hierarchy
 RAM Basic Technology
 Semiconductor
 SRAM vs DRAM
 Advanced RAM Organization
 SDRAM vs DDR-RAM
Outline
Error Correction
Single error correction
Double error correction
THE BASIC OF ERROR CORRECTION
Semiconductor System Error
• Hard Failures
– Permanent physical defect so that it can’t reliably
store data
– Stuck at 0 or 1 or switch erratically between 0
and 1
• Soft Error
– Random, nondestructive event that alters the
contents of one or more cell without damaging
the memory
– Caused by power supply problems or alpha
particles
• Most modern main memory systems include
logic for both detecting and correcting
errors
Single-bit error
• Only 1 bit in data unit has changed
Error Correcting Code (ECC) Function
Simples form of Error Detection
• Using a parity bit
– A bit that is added to ensure that the number of bits with the
value ‘1’ in a set of bits is even or odd
– Only for detecting 1-bit error, not more, nor correcting !
– E.g.: no error
A
A
A
B
B
B
wants to transmit: 1001
computes parity bit value : 1^0^0^1 = 0
adds parity bit and sends : 10010
receives : 10010
computes parity : 1^0^0^1 = 0
reports correct transmission after observing expected result
– E.g.: 1 bit error
A wants to transmit: 1001
A computes parity bit value : 1^0^0^1 = 0
A adds parity bit and sends : 10010
*** TRANSMISION ERROR ***
B receives : 11010
B computes parity : 1^1^0^1 = 1
B reports incorrect transmission after observing unexpected result
Hamming Error-Correcting Code
•
•
•
•
linear error-correcting code
can detect up to d-1 bit errors
can correct (d-1)/2
d is the minimum hamming distance
between all pairs in the code words
Hamming (7, 4) Code
• encodes 4 data bits (d1, d2, d3, d4) into 7
bits by adding 3 parity bits (p1, p2, p3)
• single error correction
Hamming (7,4) Example
HAMMING ALGORITHM
GENERALIZATION FOR
SINGLE ERROR CORRECTION
Generalization of the Hamming
Single Error Correction
• The comparison logic receives as
input two K-bit values
• A bit-by-bit comparison is done
by taking the XOR
• The result is called the syndrome
word
– The value 0 indicates that no error
was detected and otherwise
– We can determine the position from
that syndrome word
Required criteria for Hamming Error
Correction
• If the syndrome contains all 0s, no
error has been detected
• If the syndrome contains one and
only one bit set to 1, then an error
has occurred in one of the n check
bits. No correction is needed
• If the syndrome contains more than
one bit set to 1, then numerical value
of the syndrome indicates the
position of the data bit in error
SEC Step-by-Step
1. Determine how long the code
(check bits) must be
2. Determine the stored position
for each bit in M data bits and K
check bits
3. Construct the appropriate XOR
function that match with the
required criteria
1. Determine how long the code must be
• M : number of bits in data bits
• K : number of bits in code bits
• Because an error could occur on any
of the M data bits or K check bits, we
must have :
2K – 1 ‹ M + K
• e.g.: for a word of 8 data bits (M=8),
we have
K=3 : 23 – 1 < 8 + 3
K=4 : 24 – 1 > 8 + 4
2. Determine the stored position
Let’s see the explanation ! 
3. Construct the XOR Function
Again, let’s see the explanation ! 
Hamming SEC-DED Code
• Nowadays, more commonly, semiconductor
memory is equipped with a single-errorcorrecting, double-error-detecting (SEC-DED)
code
• Needs 1 extra parity bit that indicates whether
the total number of 1s is even or odd
• Enhances the reliability of the memory, but adds
the cost of complexity
• E.g. :
– The IBM 30xx implementations used an 8-bit SECDED code for each 64 bits of data in main memory
• The size is actually about 12% larger than is apparent to
the user
Hamming SEC-DED Code (2)
Any Question ?
Reference
• Chapter 5.2: Error Correction
(Stallings, William. Computer
Organization and Architecture,
8th ed. Prentice Hall. 2010)
Exercises
1. Dengan penggunaan algoritma Hamming,
berapakah jumlah check bit yang dibutuhkan jika
data bit berukuran 1024-bit ?
2. Terdapat data bit sebanyak 8-bit tersimpan di
dalam memori yang isinya 11000010. Dengan
menggunakan algoritma Hamming, tentukan nilai
check bit yang akan tersimpan pada memori.
3. Untuk data word 8-bit 00111001, check bit yang
tersimpan adalah 0111. Anggap terjadi error pada
pembacaan memori. Ketika data bit tersebut di baca
ulang dari memori, nilai check bit yang terhitung
adalah 1101. Berapakah sebenarnya nilai data bit
yang error?
Week 8 Assignment
• Bentuklah persamaan XOR untuk
menentukan SEC code (check bit) dengan
menggunakan algoritma Hamming untuk
data bit berukuran 16-bit. Bagaimana hasil
check bit apabila menerima masukan data
bit 0101000000111001 ? Simulasikan
bagaimana algoritma Hamming dapat
mengoreksi error apabila terjadi error di
data bit posisi ke-5 (0101000000101001).
Jelaskan jawaban Anda selengkaplengkapnya.
THANK YOU
Download