PRS Encoder and Decoder

advertisement
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 1 of 22
NAND Flash Memories
For Spacecraft
By
ECC Technologies, Inc. (“ECC Tek”)
4750 Coventry Road East
Minnetonka, MN 55345-3909
www.ecctek.com
Phone: 952-935-2885
Fax: 952-935-2491
Email: phil.white@ecctek.com
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 2 of 22
Notice
This document does not contain any information ECC Technologies, Inc. considers confidential so you
may copy and distribute the document freely.
Phil White
President
ECC Technologies, Inc. (ECC Tek)
4750 Coventry Road East
Minnetonka, MN 55345-3909
Phone:
Fax:
E-mail:
Website:
www.ecctek.com
952-935-2885
952-935-2491
phil.white@ecctek.com
www.ecctek.com.
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 3 of 22
Revision History
05-02-10
pew
05-07-10
05-10-10
05-12-10
pew
pew
pew
05-12-10
pew
05-13-10
pew
www.ecctek.com
Created initial version of this document based on previous documents and
drawings.
Created v1 of this document.
Created v2 of this document.
Added sections to handle NAND Flash memory configurations with and without
data path width converters.
Created v3 of this document. Widened the second sample configuration to handle
65-bit wide data instead of 48 as in v2.
Stopped using version numbers so that we can focus on developing one final
document. Readers will know what version they have by looking at the Revision
History and the date of the document.
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 4 of 22
Sections
1. Introduction ............................................................................................................................................5
2. Brief History of ECC in NAND Flash ...................................................................................................6
3. Issues Involved in Continuing to Increase t in Binary BCH Codes .......................................................6
4. N-Page Blocks Divided into Segments ..................................................................................................7
5. 2D Encoding Concept ............................................................................................................................8
6. Hardware Encoding and Decoding of Segments ...................................................................................8
7. Correcting Erasures Plus Errors .............................................................................................................9
8. Repeated Decodings.............................................................................................................................10
9. Interface to the NAND Flash Memory ................................................................................................10
10. High-Level Block Diagram of Data Collection and Downloading System .........................................12
11. NAND Flash Memory Configurations With Data Path Width Converters .........................................13
11.1. 2D RS Encoding for First Sample Configuration ........................................................................15
11.2. 2D RS Decoding for First Sample Configuration ........................................................................16
11.3. Percentage Redundancy for First Sample Configuration .............................................................17
12. NAND Flash Memory Configurations Without Data Path Width Converters ....................................18
12.1. 2D RS Encoding for Second Sample Configuration ....................................................................20
12.2. 2D RS Decoding for Second Sample Configuration ....................................................................20
12.3. Percentage Redundancy for Second Sample Configuration .........................................................22
Figures
Figure 1 N Pages Divided into Segments .................................................................................................. 7
Figure 2 Encoded Data for 2D RS ECC Schemes ..................................................................................... 8
Figure 3 Encoding and Decoding of 2D Arrays in Hardware ................................................................... 9
Figure 4 64-bit to 44-bit Data Path Width Converter .............................................................................. 11
Figure 5 64-bit to 40-bit Data Path Width Converter .............................................................................. 11
Figure 6 Spacecraft Data Collection and Downloading System .............................................................. 12
Figure 7 Encoded 2R RS Array for First Sample Configuration ............................................................. 14
Figure 8 2D RS Encoding for First Sample Configuration...................................................................... 15
Figure 9 2D RS Decoding for First Sample Configuration ..................................................................... 16
Figure 10 Percentage Redundancy for First Sample Configuration ........................................................ 17
Figure 11 Encoded 2D RS Array for Second Sample Configuration ...................................................... 19
Figure 12 2D RS Encoding for Second Sample Configuration ............................................................... 20
Figure 13 2D RS Decoding for Second Sample Configuration ............................................................... 21
Figure 14 Percentage Redundancy for Second Sample Configuration .................................................... 22
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 5 of 22
1. Introduction
This document describes how to apply ECC Tek’s two-dimensional (2D) RS error-correction (ECC)
system to create fault-tolerant NAND Flash memories for spacecraft.
Many commercial companies are currently developing NAND Flash storage systems. For example,
Fusion-IO has recently received an additional $45M in funding and Pliant Technology has recently
received an additional $27M. Established hard disk drive (HDD) companies Seagate and Western
Digital also sell NAND Flash SSD devices.
NAND Flash solid state drives (SSDs) are attractive as replacements for HDDs because they are faster,
take less power, and are more shock resistant than HDDs. Unlike HDDs, SSDs have no moving parts
and are silent.
However, there are serious problems associated with using NAND Flash devices that need to be solved.
NAND Flash devices were originally designed as electrically erasable programmable read only
memories (EEPROMs) and were not originally designed for random access memories. NAND Flash
devices cannot be reprogrammed/rewritten an unlimited number of times as HDDs can because they
degrade with use. Also, blocks of storage cells in NAND Flash devices must be erased before sub
blocks (Pages) can be programmed/written.
The wear-out and failure characteristics of NAND Flash devices require the use of a powerful errorcorrection system to allow NAND Flash memories to be reliable over a long period of time.
The use of NAND Flash memories in spacecraft will also require the ECC system to tolerate entire chip
failures.
This document presents a general methodology for implementing highly reliable and fault-tolerant
NAND Flash memories. In order to quickly and effectively convey the concepts involved, two sample
configurations are described. The first configuration requires data path width converters and the second
one does not. The two configurations show designers how to create other configurations with the same
characteristics as the two samples. The first configuration is optimized to reduce gate count. The
second configuration is optimized to make it as easy as possible to match the data path width of the
NAND Flash memory to the widths of other system data paths. Readers should keep in mind that other
cases can be easily created with different Reed-Solomon (RS) symbol sizes and different levels of
correction. The methodology is not limited to the two described sample cases.
ECC Tek strongly believes there is very little risk in implementing the methodology described in this
document because ECC Tek has more than 30 years experience in designing ECC circuits, the idea of
correcting erasures proposed in this document has been implemented in a slightly different form in
multi-track tape drives to tolerate the failure of multiple tracks, and the idea of using error-correction
software as a recovery procedure in the event that hardware correction fails was used in the first ReedSolomon HDD implementations at what is now Seagate.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 6 of 22
With a 2D RS scheme as described in this document the designer of a NAND Flash memory has the
freedom of arbitrarily increasing the level of protection and also the number of failed chips the system
can tolerate. No other existing ECC system gives designers that freedom.
2. Brief History of ECC in NAND Flash
Shortly after NAND Flash devices were first manufactured, an ECC system was recommended by
Samsung that corrected t=1 bit in each Page. Pages were then ~512 bytes. The t=1 ECC recommended
by Samsung is very similar to a Hamming code which is a binary BCH code that can correct t=1 bit.
ECC Tek has designed and licensed numerous ECC designs for NAND Flash – a number of them are
currently in mass production. The original ECC design ECC Tek licensed for NAND Flash was a
Reed-Solomon (RS) system that operated on 10-bit symbols and corrected t=5 10-bit symbol errors.
Two of the bits in each symbol were forced to 0 in the data field so that the data field actually consisted
of 8-bit bytes. ECC Tek’s first RS system for NAND Flash was implemented in 2005 and went into
mass production shortly thereafter.
In 2007, ECC Tek developed and licensed its first programmable binary BCH encoder and decoder
designs for use with NAND Flash devices which could correct up to t=18 bits. Shortly thereafter,
ECC Tek developed and licensed a programmable binary BCH ECC system that could correct up to
t=30 bits. In the last 3 years, ECC Tek has licensed a number of similar binary BCH encoder and
decoder designs for NAND Flash.
ECC Tek has developed and synthesized a binary BCH design that can correct up to t=44 bits in 1024byte Pages, and there has been interest in binary BCH designs for NAND Flash that can correct up to
t=60 bits.
3. Issues Involved in Continuing to Increase t in Binary BCH Codes
When the number of bits correctable, t, and/or the data field length, K, increases, the complexity of
binary BCH encoders and decoders increase exponentially which limits the degree to which t and K can
be increased. ECC Tek has estimated that correcting t=60 bits in 1024-byte data field lengths will
require around 250K gates in an ASIC when a fairly low level of parallelism is implemented in the
decoder. Implementing a t=60 binary BCH design in an FPGA may not be practical at the present time.
When implementing binary BCH encoders and decoders, the codeword contains binary symbols “bits”
and a Galois or Finite Field must be used to uniquely identify, tag, address or locate each codeword
symbol/bit. For NAND Flash Page sizes larger than 1024 bytes but less than 2048 bytes, a locator field
with m=14-bit elements must be used. Fourteen-bit finite field elements are relatively large.
The number of redundant bits required by a t-bit error-correcting binary BCH code is r < mt where m is
the width of the locator field elements. Usually the number of redundant bits required is r = mt.
When implementing RS codes, the codeword contains nonbinary symbols and smaller locator fields can
be used than what are required for binary BCH codes. For example a RS locator field with 10-bit
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 7 of 22
elements was used in ECC Tek’s first RS design for Flash, and a locator field with 13-bit elements was
used in ECC Tek’s first binary BCH design. Both designs were for 512-byte data field lengths.
Apparently as time passes and feature sizes decrease the reliability of NAND Flash devices is decreasing
because ECC Tek continues to receive requests to correct more and more bits – up to t=60 bits per Page.
Seven years earlier only 1 bit per Page was corrected. This trend will most likely continue especially
with multi-level cell (MLC) NAND Flash devices.
Designing binary BCH encoders and decoders with very large t is difficult because the amount of time it
takes to compute the error locator polynomial, L(x), from the Syndrome, S(x), and the complexity (gate
count) of the decoder increases rapidly as t is increased. Synthesis results indicate that a t=50 binary
BCH decoder would probably take ~250K gates in an ASIC if the decoder was designed so that it could
keep up with continuous input data when correcting 50 bits in each received word.
The above-mentioned ECC schemes do not include any provision to correct errors caused by entire chip
failures. In order to achieve fault tolerance, some type of 2D ECC scheme must be used.
4. N-Page Blocks Divided into Segments
Data will be written to N NAND Flash chips in N-Page Blocks which are divided into logical Segments
as illustrated in Figure 1. Think of each column in Figure 1 as containing a single column of “bits”.
Assume one Page contains approximately 4K bytes.
1 2 3
N
Segment 1
Segment 2
Segment n
Figure 1 N Pages Divided into Segments
Each Segment will contain N vertical codewords.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 8 of 22
For binary BCH codes currently being implemented, each vertical codeword is approximately 512 or
1024 bytes with 4 or 8 Segments per Page.
ECC Tek is proposing the use of RS vertical codewords for the 2D ECC system instead of binary BCH
codewords and much smaller Segments than what are currently being used. The maximum height (in
number of symbols) of one vertical codeword in a Segment is determined by the size of the RS column
symbol being used.
5. 2D Encoding Concept
Think of the data in one Segment as a matrix or two-dimensional array of data items. Think of the
columns as vertical RS codewords and the rows that contain data as horizontal RS codewords.
Figure 2 illustrates how redundant data items are appended onto freely chosen data items in a 2D ECC
encoding scheme. Most likely 2D encoding schemes will be widely implemented in the near future. In
Figure 2, the rows are encoded first and then the columns.
Freely Chosen Data Item
Redundant Data Items
Redundant data items are mathematical
functions of the freely chosen data items or
previously computed redundant data items.
2D Array
Figure 2 Encoded Data for 2D RS ECC Schemes
6. Hardware Encoding and Decoding of Segments
A high-level block diagram of a NAND Flash storage device that encodes and decodes Segments in
hardware using a parallel Reed-Solomon (PRS) row encoder and decoder and multiple RS column
encoders and decoders is illustrated in Figure 3.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 9 of 22
Data In
PRS Row Encoder
RS
Column
Encoder
RS
Column
Encoder
RS
Column
Encoder
NAND
Flash
Chip 1
NAND
Flash
Chip 2
NAND
Flash
Chip N
RS
Column
Decoder
RS
Column
Decoder
RS
Column
Decoder
PRS Row Decoder
Data Out
Figure 3 Encoding and Decoding of 2D Arrays in Hardware
When writing, rows are encoded by the PRS row encoder, columns are encoded by multiple RS column
encoders, and a 2D encoded Segment, as illustrated in Figure 2, is written to N NAND Flash chips.
When reading, columns read from the N NAND Flash chips are decoded by multiple RS column
decoders and rows are decoded by a PRS row decoder. The PRS row decoder will pass through the
vertical redundancy unaltered if any of the row decodings for one Segment fails. If none of the row
decodings for one Segment fails, the 2D decoder has determined that either no errors occurred or that all
of the errors have been properly corrected.
Since multiple column encoders and decoders are required, the complexity of one column encoder and
decoder pair must be reasonable in order for the 2D scheme to be practical – especially when the
2D ECC scheme is implemented in FPGAs as is the standard practice for spacecraft. Since we are
considering relatively small symbol sizes, there will necessarily be multiple Segments per Page. For
example, if we consider 4096-byte Pages and 6-bit column symbols, there will be more than 92
Segments per Page.
7. Correcting Erasures Plus Errors
Both the RS column and row encoders and decoders implemented in hardware can be designed to
correct erasures plus errors. An erasure means a codeword location where the codeword symbol in that
location has a high probability of being in error, but is not necessarily in error.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 10 of 22
The PRS row decoder should always be designed to correct errors plus erasures, but the RS column
decoders would probably be designed to correct errors only. When a column decoding fails, the decode
fail signal from the column decoder indicates that all of the row symbols in that vertical codeword are
erased. The PRS row decoder can correct t errors and s erasures as long as 2t + s < R where R is the
number of redundant symbols per row codeword.
8. Repeated Decodings
Both the row and column decoders can output entire corrected codewords (both corrected data and
corrected redundancy) so both column and row decoding on each Segment can be repeated an unlimited
number of times if desired or necessary.
If none of the row decodings on a Segment fail, then it is assumed that the data in that segment either
has no errors or that all of the errors have been correctly corrected. If any of the row decodings fail, the
PRS row decoder will pass the corrected column redundancy through the PRS row decoder without
altering it so that column and row decodings can be repeated if desired.
It is possible to duplicate the column and row hardware decoders multiple times or to feed the data back
to the input by using more FIFOs and muxes, but that would probably be overkill since one column
decoding followed by one row decoding will most likely correct all of the errors almost all of the time.
Most likely the best way to implement multiple decodings would be to do repeated decodings in
software only when needed. Software correction time is probably not critical because repeated
decodings would rarely be done. That way, row and column decodings can be repeated any number of
times and no additional hardware is required so it keeps the complexity of the ECC hardware to a
minimum.
When an error pattern occurs that is too severe for one column and row decoding to correct it (which
may never happen or may happen only once a year), then some means would need to be provided for the
software to access and redecoded the codewords resulting from one correction and repeat the correction
any number of times in an attempt to recover the data. Since ECC Tek licenses both C and Verilog
code, the C code can be used for that purpose.
9. Interface to the NAND Flash Memory
N is the number of NAND Flash chips that are written and read simultaneously. In the first sample
configuration N=15, the row symbol width is 4 and the number of redundant symbols is 4 so the input
and output data path widths for the NAND Flash memory can be up to 11 symbols = 44 bits.
What if we want to use a 44-bit wide NAND Flash memory in a 64-bit system?
To do that, we need a 64-bit-to-44-bit converter at the input of the NAND Flash memory and a
44-bit-to-64-bit converter at the output.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 11 of 22
For the input converter in this case we need to find the smallest integers x and y such that 64x = 44y.
Note that x/y=44/64=22/32=11/16. Since 11 and 16 have no common factors, we need 11, 64-bit
registers to convert a 64-bit data path to a 44-bit data path as illustrated in Figure 4.
The circuit shown in Figure 4 can be thought of as a FIFO with different input and output widths.
Control logic similar to a standard FIFO’s control logic can be developed.
There are several ways to handle the clock with a 64-bit-to-44-bit input converter. One way is to pause
five clock cycles for every 11 clock cycles while writing to the NAND Flash memory. Another way is
to implement a state machine which will monitor the circuit and issue a pause signal to the input as
needed. Another way is to use a higher frequency clock for the NAND Flash memory than what is used
for the input logic.
64
1
2
1
2
3
4
5
6
7
8
9
10
11
SM
16
44
Figure 4 64-bit to 44-bit Data Path Width Converter
Other converters are simpler such as the 64-bit-to-40-bit input converter shown in Figure 5.
64
1
SM
2
3
4
5
40
1
2
8
40
Figure 5 64-bit to 40-bit Data Path Width Converter
Converters required at the output follow the same pattern as those shown and are easy to design.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 12 of 22
10. High-Level Block Diagram of Data Collection and Downloading System
We will describe how to implement highly reliable, fault-tolerant NAND Flash memories to be used in
spacecraft data collection and downloading systems as illustrated in Figure 6.
Sensors, Cameras, etc.
Data Collector
pause
control
State
Machine
Main Data
FIFO
(Dual-Port RAM)
status
Trasmitters
Earth
Figure 6 Spacecraft Data Collection and Downloading System
Assume that the Data Collector can be paused between words written to the Main Data
FIFO (MDFIFO).
The MDFIFO can be implemented using a dual-port RAM. Assume the read and write address registers
are initially reset to 0. When a word is written to the MDFIFO, the write address is incremented by 1.
When a word is read from the MDFIFO, the read address is incremented by 1. Whenever the write and
read addresses equal the end of the RAM, they are reset to 0.
Assume that we have a “full_level” variable that is incremented by 1 every time a word is written to the
MDFIFO and decremented by 1 every time a word is read from the MDFIFO. If a word is written and
read in the same clock cycle, the full_level is not changed. The full_level variable indicates the number
of words that are currently in the MDFIFO. In order for the Transmitter to read the MDFIFO, the
full_level must be > 1.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 13 of 22
If the full_level is close to the size of the RAM, the pause signal to the Data Collector will be asserted
indicating that inputs to the MDFIFO must be paused to avoid an overflow condition.
The above-described method of implementing a synchronous FIFO is the method that ECC Tek uses in
all of its ECC designs for the Data FIFO (DFIFO) in its decoders.
The remainder of this document describes how the MDFIFO can be designed using NAND Flash
memory chips so that it will be highly reliable over time and contain an arbitrary level of fault-tolerance.
11. NAND Flash Memory Configurations With Data Path Width Converters
Data path width converters may sometimes be required if designers wish to minimize the complexity
(gate count) of the ECC circuits. Since most spacecraft electronics are implemented in FPGAs, it is
often important and necessary to reduce gate counts so the circuits will fit into a small number of
FPGAs.
We will use 6-bit column symbols and 4-bit row symbols for the first sample configuration with 4
redundant row symbols per horizontal codeword and 6 redundant column symbols per vertical
codeword. For that case, up to 44-bit data words can be input in parallel.
With 6-bit column symbols, the maximum height of a vertical RS codeword is 63 6-bit symbols = 378
bits. Therefore, the height of each Segment must be < 63, 6-bit symbols. For the first sample
configuration, the height of one Segment was chosen to be 60, 6-bit symbols because
60 symbols * 6 bits/symbol = 360 bits which is a multiple of 4, 6 and 8 so that one column of a Segment
contains an equal number of 4-bit, 6-bit and 8-bit quantities/symbols.
With NAND Flash Page sizes of ~4096 bytes, there will ~92 Segments per Page.
A 2D encoded array with 4-bit row symbols outlined in red and 6-bit column symbols outlined in blue is
illustrated in Figure 7. All of the bits from one column are written to one NAND Flash chip. It doesn’t
matter whether the NAND Flash chip inputs and outputs 8-bits at a time or 16-bits at a time or more.
The only thing that matters is that all of the bits from one column are written to one NAND Flash chip
so that if one NAND Flash chip fails, only one column of bits will be affected.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 14 of 22
Vertical
Codeword
6-bit
Symbols
Horizontal
Codeword
4-bit
Symbols
1
2
3
4
5
6
11 12 13 14 15
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
1
1
0
1
1
1
1
0
1
0
0
1
0
0
1
0
1
1
0
1
1
1
0
1
1
0
1
1
1
0
1
1
0
1
1
1
0
1
1
1
0
1
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
1
0
1
1
0
0
1
0
0
1
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
1
0
1
1
0
0
0
0
0
0
0
1
1
0
1
1
0
1
1
1
0
0
1
1
1
1
0
1
0
1
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
0
1
1
1
1
0
0
1
0
0
0
1
0
0
0
0
1
0
1
0
0
0
0
1
0
0
1
0
0
1
1
0
1
1
0
1
0
0
1
1
1
0
1
0
1
0
1
0
1
0
1
1
1
1
1
0
1
1
0
1
0
0
1
1
1
0
1
0
1
0
1
0
1
0
1
1
1
4-bit
Symbol
6-bit
Symbol
Figure 7 Encoded 2R RS Array for First Sample Configuration
The above scheme allows up to 15 4-bit symbols per horizontal codeword and up to 63 6-bit symbols
per vertical codeword.
Most of the time, the column decoders will be able to correctly recover the data without the aid of the
row decoder. However, if a NAND Flash chip fails, then many column decodings in one column will
fail indicating that all of the symbols in those codewords should be considered erasures by the row
decoder.
With 4 redundant row symbols, up to t errors and s NAND Flash chip failures can be corrected in each
row as long as 2t + s < 4 which is probably more than sufficient. In other words, with no soft errors, up
to 4 NAND Flash chips can fail with no loss of data.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 15 of 22
With 6 redundant 6-bit column symbols, up to 3 soft errors in 60 can be corrected in each vertical
codeword. Some work should be done to determine how many vertical symbols need to be corrected for
a specific type of NAND Flash device, but it’s OK for the column decoding to fail every once in a while
because the row decoder can correct for multiple column decoding failures.
11.1. 2D RS Encoding for First Sample Configuration
A block diagram of the 2D RS encoder hardware for the first sample configuration is shown in Figure 8.
Pause
Input
OR
Input
1
2
3
4
5
6
7
8
9
10 11
Parallel RS Encoder
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Row Symbol
to
Column Symbol
Converter
Row Symbol
to
Column Symbol
Converter
Serial RS Encoder
Serial RS Encoder
Write Interface
to Flash Chip
Write Interface
to Flash Chip
Flash Chip
Flash Chip
Figure 8 2D RS Encoding for First Sample Configuration
Data can be continuously input to the 2D RS encoder as long as the pause input signal is not asserted.
Whenever the encoder reaches the end of a data field, it will request the input to pause while it outputs
the redundancy for the current N vertical codewords. Once the pause signal is deasserted, the input can
continue to input data words.
All of the pieces of logic shown in Figure 8 have been previously designed and licensed by ECC Tek.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 16 of 22
11.2. 2D RS Decoding for First Sample Configuration
A block diagram of the 2D RS decoder for the first sample configuration is shown in Figure 9.
Flash Chip 1
Flash Chip N
Read
Interface to
Flash Chip
Read
Interface to
Flash Chip
RS
Column 1
Decoder
RS
Column N
Decoder
Fail
Fail
F
I
F
O
State
Machine
Control
F
I
F
O
Col Symbol
to
Row Symbol
Converter
Fail
Col Symbol
to
Row Symbol
Converter
PRS Row Decoder
Output
Figure 9 2D RS Decoding for First Sample Configuration
The FIFOs are needed to store corrected vertical codewords from one Segment because a column
decoder cannot know if decoding has failed until the entire codeword has been outputted, but the PRS
Row Decoder needs to receive an erasure indication at the same time it receives an input word. The Fail
signal from the column decoder is the erasure indication. A column decoding failure means the entire
vertical corrected codeword stored in the decoder’s output FIFO has been “erased”.
For the first sample configuration, the Column Decoder FIFO only needs to store 60 6-bit symbols.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 17 of 22
11.3. Percentage Redundancy for First Sample Configuration
Figure 10 shows the first sample configuration has 34% redundancy. The percentage redundancy is a
function of the level of protection provided. The more protection provided, the higher the percentage
redundancy.
One Segment
4 bits
11 bits
324 bits
Row
Redundancy
Data
36 bits
360 bits
Column
Redundancy
15 bits
Total number of bits in Segment = 360 x 15 = 5400
Number of Redundant bits in Segment = 1836
Percentage Redundancy = 1836/5400 = .34 = 34%
Figure 10 Percentage Redundancy for First Sample Configuration
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 18 of 22
12. NAND Flash Memory Configurations Without Data Path Width Converters
This section describes how to design 2D RS NAND Flash memories that do not require data path width
converters. Generally speaking, these configurations will be more complex and take more gates to
implement than configurations with data path width converters.
We will use 6-bit column symbols and 5-bit row symbols for the second sample configuration with 4
redundant row symbols per horizontal codeword and 6 redundant column symbols per vertical
codeword. For this case, up to 135-bit data words can be input in parallel, but we will input 65-bit data
words = 13 5-bit symbols. The 65th bit can be forced to 0 or it can be used for some type of auxiliary
data.
With 6-bit column symbols, the maximum height of a vertical RS codeword is 63 6-bit symbols = 378
bits.
In this second configuration, the row and column symbol boundaries are aligned every 30 bits since 5
and 6 have no common factors.
To determine how often the row and column symbol boundaries align, we use the same method as we
used for the data path width converters. That is, we find the lowest x and y so that 6x=5y. Note that
x/y=5/6 which cannot be reduced further so boundaries in this case will be aligned only every 30 bits.
In the first configuration the row and column boundaries were aligned every 12 bits since 4/6=2/3.
Although Verilog code has not been developed for these designs yet, it appears that the symbol
boundaries must line up at the end of the data. In other words, it appears that the height (in bits) of a
data field shown in Figure 14 in a Segment for this configuration must be 30i where i is an integer.
It also appears that the height of the column redundancy field shown in Figure 14 can be any number of
6-bit symbols since the PRS row decoder ignores the column redundancy fields.
The height of one Segment was chosen to be 60, 6-bit symbols because
60 symbols * 6 bits/symbol = 360 bits which is a multiple of 5, 6 and so that one column of a Segment
contains an equal number of 5-bit and 6-bit quantities/symbols.
With NAND Flash Page sizes of ~4096 bytes, there will ~92 Segments per Page.
A 2D encoded array with 5-bit row symbols outlined in red and 6-bit column symbols outlined in blue is
illustrated in Figure 11. All of the bits from one column are written to one NAND Flash chip. It doesn’t
matter whether the NAND Flash chip inputs and outputs 8-bits at a time or 16-bits at a time or more.
The only thing that matters is that all of the bits from one column are written to one NAND Flash chip
so that if one NAND Flash chip fails, only one column of bits will be affected.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 19 of 22
Vertical
Codeword
6-bit
Symbols
Horizontal
Codeword
5-bit
Symbols
1
2
3
4
5
6
13 14 15 16 17
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
1
1
0
1
1
1
1
0
1
0
0
1
0
0
1
0
1
1
0
1
1
1
0
1
1
0
1
1
1
0
1
1
0
1
1
1
0
1
1
1
0
1
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
1
0
1
1
0
0
1
0
0
1
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
1
0
1
1
0
0
0
0
0
0
0
1
1
0
1
1
0
1
1
1
0
0
1
1
1
1
0
1
0
1
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
0
1
1
1
1
0
0
1
0
0
0
1
0
0
0
0
1
0
1
0
0
0
0
1
0
0
1
0
0
1
1
0
1
1
0
1
0
0
1
1
1
0
1
0
1
0
1
0
1
0
1
1
1
1
1
0
1
1
0
1
0
0
1
1
1
0
1
0
1
0
1
0
1
0
1
1
1
5-bit
Symbol
6-bit
Symbol
Figure 11 Encoded 2D RS Array for Second Sample Configuration
The above scheme allows up to 31 5-bit symbols per horizontal codeword and up to 63 6-bit symbols
per vertical codeword.
Most of the time, the column decoders will be able to correctly recover the data without the aid of the
row decoder. However, if a NAND Flash chip fails, then many column decodings in one column will
fail indicating that all of the symbols in that codeword should be considered erasures by the row
decoder.
With 4 redundant row symbols, up to t errors and s NAND Flash chip failures can be corrected in each
row as long as 2t + s < 4 which is probably more than sufficient. In other words, with no soft errors, up
to 4 NAND Flash chips can fail with no loss of data.
With 6 redundant 6-bit column symbols, up to 3 soft errors in 60 can be corrected in each vertical
codeword. Some work should be done to determine how many vertical symbols need to be corrected for
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 20 of 22
a specific type of NAND Flash device, but it’s OK for the column decoding to fail every once in a while
because the row decoder can correct for multiple column decoding failures.
12.1. 2D RS Encoding for Second Sample Configuration
A block diagram of the 2D RS encoder hardware for the second sample configuration is shown in
Figure 12.
Pause
Input
OR
Input
1
2
3
11 12 13
Parallel RS Encoder
1
2
3
15 16 17
Row Symbol
to
Column Symbol
Converter
Row Symbol
to
Column Symbol
Converter
Serial RS Encoder
Serial RS Encoder
Write Interface
to Flash Chip
Write Interface
to Flash Chip
Flash Chip
Flash Chip
Figure 12 2D RS Encoding for Second Sample Configuration
Data can be continuously input to the 2D RS encoder as long as the pause input signal is not asserted.
Whenever the encoder reaches the end of a data field, it will request the input to pause while it outputs
the redundancy for the current codeword. Once the pause signal is deasserted, the input can continue to
input data words.
All of the pieces of logic shown in Figure 12 have been previously designed and licensed by ECC Tek.
12.2. 2D RS Decoding for Second Sample Configuration
A block diagram of the 2D RS decoder for the second configuration is shown in Figure 13.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Flash Chip 1
Flash Chip N
Read
Interface to
Flash Chip
Read
Interface to
Flash Chip
RS
Column 1
Decoder
RS
Column N
Decoder
Fail
Fail
F
I
F
O
State
Machine
Control
F
I
F
O
Col Symbol
to
Row Symbol
Converter
Fail
Page 21 of 22
Col Symbol
to
Row Symbol
Converter
PRS Row Decoder
Output
Figure 13 2D RS Decoding for Second Sample Configuration
The FIFOs are needed to store corrected vertical codewords from one Segment because a column
decoder cannot know if decoding has failed until the entire codeword has been outputted, but the PRS
Row Decoder needs to receive an erasure indication at the same time it receives an input word. The Fail
signal from the column decoder is the erasure indication. A column decoding failure means the entire
vertical corrected codeword stored in the decoder’s output FIFO has been “erased”.
For the sample configuration, the Column Decoder FIFO only needs to store 60 6-bit symbols.
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
NAND Flash Memories For Spacecraft
ECC Technologies, Inc.
Page 22 of 22
12.3. Percentage Redundancy for Second Sample Configuration
Figure 14 shows the second sample configuration has 29.5% redundancy. The percentage redundancy is
a function of the level of protection provided. The more protection provided, the higher the percentage
redundancy.
One Segment
4 bits
13 bits
330 bits
Row
Redundancy
Data
36 bits
366 bits
Column
Redundancy
17 bits
Total number of bits in Segment = 366 x 17 = 6222
Number of Redundant bits in Segment = 1932
Percentage Redundancy = 1932/6222 = .31 = 31%
Figure 14 Percentage Redundancy for Second Sample Configuration
www.ecctek.com
phil.white@ecctek.com
May 13, 2010
Download