Towards an information and coding theory for non-volatile memories Luis A. Lastras-Montaño IBM Research Joint work with: A. Elfadel, M. M. Franceschini, A. Jagmohan, J. Karidis, T, Mittelholzer, M. Sharma, M. Wegman UCSD Non-volatile Memory Workshop 2010

I NTRODUCTION A new branch of coding and information theory is being born, devoted to solid state storage class memories (SCM). ◮ This theory has the opportunity to unlock exciting tradeoffs. ◮ Previous 5 talks are evidence of this statement - focus on FLASH. The new theory is not simply about more "Error Correction Coding". ◮ ◮ ◮ Classical channel capacity ideas play an important role, but Particular characteristics of SCM are leading to new concepts that could play important practical roles.

O UR RESEARCH ◮ Our group works on a server's memory system. ◮ Our interests in storage class memory is influenced by our application focus. What we want from an SCM: ◮ ◮ ◮ ◮ ◮ Highly reliable - we expect no data integrity escapes and an uncorrectable error rate similar to other computing system components. Significantly denser than DRAM Sufficient memory writes - with wear leveling, we want the memory to last for the duration of the system's useful life. We believe that Phase Change Memory (PCM) has the potential of becoming our desired SCM. ◮ PCM is the focus of rest of talk

W E WANT TO ANSWER : I am a communication/information sciences researcher. How can I help? We will give some answers for this question in the case of PCM. A PRIMER ON PCM ◮ PCM is in essence a programmable variable resistor with a very large dynamic range 103 Ω − 106 Ω. ◮ Similar material used in rewritable DVDs. 000000000000000000000000000000000000 111111111111111111111111111111111111 111111111111111111111111111111111111 000000000000000000000000000000000000 000000000000000000000000000000000000 111111111111111111111111111111111111 000000000000000000000000000000000000 111111111111111111111111111111111111 000000000000000000000000000000000000 111111111111111111111111111111111111 111111111111111111111111111111111111 000000000000000000000000000000000000 000000000000000000000000000000000000 111111111111111111111111111111111111 000000000000000000000000000000000000 111111111111111111111111111111111111 000000000000000000000000000000000000 111111111111111111111111111111111111 000000000000000000000000000000000000 111111111111111111111111111111111111 000000000000000000000000000000000000 111111111111111111111111111111111111 000000000000000000000000000000000000 111111111111111111111111111111111111 000000000000000000000000000000000000 111111111111111111111111111111111111 top electrode polycrystalline GST amorphous ◮ PCM is organized in an array of cells. ◮ Cells in the same row are accessed simultaneously during a read/write operation. ◮ Multiple bits/cell (multiple levels) are feasible and key to PCM's expected success. ◮ There are no fundamental restrictions on what we might write onto individual cells in a row. ◮ ◮ This is a fundamental advantage over FLASH. We may chose not to overwrite existing content on individual cells. bottom electrode 00000000000 11111111111 00000000000 11111111111 11111111111 00000000000 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 The memory cell may be seen as a composite of material in two phases: amorphous (high resistance) and crystalline (low resistance). W RITING TO PCM ◮ Phase changes (and hence resistance changes) are induced using heat. ◮ In turn heat is created by passing electrical current through the cell ◮ The shape of the current pulse determines the final state of the cell ◮ A sharp, short RESET pulse melts the cell and then quenches it, creating an amorphous region. ◮ A longer, smaller amplitude SET pulse crystallizes the cell through annealing, putting it into a polycrystalline state. ◮ Amorphous → high resistance. ◮ Polycrystalline → low resistance.

M EMORY MODEL row read/write circuitry bitline memory controller ◮ Think of each cell as an entity with an analog read and write channel. ◮ The read channel retrieves the state of the cell. The write channel's statistical response in general depends on the current state of a cell. ◮ Think of each row as a collection of (generally parallel) read/write channels that are used on a read or write ◮ Think of a memory as a collection of rows

U SING REWRITING TO GET 2 BITS / CELL ◮ Assume memory cell has additive Gaussian write noise. ◮ We place 4 bins (each denoting a cell "level") in an available input range ◮ We stimulate the cell with an input equal to the center of the bin. ◮ We continue stimulating it until its value is inside of the bin. 1 0 1 2 3 density 0.8 Multiple write attempts 0.6 0.4 Single write attempt 0.2 0 -2 0 2 4 resistance 6 8 10

S UMMARY OF IMPORTANT PCM READ CHANNEL CHARACTERISTICS Drift: resistance vs time -- cell 5 1e+07 1e+06 log R A resistance read after programming in general changes over time. This is due to dominated by multiple effects ◮ Resistance drift:. Trend is "upwards", roughly as a line in a resistance/time log log plot. Starts immediately after programming. ◮ Recrystallization:. Trend is "downwards". Any amorphous content in the cell starts to recrystallize if the temperature of the cell is sufficiently high. This is a concern for storage applications. 100000 10000 1 10 100 time [s] 1000 10000

S UMMARY OF IMPORTANT PCM WRITE CHANNEL CHARACTERISTICS same programming signal in general do not result in the same programmed resistance. ◮ (heterogeneity I) Different memory cells react differently to the same programming signal. ◮ (heterogeneity II) Different memory cells can in general reach different resistance ranges. ◮ (iterative programming) To attain multibit/cell operation, a "write-and-verify" feedback algorithm is used. R 7 W 6 R R R R 5 W W W W 4 R R R R R R R R R R 3 W W W W W W W W W W 2 R R R R R R R R R R R R R R R R 1 W W W W W W W W W W W W W W W W 0 time ◮ Repeated applications of the

T HE REWRITABLE CHANNEL MODEL (L ASTRAS , ET. AL . ISITA'08 the write outcome in general depends on the earlier state resistance drift recrystalization cell heterogeneity read−and−verify

S TORAGE CAPACITY - FOCUS ON DRIFT NOISE (F RANCESCHINI , ET. AL ., ICC'10) ◮ Model drift (empirically) as a Wiener process with an increment that has a positive mean. ◮ Storage capacity then depends the time between the write and its subsequent read.

S TORAGE CAPACITY - FOCUS ON CELL HETEROGENEITY (J AGMOHAN , ET. AL ., ICC'10) ◮ We use empirical PCM data to model the heterogeneity in achievable resistance ranges of an ensemble of cells. ◮ We assume that the encoder knows these achievable resistances but the decoder does NOT - corresponds to the "Channel State Information available at the transmitter - CSIT" information theoretic setting ◮ Näive coding uses only those levels that can be reached by every cell to store information.

E NDURANCE CODING ◮ Suppose you want 2x endurance. Might buy 2x the memory (näive). ◮ There is an exciting way of managing endurance better through coding. ◮ Basic assumption is that reducing resistance does not cause wear, but that increasing does cause wear. ◮ Before a write, a read of the cells is performed so as to select a "kind" way to write the message. ◮ The capacity/endurance tradeoff of PCM has been computed for this model (Lastras et.al., ISIT'09) 6 Conditional entropy bound Multicell codes Waterfall codes Naïve 5 bits/cell 4 3 2 1 0 0 10 1 10 2 3 10 10 mean time between resets 10 4

T HE WATERFALL CODE physical information content message 0 0 physical information content message 0 0 1 1 1 1 2 0 2 2 3 1 3 3 4 0 4 0 5 1 5 1 6 0 6 2 7 1 params1 7 3 params2

R EWRITABLE CHANNEL THEORY ◮ One of the unique characteristics of multibit/cell non-volatile memories is that one often requires multiple attempts at getting a desired value written. ◮ Reasons include inherent write noise, as well as unknown cell parameters and cell heterogeneity. ◮ We have been developing an information theory that incorporates these elements, starting with write noise. S IMPLE MODEL ◮ Memory comprised of n memory cells written/read in parallel. ◮ The behavior of each memory cell is ◮ State space for cell has alphabet Y ⊂ R. ◮ The relation between a cell state and an input stimulus is given by a probability law µS|X (the write noise ). µS|X µS|X ◮ The state of the cell can be read noiselessly and without disturbing the state. ◮ Much work needs to be done to obtain information theoretic results for more realistic models. µS|X resulting state ◮ Input stimulus has alphabet X ⊂ R. µS|X input stimulus statistically independent from other cells.

S INGLE LETTER ITERATED REWRITABLE CHANNEL close if Yi ∈D D⊂Y µS|X X∈X Yi YL ∈Y Yi ∈ /D For each i ≥ 1, Yi is the result of passing the random variable X through a copy of the channel µS|X . Define ∆ L = min{i ≥ 1 : Yi ∈ D} as the number of iterations for convergence.

I NFORMATION THEORETIC CHARACTERIZATION OF THE CAPACITY / COST TRADEOFF ◮ Define the expected cost as E{L}. ◮ Suppose we are encoding information over a large number of cells in parallel. ◮ What is the maximum number of bits one can store reliably with a cost constraint E{L} ≤ κ ? ◮ Answer is (ISITA'08) C(κ) = sup X∈X ,D⊂Y,E{L}≤κ I(X, D; YL),

T HE UNIFORM NOISE MODEL QY|X : Y × X → [0, 1] Y = [−a′ 2, 1 + a/2] X = [0, 1] Y = x+N N uniform in [−a/2, a/2]. C LOSED FORM EXPRESSION FOR CAPACITY OF THE UNIFORM NOISE CHANNEL Let a be a given noise parameter, and let N ≥ 2 be the integer such that 1 1 ≤ a < N−2 . (ISITA'08) Mittelholzer, et.al. (ICC'09) Lastras, et.al. (ITA'10) Mittelholzer, et.al. (ISIT'10) Lastras, et.al. (ISIT'10) Franceschini, et. al. (Sig. Proc. Mag.) This work is still very much in a nascent stage. Key concepts are yet to be incorporated" ◮ Dependence of write outcome on existing state. ◮ Uncertainty due to unknown parameters.

C ONCLUSIONS ◮ Non-volatile memories are an emerging field in many disciplines ◮ ◮ ◮ New basic technology Systems architecture Information, coding theory, signal processing and algorithm design/analysis ◮ We (information theorists) believe we can bring value. These technologies need our tools and insights. ◮ Questions?