Shaping Methods for Low Density Lattice Codes Meir Feder Dept. of Electrical Engineering-Systems Tel-Aviv University * Joint work with Naftali Sommer and Ofir Shalvi Lattice Codes for Continuous Valued Channels • Shannon: Capacity achieved by random codes in the Euclidean space • Lattice codes are the Euclidean space analogue of linear codes • Can achieve the capacity of the AWGN Channel: High SNR proof – de Buda, Loeliger, and others Any SNR proof – Urbanke and Rimoldi, Erez and Zamir • Specific coding and decoding of Lattice codes: • Some lattice codes are associated with good finite-alphabet codes (e.g., Leech Lattice ~ Golay Code). • In most proposed “lattice codes” a finite alphabet (binary) code is used, with proper mapping to the Euclidean space (“Construction A”) • Sommer et al 2008: Low Density Lattice Codes – efficiently decoded lattice codes constructed by direct mapping into the Euclidean space! Lattice and Lattice codes • An n-dimensional lattice in Rm: Linear combination of n linearly independent vectors, with integer coefficients • A lattice point x (in Rm) of a lattice G: x = Gi where i is an n-dimensional integer vector; G is a matrix whose columns are linearly independent vectors in Rm • Lattice code: The lattice points inside a “shaping domain” B • Basic cell, Voronoi cell: volume |G| Lattice capacity in AWGN channel • The AWGN channel capacity with power limit P and noise variance σ2 : ½ log (1 + P/ σ2) • Poltyrev defined the capacity of AWGN channel without restrictions; performance limited by density of the code-points. • For lattices, the density is determined by |G|. Poltyrev’s capacity: | G |2 2 e n 2 (When normalized to |G|=1 Poltyrev’s capacity: σ2 < 1/2πe) • With proper shaping and lattice decoding, a lattice achieving Poltyrev’s capacity also attains the AWGN capacity, at any SNR (Erez and Zamir) Low Density Lattice Codes • Observation: A lattice codeword is x = Gi; Define the matrix H=G-1 as the “parity check” matrix, since Hx = G-1x = i = integer frac{H x}=0 Low Density Lattice Codes • Observation: A lattice codeword is x = Gi; Define the matrix H=G-1 as the “parity check” matrix, since Hx = G-1x = i = integer frac{H x}=0 • y = x + n is the observed vector. Define the “Syndrome” : s = frac{H y} = frac{H (x + n)} = frac{H n} Low Density Lattice Codes • Observation: A lattice codeword is x = Gi; Define the matrix H=G-1 as the “parity check” matrix, since Hx = G-1x = i = integer frac{H x}=0 • y = x + n is the observed vector. Define the “Syndrome” : s = frac{H y} = frac{H (x + n)} = frac{H n} Low Density Lattice Code (LDLC): A lattice code with sparse parity check matrix H The bi-partite graph of LDLC Variable nodes x y1 Check nodes i x1 i1 h1 h3 • Regular LDLC – row and column degrees of H are equal to a common degree d • A “Latin Square LDLC” : Regular LDLC where every row and column have the same non-zero values, except possible change in order and random signs h2 Observation vector: y=x+n h3 yk xk h2 ik h1 yn xn in Iterative Decoding for LDLC • An iterative scheme for calculating the PDF f (xk | y), k=1,…,n • Message passing algorithm between variable nodes and check nodes. Messages are PDF estimates • The check node constraint: ∑ hi xki = integer . Leads to “convolution step” • The variable node constraint: Get estimates of the considered variable PDF from the check nodes and the observation. Leads to “Product step” The iterative algorithm Example – Slide show: 25 Iterations, 4 nodes Simulation Results • • • • Latin square LDLC with coefficients as discussed. Block sizes: 100,1000,10000,100000 Degree: d=5 for n=100, and d=7 for all others Comparison with Poltyrev’s capacity The Shaping Challenge of LDLC • LDLC (and lattice codes in general) are used at high SNR, where the factor of (bit/sec)/Hz is high Lattice Shaping: Power Limited Lattice • • Communication with infinite lattice does not make sense (even at high SNR) It requires infinite power. Need to attain Shannon, not Poltyrev, capacity Shaping for lattice codes is essential. It is required for more than the 1.53dB shaping gain Encoding: • • • Lattice encoding is relatively complex: evaluating x = Gi is O(n2) as G is not sparse Nevertheless, efficient encoding can be done by solving efficiently (say, by Jacobi method) the sparse linear equation - Hx = i Further simplification by incorporating encoding with shaping “Nested Lattice” Shaping • The original paper (Sommer et all 2008) proposed “nested lattice” shaping: • • • • The information symbols are chosen over a limited integer range, depending on the desired rate The evaluated codeword is then “shaped” into the Voronoi region of a coarse lattice (of the same structure) This is done by finding the closest coarse lattice point and subtracting it from the evaluated codeword – requires LDLC decoding! Unfortunately, LDLC decoding for nested lattice shaping does not work well – LDLC “quantization” Shaping methods that work • Suppose H is constrained to be triangular. Clearly in this case H can no longer be regular or “Latin square”, only approximately: • The triangular H can be randomly designed similarly to the design of regular LDLC in the original paper. • Since symbols (integers) that correspond to initial columns are less protected , coarser constellations can be used for these columns - with minimal rate loss: Hypercube shaping • Similar to Tomlinson-Harashima filter in ISI channels • Denote the original integer vector by . The shaped codeword corresponds to another integer vector . • Let be the constellation size of the i-th integer. The shaped vector satisfies: • The correcting integer component • This choice can be done sequentially and efficiently since H is triangular and sparse: is chosen so that the corresponding code Systematic shaping • A novel notion - “Systematic Lattice”: A lattice where the integer “information” symbols can be obtained by rounding its continuous-valued components! • Systematic lattice construction: Let the modified integer be where • , i.e. This is done sequentially and efficiently: (can be interpreted as a generalization of Laroia’s pre-coding scheme) • Standard shaping methods (e.g. trellis shaping), can be combined with systematic LDLC, over slightly larger constellation, to attain most of the possible shaping gain. This added shaping does not change the decoder! Nested Lattice shaping • The nested lattice shaping proposed in [Sommer et all 2008 ] can be implemented, and yield good results when H is triangular • This is similar to hypercube shaping, but now the correcting integer is not chosen independently, but as a vector: • We have: • Nested lattice shaping is performed by choosing • This can be complicated : LDLC decoding may not work for “quantization” However, since H is triangular, sequential decoding algorithms can be used to possibly attain much of the shaping gain that minimizes Shaping methods for Non-triangular H • An arbitrary LDLC “parity check” matrix H can be decomposed as where T is triangular and Q is orthonormal (QR decomposition of • • ) Let be the modified integer vector . The desired is such that the codeword after shaping (satisfying ), is either restricted to be in a hypercube, or has minimal energy . Thus , where • Since T is triangular, the methods above can be applied to find (and hence ) so that is in the hypercube or has minimal power. • The transmitted LDLC codeword is with equivalent shaping properties. It can be evaluated directly by solving, with linear complexity the sparse equations: Performance Up to 0.4dB can be gained by better rate handling. Incorporating Trellis or shell shaping with systematic construction will gain additional ~1.2-1.3dB . Block size 10000. 1-3 bits/deg-of-freedom (average 2.935) Maximal degree 7: 1, h,…,h Performance with Non-triangular H • Simulated LDLC matrix of size 1000, due to high QR decomposition complexity.. • Constrained the number of non-zero elements of T (recall N=500000 (full matrix), 200000, 100000, 50000 • Maximal constellation size was 8 (3 bits/degree of freedom). Tuning the constellation to led to average 2.9 bits/dimension. • At this rate, Shannon capacity ~17.4dB. Capacity with uniform distribution ~18.9dB. • Simulation results for hypercube/systematic shaping and the various choices of non-zero elements N (at Pe=10E-5): 20.5dB, 20.6dB, 21.1dB, 22.2dB Distance to the uniform capacity: 1.6dB, 1.7dB, 2.2dB, 3.3dB • Note: At block size 1000, for Latin square LDLC (see Sommer et al 2008) the distance from Poltyrev’s capacity was ~1.5dB. ) to Summary • Shaping for lattice codes is essential. It is required for power-limited communication with finite lattice. For more than the 1.53dB shaping gain! • Shaping methods for LDLC that work: • • • By constraining H to be triangular and sparse shaping – leading to power constrained lattice coding – becomes easily implementable Introduction of the notion: “systematic lattice codes” The methods can be adapted for non-triangular H using QR decompostion • LDLC can potentially operate over AWGN at less than 1dB from the Shannon, Gaussian bound, at any number of bits/dimension. • Together with efficient, parametric decoding (see e.g., Kurkoski and Dauwels 2008, Yona and Feder 2009 ) Low Density Lattice Codes can shift from theory to practice!! Further Work on LDLC • • Prove that the class of LDLC indeed attain capacity Complete convergence proof • • Choose better code parameters Irregular LDLC • More efficient decoding algorithm: Compete with LDPC + Multilevel: Currently ~order of magnitude more complex, yet perform better (especially if compared with Gaussian shaping performance) • LDLC concept attractive and natural for the MIMO application as well (space-time lattice). A “small” channel matrix keeps H sparse. BACKUP Regular and Latin Square LDLC • Regular LDLC – row and column degrees of H are equal to a common degree d • A “Latin Square LDLC” is a regular LDLC where every row and column have the same non-zero values, except possible change in order and random signs • Example: n=6, d=3, {1, 0.8, 0.5} (before normalization) Iterative Decoding for LDLC • An iterative scheme for calculating the PDF f (xk | y), k=1,…,n • Message passing algorithm between variable nodes and check nodes. Messages are PDF estimates Initialization: • Each variable node xk sends to all its check nodes the PDF Check node message • The relation at the check node k: ∑ hi xki = integer xk 3 xk 2 h3 h2 ik h1 xkj = (integer - ∑ i≠j hi xki ) / hj xk1 • The message sent by check node k to the variable node xkj is its updated PDF, given the previously sent PDF’s of xki , i≠j Calculating the Check node message Recall check node equation: xkj = (integer - ∑ i≠j hi xki ) / hj • Convolution step: • Stretching step: • Periodic extension step: is the message finally sent to xkj Variable node message yk • Variable node xk receives estimates Ql(x) of its PDF from the check nodes it connects to • It sends back to ikj an updated PDF based on the independent evidence of all other check nodes and observed yk • Specifically – message is calculated in 2 steps: • Product step: • Normalization step: xk ik 3 ik 2 ik 1 PDF waveforms at the Variable node Periodic check node messages with period 1/hi Larger period has also larger variance Final decoding • • After enough iterations, each check node has the updated PDF estimates of all the variables it connects to. Based on that, it generates the full PDF of the LHS of the check equation. • Specifically, if the check equation is: Then, it performs a convolution step: followed by a decision step for the unknown bk: Further highlights of Previous Work • Latin Square LDLC – Analysis, Convergence • Let |h1| > …> |hd| . Define • • • “Necessary” condition for convergence: α < 1 Good performance with a choice {1,h,h,….} where h is such that α < 1 For d=2, convergence point correspond to integer b that minimizes where W is a weight that depends on H. Different than ML, close at low noise! • For d>2 the analysis is complex. Conjecture – similar result to d=2 case. • Decoding by PDF iterations: Δ sample resolution of PDF, K integer span, L=K/Δ sample size, n code size, d degree, t iterations: Computation complexity – O Storage complexity – O LDLC with small block lattices of dimension 16 -1 10 -2 10 -3 lower bound 10 BW16 -4 SER 10 -5 LDLC degree =3 10 -6 10 -7 10 -8 10 1 2 3 4 5 distance from capacity [dB] 6 7 Parametric decoding of LDLC • In LDLC the PDF’s (messages) are Gaussian mixtures • However, the number of mixtures is too high to follow • Recently, Kurkoski & Dauwels (ISIT 2008) proposed to keep the number of mixture components small, by combining several mixtures into a smaller number of mixture components • This approach has been further simplifies recently, taking into account LDLC mixture properties (Yona and Feder, ISIT 2009) • Main concepts/algorithm: • A Gaussian mixture is approximated by a single Gaussian that has the same mean and variance at the mixture. This minimizes the divergence between the mixture and chosen Gaussian • Algorithm: Given a Gaussian mixture - list of mixture elements • • • • Choose strongest element in the mixture of the LDLC iteration, at each step/node Take mixture elements whose mean is close within A to the chosen element. Combine them all into a single Gaussian A is chosen to reflect the “uncertainty”: different for variable and check nodes Remove these chosen elements from the list. Repeat until have a M Gaussian – resulting M element mixture is the iterated PDF Reasonable performance loss for high complexity decrease! Complexity, Performance • Note that the check node messages are periodic, and so may require infinite replications. In the newly proposed algorithm we take only K replications. • Thus: Variable node message contains M Gaussian at most; Check node message contains KM Gaussians at most. • • Storage complexity: Computations complexity: • Straight-forward • Sort – • Compare with Kurkoski & Dauwels: (recently Kurkoski claimed that single Gaussian may work too – by density evolution; however, we could not validate that in practice over simulated codes) • Practical choise: M=2,4 or even 6, K=2,3. Storage and computation complexity is competitive with the alternative, finite-alphabet, options Efficiently Decoded LDLC Encoding • Unfortunately, evaluating x = Gi is O(n2) since G is not sparse • Nevertheless, efficient encoding can be done by solving the sparse linear equation Hx = i • Can be done via Jacobi’s method: An iterative solution with linear complexity • Further simplification when incorporated with the shaping methods now described.. Convergence of the Variances – cont’d Basic Recursion, for i=1,2,…d Example for d=3: Convergence of the Means – cont’d • • Asymptotic behavior of “narrow” consistent Gaussians • m denotes the vector of n “narrow” mean values, one from each variable node • Equivalent to the Jacobi method for solving systems of sparse linear equations Asymptotic behavior of “wide” consistent Gaussians • e denotes the vector of the errors between the “wide” mean values and the coordinates of the corresponding lattice point