Study of LDPC coded MIMO system Yaoyu Tao (taoyaoyu@stanford.edu) EE359 Final Report Fall 2014 Under the guidance of Professor Andrea Goldsmith I. Introduction Communication systems using multiple antennas at both the transmitter and the receiver have recently received increased attention due to their ability to provide great capacity increases in a wireless fading environment [1]. However, approaching the capacity of such multi-input multioutput (MIMO) channels requires channel codes to overcome timing-varying fading, intersymbol interference, and noise. Moreover, practical MIMO systems are not able to perfectly estimate the channel state information (CSI) as assumed in theory, which degrades the performance of MIMO system to an even larger extent. The challenge calls for a coded MIMO system with very good channel codes. On the other hand, channel codes such as convolution codes, Viterbi codes and low-density parity-check (LDPC) codes, have been widely used for high-order constellations on single-input single-output (SISO) channels to achieve bandwidth efficiency. Among all the channel codes up to date, LDPC code is probably the most promising candidate because of its capacityapproaching performance. It was invented by Gallager [2] a few decades ago and didn’t draw much attention in practical system design due to its high complexity encoding/decoding. Surprisingly, for the recent decade, LDPC coded systems have been developed from theory to commercial products with the rapid developments in semiconductor industry. A plenty of researches have already brought LDPC codes into MIMO systems to push the system closer to the capacity. Varieties of binary LDPC coded MIMO systems, [3]-[5], have also been designed that greatly enhance the reliability and data rate of communication systems. Recently developed nonbinary LDPC codes [11] have been shown to perform even closer to the Shannon limit compared to binary LDPC codes. With the same parameters, the bipartite graph of nonbinary LDPC code is usually much sparser than that of a binary code. Consequentially, the higher girth of nonbinary LDPC graphs helps to avoid short cycles and also reduces the effect of stopping or trapping sets. Nonbinary LDPC coded MIMO systems [6]-[7] have been proposed and studied which achieves a better performance compared to binary LDPC coded system. In this project, we undertake a survey of state-of-the-art LDPC coded MIMO systems, including systems with binary LDPC codes and also nonbinary LDPC codes. Note that most of the system designs emphasize at the receiver side as it’s the critical and most complicated part in the system. Section II describes a serial concatenated binary LDPC coded MIMO system with an iterative receiver structure [3]. We also study design optimizations of this binary LDPC coded MIMO system with OFDM [4] and EP-based soft-decisions [5]. We further discuss challenges of the binary LDPC coded MIMO system and how nonbinary LDPC codes are brought into MIMO systems for further improvements. Nonbinary LDPC coded MIMO systems [6]-[7] are then studied and analyzed in section III. Finally, we carry out simple MATLAB simulations to compare binary LDPC coded MIMO and nonbinary LDPC coded MIMO systems in section IV. Section V concludes the report and discusses about the future research directions in this area. II. Binary LDPC coded MIMO System ([3]–[5]) Binary LDPC coded MIMO system has been studied well in recently years and successfully brought into practical adoptions for commercial use. Selected paper [3] describes a classical model of serial concatenated communication system with LDPC coded MIMO; paper [4] combines LDPC coded MIMO with OFDM and carry out detailed analysis on performance and optimization possibilities; the most recent paper, [5], introduces performance enhancement technique with EP-based soft-decisions. We will go through these papers carefully in this section. II.a Classical System Architecture [3] The system architecture of a serial concatenated LDPC-coded MIMO system with Nt transmit antennas and Nr receive antennas is proposed in [3]. The information source is encoded by a raterc binary LDPC encoder. The coded bits are interleaved bit-wise and grouped into vectors of N = mNt address bits c = [c0, c1, …, cN-1]. The mapping device maps each coded binary vector into a length-Nt symbol vector x with its entries chosen from a constellation A, where |A| = 2m. The channel output vector y is given by y = Hx + n where H is an Nr×Nt channel matrix which has i.i.d. complex Gaussian entries with zero mean and unit variance, and n represents a length-Nr complex additive white Gaussian noise (AWGN) vector with covariance matrix E[nn*] = 2σ2IN. The system spectral efficiency is then rcmNt. It is also assumed that fading coefficients are perfectly known at the receiver. A turbo principle to decode the received signal is developed in [3]. This effectively decomposes the receiver into separate soft-input soft-output components that exchange extrinsic information iteratively. An iterative de-mapper/decoder graph is shown in Figure 1.There are two levels of iterations involved. One is the de-mapper-decoder loop including both the de-mapper and decoder; the other is the decoder loop within the decoder only. For each de-mapper-decoder loop iteration, the de-mapper accepts as its inputs y and a priori information LM,a of the coded bits c, and produces extrinsic information LM,e. The LDPC decoder accepts a priori input LD,a, a deinterleaved version of LM,e, and performs the sum-product decoding algorithm for decoder-loop iterations, generating the extrinsic output LD,e. Then, LD,e is interleaved to become for the next de-mapper-decoder loop iteration. Assume the maximum allowable number of de-mapperdecoder loop iterations is denoted by ld. Figure 1. Iterative de-mapper/decoder graph [3] II.a.1 Iterative Channel De-mapper (Equalizer) [3] At the receiver side, optimal a posteriori probability (APP) de-mapper and its suboptimal version MMSE de-mapper are introduced. The APP de-mapper computes the extrinsic information for each coded bits ci, i = 0, …, N-1, conditioned on the vector channel output y, as follows: Equation (2) from [3] ๐ท ๐(๐๐ =0) ) ๐ท ๐(๐๐ =1) Where ๐ฟ๐,๐ (๐๐ ) = log ( is the a priori information, ๐ด๐๐ is the set of length-Nt symbol vectors with ci = b, b ∈ {0, 1}, and Jx ,i is the set of indexes within symbol vector x with cj = 0, 1, …, N – 1, j ≠ i. Here the coded bits are assumed to be independent with respect to each other and p(y|x) is a multivariate Gaussian density function. The computational complexity is exponential in N = mNt, becoming prohibitive when Nt gets large. One can easily see that the APP de-mapper has a very high complexity at O(|A|Nt) and can only be used in practice for small constellation A and small number of transmit antennas. Therefore, a suboptimal de-mapper based on soft interference cancellation and linear MMSE filtering whose complexity is proportional to the cube of due to the matrix inversion operations. Therefore, the suboptimal de-mapper has lower complexity for medium-to-large Nt. The MMSE de-mapper first ๐ computes the soft estimates ฬ ฬ ฬ ๐ฅ๐ = ๐ธ[๐ฅ๐ ] = ∑2๐=0−1 ๐๐ Pr(๐ฅ๐ = ๐๐ ) of all the symbols within transmitted vector x for k = 0 to Nt - 1. Then for each symbol k, a soft interference cancellation is performed on the vector channel output y, and we obtain ๐ฆ๐ = ๐ฆ − ๐ป๐ฅ ฬ ฬ ฬ ๐ In paper [3], an MMSE filter is also applied to each yk to further suppress the residual interference plus the noise, i.e., sk = ๐ค๐∗ ๐ฆ๐ . Here wk is chosen to minimize the mean square error between the symbol xk and the filter output sk. and wk is derived in [8] as Equation (5) from [3] ๐ where hk is the kth column of H and ๐ธ|๐ฅ๐ |2 = ∑2๐=0−1 ๐๐2 Pr(๐ฅ๐ = ๐๐ ). Furthermore, paper [3] follows paper [8] and assume that sk is the output of an equivalent AWGN channel with xk as its input, i.e., sk = ๐ผ kxk + ๐k, where ๐ผ k = ๐ธ[๐ ๐ ๐ฅ๐∗ ]/ E[๐ฅ๐ ๐ฅ๐∗ ] = ๐ค๐∗ โ๐ , and ๐k is a zero-mean Gaussian random variable with variance ๐๐2 = ๐ธ|๐ ๐ − ๐ผ๐ ๐ฅ๐ |2 = ๐ธ|๐ฅ๐ |2 (๐ผ๐ − |๐ผ๐ |2 ) . The extrinsic information of each coded bits is the given by an expression similar to Equation (2) from [3] with y and x replace by sk and x, respectively. II.a.2 Iterative LDPC Decoder [3] Once we have the output information from de-mapper, LDPC decoder is used to decode the coded bits. Thesis [9] gives us a comprehensive study of LDPC codes and their encoding and decoding techniques. A LDPC codes can be described by a bi-partite graph with check nodes and bit nodes on each side. Messages are exchanged between bit nodes and check nodes for error correction purpose. Number of check nodes connected to a bit node is denoted by dv and number of bit nodes connected to a check node is denoted by dc. Here we also assume that the LDPC codes are regular LDPC codes with constant dv and dc. Let’s first consider message passing from a bit node to its incident dv check nodes. Given the a priori information from the de-mapper LD,a and the message from the incident check nodes, for i ๐ −1 ๐ ๐ฃ = 0, …, dv – 1, the decoder computes ๐ฟ๐๐ต,๐ = ๐ฟ๐ท,๐ + ∑๐=0,๐≠๐ ๐ฟ๐ต,๐ , where LB,a is set to zero at ld = 1 (the first iteration of the de-mapper-decoder loop). An intuitive explanation of this equation is that we want to compute the bit-to-check message from bit j to check i based on the prior information of bit j and the information from all the adjacent check nodes of bit j except check i. Note that in the first iteration the a prior information are used as bit-to-check messages, as we don’t have the check-to-bit information. Next we consider the message passing from check node to its incident dc bit nodes. The rule for ๐ฟ๐๐ถ,๐ the message updating is ๐ก๐๐โ ( 2 ๐๐ −1 ) = ∏๐=0,๐≠๐ tanh( ๐ฟ๐๐ถ,๐ 2 ), where ๐ฟ๐ถ,๐ is the interleaved version of ๐ฟ๐ต,๐ . We can repeat the bit/check message updates for lc decoder-loop iterations and output ๐ −1 ๐ ๐ฃ the extrinsic information LD,e, given by LD,e = ∑๐=0 ๐ฟ๐ต,๐ . After LD,e is generated, we can either keep LB,a for the next de-mapper-decoder loop iteration or reset LB,a to zero. In general, resetting LB,a results in slower convergence and even performance loss for small lc. In the non-resetting case, however, the value of lc has little effect on the performance. In [3], extrinsic information transfer (EXIT) chart analysis is applied to study the convergence behavior. The proposed LDPC coded system is proved to be able to achieve performance close to the channel capacity with APP de-mapper and optimized LDPC codes. In this report, we skip the EXIT analysis part and emphasize on the system model study of LDPC coded MIMO system. II.b Deploying OFDM to Binary LDPC Coded MIMO System [4] Beyond LDPC coded MIMO system introduced in [3], paper [4] deploys OFDM scheme and caries out performance analysis and explores design optimization. The paper focus on optimization and performance compare of a LDPC coded MIMO OFDM system for a fixed target data rate (eg. 100Mbps). The paper analyzes different number of antennas at transmitter and receiver, different soft-input-soft-output demodulations schemes including maximum a posterior (MAP) and MMSE, and different MIMO channel models (spatially uncorrelated and spatial correlated; spatial correlated one is more practical). The system architecture for the LDPC coded MIMO system with OFDM [4] is given in Figure. 2. Let’s compare this system with the system without OFDM in [3]: 1) at the transmitter side, coded symbols are parallelized with IFFT for each branch for frequency assignments; 2) at the receiver side, FFT blocks are used in each branch to recover symbols. A serial-concatenated receiver with iterative demodulation/decoder scheme is also used as in [3]. Using the same parameter names as in [3], paper [4] basically describes a system with K subcarriers, Nt transmitter antennas and Nr receiver antennas with QAM modulated symbols in ๐ฬ OFDM slots. Figure 2. a) LDPC coded MIMO-OFDM transmitter Figure 2. b) LDPC coded MIMO-OFDM Receiver On the transmitter side, paper [4] discusses about the MIMO OFDM modulation with spatial correlation (which is practical in multiple-antenna systems). The frequency domain channel response matrix at the kth (assume a total of K subcarriers, k = 0, …, K - 1) subcarrier and the pth is given by according to [10], 1/2 1/2 ๐ฏ[๐,๐] = ∑๐ฟ−1 ๐ฏ๐ [๐] ๐บ๐ ๐๐ฅ๐ ( ๐=0 ๐น๐ 1/2 1/2 −๐2๐๐๐ ๐พ ) Equation (1) from [4] 1/2 1/2 Here the Rl = ๐น๐ ๐น๐ and Sl = ๐บ๐ ๐บ๐ represent the receive and transmit spatial-correlation matrices, which are determined by the spacing and the angle spread of MIMO antennas. Assume proper cyclic insertion and sampling, the MIMO OFDM system with K subcarriers decouples frequency-selective channels into K correlated flat-fading channels that can be described as the following equations: ๐ฆ[๐, ๐] = √ SNR ๐๐ก ๐ฏ[๐, ๐] ๐[๐, ๐] + ๐[๐, ๐] ๐ = 0, … , ๐พ − 1, Equation (2) from [4] ๐ = 0, … , ๐ฬ − 1 ๐ฏ[๐, ๐] ∈ ๐ถ๐๐ ×๐๐ก is the matrix of complex channel frequency responses defined in Equation (1) from [4]; ๐[๐, ๐] and ๐ฆ[๐, ๐] are the transmitted signals and the received signals at the kth subcarrier and the pth slot; ๐[๐, ๐] is the additive noise with i.i.d. entries ๐[๐, ๐] ~ Nc(0, I); and the SNR denotes the average signal-to-noise ratio at each receiver antenna. On the receiver side, again the LDPC code is assumed to be regular with constant dc and dv and rate rc. Decoding of LDPC in [4] is carried out in the same way as in [3], following sum-product iterative decoding. The difference between the system in [4] and that in [3] is the de-mapper (or demodulation/equalizer) part. Now let’s assume perfect CSI at the receiver, it is clear from Equation (1) of [4] that the demodulation of the received signals at a particular subcarrier and a particular slot can be carried out independently. As in Figure 2. b), at the qth turbo iteration, the soft MIMO OFDM demodulator computes the extrinsic information of the LDPC code bit bi as ๐ ๐−1 Equation (6) from [4] ๐ฟ๐ท→๐ฟ (๐๐ ) = ๐ (๐, {๐ฟ๐ท←๐ฟ (๐๐ )}๐ ) Here y is the received data; {๐ฟ๐−1 ๐ท←๐ฟ (๐๐ )}๐ is the extrinsic information computed by LDPC decoder in the previous turbo iteration; and g(.) is the demodulation function, which we will discuss in the next paragraph. At a given subcarrier and time slot, Nt symbols or Ntlog2(A) LDPC code bits (A is the QAM constellation) are transmitted from transmitter antennas. In a maximum a posterior (MAP) MIMO OFDM demodulator ๐ฟ๐๐ท→๐ฟ (๐๐ ) where (i = 1, …, Ntlog2(A)), is computed as Equation (7) from [4] (detailed calculation processes provided in [4]), where ๐ช+๐ is the set of x for which the ith LDPC-coded bit is “+1,” and ๐ช−๐ is similarly defined; {x+}j denotes the corresponding jth binary bit in symbol x+, and similarly, so does {x-}j. ๐ ๐ฟ๐ท→๐ฟ (๐๐,๐ ) โ log ๐(๐๐,๐ = +1|๐) ๐(๐๐,๐ ๐(๐๐,๐ = +1) − log = −1|๐) โ ๐(๐๐,๐ = −1) Equation (7) from [4] ๐−1 ๐ฟ๐ท←๐ฟ (๐๐,๐ ) = log 2 ๐−1 2 ๐−1 ๐ฟ (๐ ) SNR ๐ log |Ω| + ∑๐+∈๐ช+ exp (− โ๐ − √ ∑ ๐ก 2 {๐+ }๐ โ ๐ท←๐ฟ ๐,๐ ) ๐ ๐๐ก ๐ฏ๐ โ + ๐=1 2 ๐ฟ (๐ ) SNR − โ + ∑๐๐ก log2 |Ω|{๐− } โ ๐ท←๐ฟ ๐,๐ ) ๐ฏ๐ ๐ ๐=1 ๐๐ก 2 ∑๐+∈๐ช−๐ exp (− โ๐ − √ ๐−1 − ๐ฟ๐ท←๐ฟ (๐๐,๐ ) The soft MAP demodulator in Equation (7) from [4] also has a complexity at O(|A|Nt) as the APP demodulator in [3]. Hence a suboptimal MMSE is also given in [4] as Equation (17). ๐ ๐ฟ๐ท→๐ฟ (๐๐ ) โ log ๐(๐๐, = +1|๐๐ ) ๐(๐๐, = +1) − log ๐(๐๐, = −1|๐๐ ) โ ๐(๐๐, = −1) Equation (17) from [4] ๐−1 ๐ฟ๐ท←๐ฟ (๐๐ ) 2 = log ๐−1 −โ๐๐ − ๐๐ ๐ + โ ๐ฟ๐ท←๐ฟ (๐๐ ) ๐ log2 |Ω| + ∑๐ ++∈๐ + exp ( ∑ + โ ) {๐ } ๐ ๐ ๐=1 2 ๐,๐ ๐๐2 −โ๐๐ − ๐๐ ∑๐ ++∈๐ − exp ( ๐,๐ ๐ ๐๐2 2 ๐−โ ๐ log2 |Ω| + ∑๐=1 {๐๐− } โ ๐ ๐−1 ๐ฟ๐ท←๐ฟ (๐๐ ) 2 ๐−1 − ๐ฟ๐ท←๐ฟ (๐๐ ) ) The detailed derivations can be found in [4] with Equation (8)~(17).Short in a word, ๐๐ here is the linear MMSE filter output with a filter wj which is chosen to minimize the mean square error between transmit symbol xj and the filter output zj. Conditioned on xj, ๐๐ and ๐๐2 are the mean and variance of zj. Numerical results on error rates under different parameter setups are provided in [4] for LDPC coded MIMO-OFDM 2x2 and 4x4 systems. Readers can check them if interested. The conclusion is that LDPC greatly enhances the performance of MIMO-OFDM systems in terms of error rate. The authors also discuss the method of density evolution to obtain an optimized irregular LDPC codes that improve the overall performance of the system even closer to the capacity. Again, we emphasize on the LDPC coded MIMO system study and do not go deep into LDPC coding design and analysis. II.c Improve LDPC-coded MIMO systems with EP-based soft-decisions [5] In [3] and [4], MMSE detectors are used in the system at the receiver side. Paper [5] shows that the MIMO system using Expectation Propagation (EP) algorithm in soft-output detector is able to achieve performance gains of one order of magnitude compared to MMSE soft-output detectors for rate-1/2 LDPC codes. Using the same LDPC code, any possible gain in performance is only explained by the fact that EP is able to provide much more reliable estimates to the symbol posterior probabilities than that of MMSE. It has been shown that EP emerges as a powerful and efficient method to implement the receiver detector in high-order high-dimension MIMO systems with much higher data rate. The MMSE approximation to the true posterior distribution Pr(x|y) (as before, x is the transmitted symbols and y is the received symbols) replaces the prior over the transmitted symbols by a zeromean independent component-wise Gaussian. Intuitively it might make sense to choose the parameters of the Gaussian prior in this way, because it matches the first two moments of the input distribution. It is certainly not the best choice, as we are interested in matching the posterior distribution. So paper [5] proposes an algorithm, in which the prior distribution is optimized to ensure that the approximating Gaussian posterior matches the first two moments of the posterior distribution. EP is a Bayesian machine learning technique to construct tractable approximations to a given probability distribution. In our case, we use EP to approximate Pr(x|y) by a Gaussian distribution qEP(x) = N(x : µEP, ΣEP) that matches the first two moments of Pr(x|y), namely ๐EP = ๐ผ๐(๐ฎ|๐ฒ) [๐ฎ] Equation (8) from [5] ๐บEP = CoVar๐(๐ฎ|๐ฒ) [๐ฎ] Equation (9) from [5] This condition is known as moment matching. While the direct computation of the Pr(x|y) moments requires |A|2m operations, an iterative solution-approaching algorithm is developed to estimate the solution at polynomial complexity. Here we don’t go deep into this algorithm and assume that this is done; we can then approximate the LLR for each coded symbol as follows: LLR(๐๐๐ [๐]) = log ∑๐ข๐∈โฌ๐(1) ๐ฉ(๐ข๐,EP , Σ๐,EP ) ∑๐ข๐∈โฌ๐(0) ๐ฉ(๐ข๐,EP , Σ๐,EP ) Here ๐ข๐,EP is the ith component of mean vector ๐ขEP and Σ๐,EP is the ith component of diag( ΣEP ). Moreover, โฌ๐ (๐) = {x ∈ A | Grayj(x) = c} where Grayj(x) is the bit in jth position of the Gray encoding of symbol x. Figure 3 (Fig. 4 from [5]). Performance EP with 10 iterations compared to MMSE with BP channel decoding and a 16QAM constellation. Figure 3 above shows the performance improvement when using EP based soft decoding. The SNR gain of EP compared to MMSE at WER 10-3 is above 3dB in SNR, which is significant. The author also compares EP with Gaussian tree approximation (GTA) (both soft and hard decision) under same setup, showing performance improvement as well. III. Non-binary LDPC coded MIMO System ([6]–[7]) Nonbinary LDPC codes are first investigated in [11] where it is shown that nonbinary LDPC codes constructed over higher order Galois fields achieve superior performance than the binary codes for binary symmetric channels and binary Gaussian channels. These codes have been continuously studied in information theory side for more than a decade. Unlike binary LDPC coded MIMO system that has been successfully brought into commercial use, nonbinary LDPC coded MIMO system is still at theoretical development stage with few publications in real implementations [12]-[14]. The first nonbinary LDPC silicon design was reported last year [13] by me and a fellow teammate back in University of Michigan Ann Arbor. Nonbinary LDPC code has several advantages over binary LDPC: 1) the error floor of nonbinary LDPC is much lower than that of binary LDPC, which makes it very promising for applications that require very low error rate; 2) since nonbinary LDPC codes are define over higher order Galois field GF(q) and symbols are formed by elements in GF(q), its decoder can directly take symbol likelihoods vector without pre-processing from symbol likelihoods to bit likelihoods. Applications of nonbinary LDPC codes to MIMO channels are studied in our selected papers [6] and [7], showing performance improvements compared to binary LDPC coded MIMO systems. Specifically, in [6], detailed comparisons of performance and receiver complexity between binary and nonbinary codes are provided; in [7], a low complexity layered BP-based detection and decoding is proposed for nonbinary LDPC coded MIMO system to realize potential practical adoptions. III.a Nonbinary LDPC coded MIMO System [6] Paper [6] proposes the use of QC nonbinary LDPC cycle code for MIMO channels. Two schemes are introduced and investigated: joint MIMO detection and channel decoding (JDD) and separate MIMO detection channel decoding (SDD), as shown in Figure 4. Figure 4. A schematic block diagram of SDD (without feedback loop) and JDD (with feedback loop) systems JDD system is similar to the iterative de-mapper-decoding model introduced in [3] for binary LDPC case with feedback from the output of the decoder to the input of the de-mapper (detector). To illustrate how to deploy nonbinary LDPC to MIMO system, we assume the LDPC code is defined over GF(q) where q = 2p. At the transmitter side, a sequence of information is first mapped to a sequence a sequence of symbols in GF(q) (every p bits are mapped to a single nonbinary symbol) through a bit-tosymbol mapper g, before passing to the nonbinary LDPC encoder. Similarly, let’s assume Nt denotes the number of transmit antennas and Nr denotes the number of receive antennas. At the output of the LDPC encoder, every group of n0 coded nonbinary LDPC symbols s = {s1, …, sn0} is mapped to a group of Nt constellation symbols x = (x1, …, xt) = ๐(s) through the mapper ๐. Given the constellation size M = 2๐0 , we have pn0 = tm0. The sequence of constellation symbols is then passed to the transmit filter and set through the Nt transmit antennas. The receiver performs optimal maximum a posteriori probability (MAP) detection to compute the prior probabilities for each group of Nt transmitted constellation symbols. These prior probabilities will then be passed (after the mapper ๐ -1) to the LDPC decoder for iterative decoding. After a finite number of iterations, hard decisions on the nonbinary symbols are made at the output of LDPC decoder, which are then de-mapped to the sequence of estimated information bits. When n0 = 1, the MAP detector produces prior likelihoods for each GF(q)I symbol which can be used directly for nonbinary LDPC decoding over GF(q). Based on this observation, [6] concludes that it’s sufficient to perform MIMO detection only once followed by channel decoding, corresponding to a SDD system that performs separate detection and decoding. When n0 > 1, the prior probabilities of the group of n0 nonbinary symbols are dependent because they are mapped to complex symbols that are transmitted simultaneously. Then it is necessary to pass soft information about the dependent symbols from the LDPC decoder back to the MAP detector to produce updated symbol-wise probabilities. This corresponds to a JDD system that performs joint detection and decoding. Examples of SDD and JDD systems are also given in [6]. Suppose that a 16QAM modulation is used. An example of a SDD system is with q = 256 and n0 = 1. A single coded GF(256) symbol is mapped to two 16QAM symbols which are transmitted simultaneously. An example of JDD system is with q = 16 and n0 = 2. Every two GF(16) coded symbols are mapped to two 16QAM symbols that are transmitted simultaneously through the two transmit antennas. The MAP detector is working as following. Again assume channel matrix H is known at the receiver. Given each received signal vector y, MAP detection is used to determine the a posteriori probabilities (APP) of each nonbinary symbol sj, j = 1, …, n0, by computing the loglikelihood-ratio vector (LLRV) over GF(q). Let {0, α1, ··· , αq-1} denote elements in GF(q). The LLRV of sj is defined by z = {z0, z1, .., zq-1}, where zj = ln[p(sj = 0)/p(sj = αi)]. From MIMO equation y = Hx +n we have ∑S:๐ =0 exp[−โy−H∅(s)โ2 / (2๐ 2 )] p(s) ๐ ๐๐ = ln ∑ S:๐๐ =๐ผ๐ exp[−โy−H∅(s)โ 2 / (2๐ 2 )] p(s) Equation (2) from [6] p(s) denotes the prior probabilities of s which are passed from the LDPC decoder. Subsequently, these LLRV values are passed to the LDPC decoder for iterative decoding. Performance comparisons are carried out in [6] between binary and nonbinary coded systems (both JDD and SDD) as shown in the Figure 5 below. Figure 5 (Fig. 4 from [6]). Performance comparisons between binary and nonbinary LDPC coded MIMO systems Paper [6] shows: compared to JDD scheme that has been widely used in binary LDPC MIMO system and also widely studied in nonbinary LDPC MIMO systems, the best performance can be achieve by a SDD system with optimized nonbinary LDPC codes. III.b Low-Complexity layered BP-based Detection and Decoding for nonbinary LDPC coded MIMO System [7] Recent paper [7] in 2014 ICC in this area introduces a low-complexity layered BP-based detection and decoding that greatly reduce the complexity with negligible performance loss. The LDPC coded MIMO models in this paper emphasizes on JDD model with feedback from channel decoder to MIMO detector but also covers SDD model without iterative feedback. A joint factor graph (JFG) representation is used in [7]. It’s basically consists of two disjoint parts: the upper part is the factor graph representation of the MIMO spatial multiplexing (SM) detector. The lower part corresponds to the factor graph representation of the parity check matrix of the nonbinary LDPC decoder. A detection principle based on the BP algorithm was chosen as a solution to tackle the drawback of a MIMO detector residing in its computational complexity. Furthermore, when compared to a ML detector, a BP detector has a shorter decoding delay which results in smaller latency and reduced memory requirements for hardware implementations. A Vertical Shuffle Schedule (VSS) is applied for the BP algorithm over the entire JFG since it enables a fast iterative process convergence. Assume GF(64) is used. Figure 6 (Fig. 2 from [7]). The JFG representation of the detector and decoder for a 2x2 LDPC coded MIMO system MIMO-BP detector consists of 3-stages: First step involves sending a message as a priori information from the ith candidate symbols Si to the jth connected received symbol yi. The message from yi to Si can be expressed by the following LLR vector (๐ก) ๐ฆ๐ →๐๐ ๐ฟ ๐ Equation (4) from [7] = [๐ฟ[๐๐ ,๐ผ0 ] ; ๐ฟ[๐๐ ,๐ผ1 ] ; . . ๐ฟ[๐๐ ,๐ผ63 ] ] ๐ฟ[๐๐ ,๐ผ๐] here is the LLR of GF(q) element ๐ผ๐ at symbol Si, whose derivations can be found in Equation (5)-(9) in [7]. The second step starts after the candidate symbol Si is updated. The messages send from the symbol Si to the connected received symbol yi is computed as (๐ก) ๐ (๐ก) (๐ก−1) (๐ก) (๐ก−1) ๐ ๐ฟ๐๐ →๐ฆ๐ = ∑๐=1 ๐ฟ๐ฆ๐→๐๐ + ๐ฟ๐๐→๐๐ = ๐ฟ๐๐ − ๐ฟ๐ฆ๐→๐๐ Equation (12) from [7] ๐≠๐ Finally at the thrid stage, the candidate symbol information ๐ฟ(๐ก) ๐๐ is updated as Equation (11) from [7] and the detected symbols send extrinsic information to the nonbinary LDPC decoder as Equation (13) from [7]. (๐ก) ๐ (๐ก) (๐ก−1) ๐ ๐ฟ๐๐ = ∑๐=1 ๐ฟ๐ฆ๐→๐๐ + ๐ฟ๐๐→๐๐ Equation (11) from [7] (๐ก) (๐ก) (๐ก−1) ๐ฟ๐๐→๐๐ = ๐ฟ๐๐ − ๐ฟ๐๐→๐๐ Equation (13) from [7] (๐ก−1) Note that ๐ฟ๐๐→๐๐ is the a priori LLR-vector extrinsic information produced by nonbinary LDPC decoder during previous iteration (t – 1). The nonbinary LDPC decoding in [7] is the same as discussed in [6] and we don’t repeat its process here. A VSS schedule is used. The shuffle schedule over the complete JFG enables a fast convergence, since part of the variable nodes corresponding to the ones processed towards the end of the decoding step can profit from the update of the ones processed at the beginning of this step. To reduce the complexity, a message truncation scheme is then proposed in [7] based on the VSS BP detection and decoding. The idea is basically to truncate less reliable likelihoods in the messages exchanged on the JFG and leave only the reliable ones. Instead of exchanging a full-length (length-q for GF(q)) LLRV between nodes in nonbinary LDPC decoder, we keep only the top nm entries with nm < q. To further reduce the computation complexity, it is also possible to update only a subset (nv < nm) of S1 by the extrinsic LLRs on S2. In other words, intrinsic information on S1 at the output of the LDPC decoder serves as a pointer that determines the subset of nv most reliable S1 values that will be updated by nm S2 extrinsic LLRs. At the detector level, similar approach applies. For one particular symbol Si, a subset of nc Euclidean distance values is updated by decoder extrinsic information. This subset includes nc symbol indexes with highest LLR reliability at the output of the detector during the previous (decoder-detector) iteration. Detailed equations can be found in [7] in section IV. The author shows that by applying proposed low-complexity detection/decoding technique, the number of operations drops with nc, nv and nm much smaller than q for high-order GF(q). Table below shows the comparisons between ML detector, full-complexity detector and proposed low-complexity detector. Table 1 from [7] Comparisons of Number of Operations The performance of the proposed BP detector is also shown in Figure 7 below. We can see that the performance loss compared to a full-complexity BP or ML design is negligible for proper setup of nc, nv and nm parameters. One can also play with the tradeoff between performance loss and complexity. You can see that the red curve has the lowest complexity by losing more BER performance. Figure 7 (Fig. 8 from [7]) BER performance comparisons of low-complexity MIMO receivers As a conclusion, the BP-based detection/decoding scheme with truncation for low-complexity is a very promising technique for future practical adoptions. IV. MATLAB Simulation of LDPC Coded System In this section, we carry out simple MATLAB simulations for LDPC coded MIMO 2x2 system. The system we simulate is a serial concatenated communication link with Nt transmit antennas and Nr receive antennas. Our primary goal aims to study the error-rate (under different SNR) and system architecture comparisons between binary and nonbinary coded MIMO system. Due to limited time, we get to the point of a working binary LDPC coded MIMO system and nonbinary LDPC coded MIMO system still have bugs. Note that the LDPC and nonbinary LDPC codes we use have identical binary bit representation. The LDPC H matrix is constructed following a simple quasi-cyclic manner, corresponding to a QC-LDPC code [15]. The block length is set to be 960, the same length as the nonbinary LDPC code we used in [12] for a fair comparison. The LDPC coded MIMO system is a non-iterative architecture with BPSK modulation and demodulation to keep it simple. We also assume that the H matrix (set as [1.2 0.4; 0.3 0.7] in the simulation) of the flat fading MIMO channel is perfectly know at the receiver side. A BER curve of the system is shown below in Figure 8. Note that we have a severe error floor problem at high SNR and the BER stops dropping down at 10-6 level. This is probably mainly due to the property of our naïvely constructed QC-LDPC code with a lot of trapping sets. Further optimization of QC-LDPC code can be done to significantly lower the error floor. 2x2 binary LDPC coded MIMO system with QC-LDPC code 0 10 -2 BER 10 -4 10 1. Rate-1/2 binary QC-LDPC code of length 960 2. BPSK modulation 3. Zero-Forcing Equalizer 4. Channel matrix set to be [1.2 0.4; 0.3 0.7] -6 10 -8 10 0 5 10 15 20 25 SNR(dB) The simulation is simple right now with limited time given. If possible, we are going to continue this project and bring it to the next stage with more comprehensive and deeper study and analysis. V. Conclusions and Future Research Directions In this report, LDPC coded MIMO systems are studied with both binary LDPC codes and nonbinary LDPC codes. Binary LDPC coded MIMO systems have shown capacity-achieving capability with a serial concatenated architecture and iterative de-mapper-decoder chain [3]. In [4], a MIMO-OFDM system with binary LDPC codes is studied carefully and design optimizations are provided for higher data rate and better error performance. Paper [5] discusses another way of improving LDPC coded MIMO system by EP-based soft-decisions, achieving outstanding performance improvement compared to similar approaches in the literature. Recent research in this area has been focusing on developing systems with nonbinary LDPC codes that greatly enhance the performance. Paper [6] introduces nonbinary LDPC into MIMO system and shows its capability and feasibility. Paper [7] tackles the problem of high complexity at the receiver side and develops a low-complexity detection-decoding technique that greatly reduces the complexity with negligible performance loss, enabling very promising future in terms of practical implementation. Looking ahead, the future directions of LDPC coded MIMO system can be classified into two categories: binary LDPC systems and nonbinary LDPC systems. On the binary LDPC side, since practical systems are well studied and already commercialized with a well-defined flow, any improvements on the theory side may be able to apply quickly to practical products. Local optimizations such as improved MIMO equalization techniques for practical adoptions are always attractive; new code designs and algorithm designs with better performance or lower error floor (which becomes critical for applications that require very lower BER) is also an attractive part. Global system optimization of an end-to-end binary LDPC coded MIMO system, including transmitters, antennas and receivers, is desired to make the entire system operating right at the optimal state. Besides, implementation-friendly theory will be another good direction to go as the capability for practical adoption becomes more and more important in theory development. On the nonbinary LDPC side, the main research direction is to further improve its complexity with negligible performance loss, because the complexity of nonbinary LDPC is still too high compared to binary LDPC although the performance gain is significant. Efficient architecture and hardware implementation of nonbinary LDPC coded MIMO system will be another hot topic as practical nonbinary LDPC decoder has been successfully mapped to silicon chip for the first time last year [13]-[14] References [1] Goldsmith, Andrea, “Wireless Communications” Cambridge University Press 2005. [2] Gallager, R.G., "Low-density parity-check codes," Information Theory, IRE Transactions on , vol.8, no.1, pp.21,28, January 1962 [3] Jilei Hou; Siegel, P.H.; Milstein, L.B., "Design of multi-input multi-output systems based on low-density Parity-check codes," Communications, IEEE Transactions on, vol.53, no.4, pp.601, 611, April 2005 [4] B. Lu; G, Yue; X. Wang, “Performance Analysis and Design Optimization of LDPC-Coded MIMO OFDM Systems”, Signal Processing, IEEE Transactions on, vol.52, no.2, pp.348, 361, Feb. 2004 [5] Cespedes, J.; Olmos, P.M.; Sanchez-Fernandez, M.; Perez-Cruz, F., "Improved performance of LDPC-coded MIMO systems with EP-based soft-decisions," Information Theory (ISIT), 2014 IEEE International Symposium on , vol., no., pp.1997,2001, June 29 2014-July 4 2014 [6] Ronghui Peng; Rong-Rong Chen, "Application of Nonbinary LDPC Cycle Codes to MIMO Channels," Wireless Communications, IEEE Transactions on , vol.7, no.6, pp.2020,2026, June 2008 [7] Haroun, A.; Nour, C.A.; Arzel, M.; Jego, C., "Low-complexity layered BP-based detection and decoding for a NB-LDPC coded MIMO system," Communications (ICC), 2014 IEEE International Conference on , vol., no., pp.5107,5112, 10-14 June 2014 [8] X. Wang and H. V. Poor, “Iterative (turbo) soft interference cancellation and decoding for coded CDMA,” Communications, IEEE Transactions on, vol.47, pp.1046–1061, Jul. 1999 [9] John L. Fan, “Constrained Coding and Soft Iterative Decoding for Storage”, Kluwer Academic Publishers, 2001 (PhD thesis at Stanford University) [10] H. Bölcskei, D. Gesbert, and A. J. Paulraj, “On the capacity of OFDM-based spatial multiplexing systems,” Communications, IEEE Transactions on, pp. 225–234, Feb. 2002 [11] M. C.Davey and D. Mackay, “Low-density parity check codes over GF(q),” IEEE Commun. Lett., vol. 2, pp. 165–167, June 1998 [12] Y. Tao, Y. Park. Z. Zhang, “High-throughput architecture and implementation of regular (2, d c) nonbinary LDPC decoders”, IEEE Int. Symp. Circuits Syst., pp. 2626-2628, 2012 [13] Y. Park, Y. Tao, Z. Zhang, “A 1.15Gb/s Fully Parallel Nonbinary LDPC Decoder with Finegrained Dynamic Clock Gating”, IEEE Int. Solid-State Circuits Conf., pp.422-423, Feb 2013 [14] Y. Park, Y. Tao, Z. Zhang, “A Fully Parallel Nonbinary LDPC Decoder With Fine-Grained Dynamic Clock Gating”, IEEE Journal of Solid-State Circuits, vol. 50, no. 2, Feb 2015 [15] M. Fossorier, “Quasi-Cyclic Low-Density Parity-Check Codes from Circulant Permutation Matrices,” Info Theory, IEEE Transactions on, vol. 50, no. 8, August 2004