Mixed Block Length Parallel Concatenated Codes by Matthew M. Everett B.S., Electrical Engineering Stanford University, 2001 Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of Master of Science in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY May 2003 © Massachusetts Institute of Technology 2003. All rights reserved. Author ............... Department of Electrical Engineering and Computer Science May 21, 2003 1-12 Certified by Wayne G. Phoel Technical Staff, MIT Lincoln Laboratory Thesis Supervisor Certified by............. Leonard McMillan Associate Professor Thesis Supervisor A ccepted by ...... ................ Arthur C. Smith Chairman, Department Committee on Graduate so ETTLNTyUE OFTECHNOLOGY JUL 0 7 2003 LIBRARIES 2 Mixed Block Length Parallel Concatenated Codes by Matthew M. Everett Submitted to the Department of Electrical Engineering and Computer Science on May 21, 2003, in Partial Fulfillment of the Requirements for the Degree of Master of Science in Electrical Engineering and Computer Science Abstract Iterative decoding of parallel concatenated codes provides excellent protection from noise over digital communication channels. The block size of these codes significantly determines their characteristics. Codes with larger block sizes have better performance, but they require more time and resources to decode. This thesis presents a new type of parallel concatenated code whose constituent encoders do not all have the same block length. The freedom to choose different block sizes for different constituent codes gives system designers additional flexibility when making tradeoffs among error performance, code latency, and code complexity. In this thesis, we describe methods for encoding and decoding bit streams using this new code. We explore how changing the different parameters of the proposed code affect its characteristics, and we demonstrate the code's performance using simulations. Thesis Supervisor: Wayne G. Phoel Title: Technical Staff, MIT Lincoln Laboratory Thesis Supervisor: Leonard McMillan Title: Associate Professor 3 4 Acknowledgments First and foremost, I would like to thank my advisors, Dr. Wayne Phoel and Professor Leonard McMillan. Without their wisdom and guidance, this thesis would not have been possible. Leonard, thank you for giving me the flexibility and support I needed to follow my interests to Lincoln Laboratory. Wayne, thank you for your advice and encouragement throughout this project. I would also like to thank my family and friends for always having faith in me and for sending me electronic greeting cards whenever I started to feel too stressed out. Thanks also to Dean for helping me in the initial stages of my research; Tom, Shane, Lori, Scott, and Kit for helping me with both the organization and the details of my thesis; Jen for proofreading it; and Rob, Bob, Jason, and Matt for the "249 Grammar Hotline." Finally, I would like to thank the Computer Graphics Group and the Advanced Satellite Systems and Operations Group for their support and friendship. 5 6 Contents 1 Introduction 1.1 2 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Background 17 2.1 Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Channel Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.3 Forward Error Correction . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.1 Linear Block Codes . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.2 Code Performance . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.3 Criteria for Selecting FEC Codes . . . . . . . . . . . . . . . . 24 Turbo Codes and Multiple Turbo Codes 3.1 3.2 4 Thesis Overview. 15 Turbo Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1.1 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1.2 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.1.3 Interleaver Design . . . . . . . . . . . . . . . . . . . . . . . . . 36 Multiple Turbo Codes . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2.1 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2.2 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Mixed Block Length Parallel Concatenated Codes 4.1 27 43 Extending Turbo Codes . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . 45 Block Alignment 7 4.1.2 Block Lengths 46 . 4.2 Encoding . . . . . . . . . . 47 4.3 Decoding . . . . . . . . . . . 49 4.3.1 First Stage . . . . . . 50 4.3.2 Stopping Criterion 51 4.3.3 Second Stage 51 . 4.4 Interleaver Design . . . . . . 53 4.5 Hypotheses 55 . . . . . . . . . 5 Results 6 57 5.1 Methodology . . . . . . . . 57 5.2 Reference Turbo Codes . . . 58 5.3 Baseline MPCC . . . . . . . 58 5.3.1 Constituent Decoder Configuration 62 5.3.2 Puncturing Rate 63 5.3.3 Large Block Length 66 5.3.4 Interleaver Design . 68 . . Conclusions 6.1 6.2 6.3 MPCC Review 71 . . . . . . . . . . . 71 6.1.1 Optimal MPCC Parameters 72 6.1.2 MPCC Properties . . . . . . 72 Applications . . . . . . . . . . . . . 72 6.2.1 Satellite Communications 73 6.2.2 Video Communications . . . 74 Future Work . . . . . . . . . . . . . 75 A Discrete-Time AWGN Channels 77 8 List of Figures 2-1 A single-user, point-to-point digital communication system . . . . . . 18 3-1 A turbo encoder. .. . . . . . . . . . . . . 28 3-2 A rate-1/2 RSC encoder . . . . . . . . . . . . . . . . . . . . . . . . . 29 3-3 State diagram for the RSC encoder in Figure 3-2 . . . . . . . . . . . 30 3-4 A turbo decoder. . . . . . . . . . . . . . . . . . . . . . 31 3-5 A multiple turbo encoder with three constituent codes . . . . . . . . 39 3-6 A turbo constituent decoder modified for multiple turbo decoding . . 40 3-7 Possible configurations for multiple turbo constituent decoders . . . . 41 3-8 Parallel constituent decoder configuration for a turbo code . . . . . . 42 4-1 Combining small and large block length codes . . . . . . . . . . . . . 45 4-2 MPCC alignment strategies . . . . . . . . . . . . . . . . . . . . . . . 46 4-3 MPCC constituent encoders . . . . . . . . . . . . . . . . . . . . . . . 48 4-4 An MPCC encoder using the suggested transmission order . . . . . . 49 4-5 Flow chart for the MPCC decoding algorithm . . . . . . . . . . . . . 50 4-6 First stage of the MPCC decoder . . . . . . . . . . . . . . . . . . . . 50 4-7 MPCC constituent decoders . . . . . . . . . . . . . . . . . . . . . . . 54 4-8 A two-stage interleaver . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5-1 (a) BER and (b) BLER vs. Eb/No for the reference turbo codes . . . 59 5-2 BLER at the output of the first decoder stage vs. Eb/No for the baseline MPCC......... . .. . . . .. . . . .. . . . . .. . . .. ................................... 9 60 5-3 (a) BER and (b) BLER at the output of the second decoder stage vs. Eb/No for the baseline MPCC . . . . . . . . . . . . . . . . . . . . . . 61 5-4 BER vs. Eb/No and decoder configuration for the baseline MPCC . . 62 5-5 BLER at the output of the first decoder stage vs. c, at 1.5 dB for the baseline M PCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 63 (a) BER and (b) BLER at the output of the second decoder stage vs. , at 1.5 dB for the baseline MPCC . . . . . . . . . . . . . . . . . . . 64 5-7 BER vs. , and N at 1.5 dB for the baseline MPCC . . . . . . . . . . 67 5-8 BER vs. Eb/N and N, at , = 1/16 for the baseline MPCC . . . . . . 67 5-9 BER vs. Eb/No and interleaver design method for the baseline MPCC 69 A-1 Additive White Gaussian Noise channel . . . . . . . . . . . . . . . . . 78 10 List of Tables 3.1 Multiple turbo constituent decoder configurations . . . . . . . . . . . 40 4.1 Turbo code characteristics . . . . . . . . . . . . . . . . . . . . . . . . 44 4.2 Desired code characteristics 44 4.3 MPCC second stage configurations 5.1 Properties of the reference turbo codes 5.2 Properties of the baseline MPCC 5.3 Regions of the graph in Figure 5-6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 52 58 60 65 12 List of Acronyms AWGN Additive White Gaussian Noise BER Bit Error Rate BLER Block Error Rate BPSK Binary Phase Shift Keying FEC Forward Error Correction MAP Maximum A Posteriori ML Maximum Likelihood MPCC Mixed Block Length Parallel Concatenated Code MPEG Moving Pictures Experts Group PCC Parallel Concatenated Code PSD Power Spectral Density RSC Recursive Systematic Convolutional TCM Trellis Coded Modulation 13 14 Chapter 1 Introduction Nearly all digital communication systems currently under development employ error control codes in conjunction with iterative decoding. The most common implementations use a type of parallel concatenated code called turbo codes. Turbo codes have the useful property that arbitrarily low error rates can be achieved by selecting sufficiently large block lengths [2]. The downside to this technique is that codes with large block lengths require more time and resources to decode. As a result, practical turbo codes are limited in the performance they can deliver. Recent research has focused on producing new codes that maintain the good error performance of existing codes but that also add new features to make them more practical for real-world systems. This thesis presents a new type of parallel concatenated code that uses multiple block lengths for the different constituent codes. The constituent codes with small block lengths provide early estimates of the transmitted sequence, and these estimates can be improved when necessary by the large block length code. We call this new type of code a mixed block length parallel concatenated code (MPCC). This modification to traditional turbo codes is designed to give system designers additional flexibility when making tradeoffs among error performance, code latency, and code complexity. For example, it is possible to design an MPCC that provides better error performance than a similar turbo code without being significantly more complex. The focus of this thesis is exploring the new properties of MPCCs and how 15 these properties can be used in real systems. 1.1 Thesis Overview This thesis is organized into six chapters. Chapter 2 provides important background material and explains the basic principles of digital communications and forward error correction. Chapter 3 describes turbo codes, turbo code interleaver design, and multiple turbo codes. We introduce here the basic algorithms for encoding and decoding turbo and multiple turbo codes. These codes form the basis for the MPCCs we present in this thesis. Chapter 4 covers the MPCC encoding and decoding algorithms. We explain the motivation behind our particular modification to turbo codes and make some predictions about MPCC properties. Chapter 5 presents the results from our simulations. We compare our new code to turbo codes, and we explore the effect that different MPCC parameters have on overall code performance. Finally, Chapter 6 summarizes the conclusions of this research project. We review the most important results from our simulations, suggest some applications for which our new code may be ideally suited, and propose areas for future research. 16 Chapter 2 Background This chapter provides background material relevant to the work in this thesis. It describes the standard digital communication model and introduces both the additive white Gaussian noise (AWGN) channel and forward error correction (FEC). Digital communication systems send information from one location to another. The problem is sending this information reliably, quickly, and efficiently. Designers create useful systems by balancing these three objectives. One of the first steps in understanding how to make these tradeoffs is to look at the system's different parts. We present here the most commonly used model for digital communication systems, explained in more detail by Proakis [17]. Every communication system has a transmitter, a channel, and a receiver. Figure 2-1 shows how these elements fit together. The channel is not under the designer's control, so we treat it as a given. The transmitter and receiver must be designed to work optimally with this channel. The transmitter has three major components: the source encoder, the channel encoder, and the modulator. Each component plays a specific role in the communications process. The source encoder improves the versatility and the efficiency of the system. The channel encoder reduces the frequency of errors, and the modulator makes efficient use of bandwidth. Since the signal from the information source is not necessarily digital, the source encoder takes that signal and converts it into a compressed binary stream. 17 This Information _ Source Source F~~ Encoder Channel --Encoder ~~ -- ----- Modulator---.-- Transmitter ______Channel DetiatonSource Decoder Channel Decoder Demodulator Receiver Figure 2-1: A single-user, point-to-point digital communication system conversion serves two purposes. First, it separates the design of the rest of the system from the nature of the information being sent. By changing only the source encoder, the same communication system can be used to transmit images, video, data, or voice. Second, the source encoder ensures that the system is achieving its maximum efficiency by removing redundant data from the input. The output of an ideal source encoder is a bit sequence that is compressed down to or below the entropy rate of the information source. With this level of compression, each bit in the output is equally likely to be a '1' or a '0'. We use this assumption to simplify our analysis of linear block codes in Section 2.3.1. The output of the source encoder is passed to the channel encoder. The channel encoder adds redundancy so that the receiver can correct any errors introduced by the channel. This technique is referred to as FEC, and we describe FEC systems in detail in Section 2.3. The output of the channel encoder is given to the modulator. The modulator generates a waveform from the bits coming out of the channel encoder. It takes b bits at a time and transmits a single symbol for these b bits. This symbol is chosen from a set of 2b possible symbols. Once the modulator has transmitted the channel symbols, the channel delivers a noisy version of this signal to the receiver. 18 At the receiver, the process happens in reverse. First, the demodulator estimates the transmitted channel symbols from the received signal. Next, these estimated channel symbols are fed to the channel decoder, which uses the redundancy added by the channel encoder to remove any errors that were introduced by the channel. Finally, the source decoder delivers a reconstructed version of the original signal to the destination. 2.1 Channel Model We use the AWGN channel model for its mathematical simplicity and its proven ability to analyze communication systems. It adequately captures the major features of many communication systems, even though it simplifies the actual workings of the channel. The AWGN channel is a continuous-time system such that the following relationship holds: y(t) = x(t) + z(t) where x(t) is the input, y(t) is the output, and z(t) is a Gaussian random process with power spectral density (PSD) No/2. The signal x(t) is produced by the modulator from a sequence of bits, x, that came from the channel encoder. For the purposes of this thesis, we assume binary phase shift keying (BPSK) for the modulation to simplify our representation of the transmitted symbols: +I when bit i is a '1' - 1 when bit i is a '0' At the receiver, the demodulator makes a soft decision on each transmitted bit. We call this sequence of soft decisions y. We wish to simplify the AWGN channel by treating it as a discrete-time system that has no intersymbol interference and whose inputs and outputs are the vectors 19 x and y, respectively. We are therefore primarily interested in the probabilistic relationship between these two vectors. We understand in making this simplification that the data rate of the discrete-time system is limited by the bandwidth of the continuous-time system. Section 2.2 discusses exactly how continuous-time bandwidth determines discrete-time channel capacity. Appendix A derives the relationship between the vectors x and y, and we reproduce the result here: X)= g ( XjNo (2.1) rE =~ NO/ Es irN0 where Ar(bp, g2 oE ) is a Gaussian distribution with mean p and variance U2 , and E,/No is the ratio of symbol energy to noise density. Since the Gaussian distribution has a well understood and easily analyzed form, this uncomplicated result is one of the main reasons we use the AWGN channel model in our analysis. 2.2 Channel Capacity Theory predicts that a channel's maximum sustainable data rate is limited by its bandwidth and signal-to-noise ratio. We are interested in finding this limit because it gives us a good method for evaluating real-world systems. We can know when a system is performing near its theoretical capacity and when it needs significant improvement. Channel capacity, C, is a function of both signal-to-noise ratio, Eb/No, where Eb is the average energy transmitted per bit, and bandwidth, W; Shannon derived the following relationship for channel capacity [19]: C Wlog 2 1+ T cW . No The capacity C directly determines the maximum sustainable data rate. When the 20 data rate being sent over a channel, D, is less than the channel capacity, data can be sent error-free. Conversely, when D > C, error-free communication is impossible. Although sending data with zero errors is only a theoretical possibility, real systems can minimize how frequently errors occur by using FEC. 2.3 Forward Error Correction Noise is unavoidable in real-world communication systems. No matter how much energy we use to transmit each channel symbol, there is always some probability that it will be received in error. We can, however, mitigate the effects of noise using FEC. FEC algorithms compensate for noise by adding redundancy to the transmitted data. This redundancy can then be used to correct errors in the received bit stream. The FEC encoder takes a sequence of bits at one bit rate and produces another sequence of bits at a higher bit rate. We define the code rate of an FEC code to be the ratio of the input bit rate to the output bit rate. In other words, if every k bits at the input produce on average n bits at the output, we say that the code has rate k/n. Codes with lower rates can correct more errors but have lower throughput. The FEC decoder uses the redundancy in the encoded bit stream to correct errors that were introduced by the channel. It uses the statistics given by the channel model about the received bits to determine the most likely transmitted sequence. 2.3.1 Linear Block Codes We examine here a category of FEC codes called linear block codes. Block codes operate on fixed-sized groups of bits at a time and are partly defined by their code rate, r, and the number of bits in each input word, N. A code with input block size N and code rate r produces N/r output bits per block. The encoder performs a fixed, one-to-one mapping from the set of input words to the set of output words. The input is a vector of length N, which we call d, and the output is a vector of length N/r, which we call x. If we let D be the set of all possible input words and X be the set of all possible output words, we can represent 21 the operation of the encoder as a map from D to X: X = f(d) Linear block codes are block codes that also have the following property: f (di + d 2 ) = f (di) + f (d 2 ) We discuss only binary codes in this thesis, so we can interpret the addition in this equation to be binary addition over a two-element finite field. The decoder receives the vector y, a noisy version of x. It uses this vector to determine dML, which is the maximum likelihood (ML) sequence estimate of the vector d. We can express this problem as an optimization problem: dML = arg maxpdly(dly) d We use Bayes' Rule to substitute: dML = arg max Pyld(y d d)Pd(d) pY(y) Since all possible inputs are equally likely, pd(d) is a constant. Furthermore, py(y) is independent of d in the maximization. We remove these from the formula to find: dML = arg maxpyld(y Id) (2.2) d Lin and Costello give more information in [11] about encoding and decoding linear block codes, although they do not discuss turbo-like codes. Unfortunately, it is not always possible in practice to do an exhaustive search over all possible vectors d to determine the one that is most likely. Iterative decoding provides a practical alternative to an exhaustive search. Iterative decoders produce a series of estimates of dML that converge on the correct value. 22 2.3.2 Code Performance The performance of a block code depends on how likely it is that a sequence is decoded correctly. This probability depends on two factors. The first is signal-to-noise ratio. As Eb/No increases, it becomes less and less likely that a sequence will be decoded incorrectly. The second is a property inherent to each code called its distance spectrum [7]. To define the distance spectrum, we look at the cause of errors in linear block codes. First, we note that the probability of error is independent of which codeword is transmitted because the code is linear. We can therefore restrict ourselves to looking at errors that occur when the all-zero codeword is transmitted. We classify error events based on the weight d of the incorrectly detected codeword. Because we are assuming a discrete-time, memoryless channel, it is far more likely that the all-zero codeword be incorrectly detected as a low-weight codeword than as a high-weight one. We define Bd to be the average number of bit errors caused by the all-zero codeword being incorrectly received as a codeword of weight d. The value of Bd for each value d depends on the specific code being analyzed. We can find an upper bound on the expected bit error rate (BER) using the union bound [16]: N/r BER < N BaQ /d- 2rEb/No) d=d.,, where dmin is the code's minimum Hamming distance, r is its rate, and Eb/No is the signal-to-noise ratio. We define the function Q(x) to be the probability that a zero-mean, unit-variance Gaussian random variable is greater than x: Q(x) e-( = 2 /2)du The distance spectrum is the set of all pairs (d, Bd). We can improve the performance of a code either by decreasing Bd for low-weight codewords or by eliminating low-weight codewords altogether. 23 2.3.3 Criteria for Selecting FEC Codes System designers consider several criteria when choosing FEC codes for real-world systems: " bit error rate, the probability that a given bit will be received in error; " code latency, the delay the encoder and decoder add to the communication system; and " code complexity, the resources required to implement the encoder and decoder. The particular application for which a communication system is being designed determines the relative importance of these three factors. Bit error rate is a factor in selecting codes for almost every application. When bits are received incorrectly, the information they represent is either lost or must be retransmitted. Too many dropped packets or retransmissions degrade the performance of any communication system. Codes with low bit error rates are therefore necessary to ensure that these events are as infrequent as possible. Code latency is most important in applications that require low system latency such as voice communication and critical, real-time data communication. For voice communication, low latency is needed to allow end users to talk comfortably with each other. For real-time data communication, low latency ensures that information is still relevant by the time it is received and decoded. Code complexity plays a role in picking codes when there are resource constraints on the encoder and decoder. One example of such a system is a satellite communication system. The electronics on the satellite must meet certain size, power, heat, and cost constraints. These restrictions limit the complexity of the code the satellite can implement and place a premium on codes that can achieve good bit error rates with low complexity. In the next chapter, we introduce two FEC codes that work well for a broad range of applications: turbo codes and multiple turbo codes. They provide excellent error 24 performance without being overly complex or introducing too much delay into the communication system. 25 26 Chapter 3 Turbo Codes and Multiple Turbo Codes This chapter discusses turbo codes, turbo code interleaver design, and multiple turbo codes. These codes form the basis for the MPCCs we propose in Chapter 4. 3.1 Turbo Codes Turbo codes provide excellent error correction with reasonable complexity of implementation; the performance of Berrou's original turbo codes comes within 0.7 dB of the Shannon limit when using large block lengths [2]. The success of turbo codes makes them an ideal basis for developing new codes. In this section, we describe the turbo code encoding and decoding algorithms as well as turbo code interleaver design. 3.1.1 Encoding Turbo codes are a type of parallel concatenated code (PCC). The output of a PCC encoder can contain systematic bits and parity bits, although some systems transmit only the parity bits [12]. Systematic bits come directly from the original input sequence and give the decoder its initial estimate of the decoded sequence. We refer to the systematic sequence as x'. Parity bits are alternate representations of the input 27 d -xs 1 0 ~RSC X Encoder 1 ri Puncturing Unit RSC Encoder2 X 2 Figure 3-1: A turbo encoder sequence and provide the decoder with the redundancy it needs to correct errors in the received systematic bits. In turbo codes, there are two parity sequences, which we call x' and x. For the codes we discuss in this thesis, all three sequences, xs, x, and xz, are included in the final output sequence. Turbo encoders have four components: two recursive systematic convolutional (RSC) constituent encoders, an interleaver, and an optional puncturing unit. The interleaver is designated in block diagrams with the capital Greek letter "II." Figure 3-1 shows a complete turbo encoder. The two RSC encoders take uncoded information bits as inputs and produce a sequence of systematic and parity bits as outputs. The interleaver decorrelates the outputs of the two RSC encoders by shuffling the input to the second encoder. The puncturing unit increases the rate of the turbo code by removing some parity bits from the outputs of the RSC encoders. Recursive Systematic Convolutional Encoders The RSC encoder is a shift register with both feed forward and feedback taps. Mathematically, we can use a generator matrix of rational transfer functions to represent the RSC encoder. The numerator polynomial of each element describes the feed forward taps, and the denominator describes the feedback taps. Figure 3-2 shows a rate-1/2 encoder with memory v = 2 and generator matrix 1, demor[1fD.D delay of i. 28 12 where D' denotes a d D Figure 3-2: D A rate-1/2 RSC encoder with memory v = 2 and generator matrix 1+D2 I +D+D2 Consider a rate-k/n RSC encoder. The shift register starts in the all-zero state at the beginning of each block. For each transition, the encoder does three things. It accepts k input bits, passes these k bits to the output as systematic bits, and generates n - k parity bits. These parity bits are based on the encoder's current state and the present value of its inputs, making it convenient to interpret the RSC encoder as a Mealy state machine [14]. A binary encoder with memory v has M, states, where M, = 2". Figure 3-3 shows the state transition diagram corresponding to the encoder in Figure 3-2. The encoder continues to make transitions until all N bits in the input block are processed. At this point, some turbo codes add an extra v bits to the input sequence to force at least one of the RSC encoders into the all-zero state. Other turbo codes do not add these extra bits and permit the RSC encoders to terminate in any state, causing a slight degradation in performance. Interleaver The interleaver decorrelates the outputs of the two RSC encoders by permuting the input to one of the encoders. We label this permuted sequence d. The interleaver reorders the elements of d according to the function 7r(i) such that d() = di, where ir(i) is a one-to-one mapping of the integers {1, ... 29 , N} onto itself. 10 1/11 0/01 0/00 0/00 0011 1/10 1/10 1/11 0/01 01 Figure 3-3: State diagram for the RSC encoder in Figure 3-2. Transitions are labeled with the input bit followed by the two output bits. Decorrelating the outputs of the encoders makes it less likely that low-weight error sequences will occur in both outputs concurrently. We argue in Section 2.3.2 that reducing the number of low-weight error sequences improves code performance. Puncturing Unit The puncturing unit increases the rate of the turbo code by removing some parity bits from each output block. These bits are normally removed in a regular, repeating pattern. For example, we can convert a rate-1/3 turbo code consisting of two rate-1/2 constituent encoders into a rate-2/3 turbo code by removing three out of every four parity bits from both constituent encoders. 3.1.2 Decoding The goal of the turbo decoder is to solve the optimization problem stated in Equation 2.2. The turbo decoder uses belief propagation to make a series of soft decisions on each systematic bit that converge on a correct solution [10]. The turbo decoder consists of two constituent decoders connected by interleavers and deinterleavers. The interleavers are identical to the one used in the encoder, and the deinterleavers undo the permutation of the interleavers. Figure 3-4 shows a 30 TrTr_ II YS Ae MAP Decoder 1 rS MAP Decoder 2 _2 Vs -I A, JL I ,. a Figure 3-4: A turbo decoder complete turbo decoder. To decode a received block, the input sequence y is first broken into three components: y', the sequence of demodulator soft decisions corresponding to the systematic bits; yP, the soft decisions for parity bits from the first encoder; and yp, the soft decisions for the parity bits from the second encoder. Parity bits that were punctured by the encoder are included in the sequences yp and yp by assigning them soft decisions of zero. This value indicates that the bit is equally likely to be a '1' or a '0'. The vectors y9 and yj are then passed to the first decoder along with an a priori probability, pi,i(j), that each decoded bit di is equal to j: Pi,i(0) = Pr{di = 0} Pi,i(1) = Pr{di = 1} Initially, we assume that all possible decoded sequences are equally likely and set both p1,i(O) and pl,i(i) equal to 1/2 for every bit i. Later, we improve these probabilities using the redundant information contained in the received sequence. The first constituent decoder runs the Maximum A Posteriori (MAP) algorithm. We describe the details of this algorithm in the next section. The MAP algorithm 31 1 produces two outputs. The first is the logarithm of the a posteriori probability ratio for each decoded bit di: A(dj) =In Pr{di = 1ly} Pr{dj = Oly} This value is used only after the final iteration of the decoder to make a hard decision for each bit. The decoder decides that bit i is a '1' when A(dj) > 0 and a '0' when A(dj) < 0. Either '1' or '0' can be chosen with equal probability of error when the soft decision is equal to zero. The second output is a value for each bit i called the extrinsic information, A'i. The extrinsic information is the information embedded by the first constituent encoder that is not explicitly a function of ys or p 1 (j). uncorrelated with both S One feature of A' is that it is and yp. As a result, we can use A' to form new a priori probabilities for the second decoder, p 2 (j). Next, the second constituent decoder runs the MAP algorithm using pS, yp, and p 2 (j) as its inputs. The resulting extrinsic information A' is then used to find new values for p 1 (j), and the process repeats. We continue alternating between the two constituent decoders until some stopping criterion is met. After the final iteration, the soft decisions A(d) from the last constituent decoder are converted to hard decisions and used as the output of the entire decoder. Vucetic and Yuan provide a detailed explanation of iterative turbo decoding using the MAP algorithm in [21]. Maximum A Posteriori Algorithm The MAP algorithm was first implemented efficiently by Bahl et al. [1]. Its basic goal is to determine an a posteriori probability ratio for each decoded bit. The MAP algorithm makes hard decisions on a bit-by-bit basis, rather than selecting the most likely sequence of bits. We discuss here the algorithm for rate-1/2 RSC codes; it is easy to generalize to rate-k/n RSC codes. The MAP algorithm in the first decoder takes three inputs: ys, the sequence of 32 received systematic bit estimates; y', the sequence of received parity bit estimates from the first constituent encoder; and pj(j), the a priori probabilities. The MAP algorithm uses these inputs to determine an a posteriori probability ratio for each decoded bit, di: Pr{di = l|y} A(dj) = ln Prfdi = 11y} (3.1) Prfdi = Oly} To determine A(dj), we start by recalling that an RSC encoder with memory v can be interpreted as a Mealy state machine with M. states. We number these states 0, 1,... , M, - 1, and we let Si = I denote the event in which the encoder is in state 1 after transition i. We then group soft decisions based on which encoder transition produced them. We define the vector yj/ to be the set of n soft decisions corresponding to transition i. We refer to the individual soft decisions in this vector as ji,m, where m can range from 0 to n - 1. We wish to compare each vector yi against the encoder's possible outputs for that transition. We define the vector d (l) to be the set of bits generated when the input bit j E {0, 1} is presented to a state machine in state 1. Again, we refer to individual bits in this vector as ij,(l), with m ranging from 0 to n - 1. We can now compute the probability that transition i takes the state machine from state 1' to 1 given that the systematic bit for that transition is j: (',l) (rNo)/2 when the transition is possible e 0 The factor (7rNo)n/ otherwise 2 cancels out in Equation 3.1. Next, we compute two conditional probabilities, cri(l) and 3i(l). We define ai(l) to be the probability that Si = I conditioned on the bit estimates from the first i transitions and Oi(l) to be the same probability conditioned on the bit estimates from 33 the remaining transitions. We calculate both values recursively: M 8 -1 l'=0 (ai_1(l') Seo1 MS-1 Oi (1)= 7 .(l', 1) -j +1(1')' - 1'=O ( We now combine ai(l) and /3j(l) to compute the log-probability ratio that each bit is a '1' or a '0': A(dj) = In 10- E1=- a _1( ')??l', 1),j(1) This soft decision has three terms: one corresponding to xi; one corresponding to Pii) and the extrinsic information, A': A(dj) =-ln + Pi'i (0) No x8 + A 1,i '1 We now solve for Ai i: A = A(di) - In Pj(l) 4 (3.2) The results of the MAP algorithm are therefore A(dj), the soft decisions, and A', the extrinsic information. Vucetic and Yuan describe the MAP algorithm in greater detail in [21]. Log-MAP Algorithm The Log-MAP algorithm is identical the the MAP algorithm except that it operates in the log domain. Replacing 'yj(l', 1), ai(l), and 3i (I) with their logarithms, 7(l', 1), 34 di(l), and f3(l), we find: ) = In - (1', 1) 77i, di (l) = In az (l) MS-1 11=0 jE{10,1} (1) = in A (l) MS-1 = In E E +0'+(lilt) 1'=0 jG{0,1} EM -1 1=0 A(dj) = In _('+,('l+ l We can simplify these equations by defining the function max*(., -): max*(Xi, 2, .. . )= In (ex'+ X2 + . . . ) max *(x, y) = max(x, y) + In (1 + e xYI) = max(x, y) + 6(X - yI) where 6(Ix - yj) = in(1 + e-l-|Y). We can now define dj(s), i3(s), and A(dj) in terms of max*(., -): max* {N _1 + 57 (l', l)} 0<l'<M -1 =i(l) jE{0,1} f3 (l) = max 0<I'<M -1 { +1 + j+l(,1 1')} jE{0,1} A(di) [_(') =max -~i) - + T7j(l', 1) + !3i(l)] max* <<M -1 max* [s_1(s') + if(l', 0)+ i(1)] Log-MAP is both faster and less likely to cause roundoff errors than the MAP algorithm. Many computers execute the addition and subtraction operations required by Log-MAP more quickly than the multiplications and divisions used by MAP. Ad35 ditionally, addition and subtraction are less likely to result in overflow and underflow problems than multiplication and division. For these reasons, we base the decoding algorithm we use in this thesis on the Log-MAP algorithm. One interesting issue with the Log-MAP algorithm is the implementation of the function 6(.), which is both expensive to compute and generally close to zero. Some turbo decoders assume that it is always equal to zero, replacing the max*(., -) operation with the simpler max(-, -) function. This modified algorithm is called the Max-Log-MAP algorithm. Using Max-Log-MAP results in some performance loss but simplifies hardware implementation. A good compromise between computing 6(-) for every max*(-,.) operation and ignoring it completely is to implement it with a lookup table. Previous research has shown that using an eight-value lookup table for 6(-) causes negligible loss in performance [18]. We use the lookup table strategy in our implementation of LogMAP. Viterbi describes in more depth the differences between the MAP, Log-MAP, and Max-Log-MAP algorithms [201. 3.1.3 Interleaver Design The purpose of the turbo code interleaver is to prevent low-weight input sequences from generating low-weight parity sequences in both RSC encoders. There are several commonly used ways of achieving this goal, including random interleavers and spreadrandom interleavers. Both use randomness to decrease the number of low-weight codewords. Random Interleavers The first common interleaver type is the random interleaver. Like all interleavers, random interleavers perform the same permutation on every block of input bits. The permutation of a random interleaver, however, is selected pseudo-randomly from all possible permutations when the system is designed. Random interleavers break up most low-weight parity sequences. These sequences 36 occur when a small number of '1's occur in close proximity in the input sequence. Since the random interleaver moves bits that are nearby each other to random locations, it is unlikely that bits that are close to each other in the original sequence will remain close in the permuted sequence. We have no guarantee, however, that all groups of bits will be successfully broken up. We can therefore improve on the performance of the random interleaver by adding some constraints on the selected permutation. These constraints should ensure that the interleaver spreads out nearby bits. Spread-Random Interleavers One successful type of constrained random interleaver is the spread-random interleaver. A spread-random interleaver has one parameter, S, that determines how much adjacent bits should be spread out. The spread-random condition with parameter S requires that for any pair of bits xi and xj: i-j S ->7rF)-g(j)| > S (3.3) where r(i) is the position of bit i in the permuted sequence. This constraint guarantees that low-weight error patterns whose systematic component has length less than S get broken up by the interleaver. Spread-random interleavers are designed by selecting the indices of the permutation function r(i) pseudo-randomly and rejecting any that do not meet the spread-random condition [7]. There is a limit on how large we can make S before Equation 3.3 becomes difficult or impossible to satisfy. Spread-random interleavers of size N can generally be constructed for parameter S < Smax, where Snax = [/N/2] [5]. This upper bound on S limits the performance of spread-random interleavers. Spreading Factors Heegard and Wicker generalize spread-random interleavers to include two parameters, S and T, which they call spreading factors [10]. The modified constraint takes the 37 following form: ji - JI < S - 17(1) - 7r(j)j > T (3.4) This condition reduces to Equation 3.3 for the case S equal to T. We take advantage of the additional flexibility offered by this more general constraint when we design interleavers for MPCCs. 3.2 Multiple Turbo Codes Multiple turbo codes were first introduced by Divsalar and Pollara. They differ from traditional turbo codes by having more than two constituent codes [4]. This modification changes the encoding process only slightly but has a significant impact on decoding. 3.2.1 Encoding Multiple turbo encoders work exactly like regular turbo encoders except that they generate more than two parity sequences. We label these sequences xP, xP,... X. These sequences are all included in the final output sequence of the encoder. Figure 3-5 shows a multiple turbo encoder with three constituent codes. 3.2.2 Decoding Multiple turbo decoders differ from turbo decoders by having more than two constituent decoders. This change raises three issues: 1. we must now correctly interleave and deinterleave the data passed among the multiple constituent decoders; 2. we must change how we calculate our a priori probabilities; and 3. we must determine the configuration of the constituent decoders and the order in which they will operate on the received sequence. 38 " dA RSC X Encoder 1 RSC Encoder 2 Puncturing Unit 2 RSC Encoder 33 Figure 3-5: A multiple turbo encoder with three constituent codes First, we address the problem of interleaving and deinterleaving data by incorporating the interleavers and deinterleavers into the constituent decoders, as suggested by Divsalar and Pollara [4]. By interleaving the data at the input of the constituent decoder and deinterleaving it at the output, we can connect any one constituent decoder to any other constituent decoder without inserting interleavers or deinterleavers between them. Figure 3-6 shows the modified constituent decoder for the second constituent code. Second, multiple turbo decoders form their a priori probability estimates for each bit by combining the extrinsic information from more than one constituent decoder. We recall that the extrinsic information from a constituent decoder depends only on the redundant information embedded by the corresponding constituent encoder. Since the redundant information from different constituent encoders is uncorrelated, we can assume that the extrinsic information from different constituent decoders is also uncorrelated. We therefore approximate the a priori probabilities for a given decoder by adding the extrinsic information from the other decoders [4]: In PM'i(1)= z A" ' m'$m PM'i(0) 39 D P2 yS n2 A2 MAP Decoder 2 Figure 3-6: A turbo constituent decoder modified for multiple turbo codes to include interleaver and deinterleaver Using this definition of pm,(j), we can use the same MAP algorithm we used in regular turbo decoding. Last, the number of possible configurations for multiple turbo decoders is larger than for regular turbo decoders. Figure 3-7 illustrates four of the configurations suggested by Divsalar and Pollara in [4] and by Han and Takeshita in [8]. Table 3.1 shows these same four configurations in a list format. Each entry represents one unit of time and contains the constituent decoders that are active during that time step. The pattern repeats past the end of the list. Name Configuration Extended Serial 1, 2, Extended Master-Slave 1, (2, 3) Parallel 3 (1, 2, 3) Table 3.1: Multiple turbo constituent decoder configurations It is interesting to note that for normal, two-code turbo codes, all four of these configurations reduce to the serial one. Figure 3-8 shows this effect for the parallel structure. One critical assumption that Divsalar and Pollara make in analyzing multiple turbo codes is that all the constituent codes have the same block length. The next 40 DEC 1 DE3DE1- DEC 2 (a) (b) DEC 2 DEC 2 D EC 1 E C 3D EC 1 DE 3 (c) DEC 1 DEC 1 DEC 1 DEC 1 DEC 2 DEC 2 DEC 2 DEC 2 DEC 3 DEC 3 DEC3 DEC 3 (d) TIME Figure 3-7: Possible configurations for multiple turbo constituent decoders: (a) serial; (b) extended serial; (c) extended master-slave; and (d) parallel [4, 8] 41 DEC 1 DEC 1 DEC 2 DE2DEC DEC 1 DEC1 2 DEC2 TIME Figure 3-8: Parallel constituent decoder configuration for a turbo code reduces to two copies of the serial one chapter introduces the concept of mixed block lengths to the multiple turbo code construction, enabling further gains in performance. 42 Chapter 4 Mixed Block Length Parallel Concatenated Codes We present in this chapter mixed block length parallel concatenated codes (MPCCs). MPCCs extend small block length turbo codes by adding a third, large block length constituent code. We first explain why it is desirable to combine the properties of small and large block length turbo codes in this way. We then describe the MPCC encoding and decoding algorithms and discuss MPCC interleaver design. We conclude by listing our hypotheses about MPCC characteristics. 4.1 Extending Turbo Codes Small and large block length turbo codes both have desirable properties. Table 4.1 shows how well turbo codes with different block lengths meet the criteria for FEC selection outlined in Section 2.3.3. We see that increasing a turbo code's block length has three effects. First, it becomes easier to lower bit error rates by further separating adjacent codewords [61. Second, code latency increases. This longer latency has two components. The first is the time required to receive a complete encoded block. Although the decoder can start decoding as soon as the first systematic and parity bits arrive, it cannot do the majority of the decoding work until the entire block has been received. 43 Block Length Small Large Error Performance Fair Excellent Latency Good Poor Complexity Good Poor Table 4.1: Turbo code characteristics The second component of code latency is the amount of time needed to decode each block. Longer blocks take longer to decode because the time required for each MAP iteration is proportional to the size of the block being processed. Since turbo codes with larger block lengths spend more time both waiting for blocks to arrive and decoding blocks, they have higher code latency than small block length turbo codes. Third, decoder complexity increases. As block lengths become larger, the memory needed to store both the interleaver and the internal metrics for the MAP algorithm also increases. We seek a way of combining the bit error rate of large block length codes with the low latency and low complexity of small block length codes. Table 4.2 shows the characteristics of the desired code. Error Performance Latency Complexity Small Block Length Fair Good Good Large Excellent Poor Poor Mixed Good Good Good Table 4.2: Desired code characteristics To achieve this set of features, we start with a turbo code that has small block length N, and we add a constituent code with larger block length NI. Figure 4-1 shows how we add this larger block length constituent code to a traditional turbo code. We keep the overall code rate the same by changing the puncturing rates of the three constituent codes. We define the puncturing rate of the large constituent code to be P1 . We also define the variable j to be the percentage of parity bits that come from the large constituent code. For the MPCCs we study in Chapter 5, to 1 - PRi. 44 j is equal . XS d RSC Encoder 1 xI Puncturing RSC Encoder 2 N = N, Unit .x3 Original Turbo RSC Encoder 3 - Code Puncturing X Additional Constituent Code Figure 4-1: Combining small and large block length codes Although this thesis discusses only MPCCs with two small constituent codes and a single larger one, it is easy to generalize to other combinations of block sizes. 4.1.1 Block Alignment One issue that is important to the implementation of MPCCs is the alignment of the larger constituent code with respect to the two smaller ones. There are two possible alignment strategies: aligned and staggered. The aligned strategy aligns the beginning of the first block of the larger constituent code with the first bit of the input sequence, and the staggered strategy does not. Figure 4-2 shows both approaches to block alignment. We evaluate these strategies based on two criteria: error performance and ease of implementation. In terms of bit error rates, the performance of staggered MPCCs and aligned MPCCs are similar. The significant low-weight error sequences are the same for both codes except for a small number involving bits near block boundaries. As the size of the larger constituent code increases relative to the smaller ones, these edge cases become insignificant. From an implementation perspective, however, we strongly prefer aligned MPCCs. Aligned MPCCs are much easier to implement because we can treat them as block 45 d Small Blocks 11 111111111111 N SSS Large Blocks 1111 11 N l N N 1111 ii N- I NS N (a) d Small Blocks Large Blocks N -N N, , N . N,- . (b) Figure 4-2: MPCC alignment strategies: (a) aligned and (b) staggered codes with block length Neq: Neq = lcm(N 8 , NI) (4.1) where lcm(-, -) denotes the least common multiple. In other words, we can encode and decode each group of Neq bits independently of each other. The MPCC encoding and decoding algorithms in this chapter use this approach. Staggered MPCCs, on the other hand, are not block codes. Correct decoding of staggered MPCCs requires careful handling of bits near block boundaries and forces some bits to be decoded more than once. Since this added complexity brings little performance improvement, we choose to study only aligned MPCCs in this thesis. 4.1.2 Block Lengths We note that the error performance of an aligned MPCC is a function of N, and N1 , while its complexity is a function of Neq. We therefore recommend that N be made 46 an integer multiple of N, so that Neq is equal to Nj: N, = Bs - Ns = 42 (4.2) Neq for some positive integer B,. If this constraint is not met, we can both lower Nq and increase N, by making N equal to the next larger multiple of N,. Since these new values improve the error performance of the MPCC and lower its complexity, all MPCCs should satisfy Equation 4.2. 4.2 Encoding MPCC' encoding is similar to multiple turbo encoding, except that one constituent code has a different block length than the other two. As we explain in Section 4.1.1, we handle the different block sizes by treating the MPCC as a block code with block length Neq. The two smaller constituent encoders take each group of Nq bits and break the bits into B, blocks of Ns bits each, where B, = Neq/Ns. We label these smaller blocks d[l], . .. , d[B,], and we use a similar convention whenever we break other vectors into smaller groups. The constituent encoders then interleave and encode these blocks of size N, independently. The larger constituent encoder, on the other hand, interleaves and encodes each block of Neq bits without separating it into smaller blocks, since Nq is equal to N. Figure 4-3 illustrates the encoding operation for both the small and large block length constituent encoders. The MPCC encoder then punctures the parity bits from the constituent encoders and places them in the output sequence along with the systematic bits. We suggest an arrangement of systematic and parity bits that has the following properties: e each block of parity bits from the small constituent codes should be placed 'From this point forward, we use the term MPCC to imply an aligned MPCC that satisfies Equation 4.2. 47 Neq d[l] d[2] d[B ] r'2 RSC X [1] 112 RSC x[2] 12 RSC xg[B,] ri3 RSC x3 (a) Neq N d d (b) Figure 4-3: MPCC constituent encoders for: (a) the small block length constituent codes and (b) the large block length constituent code 48 d xs[1] xP[1] Encoder Constituent xS[1] ... xs[B,] xP[B,] xP[B,] x u . Constituent e Encoder 2wr Constituent Encoder3 codeword.l An MPCC encoder using the transmission order suggested in the text. Figure 4-4: The puncturing unit is omitted for clarity. immediately after the corresponding block of systematic bits, and e parity bits from the large constituent code should be placed at the end of each codeword. Figure 4-4 shows an MPCC encoder with a transmission order that meets these requirements. 4.3 Decoding Like the MPCC encoder, the MPCC decoder works with blocks of Neqbits at a time. These blocks are sent through a two-stage decoding process. The first stage uses only the two smaller constituent codes to decode the received block. At the end of this stage, the decoder decides whether or not there are still any errors in the current block. If there are not, then it stops decoding. Otherwise, it starts the second phase of the decoding process in which all three constituent codes are used to correct any remaining errors. Figure 4-5 shows the flow chart for this operation. 49 Yes CStartD - Decode Using Smaller Codes Dne - NO, Decode Using All Codes StOD Figure 4-5: Flow chart for the MPCC decoding algorithm -N Neqq - N N yS ys[1] N ys[2] . e * 3 A[1] y [B,] A[2] a* * A[B,] A =Turbo yy, yg Figure 4-6: First stage of the MPCC decoder 4.3.1 First Stage The first stage of the decoder uses only the two small constituent codes. It first breaks each block of Neq bits into B8 blocks of N8 bits. It then decodes these smaller blocks separately using the turbo decoding algorithm from Section 3.1.2. Figure 4-6 shows the complete first decoder stage. Note that implementing this stage does not actually require B8 separate turbo decoders; a single decoder can be reused for each block. We show B 8 decoders in the diagram to illustrate the independence of the decoding operations. 50 4.3.2 Stopping Criterion At the end of the first stage, we must decide whether or not any errors remain in the current block. For this thesis, we assume that we always know when a block has been correctly decoded and enter the second stage of the decoder only when necessary. If we were implementing MPCCs in a real system, we would have to use some nonideal stopping criterion to make this determination. Matache et al. describe several stopping rules that are suitable for this purpose [13]. The quality of the stopping rule affects the performance of the MPCC because a failure to detect errors at the output of the first decoder stage increases the bit error rate of the whole system. 4.3.3 Second Stage The second stage uses all three constituent codes to correct any errors that remain after the first stage. Decoding with all the constituent codes presents four specific challenges: 1. we must correctly interleave and deinterleave the data passed among the multiple constituent decoders; 2. we must change how we calculate our a priori probabilities; 3. we must determine the configuration of the constituent decoders and the order in which they will operate on the received sequence; and 4. we must exchange information between constituent decoders that have different block lengths. Section 3.2.2 on multiple turbo decoding discusses in detail how to solve the first two problems, but the last two require more careful consideration. First, we must reconsider our list of constituent decoder configurations because MPCC constituent codes, unlike those for multiple turbo codes, no longer all have the same block length. Second, the problem of passing data between constituent codes with different block lengths is unique to MPCC decoding. The next two sections explain how we address these issues. 51 Constituent Decoder Configuration We make two changes to our list of constituent decoder configurations from Section 3.2.2 to make it suitable for MPCCs. First, we introduce two variants on the extended serial configuration. These variants spend more time than the extended serial configuration on the two small constituent codes. Since the constituent decoders no longer all have the same block length, it may be advantageous to use a more asymmetrical configuration. Name Extended Serial Variant One Variant Two Extended Master-Slave Parallel Configuration 3, 1, 3, 1, 3, 1, 3, (1, 2) (1, 2, 3) 2 2, 1, 2, 3, 1, 2, 1, 2, 1, 2 2, 1, 2, 1, 2, 1, 2, 1, 2 Table 4.3: MPCC second stage configurations Second, we change all the configurations to start with the large constituent decoder. By the time a block of data reaches the second decoder stage, the first stage has already run multiple iterations using the two smaller codes. Any further iterations with these two codes at the beginning of the second stage would not provide a significant reduction in the bit error rate. Table 4.3 lists the five configurations we study in this thesis. Different Interleaver Sizes We handle different interleaver sizes in the second decoder stage using a similar strategy to the one we used for the MPCC encoder. Each constituent decoder works with groups of Neq bits, accepting Nq systematic bits and producing Neq soft decisions at a time. The small constituent decoders break each block of Neq bits into B, blocks of N, bits. Next, they interleave and decode each of these small blocks separately. Finally, they deinterleave the soft decisions and reassemble them into a large block of size Neq. 52 The large constituent decoder interleaves, decodes, and deinterleaves each large group of Neq bits without separating it into smaller blocks. Figure 4-7 illustrates the decoding operation for both the small and large block length constituent decoders. 4.4 Interleaver Design For the small interleaver, we use a spread-random interleaver because they break apart most low-weight error sequences [10]. Constructing the large interleaver, however, is more complicated. One option is to use a second spread-random interleaver. We evaluate the performance of using two spread-random interleavers in Section 5.3.4. The other possibility is to separate the interleaving process into two parts with a two-stage interleaver. The first stage breaks each group of N, bits into blocks of N, bits, interleaves these smaller blocks, and reassembles them. The second stage passes this reassembled group of bits through a B, x N, block interleaver. Vucetic and Yuan describe block interleavers in [21]. Figure 4-8 shows a complete two-stage interleaver. We want to construct the first interleaver stage so that the entire interleaver satisfies the spread-random constraint as much as possible: li - S ir(i) - r()I > S (4.3) Since the block interleaver spreads out adjacent bits at the output of the first interleaver stage by an additional factor of B8 , we want the first interleaver stage to meet the following criteria: |i5 -Sj l~xs~) - ,(')) -Bs|l |rs 7 (i) - is (J)I > S/B where T = S/BS. This constraint is identical to the two-parameter spread-random constraint from Equation 3.4. We therefore use the two-parameter spread-random 53 D N P2 - N -] [1[ yS [2] A.[. MAP n2 2 A[j 2 A; H2 2 (a) N Neq N N P3 MAP (b) Figure 4-7: MPCC constituent decoders for (a) the small block length constituent codes and (b) the large block length constituent code 54 N eq d [1] [2] Neq [B,] -- [1] [2] * [B,] H, First Stage Second Stage - LI d Figure 4-8: A two-stage interleaver interleaver for the first interleaver stage. Since the largest value for S that we can use in Equation 4.3 is S. = [Ni/2], we select the following values for S and T: T= [VN12J _N_112 B8 T - The advantage to using this method is that it requires less memory to store the mapping function ir(i) than a spread-random interleaver of equal size. A spread- random interleaver with size N, requires N - log(NI) bits to store the function ir(i). The two-stage interleaver requires only N, - log(N,) bits to store 7r,(i) from the first interleaver stage and no memory for the second interleaver stage. Block interleavers can be realized using only trivial addressing operations and do not require any memory for a lookup table. The disadvantage to using the two-stage interleaver is that it is less random than a spread-random interleaver and therefore performs slightly worse. 4.5 Hypotheses We make two claims about MPCCs: 55 1. they need only the first decoder stage to decode most received blocks, and 2. they have improved error rates compared to their underlying, small block length turbo codes. The first claim is important because it affects the average latency and complexity of the MPCC decoder. Consider the expected behavior of the two-stage MPCC decoder. At higher signal-to-noise ratios, the first stage successfully decodes a large percentage of the received blocks. The complexity and latency of the first decoder stage, however, is the same as the original block length N, turbo code. If the probability of entering the second decoder stage is sufficiently small, then the extra cost we are forced to pay when we use the second stage becomes insignificant. We therefore conclude that the average complexity and latency of MPCCs is close to that of the original code. On the other hand, at these same higher signal-to-noise ratios, the error performance of an MPCC should be better than a block length N, turbo code. The encoder can use the longer interleaver to break up longer error patterns and achieve a lower bit error rate. 56 Chapter 5 Results This chapter presents simulation results that show the improved properties of MPCCs over traditional turbo codes. We start by describing the error performance of four reference turbo codes that form the basis for the MPCCs we study in this chapter. We then propose a baseline MPCC and compare it to the reference turbo codes. Finally, we measure how changing the parameters of the baseline MPCC affects its overall characteristics. 5.1 Methodology We use software simulations to evaluate the performance of turbo codes and MPCCs. For each code, we repeatedly encode, transmit, and decode blocks of bits until we meet both of the following conditions: * we have processed at least 50 blocks with errors and * we have processed at least 200 blocks total. These conditions ensure that we have a good sample of error patterns at both high and low signal-to-noise ratios, respectively. Most of our simulations measure bit error rate vs. signal-to-noise ratio. For the simulations that keep the signal-to-noise ratio constant, we choose a value of Eb/No = 1.5 dB because it lies in the waterfall region of the codes we study in this chapter. 57 5.2 Reference Turbo Codes We start by examining the performance of four reference turbo codes with the properties outlined in Table 5.1. Figure 5-1 shows the bit error rate (BER) and block error rate (BLER) for each of these codes. Since larger interleavers are able to break up larger error patterns, turbo codes with larger block lengths should perform better, and our results confirm this prediction [6]. Property Code Rate RSC Code MAP Passes Block Size Interleaver Value 1/2 K = 5, G = 35/23 24 N E {1024, 2048, 4096, 8192} Spread-Random Table 5.1: Properties of the reference turbo codes. K is the constraint length of the RSC code, and G is the generator function in octal notation. 5.3 Baseline MPCC We propose a baseline MPCC based on the block length 1024 reference turbo code. Table 5.2 lists the properties of the proposed code. We keep the code rate, RSC code, and number of MAP passes constant in order to ensure a fair comparison with the reference turbo codes. Figures 5-2 and 5-3 show the performance of the proposed MPCC relative to the block length 1024 and 8192 turbo codes. These graphs demonstrate two major properties of the baseline MPCC. First, Figure 5-2 shows that the second stage of the MPCC decoder is not needed for every received block. At a signal-to-noise ratio of 1.5 dB, the second stage is necessary for only 22% of the received blocks, and this percentage continues to go down as Eb/No increases. Second, Figure 5-3 shows that the MPCC performs better than the reference turbo code with block length 1024 below bit error rates of 10- . For example, the proposed MPCC performs better than the reference turbo code by 0.1 dB at a bit error rate of 58 1CP . . .. . . .. . . . . .. . . .. . . . . . . .. . .. . . . . . . .. . . . . . . .. . .. . . .. . . .. . . .. . .. . . .. . .. . . . .. . . . . . . . .. . . . . .. . . . .. . .. . . .. . . . . . . . . .. . . .. . . . .. . . . .. . .. . . . .. . . . . . .. . . . .. . .. . . ......... .. .. . .. .. .. .. ... ... .......... ........ 10-2 . ........ a: w . . . . .. . .. . . .. . . .. . . . ... .......I ......... ....... .... .......... ...... ....... .......... ... .......... .......... .......... .... ..... ....... . .......... .......... ........ ....... ........ .. ......... .......... ........ ................................... ... ... ...... ...................... .......... .. . . .. . . .. .......... ... ... .. .. ...... .......... .......... 10- 3 . . . .. . . .. . .. . . .. . . .. .. . .. . .. . .. . . ..... .. . . .. . . . .. ........ . . .. . . . . .....t . .. .... ....... ...... . . ................ . . , ..... ... . . . . . . .. .. . ..... ........ ........................: : : : : : : : : ...... . .......... .......... .......... ....... .. .......... .................... .......... 10-4 ......................................... ..................... N 10 -6 .................. ............. ............ ..... 1024 ................ N = 2048 N = 4096 N = 8 19 2 .............. .......... ................. .......... .......... .......... .......... 10 -6 0 0.25 0.5 0.75 1 E b /No (dB) 1.25 1.5 1.7 5 (a) E id, ................ .......... ........ .......... .......... ......... .......... .......... .......... .......... ................................. . . ... .... . .. . . . . . .. . . . .......... .......... 10-1 ........... ........................ .. ... ...... ......................... ..................... ......: : ....... .......... .......... ..... .... . ...... ................... ................................ .... .. . .. ....... .......... .......... ....... 10-2 .... .................... ...... ....... ...... cc w -j CO .......... .......... ....... .. ....... .. ...... ... .......... .......... .......... ....... .. . ...... . .......... .......... .......... .......... ................... . .. ...... ........ . .......... 10-3 ..... ..... ... ..... .......... .......... .... .... . .. .... . .. .... ... ..... .. .. ...... ... .. .. .. .......... . . .. . . .. . . .... . . . . .. .. . . . . . . .. . ... .... ..... .. ...... ... ... .......... .......... .......... ..................... .... ..... .... .... .......... .......... .......... .......... .......... .... .... ...... ... 10-4 . . .. . . .. . . -9- 10-, 0 N N N N = 1024 ........... .......... ...... = 20 48 ............. -' ' -'..''..-. .':............ .......... .......... = 4096 ........ ........ " **... !' - ..... .......... = 8192 ... ...... 0.25 0.5 0.75 1 Eb/NO (dB) 1.25 1.5 1.75 Figure 5-1: (a) BER and (b) BLER vs. EblNo for the reference turbo codes 59 Value Property Code Rate RSC Code MAP Passes 1/2 K = 5, G = 35/23 First Stage: 12, Second Stage: 12 Block Size Ns = 1024, N = 8192 Small Interleaver Large Interleaver Configuration Puncturing Rate Spread-Random, S = 20 Spread-Random, S = 65 Extended Serial 6 = 1/16 Table 5.2: Properties of the baseline MPCC 10 - -- .. ... .... . - l -.MP C Ns = 4 N =.0 . .1 .2 m. .... B cks ..... .... . -. .0 .1 P ............ .L C N s = ... 4.. N. .... = .... 2 a r g B cks. .. ........... ........... ~i10~ 10 ...-- ~~ ~ ~ ~ ~ ~ ~ . ~ 0 0.5 1.5 Eb/NO (dB) Figure 5-2: BLER at the output of the first decoder stage vs. Eb/Nofor the baseline MPCC 10-5. The error floor for the baseline MPCC occurs at a level too low for our software to detect, but we expect it to be lower than the one for the block length 1024 turbo code because the MPCC has a larger effective block length. Now that we have established the performance of the baseline MPCC, the next sections study how changing the parameters in Table 5.2 affects overall code performance. 60 id, 10-1 4 .......... .......... .......... .......... .......... .......... ..... . . . . . .......... .......... .......... .......... .......... .......... .......... .......... .. . . .. . .. . .. . . .. . .. . . .. . . .. . . .. .. . . .. . . .. .. . . .. . . .. .......... .......... .... ... ..........;.1........ .......... .......... .......... .......... 10, . . . .. . . . . . . .. . . . .. . . .. . . .. . . . ' ..... ..... ........ A ...... .. .... ..... . .. . .. . . .. . . . .. . ........... .......... .. ....... ........::::::. . ............. .......... .......... . .. I . . - -: . . . :..:...... . . .......... .......... .......... ...... ...... . ......... ..................... .......... ....... ..:......... .......... .......... cc Uj 10-3 ........ ........ .. . . . .. . . . . ...... ....... .... . .. ........ ...... . . . . .. . . . .. .. : : : : : : : : : : : . :-Ii ... . .. ... .... ... ....... ...: ... .... .... N ... .......... ''***..... .......... ''*.......*....... *''*......:.......... ..........:.......... .......... .... ..V .,.......... 10 -4 ............... .............. ....*...... .......... ................................ .........I 10-5 -+- Turbo Code, N = 1024 Turbo Code, N = 8192 i e .......... ............... .......... . ....... ... ......... . --9- MPCC, Ns = 1024, NI = 81 2 ............. 0 0.25 0.5 ...... ...... A : * .............. ......... .......... .......... 0.75 1 Eb/NO (dB) 1.25 1.5 1.75 (a) 1CP :: . V.. : . : . : . :! : . : 7: . . . . : : : : . . . . .. : : :. :. :.. : : .. .. :.- . . . .. .. . .. . . .. . . .. .: : : : : : - : ; . - - : : : : . . .. . .. . . . . . .. . . .. . . . . . . .. . . . .. . .. . . . .. . . .. . . .. . . .. . . : I I * I - I- . . . . .. . . . .. . . . . . . .. . . .. . . . . .. . . .. . .. . . .. . . .. . . . . . .. . . I . . . . . . . . . . . . . . . . . . . . . .. . . . : . . . N t. . . . . . .. . . .. . . . . .. . . .. . . . . . .. . .. . .. . . .. . . . .. . . . . . . . . . .. . . .. . . . . . . .. . . .. . 10-1 ccw i CO . . . .. . . .. . . . . .. . . .. . . .. . . .. . .. . . . . . .. . .. . . . . .. . . .. . .. . . .. . .. . . . . . 10 -2 . . . .. . . . .. . . . .. . . . . . . . . .. . . .. . .. . . .. . . . .. . . .. . . . .. . . . .. . . . .. . . . .. . . .. . . .. . . . .. . . . .. . . ... . . . .. . . . . . . . . . .. . . .. .. . . . . .. . . . . . . . .. . . . .. N . .. . . . .. .. . . . . .. . . .. . . . . . .. . .. . . .. . .. . . .. . . .. . . .. . . . . .. . .. . .. . .. . . .. . .. . . .. . .. . . . . . .. . . . . . .. . . . . . . . . .. . .. . . . . .. . . .. . . . . .. . .. . . . . . .. . . . . . .. . . . . . . . . .. . .. . .. . .. . . .. . .. . . .. . .. . .. . . . .. . . . . . . .. . . .. . . .. . .. . 10-3 .. .. ... ... ... .. . ... ... ... ... .. . ... ... ........1.: . ... ... ... ... .......... ....... .. . . ...... . .. ... ... .. . .... I.. ... . .. ... ... ... ... . .. ... ... ... .. .. ... ... ... ..: .. .. .... .. .... ... ... . .......... .. .. .............................. ............... ........ .. ......... .......... .......... ........... 10-4 --9- MPCC, Ns = 1024, NI = 8192, Small Blocks w MPCC, Ns = 1024, NI = 8192, Large Blocks i- Turbo Code, N = 1024 --Turbo Code, N = 8192 10-5 0 0.25 0.5 0.75 1 Eb/NO (dB) ............... ....... .......... ........... ........ 1.25 1.5 1.75 Figure 5-3: (a) BER and (b) BLER at the output of the second decoder stage vs. EblNo for the baseline MPCC 61 10 10 - 6 -e-M- 1i- ---*-EB- 1 . .. . . . . .-.. . . .. Extended Serial Vaniant One Variant Two Extended Master-Slave Parallel 1 .125 1.25 1.375 Eb/NO(dB) 1.5 1.625 1.75 Figure 5-4: BER vs. Eb/No and decoder configuration for the baseline MPCC. The BLER curve is similar to the BER curve. 5.3.1 Constituent Decoder Configuration We first attempt to determine the best constituent decoder configuration for the second MPCC decoder stage. Figure 5-4 presents the results of our simulations for each of the five configurations discussed in Section 4.3.3. The "Extended Serial," "Variant 1," and "Extended Master-Slave" configurations all perform similarly while the "Parallel" and "Variant 2" configurations perform somewhat worse than the other three. The "Parallel" configuration underperforms the "Extended Serial" configuration by approximately 0.1 dB at a bit error rate of 10-5. Han and Takeshita find similar behavior for the parallel configuration in multiple turbo decoders [8]. Our experiment demonstrates that this result applies to MPCCs as well. The "Variant 2" configuration has a higher error floor than the other four configurations. The most likely cause for this raised error floor is insufficient use of the large constituent decoder. "Variant 2" uses the large constituent decoder only twice per block, while the other four configurations use it three or four times. 62 10 10-- 10- -o- 1/128 1/4 1/3 1/16 1/8 1/4 -- Small Blocks L.rg Blocks . 1/2 liarge Figure 5-5: BLER at the output of the first decoder stage vs. ( at 1.5 dB for the baseline MPCC 5.3.2 Puncturing Rate We wish to find the value for (i that optimizes the performance of the baseline MPCC. We therefore measure the bit and block error rates of the baseline MPCC at 1.5 dB as we vary (i over the interval [0, 1]. Figures 5-5 and 5-6 show the results of our simulations. From our data, we conclude that the optimal value for (; is (i = 1/16. The next sections justify this assertion and point out other interesting features of our simulation results. First Stage BLER Curve Figure 5-5 shows the block error rates for the small (1024 bits) blocks and the large (8192 bits) blocks at the output of the first decoder stage as a function of (z Although the first stage decodes using only the small constituent codes, the value we are most interested in is the large block error rate. Every large block that is in error at the end of the first stage must get passed to the second stage for further processing. We expect both block error rates to increase as we increase (z Making (i larger takes parity bits away from the two small constituent codes that the first decoder stage 63 10-1 10-11 1 :: : : : : : : : : : : : : : : : : :: I : : : : : : : : : : : : : : : .. . . . . .: ...... . . . . . . . . . . . .......... .......... .. I .. .... ...... : : : : : : : : : : 0 : : : : : : : : : : w: : : : ; : : : : : ! : : : : : : : : : : .. : : : : :: ...........: . . . . . . . . . : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ...... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .............I .......... ...................................................... ::1I -:: ..... **,*''.,: .......... ..... * .......... ::,*, ....... III ...... ..................... .................................................. .................... ..................... .......... ........... .... .......... ..... ..... .......... ..................... .......... ............................. .. ................................ ..........I... ...... .......... .......... .......... ........................ ...... ... .......... .......... .......... .......... ....... .. ..... ... .......... .... :, i ...... 0 3 ........ .. . .. . . . .. . . .. . . . .. . . . . . . . .. . . . .... . .. . .. . .. . . . . .. . . .. . . . .. . . . .. . . .. . . . .. . . . : 1 .. . ' ' ,* ,* : * ,* . . . .. . . ; .. . . . .. . . . . . . .. . . . . . . . . .. . . .. . . .. . . . . . .. . . .. . . . .. . . .. . . . .. . . . .. . . . .. . . . . . . . . . .. . . ...... .... . .. . ...... .. ....... ... .......... C_ w CO 10 .............. ............... YT P Otq: ......... .......... ..... .... ..........:.......... .......... .... ..... .......... .......... ........... .......... .. ....... ..... .. . . . .. . . . .......... .......... ........... ............... ..... .......... ... ...... .......... .......... .......... .......... ...... . . ............. ....... .......... .......... 10-5 ... ... .. .. ... .. .. .. .. .. .. .. .. ... .. .. .. -... .. .. .... .. .. .. .. .. . ... .. .. .. . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . .. . . .. . .. ... .. .. .. .. .. .. . .. .. .. ... . . .. .. . . . .. .. .. . .......... .... 10-61 1/128 1/64 ........... ....... .......... 1/32 1/16 .......... 1/8 1/4 1/2 1 Iarge (a) lop ........... .......... .......... .......... .......... ......... ..... ...... .......... .......... .......... ........... ......I............. ................................. ... .. .. ... .... .... I l .. .. .. ... ..... .. .. .. .. .. ..... ... .. ... .... .... .......... .. ...... ........... ..................... .......... ................ 10-1 ........... ......... ...... .... .... . .......... .......... .......... .......... .......... ........... ......... .......... .......... .......... .......... ........... ..... cc w -j CD - - - 10 -2 - - - - - - - - - - :: : ... ........... ............ .......... .......... .. ......................... ..:.......... .......... ........... .......... ... ...... .......... ..... ............... .......... ................ .......................................... .......... 10-3 10-4 ...... _.(-.Asymp Me ... .. .... ......... ... .. ...... .......... .... ..... ..... ...... .... .. ..... .. ........ .. ......... .. .. ......... .... ..... ... . .... ....... ..... .. ........ .......... . ....... ... .......... ........ ........ ...... .......... ........... ...... -e.......... .......... .......... ...... Small ...... .... Blon:jks .....sj --x- Large Block . 1/128 1/64 1/32 1/16 118 1/4 1/2 1 Ulge Figure 5-6: (a) BER and (b) BLER at the output of the second decoder stage vs. at 1.5 dB for the baseline MPCC 64 uses to decode each received block. The first decoder stage should therefore correct fewer errors, and our simulation results substantiate this prediction. We conclude that if two values of j give the same bit error rate, we should choose the smaller value because doing so minimizes the number of blocks that must be passed to the second decoder stage. Second Stage BER Curve The second stage BER graph illustrates how 'i affects overall MPCC performance. We find first that the error rate is not monotonic as we vary j and second that we can separate the graph into the three regions shown in Table 5.3. Region I II III Name Turbo-Like Region MPCC Region Inverted MPCC Region Range 0 < j 1/64 1/64 < j < 5/16 5/16 < j < 1 Table 5.3: Regions of the graph in Figure 5-6 In the turbo-like region, there are so few bits in the large constituent code that only the two small constituent decoders are effective at correcting errors. Our bit error rate therefore goes up as we increase j because we are removing bits from the two most effective constituent codes. The minimum in this region occurs for the value j = 0, corresponding to a pure block length 1024 turbo code. In the MPCC region, there are enough bits in the large constituent code that the MPCC works as intended. The small constituent codes correct most of the errors in the received sequence, and the large constituent code corrects the rest. The global minimum for the MPCC bit error rate occurs in the range 1/16 < j < 1/8. We select j = 1/16 as the optimal value for j for two reasons. First, the bit error rate is approximately constant throughout the interval [1/16, 1/8. Second, the large block error rate at the output of the first decoder stage is three times lower for = 1/16 than for j = 1/8. In the inverted MPCC region, 5/16 < j < 1, the MPCC works in reverse. The large constituent code corrects most of the errors, and the small codes correct the 65 remaining ones. There is a local minimum for the value j = 5/8. We do not want to operate in this region because a block length N turbo code provides better error performance with the same complexity and latency. Second Stage BLER Curve The interesting feature of the second stage BLER curve is the ratio between the two block error rates. This ratio indicates whether the block error probabilities for each small block are independent. We first define Pr{el} and Pr{e5 } to be the large and small error probabilities, respectively. We then derive a formula for the ratio between the two error probabilities for small values of Pr{e5 }: Pr{el} Pr{e5 } _ 1 - (1 - Pr{es})Bs Pr{e5 } 1- (1- B Pr{e}) Pr{e5 } =BS The ratio should therefore be approximately equal to B, when the small block error probabilities are independent and less than B, when the probabilities are not independent. We find the ratio to be roughly equal to B, = 8 for small values of j and to decrease as j increases. We expect this behavior because the dependence between errors in the small blocks should increase as the large constituent code begins to correct more and more errors. 5.3.3 Large Block Length In this section, we determine the effect that changing the large block size Ni has on the properties of the baseline MPCC for different values of j and Eb/No. Figures 5-7 and 5-8 show the results of our simulations. 66 10-1 .......... ................*11............. .................I............................ .......... .......... .......... .....I ....: ...... .......... ..................... ........ .......... .......... ...... .......... . . .. . . . .. . .. . . . .. .. . . .. . . . . ......... . . . . .. . . . ................. 10-2 .. .. ..... .. . . . .. . . .. . . . .. . . . . . .. . . . .. . .. . . . .. . . . . . .. . . . . . . . . .. . . . .. . .. . . . .. . . . . .. . . . . .. . . . . . .. . . . .. . . . .. . .. . . . . . . . . .. . . .. . . . .. . . .. . . . . . .. . . . .. . . .. . . . . . . . . . . .. . . . . .. . . . .. . . . .. . . . .. . 10 -3 . . . .. . . . . .. . . . . . .. . . . . . . . .. . . . .. . . . . . ... . . .. . . . . . . .. . . . . .. . . . .. . . . .. . . . . .. .. ........ ... ............ .... ... ... .... .. ... ..... . ..... .. .. .. .. .. .. .. .. ... .. . ..... .. ... . ........... .. ... .. ... ..... .... .. .. .. .. ... ... .. .. .. .. .............. .......... ....... ........... .................... cc LU co 10-41 .... ....... . .......... ... .... .......... ...... ........ . ... .......... .......... .. ............ .......... .... ..... ............ .......... .......... .... ...... .......... .......... ...... ... .......... .. ....... .......... ........... 10 -6 101/1 -- G- 28 MPCC, NI = 2048 ............ ................ .......... ..... .... ...... ...... . ....... .......... ...t ........ .. ..... M PCC, NI = 4096 .......... .......... .......... . ........ MPCC, NI = 6144 ....... .......... ...... MPCC, NI = 8192 ....... ....... 1/64 1/32 1/16 1/8 1/4 1/2 1 I-Wroe Figure 5-7: BER vs. j and N, at 1.5 dB for the baseline MPCC 10-1 .. . .. ... . . ..... . .... . . . . ..... . . . . ...... ... . ... . . . ..... . . . . . ..... . . . .. . . ... . . . .... . . . . ..... . . . . ...... . . . . . . ........ . . . . . . . .... . . . . .. .... ............I ............: .............:, - I ........................ ............. ..........................: ....................................... : ............. ........... .............. ................................................... ............ ........... .........................I ............. 10-2[ .............. ........... . . . . .. . . . . .. . . .. . ..... . . . . .. . . . ..... . . . .. . . .. . .. . . .. . ... . . . .. . . . ..... .. . . . . .I . . ..... . . . .. . . . .. . .. . . .. . . . ..... .. . . . . .. . . .. . . . .. ... .. . . . .. . . cr -3 w 10 co . .. .. . . .. . . .. . . . . . . . .. . . . .. . . . .. . . ..... . . . .. . . . .. .......... ................ ............. ...... ..................... ................................. ... ...... .........* .. ......, ...... .................................. ..... ...... .....I ....... ............... .............. ....................... ..... ...... ............. ........... 10-4 .... . . .. . . . .. . . .. ... .. . . ..... . .. .. . .. ... . . . . . .. . . .. -e- MPCC, NI = 2048 M PCC, NI = 4096 -4- M PCC, NI = 6144 ...... . ..... 1R, ............. . ........... .. ............. ....... ... " . ..... . ........ .................. ... :. < ...... .......... -I- M PCC , NI = 8192 ........ ........... ............. ..: ..... -0- Turbo Code, N = 1024 1.125 1.25 1.375 Eb/No (dB) Figure 5-8: BER vs. EblNo and N, at 67 1.5 1.625 ....... 1.75 1/16 for the baseline MPCC Varying 'i Figure 5-7 illustrates how bit error rate varies with N, and j at a constant Eb/No of 1.5 dB. The error performance for the MPCC at the points , = 0 and j = 1 are constant as we change N, but the minima in the MPCC and inverted MPCC region get deeper as N increases. Increasing N therefore does not change the ideal value for , but does improve the error performance of the MPCC. Varying Eb/No Figure 5-8 shows how bit error rate varies with N, and Eb/No at a constant , of 1/16. The waterfall region occurs earlier and has a steeper slope as N increases. Although the error floor is too low to detect in software, the error floor should be lower for larger values of N1 . 5.3.4 Interleaver Design We simulate both methods discussed in Section 4.4 for designing the large interleaver. Figure 5-9 shows the performance of both the spread-random interleaver and the two-stage interleaver. The two methods perform similarly for signal-to-noise ratios between 0 dB and 1.5 dB. The error floor for both interleavers was below a level we could detect in software, but we expect the error floor of the two-stage interleaver to be higher because the interleaver is less random. 68 id, .. . .. . . .. . . .... . .. .. . .. . . .. . . .... . .. . . . .. . . . . .. . . . .. .. . . . .. . . . .. . . .. . . .. . . .. . . .. . .. . . .. . . .. . . . .. . . . . .. . . . .. . . . .. . . . . .. . . . .. .. . . . .. . . . .. ... .. . . .. . . .. . . .. . .. . . .. . . .... . . .. . . . . .. ... . .. . . . .. . . . 10-1 . . .. . . . .. . . .. ... .. . . .. . . .. . . .. . ... .. . .. . .. . . . .. . . . .. .. . . . . .. . . . .. . . . .. .. . . .... . . . . .. . . . .. . . . .. 7 .. ... . . .. ... . . .. . . .. . . . .. . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . ... . . . . . . . . . . . . .. . . . .. . . . . .. . . . .. . . . .. . .. . . . .. . . .. . I . .. . . .. . . .. ... .. . .. . . .. . . . .. . . .. . . . . .. . 10 -2 . ... . .. . .. . . . . . . . . . . . . . . . .. . . . . .. .. . . . .. . . . .. I . . .. cc W Co .. . . .. . . . I. .. . . . . .. . . . . .. .. . . . .. . . .. . . .. . . .. . . .. . . .. . .. . . .. . . ... .. . . .. . . . . .... . . .. . . . . .. .. . . .. ... .. . . .. .. . . . . .. . . . . . . . . .. .. . . .. . . ... .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. .. . . .. . . . . .. . . . .. . . . .. . . . . .. . . .. . . . .. . . . . .. . . . .. .. . . . .. . . . .. . . .. . . .. . . .. . . .. . .. . .. . . . .. . . . .. . . . .. . . . . .. . . .. . . . . .. . .. . . . . ... .. . . . .. . . . .... . . . .. . . ................... ... ...... . . .... . .. . .. ... . . .. ... .. . . . .. . . . ...... . ... .. . . .. . . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . . . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . .. . . . .. . . . . .. . . 10-4 .. . . . . . . . . .. . . .. . . 10-3 .. . . .. . .. . .. . . .. .. . .. . . .. .. . . . .. . . . .. . . .. .. . . . .. . . . .. .. . . . .. . . . . . . . .. . .. . . . .. . . .. . . . .. .. . . . .. . . . . . . . .. . .. . . .... . .. . .. . . . .. ... .. . . . .. .. . . . .. . . . .. . . .. ... ... .. . . . .. . . . . .. . . .. . . . .. . . . . . . .. . . . .... . . . .. . . . .. . .. .. . . . . .. .. . . . .... . . . .. . . .. . . .. .. . . . . .. . . .. . . . . .. . . . .. . . . .. . .. .. . . . . .. . . -9- Spread-Random Interleaver --: ..: ... ...... ......... .............. ..... ..... m Two-Stage Interleaver-_ 10. 0 0.25 0.5 0.75 1.25 1.5 E b /No (dB) Fimire Q __ 5-9: BER vs. EbINO and interleaver design method for the baseline MPCC. The BLER curve is similar, showing no appreciable difference between the two interleavers. 69 70 Chapter 6 Conclusions This chapter discusses some of the implications of MPCCs, including benefits and potential applications. We review the characteristics of MPCCs and suggest areas for future work. 6.1 MPCC Review We introduce in Chapter 4 the notion of mixed block length parallel concatenated codes (MPCCs). MPCCs are small block length turbo codes modified to include a third, large block length turbo code. This modification makes MPCCs a type of multiple turbo code, and we borrow techniques from multiple turbo coding when we design the MPCC encoder and decoder [4]. Even though MPCCs bear a resemblance to both turbo codes and multiple turbo codes, they are a generalization of these two ideas. Removing the constraint that all constituent codes have the same block size raises new issues that do not need to be solved when working with regular turbo codes. This thesis begins to address these issues and demonstrates that MPCCs provide an interesting new framework for developing error correcting codes. 71 6.1.1 Optimal MPCC Parameters We study in Chapter 5 how to optimize different MPCC paramaters. This section summarizes our findings for the baseline MPCC described in Table 5.2. These results are correct only for the baseline MPCC and may not apply to other MPCCs. First, Figure 5-4 shows that the extended serial configuration is the best constituent decoder configuration for the second decoder stage. Second, Figures 5-5 and 5-6 illustrate that the ideal value of j for the baseline MPCC is j = 1/16. Last, Figure 5-9 demonstrates that two-stage interleavers and spread-random interleavers perform equally well in the waterfall region but may have different error floors. 6.1.2 MPCC Properties We make two claims in Chapter 4 regarding MPCCs: 1. they need only the first decoder stage to decode most received blocks, and 2. they have improved error rates compared to their underlying, small block length turbo codes. The simulation results we present in Chapter 5 reinforce both arguments. Figure 5-2 shows that the first stage of the decoder successfully decodes approximately 80% of the received blocks at a signal-to-noise ratio of 1.5 dB, and this percentage continues to go up as the signal-to-noise ratio increases. Furthermore, Figure 5-3 illustrates that modifying a block length 1024 turbo code to include a block length 8192 constituent code does improve the error performance of the code below a bit error rate of 10-3. The coding gain from using an MPCC is 0.1 dB at a bit error rate of 10 6.2 5 . Applications Although MPCCs have interesting theoretical properties, there are also several realworld applications for which they are well suited, including satellite and video communications. Satellite systems can take advantage of the low average complexity of 72 MPCCs, and video systems can take advantage of the low latency. The next sections describe how the properties of MPCCs benefit these two applications. 6.2.1 Satellite Communications Power and size constraints on both satellites and terminals make satellite communication systems challenging to design. This section explains how MPCCs can help engineers meet these demanding constraints. Satellite Design Satellites can benefit from the reduced average complexity of MPCCs. Consider a satellite that is acting as a router for a large number of channels. It may be impossible to use a large block length turbo code because the satellite cannot support a complex decoder for every channel. MPCCs have a two-stage decoder, however, and the two stages have different complexities. Furthermore, the second stage is not needed for every block. Chapter 5 indicates that at a signal-to-noise ratio of 1.5 dB, the first stage successfully decodes approximately 80% of all blocks. Consequently, we can implement the MPCC decoder in the following way. Assume the satellite handles 1000 channels. We include 1000 copies of the first stage of the decoder. These first-stage decoders are dedicated for each channel and are relatively simple to implement. We also include 200 copies of the second stage. These secondstage decoders are more complex, but there are fewer of them. Every five channels share a second-stage decoder, and the two stages are connected with a queuing system. When the first stage fails to decode a received block of N, bits, it adds the containing block of Neq bits to the queue for the second stage. When the second stage is finished with a block, it returns it to the correct channel and starts decoding the next block in the queue. This arrangement may meet the constraints on the satellite design when using a large block length turbo code may not. 73 Terminal Design Consider a satellite that is broadcasting to two types of receivers: a small, manportable receiver and a large, vehicle-based receiver. Assume that the decoder on the man-portable device is more constrained than the one on the vehicle. If we were to use a regular turbo code, we would need to select a block length that could be decoded by the smaller receiver, even if the larger receiver can afford to implement a more complex decoder. When using turbo codes, the performance of both receivers is constrained by the limitations of the smaller one. With MPCCs, however, we can transmit a single MPCC-encoded stream that can be decoded by the smaller receiver at one bit error rate and the larger receiver at an improved bit error rate. We pick a block length for the small constituent codes that the smaller receiver can decode, and we pick a block length for the large constituent code that the larger receiver can decode. When the smaller receiver receives a block of Neq bits, it treats it as B, blocks of N. bits, each encoded with a block length N, turbo code. It does not implement the second decoder stage and discards the parity bits from the third constituent code. The larger receiver, on the other hand, implements both stages of the decoder and achieves a lower bit error rate. In this way, the performance of a broadcast system is not limited by the most constrained receiver. 6.2.2 Video Communications We have two competing interests in real-time video communication systems. First, we want low latency. Real-time video systems assign each frame a deadline. If the decoder cannot successfully decode a frame by its deadline, then the system cannot display that frame. Second, we want low bit error rates. Each bit error can cause the loss of several frames, especially when using MPEG compression [15]. An MPEG-encoded video sequence consists of I-, P-, and B-pictures. According to the MPEG standard: I-pictures are coded independently, entirely without reference to other 74 pictures. P- and B-pictures are compressed by coding the differences between the picture and reference I- or P-pictures, thereby exploiting the similarities from one picture to the next [15]. In other words, there are dependencies between frames. If an I- or P-picture is lost, every dependent frame is also lost. It is therefore important to receive I- and Ppictures correctly. Unfortunately, we may not be able to use a large block length turbo code because of its high latency. MPCCs help solve this problem. For each received frame, we initially decode using only the first stage of the decoder. We select block lengths for the two small constituent codes so that the first decoder stage can finish decoding each frame before its deadline. Whenever the first decoder stage is successful, we display the frame immediately. When the first stage is unsuccessful, however, we pass the frame to the second stage to correct the remaining errors. Although we may not be able to display that particular frame in time, we can avoid losing dependent frames. 6.3 Future Work This section outlines several areas for future work on MPCCs. The first way to extend MPCCs is to explore different combinations of constituent code sizes. The algorithms outlined in Chapter 4 generalize to any combination of block sizes. For example, it would be interesting to add a fourth constituent code with a block length N,, such that Nj1 > N. The decoder for this code would have three stages. The first would decode with the two smaller codes, the second with the first three codes, and the third with all four. Alternatively, one could add a fourth constituent code with block size N. This change might increase or decrease the efficiency of the second decoder stage, and it might affect the ideal value of 1. Second, this thesis studied only rate-1/2 MPCCs. It would be interesting to look at both higher and lower rate MPCCs. Higher rate codes would have less flexibility 75 to allocate parity bits among the different constituent codes, and lower rate codes would have more. The effect of code rate on latency, complexity, and performance is a subject for future study. Third, we can explore trellis coded modulation (TCM) using MPCCs. TCM improves bit error rates by combining error correction and modulation [3]. Using a two-stage MPCC demodulator might bring some of the same benefits as using a two-stage MPCC decoder. Last, a hardware implementation of MPCCs would allow an investigation of MPCC behavior at lower bit error rates. We were unable to detect the error floor for our proposed code because it lies below 10-. Hardware systems have much higher throughput and can measure error rates of 10-8 and below. 76 Appendix A Discrete-Time AWGN Channels In this appendix, we derive the relationship between the input and output of a discrete-time AWGN channel based on our model for continuous-time AWGN channels. To begin our discussion, we recall three things: " the input and output of a continuous-time AWGN channel are the continuous- time signals x(t) and y(t); " the signal x(t) is produced by a sequence of bits x that come from the source encoder; and " the output of the demodulator is a sequence of soft decisions y that are estimates for each transmitted bit. We define the input and output of the discrete-time AWGN channel to be the vectors x and y and wish to determine their relationship. We follow here the reasoning suggested by Haykin [9]. We start by determining how y(t) relates to x(t). The output signal y(t) is generated by adding a Gaussian white noise process, N(t), to the input signal. The process N(t) is a zero-mean, stationary, Gaussian random process with autocorrelation function: KNN(T) N0 2 - N6(T) 77 (A.1) N(t) x(t) y(t) Figure A-1: Additive White Gaussian Noise channel This function corresponds to a constant power spectral density: SNN(f) - N0 2 The output can now be represented as a function of the input: y(t) = x(t) + N(t). (A.2) Figure A-1 shows the system described by Equation A.2. We use binary phase shift keying (BPSK) to demonstrate the simple relationship between the vectors x and y. First, we look at what a BPSK signal looks like. A BPSK signal is a superposition of time-shifted pulses, with each pulse representing a single bit. These pulses are all based on a single prototype pulse p(t) and are spaced a fixed distance apart in time, T. Each pulse's amplitude, aj, is determined by the value of its corresponding bit, xi. In BPSK, we let: ai = xi Given aj, each modulated pulse has the same form: aip(t - iT). When selecting a specific prototype pulse for a system, we are interested in minimizing intersymbol interference. We would ideally like each received symbol, yi, to depend only on the corresponding transmitted bit, xi. Given a pulse spacing of T, 78 there is a constraint we can place on the pulses called the Nyquist criterion that eliminates intersymbol interference. For a given pulse spacing T, the Nyquist criterion requires: p(t) * p(t - iT) = Eso6 for each integer i where ES is the energy per channel symbol. In other words, pulses spaced in time by an integer multiple of T are orthogonal to each other. An example of a pulse that meets this criteria is the sinc pulse: p(t) = V E 5Tsinc(t/T) Now, the pulse for each bit xi becomes: a /E 8 T sinc( tiT). The modulator combines these individual pulses to form the modulated signal. The signal x(t) at the output of the modulator is the superposition of these pulses. We define x(t) in terms of the sequence of pulse amplitudes a: x(t) EsT sinc( tTT Za Once this combined signal is transmitted, it is corrupted by the AWGN channel to form y(t). Because the noise has a constant PSD, the optimal receiver is a filter matched to the transmitter pulse shape. The receiver therefore determines the vector y by passing the signal y(t) through a matched filter: 1 yi = Esy(t) * p(t - iT) When we condition y on the transmitted sequence x, the vector y is a linear functional of the Gaussian random process y(t). Each element of y is therefore a Gaussian random variable. Each of these Gaussian random variables is completely defined by its mean and variance. We know that the received symbols are independent because the pulse shape 79 satisfies the Nyquist criterion. We also know that N(t) is zero-mean and has the autocorrelation function described in Equation A.1. Combining these observations, we find the mean and variance for each received symbol: p(yi) = ai(A3 (A.3) = xi = N.2(yi) (A.4) 2E8 The relationship between x and y therefore takes the following simple form: Xi No pyi xi (yi xi) = M (x ~I jx(x 2ES j,)2 - 7No where At (p, U2 ) is a Gaussian distribution with mean [ and variance 80 0 2. Bibliography [1] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv. Optimal decoding of linear codes for minimizing symbol error rate. IEEE Transactions on Information Theory, 20(2):284-287, March 1974. [2] C. Berrou and A. Glavieux. Near optimum error-correcting coding and decoding: Turbo Codes. In Proceedings IEEE International Conference on Communications, volume 44, pages 1261-1271, October 1996. [3] E. Biglieri, D. Divsalar, P. J. McLane, and M. K. Simon. Introduction to TrellisCoded Modulation with Applications. Macmillan Publishing Company, New York, 1991. [4] D. Divsalar and F. Pollara. Multiple Turbo Codes for deep-space communications. The Telecommunications and Data Acquisition Progress Report, 42(121), May 1995. [5] S. Dolinar and D. Divsalar. Weight distributions for Turbo Codes using random and nonrandom permutations. The Telecommunications and Data Acquisition Progress Report, 42(122), August 1995. [6] S. Dolinar, D. Divsalar, and F. Pollara. Code performance as a function of block size. The Telecommunications and Mission OperationsProgress Report, 42(133), May 1998. [7] W. Feng, J. Yuan, and B. Vucetic. A code-matched interleaver design for Turbo Codes. In Proceedings IEEE International Conference on Communications, volume 50, pages 926-937, June 2002. [8] J. Han and 0. Y. Takeshita. On the decoding structure for multiple Turbo Codes. In Proceedings IEEE International Conference on Information Theory, page 98, June 2001. [9] S. Haykin. Digital Communications. John Wiley and Sons, New York, 1988. [10] C. Heegard and S. Wicker. Turbo Coding. Kluwer Academic Publishers, Boston, 1999. [11] S. Lin and D. J. Costello. Error Control Coding: Fundamentals and Applications. Prentice Hall, Inc., Englewood Cliffs, 1983. 81 [12] P. C. Massey and D. J. Costello. New low-complexity turbo-like codes. In Proceedings of the IEEE Information Theory Workshop, pages 70-72, September 2001. [13] A. Matache, S. Dolinar, and F. Pollara. Stopping rules for turbo decoders. The Telecommunications and Mission Operations Progress Report, 42(142), August 2000. [14] G. H. Mealy. A method for synthesizing sequential circuits. Bell System Technical Journal,34(5):1045-1079, September 1955. [15] J. L. Mitchell, W. B. Pennebaker, C. E. Frogg, and D. J. Legall. MPEG Video Compression Standard. Chapman and Hall, New York, 1996. [16] L. Perez, J. Seghers, and D. J. Costello. A distance spectrum interpretation of Turbo Codes. IEEE Transactions on Information Theory, 42(6):1698-1709, November 1996. [17] J. Proakis. Digital Communications. McGraw-Hill, Inc., New York, 1995. [18] P. Robertson, P. Hoeher, and E. Villebrun. Optimal and sub-optimal maximum a posteriori algorithms suitable for turbo decoding. European Transactions on Telecommunications and Related Technologies, 8(2):119-125, March - April 1997. [19] C. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27(3):379-423, July 1948. [20] A. J. Viterbi. An intuitive justification and a simplified implementation of the MAP decoder for convolutional codes. IEEE Journal on Selected Areas in Com- munications, 16(2):260-264, February 1998. [21] B. Vucetic and J. Yuan. Turbo Codes: Principles and Applications. Kluwer Academic Publishers, Boston, 2000. 82