A Comparative Analysis of Physical-Layer Rateless Coding Architectures AN CIVVS iAASSXHUSETTS INSThWE by OF TECHNOLOGY David Luis Romero JUN 10 2014 B.S. Electrical Engineering New Mexico State University, 2009 IBRARI.ES Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2014 @ Massachusetts Institute of Technology 2014. All rights reserved. A uthor. Signature redacted ..................... al Enginerin of Elect Certified by . . 21, 2014 ................... Gregory W. Wornell Professor of Elect ical Engineering and Computer Science SiMay Signature / and Computer Science redacted Signature redacted Thesis Supervisor Certified by ............. . .. . . r - - - - - - - ..................... Dr. Adam R. Margetts Technical Staff, MIT Lincoln Laboratory Thesis Supervisor Accepted by........... Signature redacted U U Leslie A. Kolodziejski Chairman, Department Committee on Graduate Students / 2 A Comparative Analysis of Physical-Layer Rateless Coding Architectures by David Luis Romero Submitted to the Department of Electrical Engineering and Computer Science on May 21, 2014, in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering Abstract An analysis of rateless codes implemented at the physical layer is developed. Our model takes into account two aspects of practical communication system design that are abstracted away in many existing works on the subject. In particular, our model assumes that : (1) practical error detection methods are used to determine when to terminate decoding; and (2) performance and reliability as observed at the transport layer are the metrics of interest. Within the context of these assumptions, we then evaluate two recently proposed highperforming rateless codes. Using our analysis to guide an empirical study, the process of selecting the best rateless code for a given set of system constraints is illustrated. Thesis Supervisor: Gregory W. Wornell Title: Sumitomo Professor of Engineering Thesis Supervisor: Dr. Adam R. Margetts Title: Technical Staff, MIT Lincoln Laboratory 3 4 Acknowledgments I would like to express my utmost gratitude to my advisor, Professor Gregory W. Wornell, whose deep insight and broad expertise guided me throughout the course of this work, and without whom this thesis would not have been possible. I would also like to thank Dr. Adam R. Margetts, whose generous sharing of expertise and time have benefited me tremendously since before the beginning of this work. A special thank you is reserved for Professor Uri Erez, whose expertise and assistance is reflected in many parts of this work. I would also like to acknowledge the generous support provided to me by the MIT Lincoln Laboratory Lincoln Scholars fellowship, which made my graduate study possible. Finally, I would like to acknowledge the support and encouragement that I received from my family and friends while completing this work, and, moreover, extend immense gratitude to my Father and Mother, who showed me to continue moving forward when faced with difficulty. "Once upon a time they used to represent victory as winged. But her feet were heavy and blistered, covered with blood and dust." The Fall of Paris Ilya Ehrenburg 5 6 Contents 1 13 Introduction 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 Preliminaries 3 4 15 2.1 N otation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 29 System Analysis 3.1 Error Detection Model and Analysis . . . . . . . . . . . . . . . . . . . . . . . 30 3.2 Message Error Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3 Approximate Decoding Error Probability . . . . . . . . . . . . . . . . . . . . 41 3.4 Packet Error Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.5 Packet Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.6 Constrained System Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 System Design and Practical Rateless Codes 4.1 63 Layered C odes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Decoding Errors Under Successive Decoding . . . . . . . . . . . . . . 65 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 . . . . . . . . . . . . . . . . . . . . . . . . . 71 Latency Constrained System Design . . . . . . . . . . . . . . . . . . . 71 4.1.1 4.2 Spinal C odes 4.3 Design of Constrained Systems 4.3.1 7 4.3.2 Reliability Constrained System Design . . . . . . . . . . . . . . . . . 83 4.3.3 Short Packet Lengths . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5 Discussion and Conclusion 95 A Layered System Architectures 99 B Design Details 101 B.1 Layered Rateless Code Design . . . . . . . . . . . . . . . . . . . . . . . . . . 101 B.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 B .1.1 Choosing ctgt B.1.2 Mean Squared Error Estimate of Effective SNR . . . . . . . . . . . . 102 Complexity of Practical Rateless Codes . . . . . . . . . . . . . . . . . . . . . 103 B.2.1 Layered Rateless Codes . . . . . . . . . . . . . . . . . . . . . . . . . . 104 B.2.2 Spinal Codes B.2.3 Simulation Aided Complexity Analysis . . . . . . . . . . . . . . . . . 111 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 8 List of Figures 2-1 System m odel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2-2 Illustration of the structure of a packet. . . . . . . . . . . . . . . . . . . . . . 26 2-3 Illustration of the structure of a rateless codeword. . . . . . . . . . . . . . . 27 3-1 Fraction of throughput dedicated to information bits, pt/R with R = 1/3, as a function of undetected error probability, pue, for various information block len gth s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3-2 Single decoding attempt probability of error example, k = 256, = 0 dB. . 43 3-3 Upper bound on probability of message error, k = 1024, -Ydb = 5 dB. . . . . . 44 3-4 Numerical examples of upper and lower bounds on the overall packet error 'Ydb . . . . . . . . . . . . . . . . . . . 47 3-5 Bounds on throughput, -Ydb = 1 dB. . . . . . . . . . . . . . . . . . . . . . . . 61 3-6 Bounds on throughput, -Ydb = 10 dB. . . . . . . . . . . . . . . . . . . . . . . 62 4-1 Packet error probability upper bounds for UDP design example, -Ydb = 1 dB. 73 4-2 Packet error probability bounds for UDP design example, -Ydb = 1 dB. ..... 74 4-3 Throughput upper bounds and simulation results for layered, and spinal codes probability as a function of k, for UDP design example, 4-4 -Ydb = Ydb = 1 dB. 1 dB. . . . . . . . . . . . . . . . . . . . . . . 77 Packet error probability upper bounds and simulation results for layered, and spinal codes for UDP design example, -Ydb = 1 dB. . . . . . . . . . . . . . . . 9 78 4-5 Throughput upper bounds and simulation results for layered, and spinal codes for UDP design example, 4-6 -Ydb = 8 dBe. . . . . . . . . . . . . . . . . . . . . . 79 Packet error probability upper bounds and simulation results for layered, and spinal codes for UDP design example, -Ydb = 8 dBe. . . . . . . . . . . . . . . 80 4-7 Throughput upper bounds and simulation results for layered, and spinal codes for TCP design example, 4-8 1 dB. . . . . . . . . . . . . . . . . . . . . . . 85 Throughput upper bounds and simulation results for layered, and spinal codes for TCP design example, 4-9 -Ydb = -Ydb = 8 dB. . . . . . . . . . . . . . . . . . . . . . . 86 Packet error probability bounds for UDP design example with short packet length, /db = 1 dB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4-10 Throughput upper bounds and simulation results for layered, and spinal codes for UDP design example with short packet length, Ydb = 1 dB. . . . . . . . . 90 4-11 Packet error probability upper bounds and simulation results for layered, and spinal codes for UDP design example with short packet length, -Ydb = 1 dB. 91 4-12 Throughput upper bounds and simulation results for layered, and spinal codes for UDP design example with short packet length, Ydb = 8 dB. . . . . . . . . 92 4-13 Packet error probability upper bounds and simulation results for layered, and spinal codes for UDP design example with short packet length, -Ydb = 8 dB. . 93 A-1 Three layer communication system architecture. . . . . . . . . . . . . . . . . 100 B-i Throughput efficiency for M = L = 7 layered codes vs SNR. Various choices of Etgt are show n. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 B-2 MSE and measured effective SNR when attempting to decoder layer 7. . . . 103 B-3 Average receiver operations per goodbit for spinal and layered rateless codes. 112 10 List of Tables 2.1 Deterministic Scalar Quantities . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2 E vents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3 Random Scalars, Vectors, and Matrices . . . . . . . . . . . . . . . . . . . . . 20 2.4 Layered Code Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.5 Spinal Code Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1 Summary Of Basic Error Events . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1 Simulation Configuration, k {256, 512, 1024} . . . . . . . . . . . . . . . . 76 4.2 Spinal Code Throughput Comparison, k . . . . . . . . . 81 4.3 Receiver Ops. Per Goodbit, k = 1024, 1 dB . . . . . . . . . . . . . . . 82 4.4 Short Packet Length Simulation Configuration, k = {64, 128, 256} = = 1024, _Ydb = _Ydb = 1 dB . . . . . . 89 B. 1 Real Arithmetic Operations Required To Compute And Apply UMMSE Combining Weights For Layered Rateless Code. . . . . . . . . . . . . . . . . . . . 106 B.2 Real Operations Required By The Turbo Decoder To Decode The Layered R ateless C ode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 B.3 Real Operations Required By a Spinal Code Receiver. . . . . . . . . . . . . . 110 11 12 Chapter 1 Introduction Rateless codes are forward error correcting codes for which the number of encoded symbols used to communicate a given message varies with the channel. This is in contrast to traditional fixed rate codes, which attempt to communicate using a fixed number of encoded symbols and are not capable of dynamically adapting to each channel realization. The highly adaptive nature of rateless codes make them an attractive choice for systems that operate where the communication channel is unknown, or varies over time in an unknown manner. A number of methods exist that can be used to enable a system to adapt to uncertainties in a channel. In a broad sense, all of these methods fall into the category of variable rate codes 1 . One sub-class of such codes is based on shortening a block of information symbols, which are subsequently encoded into a fixed length codeword [1]. Another sub-class of variable rate codes are designed to encode a fixed number of information symbols into a variable length codeword using either code extension, or puncturing and incremental transmission [2]. Still other variable rate codes use some combination of shortening, and extension or puncturing [3]. In this thesis, we consider only variable rate codes that encode a fixed number of information symbols into a variable length codeword, and refer these as rateless codes. Rateless codes have long been of interest to the coding community. After practical constructions were successfully applied to erasure channel models [4], an interest in devel'Certain classes of these codes are sometimes referred to as rate compatible codes. 13 oping rateless codes for noisy, physical layer channel models emerged. It has since been demonstrated that practical constructions of such codes are possible, and can achieve good performance [5, 6, 7, 8]. Typical analyses of rateless codes appearing in the literature focus on performance as observed at the physical layer (PHY) and medium access control layer (MAC) of the system (See Appendix A). Additionally, many existing analyses assume a model in which the receiver behaves as if knowledge of the transmitted message is available after decoding, enabling the decoder to detect with probability 1 whether decoding decisions are valid. Such perspectives are useful for distinguishing code performance from other characteristics of the system. However, once a set of candidate rateless codes with good performance has been identified, a system designer must evaluate their performance when coupled within the system. Such an evaluation must take into account performance and reliability as observed at the application layer of the system, as these metrics most closely affect user experience. Furthermore, systems must utilize an error detection scheme that provides some imperfect level of reliability, as a "genie" aided receiver is not possible in practice. The above considerations must be taken into account when designing any communication system. However, when considering rateless codes it is particularly important to take the effects of imperfect error detection into account, as decoding and error detection are typically executed repeatedly for each message, leading to an aggregate probability of undetected error that is substantially higher than in the case of fixed rate codes. Additionally, because application level performance and reliability is impacted by protocols that operate within the transport layer of the system (See Appendix A), it is important to consider such protocols when undertaking a detailed system analysis. In this thesis, we analyze the performance and reliability of rateless codes with the goal of contributing a more detailed system analysis than what has typically appeared previously in the literature. While previous works have abstracted away certain details to produce useful course grained analyses of rateless codes, we develop a finer grained analysis by considering 14 a model employing fewer analytical simplifications. In particular, we analyze performance and reliability from the perspective of the transport layer of the system, while assuming use of a practical error detection method. Our analysis is executed in two phases. First, a model of rateless codes is presented and analyzed. The model is anchored at the transport layer of the system, and is used to capture the dominant effects of a practical system employing such codes. The analysis yields a set of bounds on the throughput and probability of error that one can expect to observe at the application layer of the system, given a set of design parameters. Second, simulation of two high performing practical rateless codes is used, along with our analysis, to illustrate how a system designer can select an appropriate rateless code given a set of constraints on the system. The primary goal of this thesis is to provide insight and practical examples of interest to system designers who are considering the use of rateless codes. The remainder of the thesis is organized as follows. In the following section a brief summary of previous work on rateless codes is provided. In Chapter 2 notation is defined, and a detailed description of the basic system model for rateless codes is given. In Chapter 3, a basic system model of rateless codes is analyzed. In Chapter 4, layered and spinal codes are described, and Monte Carlo simulation results are presented within the context of illustrative design examples. Finally, the thesis concludes with a brief discussion in Chapter 5. 1.1 Background Several fundamentally different approaches to rateless coding have been introduced over time. One notable example is automatic repeat request (ARQ) systems. In an ARQ system, the transmitter resends previous messages upon receiving feedback from the receiver that a codeword was decoded in error. The receiver may use some method of combining the repeated transmissions. Other examples are hybrid-ARQ (HARQ) systems, which encode 15 messages using a forward error correction (FEC) code, then send a subset of the parity symbols, only sending additional parity symbols upon request from the receiver. HARQ systems have been practically deployed, for example using the current LTE standard [9]. A more sophisticated approach to rateless coding is that of fountain codes, which can, in principal, generate an infinite stream of distinct encoded symbols to be sent across the channel. A notable example of fountain codes that were designed for erasure channel models are the Raptor codes of [10]. models in [5]. Raptor codes were later adapted to physical layer channel Spinal codes [7], which are one of two practical rateless codes considered in this thesis, fit into the fountain paradigm, and have been shown to perform well when applied to physical layer channel models. Another approach suited to physical layer channel models is to construct a rateless code using layering and repetition of a set of base codebooks. This is the approach taken for the layered codes of [6], which is the other of two practical codes that are considered in this thesis. In general, spinal codes have been shown to perform well for short information block lengths, while layered codes employing turbo codes for the base codebook have been shown to perform well at long information block lengths. Because of their demonstrated performance in different information block length regimes, it is of interest to compare and contrast the performance of the two codes over the span of short and long information block lengths. 16 Chapter 2 Preliminaries 2.1 Notation Events and sets are denoted using calligraphic characters (A). The complement of event (or set) A is denoted Ac. The cardinality of a set, A, is denoted JAI, while the probability of event A is denoted P (A). Random quantities are denoted using lower case or upper case sans serif characters (b, or B, respectively), whereas deterministic quantities appear as either lower case, or upper case serifed characters (b, or B, respectively). Vectors (deterministic or random) are denoted using lower case, bold face characters (v if deterministic, v if random), while matrices are denoted using upper case, boldface characters (G if deterministic, G if random). Deterministic scalar quantities, events, and random scalars, vectors and matrices are summarized in Tables 2.1, 2.2, and 2.3, respectively. Parameters that are particular to the layered, and spinal rateless codes, which are evaluated in this thesis, are defined in Tables 2.4, and 2.5, respectively. Though some quantities such as vectors and events are initially defined using indices and subscripts, if there is no risk of ambiguity or confusion, such indices and subscripts are dropped during exposition for convenience. 17 Table 2.1: Deterministic Scalar Quantities Symbol/Defintion Description k Number of information bits encoded into a single variable rate codeword. b Number of bits dedicated to error detection for each k bit message. t Number k - b bit subpackets contained in one packet. n Number of channel uses. ni Cumulative number of channel uses corresponding to a single message after i decoding attempts. nIRU Number of channel uses observed between consecutive decoding attempts; i.e. the length of an incremental redundancy unit (IRU). m Maximum number of IRUs (i.e. decoding attempts) corresponding to one message. ntrials Number of Monte Carlo simulation trials. Pue A P ('P|) Probability of false CRC pass given a decoding error. P Overall error probability for latency constrained cases. Pr Overall error probability for reliability constrained cases. C Shannon channel capacity. Cf b Capacity of feedback channel. R System throughput. R, Throughput for latency constrained cases. Rr Throughput for reliability constrained cases. pA Instantaneous throughput. Average channel signal to noise ratio, linear units. '/db Average channel signal to noise ratio, decibel units. ntx Maximum number of packet transmissions dedicated to one packet. 18 Table 2.2: Events Symbol/Definition Description Si= Decoding error given ni channel uses. (ni) Pi =P(ni) Checksum (CRC) pass event. Zund = Zud(ini) A Si n P, Undetected decoding error given ni channel uses. Zdet(ni) A Ein 'Pi Detected decoding error given ni channel uses. Si = St(ni) A En0 P. Detected decoding success given ni channel uses. Ut TA (Zynd n (n-Zet)) Undetected decoding error, Dm A E U (n F A ULID IZiet) 1 th codeword. Message error event, 1 th message. Packet error event given a packet of t messages. 19 Table 2.3: Random Scalars, Vectors, and Matrices Symbol/Definition Description p Packet containing t length k - b bit subpackets. U1 1"h subpacket of length k - b bits. m1 11h message of length k bits, passed to encoder. rM1 Decoding decision corresponding to the in a given packet. " Length n vector of encoded symbols. " nThe yfl yi 1 th message IRU, a length ni - ni_1, ni < nm vector of encoded symbols. ith Length n vector of noisy received symbols. The ith noisy received IRU, a length ni - ni- 1 , ni < nm vector of noisy received symbols. wn w?niLength Length n vector containing samples of noise process. ni - ni_1 vector of noise samples corresponding to ith IRU, where ni < nm. T Number of packet transmission before packet decoding success. N* Number of channel uses dedicated to the 1th codeword prior to terminating decoding. 20 Table 2.4: Layered Code Parameters M Maximum number of redundancy blocks for layered rateless codes. L Number of layers corresponding to one layered rateless codeword. G Complex layer combining matrix. Etgt Target bit error rate used to design G. rb Rate of base code when complex transmit symbol and FEC code rate is taken into account. N Length of each turbo base code, in channel uses. Table 2.5: Spinal Code Parameters k' Number of message bits (out of k) which make up one hash key in a spinal encoder. v Number of bits which specify each spinal value. c' The alphabet of complex transmit symbols has a cardinality of 22', i.e. c' bits are mapped to each real and imaginary part of the symbol. B Beamwidth of spinal decoder. sP Number of passes over the spine. 21 2.2 System Model Our analysis assumes a discrete time baseband forward channel model that is corrupted by additive white Gaussian noise (AWGN). Rateless coded communication takes place over the forward channel, which is characterized by its Shannon capacity, C, in units of bits per channel use, where a channel use is defined as the unit of time required to send one complex symbol over the forward channel. Note that the forward channel can equivalently be characterized by its average signal to noise ratio (SNR), -Ydb, in units of decibels. A feedback channel is used to inform the transmitter when decoding success (or failure) has been detected by the decoder. The feedback channel can be used as frequently as once per forward channel use. The transmitter uses feedback information to determine whether to begin transmitting the next message, or to send additional encoded symbols so that the receiver can again attempt to decode. The simple acknowledgment/negative acknowledgment (ACK/NACK) feedback scheme considered in this thesis implies that one bit of information per use of the feedback channel is required. The feedback channel is characterized by its capacity, CfA. Throughout this thesis, it is assumed that the feedback channel has zero delay, and Cfb ;> 1 bit per channel use. It is common when analyzing rateless codes (e.g. [7, 8]) to assume a genie aided system. Perfect error detection is possible under this assumption, as it enables the decoder to compare the estimated and true messages, thus allowing the decoder to detect with probability 1 whether a decoding decision is correct. When a genie aided system is not assumed (which is the configuration of interest in this thesis), it is common to dedicate a number of the information bits that make up each message to error detection. If b out of k message bits are used for error detection, then these b bits function as a "checksum" that is used to verify the integrity of the decoding decision (i.e. the remaining k - b bits) with some level of reliability. While other methods of error detection that do not require a b bit overhead are possible under certain rateless coding schemes', these alternative methods are not considered 'For example, error detection for the incremental redundancy rateless coding scheme presented in [8] 22 Packet Queue Packet Buffer p Ip Segment/ Queue Reass./ Buffer w.m Ui CRC Rateless Encoder Encoder Zfm i Rateless CRC Decoder Decoder Feedback Channel Figure 2-1: System model in this thesis because they tend to be coupled to the particular type of rateless code under consideration. In many practical systems, a b bit cyclic redundancy check (CRC) is used for error detection. Because it is ubiquitous, and is not coupled to a particular type of rateless code, this method is assumed for the remainder of this thesis. In terms of functional units, the system consists of one transmitter, and one receiver. The transmitter consists of a packet queue, a segmentation/queueing unit, a CRC encoder, and a rateless encoder. The receiver is made up of a rateless decoder, a CRC decoder, a reassembly/buffer unit, and a packet buffer. A functional system block diagrams is illustrated in Figure 2-1. It is assumed that the packet queue contains an endless stream of transport layer packets of information bits. Each packet is denoted p, and is made up of t subpackets, each of which is made up of k - b bits. Hence, each packet consists of t(k - b) bits. It is assumed that b' out of the t(k - b) bits in each packet form a b' bit CRC which is used to verify the integrity can be accomplished using the syndrome of the LDPC code. As another example, rateless codes employing iterative decoders can leverage soft reliability metrics for reliability based error detection [11]. 23 of the decoding decisions after all such decisions corresponding to a given packet have been made. Note that transport layer packets model data units that are passed from the transport layer to the MAC and PHY layers of a communication system (See Appendix A). We next consider the operation of the system, which can be described by the following sequence of events. 1. A large packet of bits, p, is segmented into t distinct subpackets, each denoted u for 1 = 1, 2, ... , t, where each u1 contains k - b bits. 2. A b bit CRC is computed and appended to each u1 , resulting in t distinct k bit messages, each denoted ml. 3. The first message, mi, { X , x1, 0 -- , x m-1}, is passed to the rateless encoder, which maps mi --+ xm- where each scalar x3 E Xnm is a complex encoded transmit symbol that costs one channel use to send to over the channel. Note that for a rateless code, in principal, nm could be infinite. 4. The first incremental redundancy unit (IRU), a subset of the encoded symbols xm, x1m = {xIx 2 , ... ,x 1 xim c }, is sent over the channel. 5. After observing the channel output, the decoder makes a decision, denoted on the received sequence y" 1 = Xnm + W"l, where Wfl i1 , based is a length n 1 vector of complex baseband samples of a white Gaussian process. 6. The decoder computes a checksum for iin1 to verify the integrity of the decoding decision. If the checksum passes, an ACK is sent over the feedback channel, which signals to the transmitter that it can commence encoding and transmitting m2 . If the checksum fails, a NACK is sent over the feedback channel 2 , causing the transmitter to send the second IRU, 2 X2m, over the forward channel. Note that the lack of an ACK can be considered a NACK given an error free feedback channel. 24 7. If necessary, additional IRUs are transmitted, and steps 5-6 are executed after the receiver observes each IRU until the checksum corresponding to m- 1 passes, at which time the system begins executing Step 3 for the codeword corresponding to m 2. 8. Once all t message decisions, mi, have been acquired by the receiver, the CRC bits are removed and the resulting subpacket estimates, U-1 for 1 E 1, 2,.-. , t, are reassembled into an estimate of the transmitted packet, P [1, l2, -- , Ut]. 9. The integrity of P is checked using the b' bit CRC checksum. If the checksum passes, the receiver informs the transmitter via the noiseless feedback channel, and the system starts the process over at Step 1 for the next packet in queue. Otherwise, the packet is retransmitted, or dropped. Note that all t of the mi E p must be correctly estimated, else the packet must be retransmitted or dropped, depending on the network protocol. Figure 2-2 illustrates the structure of a packet, and its relation to its constituent subpackets and messages. Figure 2-3 illustrates the structure of a rateless codeword and its relation to its constituent IRUs. Note that Figures 2-2 and 2-3 also help to elucidate some of the notation defined in Tables 2.1 - 2.3. 25 p Packet Subpac kcet CRC Encodi ng Messa ge CRC .2 - U1 U U11CC R M2 MI Figure 2-2: Illustration of the structure of a packet. 26 R Mt mZIX Message Rateless Codeword x \ 0 00 XX 1| 0P v * Transmitted Rateless Codeword After j IRUs * * * * I I I I I I I I F~1 I I I I I I I I I I I I I \% IRUs I nm 0 00 I I I I I I x nj Figure 2-3: Illustration of the structure of a rateless codeword. 27 I I I I I I 28 Chapter 3 System Analysis Many existing works on rateless channel codes [6, 7] are based on analysis and empirical results corresponding to performance as observed at the MAC and/or PHY layers of a communication system architecture (see Appendix A). Furthermore, many works assume that the decoder is omniscient in that there exists no uncertainty pertaining to detecting when a correct decoding decision is made. While such analysis is useful for isolating code performance from other effects in the system, it can be inadequate for designers who are ultimately interested in the overall performance of such codes when they are integrated into the overall system architecture. Motivated by this fact, this chapter develops an analysis of rateless codes for the AWGN channel. Important features that differentiate this development from other works on rateless codes are: 1. Imperfect error detection is used at the receiver 2. Performance and reliability as observed at the transport layer is of interest Integrating features 1) and 2) throughout, our development yields a model that is useful to system designers who are considering the tradeoffs involved in choosing between different rateless codes, and guides the selection of important design parameters (e.g. information block length, CRC length). The resulting set of equations and inequalities that represent 29 the model can help guide the beginning stages of the design process. Rather than having to immediately resort to hours (or days) of simulation, our analytical model provides a compact representation of the performance and reliability that one can expect to observe given a rateless code, a set of design parameters, and system constraints. Throughout the analysis, important equations and inequalities are placed in rectangular boxes for emphasis. 3.1 Error Detection Model and Analysis In this section, a CRC method of error detection is defined and analyzed. Our analysis shows that, given a b bit error detection CRC, the probability of undetected error conditioned on an incorrect decoding decision, denoted pue, can be approximated as: Pue ~ 2-b (3.1) Under rateless coding, several kinds of errors are possible. For example, say that a particular receiver has acquired a sequence of noisy observations corresponding to a transmitted rateless codeword made up of i IRUs (See Figure 2-3). The decoding error event, Si (equivalently denoted as S(ni)), occurs if M-i f m, where iin is the tentative decoding decision given all information available at the receiver, and m is the message that was encoded at the transmitter. The probability of a decoding error occurring, P (Ei), is generally dependent upon the channel SNR, the block length ni, and on the particular method of encoding and decoding that is used. Another error that is possible under rateless coding (and, more generally, under any communication scheme) occurs when a decoder fails to detect an error in a tentative decoding decision. The event associated with error detection, Pi, occurs when the error detection check returns a positive result1 . Under practical error detection methods, P (Pi) is dependent upon 1Note that this indicates either an undetected error, or a detected success. 30 Table 3.1: Summary Of Basic Error Events __S_ __ _ _ _ _ _sic Pi Undetected Error Detected Success P Detected Error False Detected Error the particular method of error detection, and the bit error rate of the tentative decoding decision [12]. To simplify our analysis, we assume an error detection model that depends only on the method of error detection, and whether the ith decoding decision is correct. Error events corresponding to a codeword of length ni channel uses and their interactions and associated penalties are summarized in Table 3.1. When employing rateless coding, error events corresponding to a given message can be characterized by a sequence of events (and their complements) such as those listed in Table 3.1. Stated more formally in terms of our notation, error events for a given message, m, occur over a sequence of decoding attempts that are executed on a sequence of codewords of increasing lengths {ni, n 2 , ... , ni}, where ni is the length at which either si n Pi or E9 n Pi occurs. In principle, ni can go to infinity, though in practice it is typically limited to some system dependent value nm. For notational convenience, the undetected, and detected error events are defined, respectively, for a length ni codeword as: Zlund = zdet E n P, (3.2) S.E n 'Pi (3.3) Similarly, the event corresponding to a detected decoding success given a length ni codeword is defined as: S, Ec n P. 31 (3.4) Assuming a maximum codeword length of nm, the overall undetected error event, sponding to the 1 th message, where 1 E 1, 2, A Zund u Z"nd (Z d n , t, corre- is defined as the following disjoint union: n Zdet U ... U (Zm" ( i=1 ... El, n Zdetn ... n Z " (3.5) zet j=1 Next, the event ZLnd = 6i n P is considered in order to gain basic insight into the performance-reliability tradeoff that exists for a rateless code. The undetected error event, Z1Id, incurs a penalty on the overall probability of error of the system, as the receiver erroneously believes that in = m. Note that the false detected error event in Table 3.1, e Because the event Znd n Pf, is assumed to have probability 0. is very detrimental from a system reliability standpoint, as it potentially allows many erroneous bits to pass through the MAC and PHY layers toward the user application, it is useful to first understand P (ZInd), the probability of an undetected error given a decoding error on a codeword of length nI. Using elementary laws of probability, we rewrite the probability of undetected error as: P (Znd) =IP ( 1 =ED (F1 n P1 ) S,) 1P (E1l) (3.6) Referring to the right side of (3.6), note that P (Si) is a function of the channel SNR (which partially determines the probability of a decoding error), and the particular rateless code that is used. From the perspective of analyzing a given error detection model, we are interested in P (P 1 S) = P (Pi ES), which depends only on the method of error detection, conditional on whether a decoding error has occurred. Note that the decomposition in 3.6 conveniently 32 enables an analysis of this perspective2 For convenience, we define: Pue (3.7) A P (Pi s Note that in the case of the genie aided system discussed in Section 2.2, we have pue = 0. As stated previously, we consider systems that use a b-bit CRC for error detection. This method allows a designer to decrease pue by increasing b. Thus, as reliability increases, throughput decreases. It is probabilistically possible for a CRC to pass when a decoding error has occurred (i.e. for a CRC collision to occur). This can be understood by thinking of a b-bit CRC as partitioning the set of possible binary messages, A, where 1A1a sequence of 2 ' subsets, {Aj}, for j = 0,, 2 k-b, 1. Each A contains all of the length k - b ..., 2 bit binary messages which collide with each other under the given b-bit CRC; i.e. JAjj= for each j = 0, 1,, into 2 k-2b 2' - 1. Under the assumption that each erroneous ii- is equiprobable given that a decoding error has occurred, the probability that a set of decoded message bits will erroneously pass the CRC can be expressed and upper bounded as: JAI -1 2 k-2b _1 2 k-b - 1 2 k-b 2 k-b 1 (3.8) < 2-b 2It has been shown [12] that, for certain classes of cyclic redundancy check polynomials that can be used for error detection, the model used in this work for pue is accurate in the SNR regime where the bit error rate of the CRC codeword is high. This turns out to be a reasonable model for rateless coding, because many decoding attempts occur when the rateless codeword is much shorter what is required to successfully decode, leading to a high bit error rate at the the output of the decoder. 33 where (3.8) follows from the fact that < 1 whenever 2 k-b 2 b ;> 2b > 1, or equivalently, when i. A lower bound can be derived by considering the error in the upper bound as follows: 1 2 k-2b - 2k-b 2 -b( 2 k-b - 1) - 2k-b - 2 k-2b -2 -b - + 1 1 2 k-2b - 2k-b - 2 k-2b + 1 1 1 -2 2k-b - (3.9) 1 1 < 2k- - (3.10) 1 where (3.9) is strictly greater than 0 for b > 0 and k > b, and (3.10) is true under the former condition on b. Using (3.8) and (3.10) we arrive at: 2 where 6 = 2k-b-. Clearly, c k = 256 and b = 24 we have E E < Pue - < 2-b (3.11) 0 quickly for reasonable values of k - b (for example, if e 1.4489 x 10-70). Thus, under the current model for error detection, we are justified in using the approximation pue ~ 2 b, which given in (3.1). To make the throughput-reliability tradeoff concrete, consider the relationship between the instantaneous throughput, pt A k , and pue. Using the approximation given in (3.1), pt can be expressed as: Pt ,r.. k + log2 (pue) (3.12) Note that, for a rateless code, the number of channel uses, N*, required for the CRC to pass is a random variable. For illustration purposes, in this section we take N* 34 = ni. Figure 3-1 illustrates the throughput-reliability tradeoff for several information block lengths (as labeled in the figure legend). Note that the sequence of curves is characterized by a slope that increases as the value of k is decreased, illustrating the performance cost associated with various levels of reliability. --- 1 0. 0.99 0) W 0.98 *0 0.97 .dO p oo 0.96 0 . 0 0.95 6144 - - - 1024 512 256 U. C LL 0.94 .c C .... :.. . 0.93 - 10~ 10 s 4-3- 10-4 10 10-2 10-1 100 Probability of Undetected Error Figure 3-1: Fraction of throughput dedicated to information bits, pt/R with R = 1/3, as a function of undetected error probability, Pue, for various information block lengths. 35 3.2 Message Error Probability In Section 3.1, it was illustrated that CRC based error detection provides a mechanism to control the reliability of decoding decisions. Furthermore, it was shown that each measure of reliability has some associated performance cost. A model for this method of error detection was presented, and the associated performance-reliability tradeoff was illustrated for the case of a single decoding attempt. In the rateless setting, a sequence of decoding attempts, in which error detection is performed after each attempt, may be necessary. Hence, the effect of error detection on the overall reliability of each encoded message is more complex than was illustrated in the single decoding attempt case. In this section, the results developed in Section 3.1 are extended to characterize the overall reliability associated with transmitting a single message. Before proceeding with a summary of the main results and corresponding analysis, let us define the primary event of interest. Let D, denote the error event corresponding to the 1"h of t messages that make up a transport layer packet. A message error event occurs if either an undetected decoding error occurs, or some agreed upon maximum number of detected decoding attempts, m, has been reached 3 . Thus, D, can be defined as: SA E (3.13) Zet U The main results developed in this section are as follows. We derive lower, and upper bounds on the overall probability of undetected error corresponding to message 1 (as defined in (3.5)), which can be expressed as: P (S > EPue (I - Pue)ii=1 3 P (E) (3.14) j=1 In this case, the transmitter and receiver may agree to "give up" on message 1, or perhaps on the entire packet, depending on the application of interest. This issue will be addressed when we consider constrained system models in Section 3.6. 36 Zpue (1 - P (s, (3.15) Pue)<-P(Vi) The upper bound given in (3.15) is then used, along with a bound on the probability of the overall detected decoding failure event, P (nli Ziet), to express the following upper bound on P (D1 ), the overall probability of error associated with message 1 : P(D) < Zpe pue) m P (Sm) (ES) + (1 (1 -pue-P (3.16) To begin our analysis, we express the probability of the event (3.13) as: M P (D) = P ) (n Zfet U P (E) +P ( ziet where (3.17) uses the fact that the undetected decoding error event and (3.17) fli1 Ziet are disjoint (viz., (3.3), (3.5)). The contribution of the second term on the right side of (3.17) is first considered. This term can be rewritten and upper bounded as: P (zet = P P (Zre) i=1 Zdet) Z det i Z ) \ _j =m-i+1 <P (Zdet ) (1 - pue)m- 1 P ('P Em) P (Sm) (1 (1 - Pue) m P (Sm) 37 (3.18) (3.19) - pue)m- (3.20) where (3.18) follows from the chain rule for joint probabilities, (3.19) follows from the fact that L1 EP (Zdet. fT72 m-i+ (1 ) - Pue)m 1 , and (3.20) is based on the definition (3.7). To bring system operational insight into the sequence of expressions (3.18) - (3.20), consider the following: if the product due to the chain rule in (3.18) is collapsed we get: m m rip (zm±2 i=1 m-1 ziet) n M-i+1 = n ziet gzet (i=1 (3.21) which is the joint probability that the first m-I transmissions result in a sequence of detected decoding failures given that the mth decoding attempt resulted in a detected decoding failure. Intuitively, if a message can not be decoded using a codeword of length nm, then there is a high probability that it could not be decoded at any length ni < nm, which suggests that (3.19) may be reasonably tight. Given (3.20) and the discussion above, a preliminary upper bound on the overall message error probability can be expressed as: P (DJ) < P (Et) + (1 - pue)m I? (Sm) (3.22) We next focus on the first term on the right side of (3.22), the overall probability of undetected error corresponding to the 1th of t messages associated with a given packet. The exact evaluation of this term is difficult, even for moderate m. However, in what follows we show that using a small amount of operational insight enables us to derive upper and lower bounds on P (E) that are easily evaluated. We begin by expressing P (E) as follows: 38 P ES U =P nzde n zu j=1 (i=1 i-1 m J:P i=1 n un-dn idet(3.23) j=1 where (3.23) follows from the fact the E' is equal to the disjoint union in (3.5). From Equation (3.23), it can be seen that P (9E) can be bounded by bounding each term in the summation. Hence, we expand each term as follows: d e net und _ p und (j=1 j=1 j=1 det et et i-1 i-1 i-1 j=1 j=1 k=j+1 j=1 j=1 k=j+1 (3.25) where (3.24) follows from the chain rule for joint probabilities, and (3.25) follows from the definitions of Ziund and Zidet, along with the fact that, under the current error detection model, P (PtISi n A) = P (PtIS,) for any arbitrary event A. Now, consider the term P (si n- Zjet), which is the conditional probability of a de- coding error occurring on attempt i, given that detected decoding errors have occurred for previous decoding attempts j = 1, 2, - - -, i - 1 for the current codeword. Using Bayes rule, this conditional probability can be expressed as follows: 39 P 6i n Zdet (flilz - et ))(3.26) Fqt Z From an operational perspective, we argue that P (-i beau) et Z et > P (Oi-i s) Z5 et), because conditioning on a decoding error on the ith decoding attempt clearly can not decrease the probability that decoding attempts j = 1, 2, - - - , i--1 result in errors for a stationary channel. Stated in other words, for a fixed message size k, knowledge of future decoding errors does not decrease the probability of past decoding errors for a codeword which increases in length with each decoding attempt, as it does for rateless codes. Hence, it can be concluded that: P Ei (3.27) > P (8i) zdet n j=1 Now consider the terms in the product of (3.25). Following the same operational argument that was used to justify (3.27), each of these terms can be lower bounded as: P Ej n P (Ej) Zdet) (3.28) k=j+1 Combining the inequalities (3.27), and (3.28) into (3.25) results in the lower bound given in (3.14). To derive an upper bound on the terms in the summation of (3.23), consider the following sequence of equations and inequalities: F (Zund ~q(~zet) det =FZnd)Fo -)Znd) -1 = d (Zu) i-i FPzde =1k pue(i) (1p~) t Z fJ i--1 j=1 nd Z ni k=+ Zi~ed l j=1 _5 Pue7 (Ei) (I - Pue)i 40 P (Fi-1 Zund) (y --1 (3.29) zdej' (3.30) kkj±1 k=j+1 (3.31) where (3.29) follows from the chain rule for joint probabilities, (3.30) results from the definitions of Ziet, and Zind, and from the assumption that error detection is conditionally independent of any additional events given the current decoding outcome. The inequality in (3.31) is based on the fact that 1 1 E Zn 0 ( operational insight that led to (3.27), and (3.28). +1 4et) <land the same Note that (3.31) can be extended by dropping the conditional probability F (Ei-1 Zund) , as this term is likely close to 1 due to conditioning on a future decoding error. The upper bound given in (3.15) follows. Note that applying (3.15) to (3.22) yields the message error probability given in (3.16). As an additional point, note that if it is assumed that the system is operating in an SNR regime where successful decoding is possible for some system capable value of nm, then P (E,,m) = 0, and the overall message error probability can be lower bounded by (3.14). From a computational standpoint, the bounds in (3.14) - (3.16) are attractive as a design tool because they are each a function of only Pue, and the marginal probabilities of decoding error events over m decoding attempts. The only component of these bounds that has not yet been developed, but is required for evaluation, is a model for the marginal probability of decoding error, P (.E). In Section 3.3, two such models are described. 3.3 Approximate Decoding Error Probability In this section, a method of approximating the marginal probability of decoding error is presented. The method is based on random coding arguments, thus providing an optimistic characterization of performance and reliability that can be used as a benchmark when considering practical rateless codes. The model we consider is based on error exponent analysis and jointly typical sets. Conceptually, the decoder for this model can be thought of as making decisions based on whether the the received sequence is jointly typical with a unique codeword from the codebook. This approach is referred to as jointly typical decoding [13]. The model assumes the following: 41 1. There exists a codebook of 2k unique codewords which each correspond to a unique k bit message. The codebook is known to both the transmitter and receiver. 2. Each codeword, Xflm - {X 1 , X2 , ... , Xfm} for i E {1, 2, ... , 2 k}, is a length nm sequence of encoded symbols, where each symbol is drawn iid according to some probability distribution p(x). Additionally, each encoded symbol has power constraint P; that is, for any symbol xj E Xfm, E[x] = P. 3. The encoded symbols are sent incrementally over the channel, which outputs yf, a distorted version of n encoded symbols, where n < nm. 4. Each encoded symbol sent over the channel is distorted according to the transition distribution p(ylx). As stated in Section 2.2, the channel transition distribution of interest in this work is AP (x, U2 ). 5. Each time the channel outputs additional distorted symbols corresponding to a given k bit message, the receiver attempts to decode by looking in the codebook for a unique codeword of length n that is jointly typical with the received sequence, according the distribution p(X", yn) = 1 p(xy). Additional details on jointly typical sequences and decoding can be found in Chapter 7 of [13], which also provides a detailed derivation of the error exponent for such a decoder. The important result of this derivation (for the current purpose) is that the average marginal probability of decoding error corresponding to a length ni codeword can be bounded as: P (,Ei) < 2 "aaN-- (3.32) with SNR -y = P , and small constant 6 > 0. The bound in (3.32) is most accurate in cases where the code block length, ni, is long. Figure 3-2 illustrates (3.32) for a k = 256 bit rateless code over a complex AWGN channel with capacity C = 1 bit per channel use (Ydb = 0 dB) plotted as a function of coding rate, 42 k (equivalent to spectral efficiency in this case). Note that, though the error exponent curve represents an upper bound, for the reminder of this thesis, we use the approximation. P (Vi) - 2 (3.33) * 100 10-2 a- 0 L. 0 0 1-6 I.. 0 10 0 10-*0 10-12 0.6 0.65 0.7 0.75 0.8 0.85 0.95 0.9 1 1.05 1.1 Spectral Efficiency (bits/channel use) Figure 3-2: Single decoding attempt probability of error example, k = 256, -Ydb = 0 dB. Figure 3-3 illustrates the upper bound in (3.16) plotted as a function of pue for a complex AWGN channel with capacity C ~ 2.06 bits per channel use the information block length is k = 1024 bits, nIRU = (-Ydb = 5 dB). In this example, 1, and nm = 8192 (i.e. decoding attempts are executed after every encoded symbol is received, up to nm = 8192 symbols). 43 100 -- - - - - -- - - -- - .- .- - - -- 10 - 0 L. -O 102 (U ....... .. . .. ...... . ... .. ......... 0 cc . . .. . . . .. . . ..-.. ... . .. . . . .; .. ....... 10 .0- M . 10 05 10-6 10 1u 10 10~- 10- -7 10- 10~5 10 10 Probability of Undetected Error (Pue) 110 Figure 3-3: Upper bound on probability of message error, k = 1024, 3.4 10~ 100 Ydb = 5 dB. Packet Error Probability In Section 3.2, a bound on the overall message error probability was developed. When evaluated assuming a particular model for encoding, decoding, error detection, and channel impairments, the upper bound given in (3.16) can provide insight into the reliability corresponding to the transmission of a single message encoded using rateless coding. Typically, user experience is more closely associated with the reliability and performance that is observed at the transport layer of the system, as opposed that observed at the MAC and PHY layers. Using this fact as motivation, this section focuses on characterizing the level of reliability that is observed at the transport layer. We begin by defining the packet error event, F, then give a summary of the main results of this section. 44 A packet error event occurs when the message error event, D1, occurs for at least one of the t constituent messages corresponding to a given packet. Hence, F is defined as: 7 TA UD (3.34) 1=1 The main results of this section consist of the following lower and upper bounds on P (F), the probability of error corresponding to a transport layer packet: m P F > 1-1- Pue (I - Pue) i-' r - (Pue(1 P (E9 ) (3.35) j=1 i-1 1- ( t i - pue) -IP (i) + (1 - Pue)mP (Sm)) (3.36) The analysis leading to (3.35) and (3.36) proceeds as follows. Using results developed in Section 3.2, the probability corresponding to (3.34) can be expressed as: P( D = E) (1=1) t =1 - P ( =I - Di P (Dc) = 1 1 -P (Dj)) (3.37) (3.38) where (3.37) and (3.38) result from the assumption that error events are independent and identically distributed across codewords. A lower bound on (3.38) is derived as follows: 45 P(F) 1- P (D) > 1- 1-- E (3.39) Applying (3.14) to (3.39) results in (3.35). To derive an upper bound, simply apply (3.16) to (3.38), which yields (3.36). Figure 3-4 illustrates the bounds in (3.35) and (3.36). In this example, C ~ 1.2 bits per channel use (-ydb = 1 dB), ni = 1, nriU = 4, nm = 8192, and tk = 4096 bits. The corresponding upper and lower bounds are plotted as a function of the information block length per codeword, k. The two cases correspond to two different choices of b, as labeled in the figure legend. The plot shows that the probability of a packet error event is tightly bounded above and below by a nearly constant function of k. The fact the the bounds are nearly constant for fixed b as k varies is not surprising when one considers that, because nIRU, nm, and tk are held fixed in the example, the number of possible decoding attempts over all codewords in a given packet remains constant as k varies. This can be further understood by considering the union in (3.5). In addition to illustrating the upper and lower bounds derived in this section, the curves in Figure 3-4 show the dramatic effect increasing b can have on the reliability of a packet. Also, note that in the example of Figure 3-4, it is assumed that decoding is attempted every 4 complex channel uses. This affects the overall probability of error because the number of opportunities for an undetected decoding error to occur increases as decoding attempts become more frequent. Because of this effect, there is, once again, a tradeoff between the performance a system can achieve, and the overall probability of error. The former would have that the system be configured to attempt decoding as frequently as possible, resulting in a fine tuning of the realized rate to each realization of the channel. The latter would have 46 the system attempt to decode only when there is a high degree of confidence' that decoding will be successful, so as to minimize the number of terms in the union given in (3.5). We return to this point in Chapter 5, as it suggests an interesting area of future work that is beyond the scope of this thesis. 100 ::::::::: : :.. ... ..... I........... . . . . .. . . .I .. .... ............... .... ....... . . . . . . .. . . . I . . . . . . . . . . .I :::L ....I . Ii : . b = 16 .........................'I' ' *........:........... ........ ...... .............. ...I ......... .......... ........... ............. ....... ....... b = 24 ................................... ................ ........ ............. .................. ...... . ....... ..... ........ ........ ............................. 10-1 .......... *-..* .... ....... ....... .... .. ........... . ...... . ... ...... ..... ... ..... ... .. .... ........ .... ..... ... ..... ... ....... .... * . ........... ....... ...... ..I.... ... ....... .. ....*'.. -. .............. ....... ..... ........ ........ ... ....... ...I ....... ........ ....... ....** ..... ........ . . ........ ...I ...I ......... .... ...... ..I ...... ... .............. .............. ............ ........... ............. ...................I ..... ............ ........... ......... ............ ........... ........... ........... ............. ............ ........... 10-2 0 L. CL L. ........... ............. ... .................... ........... ....... ............ ...... ... ............. ................... .. . ... .......... .. ....... I ..... .... ........ .... .... ... ... ........ .............. .. .......I... ................... .... .... . ....... ....... ... ...... ..... ............. ........... ........ ........... ..... ........... ....... ..... ............ ........... ....... ........I .................... ........... ........................... ...I .............................. ......................... 0 LU ............ I ..... ...... ........... . 10-3 ........... ....... ...................... ....I ....... ......... ........................... .......... ......... .......... ........... ........ ........ ...... .............. ............ .......... ................................... ............ ....................... ....... ... ......................... ........... ........... ........... ........................ cc M 10 -4 . . .. . . . . . . .. . . . . . . .. . . .... . . . . . .. .............. . . . . . . . .... . . . . . .. . . .... . . . . . .. ................. ..................... . . . . . . . .. . . . ... ........ *. ... . . .. . . . .. . . . . . . .. . . . .. . .. . .. . . . . .. . .. . . . . ... . . . . . . . . . . .. .. . .. . . . . . .. . . . .. . . . . . . . . . .. . . .. . . . . .. . . . . . .. .. . . . . . . .. . . . .. . . . .. . . . . .... . . .. . . .. . ... . .. . . . . . .. . . . . . . . .. . . 10-5 . . . . . .. . ... . . .. . . .. . ... . . . . . . . . . . . ... . . . . . . . . . . ... . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . . .... . . . . . . . .. . . . . . . . . .. . . . . . . .. . . . . . . . .. ....................... ........... ..............I ........ ... . . .. . . . . . . . . . . . .. . . . . . .... . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . ... . . . . . . . . . . .. . . . . . . . . . . . . . .................. *.............. ............. .......... ............. ........... ............ .........I...........4-6*::::::::i::::::::::: :::........I............I. . . . . . . . .. . 500 1WO 1500 2000 2500 3WO 3500 4000 information Block Length (bits) Figure 3-4: Numerical examples of uppe r and lower bounds on the overall packet error probability as a function of k, -Ydb = 1 dB. 'A confidence metric could be computed based on the noisy observations, e.g. log likelihoods could be used for this purpose. 47 3.5 Packet Throughput In this section, the throughput of a transport layer packet is characterized. We begin by defining the notion of packet throughput, then give a brief summary of the main results. Then, prior to going into the detailed derivations of these results, a generic example showing how the results can be used to construct bounds that can be evaluated numerically is provided. Note that for the remainder of our development, it is assumed that any undetected decoding errors are subsequently detected using the bl bit CRC embedded in the packet 5 . First, consider a packet consisting of t(k -b) useful information bits'. The packet through- put is defined, in units of bits per channel use, as: R tA Tb E where i (f) = p ( i P N* (3.40) is the overall packet error probability which takes into account the possibility of up to nrx - 1 packet retransmissions, and the expectation in the denominator is taken with respect to N* and T, which are both nonnegative integer valued random variables. Each N* is assumed iid, and takes on value n C N>o when codeword 1 is successfully decoded using a rateless code which occupies n channel uses. T is distributed as a geometric random variable that, for a given packet, takes on the value equal to the number of attempted packet transmissions required for all t codewords to be successfully decoded. Formally, N* and T can be defined as follows: N* min n E Z>O Si U Znd} T ~ P (c) 5 P (E)3 1 (3.41) (3.42) This assumption is equivalent to assuming a moderately large value of b' We continue to assume the error detection model of Section 3.1, and the packet structure of Section 3.4 but, for ease of exposition, include the b' bit packet CRC in this block of "useful" information bits. 6 48 for j = 1, 2, ..., nt,. Note that nt., is a protocol dependent parameter, and is not necessarily under the control of the designer in the scenarios considered in this work. In what follows, we consider only two distinct values of ntz, in particular, nt. = 1, and nri, = 00. The reasoning behind the definition in (3.40) is as follows. If nriaiS distinct packets must be communicated to the receiver, the ratio of the total number of useful information bits sent across the channel to the total number of channel uses can be expressed as: ntriaist(k - b) When rItrials (3.43) is large, some fraction, P (F), of the packets will be decoded in error. Hence, we can rewrite (3.43) as: P (fc) t(k - b) P (fc) nriait(k - b) t Taking the limit of (3.44) as rials ZT j N* ntrials -K 1/ntrials Z als tT N*(3.44) oc results in (3.40). Before proceeding with the analysis, we provide a summary of the important results that are developed in this section. Note that for the remainder of this thesis, we make frequent use of the fact that events can be parameterized by the code block length at which they occur, rather than indexing them according to the decoding attempt at which they occur. For example, for the decoding error event we have S(ni) = 9i (as previously described in the second paragraph of Section 3.1), and for the decoding success event we have S(ni) = Si. Bounding the expectation in the denominator of (3.40) is accomplished using the law of iterated expectation. This entails developing bounds on certain conditional expectations, as described next. First, upper, and lower bounds are developed for the conditional expectation of N* given .T. Note that this is the expected code block length given knowledge that a packet error event does not occur for the packet containing the codeword pertaining to N*. The bounds are given as: 49 00 (3.45) n=1 E [N,*CF"] E -- E (1 - Pue)i 1 IP (E(i - 1) S(i)) P(c(i)) Note that when evaluating (3.46), P (S(i - 1) S(i)) will be approximated as (3.46) IF.F(C-1)) Next, upper and lower bounds on the conditional expectation of N* given F are derived. This is accomplished by applying the law of iterated expectation, then deriving upper and lower bounds on the conditional expectation of N* given El. The bounds are shown to be: -00 IE [N* ~ < ( - Pu)nIp (9(n)) (3.47) n=1 E[N* > ( - P (9(i))pue(1 - pue)i-IP (s(i - 1) zund(i) where P (F(i - 1) Zund(i)) can be approximated as P(.(i-1)) W ) p(Zund(i)) (3.48) It is then shown that the bounds in (3.47) and (3.48), along with the appropriate bounds on P (9E) from Section (3.2), can be used to construct upper and lower bounds based on the expression: [N*IF] = E N* - P (E-) Finally, it is shown that R can be expressed as: 50 + E [N* E] P (E ) (3.49) (k - b)P (FC) (3.50) -T) k\E [N*Fc] E [N*F] IP (.F) (Fc + ±P(yc) N( e P (EC) + E [N*|F] and that the results in (3.45) - (3.48), along with appropriate results from previous sections can be used to construct upper and lower bounds on (3.50), which can then be numerically evaluated. For example, an upper bound can be constructed as follows: 1. Select the desired values of of k, and b and substitute them into (3.50). Note that selecting b implies the value of pue. 2. Substitute (3.33) for any and all marginal decoding error probabilities involved in the construction of the desired bound (parameterized appropriately for the desired configuration). 3. Substitute the product of (3.46) and the quantity 1 minus (3.36) into the denominator of (3.50), in place of the product E [N *FC] P (fC) (There are two such instances in the denominator of (3.46).). 4. Construct a lower bound on (3.49) by substituting the product of (3.48) and (3.14), and the product of (3.46) and the quantity 1 minus (3.15) into (3.49) in place of E [N* ] P (-) and E N* the fact that E N* Ef EI P (-E), respectively (Note that we have made use of = E [N *Fe].). 5. Substitute the lower bound constructed in step 4. into (3.50) in place of E [N*F]. 6. Compute the ratio ., using (3.35) in place of P (F), and the quantity 1 minus (3.35) in place of P (FC). Substitute the result into (3.50). 7. Compute P (Fe) appropriately and substitute the result into (3.50). For example, if the transport layer protocol specifies that nt = 1, compute P (se) = 1 - P (F), using (3.35) in place of P (F). 51 The example above illustrates a generic procedure that can be used to evaluate the bounds developed in this thesis. In a given design scenario with a particular set of system specifications, some of the steps enumerated above may require slight modifications, but the basic procedure will remain the same. Next, we begin the analysis and derivations of (3.45) - (3.50). To evaluate the denominator of (3.40), first apply the law of iterated expectation: NtT ] E EN* tT ~ =E E N* T tT =E ~tT N* T = I P (T = 1) +IE IN* T > 1 P (T > 1) (3.51) The conditioning in the expectations of (3.51) corresponds to the packet error event F as follows: FC= {T = 1} (3.52) F= {T > 1} (3.53) Each expectation in (3.51) is next considered separately. The first expectation in (3.51) is conditioned on the event for which all t constituent codewords of a given packet are successfully decoded. In this case, there is no need for the packet to be retransmitted. Manipulating the expectation and expressing it in terms of F, we arrive at the expression: [ E TtT N* T =1 - =E t N* T = i] t E [N* Fc] 1=1 52 (3.54) Because N* is a nonnegative integer valued random variable, the expectation in (3.54) can be expressed as: E [N* Fc] = EnP (N* > n (3.55) 7c) Note that, because the decoding error event corresponding to a codeword of length ni can be expressed as (3.56) Si = E(ni), the events Zund and Zdet can be similarly be parameterized by a given number of channel uses. To begin evaluating (3.55) numerically, we rewrite the event of interest involving N* in terms of S(n): n N(*> nF} Zdet()) 0 i=1 (Ec(i) n P(i)) (3.57) i=n+1 Computing the probability of the event on the right hand side of (3.57) is difficult, but an upper bound can be derived by considering the following disjoint union: nZdet 0 c~i n-i) U n det (i c~ i=n+1 =1i+1i=1 ncPi )C (3.58) which is equivalent to the intersection: Zdet) (3.59) i=1 Rewriting the probability of the event given in (3.59) in terms of (3.58), and rearranging terms results in the following upper bound on each of the probabilities in the summation of 53 (3.55): P (N* > nrF) < P Zdeti <- (I - Pue)']? (S(n)) (3.60) where (3.60) results from the same chain rule argument that resulted in (3.19). Substituting (3.60) into (3.55) yields the result given in (3.45). To derive a lower bound on the probabilities in the summation of (3.55), we write the complement of the event of interest as: {N* (s(i) n < nKFc} = Zdet(j) )) i=1 (3.61) j=1 Because the events in the union on the right hand side of (3.61) are disjoint, the corresponding Cd1 probability is: IP (N* < n .V) =? Ps(i)n Zdet( ) ) (3.62) Applying the chain rule to each term in the summation of (3.62) results in the following sequence of relations: P (S(i) n Zdet(j) )) = P (S(i)) P ( i-1 =P (S(i)) Zdet (j) (-1 Zdet (j) s(i) n l 1p j=1 S(i) ( n Zdet(k))) k=j+l i-1 =P (Ec(i)) (1 - Pue) i~11 j=1 h i-1 P S(j) S(i) n ( Zdet (k))) k=j+1 (3.63) < P (E(i)) (1 - Pue)i- P (E(i - 1) S(i)) 54 (3.64) where (3.63) results from the definitions 7 of S(i), and Zdet(j), and the fact that, given E(j), P(j) is conditionally independent of decoding attempts for j' f j. The inequality in (3.64) results from taking the smallest term in the product 8 in (3.63), and noting that the product of the remaining terms are bounded above by 1. A lower bound on the probability of the complement of (3.61) can now be expressed using (3.64): P (N* > n yc) > 1- ZP (SC(i)) (1 - Pue)i-IP (S(i - 1) S(i)) (3.65) which implies the lower bound given in (3.46). The second expectation in (3.51) is conditioned on the event corresponding to a given packet requiring at least one retransmission. This event occurs when at least one of the decoding decisions made during a given packet's initial decoding attempt is erroneous and the CRC fails to detect the error. We begin to characterize this expectation by expanding it into the following recurrence relation: t tT[~T E T ~ EN* N*T > 1 =E IN*F +E t ~tT N* F +E =E ~ [ (3.66) N* T>1 IP(T >1) tT +E P(T=1) [N*T=1 =1 (3.67) .. where (3.66) is a consequence of the memoryless property of the AWGN channel. Rearranging 7Recall that in the error detection model defined in Section 3.1, all decoding successes are detected; that is, P (p Ec) A 1 8 From an operational perspective, P (E(i - 1) S(i)) is the smallest term in the product because decoding errors are least likely to occur for j nearly i, and the nearest j can be to i in the given product is when j = i - 1. 55 terms in (3.67) and rewriting the events involving T in terms of F yields: E[ N*T > 1] = E [EZ_ 1 N* F] + E [E'= N*|Fc] P (FC) (3.68) t (E [N* F] + E [N* 1C] P (Fc) (3.69) The conditional expectation E [N* F] is the only term in (3.69) which we have not previously established how to evaluate. To close this gap, we once again use the law of iterated expectation, along with the fact that N* is conditionally independent of F given IE [N* IF] =E [N* .F,E-1 P (E-1) + JE [N* F,F = E [N* -]P (E ) + E [N* Note that E N* Ef = P E E1: (3.70) (3.71) EI P E) E [N* F'], which we have previously established how to evaluate. The remaining expectation in (3.49) can be expressed as: E [Ni =EP (N* (3.72) > n n=1 To compute the probabilities in the summation in (3.72), note that: nz i=1 Because ( n Zdet (i)) n Uo de (W) n ( zund(i)) (3.73) i=n+1 c Zund(i) Zdet(z)), (nm= an upper bound on (3.72) is given by (3.47). For a lower bound on the terms in the summation of (3.72), consider the 56 complementary event: {N* < rP 1 } zun= ( od Zdet(j) (3.74) j=1 i=1 which is another union of disjoint events. The probability of the corresponding event can be expressed as a summation of probabilities, each of which was previously upper bounded in (3.31). Applying this result yields the following upper bound: P (N* < n ) ZP'((i)) pue(1 - Pue)i-lIP (8(i - 1) zund(i)) (3.75) Approximating P (V(i - 1) Zund(i ) as P(S(i-1)) and taking the complement of (3.75) results in the desired lower bound given in (3.48). Finally, note that after substituting (3.54) and (3.69) into (3.40), we arrive at the packet throughput expression given in (3.50). The expression in (3.50) can be numerically evaluated by applying the bounds on P (SE) and P (T) that were derived in Sections 3.1, 3.2 and 3.4, along with the approximation given in Section 3.3. Its main utility to system designers is as a tool that can be used to begin modeling systems that use rateless coding. In particular, our results enable different choices of design parameters and their resulting effects on system performance to be evaluated prior to resorting to simulation. 3.6 Constrained System Models In this section, the results developed in Sections 3.1-3.5 are used to analyze two distinct models of communication, each of which is assumed to incorporate rateless channel codes. Motivation for the analysis of these two models is based on observations first made in Section 3.1, where the inherent tradeoff between system throughput and reliability was illustrated. Each model is loosely based on one of the common network communication protocols belonging to 57 the transport layer of a communication system architecture (see Appendix A). Furthermore, each model is meant to exemplify a different system constraint that is representative of those which system designers are typically faced with. As an interesting comparison, we compare the bounds provided by one of our models to the variable length feedback (VLF(e)) codes of [14]. The VLF(e) bound provides a coarse grained view of variable rate codes that is based on transmitting a single codeword. The goal of this analysis is to provide useful insight to system designers who may be considering incorporating rateless codes into their designs. The first model is referred to as the latency constrainedmodel. This model is useful for applications in which delay is more costly than dropped packets, and can be considered an archetype for the well known User Datagram Protocol (UDP). The second model, which will be referred to as the reliability constrainedmodel, may be used for delay tolerant applications that require high degree of reliability, and can be thought of as modeling the Transmission Control Protocol (TCP). In what follows, "reliability constrained model" and "TCP model" are used interchangeably, and likewise for "latency constrained model" and "UDP model". The most important distinction between the two models stems from the fact that the reliability constrained model attempts to decrease the probability of error observed by the application by retransmitting a packet when a packet error event occurs. This is in contrast to the latency constrained model, which drops erroneous packets and moves on to the next packet in queue'. The overall probability of error and throughput for the latency constrained model are: A1AP (F) R (k - b) [N P (.FC) Ez E [Nr*] (3.76) (3.77) Where the expectation in the denominator of (3.77) is expressed as: 9 Thus, the UDP model subjects the application layer to a level of reliability equal to the packet error probability. 58 E [N*] = E [N* Fc] P (Fc) + E [N* F] P (F) (3.78) Lower and upper bounds on (3.78) can be derived directly using our previous development in Section 3.5, along with the appropriate bounds on P (F')from Section 3.4. These lead to upper and lower bounds on (3.77). The probability of error and throughput for the reliability constrained model, assuming there is no limit on the number of packet retransmissions in the event that any packet is found to be in error are given as: Pr A 0 R Rr where P (f') = 1 (3.79) (3.80) Pr = 1 (see 3.50). To illustrate the utility of Equations (3.76) - (3.80) along with the bounds derived in previous sections, consider the following scenario: Suppose a designer would like to design a system where the packet length may not exceed 4096 bits, and the overall system probability of error must be less than or equal to 1 x 10-10. Because the packet length and the overall probability of error are specified, the designer must choose k and b so that the specifications are met or exceeded. Choosing k and b will also determine the number of codewords which are used to transmit a packet, t. Figure 3-5 illustrates upper bounds on (3.77), and (3.80) for a low SNR case (-sdb = 1 dB), while Figure 3-6 illustrates the high SNR case (Ydb = 10 dB). For the marginal decoding error probabilities, the error exponent model described in Section 3.3 was used. The bounds are plotted as a function of the information block length of each codeword, k, for choices of b which enable the system error rate specification to be met or exceeded. As can be seen in Figures 3-5 and 3-6, the UDP model provides a bound for both high and 59 low SNR cases. Noteworthy is the fact that the UDP model, for both SNR regimes, offers a more pessimistic bound on throughput than what is provided by the VLF(C) lower bound of [14]. The VLF(c) bound, plotted in green in each figure, is included in the comparison because it could be used as a tool to gain insight into such a design. However, because the UDP model takes a more fine grained perspective into account, it is a more appropriate analytical tool for the current design task. Also note that the TCP model throughput bound is higher than that of the UDP model for shorter values of k. If it is assumed that the bounds are tight, then this would suggest that there is significant loss incurred by utilizing such a lengthy CRC (b = 45 bits) in order to meet the system error rate specification in the UDP case. As can be seen in both SNR examples, the upper bounds for both models converge to C as k increases. It is interesting to note that these effects are a direct manifestation of the throughput-reliability tradeoff first discussed in Section 3.1, and illustrated in Figure 3-1, which showed that, when using CRC error detection, reliability has a higher throughput cost when shorter vales of k are used. 60 1.2 4) C 0.8 U) 0.6 0.4 - - 0.2 --- --n 0 500 1000 1500 2000 Capacity lower bound - VLF(E) TCP: b = 24 UDP: Error Rate: 1e-10 (b = 45) 2500 3000 Information Block Length (bits) Figure 3-5: Bounds on throughput, -Ydb = 1 dB. 61 3500 4000 3. 5 . . . . . .. - -5 I . .... .-.- . . -. -. . -. . . .. . . -- ... ... . . . 35 .. . . . . . . .. . .. . ... . . . a) a> 2. 5 -- 2-- 0 C C .0 1. 5- - 5 0. 0) 500 Capacity VLF(E) lower bound --TCP: b =24 --0-- UDP: Error Rate: 3e-1 1 (b = 45) - -- 1000 1500 2000 - 2500 3000 Information Block Length (bits) Figure 3-6: Bounds on throughput, -Ydb = 10 dB. 62 3500 4000 Chapter 4 System Design and Practical Rateless Codes In this chapter, two practical rateless codes are evaluated. Each code is evaluated using Monte Carlo simulation, and the analysis of Chapter 3 is used to guide a design process that can be used to select the "best" practical rateless code for a given system. We begin by describing each practical code. 4.1 Layered Codes Layered rateless codes [6] utilize an approach in which weighted linear combinations of encoded symbols from a number of distinct messages are incrementally transmitted over the channel. The encoded symbols corresponding to each distinct message, which we refer to as base codewords, are generated using a fixed-rate base code. After a sufficient number of observations by the receiver, the decoder attempts to obtain an estimate of each message using a successive cancellation decoder. If decoding fails for any one of the messages, additional weighted linear combinations of the base codewords are transmitted, and successive cancellation decoding is restarted using all of the available observations. Though the construction is not limited to large values of k, it has been shown [6] that choosing relatively long 63 information block length turbo codes for the base codebook results in good performance. The adjective "layered" is an epithet referring to the fact that L base codewords (i.e. "layers") are repetitiously transmitted over the channel, where each repetition, which is referred to as a redundancy block, is constructed as a distinct linear combination of the base codewords. At the encoder, a set of combining weights make up a M x L matrix of complex coefficients, denoted G, where M is the maximum number of distinct redundancy blocks. If the l1 h base codeword is denoted cl, and the m'h redundancy block is denoted xm, then the M redundancy blocks output from the layered encoder are of the form: Xi Ci (4.1) XM CL where, in what follows, any subset of the xm's corresponding to a particular set of L cl's will be referred to as a layered codeword. Because of the repetition structure of the layered architecture, some method of combining available redundancy blocks at the receiver is required prior to executing each successive decoding attempt. This step of the receiver algorithm can be implemented using symbol by symbol, unbiased minimum mean square error (UMMSE) combining of the observed redundancy blocks 1 . Note that decoding can be attempted when a non integer number of redundancy blocks have been observed by the receiver. Further, it is shown in Section IX of [6] that good performance can be achieved in such a configuration, so long as at least 1 full redundancy block has been observed. In [6] it was shown analytically that the layered construction can achieve capacity approaching performance given a capacity approaching base code and appropriately designed matrix G. It was also demonstrated, using simulation of several different configurations, 'Note that maximal ratio combining (MRC) can also be used, as it is for the implementation described in [15], though UMMSE generally enables higher performance. 64 that layered codes can achieve good performance when using an analytically optimized G, along with a long information block length Turbo base code. In this thesis, we further contribute to the results on the layered construction by analyzing their expected performance when used in the kinds of systems described in Section 3.6. We begin by describing in the following subsection how this architecture fits into the models developed in Chapter 3. 4.1.1 Decoding Errors Under Successive Decoding The error detection model of Section 3.1 assumes that, given n channel uses using a particular rateless codeword, the probability of an undetected decoding error, P (Zund(n)), is equal to p,4eP (9(n)) e 2-'P (E(n)), which is a function only of the number of message bits dedicated error detection, b, and the probability of decoding error, P (E(n)). Additionally, in our previous treatment it was assumed that P (8(n)) could be approximated simply by setting it equal to the error probability given by either one of the two decoding models of Section 3.3. In this section, it is shown that under the successive cancellation decoding inherent in the layered architecture, the decoding error probability, and undetected decoding error probability for a given number of channel uses has a slightly different characterization. In particular, it is shown that a different mapping from the decoding models of Section 3.3 to the decoding error probability P (9(n)) is required. The main result of this section is that the required mapping is given by the following relation: L P (,E(n)) =1- S(n, 1)) (4.2) where S(n, 1) is similar to the detected decoding error event defined in (3.4), though for this case it must be parameterized by the current number of channel uses, n, and the layer 1 c [1, L] that is currently being decoded. The mapping in (4.2) suggests that successive cancellation decoding provides additional 65 protection against undetected decoding errors when' compared to a rateless code which is accurately modeled by the simple, direct mapping. As noted in Section 3.1, the decoding error probability for any particular code which occupies n channel uses is a function of the channel SNR, 'y. As such, the decoding error event can be expressed in a manner which conveys explicit dependence on the channel SNR: S(n) = E(n, -y) (4.3) The explicit parameterization of decoding error events shown on the right hand side of (4.3) is useful when considering the probability of decoding error under the successive cancellation decoder employed by the layered architecture. To understand why this is the case, consider the effective SNR at the input to the decoder when attempting to decode layer 1 E [1, L]. In particular, this is the effective SNR after any previously decoded layers have been subtracted from the received signal, and UMMSE combining of the remaining redundancy blocks has been executed. This effective SNR is determined by the SNR of the communication channel, and the layers that have yet to be decoded. As such, this effective SNR is a function of the number of received redundancy blocks, and the particular layer that is being decoded. This means that the decoding error events corresponding to individual layers must be parameterized by n, and ,, the effective SNR corresponding to layer 1. Thus, given an estimate of 'yj for each n and each 1 E [1, L], the probability of decoding error can be estimated using one of the decoding models of Section 3.3. The discussion in the above paragraph implies that a method for obtaining estimates of the effective SNR is required by our modified model of the layered architecture. In practice (and as suggested in Section IX of [6]), a reasonable estimate of this effective SNR can be obtained by computing the expected mean square error (MSE) corresponding to each layer after cancellation of previously decoded layers and UMMSE combining is taken into account. See Section B.1.2 of Appendix B for an illustration of the validity of using the MSE estimate for this purpose. 66 Let l be an estimate of the effective SNR corresponding to layer 1 for a particular 1 layered codeword which has occupied n complex channel uses. Assume that obtained as described above. l has been Let Slayer (n, -Yl) be a new error event corresponding to a decoding error given n channel uses when the effective SNR is -Y, where the subscript 1 denotes explicit dependence on the particular layer that is being decoded by the successive cancelation decoder. Denote the event corresponding to a detected decoding success at layer I given n channel uses as S(n, 1), where S(n, 1) is defined as: S(n, l) (4.4) Eaye, (n,yi) n P E Note that we need not parameterize P because we continue to assume the error detection scheme given in Equation (3.7), which is conditionally independent of n and yl given the associated decoding result. Similar to (4.4), the undetected decoding error event defined in (3.2) can be parameterized by 1 as: Zund(n, (4.5) 1) = Elayer (n, 7y) n P and similar for the detected decoding error event, Zdet(n, 1) (see definition Equation (3.3)). Because the expected MSE for layer 1 is computed under the assumption that layers 1+1, ..., L have been successfully decoded and cancelled, the resulting effective SNR estimate can be used to approximate the conditional probability of decoding error. This can be accomplished using the error exponent approximation of Section 3.3 as: L I? (Slayer (n, ni) 1 S(n, ) ~ 2-n(c( - ) (4.6) j=I+1 Equation (4.6) enables the use of the models of Section 3.6 to analyze the layered architecture. We next show, in detail, how this can be accomplished. 67 Decoding success for a layered codeword is defined as the intersection of detected decoding successes of all L layers which make up the codeword for some n. For a given n the probability of this event can be computed as: L P TS(n, () L = B =1 (=1 , 1- I L (n, -yi) n S(n, 1) j=l+1 (layer (4.7) where we have used the chain rule for probabilities and the fact that: P (S(n, l)1S(n, l + 1)) P (S aye, (n, y) n PIS(n, l + 1)) = B (P'9fayer (n, yi) , S(n, 1 + 1)) P (Slayer (n, -z) S(n, 1 + 1)) = (1 - P (Slayer (n, -yi)|S(n, 1 + 1))) (4.8) (4.9) Note that if we take the complement of the relation given in (4.7) we obtain the probability of decoding error given in (4.2). This result given is key to applying the results of Chapter 3 to the analysis of layered codes. In particular, Equations (4.6), (4.9), and (4.2) show how to map the decoding models of Section 3.3 to the probability of decoding error given n channel uses under the successive cancellation decoder employed by the layered architecture. Next, consider an undetected decoding error given n channel uses. Under successive cancellation decoding, such an undetected error occurs if i out of L decoding attempts result in undetected errors, where 1 < i < L, and the remaining L - i attempts each result in successful decodings. Hence, the number of ways in which an undetected decoding error can occur is combinatorial in L. In particular, there are E_1 (L) = - an undetected error can occur for a given n. Because each of the 2 L 1 distinct ways in which - 1 events are distinct, the overall event can be defined as the following disjoint union: uj n l) uZ"d(n, 68 n ( S(n,l') (4.10) is the ith set of indices such that B where B 1, 2, ... , 2 L Note that Pte, - (L) C [1, L], B L B for i # j, 1. The probability corresponding to (4.10) is a summation of of the terms are proportional to pue, while the next where i,j - 2L _ 1 terms. (L) are proportional to etc. Thus, despite the fact that the number of terms on the sum grows exponentially with L, many of the terms are very small, even for moderate choices of b. Additional operational insight reveals that for a number of the first decoding attempts executed on a codeword given n channel uses, only 1 out of the 2L - 1 terms is present In particular, if the probability of decoding the Lth layer (which is in the summation. the first layer the decoder attempts to decode) is equal to zero, then the only way for an undetected decoding error to occur is if all L decoding attempts for that particular n result in undetected errors, an event whose probability is upper bounded by pU . Thus, for the layered architecture, the probability of an undetected decoding error given n channel uses can be upper bounded as: P (Z und (n)) < (Z1 L ()Pje Ej if P (S(n, L)) > 0 p1e otherwise (4.11) Note that error propagation due to cancelation of erroneous codewords makes successfully decoding subsequent layers after an undetected error occurs less likely. This can be thought of as a feature of the layered architecture, due to the successive cancellation decoding. Because of this feature, it is likely that (4.11) is somewhat loose. 4.2 Spinal Codes Spinal codes [7] were recently shown to achieve good performance for relatively short infor- mation block lengths. The basic idea underlying spinal codes is to encode each message, m, into a series of correlated spine values by sequentially applying a hash function to distinct k' bit segments of m. A sequence of random number generators (RNGs) seeded by the re69 sulting spine values are then used to generate symbols which are mapped to an appropriate transmit symbol before being sent out on the channel. The spinal decoder described in [7], is an approximate maximum likelihood decoder in which decoder complexity, and performance can be traded off by varying the configuration of the decoder. One of the main attributes that makes spinal codes of interest in this work (and in general) is the good performance they have been shown to achieve at short message information block lengths. This makes spinal codes particularly appealing for use in low latency applications, where modern high performance codes (e.g. turbo, LDPC) typically do not achieve high performance. Though the construction of the spinal code itself does not appear to limit its good performance to the regime of short k, the decoder presented in [7] can inhibit performance as k increases, when the beamwidth of the decoder, B, is held constant. The beamwidth of the spinal decoder determines the number paths that are retained as the decoder sequentially steps through the tree of possible decoding decisions. At each step, B paths back to the root of the tree are retained, and all other tentative paths are pruned. Hence, as k is increased (and subsequently, the required code block length), the tree lengthens, and the number of opportunities for the correct path to be pruned from the tree increases. This means that B must be increased in order to maintain good performance as k increases. However, increasing B increases the computational complexity of the decoder, making it prohibitive to increase the beamwidth in practical systems. A more detailed discussion and analysis of the parameters associated with spinal code can be found in Section 8.5 of [7]. For the analysis at hand, we use this reference as a guide to aid in choosing the best performing configuration, though we are also limited to configurations that are supported by [16]. In [7], the authors analyze and demonstrate spinal code performance in various configurations for AWGN, and Rayleigh fading channels. In [17], the authors develop a link layer protocol for spinal codes which attempts to dynamically select values of nIRU by learning the complementary distribution function (CDF) of the decoder stopping trials (similar to 70 Equation 3.41), then applying a dynamic program to determine the IRU length after every feedback. In this thesis, we contribute to the work on spinal codes by analyzing their performance when they are used in the kinds of systems described in Section 3.6. The operation of spinal codes is such that the models of Chapter 3 can be used directly, without any need for modifications as those which had to be made for the layered codes. Design of Constrained Systems 4.3 The main objective of this section is to illustrate how to choose the "best" rateless code for a given design scenario. We also seek to illustrate how the analysis of Chapter 3 can be helpful to the design process. The illustration is carried out by comparing results obtained via Monte Carlo simulation of layered, and spinal codes to appropriately parameterized versions of the models developed in Chapter 3 (and refined for the layered case in Section 4.1.1). 4.3.1 Latency Constrained System Design Consider a system in which the transport packet length is at most 7168 bits, and the transport layer protocol is UDP. Because packet retransmissions are not allowed, the system can be modeled using (3.76), and (3.77). Suppose it has been determined that the application can tolerate a maximum packet error rate of 3 x 10-2, and that the system should be designed to operate in environments with Ydb ;> 1 dB. Given this setup, we would like to answer the following questions: Q1) What values of k and b can be chosen to meet a given system error probability specification? Q2) Which rateless code under consideration performs best given the answer to Q1? 71 To begin answering Q1, the recall the packet error probability upper bound given in (3.36). Simply evaluating this bound for different choices of k and b, yields insight into where to begin the design. To illustrate this, Figures 4-1, and 4-2 show (3.36) plotted as a function of k for two distinct choices of b. Figure 4-1 illustrates the worst case SNR that the system is specified to operate in high SNR case (-Ydb = (-Ydb = 1 dB), while Figure 4-2 illustrates a relatively 8 dB). The markers on each curve specify values of k that divide 7168 an integer number of times and are greater than b, and are, thus, candidate choices for the codeword information block length. The bounds are clearly above the specified packet error rate for low SNR, and slightly below it at high SNR for the b = 14 case. For b = 16, the bounds are below the desired packet error rate for both low, and high SNR. These observations lead to the conclusion that b > 16. Note that in Figures 4-1 and 4-2, it is assumed that decoding is first attempted after 16 channel uses have been observed by the receiver, and then is attempted again every 8 complex channel uses. Recall that, from the perspective of wanting to match the rate of the code to the particular channel realization, it is desirable to attempt decoding as frequently as possible. However, when practical error detection is used, more frequent decoding leads to a higher probability of undetected error because there are more terms in the union of (3.5) (i.e. m is larger). Additional observation of Figures 4-1 and 4-2 reveals an important, and intuitive design guideline: given a specified operating environment and associated worst case SNR, it is most important during the early stages of design to consider the worst case to help guide the design process. Indeed, the worst case reliability illustrated in Figure 4-1 obviated the need to consider the results in Figure 4-2. While this observation can help guide the design process away from unnecessary tasks, it is generally important to characterize the system over a wide range of conditions at some critical point in the design process. Because the upper bounds shown in Figures 4-1 and 4-2 are relatively constant over the range of k values, it can be concluded that throughput must be taken into account to 72 properly choose k. Though the exact value of k still remains to be chosen, the packet error model of Section 3.4 been leveraged to help answer Q1. . . ... . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . ........ ...................................... ..... ........... ......... ..... ....... .................... .... ......... .... .... ............................. .......... ........... ............... ......................... ................... . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . ... . . . . . . . . .I .......................... .............................................I .............. ............... ............ .......... .........................................I ......... 10-1 :7 ....................... .......... ...................... ............................. ... cc .............. 0 L. 10-2 .... .... ...... .... .... .... .... : ....* ...: ........... ......... ..* ' * * *............. ....... ...... ....* ..... *...................... ..... ....... ..... ......... ................... .......... . ...... .. ........... ................... ...... ........................... .......... ...... ... .. ....... ......... .............. ...... ...... ........ ........... ......... .......I ... ..... 0 ............................. ......... . .... CL 10) ................... ....................... ............ ........ ......... .................... .. ....... ........... .........I ........... ... .................. ............ .......... .........* ................. ............. ......... ..... ................... ........................................ .............. ................... ............. ........ ..... ......... ........... ...... ... ..... ........... ... ..... .......... ........... .... ...........I .............. ..................... ............... .............. ........ ......... ........... ..... ........... ............ 10 17 Erro r S pe c . b = 14 e b = 16 .......I.......... .................. ...................... ........ ............ 100 200 300 400 500 600 ......... ....... 700 800 900 1000 Information Block Length (bits) Figure 4-1: Packet error probability upper bounds for UDP design example, -Ydb = 1 dB. Given the preliminary conclusions reached for Q1, (i.e. that b > 16, and k may be selected as any value which divides 7168 an integer number of times), we now wish to answer Q2; that is, to select a rateless code, along with a corresponding information block length, k, that performs best in this scenario. To begin to answer this question, bounds on the throughput, R 1, given in (3.77) can be evaluated. Because a candidate set of b and k values has already been determined, simulations of the layered and spinal codes 2 targeting these configurations can also be executed. Figures 4-3, and 4-5 show such simulation results for 2 All spinal code simulations were executed using the useful and well documented libwireless package [16]. 73 10 0 a.100 10- Error Spec. b = 14 b = 16 -*-- -9-1 . . r .. .. .. . 100 .. . 200 300 .I . . . . 400 . . . .. . . 500 .. . 600 . .. . .. . . 700 . .. 800 . . . . . . 900 . . . 1000 Information Block Length (bits) Figure 4-2: Packet error probability bounds for UDP design example, -Ydb = 1 dB. low, and high SNR cases (-Ydb = 1 and 8 dB, respectively), along with upper bounds on R1 . Table 4.1 summarizes the parameters used to configure each simulation shown in Figures 4-3, and 4-5. Recall that notation that is specific to the layered and spinal codes is described in Tables 2.4 and 2.5, respectively. For the spinal code simulations, k', c', and B were chosen based on recommendations in Section 8.5 of [7], while ni and nIRU are a consequence of the "8-way" puncturing schedule3 which is available in the libwireless package for simulating spinal codes [16]. For the layered code simulations, rb was selected to correspond to the LTE turbo code of [18] along with the use of a complex QPSK transmit symbol, L was chosen 3 Between each decoding attempt, symbols corresponding to every 8 th spine value are transmitted. See [7], Section 5. 74 so that throughput is not limited 4 for SNRs below approximately -ydb = 14 dB, M was chosen to maximize the number of unique redundancy blocks (given the chosen L) without imposing any loss due to the layered architecture (see [6], Section V), and ctgt was chosen based on previous simulation results executed over the SNR range of interest (see Figure B-1, Appendix B). For both codes, simulations were executed with equal values of nIRU for a given information block length, k. Hence, the frequency with which feedback is acquired by the encoder is the same for both codes. Note that the layered codes are configured so that decoding attempts begin after the first full redundancy block is received. This configuration was chosen because it was shown ([6], Figure 3) that attempting to decode when less than one redun- dancy block has been observed by the decoder results in low performance. On the other hand, due to the puncturing schedule of the spinal code simulation, decoding attempts begin after nIRU channel uses are observed. From a maximum throughput perspective, this means that the spinal codes have an advantage over the layered architecture in it's current configuration. Indeed, the maximum throughput for the spinal codes is 8 x k x 2 = 64 bits per complex channel use, whereas the layered architecture can achieve only up to L x rb = 14/3 bits per complex channel use. However, in practice systems generally do not operate in environments where C > 64 bits per channel use. There is also a disadvantage for the spinal codes given this configuration, which can be understood by recalling the union of disjoint events given in Equation (3.5). In particular, the number of decoding attempts may be greater in the case of spinal codes because such attempts commence when fewer channel uses have been observed. This generally leads to a larger number of decoding attempts for the same expected N*, as compared to layered codes. This means that, compared to the case of the layered codes, there will be a larger number of disjoint events in the union of (3.5), and, hence, a larger number terms in the summation of the corresponding probability of undetected error for the spinal codes. This effect for the spinal codes can be mitigated by computing a reliability 4 1t is required that L x rb ;> log 2 (1 + -y) hold to avoid limiting throughput for an SNR of y. 75 metric based only on the channel observations, and using this metric to decide when to begin decoding. This enhancement is not implemented for the simulations presented in this work. Table 4.1: Simulation Configuration, k = {256, 512, 1024} (a) Spinal Codes (b) Layered Codes k' 4 rb 2/3 V 6 L 7 c' 6 M 7 B 256 ni {390, 774, 1542} ni {4, 8,16} nIRU {4,8,16} nIRU {4, 8, 16} EQgt 1 x 10-3 Mapping PAM-2c'/ real dim. Mapping QPSK b 16 b 16 The stand-alone markers plotted in Figures 4-3 and 4-5 illustrate the results of the simulations described above. The value corresponding to each marker was computed as: tk (k - b) Z tktrials Ej=1 where tk tialsl{T; n* i is the number of messages (codewords) that make up a packet when the information block length is k, nTtrgias is the number of packets simulated, 1{Fjc} is an indicator function, and n* is the realization of N*. Note that (4.12) converges to (3.77) in the limit as ntrials -+ 00. Figures 4-4 and 4-6 show the probability of error for each simulation, along with that predicted by the models when configured with applicable parameters from Table 4.1. Consider the results shown in 4-3. First, note that the upper bounds on R, provided by our model give a less optimistic forecast of performance as compared to the VLF(E) lower bound5 of [14]. This comparison is used simply to illustrate the importance of considering 5 Note that the VLF(E) bounds shown in each figure were computed using k's for each code. 76 E = maxpl over all candidate 1.3 - 1.2 - 4) - - x 0.9 2 0 .8 - Layered.Simulation La. e e.Mode ... Mo..e .. -.. . .. .. .. . ... . . .. . . .. -.. ... A 1.1 x ... ... .... . -Spina . C 0.7- - -- . . --- - - -- Shannon Capacity VLF(es na VLF(Elaee 0.6 9- Layered Model - -G - Spinal Model 0.5 .. .. . - -.. ... x . ..... -... .. -. . . ... .. . A 0.4' 0 100 200 300 400 500 600 Layered Sim ulation Spinal Simulation 800 700 900 - 1000 Information Block Length k(bits) Figure 4-3: Throughput upper bounds and simulation results for layered, and spinal codes for UDP design example, -Ydb = 1 dB. the performance limitations that are imposed by the architecture of the entire system, as opposed to considering only the performance that the rateless code may achieve. Next, note that for this low SNR case, spinal codes achieve higher performance when k = 256, but there is a "crossover" in the simulated throughput for k = 512, where the layered codes achieve higher simulated throughput. This is also the case, albeit to a larger extent, for k = 1024. This suggests that, given each system in the current configuration, if the system is to operate primarily in low SNR environments, the highest performance can be achieved by choosing the layered construction configured with a base code with corresponding information block length k > 512. One explanation for this performance is as follows. The packet length in this example is long enough such that the turbo base code 77 10 1 10. - 10 . .. .. 0 10.. +0 (10 s 10- -0- 0 10 Layre.Mde .... 300 400 Spna0Mde 500 600 700 800 900 1000 Information Block Length k(bits) Figure 4-4: Packet error probability upper bounds and simulation results for layered, and spinal codes for UDP design example, -ydb = 1 dB. can be chosen at a relatively long block length, e.g. k = 1024. Additionally, recall that the beamwidth of the spinal decoder, B, constrains spinal code performance as k grows and B is held constant. The crossover observed in this case is likely caused to a combination of these two characteristics. Further on in this example we return to this issue by experimenting with the beamwidth of the spinal decoder in an effort to determine how sensitive spinal code performance is to a change in B within the context of the UDP model. Moreover, in Section (4.3.3), we investigate a short packet length design example. However, given that the system was specified to operate in high SNR environments, results for such environments must also be considered when deciding which rateless code to use. 78 I I I I I - I I I I -- -- - ---- - VLF- sp--"-0 - F 1.5 ...... I ........... ............ - - - - - --- 4.5 ...... ............. - -- - - Shannon Capacity --- VLF(s, spinal -VLFI.laered 0 1 ... . .5 . ... . . .. . . .................. ......... . . ... . .. . .... .. ... 0 100 200 300 e- Spinal Model Layered Model a Spinal Simulation a Layered Simulation .. . ...... .-...- 400 500 600 700 800 900 1000 Information Block Length k(bits) Figure 4-5: Throughput upper bounds and simulation results for layered, and spinal codes for UDP design example, ?Ydb = 8 dBe. Turning to the results shown in Figure 4-5, it can be seen that the conclusion that layered codes of long block length should be chosen for the design does not hold as strongly for this SNR. The highest empirical throughput at this SNR is achieved by the spinal code at a block length of k = 512. For k = 1024, the spinal code has a negligible advantage over the layered code. Because this advantage is minuscule in comparison to that which the layered code holds in the low SNR case, the reasonable conclusion remains that, for the current configuration, layered codes should be selected, although it has now been shown that the information block length should be chosen as k = 1024. The preceding discussion focused on comparing the two codes based only on throughput. Clearly, because the system under design is specified to operate at a packet error rate of less 79 100 ........... ...... ............. .............. ...... .............. ............................... .. . . . . . . .. . ... . . . . .. . . . . I.. .. . . . . . . .. . .. . . . .. . . . ..... . . .. . . .. . . . . . . . . .. . . . .... . . . .. . . . . . 10~1 . . . . . . .. . . . . . .. . . . . . . .. . . . . . .. . . . .. . . . . . .. . . . . . .... . . . .. . . . . . . . . . . .. . . . . . . . . . . . . . . .. . . . . . .. . . . . . .. . . . .. . . .. . . . . . . . . .. . . .. . . . . . . . .. . ... . . . . . .. . . . . . . . .. ... . . . . . .. . . . ... .. . . . . . .. . .... . . . .. . . . .. . . . . .. . . .. . ... . . . . . .. . . . .... . . . .. . . . . .... . .. .. . . . . . . .. . . .. . . . . . . . .. . . . . .... . . . .. . . . . . . . . . .. . . . . ... .. . . . . .. . .... . . . .. . . . .... . . . . . .. . ... . . . . . . .. . . .... . . . .. . . . . ... ............ ........................ ... ........ .. ............... 10-2 ... . . . . .. . . . ............... .................................................................... ........... .............. ................ ............................................. 4 cc 10-3 0 0 . . . . . . . . .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . .. . .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .. . . . . . . . .. . . . . . . .. . . . . . .. . . . . . . .. . . . . . . . . . . . ... . . . . . . . . . . ... . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . ... . . . . . . . . . ... . . . . . . .. . . . .. . . . . .. . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . .. ............. ............. ............. ............. ............. ............. .... . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .......... . .... .... .. ..I . .. .... ... .. :X.. :. ........... .... .... ...... .. ...... .... . ...... .....:....... ...... ...... .. .. .... .... .. ....... .. ........... .. .. .. ... ....... . ......... . . .. . . .. . .. .. ..... ....... .. ..... . .... -e...... .... ... ... -. - -. t.-. i .. -'........ - ..... ... ... ........ .... ...... .... ...... .... .... ...... ....... ...::. .. . . . . . . .. . . . . . . .. . . . . . . 10 - *0 .. . . . . . .. . . . . . . . .. . . . .. . . . . .. . . .. . . . . . . .. . . . . . . . . . . . .. . . . . . . . ... .......... ....... .. .......................................... .......................... a) . . . . . . . . . . . ... . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . .I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . . . .. . ... . . . . . . .. . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . . . . .. . . . . .. . . . . . . . . .. . . . .. . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . ... . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . .. cc IL 10 6 . . . . . . . . .. . . . . . . .. . . . . .... . . . . . .. . . .... . . . .. . . .... . .. . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . .. . . . . . .. . . .I .. . . . .. . . . . .... . . . . .. . . . . . . .. ... . . . . .. . . . . .... . . . . . . .. . .... . . . . . . . .... . . . . .. . . . ... . . . . . .. . . .... . . . . .. . . . . . . . .. . . . . . . . .. . . . .. . . . . . . .. . . .. . . . .. . . . .. . . 107 . . . .. . . . . . .. . . . .. . . . .. . . . .. . . .. . . . . . . .. ................. . . . . . . . .. . . . . . .. . . . . . .. . . . .. . . . .. . . . .. . . . . . . . .. . . . .. . . . .. . . . .. . . . . .. .. . . . . . . . . .. . . . . .. . . . . . .. . . . .. . . . .. . . .. .... .. . . . . . .. . . . . . . . .. . . . . . . . . . .. . 108 . . . 300 . . . . . .. . . . . .. . . . . 500 400 - E-EX Spinal Model Spinal Simulation Layered Model Layered Simulation .. . . .... . . . .. . . . .. . . . . .. . 600 700 800 900 1000 Information Block Length k(bits) Figure 4-6: Packet error probability upper bounds and simulation results for layered, and spinal codes for UDP design example, Yb = 8 dBe. than 3 x 10-2, the empirical packet error rates of the two codes must be taken into account prior to making a final design decision. Turning attention to Figures 4-4 and 4-6, note that both codes meet or exceed the specified packet error rate of under the current configuration. Note that there are no empirical packet error rate measurements for the layered codes appearing in Figure 4-6 for k = 512 and k = 1024. This is due to the fact that undetected errors were not observed when simulating layered codes in the current configuration for these values of k at -Ydb = 8 dB. That is, for this set of Monte Carlo trials, ntras = 6 x 104 layered codewords were simulated for each value of k, and not a single undetected error was observed for k = 512, and k = 1024. This means that, with high probability, the codeword error rate, P (Dj) is only a small multiplicative constant greater than about 1 x 10-3, meaning 80 the packet error rate is, within a narrow confidence interval6 , around 1 x 10-4. This low observed packet error probability suggests that our claim in Section 4.1.1 that (4.11) is a loose upper bound is correct, and also illustrates that successive cancellation decoding provides additional robustness against undetected errors for the layered codes. Based on the simulated throughputs and packet error rates observed thus far, it can be concluded that one should either choose the layered codes with k = 1024, or reevaluate spinal code performance after modifying some of the parameters of Table 4. Ia. Efforts toward the later option can be aided by observations based on Figures 4-4 and 4-6. In particular, note that the empirical throughput of the layered code increases monotonically for both SNR cases. In contrast, that of the spinal code increases from k = 256 to k = 512, then decreases for k = 1024 in both SNR cases. This decrease is not predicted by the analytical model, and is counterintuitive from the perspective of random coding theory. However, is is shown in Section 4 of [7] that the performance of the spinal decoder can be increased by increasing the beamwidth, B. The results of simulating spinal codes with B = 512 and 1024 are shown in Table 4.2. For both experiments are configured with Ydb = 1 dB, k = 1024, and other parameters as given in Table 4.la. Table 4.2: Spinal Code Throughput Comparison, k = 1024, Ydb B Throughput (R 1 , bits per channel use) 256 0.87 512 0.90 1024 0.90 = 1 dB As can be seen in Table 4.2, increasing B from 256 to 512 results in a slight increase in throughput, but increasing B further does not help. Hence, spinal codes are still outperformed by the layered codes for k = 1024, 6 _Ydb = 1 dB. These estimates are based on the application of Chebyshev's inequality, and a normal approximation to the binomial indicator function, 11{D} 81 An additional consideration that must be taken into account when increasing B is the resulting increase in computational complexity of decoding, as spinal decoding complexity is proportional to B. A detailed treatment of the computational complexity of the two codes that considers, for example, the degree to which each decoding algorithm can be optimized is beyond the scope of this thesis. However, one can begin to acquire an understanding of the relative complexity cost that must be paid to achieve the throughput increases illustrated in Table 4.2 by using the simulation aided complexity analysis given in Appendix B, Section B.2. This analysis is used to estimate the average number of receiver operations per goodbit 7 for this case. The results, which are given in Table 4.3, show that a 14% increase in the number of receiver operations per goodbit must be paid for increasing B from 256 to 512 as shown in Table 4.2. Table 4.3: Receiver Ops. Per Goodbit, k = 1024, Spinal (B = 256) 5.743 x -Ydb = Spinal (B = 512) Layered 6.554 x 106 4.289 x 106 106 1 dB Whether or not the additional complexity of the spinal code is prohibitive in practice will depend on aspects such as the hardware platform on which the system will be deployed, size, weight, and power constraints on the system, as well as the degree to which each decoding algorithm can be paralellized. In [19], the authors develop an efficient architecture for decod- ing spinal codes, and a brief comparison is made to the LTE turbo decoder. The comparison in [19], along with our simulation aided complexity analysis suggests that, although spinal decoding requires a higher number of receiver operations per goodbit, the turbo decoder of the layered architecture requires a larger memory footprint. A detailed discussion of these issues is beyond the scope of this thesis. 7 Recall that a goodbit is defined as a bit which is decoded correctly. For example, if K bits are decoded over some interval of time, and the empirical error rate for these K bits is p,, the number of goodbits recieved is equal to K x p,. 82 Given the above analysis of throughput, reliability, and computational complexity of the candidate rateless codes for the specified latency constrained system, a reasonable solution to Q2 is to choose the layered architecture with an information block length of k = 1024. Indeed, under the scenario presented in this section, it has been illustrated that this design decision will result in a system which will exhibit higher expected throughput, a lower packet error probability, and also a lower computational complexity than choosing the spinal code in any of the considered configurations. If the memory resources on the implementation fabric of the receiver are limited, spinal codes are capable of delivering throughput similar to that of the layered codes for k = 256, and 512, although it has been shown that this decision will result in at least an order of magnitude increase in packet error probability. Finally, note that to further balancing the comparison of these two rateless codes, it would be ideal if each code were configured with a CRC length, b, such that the packet error probabilities of each code are equal. This would enable a comparison based purely on the performance that each rateless code is capable of within the setting of interest. However, we defer such experiments for now, due to limitations of the software simulator. 4.3.2 Reliability Constrained System Design Consider a system in which the transport packet length is at most 7168 bits, and the transport layer protocol is TCP. In this scenario the system can be modeled using (3.79), and (3.80). Once again, the system should be designed to operate in environments with -Ydb > 1 dB. Because we model TCP as providing unlimited packet retransmissions, which can, in principle, drive the system error rate as low as desired, a particular specification on the probability of error is not given. Because packet retransmissions incur a large penalty on throughput, one should seek to minimize such occurrences by minimizing the probability of packet error. However, because of the throughput-reliability tradeoff that exists, such a minimization must take into account the cost that a given level of reliability incurs on performance. Thus, instead of seeking 83 answers to Q1 and Q2 as in the latency constrained case, we wish to answer the following: Q3) Which one of the two codes, and associated values of k and b, provides the highest throughput under the given constraint on packet length? To begin designing the MAC and PHY layers of the system such that the best possible tradeoff between performance and reliability is achieved, we turn to the models, and our simulation results. The parameters used for simulation of layered and spinal codes are the same as those in Table 4.1. Upper bounds on system throughput, R,, along with corresponding simulation results for _Ydb 1 dB, and 8 dB, are shown in Figures 4-7, and 4-8, respectively. The figures, once again, illustrate the choices of k that divide 7168 an integer number of times. It is interesting to note that the model suggests that the layered code can outperform the spinal code for all values of k which are considered in the low SNR case, and the layered codes do, in simulation, outperform the layered codes for two out of thee values of k which were simulated. In the high SNR case, the models suggest that the spinal code can outperform the layered code for all considered values of k, and the simulation results show that this is the case for two out of the three k's which were simulated. 84 .. . . . . .. . . . .. .. . . . . . . . . . . . .. .. . . . ... . . . . . 1.2 1.1 1 xx 0.9 F A. xA 0.8 k.. (0 ........ ................ 0.7[F Shannon Capacity Layered Model -... .--- 0.6 F- - e - Spinal Model X 0.5 F 0.4 A 0 100 200 300 400 500 600 700 Layered Simulation Spinal Simulation 800 900 1000 information Block Length k(bits) Figure 4-7: Throughput upper bounds and simulation results for layered, and spinal codes for TCP design example, -Ydb = 1 dB. Given the configurations and results shown in Figures 4-7, and 4-8, one can reasonably answer Q3 by choosing the layered code with base codes for which k = 1024. However, this conclusion does not take into account the effect of alternative choices of b. Could varying the value of b for the spinal code result in higher throughput, in spite of the throughput-reliability tradeoff that exists? Because the simulation resources for varying b are not available for the spinal code, we cannot explore the parameter space for the spinal code empirically. Because of this, we turn to the model, which suggests that the throughput of the spinal code could (based on its upper bound) exceed that of the layered code if b were increased from 16 to 24. However, one must also recall that R, does not take into account the beamwidth related effects on the spinal decoder. Furthermore, if we consider the effect that was observed 85 I I I I I I I . - - ..- - - - - - --- - - -o e-0 2.5 F I . M U .... . . .. .. . .. . .. . .. . . . . . .x. . .. . . . . . . . . . . . . . . . .. . ... . 2 (0 1.5 F . 0. 1 - e Shannon Capacity - Spinal Model 9 A Layered Model Spinal Simulation x Layered Simulation 0.5 0 100 200 300 400 I I I 500 600 700 I 800 900 1000 Information Block Length k(bits) Figure 4-8: Throughput upper bounds and simulation results for layered, and spinal codes for TCP design example, -Ydb = 8 dB. in previous simulations, in which the empirical throughput of the spinal code reached a maximum value at k = 512, then decreased when k was increased to 1024. Thus, it is likely that the throughput of the spinal code, configured with b = 24, will also decrease as k increases, as opposed to following the trajectory suggested by R,. 86 4.3.3 Short Packet Lengths In Sections 4.3.1 and 4.3.2, it was shown that the chosen layered construction provides good performance and reliability for latency and reliability constrained scenarios, so long as the information block length of the base code is chosen to be relatively large (k = 1024 in our examples). Allowing such a value of k implies a relatively long transport packet length, which must be proportional to L x k. Codes such as the turbo code, which constitute the base codebook for the layered construction, do not to perform particularly well for short block lengths. While the layered construction does not preclude us from using a high performing, short block length code (such as an algebraic code) for the base codebook, we differ such an investigation for future consideration. Many applications (e.g. voice communications) have strict latency requirements. For the model of interest in this thesis, a strict latency requirement implies a short transport layer packet. Performance results presented in [7] suggest that spinal codes, due to their capacity approaching performance at short information block lengths, are particularly well suited for such applications. Given the above considerations, one must ask: are spinal codes the better choice when designing a system with a strict latency requirement, when compared to the layered construction considered in this thesis? As it turns out, the answer is yes. The following design example illustrates how to arrive at this conclusion. Consider the design of a system for which the transport packet length is at most 1024 bits, the transport layer protocol is UDP (as would be appropriate for an application with a strict latency requirement), and the packet error rate must not exceed 1 x 10-2. Additionally, assume the system is expected to operate in environments for which -Ydb > 1 dB. As in Section 4.3.1, we seek to answer Q1 and Q2. Because of the short packet length, the values of M, and L for the layered codes must be chosen smaller than those in the previous examples. For this example, L = M = 4 is chosen 8 . 8 Recall that this choice will limit throughput for SNRs above y = 87 2 Lxrb - 1, or Ydb 7.3 dB. Figure 4-9 shows that b = 16 is sufficient for both codes to meet the system error rate specification for k E {64, 128, 256}, thus answering Q1. This knowledge, as in Section 4.3.1, enables us to target simulations with these parameter choices. A summary of the parameters used to configure the simulations used in this example are given in Table 4.4. 10-2 CO 0 0 10~ q CL -e-b 10 - 80 100 120 140 160 180 200 =18 220 240 Information Block Length k(bits) Figure 4-9: Packet error probability bounds for UDP design example with short packet length, Y1db = 1 dB. Figures 4-10 and 4-12 illustrate simulation results of the short packet UDP case, along with throughput upper bounds on R 1 , and, again, VLF(c) lower bounds for each rateless code, for comparison. For both SNRs, the bounds on R, suggest that the spinal code may outperform the layered code at each value of k, which agrees with the simulation results. Though the empirical throughput of the layered construction increases quickly with k (whereas, in contrast, the spinal codes performance approaches a peak at k = 128), they are still out88 Table 4.4: Short Packet Length Simulation Configuration, k = {64, 128, 256} (b) Layered Codes (a) Spinal Codes k' 4 rb 2/3 v 6 L 4 C' 6 M 4 B 256 ni {102, 198, 390} ni f{1, 2, 4} nIRU {1,2,4} nIRU {1,2,4} Etgt I x 10-3 Mapping PAM-2c/ real dim. Mapping QPSK b 16 b 16 performed for each of the values of k which could be chosen given the packet length and the values of L and M. These observations confirm that the spinal code is better suited for the short block length regime, when compared to the layered construction with a turbo base codebook. Figures 4-11 and 4-13 illustrate the probability of error observed in simulation and the upper bounds provide by the UDP model. Unlike the longer packet length case of Section 4.3.1, undetected packet errors were observed for the layered codes in both SNR regimes, at each considered value of k. Note that the confidence levels corresponding to the Monte Carlo simulations of short packet length cases are similar to those given in Section 4.3.1 for the long packet length cases. Also, observe that the empirical packet error probabilities in Figure 4-13 corresponding to the layered codes lie above the upper bound given by our model. This phenomena suggests that, for short information block lengths, the layered construction with turbo base code is not well modeled by our UDP model. Further investigation as to why this is the case is deferred. Given the performance results in Figures 4-10 and 4-12, the clear answer to Q2 is to select spinal codes with k = 256. This choice will provide for the highest average throughput, and 89 1.2F .. .. . . . . . . . . . . . . . . . . . . 1 0 . ........ . - -. -- -- o --- - -. -- - x0.6 - - - - * - - - - - -Shannon x Capacity - VLF(.. . .yer.. _- VLF(sp 1 n) - e - Spinal Model -0-- Layered Model * 0.5 -..- - - . -A x Spinal Simulation Layered Simulation - Information Block Length k(bits) Figure 4-10: Throughput upper bounds and simulation results for layered, and spinal codes for UDP design example with short packet length, ~Ydb = 1 dB. a probability of error meeting that specified for the system. As was previously mentioned, further investigation of the layered construction configured with alternative base codes is beyond the scope of this thesis, though it is, perhaps, an interesting area warranting future work. 90 10, 10-1 ................... . . . . .... . . .. ......... ..................... ............ . . . . . . ... . . . . . . . . . .. . . . . . . ... . . . . . . ... . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . .. . . .... . . . . .. . .. .. . . . . . .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . .. . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . :1 . . . . . . . . ... . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . .. . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . I . . ... . . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . ... . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . ........ .. .................... .. . . . . . . . . ... . . . . . . . .- 10-2 >11 . . .. . . .... . . . . .. . . . . . . . . . . ... . . . . . . . . . . . . . . . . . ... . . . . . . . . cc X 10 -3 M . . . . . . . . . .. . . . . . . . . . . . . ... . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . .. . . . . . . . . ... . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . ... . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . ... . . . . . . . I .. . . . . . . . . . . . .... . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . ... . . . . . . ... . . . . . . .. .. . . ... . . . . . . . . . . . . . . .. . .. . . . . . ... . . . . . . . . .. . . .. . . . . .. . . . . . . . . . .. . ... . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . .. . . . . 0 am 0 .... . . . . . . . . .. . . . . .. ........... ...... . . . . . . . . ........... ...... .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . ... . . . . . . . . . . . . . . . . .. . .. . . . . . . . . .. . . . . . . . .. . . . . . . . . ... . . . . . . . . . . . . . . . . ... . . . . . . . . . ..... . . . . . : ...... . . . . .. . . 1 . . . . .. . . . . ...... . . . .. . . . .. . . . . .. . . . .. . . .. . - . . .. . . .. .. .. . .. . ... . . . . . . . .. ... . . . . . . . . .. . .................................. . . . . . . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . 1 . . . . . . . : ... . . . . . . . . ... - . . . . . . I . : . . . . . . ... . . . . . . . . . . . . . . . . . I . . . . . . . I .... . . . . . . . : .. . . . . ...................................... .......................................... LU .. . . . . . . . . . . . .. . . . . . . . ... . . . . . . . . . . . . . .. . . . . . . . . .... . . . . . . . 10-6 . . . . . . . . .... . . . . . . . ... . . . . . . . . . . . . . . ... . . ... . . . . . . . . . . . . . . . . . ... . . . . . . . . ... . . . . . . . . . . ... . . . . . . . . . . . . I . . . . ... . . . . . . . . ... . . . . . . . . . . . ... . . . . . . . . ... . . . . . . . . . . ... . . . . . . . . . . ... . . . . . . . X . . . . . . . . . .. . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . ... . . . . . . . . ... . . . . . . . . .. . . . . . . . . .. . . . . . . ........... . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . .. .. .. . . .. cc CL 10 -8 .. . . . . . . . .. . . . . . . . .. . . . . . . . . ... . . . . . . . . . . . . . .. . . . . . . . . . .. . . . . . .. . . . . . . . . ... . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . ... . . . . . . . . . .. . . . . . . . . S pinal M odel ......... ...................... .......... .............. ... A Spinal Simulation ..... ...... I ....... ............... ::::::::: .... .............. ..... ................... ........... .. .................. ... ....... .............. .. ...... .. ......... .. ............... E) Layered M odel .............................. ....... .......... X Layered S im ulation ................ I.......................... .......... ...... . ...... ............ .......... .. .. ...... .. ........ ... ....... ........ ... .. . ... ...... ........... ........ ...... ........... ........ ........ I .... ... .... ........ .. ....... ........ ......... ......... .......... ......... ......... ... ....... . ...... .. ........ ... ...... ..... ... ....... ........ ......... ........ ........... ...... ............. ........... .. ...... ........ ........... ........ ..... ........ 60 80 100 120 140 160 180 200 220 240 260 Information Block Length k(bits) Figure 4-11: Packet error probability upper bounds and simulation results for layered, and spinal codes for UDP design example with short packet length, 7db = 1 dB. 91 I I I I a)0 2.5 I I - - - - - - - I - co Cx Shannon Capacity VLF(Elayered -- - - - VLF(cs 1a - e - Spinal Model 9 A - 0 .5 - 60 80 100 120 - - - -.---.-.- 140 160 Layered Model Spinal Simulation Layered Simulation - --...-. . --.-.- - --. x .- 180 200 220 240 260 Information Block Length k(bits) Figure 4-12: Throughput upper bounds and simulation results for layered, and spinal codes for UDP design example with short packet length, -Ydb = 8 dB. 92 10 0 . . .... . . . . . . . ...... .............. ................... ............... ............ .................................................... ........ .......... ..................... 10-1 . ........ ........ ........... ........... .......................... ..... ......... ......... ..... ................................................ ........... 10-2 .... ........ ...... ..................... ............ ............ ................ .... Spinal Simulation E) X Layered M odel Layered Simulation . . . . . . ... . . . . . . . . I . . . . . ... . . . . S pin a l M ode l . . . .. . . . . . . .. . . . .. . . . . . . . . .. . . . . . . . . . ... . . cc . . . . . . . . .. . . . . . . . . . . . . . . . . . . . ... . . . . . .... . .. . . ... . . . . . . . . . .. . I . . . . . . .. . . . . . . . . . . . .. . . . . . . 10 . . . . . . . . -4 .. . . .. .. . .. . . . . . . . . . .. . . .. . . . . ........ .... . . . . . . . . . .. . . . . . . . : . . ... . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . .. . . . . . . . . . . .. . . . . . .. . . . . . .. . . . . . . . . ... . . . . . . . . * . ... . . * . .. . . .. . . . . . . . . . ... . . . . . .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . : . . . . . . ... . . . . . . . . ... . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . ... . . . . . . . . ... . . . . . . . . .. .... .. .. . .. . . . . ... . . .... ... . . . . .. . . .. . . . . .. . . . . . . .. . . . . .. . . 10-6 . . . . . . . . .. . . . . . . . . . . . . . X . .. . . . . . . . . . . . .... . . . . . . . ..... . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . .... . . . . .. . . . . .... . . . . . . . .. . .. . . . . . . . . . . . . . . . . .. . . . . . .. . . . . .... . . . . .. . . . . . .. . . . . . . . ..... . . . . . .. . . .. . . . . .... . . . . . . . .. .... . . . .. . . . .. . . . .. . . . . .. . . . ..*. . . . . .. . . . . . .. .... . . . .. . . . . .. . . . .. . . . .. . . . ..... . . . .. . . .. . . . . . .. . . . .. . . . . .. . . . .. . . ..... . . . .. . . . . . . .. . . . . .. . . . .. . ... cc CL . . . . . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . ... . . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . .. .. . . . .. . . . .. . . . .. W . . . . . . .. . . . . . . . . . ... . . . . . ........... .. . . . . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . . . ... . . . . ... . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . .. I . . . . . . . . . . . . . . . .. . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . .. . . . . .. . . .. . . . . .. . . . . . . . . . .. . . . . . . . . 0 &W CL No . . . . .. . . . . . . . . . . .. . . . . . . . . ... . . . . . . . . 10-3 M . . . . . .. . . . . .. . . . . . . ..... . . .... .. . . . . . . . .. .... . . ...7. ..........I......... ........ ..................... ......... ........ ........... ........... .................... ........... ........... .. . . . . . . . .... . . . .. . . . .. . . . . . .. . . . . .. . ..... ............... ..... .... . . . . . . . . .... ...... ..................... . . .. . ............................................... . . .... . . . .. . . .... . . . . . .. . . . . .. . . .... . .. . . . . .... . .. . . . . . .. . .... . .. . . . . .... . . . . . . . . .. . . . . . . . .. . . ... . . . . . . . . . . .. . . I . . . . . . . . . . . . . . .... . . . .. . . .... . . . . . .. .. .. . ... ... .. ..... . . . . . . . . . ... 10 -8 60 80 . . ..... . . . . .. . . . . . . . . . . . . . . . 100 I I 120 140 I I I 160 180 200 220 240 260 Information Block Length k(bits) Figure 4-13: Packet error probability upper bounds and simulation results for layered, and spinal codes for UDP design example with short packet length, -Ydb = 8 dB. 93 94 Chapter 5 Discussion and Conclusion In this thesis, we have presented an analysis of practical rateless codes. The analysis takes into account two important features of practical communication systems that are typically ignored in other analyses appearing in the literature. We began by defining and characterizing a model of rateless coding that accounts for practical error detection methods and packetized communication protocols at the transport layer. Our analysis resulted in the derivation of a set of parametric bounds corresponding to the performance and reliability of a system that uses rateless codes for error correction, and CRC for detecting decoding decision errors. This set of parametric bounds was then used to guide a Monte Carlo based analysis of layered and spinal rateless codes. This analysis illustrated a process that can be followed when investigating rateless codes for system design Furthermore, it was shown that under the UDP and TCP like models developed in this thesis, the layered codes perform well when the transport packet length is sufficiently long to support a relatively long turbo base codebook. For short transport packet lengths that are characteristic of latency constrained applications, spinal codes perform best for the configurations considered in this work. During the course of our analysis, a number of additional and fascinating insights emerged, including suggestions for future work, which we discuss next. 95 Due of the spinal decoder, in particular the beamwidth parameter, spinal codes are inherently well suited to operate using short information block lengths, and not well suited for moderate to large informations block lengths. On the other hand, the turbo base code that the layered code is configured with in this work is inherently well suited for cases where the information block length is relatively large. An interesting topic for future work would consist of characterizing the performance of layered codes when configured with algebraic codes for the base codebook, as they have been shown to perform well in the short information block length regime. While analyzing a well known performance-reliability tradeoff resulting from CRC error detection overhead, it was observed that another closely related tradeoff exists for rateless codes. Indeed, our definition and subsequent analysis of the codeword error detection event illustrated that a performance-reliability tradeoff also exists as a result of choosing the frequency with which decoding occurs. In particular, we saw that increasing the frequency of decoding enables a fine tuning of the realized rate, but incurs a penalty on the overall probability of error. Investigating the optimal frequency of decoding under a detailed system model such as ours could be a useful area for further work. Recent efforts in this general area can be found in [20]. An interesting area related to the above line of work is the investigation of methods for increasing the reliability of rateless codes that use CRC error detection. This is inspired by our analysis of layered rateless codes, as it was observed that the successive cancellation decoding architecture inherently provides additional reliability against undetected errors under the CRC model. Additional gains can be obtained by leveraging the reliability metrics computed at the receiver. For example, log-likelihoods can be used to dynamically determine how many noisy encoded symbols should be observed before decoding is attempted. Such information can decrease the number of decoding attempts (and error detection checks), resulting in a substantial decrease in the overall probability of undetected error. Finally, eliminating the need for CRC overhead has been investigated previously. How96 ever, with the proliferation of soft reliability based iterative decoders (such as the turbo decoder), useful methods of leveraging the input and output soft information utilized by these decoders remains to be fully exploited for rateless codes. An example of recent efforts in this direction can be found in [11]. 97 98 Appendix A Layered System Architectures One aspect of this work which distinguishes it from previous works is that we analyze performance and reliability from the perspective of the transport layer of a communication system. In communication system design, it is common practice to partition the system into a layered conceptual model. Well known examples include the Open Systems Interconnection model (OSI), and the Transmission Control Protocol/Internet Protocol suit (TCP/IP). For the purpose of our analysis, we define a simple abstraction of these common layered architectures which contains only the layers important to the work in this thesis. Though not as elaborate as either of the standard layered system architectures mentioned above, our model can easily be used as a proxy by designers who wish to study system performance and reliability tradeoffs when designing the medium access control (MAC) and physical (PHY) layers of a communication system. The three layer model which we employ is illustrated in Figure A-1. The communication link is modeled as consisting of one sender, and one receiver. At the sender, it is assumed that information, in the form of a bit stream, is sourced from the application layer, then passed to the transport layer where a transport layer packet, or more succinctly, a packet, is constructed. The packet is then passed to the MAC/PHY layer, where it is segmented into distinct messages and encoded prior to being transmitted over the 99 Information Source/Sink Application Layer Packet Construction/ Retransmission Transport Layer Protocol IRa MAC/PHY Layer Scheduling &z Transmission/ Rateless Coding Figure A-1: Three layer communication system architecture. channel. At the receiver, the sequence channel observations are decoded at the MAC/PHY layer. Once successful decoding has been achieved with a high level of confidence, the resulting message estimate is passed up to the transport layer. When all of the constituent messages corresponding to a packet have been decoded, they are reassembled, resulting in an estimate of the transmitted packet. The integrity of the packet estimate is then checked for any undetected errors prior to being passed up to the receiver application layer. The MAC portion of the MAC/PHY layer controls scheduling of sets of encoded transmit symbols, referred to as incremental redundancy units (IRUs), while the transport layer is responsible for scheduling packet transmissions. If the system employs a protocol which prescribes retransmissions of erroneous packets, scheduling of such retransmissions is handled by the transport layer. 100 Appendix B Design Details B.1 Layered Rateless Code Design In this section, several details regarding the design and configuration of the particular layered rateless codes discussed in Chapter 4. B.1.1 Choosing ctgt When designing the G matrix, L, M, and ctgt must be chosen prior to executing the numerical optimizationi required by the design. As was discussed in Section 4.3, L, and M can be chosen based on considerations regarding the maximum desired throughput, and number of unique redundancy blocks of encoded symbols. However, a method for choosing Etgt is not as apparent. For the layered codes which were simulated for this theses, in order to choose an appropriate value of ctgt, that is, a value which maximizes performance for a given L and M, Monte Carlo simulations were executed over a wide range of SNRs for several values of ctgt. These simulations were carried out after L and M had been chosen. A subset of the results are shown in Figure B-1. Clearly, throughput efficiency is best for Etgt = 1 X 10-3 when then entire SNR range is taken into account. 'For this thesis, this optimization is implemented using the Matlab script layeropt.m, by Mitchell D. Trott 101 Layered Code Throughput Efficiency: Various Etgt 0.9- 0.85 - 0.81- 0.75[0.7-.... ......... .......... .. ...... 0.65 t Egt 0.61- E _Etgt 0.55 0 10 5 =Ee-5 =1 - 20 15 SNR (dB) Figure B-1: Throughput efficiency for M = L = 7 layered codes vs SNR. Various choices of ctgt are shown. B.1.2 Mean Squared Error Estimate of Effective SNR This section illustrates the validity of using the computed mean squared error as an estimate of the effective SNR under the successive cancelation decoder. For this example, a Monte Carlo simulation of the L = M = 7 layered codes were executed, and the effective SNR after UMMSE combining, taking into account residual interference from all undecoded layers, was measured. That is, a set of measurements of the SNR that the turbo decoder observes were taken. Figure B-2 illustrates the results of these measurements for -Ydb = 1 dB when decoding layer 7, averaged over all trials. Also plotted are the MSE based estimates. All results are plotted versus the decoding attempt index. 102 Estimated And True Effective SNR: Channel SNR = 1 dB -2 -2.5| -3 M -3.51 -. .. ....... ........ ..... ......... ....~ U,) -4 -4.5 - -- MSE Based Estimate --- Avg. Measured Effective SNR: Layer 7 -5 10 20 30 40 50 60 Decoding Attempt # Figure B-2: MSE and measured effective SNR when attempting to decoder layer 7. From observing Figure B-2, it is clear that the MSE based estimate provides a very good estimate of the average observed effective SNR. B.2 Complexity of Practical Rateless Codes In this analysis, complexity is characterized by counting the number of real operations that must be executed by a receiver employing each respective code. The resulting parameterized expressions are then used, along with data obtained by Monte Carlo simulations of each code, to estimate the expected number receiver operations per goodbit for each code over a range of operating SNRs. In our consideration, processing required to produce channel and noise parameter estimates is not taken into account because such estimates are required by 103 each rateless code of interest in this thesis. B.2.1 Layered Rateless Codes Expressions for the complexity of each unique processing step for the layered rateless coding receiver are provided in this section. When using the layered rateless code (and, more generally, any rateless code), several decoding attempts may be necessary to successfully decode all layers, where each attempt is carried out using an increasing number of received redundancy blocks. In the expressions given in Tables B.1 and B.2, mrb < M is the number of received redundancy blocks 2, L is the number of layers, and N is the length of each redundancy block, in channel uses'. For the Turbo base code, u is the constraint length of the constituent convolutional encoders (i.e. each encoder has 2' possible states), 1 is the code rate P at the output of the Turbo encoder, k is the number of information bits per Turbo codeword (i.e. k is the number of information bits per layer), and I is the number of decoder iterations. The expressions given in Tables B. 1 and B.2 correspond to the complexity of a single decoding attempt; that is, they do not take into account the cumulative complexity of decoding a particular message. However, the numerical examples which follow illustrate cumulative complexity based on simulated performance of the layered codes, as that is of most practical value in an analysis such as ours. Once a set of redundancy blocks is received they must be combined so that the successive cancellation decoder can begin attempting to decode each of the L layers. MMSE combining of the available redundancy blocks is used. The vector of MMSE combining weights, v, are derived for the AWGN channel of interest as the solution to: 2 3 Note that in our implementation of layered codes, mb can be a non-integer. Equivalently, N is the length of each base codeword. 104 (B.1) argminE [HvIy - ciHIl V where the input output relationship for the channel is given as: (B.2) y= Gc+w where y is the mb x N received vector, G is a mb x L matrix of layer combining weights, c is a L x N matrix of L length N base codewords, and w is a mb x N matrix of iid P1(0, a2 ) noise. The solution to Eq. B.1 is: (B.3) V = R-Rc, The received auto-covariance matrix, and received cross-covariance matrix above can be shown to be: Ry = GmlGil + Rzz (B.5) Rcy = G:,, where Gmbl is the upper left mb x 1 of G, t denotes (B.4) the conjugate transpose of a matrix or vector, R,, is the noise covariance matrix, and G:,i is the 1th column of Gm'bl. To compute the MMSE combining weights v for a given layer 1, and a given number of received redundancy blocks mb the receiver proceeds as follows: 1. Compute (B.4): (a) Compute m, x 1 complex matrix multiply GmrblGQb: Requires m2bl complex multiplications, and mr(l - 1) complex additions and must be computed for each 105 Table B.1: Real Arithmetic Operations Required To Compute And Apply UMMSE Combining Weights For Layered Rateless Code. Computation MMSE Weights Mults. yy ~ R14L(Y * Rey -_ 4mb2) + 7mrbL(L + 1) + mrb Lb L(2+1 (4m2b) R4L + M "V Adds. b) 0 4Lm 2rb Apply MMSE e Apply v+y ) 4L(!2+mb-L 4 Lmrb = R-yy1 Rccy M + M2 b~ 2 Lmrb(mrb 4LmrbN - 1) 2L(mrb - 1)N 1 - [1, L] for a total of m 2 L(L+1) complex multiplications and m 2 (L-1)L complex additions. (b) Add RZZ = .2 mrbXmrb to previous result: Requires mb real additions, and must be computed 1 time for fixed mb. (c) Invert (B.4): Requires af + m2 - M complex multiplications and same number of complex additions and must be computed L times. 2. Compute (B.5) (a) Requires mrb complex multiplications, and must be computed for each of the L decoded layers. 3. Compute (B.3) (a) This is a complex mrb X mrb matrix times mrb x 1 vector multiply which requires m2b complex multiplies and mrb(mrb - 1) complex additions, and must be computed for each layer. The total number of real arithmetic operations required for the MMSE portion of the receiver based on layered rateless codes is summarized in Table B.1. 106 After MMSE combining is complete, log-likelihood ratios for the even and odd indexed bits are computed as: LLRodd(y) -2V_9ie(v~ LLReven(y) =-2vf1m(v~ y) (B.6) y) (B.7) For the complexity listed below, QPSK mapping of the coded bits to signals is assumed. The Turbo decoder is assumed to utilize the log-MAP algorithm, which has complexity that was analyzed in [21]. Note that after successful decoding of each of the first L - 1 layers, the k information bits must be re-encoded, which involves interleaving of the k information bits for 1 of the 2 constituent convolutional encoders which make up the Turbo encoder. Because the complexity of the Turbo encoder for successive cancellation consists only of interleaving and a number of xor and shift register operations which is linear in k, only table lookups for the interleaver are counted in the estimate of complexity. It is also assumed that the factor -2v is cached, and therefore reading it from memory is negligible. 1. Compute (B.6) and (B.7): Requires 2 real multiplications per complex received symbol for each of the L length-N layers. 2. Log-MAP decoding: Required arithmetic complexity is reported in [21] and listed in Table B.2 for convenience. Decoding is carried out for each of the L layers. 3. Re-encoding: Requires k real table look-ups of the interleaving indices, and must be carried out for each of the first L - 1 layers. 4. Mapping the re-encoded layers to QPSK requires N complex (2N real) table look-ups (1 complex table look-up for each pair of encoded bits). Must be carried out for each of the first L - 1 decoded layers. 107 Table B.2: Real Operations Required By The Turbo Decoder To Decode The Layered Rateless Code. Computation LLR * QPSK Turbo Decoding * Log-MAP Cancellation " Re-Encoding Mults. Adds. Table LUs+Max 2LN 0 0 2IdkL(2u(2p + 4)) 2IdkL(2u(2p + 14) - 5) 4IdkL(2u+ 2 _ 1) 4mbN(L - 1) 0 (k + 2N)(L - 1) * Cancellation 0 2mb(L - 1)N 0 5. Multiply G:, by the re-encoded and re-mapped length N complex codeword: This requires mbN complex multiplications for each of the first L - 1 layers. 6. Cancellation: Requires N complex additions (subtractions) per received redundancy block. The total number of real arithmetic and logical operations (in this case, table look ups and max( evaluations) for the Turbo decoder are summarized in Table B.2. B.2.2 Spinal Codes In this section we provide expressions for the computational complexity of each unique processing step for a receiver which employs spinal codes. To begin, we explain the operation of the spinal decoder and define notation. The spinal decoder is best viewed as an approximate maximum likelihood decoder; i.e. it aims to solve the optimization: M = argmin||y - x (M) 112 (B.8) MEA where x (M) is the codeword corresponding to message M, y is the received sequence, A is the set of all possible messages, and M is the result of decoding. In [7] the authors explain that the spinal decoder breaks the distance in Equation (B.8) into a sum over spine values (and decoder passes): 108 k/k' sp (B.9) (M) 12 -i,j _y~ [i' 11 11Y - X (M)|2 i=1 j=1 where xi,j (M) is the encoded symbol value for spine value i on decoder pass j for a particular hypothesis, k is the length of M in bits, k' is the number of bits (out of the k bits in M) that are used to generate each xi,j (M), yi,j is the noisy received version of x2 ,, (M), and sp is the number of passes over the spine which have been transmitted at the time of decoding. The receiver decoding complexity in the expressions to follow will also be a function of the decoder depth d, and beamwidth B (See [7]). Equation (B.9) must be computed for each decoding attempt. In what follows, we let x = x (M), and xi,j = xi,j (M), for convenience. To compute (B.9), the encoder must be replayed at each symbol. This can be accomplished as follows: 1. Compute (B.9): (a) Encode hypotheses, x: Requires ( - d) 2 dk' hash evaluations, and (k - d) 2 dk's random number generator evaluations to generate all possible encoded symbols for each of B branches in the beam, assuming sp passes. (b) Compute subtractions (yi,j - xij): Requires ( - d) 2 dk's, complex subtractions for each of B branches in the beam, assuming s, passes. (c) Compute products 1.12: Requires ( - d) 2dn+1s, real multiplications, and ( - d) real additions for each of B branches in the beam, assuming sp passes. (d) Compute summations E Z, (-): Requires (- - d) 2 dk'(sp - 1) real additions for each of B branches in the beam assuming, sp passes. 2. Prune decoding tree: (a) Prune decoding tree to beamwidth B: Requires 109 ( - d) B2k' comparisons. 2 dk's Table B.3: Real Operations Required By a Spinal Code Receiver. Computation Distance Metric Mults. Adds. Hash+RNG +Compares 0x 0 0 (-d) *(yi - xij) 0 * 1-|2 d) 2dk'+1Bsp - - d) 2dn+1Bsp 0 (-L - d) 2dk'Bsp 0 0 * Z(-L) 0 ( Decode * Prune Tree 0 0 -d) 2dk'B(s - 1) 2k'B(sp + 1) - d) B2k' The total number of real arithmetic operations, as well other significant computing operations are summarized in Table B.3. 110 B.2.3 Simulation Aided Complexity Analysis In this section we present graphical results based on the expressions presented in Sections B.2.1, and B.2.2. All results in this section were generated by averaging over Monte Carlo simulation results for each of the rateless codes of interest. The Monte Carlo data was substituted into the expressions given in the previous sections in this appendix, and average receiver complexity was computed. Prior presenting the results, we disclose the values of relevant constant parameters for each code. The layered code is configured with L=7 layers, and up to M=7 redundancy blocks and uses the rate 1/3, k = 6144 information bit LTE turbo code with QPSK constellation as the base code for each layer. The receiver was configured to execute partial redundancy block decoding [61 with nrIR = 230. The spinal code is configured for k bits, using k' = 4 bits to generate each spine value, beamwidth B = = 256 information 256, and depth d = 1. The average number of receiver operations per goodbit for each code are plotted over a wide range of SNRs in Figure B-3. Note that the results in Figure B-3 are not presented so that the spinal and layered codes in their stated configurations can be directly compared (indeed, it is not a fair comparison in this configuration, as the information block lengths are vastly different). Instead, Figure B-3 is presented to illustrate the method which was used to generate the results given in Table 4.3. 111 Rateless Code Receiver Complexity 10 8 V 0 0 ................:: ' : . :- - : - - : : :: : : ....I .......... ....... ........... ........ .. .. .. I... .... .. ....... .... .. .*:.................... .................... .................... ................- .................. ...................... .................. .................. .................. . . . .. . . . . . .. . . . . . . . . ..................................................I ............ ..... ...I.......I....... ... ...........I..... .................... I ................... ............ .. ...................... ........ ................................ ............. -S p in a l, k= 256 , 13= 25 6 7 ....... ........... ....... L ay e re d , k= 6 144 ........................ 10 .. . . . . . . . . . . . :.. . . . . . .. . . . . .......*.......*::.. .. .. .... . . . . . . . . . . . . . . . . . . ... . . . .. .. .. -.. .: . .. .. .. .. .. .. .. .. .. .. .......... .. .. .. .. .. .. .. ... .. .. I. .. .. .I.. .. .. . . . . . . .. .... . . .. ................... .. .. .. .. . .. .. ... ...... .. .. .. .. .... . .. .. . . . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . ... ............ ............................................ .. .. .. .. .. .. .. .. .. .. .. ... ..................... .......... ........ ... ............ ...................................... . . . .. . . . . . . . . . .. . . . . .. . . . .. . . . . .. . . . 4) CL 6 0 10 CL 0 L_ 4) . . .. . . . . . . .. . . . . .. . . . . . . . . . I . . . . . .. . . . . .. . .. . . . . . . . . . . . . ... . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . ... . . .. . . . . . . . . . . . . . . . . 4 .. . . . . . . . . . . .. . . . . .. .. . . . . . . . .. . . . . . .. .. . . . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . 4 . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .. . . . .. . . . . .. . . . . .. . .. . . . . . . . . . I . . .. . . . . . . . . . . . . . . . I . .. . . . . . .. .. . . . . . . . . .. . . . 10' .. . . . . . . .. . .. .. ... . . . . . . . . . . .. . . . .. . . . . .. . . . . . .. .. . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . .. . . . . . .. . . . .. . . . . . .. . . . . . .. .. . . . . . .. . . . . .. . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . .I. . . . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . ... . . . . . . . . I . . . . . . . . . .. . . . . . . . . 10 . . .. . . . . . I . .. . . . . . .. . . . . . . . .. . . . . . . .. . . . . . . . . . * . . . . . . . . .. . . . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . ... . . 4 . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . .. . . .. . . . . . . . . . . . .. . . . . . . 4 . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . .. . . . . . .. . . . . . . . . .. . . . . . .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . I .... . . . . . . . . . . . . . . . . . . . .. . . . .. . . . *. .. .. .. .. . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . .. . . . . . :. . . * . : . ' . : . . . ' . . * . I . . . . . . . . .. . . . . . . . .. . .. . . . . .. . .. . . . . .. . . . . . . . . .. . . . . . . .. . . . .. . . . . . .. . . . . . .. . . . . . . .. . . . . . .. . . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4 5 0 10 5 15 20 SNR (dB) Figure B-3: Average receiver operations per goodbit for spinal and layered rateless codes. 112 Bibliography [1] X. Liu, X. Wu, and C. Zhao, "Shortening for irregular qc-ldpc codes," Communications Letters, IEEE, vol. 13, no. 8, pp. 612-614, August 2009. [2] F. Babich, G. Montorsi, and F. Vatta, "Design of rate-compatible punctured turbo (rcpt) codes," in Communications, 2002. ICC 2002. IEEE International Conference on, vol. 3, 2002, pp. 1701-1705 vol.3. [3] N. B. Chang, "Rate Adaptive Non-Binary LDPC Codes with Low Encoding Complexity," Nov. 2011. [4] M. Luby, "Lt codes," in Proceedings of the 43rd Symposium on Foundations of Computer Science, ser. FOCS '02. Washington, DC, USA: IEEE Computer Society, 2002, pp. 271-. [Online]. Available: http://d.acm.org/citation.cfm?id=645413.652135 [5] R. J. Barron, C. K. Lo, and J. M. Shapiro, "Global design methods for raptor codes using binary and higher-order modulations," in Proceedings of the 28th IEEE conference on Military communications, ser. MILCOM'09. Piscataway, NJ, USA: IEEE Press, 2009, pp. 746-752. [Online]. Available: http://d.acm.org/citation.cfm?id=1856821.1856930 [6] U. Erez, M. D. Trott, and G. W. Wornell, "Rateless Coding for Gaussian Channels," IEEE Trans. Inf. Theory, vol. 58, no. 10, pp. 530-547, Feb. 2012. [7] J. Perry, P. Iannucci, K. E. Fleming, H. Balakrishnan, and D. Shah, "Spinal Codes," in A CM SIGCOMM, Helsinki, Finland, August 2012. [8] D. L. Romero, N. B. Chang, and A. R. Margetts, "Practical Non-Binary Rateless Codes for Wireless Channels," in Forty Seventh Asilomar Conference on Signals, Systems and Computers (ASILOMAR), November 2013. [9] 3rd Generation Partnership Project (3GPP). (2012) LTE; Evolved Universal Terrestrial Radio Access (E-UTRA); Medium Access Control (MAC) Protocol Specification. 3GPP TS 36.321. 3rd Generation Partnership Project. [Online]. Available: http://www.3gpp.org/ftp/Specs/html-info/36321.htm [10] A. Shokrollahi, "Raptor codes," IEEE Trans. Inf. Theory, vol. 52, no. 6, pp. 2551-2567, June 2006. [11] A. R. Williamson, T.-Y. Chen, and R. D. Wesel, "Reliability-based error detection for feedback communication with low latency," CoRR, vol. abs/1305.4560, 2013. 113 [12] J. Wolf and I. Blakeney, R.D., "An exact evaluation of the probability of undetected error for certain shortened binary crc codes," in Military Communications Conference, 1988. MILCOM 88, Conference record. 21st Century Military Communications - What's Possible? 1988 IEEE, Oct 1988, pp. 287-292 vol.1. [13] T. Cover and J. Thomas, Elements of Information Theory. Hoboken, NJ: John Wiley Sons, Inc., 2006. [14] Y. Polyanskiy, H. V. Poor, and S. Verddl, "Feedback in the non-asymptotic regime," IEEE Trans. Inf. Theory, vol. 57, no. 8, pp. 4903-4925, 2011. [15] A. Gudipati and S. Katti, "Strider: Automatic Rate Adaptation and Collision Handling," in SIGCOMM, Aug. 2011. [16] J. Perry. (2013, June) libwireless. [Online]. Available: http://www.yonch.com/wireless [17] P. Iannucci, J. Perry, H. Balakrishnan, and D. Shah, "No Symbol Left Behind: A Link-Layer Protocol for Rateless Codes," in ACM MobiCom, Istanbul, Turkey, August 2012. [18] 3rd Generation Partnership Project (3GPP). (2012) LTE; Evolved Universal Terrestrial Radio Access (E-UTRA); Multiplexing and Channel Coding. 3GPP TS 36.212. 3rd Generation Partnership Project. [Online]. Available: http://www.3gpp.org/ftp/Specs/htmlinfo/36212.htm [19] P. A. Iannucci, K. E. Fleming, J. Perry, H. Balakrishnan, and D. Shah, "A hardware spinal decoder," in Proceedings of the Eighth ACM/IEEE Symposium on Architectures for Networking and Communications Systems, ser. ANCS '12. New York, NY, USA: ACM, 2012, pp. 151-162. [Online]. Available: http://doi.acm.org/10.1145/2396556.2396593 [20] A. R. Williamson, T.-Y. Chen, and R. D. Wesel, "A rate-compatible sphere-packing analysis of feedback coding with limited retransmissions," CoRR, vol. abs/1202.1458, 2012. [21] P.-Y. Wu, "On the complexity of turbo decoding algorithms," in Vehicular Technology Conference, 2001. VTC 2001 Spring. IEEE VTS 53rd, vol. 2, 2001, pp. 1439-1443 vol.2. 114 This work is sponsored in part by the Department of the Air Force under Air Force Contract #FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the United States Government. 115