c Copyright 2007 Atri Rudra List Decoding and Property Testing of Error Correcting Codes Atri Rudra A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2007 Program Authorized to Offer Degree: Computer Science and Engineering University of Washington Graduate School This is to certify that I have examined this copy of a doctoral dissertation by Atri Rudra and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. Chair of the Supervisory Committee: Venkatesan Guruswami Reading Committee: Paul Beame Venkatesan Guruswami Dan Suciu Date: In presenting this dissertation in partial fulfillment of the requirements for the doctoral degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of this dissertation is allowable only for scholarly purposes, consistent with “fair use” as prescribed in the U.S. Copyright Law. Requests for copying or reproduction of this dissertation may be referred to Proquest Information and Learning, 300 North Zeeb Road, Ann Arbor, MI 48106-1346, 1-800-521-0600, to whom the author has granted “the right to reproduce and sell (a) copies of the manuscript in microform and/or (b) printed copies of the manuscript made from microform.” Signature Date University of Washington Abstract List Decoding and Property Testing of Error Correcting Codes Atri Rudra Chair of the Supervisory Committee: Associate Professor Venkatesan Guruswami Department of Computer Science and Engineering Error correcting codes systematically introduce redundancy into data so that the original information can be recovered when parts of the redundant data are corrupted. Error correcting codes are used ubiquitously in communication and data storage. The process of recovering the original information from corrupted data is called decoding. Given the limitations imposed by the amount of redundancy used by the error correcting code, an ideal decoder should efficiently recover from as many errors as informationtheoretically possible. In this thesis, we consider two relaxations of the usual decoding procedure: list decoding and property testing. A list decoding algorithm is allowed to output a small list of possibilities for the original information that could result in the given corrupted data. This relaxation allows for efficient correction of significantly more errors than what is possible through usual decoding procedure which is always constrained to output the transmitted information. We present the first explicit error correcting codes along with efficient list-decoding algorithms that can correct a number of errors that approaches the information-theoretic limit. This meets one of the central challenges in the theory of error correcting codes. We also present explicit codes defined over smaller symbols that can correct significantly more errors using efficient list-decoding algorithms than existing codes, while using the same amount of redundancy. We prove that an existing algorithm for a specific code family called Reed-Solomon codes is optimal for “list recovery,” a generalization of list decoding. Property testing of error correcting codes entails “spot checking” corrupted data to quickly determine if the data is very corrupted or has few errors. Such spot checkers are closely related to the beautiful theory of Probabilistically Checkable Proofs. We present spot checkers that only access a nearly optimal number of data symbols for an important family of codes called Reed-Muller codes. Our results are the first for certain classes of such codes. We define a generalization of the “usual” testers for error correcting codes by en- dowing them with the very natural property of “tolerance,” which allows slightly corrupted data to pass the test. TABLE OF CONTENTS Page List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Basics of Error Correcting Codes . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Historical Background and Modeling the Channel Noise . . . . . 1.2 List Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Going Beyond Half the Distance Bound . . . . . . . . . . . . . . 1.2.2 Why is List Decoding Any Good ? . . . . . . . . . . . . . . . . . 1.2.3 The Challenge of List Decoding (and What Was Already Known) 1.3 Property Testing of Error Correcting Codes . . . . . . . . . . . . . . . . 1.3.1 A Brief History of Property Testing of Codes . . . . . . . . . . . 1.4 Contributions of This Thesis . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 List Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.2 Property Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.3 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 3 4 6 8 9 9 10 11 11 13 14 Chapter 2: Preliminaries . . . . . . . . . . . . . . . . . . . 2.1 The Basics . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Basic Definitions for Codes . . . . . . . . . . 2.1.2 Code Families . . . . . . . . . . . . . . . . . 2.1.3 Linear Codes . . . . . . . . . . . . . . . . . . 2.2 Preliminaries and Definitions Related to List Decoding 2.2.1 Rate vs. List decodability . . . . . . . . . . . 2.2.2 Results Related to the -ary Entropy Function . 2.3 Definitions Related to Property Testing of Codes . . . 2.4 Common Families of Codes . . . . . . . . . . . . . . . . . . . . . . . . 15 15 15 16 17 18 19 22 25 26 i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 2.4.1 Reed-Solomon Codes . . . . . . . . . . . . . . . . . . . . . . . . 26 2.4.2 Reed-Muller Codes . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Basic Finite Field Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Chapter 3: List Decoding of Folded Reed-Solomon Codes . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Folded Reed-Solomon Codes . . . . . . . . . . . . . . . . . . 3.2.1 Description of Folded Reed-Solomon Codes . . . . . 3.2.2 Why Might Folding Help? . . . . . . . . . . . . . . . 3.2.3 Relation to Parvaresh Vardy Codes . . . . . . . . . . . 3.3 Problem Statement and Informal Description of the Algorithms 3.4 Trivariate Interpolation Based Decoding . . . . . . . . . . . . 3.4.1 Facts about Trivariate Interpolation . . . . . . . . . . 3.4.2 Using Trivariate Interpolation for Folded RS Codes . . 3.4.3 Root-finding Step . . . . . . . . . . . . . . . . . . . . 3.5 Codes Approaching List Decoding Capacity . . . . . . . . . . 3.6 Extension to List Recovery . . . . . . . . . . . . . . . . . . . 3.7 Bibliographic Notes and Open Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 29 30 31 32 33 34 36 37 38 40 42 47 49 Chapter 4: Results via Code Concatenation . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Code Concatenation and List Recovery . . . . . . 4.2 Capacity-Achieving Codes over Smaller Alphabets . . . . 4.3 Binary Codes List Decodable up to the Zyablov Bound . . 4.4 Unique Decoding of a Random Ensemble of Binary Codes 4.5 List Decoding up to the Blokh Zyablov Bound . . . . . . . 4.5.1 Multilevel Concatenated Codes . . . . . . . . . . 4.5.2 Linear Codes with Good Nested List Decodability 4.5.3 List Decoding Multilevel Concatenated Codes . . 4.5.4 Putting it Together . . . . . . . . . . . . . . . . . 4.6 Bibliographic Notes and Open Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 52 53 54 57 58 59 61 63 66 69 70 . . . . . . . . . . . . . . . . . . . . . . . . Chapter 5: List Decodability of Random Linear Concatenated Codes . . . . . . . 72 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 ii 5.2 5.3 5.4 5.5 . . . . . . . . . . . . . . . . . . . . . 73 76 78 84 84 85 88 Chapter 6: Limits to List Decoding Reed-Solomon Codes . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Overview of the Results . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Limitations to List Recovery . . . . . . . . . . . . . . . . . . . 6.2.2 Explicit “Bad” List Decoding Configurations . . . . . . . . . . 6.2.3 Proof Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 BCH Codes and List Recovering Reed-Solomon Codes . . . . . . . . . 6.3.1 Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Implications for Reed-Solomon List Decoding . . . . . . . . . 6.3.3 Implications for List Recovering Folded Reed-Solomon Codes . 6.3.4 A Precise Description of Polynomials with Values in Base Field 6.3.5 Some Further Facts on BCH Codes . . . . . . . . . . . . . . . 6.4 Explicit Hamming Balls with Several Reed-Solomon Codewords . . . . 6.4.1 Existence of Bad List Decoding Configurations . . . . . . . . . 6.4.2 Low Rate Reed-Solomon Codes . . . . . . . . . . . . . . . . . 6.4.3 High Rate Reed-Solomon Codes . . . . . . . . . . . . . . . . . 6.5 Bibliographic Notes and Open Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 90 91 91 92 93 93 93 97 98 99 101 102 102 102 106 108 Chapter 7: Local Testing of Reed-Muller Codes 7.1 Introduction . . . . . . . . . . . . . . . . 7.1.1 Connection to Coding Theory . . 7.1.2 Overview of Our Results . . . . . 7.1.3 Overview of the Analysis . . . . . 7.2 Preliminaries . . . . . . . . . . . . . . . 7.2.1 Facts from Finite Fields . . . . . . . . . . . . . . . . . . . 111 111 111 112 113 114 116 5.6 Preliminaries . . . . . . . . . . . . . . . . . . . Overview of the Proof Techniques . . . . . . . . List Decodability of Random Concatenated Codes Using Folded Reed-Solomon Code as Outer Code 5.5.1 Preliminaries . . . . . . . . . . . . . . . 5.5.2 The Main Result . . . . . . . . . . . . . Bibliographic Notes and Open Questions . . . . . iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Characterization of Low Degree Polynomials over A Tester for Low Degree Polynomials over . . . . 7.4.1 Tester in . . . . . . . . . . . . . . . . . . 7.4.2 Analysis of Algorithm Test- . . . . . . . . 7.4.3 Proof of Lemma 7.11 . . . . . . . . . . . . . 7.4.4 Proof of Lemma 7.12 . . . . . . . . . . . . . 7.4.5 Proof of Lemma 7.13 . . . . . . . . . . . . . A Lower Bound and Improved Self-correction . . . . 7.5.1 A Lower Bound . . . . . . . . . . . . . . . . 7.5.2 Improved Self-correction . . . . . . . . . . . Bibliographics Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 123 123 124 127 131 136 137 137 137 139 Chapter 8: Tolerant Locally Testable Codes . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 8.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . 8.3 General Observations . . . . . . . . . . . . . . . . . 8.4 Tolerant Testers for Binary Codes . . . . . . . . . . 8.4.1 PCP of Proximity . . . . . . . . . . . . . . . 8.4.2 The Code . . . . . . . . . . . . . . . . . . . 8.5 Product of Codes . . . . . . . . . . . . . . . . . . . 8.5.1 Tolerant Testers for Tensor Products of Codes 8.5.2 Robust Testability of Product of Codes . . . 8.6 Tolerant Testing of Reed-Muller Codes . . . . . . . . 8.6.1 Bivariate Polynomial Codes . . . . . . . . . 8.6.2 General Reed-Muller Codes . . . . . . . . . 8.7 Bibliographic Notes and Open Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 141 141 143 145 145 148 152 152 153 156 156 157 159 Chapter 9: Concluding Remarks 9.1 Summary of Contributions 9.2 Directions for Future Work 9.2.1 List Decoding . . . 9.2.2 Property Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 161 161 162 163 7.3 7.4 7.5 7.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 iv LIST OF FIGURES Figure Number Page 1.1 1.2 Bad Example for Unique Decoding . . . . . . . . . . . . . . . . . . . . . . The Case for List Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 The -ary Entropy Function . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.1 3.2 3.3 Rate vs List decodability for Various Codes . . . . . . . . . . . . . . . . . 31 Folding of the Reed-Solomon Code with Parameter . . . . . . . . . . 32 Relation between Folded RS and Parvaresh Vardy Codes . . . . . . . . . . 34 4.1 4.2 4.3 4.4 List Decodability of Our Binary Codes . . . . . . . . . . . Capacity Achieving Code over Small Alphabet . . . . . . Unique Decoding of Random Ensembles of Binary Codes . Different variables in the proof of Theorem 4.5. . . . . . . 5.1 Geometric interpretations of functions and . . . . . . . . . . . . 74 7.1 Definition of a Pseudoflat . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 8.1 The Reduction from Valiant’s Result . . . . . . . . . . . . . . . . . . . . . 155 v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 6 53 56 60 69 LIST OF TABLES Table Number 4.1 Page Comparison of the Blokh Zyablov and Zyablov Bounds . . . . . . . . . . . 54 vi ACKNOWLEDGMENTS I moved to Seattle from Austin to work with Venkat Guruswami and I have never regretted my decision. This thesis is a direct result of Venkat’s terrific guidance over the last few years. Venkat was very generous with his time and was always patient when I told him about my crazy ideas (most of the time gently pointing out why they would not work), for which I am very grateful. Most of the results in this thesis are the outcome of our collaborations. I am also indebted to Don Coppersmith, Charanjit Jutla, Anindya Patthak and David Zuckerman for collaborations that lead to portions of this thesis. I would like to thank Paul Beame and Dan Suciu for agreeing to be on my reading committee and their thoughtful and helpful comments on the earlier drafts of this thesis. Thanks also to Anna Karlin for being on my supervisory committee. I gratefully acknowledge the financial support from NSF grant CCF-0343672 and Venkat Guruswami’s Sloan Research Fellowship. My outlook on research has changed dramatically since my senior year at IIT Kharagpur when I was decidedly a non-theory person. Several people have contributed to this wonderful change and I am grateful to all of them. First, I would like to thank my undergraduate advisor P. P. Chakrabarti for teaching the wonderful course on Computational Complexity that opened my eyes to the beauty of theoretical computer science. My passion for theory was further cemented in the two wonderful years I spent at IBM India Research Lab. Thanks to Charanjit Jutla and Vijay Kumar for nurturing my nascent interest in theoretical computer science for being wonderful mentors and friends ever since. Finally, I am grateful to David Zuckerman for his wonderful courses on Graph theory and Combinatorics and Pseudorandomness at Austin, which gave me the final confidence to pursue theory. I have been very lucky to have the privilege of collaborating with many wonderful researchers over the years. Thanks to all the great folks at IBM Research with whom I had the pleasure of working as a full time employee (at IBM India) as well as an intern (at IBM Watson and IBM Almaden). Thanks in particular to Nikhil Bansal, Don Coppersmith, Pradeep Dubey, Lisa Fleischer, Rahul Garg, T. S. Jayram, Charanjit Jutla, Robi Krauthgamer, Vijay Kumar, Aranyak Mehta, Vijayshankar Raman, J.R. Rao, Pankaj Rohatgi, Baruch Schieber, Maxim Sviridenko and Akshat Verma for the wonderful time I had during our collaborations. My stay at UT Austin was wonderful and I am grateful to Anna Gál, Greg Plaxton and David Zuckerman for their guidance and our chats about sundry research topics. Thanks a bunch to my room mate and collaborator Anindya Patthak for the wonderful times. Also a big thanks to my friends in Austin for the best possible first one and a half years I could vii have had in the US: Kartik, Sridhar, Peggy, Chris, Kurt, Peter, Walter, Nick, Maria. I thought it would be hard to beat the friendly atmosphere at UT but things were as nice (if not better) at UW. A big thanks to Anna Karlin and Paul Beame for all their help and advice as well as the good times we had during our research collaborations. Special thanks to Paul and Venkat for their kind and helpful advice on what to do with the tricky situations that cropped up during my job search. I had a great time collaborating with my fellow students at UW (at theory night and otherwise): Matt Cary, Ning Chen, Neva Cherniavsky, Roee Engelberg, Thach Nguyen, Prasad Raghavendra, Ashish Sabharwal, Gyanit Singh and Erik Vee. Thanks to my office mate Eytan Adar and Chris Ré for our enjoyable chats. The CSE department at UW is an incredibly wonderful place to work– thanks to everyone for making my stay at UW as much as fun as it has been. Thanks to my parents and sister for being supportive of all my endeavors. Finally, the best thing about my move to Seattle was that I met my wife here. Carole, thanks for everything: you are more than what this desi could have dreamed for in a wife. viii DEDICATION To my family, in the order I met them: Ma, Baba, Purba and Carole ix 1 Chapter 1 INTRODUCTION Corruption of data is a fact of life. Error-correcting codes (or just codes) are clever ways of representing data so that one can recover the original information even if parts of it are corrupted. The basic idea is to judiciously introduce redundancy so that the original information can be recovered even when parts of the (redundant) data have been corrupted. Perhaps the most natural and common application of error correcting codes is for communication. For example, when packets are transmitted over the Internet, some of the packets get corrupted or dropped. To deal with this, multiple layers of the TCP/IP stack use a form of error correction called CRC Checksum [87]. Codes are used when transmitting data over the telephone line or via cell phones. They are also used in deep space communication and in satellite broadcast (for example, TV signals are transmitted via satellite). Codes also have applications in areas not directly related to communication. For example, codes are used heavily in data storage. CDs and DVDs work fine even in presence of scratches precisely because they use codes. Codes are used in Redundant Array of Inexpensive Disks (RAID) [24] and error correcting memory [23]. Codes are also deployed in other applications such as paper bar codes, for example, the bar code used by UPS called MaxiCode [22]. In this thesis, we will think of codes in the communication scenario. In this framework, there is a sender who wants to send (say) message symbols over a noisy channel. The sender first encodes the message symbols into symbols (called a codeword) and then sends it over the channel. The receiver gets a received word consisting of symbols. The receiver then tries to decode and recover the original message symbols. The main challenge in coding theory is to come up with “good” codes along with efficient encoding and decoding algorithms. In the next section, we will define more precisely the notion of codes and the noise model. Typically, the definition of a code gives the encoding algorithm “for free.” The decoding procedure is generally the more challenging algorithmic task. In this thesis, we concentrate more on the decoding aspect of the problem. In particular, we will consider two relaxations of the “usual” decoding problem in which either the algorithm outputs the original message that was sent or gives up (when too many errors have occurred). The two relaxations are called list decoding and property testing. The motivations for considering these two notions of decoding are different: list decoding is motivated by a well known limit on the number of errors one can decode from using the usual notion of decoding while property 2 testing is motivated by a notion of “spot-checking” of received words that has applications in complexity theory. Before we delve into more details of these notions, let us first review the basic definitions that we will need. 1.1 Basics of Error Correcting Codes We will now discuss some of the basic notions of error correcting codes that are needed to put forth the contributions of this thesis.1 These are the following. Encoding The encoding function with parameters "! is a function #%$&('*) & , where & is called the alphabet. The encoding function # takes a message +,&-' and converts it into a codeword #./0 . We will refer to the algorithm that implements the encoding function as an encoder. Error Correcting Code An error correcting code or just a code corresponding to an encoding function # is just the image of the encoding function. In other words, it is the collection of all the codewords. A code 1 with encoding function #2$& ' ) &3 is said to have dimension and block length . In this thesis, we will focus on codes of large block length. Rate The ratio 4556 is called the rate of a code. This notion captures the amount of redundancy used in the code. This is an important parameter of a code which will be used throughout this thesis. Decoding Consider the basic setup for communication. A sender has a message that it sends as a codeword after encoding. During transmission the codeword gets distorted due to errors. The receiver gets a noisy received word from which it has to recover the original message. This “reverse” process of encoding is achieved via a decoding function 7 $8& ) &9' . That is, given a received word, the decoding function picks a message that it thinks was the message that was sent. We will refer to the algorithm that implements the decoding function as a decoder. Distance The minimum distance (or just distance) of a code is a parameter that captures how much two different codewords differ. More formally, the distance between any two codewords is the number of coordinates in which they differ. The (minimum) distance of a code is the minimum distance between any two distinct codewords in the code. 1 We will define some more “advanced” notions later. 3 1.1.1 Historical Background and Modeling the Channel Noise The notions of encoding, decoding and the rate appeared in the seminal work of Shannon [94]. The notions of codes and the minimum distance were put forth by Hamming [67]. Shannon modeled the noise probabilistically. For such a channel, he also defined a real number called the capacity, which is an upper bound on the rate of a code for which one can have reliable communication. Shannon also proved the converse result. That is, there exist codes for any rate less than the capacity of the channel for which one can have reliable communication. This striking result essentially kick-started the fields of information theory and coding theory. Perhaps an undesirable aspect of Shannon’s noise model is that its effectiveness depends on how well the noise is modeled. In some cases it might not be possible to accurately model the channel. In such a scenario, one option is to model the noise adversarialy. This was proposed by Hamming. In Hamming’s noise model, we think of the channel as an adversary who has the full freedom in picking the location as well as nature of errors to be introduced. The only restriction is on the number of errors. We will consider this noise model in the thesis. Alphabet Size and the Noise Model We would like to point out that the noise model is intimately tied with the alphabet. A symbol in the alphabet is the “atomic” unit on which the noise acts. In other words, a symbol that is fully corrupted and a symbol that is partially corrupted are treated as the same. That is, the smaller the size of the alphabet, the more fine-grained the noise. This implies that the decoder has to take care of more error patterns for a code defined over a smaller alphabet. As a concrete example, say we want to design a decoder that can handle :<;>= of errors. Consider a code 1 that is defined over an alphabet of size (i.e., each symbols consists of two bits). Now, let ? be an error pattern in which every alternate bit of a codeword in 1 is flipped. Note that this implies that all the symbols of the codeword have been corrupted and hence the decoder does not need to recover from ? . However, if 1 were defined over the binary alphabet then the decoder would have to recover from ? . Thus, it is harder to design decoders for codes over smaller alphabets. Further, the noise introduced by the channel should be independent of the message length. However, in this thesis, we will study codes that are defined over alphabets whose size depends on the message length. In particular, the number of bits required to represent any symbol in the alphabet would be logarithmic in the message length. The reason for this is two-fold: As was discussed in the paragraph above, designing decoding algorithms is strictly easier for codes over larger alphabets. Secondly, we will use such codes as a starting point to design codes over fixed sized alphabets. With the basic definitions in place, we now turn our attention to the two relaxations of the decoding procedure that will be the focus of this thesis. 4 1.2 List Decoding Let us look at the decoding procedure in more detail. Upon getting the noisy received word, the decoder has to output a message (or equivalently a codeword) that it thinks was actually transmitted. If the output message is different from the message that was actually transmitted then we say that a decoding error has taken place. For the first part of the thesis, we will consider decoders that do not make any decoding error. Instead, we will consider the following notion called unique decoding. For any received word, a unique decoder either outputs the message that was transmitted by the sender or reports a decoding failure. One natural question to ask is how many errors such a unique decoder can tolerate. That is, is there a bound on the number of errors (say @AB , so @CA is the fraction of errors) such that for any error pattern with total error at most @DAD , the decoder always outputs the transmitted codeword? We first argue that @CAFEHGJI,4 . Note that the codeword of symbols really contains symbols of information. Thus, the receiver should have at least uncorrupted symbols among the symbols in the received word to have any hope of recovering the transmitted message. In other words, the information theoretic limit on the number of errors from which one can recover is KIL . This implies that @ALEM/KILCN6LOGPIQ4 . Can this information theoretic limit be achieved ? Before answering the question above, we argue that the limit also satisfies @DASR,TU6CWVXY , where we assume that the distance of the code T is even. Consider two distinct messages [Z\!] such that the distance between #.^_Z` and #a/]b is exactly T . Now say that the sender sends the codeword #a/KZN over the channel and the channel introduces Tc6<V errors and distorts the codeword into a received word d that is at a distance of TU6eV from both #a/[ZN and #a/]b (see Figure 1.1). Now, when the decoder gets d as an input it has no way of knowing whether the original transmitted codeword was #.^KZf or #./g\ .2 Thus, the decoder has to output a decoding failure when it receives d and so we have @AHRhTU6UiVXj . How far is TU6CWVXY from the information theoretic bound of GkIK4 ? Unfortunately the gap is quite big. By the so called Singleton bound, TlEm_ILonpG or Tc6qRrGsIt4 . Thus, the limit of TU6CWVXY is at most half the information theoretic bound. We note that even though the limits differ by “only a small constant,” in practice the potential to correct twice the number of errors is a big gain. Before we delve further into this gap between the information theoretic limit and half the distance bound, we next argue that the the bound of TU6<V is in fact tight in the following sense. If @CAB Tc6<VuIvG , then for an error pattern with at most @AD errors, there is always a unique transmitted codeword. Suppose that this were not true and let #a/wZN be the transmitted codeword and let d be the received word such that d is within distance @UAD from both #./SZN and #./g\ . Then by the triangle inequality, the distance between #a/[ZN and #.^]\ is at most VX@CADxvTyIV0RzT , which contradicts the fact that T is the 2 Throughout this thesis, we will be assuming that the only communication between the sender and the receiver is through the channel and that they do not share any side information/channel. 5 E(m1) E(m1) E(m2) y d/2 E(m2) d/2 y d n−d d/2 d/2 Figure 1.1: Bad example for unique decoding. The picture on the left shows two codewords #a/SZN and #.^]b that differ in exactly T positions while the received word d differs from both #.^SZf and #.^]\ in Tc6<V many positions. The picture on the right is another view of the same example. Every -symbol vector is now drawn on the plane and the distance between any two points is the number of positions they differ in. Thus, #a/{ZN and #.^]b are at a distance T and d is at a distance Tc6<V from both. Further, note that any point that is strictly contained within one of the balls of radius TU6eV has a unique closest-by codeword. minimum distance of the code (also see Figure 1.1). Thus, as long as @BABw|Tc6<V}ItG , the decoder can output the transmitted codeword. So if one wants to do unique decoding then one can correct up to half the distance of the code (but no further). Due to this “half the distance barrier”, much effort has been devoted to designing codes with as large a distance as possible. However, all the discussion above has not addressed one important aspect of decoding. We argued that for @CABw|TU6eV}ItG , there exists a unique transmitted codeword. However, the argument sheds no light on whether the decoder can find such a codeword efficiently. Of course, before we can formulate the question more precisely, we need to state what we mean by efficient decoding. We will formulate the notion more formally later on but for now we will say that a decoder is efficient if its running time is polynomial in the block length of the code (which is the number of symbols in the received word). As a warm up, let us consider the following naive decoding algorithm. The decoder goes through all the codewords in the code and outputs the codeword that is closest to the received word. The problem with this brute-force algorithm is that its running time is exponential in the block length for constant rate codes (which will be the focus of the first part of the thesis) and thus, is not an efficient algorithm. There is a rich body of beautiful work that focuses on designing efficient algorithms for unique decoding for many families of codes. These are discussed in detail in any standard coding theory texts such as [80, 104]. 6 We now return to the gap between the half the distance and the information theoretic limit of uI~ . 1.2.1 Going Beyond Half the Distance Bound Let us revisit the bound of half the minimum distance on unique decoding. The bound follows from the fact that there exists an error pattern for which one cannot do unique decoding. However, such bad error patterns are rare. This follows from the nature of the space that the codewords (and the received words) “sit” in. In particular, one can think of a code of block length as consisting of non-overlapping spheres of radius TU6<V , where the codewords are the centers of the spheres (see Figure 1.2). The argument for half the distance bound uses the fact that at least two such spheres touch. The touching point corresponds to the received word d that was used in the argument in the last section. However, the way the spheres pack in high dimension (recall the dimension of such a space is equal to the block length of the code ), almost every point in the ambient space has a unique by closest codeword at distances well beyond TU6<V (see Figure 1.2). E(m1) y d/2 E(m2) d/2 E(m4) d/2 y’ E(m3) d/2 Figure 1.2: Four close by codewords #a/_ZN\!f#.^]bb!f#./u and #a/u\ with two possible received words d and dY . #a/SZNb!f#./g\ and d form the bad example of Figure 1.1. However, the bad examples lie on the dotted lines. For example, d is at a distance more than Tc6<V from its (unique) closest codewords #.^0\ . In high dimension, the space outside the balls of radius Tc6<V contains almost the entire ambient space. Thus, by insisting on always getting back the original codeword, we are giving up on 7 correcting from error patterns from which we can recover the original codeword. One natural question one might ask is if one can somehow meaningfully relax this stringent constraint. In the late 1950s, Elias and Wozencraft independently proposed a nice relaxation of unique decoding that gets around the barrier of half the distance bound [34, 106]. Under list decoding, the (list) decoder needs to output a “small” list of answers with the guarantee that the transmitted codeword is present in the list.3 More formally, for a given error bound @> and a received word d , the list-decoding algorithm has to output all codewords that are at a distance at most @ from d . Note that when @ is an upper bound on the number of errors that can be introduced by the channel, the list returned by the list-decoding algorithm will have the transmitted codeword in the list. There are two immediate questions that arise: (i) Is list decoding a useful relaxation of unique decoding? (ii) Can we correct a number of errors that is close to the information theoretic limit using list decoding ? Before we address these questions, let us first concentrate on a new parameter that this new definition throws into the mix: the worst case list size. Unless mentioned otherwise, we will use to denote this parameter. Note that the running time of the decoding algorithm is si8 as the decoder has to output every codeword in the list. Since we are interested in efficient, polynomial time, decoding algorithms, this puts an a priori requirement that be a polynomial in the block length of the code. For a constant rate code, which has exponentially many codewords, the polynomial bound on is very small compared to the total number of codewords. This bound was what we meant by small lists while defining list decoding. Maximum Likelihood Decoding We would like to point out that list decoding is not the only meaningful relaxation of unique decoding. Another relaxation called maximum likelihood decoding (or MLD) has been extensively studied in coding theory. Under MLD, the decoder must output the codeword that is closest to the received word. Note that if the number of errors is at most TItGN6<V , then MLD and unique decoding coincide. Thus, MLD is indeed a generalization of unique decoding. MLD and list decoding are incomparable relaxations. On the one hand, if one can list decode efficiently up to the maximum number of errors that the channel can introduce then one can do efficient MLD. On the other hand, MLD does not put any restriction on the number of errors it needs to tolerate (whereas such a restriction is necessary for efficient list decoding). The main problem with MLD is that is turns out to be computationally intractable in general [17, 79, 4, 31, 37, 91] as well as for specific families of codes [66]. In 3 The condition on the list size being small is important. Otherwise, here is a trivial list-decoding algorithm: output all codewords in the code. This, however is a very inefficient and more pertinently a useless algorithm. We will specify more carefully what we mean by small lists soon. 8 fact, there is no non-trivial family of codes known for which MLD can be done in polynomial time. However, list decoding is computationally tractable for many interesting families of codes (some of which we will see in this thesis). We now turn to the questions that we raised about list decoding. 1.2.2 Why is List Decoding Any Good ? We will now devote some time to address the question of whether list decoding is a meaningful relaxation of the unique decoding problem. Further, what does one do when the decoder outputs a list ? In the communication setup, where the receiver might not have any side information, the receiver can still use a list-decoding algorithm to do “normal” decoding. It runs the list-decoding algorithm on the received word. If the list returned has just one codeword in it, then the receiver accepts that codeword as the transmitted codeword. If the list has more than one codeword, then it declares a decoding failure. First we note that this is no worse than the original unique decoding setup. Indeed if the number of errors is at most Tc6<VyIpG , then by the discussion in Section 1.2 the list is going to contain one codeword and we would be back in the unique decoding regime. However, as was argued in the last section, for most error patterns (with total number of errors well beyond TU6<V ) there is a unique closest by codeword. In other words, the list size for such error patterns would be one. Thus, list decoding allows us to correct from more error patterns than what was possible with unique decoding. We now return to the question of whether list decoding can allow us to correct errors up to the information theoretic limit of G(I~4 ? In short, the answer is yes. Using random ; coding arguments one can show that for any , with high probability a random code of rate 4 , has the potential to correct up to G(IF4tI fraction of errors with a worst case list size of `G6X< (see Chapter 2 for more details). Further, one can show that for such codes, the list size is one for most received words.4 Other Applications of List Decoding In addition to the immense practical potential of correcting more than half the distance number of errors in the communication setup, list decoding has found many surprising applications outside of the coding theory domain. The reader is referred to the survey by Sudan [98] and the thesis of Guruswami [49] (and the references therein) for more details on these applications. A key feature in all these applications is that there is some side information that one can use to sift through the list returned by the list-decoding algorithm to pick the “correct” codeword. A good analogy is that of a spell checker. Whenever a word is mis-spelt, the spell checker returns to the user a list of possible words that the user might have intended to use. The user can then prune this list to choose the word that he or she had 4 This actually follows using the same arguments that Shannon used to establish his seminal result. 9 intended to use. Indeed, even in the communication setup, if the sender and the receiver can use a side channel (or have some shared information) then one can use list decoding to do “unambiguous” decoding [76]. 1.2.3 The Challenge of List Decoding (and What Was Already Known) In the last section, we argued that list decoding is a meaningful relaxation of unique decoding. More encouragingly, we mentioned that random codes have the potential to correct errors up to the information theoretic limit using list decoding. However, there are two major issues with the random codes result. First, these codes are not explicit. In real world applications, if one wants to communicate messages then one needs an explicit code. However, depending on the application, one might argue that doing a brute force search for such a code might work as this is a “one-off” cost that one has to pay. The second and perhaps more serious drawback is that the lack of structure in random codes implies that it is hard to come up with efficient list decodable algorithms for such codes. Note that for decoding, one cannot use a brute-force list-decoding algorithm. Thus, the main challenge of list decoding is to come up with explicit codes along with efficient list-decoding (and encoding) algorithms that can correct errors close to the information theoretic limit of gIF . The first non-trivial list-decoding algorithm is due to Sudan [97], which built on the results in [3]. Sudan devised a list-decoding algorithm for a specific family of codes called Reed-Solomon codes [90] (widely used in practice [105]), which could correct beyond half the distance barrier for Reed-Solomon codes of rate at most G6X . This result was then extended to work for all rates by Guruswami and Sudan [63]. It is worthwhile to note that even though list decoding was introduced in the late 1950s, these results came nearly forty years later. There was no improvement to the Guruswami-Sudan result until the recent work of Parvaresh and Vardy [85], who designed a code that is related to Reed-Solomon codes and presented efficient list-decoding algorithms that could correct more errors than the Guruswami-Sudan algorithm. However, the result of Parvaresh and Vardy does not meet the information theoretic limit (see Chapter 3 for more details). Further, for list decoding Reed-Solomon codes there has been no improvement over [63]. This concludes our discussion on the background for list decoding. We now turn to another relaxation of decoding that constitutes the second part of this thesis. 1.3 Property Testing of Error Correcting Codes Consider the following communication scenario in which the channel is very noisy. The decoder, upon getting a very noisy received word, does its computation and ultimately reports a decoding failure. Typically, the decoding algorithm is an expensive procedure and it would be nice if one could quickly test if the received word is “far” from any codeword (in which case it should reject the received word) or is “close” to some codeword (in which case 10 it should accept the received word). In the former case, we would not run our expensive decoding algorithm and in the latter case, we would then proceed to run the decoding algorithm on the received word. The notion of efficiency that we are going to consider for such spot checkers is going to be a bit different from that of decoding algorithms. We will require the spot checker to probe only a few positions in the received word during the course of its computation. Intuitively this should be possible as spot checking is a strictly easier task than decoding. Further, the fact that the spot checkers need to make their decision based on a portion of the received word should make spot checking very efficient. For example, if one could design spot checkers that look at only constant many positions (independent of the block length of the code), then we would have a spot checkers that run in constant time. However, note that since the spot checker cannot look at the whole received word it cannot possibly predict accurately if the received word is “far” from all the codewords or is “close” to some codeword. Thus, this notion of testing is a relaxation of the usual decoding as one sacrifices in the accuracy of the answer while gaining in terms of number of positions that one needs to probe. A related notion of such spot checkers is that of locally testable codes (LTCs). LTCs have been the subject of much research over the years and there has been heightened activity and progress on them recently [46, 11, 74, 14, 13, 44]. LTCs are codes that have spot checkers as those discussed above with one crucial difference: they only need to differentiate between the cases when the received word is far from all codewords and the case when it is a codeword. LTCs arise in the construction of Probabilistically Checkable Proofs (PCPs) [5, 6] (see the survey by Goldreich [44] for more details on the interplay between LTCs and PCPs). Note that in the notion of LTC, there is no requirement on the spot checker for input strings that are very close to a codeword. This “asymmetry” in the way the spot checker accepts and rejects an input reflects the way PCPs are defined, where the emphasis is on rejecting “wrong” proofs. Such spot checkers fall under the general purview of property testing (see for example the surveys by Ron [92] and Fischer [38]). In property testing, for some property , given an object as an input, the spot checker has to decide if the given object satisfies the property or is “far” from satisfying . LTCs are a special case of property testing in which the property is membership in some code and the objects are received words. The ideal LTCs are codes with constant rate and linear distance that can be tested by probing only constant many position in the received word. However, unlike the situation in list decoding (where one can show the existence of codes with the “ideal” properties), it is not known if such LTCs exist. 1.3.1 A Brief History of Property Testing of Codes The field of codeword testing, which started with the work of Blum, Luby and Rubinfeld [21] (who actually designed spot checkers for a variety of numerical problems), later developed into the broader field of property testing [93, 45]. LTCs were first explicitly de- 11 fined in [42, 93] and the systematic study of whether ideal LTCs (as discussed at the end of the last section) was initiated in [46]. Testing for Reed-Muller codes in particular has garnered a lot of attention [21, 9, 8, 36, 42, 93, 7, 1, 74], as they were crucial building blocks in the construction of PCPs [6, 5], Kaufman and Litsyn [73] gave a sufficient condition on an important class of codes that imply that the code is an LTC. Ben-Sasson and Sudan [13] built LTCs from a variant of PCPs called the Probabilistically Checkable Proof of Proximity– this “method” of constructing LTCs was initiated by Ben-Sasson et al. [11]. 1.4 Contributions of This Thesis The contributions of this thesis are in two parts. The first part deals with list decoding while the second part deals with property testing of codes. 1.4.1 List Decoding This thesis advances our understanding of list decoding. Our results can be roughly divided into three parts: (i) List decodable codes of optimal rate over large alphabets, (ii) List decodable codes over small alphabets, and (iii) Limits to list decodability. We now look at each of these in more detail. List Decodable Codes over Large Alphabets Recall that for codes of rate 4 , it is information theoretically not possible to correct beyond GIo4 fraction of errors. Further, using random coding argument one can ; show the existence of codes that can correct up to GI4oIP fraction of errors for any (using list decoding). Since the first non-trivial algorithm of Sudan [97], there has been a lot of effort in designing explicit codes along with efficient list-decoding algorithms that can correct errors close to the information theoretic limit. In Chapter 3, we present the culmination of this line of work by presenting explicit codes (which are in turn extensions of Reed-Solomon codes) along with polynomial time list-decoding algorithm that can correct GIx4tI fraction of ; ; errors in polynomial time (for every rate RO4RG and any ~ ). This answers a question that has been open for close to 50 years and meets one of the central challenges in coding theory. This work was done jointly with Venkatesan Guruswami and was published in the proceedings of the 38th Symposium on Theory of Computing (STOC), 2006 [58] and is under review for the journal IEEE Transactions on Information Theory. List Decodable Codes over Small Alphabets The codes mentioned in the last subsection are defined over alphabets whose size increases with the block length of the code. As discussed in Section 1.1.1, this is not a desirable feature. In Chapter 4, we show how to use our codes from Chapter 3 along with known 12 techniques of code concatenation and expander graphs to design codes over alphabets of ; size V<" that can still correct up to GIF4QIl fraction of errors for any * . To get to within of the information theoretic limit of ]I~ , it is known that one needs an alphabet of size Ve Xi (see Chapter 2 for more details). However, if one were interested in codes over alphabets of fixed size, the situation is different. First, it is known that for fixed size alphabets, the information theoretic limit is much smaller than {I (see Chapter 2 for more details). Again, one can show that random codes meet this limit. In Chapter 4, we present explicit codes along with efficient list-decoding algorithms that correct errors up to the so called Blokh-Zyablov bound. These results are the currently best known via explicit codes, though the number of errors that can be corrected is much smaller than the limit achievable by random codes. This work was done jointly with Venkatesan Guruswami and appears in two different papers. The first was published in in the proceedings of the 38th Symposium on Theory of Computing (STOC), 2006 [58] and is under review for the journal IEEE Transactions on Information Theory. The second paper will appear in the proceedings of the 11th International Workshop on Randomization and Computation (RANDOM) [60]. Explicit codes over fixed alphabets, considered in Chapter 4, are constructed using code concatenation. However, as mentioned earlier, the fraction of errors that such codes can tolerate via list decoding is far from the information theoretic limit. A natural question to ask is whether one can use concatenated codes to achieve the information theoretic limit? In Chapter 5 we give a positive answer to this question in following sense. We present a random ensemble of concatenated codes that with high probability meet the information theoretic limit: That is, they can potentially list decode as large a fraction of errors as general random codes, though with larger lists. This work was done jointly with Venkatesan Guruswami and is an unpublished manuscript [61]. Limits to List Decoding Reed-Solomon Codes The results discussed in the previous two subsections are of the following flavor. We know that random codes allow us to list decode up to a certain number of errors, and that is optimal. Can we design more explicit codes (maybe with efficient list-decoding algorithms) that can correct close to the number of errors that can be corrected by random codes? However, consider the scenario where one is constrained to work with a certain family of codes, say Reed-Solomon codes. Under this restriction what is the most number of errors from which one can hope to list decode? The result of Guruswami and Sudan [63] says that one can efficiently correct up to wI| j many errors for Reed-Solomon codes. However, is this the best possible? In Chapter 6, we give some evidence that the Guruswami-Sudan algorithm might indeed be the best possible. Along the way we also give some explicit constructions of “bad list-decoding configurations.” A bad list-decoding configuration refers to a received word d along with an 13 error bound @ such that there are super-polynomial (in ) many Reed-Solomon codewords within a distance of @ from d . This work was done jointly with Venkatesan Guruswami and appears in two different papers. The first was published in in the proceedings of the 37th Symposium on Theory of Computing (STOC), 2005 [56] as well as in the IEEE Transactions on Information Theory [59]. The second paper is an unpublished manuscript [62]. 1.4.2 Property Testing We now discuss our results on property testing of error correcting codes. Testing Reed-Muller Codes Reed-Muller codes are generalizations of Reed-Solomon codes. Reed-Muller codes are based on multivariate polynomials while Reed-Solomon codes are based on univariate polynomials. Local testing of Reed-Muller codes was instrumental in many constructions of PCPs. However, the testers were only designed for Reed-Muller codes over large alphabets. In fact, the size of the alphabet of such codes depends on the block length of the codes. In Chapter 7, we present near-optimal local testers for Reed-Muller codes defined over (a class of) alphabets of fixed size. This work was done jointly with Charanjit Jutla, Anindya Patthak and David Zuckerman and was published in the proceedings of the 45th Symposium on Foundation of Computer Science (FOCS), 2005 [72] and is currently under review for the journal Random Structures and Algorithms. Tolerant Locally Testable Codes Recall that the notion of spot checkers that we were interested in had to accept the received word if it is far from all codewords and reject when it is close to some codeword (as opposed to LTCs, which only require to accept when the received word is a codeword). Surprisingly, such testers were not considered in literature before. In Chapter 8, we define such testers, which we call tolerant testers. Our results show that in general LTCs do not imply tolerant testability, though most LTCs that achieve the best parameters also have tolerant testers. As a slight aside, we look at certain strong form of local testability (called robust testability) of certain product of codes. Product of codes are also special cases of certain concatenated codes considered in Chapter 4. We show that in general, certain product of codes cannot be robustly testable. This work on tolerant testing was done jointly with Venkatesan Guruswami and was published in the proceedings of the 9th International Workshop on Randomization and Computation (RANDOM) [57]. The work on robust testability of product of codes is joint work with Don Coppersmith and is an unpublished manuscript [26]. 14 1.4.3 Organization of the Thesis We start with some preliminaries in Chapter 2. In Chapter 3, we present the main result of this thesis: codes with optimal rate over large alphabets. This result is then used to design new codes in Chapters 4 and 5. We present codes over small alphabets in Chapter 4, which are constructed by a combination of the codes from Chapter 3 and code concatenation. In Chapter 5, we show that certain random codes constructed by code concatenation also achieve the list-decoding capacity. In Chapter 6, we present some limitations to list decoding Reed-Solomon codes. We switch gears in Chapter 7 and present new local testers for Reed-Muller codes. We present our results on tolerant testability in Chapter 8. We conclude with the major open questions in Chapter 9. 15 Chapter 2 PRELIMINARIES In this chapter we will define some basic concepts and notations that will be used throughout this thesis. We will also review some basic results in list decoding that will set the stage for our results. Finally, we will look at some specific families of codes that will crop up frequently in the thesis. 2.1 The Basics We first fix some notation that will be used frequently in this work, most of which is standard. For any integer G , we will use u¡ to denote the set ¢cG£!¥¤¥¤¥¤!K¦ . Given positive integers and , we will denote the set of all length vectors over u¡ by §¡ . Unless mentioned otherwise, all vectors in this thesis will be row vectors. ¨ª©£«3¬ will denote the logarithm of ¬ in base V . ¨®­¬ will denote the natural logarithm of ¬ . For bases other than V and ? , we will specify the base of the logarithm explicitly: for example logarithm of ¬ in base will be denoted by ¨ª©£«¯¬ . A finite field with elements will be denoted by ¯ or °± £ . For any real value ¬ in ; the range E¬0E|G , we will use ² ¯ ¬³Q¬-¨ª©£«£¯ 3IwGDI]¬(¨ª©£«>¯¬PI´GµI]¬c¨ª©£«£¯GµI]¬ to denote the -ary entropy function. For the special case of y|V , we will simply use ²w ¬" for ²y/¬ . For more details on the -ary entropy function, see Section 2.2.2. For any finite set ¶ , we will use ·¸¶J· to denote the size of the set. We now move on to the definitions of the basic notions of error correcting codes. 2.1.1 Basic Definitions for Codes Let *tV be an integer. Code, Blocklength, Alphabet size : An error correcting code (or simply a code) 1 is a subset of ¸¡ for positive integers and . The elements of 1 are called codewords. The parameter is called the alphabet size of 1 . In this case, we will also refer to 1 as a -ary code. When P5V , we will refer to 1 as a binary code. The parameter is called the block length of the code. 16 Dimension and Rate : For a -ary code 1 , the quantity t¨ª©<«¯·¸1.· is called the dimension of the code (this terminology makes more sense for certain classes of codes called linear codes, which we will discuss shortly). For a -ary code 1 with block length , its rate is defined as the ratio 45H¹»º½¼`¾¿ ÀÁ¿ . Often it will be useful to use the following alternate way of looking at a code. We will think of a -ary code 1 with block length and ·¸1·£Là as a function ÄÃL¡") ¸¡ . Every element ¬ in ¸ÃL¡ is called a message and 1a/¬ is its associated codeword. If à is a power of , then we will think of the message as length -vector in »¡ ' . Viewed this way, 1 provides a systematic way to add redundancy such that messages of length over ¸¡ are mapped to symbols over »¡ . È (Minimum) DistanceÈ and Relative distance : Given any two vectors ÅQ%Æ^ÇCZ!¤¥¤¥¤!`Ç and in , their Hamming distance (or simply distance), denoted by Ê É / Æ Á Ë Z ! ¥ ¤ ¥ ¤ ¤ ! Ë ¸ ª ¡ Ì Ì /Å8!NÉ , is the number of positions that they differ in. In other words, ^Å8!NÉ3v·¢Í¥· Ë"ÎÐ Ï ÇÎѦ . The (minimum) distance of a code 1 is the minimum Hamming distance between any two codewords in the code. More formally ÒÓ®Ô`Õ Ó Ì i1k × Ö × Ø`Ù ­ ÜZ\!NÜ\\\¤ × Û W× Ø À NÚ The relative distance of a code 1 of block length is defined as ÝPHÞfß»àªá À . 2.1.2 Code Families The focus of this thesis will be on the asymptotic performance of decoding algorithms. For such analysis to make sense, we need to work with an infinite family of codes instead of a single code. In particular, an infinite family of -ary codes â is a collection ¢e1Î`· Í-+wã3¦ , where for every Í , 18Î is a -ary code of length Î and "Îä~"ÎæåZ . The rate of the family â is defined as é Ó Ó 4oçâµ³Q¨ Ö C­ Î è ¨ª©<«>¯µ·¸13ÎN· ¤ BÎ ê The relative distance of such a family is defined as Ó ÝU^âäkQ¨ Ö Ó ­UÎ è é ÒÓ®Ô`Õ i 13Î ¤ "Î ê From this point on, we will overload notation by referring to an infinite family of codes simply as a code. In particular, from now on, whenever we talk a code 1 of length , rate 17 4 and relative distance Ý , we will implicitly assume the following. We will think of as large enough so that its rate 4 and relative distance Ý are (essentially) same as the rate and the relative distance of the corresponding infinite family of codes. Given this implicit understanding, we can talk about the asymptotics of different algorithms. In particular, we will say that an algorithm that works with a code of block length × is efficient if its running time is / for some fixed constant Ü . 2.1.3 Linear Codes We will now consider an important sub-class of codes called linear codes. Definition 2.1. Let be a prime power. A -ary code 1 of block length if it is a linear subspace (over some field ¯ ) of the vector space B¯ . is said to be linear The size of a -ary linear code is obviously £' for some integer . In fact, it is the dimension of the corresponding subspace in ¯ . Thus, the dimension of the subspace is same as the dimension of the code. (This is the reason behind the terminology of dimension of a code.) We will denote a -ary linear code of dimension , length and distance T as an ³!ë"!NT£¡ ¯ code. (For a general code with the same parameters, we will refer to it as an ^³!ë"!NTU ¯ code.) Most of the time, we will drop the distance part and just refer to the code as an ³!ë¡ ¯ code. Finally, we will drop the dependence on if the alphabet size is clear from the context. We now make some easy observations about -ary linear codes. First, the zero vector is always a codeword. Second, the minimum distance of a linear code is equal to the minimum Hamming weight of the non-zero codewords, where the Hamming weight of a vector is the number of positions with non-zero values. Any !b>¡ ¯ code 1 can be defined in the following two ways. 1 can be defined as a set ¢ìí{· ì,+w ¯' ¦ , where í called a generator matrix of 1 . is an 0î0 matrix over ¯ . í is 1 can also be characterized by the following subspace ¢Xï"·»ïl+ ¯ and ²]ïeðñ¦ , where ² is an ^KItòî_ matrix over ¯ . ² is called the parity check matrix of 1 . The code with ² as its generator matrix is called the dual of 1 and is generally denoted by 1}ó . The above two representations imply the following two things for an ³!ë¡ ¯ code 1 . First, given the generator matrix í and a message ì5+F ¯ , one can compute 1a/¬ using /j field operations (by multiplying ìjð with í ). Second, given a received word d´+] ¯ and the parity check matrix ² for 1 , one can check if d´+K1 using a^k/}I[CN operations (by computing ²ud and checking if it is the all zeroes vector). 18 Finally, given a -ary linear code 1 , we can define the following equivalence relation. ìô d if and only if ìwI,dz+1 . It is easy to check that since 1 is linear this indeed À is an equivalence relation. In particular, ô partitions ¯ into equivalence classes. These are called cosets of 1 (note that one of the À cosets is the code 1 itself). In particular, every Ï 1 and dwn1 is shorthand for coset is of the form dwn1 , where either dõMñ or dö+÷ ¢dunï"·»ï+K1*¦ . We are now ready to talk about definitions and preliminaries for list decoding and property testing of codes. 2.2 Preliminaries and Definitions Related to List Decoding Recall that list decoding is a relaxation of the decoding problem, where given a received word, the idea is to output all “close-by” codewords. More precisely, given an error bound, we want to output all codewords that lie within the given error bound from the received word. Note that this introduces a new parameter into the mix: the worst case list size. We will shortly define the notion of list decoding that we will be working with in this thesis. ; Given integers LV , {pG , Et?E, and a vector ì´+~ »¡ , we define the Hamming ball around ì of radius ? to be the set of all vectors in »¡æ that are at Hamming distance at most ? from ì . That is, ø ¯ ^ìk!f?Xkq¢d· d´+F »¡ and Ì /d3!ìµ9EQ?<¦c¤ We will need the following well known result. Proposition 2.1 ([80]). Let OV and ?>!õùG be integers such that ?´Eh`G}I÷G6e£Ñ . Define @ot?X6 . Then the following relations are satisfied. ø ü Î û W JIQG Q E ¾ t ¾ ` ¤ û Î ÛCýuþ ÍXÿ å · ¯ iñú!N?X·UQ ¾ ` ¤ · ø ¯ iñú!N?X·> (2.1) (2.2) We will be using the following definition quite frequently. Definition 2.2 (List-Decodable Codes). Let 1 be a -ary code of block length . Let ; z ø öG be an integer and RÊ@lR%G be a real. Then 1 is called ^@"!f8 -list decodable if every Hamming ball of radius @ has at most codewords in it. That is, for every dl+gú¯ , · /d8!@>Y]1.·UEt . In the definitions above, the parameter can depend on the block length of the code. In such cases, we will explicitly denote the list size by /Y , where is the block length. We will also frequently use the notion of list-decoding radius, which is defined next. 19 Definition 2.3 (List-Decoding Radius). Let 1 be a -ary code of block length . Let ; Rx@§RqG be a real and define ?Pt@> . 1 is said to have a list-decoding radius of @ (or ? ) with list size if @ (or ? ) is the maximum value for which 1 is /@B!f8 -list decodable. We will frequently use the term list-decoding radius without explicitly mentioning the list size in which case the list size is assumed to be at most some fixed polynomial in the block length. Note that one way to show that a code 1 has a list-decoding radius of at least @ is to present a polynomial time list-decoding algorithm that can list decode 1 up to a @ fraction of errors. Thus, by abuse of notation, given an efficient list-decoding algorithm for a code that can list decode a @ fraction (or ? number) of errors, we will say that the list-decoding algorithm has a list-decoding radius of @ (or ? ). In most places, we will be exclusively talking about list-decoding algorithms in which case we will refer to their list-decoding radius as decoding radius or just radius. In such a case, the code under consideration is said to be list decodable up to the corresponding decoding radius (or just radius). Whenever we are talking about a different notion of decoding (say unique decoding), we will refer to the maximum fraction of errors that a decoder can correct by qualifying the decoding radius with the specific notion of decoding (for example unique decoding radius). We will also use the following generalization of list decoding. Definition 2.4 (List-Recoverable Codes). Let 1 be a -ary code of block length . Let ; !fQzG be integers and RQ@uRpG be a real. Then 1 is called /@"! !f8 -list recoverable if the following is true. For every sequence of sets ¶µZ!¥¤¥¤¥¤!f¶ , whereÈ ¶"Î ¸¡ and ·¸¶"ÎN·úE for every GsE,Í8E, , there are at most codewords ï*õÆ ÜXZ!¥¤¥¤¥¤!NÜ +K1 such that ÜëÎä+K¶BÎ for at least GI´@C½ positions Í . Further, code 1 is said to ^@"! -list recoverable in polynomial time if it is /@"! !f(/jN list recoverable for some polynomially bounded function ( , and moreover there is a polynomial time algorithm to find the at most (/j codewords that are solutions to any ^@"! !ë/Y` -list recovery instance. List recovery has been implicitly studied in several works; the name itself was coined in [52]. Note that a /@B!G<!ë3 -list recoverable code is a ^@"!f8 -list decodable code and hence, list recovery is indeed a generalization of list decoding. List recovery is useful in list decoding codes obtained by a certain code composition procedure. The natural list decoder for such a code is a two stage algorithm, where in the first stage the “inner” codes are list decoded to get a sequence of lists, from which one needs to recover codewords from the “outer” code(s). For such an algorithm to be efficient, the outer codes need to be list recoverable. We next look at the most fundamental tradeoff that we would be interested in for list decoding. 2.2.1 Rate vs. List decodability ; In this subsection, we will consider the following question. Given limits ÷G and R @´R G on the worst case list size and the fraction of errors that we want to tolerate, what 20 is the maximum rate that a ^@"!f8 -list decodable code can achieve? The following results were implicit in [110] but were formally stated and proved in [35]. We present the proofs for the sake of completeness. We first start with a positive result. ; Theorem 2.1 ([110, 35]). Let tV be an integer and R,ÝEpG be a real. For any integer ; tvG and any real RL@0RõG(ILG6e , there exists a /@"!f8 -list decodable -ary code with Z Z . rate at least GI² ¯ ^@äI Z I Proof. We will prove the result by using the probabilistic method [2]. Choose a code 1 of Z Zå at random. That is, pick block length and dimension a`GIF² ¯ ^@äI Z ÑgI´ each of the ' codewords in 1 uniformly (and independently) at random from ¸¡ç . We will show that with high probability, 1 is /@B!ë3 -list decodable. Let ·¸1.·cà 5 ' . We Z first fix the received word d+~ ¸¡® . Consider an iSntG -tuple Z of codewords ï !¥¤¥¤¥¤!Nï in 1 . Now if all these codewords fall in a Hamming ball of radius @ around d , then 1 is not /@B!f8 -list decodable. In other words, this WKnLG -tuple forms a counter-example for 1 having the required list decodable properties. What is the ø probability that such an event happens ?ø For any fixed codeword ï´+1 , the probability that it lies in /d3!@Y is exactly · /d8!@>Y· ¤ Z Now since every codeword is picked independently, the probability that the tuple Wï !¤¥¤¥¤! ø is ï Z forms a counter example · ^d8!@j· ÿ þ Z å Z Z½å EQ < ¾ N® ! where the inequality follows from Proposition 2.1 (and the fact the volume of a Ham that Z different choices of nSG ming ball is translation invariant). Since there are L E à ø Z tuples of codewords from 1 , the probability that there exists at least one wnG -tuple that lies in ^d8!@j is at most (by the union bound): à Z å Z Zå < å Z Z½å ¾ `® L < å ¾ ` ! where 4ùùC6 is the rate of 1 . Finally, since there are at most X choices for d , the probability that there exists some Hamming ball with SnQG codewords from 1 is at most å Z Z½å < å ¾ ` L Z < Z½å åBåZ ¾ ` Z å EQ e / ! where the last inequality follows as 6OGsIt² ¯ ^ @8I|G6CW´n|G8I|G6 . Thus, with ; å probability GyI e/ (for large enough ), 1 is a /@"!f8 -list decodable code, as desired. The following is an immediate consequence of the above theorem. 21 ; ; Corollary 2.2. Let *tV be an integer and R~@ÂRGÐIG6e . For every , there exists a -ary code with rate at least GIF² ¯ /@CäIl that is /@B!ë`G6X<` -list decodable. We now move to an upper bound on the rate of good list decodable codes. ; ; Theorem 2.3 ([110, 35]). Let o5V be an integer and R@gE÷GIQG6e . For every o , there do not exist any -ary code with rate GJI,² ¯ /@Cµnx that is /@B!f(/jN -list decodable for any function /Y that is polynomially bounded in . Proof. The proof like that of Theorem 2.1 uses the probabilistic method. Let 1 be any ary code of block length with rate 45÷GkI_² ¯ /@nK . Pick a received Ì word d uniformly ø at random from »¡ . Now, the probability that for some fixed ï+_1 , /d8!Nïc8E@ is · iñú!@Y¥· åZ å Q e ¾ ` ! where the inequality follows from Proposition 2.1. Thus, the expected number of codewords within a Hamming ball of radius @ around d is at least åZ å "å Z½å ·¸1.·X e ¾ ` t e å ¾ `®® ! which by the value of 4 is . Since the expected number of codewords is exponential, this implies that there exists a received word d that has exponentially many codewords from 1 within a distance @ from it. Thus, 1 cannot be /@"!f(^Y` -list decodable for any polynomially bounded (in fact any subexponential) function ( . List decoding capacity Theorems 2.1 and 2.3 say that to correct a @ fraction of errors using list decoding with small list sizes the best rate that one can hope for and can achieve is GIF² ¯ /@C . We will call this quantity the list-decoding capacity. The terminology is inspired by the connection of the results above to Shannon’s theorem for the special case of the -symmetric channel (which we will denote by e¶31 ). In this channel, every symbol (from »¡ ) remains untouched with probability GyI@ while it is changed to each of the other symbols in »¡ with probability ¯ å Z . Shannon’s theorem states that one can have reliable communication with code of rate less than GPI5² ¯ /@C but not with rates larger than G(I~² ¯ ^@ . Thus, Shannon’s capacity for e¶k1 is G(I~² ¯ ^@ , which matches the expression for the list-decoding capacity. Note that in e¶31 , the expected fraction of errors when a codeword is transmitted is @ . Further, as the errors on each symbol occur independently, the Chernoff bound implies that with high probability the fraction of errors is concentrated around @ . However, Shannon’s proof crucially uses the fact that these (roughly) @ fraction of errors occur randomly. What Theorems 2.1 and 2.3 say is that even with a @ fraction of adversarial errors 1 one can 1 Where both the location and the nature of errors are arbitrary. 22 have reliable communication via codes of rate GJI,² ¯ /@C with list decoding using lists of sufficiently large constant size. We now consider the list-decoding capacity in some more detail. First we note the following special case of the expression for list-decoding capacity for large enough alphabets. Z When is V Ñ , GµIg@(I] is a good approximation of ² ¯ /@C (see Proposition 2.2). Recall that in Section 1.2, we saw that G(Il@ is the information theoretic limit for codes over any alphabet. The discussion above states that we match this bound for large alphabets. The proof of Theorem 2.1 uses a general random code. A natural question to ask is if one can prove Theorem 2.1 for special classes of codes: for example, linear codes. For õV it is known that Theorem 2.1 is true for linear codes [51]. However, unlike general codes, where Theorem 2.1 (with ÝaRzG ) holds for random codes with high probability, the result in [51] does not hold with high probability. For oLV , it is only known that random linear codes (with high probability) are /@B!f8 -list decodable with rate at least G8Iw² ¯ ^@ÁI Z I U`G . Z ¹»º¼ ¾ Achieving List-Decoding Capacity with Explicit Codes There are two unsatisfactory aspects of Theorem 2.1: (i) The codes are not explicit and (ii) There is no efficient list-decoding algorithm. In light of Theorem 2.1, we can formalize the challenge of list decoding that was posed in Section 1.2.3 as follows: ; ; Grand Challenge. Let *tV and let R~@ÂRGkIG6e and be reals. Give an explicit 2 ¯ construction of a -ary code 1 with rate G3I{² /@I_ that is ^@"!faG6XeN -list decodable. Further, design a polynomial time list-decoding algorithm that can correct @ fraction of errors while using lists of size `G6X< . We still do not know how to meet the above grand challenge in its entirety. In Chapter 3, we will show how to meet the challenge above for large enough alphabets (with lists of larger size). 2.2.2 Results Related to the -ary Entropy Function We conclude our discussion on list decoding by recording few properties of the -ary entropy function that will be useful later. We first start with a calculation where the -ary entropy function naturally pops up. This hopefully will give the reader a feel for the function (and as a bonus will pretty much prove ; the lower bound in Proposition 2.1). Let EL¬SEzG and yV . We claim that the quantity ! PILGi is approximated very well by ¾ for large enough . To see this, let us 2 By explicit construction, we mean an algorithm that in time polynomial in the block length of the code can output some succinct description of the code. For a linear code, such a description could be the generator matrix of the code. 23 first use Stirling’s approximation of #" by /06e?X%$ (for large enough )3 to approximate ! : þ ¬Dÿ "£? ? å ^Á¬" /uI´Á¬" å ? " /Á¬"&"®/uI´¬&"(' ¹»º¼¾ »¹ º½¼`¾ < Zå ¹»º¼`¾ e Zå ® å Z å Z½å t < ¹»º¼`¾ ¹»º¼`¾ ® ¤ Thus, we have þ ¬Dÿ WJIG ¯ å ½Z å Zå å Z e ¹»º¼`¾ ¹»º½¼`¾ ® »¹ º½¼`¾ L ¾ ! ' as desired. Figure 2.1 gives a pictorial view of the -ary function for the first few values of . 1 q=2 q=3 q=4 0.9 0.8 0.7 Hq(x) ---> 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 x ---> Figure 2.1: A plot of ² ¯ ¬ for §2VU!f and . The maximum value of G is achieved at ¬§÷GIQG6X . We now look at the -ary entropy function for large . Proposition 2.2. For small enough , G9I´² ¯ /@8|G9I{@PI{ for every Z and only if is V Ñ . 3 There is a ) *,+- factor that we are ignoring. ; R~@ÂE|G9IxG6e if 24 Proof. We first note that ² ¯ ^@³Q@8 ¨ª©£«£¯ jIgG>Iy@9¨ª©£«£¯@kI0G"Io@CU¨ª©<«£¯G"Io@Ck@9¨ª©<«>¯WúI GnQ²w/@CN6µ¨ª©<« . Now if gHV Z , we get that ² ¯ /@CEõ@yn, as ¨ª©£«>¯W}IqGE%G and ²w^@JEvG . Next we claim that for small enough , if ÂvG6X then ¨ª©£« ¯ WsILGJvG(I~ . Indeed, ¨ª©£«>¯ ÐIGkpGµn~`G6µ¨®­<U¨®­ú`G³IFG6e<³pGkI_/. ¯ Z ¯2 , which is at least GkIK for 1 ¹ 0 (but pG6 ), then for fixed @ , ²w/@`6µ¨ª©£«8PQc54-`G . Then |G6X . Finally, if P5V @8¨ª©£«£¯WcIyGfn}²w^@N6䨪©£«3,@"IJ£ns64(`G8x@ns , which implies that G£Is² ¯ /@C9RG£I-@"IJ , 3 as desired. Next, we look at the entropy function when its value is very close to G . ; Proposition 2.3. For small enough ² ¯ þ GI G , Il ÿ E|GIÜ ¯ ! Ü ¯ is constant that only depends on . Proof. The intuition behind the proof is the following. Since the derivate of ² ¯ ¬" is zero at ¬upGkIFG6e , in the Taylor expansion of ² ¯ G3IG6e9IS< the term will vanish. We will now make this intuition more concrete. We will think of as fixed and G6 as growing. In particular, we will assume that RqG6e . Consider the following equalities: ² ¯ GIQG6XJI´<³ GÐIQG6e-I´ G G I n ¨ª©<«>¯ nl ÿ -IQG ÿ ÿ ÿ þ þ þ þ G X G GIQ £`6CW-IG Il¨ª©£« ¯ GI n nl ¨ª©£« ¯ JIQG ÿJÿ G8n ÿ ÿ þ þ þ þ G G GIQ £`6CW-IQG GI ¨®­ GÐI I n ¨®­ ¨®­(87 þ -IQG ÿ G9nlX ÿ ÿ:9 þ þ G X G X G9n C/ äI I I I n I ¨®­87 JIQG VW-IQG ÿ þ -IQG þ I l I n (2.3) V JIG Vöÿ:9 G X G9n C/ äI I I ¨®­8 7 JIQG VW-IQG G X JIV> I n I n VW-IG ÿ þ JIQG ÿ 9 þ G JIFV£ G9n C/ äI I n I (2.4) ¨®­ 7 VW-IQG JIG V JIG 9 GI n C/ V¨®­C -IQG I GI G Il ¨ª©£«£¯ 25 E G I k¨®­(UW-IQG (2.5) (2.3) follows from the fact that for · ¬·jR2G , ¨®­j`Gn,¬"òõ¬uIx¬ 6<Vòn,¬ 6XI¤¥¤¥¤ and by collecting the and smaller terms in C/ . (2.4) follows by rearranging the terms and by absorbing the term in U . The last step is true assuming is small enough. We will also work with the inverse of the -ary entropy function. Note that ² ¯ on the ; ; åZ domain !¥G-ILG6e¡ is an bijective map into !¥G¡ . Thus, we define ² ¯ <;|¬ such that ; ² ¯ /¬³=; and E¬0E|GIQG6e . Finally, we will need the following lower bound. Lemma 2.4. For every ; E>;§E|GIG6e and for every small enough ; , åZ å Z ² ¯ < ;Il 6eÜ ¯ 9t² ¯ < ;äIl! where Ü ¯ |G is a constant that depends only on . åZ Proof. It is easy to check that ² ¯ ?; is a strictly increasing convex function in the range ; åZ ;r+ !¥G¡ . This implies that the derivate of ² ¯ <; increases with ; . In particular, ; ; W² ¯ åZ ½^`G0 i² ¯ åZ ½/?; for every E@;qE G . In other words, for every RA;E G , ; å å Z å Z½å and (small enough) Ýy , ¾ e 6Bë ¾ X 6B E ¾ e ¾ X . Proposition 2.3 along with å Z å Z the facts that ² ¯ G}öGòIqG6e and ² ¯ is increasing completes the proof if one picks Üë¯ tÖDCEÁ`G£!¥G6eÜ ¯ and ÝPQ 6eÜë¯ . 2.3 Definitions Related to Property Testing of Codes ; We first start with some generic definitions. Let *tV , {|G be integers and let Rx*RG be a real. Given a vector ìz+2 »¡ and a subset ¶Fù »¡ , we say Ì that ì is -close to ¶ çì3!d`6 is the relative if there exist a dp+¶ such that ÝU^ìk!`dµaEH , where ÝU^ì3!dµy Hamming distance between ì and d . Otherwise, ì is -far from ¶ . ; Given a -ary code 1 of block length , an integer GgmG and real Rq§RvG , we say that a randomized algorithm H is an ?G<!`< -tester for 1 if the following conditions hold: À (Completeness) For every codeword d´+K1 , IJX H accepts a codeword. À ^dkpG¡pG , that is, H À always (Soundness) For every dl+F »¡ª that is -far from 1 , IJX H ^dkpG¡jE|G6e , that is, À with probability at least V£6e , H rejects d . À (Query Complexity) For every random choice made by H , the tester only probes at À most G positions in d . 26 We remark that the above definition only makes sense when 1 has large distance. Otherwise we could choose 1ö ¸¡® and the trivial tester that accepts all received words is ; a !`< -tester. For this thesis, we will adopt the convention that whenever we are taking about testers for a code 1 , 1 will have some non trivial distance (in most cases 1 will have linear distance). The above kind of tester is also called a one-sided tester as it never makes a mistake in the completeness case. Also, the choice of V>6X in the soundness case is arbitrary in the ; , as long as following sense. The probability of rejection can be made G(I~Ý for any Ý. we are happy with a?G> many queries, which is fine for this thesis as we will be interested in the asymptotics of the query complexity. The number of queries (or G ) can depend on . Note that there is a gap in the definition of the completeness and soundness of a tester. In particular, the tester can have arbitrary output when the received word d is not a codeword but is still -close to 1 . In particular, the tester can still reject (very) close-by codewords. We will revisit this in Chapter 8. We say that a ?G<!`< tester is a local tester if it makes sub-linear number of queries 4 , that is, GyKU/j and is some small enough constant. A code is called a Locally Testable Code (or LTC), if it has a local tester. We also say that a local tester for a code 1 allows for locally testing 1 . 2.4 Common Families of Codes In this section, we will review some code families that will be used frequently in this thesis. 2.4.1 Reed-Solomon Codes Reed-Solomon codes (named after their inventors [90]) is a linear code that is based on univariate polynomials over finite fields. More formally, an ³!ënlG¡ ¯ Reed-Solomon code with _R and §p is defined as follows. Let kZ!¥¤¥¤¥¤!N be distinct elements from ¯ Z È ý (which is why we needed , ). Every message LhzÆ^ !¥¤¥¤¥¤! ' +]Á¯' is thought of as a degree polynomial over ¯ by assigning the 3n_G symbols to the 3n_G coefficients of a degree polynomial. In other words, NM<O{³ ý na[ZPOtn0½n O ' . The codeword ' corresponding to L is defined as follows È 4P¶QLzÆWRMWZNb!¤¥¤¥¤!ë(M ¤ Now a degree polynomial can have at most roots in any field. This implies that any two distinct degree polynomials can agree in at most places. In other words, Proposition 2.5. An ³!ë}ntG¡ ¯ Reed-Solomon code is an ³!ë}nQG£!NTouI~¡ ¯ code. 4 Recall that in this thesis we are implicitly dealing with code families. 27 By the Singleton bound (see for example [80]), the distance of any code of dimension -n~G and length is at most aIw . Thus, Reed-Solomon codes have the optimal distance: such codes are called Maximum Distance Separable (or MDS) codes. The MDS property along with its nice algebraic structure has made Reed-Solomon code the center of a lot of research in coding theory. In particular, the algebraic properties of these codes have been instrumental in the algorithmic progress in list decoding [97, 63, 85]. In addition to their nice theoretical applications, Reed-Solomon codes have found widespread use in practical applications. In particular, these codes are used in CDs, DVDs and other storage media, deep space communications, DSL and paper bar codes. We refer the reader to [105] for more details on some of these applications of Reed-Solomon codes. 2.4.2 Reed-Muller Codes Reed-Muller codes are generalization of Reed-Solomon codes. For integers G and hG , the message space is the set of all polynomials over ¯ in variables that have total degree at most . The codeword corresponding to a message is the evaluation of the corresponding -variate polynomial over distinct points in TS¯ (note that this requires S ÷ ). Finally, note that when G and m , we get an ³!ëynqG¡ ¯ Reed-Solomon code. Interestingly, Reed-Muller codes [82, 89] were discovered before Reed-Solomon codes. 2.5 Basic Finite Field Algebra We will be using a fair amount of finite field algebra in the thesis. In this section, we recap some basic notions and facts about finite fields. A field consists of a set of elements that is closed under addition, multiplication and ; (both additive and multiplicative) inversion. It also has two special elements and G , which are the additive and multiplicative identities respectively. A field is called a finite field if its set of elements is finite. The set of integers modulo some prime U , form the finite field B . ø The ring of univariate ø polynomials with coefficients from will be denoted by k O[¡ . A polynomial #a<O{ is said to be irreducible if for every way of writing #.?O{(WV*?O{³ ?O{ , either V*?O{ or <O{ is a constant polynomial. A polynomial is called monic, if the coefficient of its leading term is G . If #.<O_ is an irreducible polynomial of degree T over a field , then the quotient ring k O[¡^6Ci#a<O{` , consisting of all polynomials in ³ O[¡ modulo #.?O{ is itself a finite field and is called field extension of . The extension field also forms a vector space of dimension T over . All finite fields are either for prime U or is an extension of a prime field. Thus, the number of elements in a finite field is a prime power. Further, for any prime power there exists only one finite field (up to isomorphism). For any that is a power of prime U , the field ¯ has characteristic of U . The multiplicative groups of non-zero elements of a field 28 ¯ ¯ , denoted by ¯X , is known to be cyclic. In other words, (¯X O¢cG£!ZYµ!ZY !¤¥¤¥¤!Y åU ¦ for ; some element YS+g ¯\[ ¢ ¦ . Y is also called the primitive element or generator of R¯X . The following property of finite fields will be crucial. Any polynomial µ< O_ of degree ; at most T in k O[¡ has at most T roots, where p+Q is a root of ? O{ if * . We would be also interested in finding roots of univariate polynomials (over extension fields) for which we will use a classical algorithm due to Berlekamp [16]. Theorem 2.4 ([16]). Let U be a prime. There exists a deterministic algorithm that on input a polynomial in ^] O[¡ of degree T , can find all the irreducible factors (and hence the roots) in time polynomial in T , U and _ . 29 Chapter 3 LIST DECODING OF FOLDED REED-SOLOMON CODES 3.1 Introduction Even though list decoding was defined in the late 1950s, there was essentially no algorithmic progress that could harness the potential of list decoding for nearly forty years. The work of Sudan [97] and improvements to it by Guruswami and Sudan in [63], achieved efficient list decoding up to a @a`cb<i4PkpG³I 4 fraction of errors for Reed-Solomon codes of ; rate 4 . Note that GYIK 4|~@CAµi4}³÷GYI04PN6<V for every rate 4 , R4pRG , so this result showed that list decoding can be effectively used to go beyond the unique decoding radius ; for every rate (see Figure 3.1). The ratio @d`cb<i4PN6@UAµi4P approaches V for rates 4r) , enabling error-correction when the fraction of errors approaches 100%, a feature that has found numerous applications outside coding theory, see for example [98], [49, Chap. 12]. Unfortunately, the improvement provided by [63] over unique decoding diminishes for larger rates, which is actually the regime of greater practical interest. For rates 4q) G , the ratio efb approaches G , and already for rate 4r G6<V the ratio is at most G£¤®Gji . Thus, gC while hthe results of [97, 63] demonstrated that list decoding always, for every rate, enables correcting more errors than unique decoding, they fell short of realizing the full quantitative potential of list decoding (recall that the list-decoding capacity promises error correction up to a GIF45LVX@CAµi4P fraction of errors). The bound @k`cb£W4} stood as the best known decoding radius for efficient list decoding (for any code) for several years. In fact constructing /@"!f8 -list decodable codes of rate 4 for @_p@k`lb£W4} and polynomially bounded , regardless of the complexity of actually performing list decoding to radius @ , itself was elusive. Some of this difficulty was due to the fact that GI{ 4 is the largest radius for which small list size can be shown generically, via the so-called Johnson bound which argues about the number of codewords in Hamming balls using only information on the relative distance of the code, cf. [48]. In a recent breakthrough paper [85], Parvaresh and Vardy presented codes that are listdecodable beyond the GBI 4 radius for low rates 4 . The codes they suggest are variants of Reed-Solomon (or simply RS) codes obtained by evaluating O|G correlated polynomials at elements of the underlying field (with öG giving RS codes). For any %G , they ; achieve the list-decoding radius @ m6$³n W4}òGòI olp $ 4 $ . For rates 4H) , choosing large enough, they can list decode up to radius GòI,aW4K¨ª©£«B`G6<4}` , which approaches the capacity GIF4 . However, for 4÷pG6cGrq , the best choice of (the one that maximizes @ m6$³n i4P ) is in fact hG , which reverts back to RS codes and the list-decoding radius GoI 4 . (See Figure 3.1 where the bound GoI s c4 for the case V is plotted 30 — except for very low rates, it gives a small improvement over @l`cb<i4} .) Thus, getting arbitrarily close to capacity for some rate, as well as beating the G-I 4 bound for every rate, both remained open before our work1 . In this chapter, we describe codes that get arbitrarily close to the list-decoding capacity for every rate (for large alphabets). In other words, we give explicit codes of rate 4 together with polynomial time list decoding up to a fraction GsIt4qI, of errors for every rate 4 ; and arbitrary ´ . As mentioned in Section 2.2.1, this attains the best possible tradeoff one can hope for between the rate and list-decoding radius. This is the first result that approaches the list-decoding capacity for any rate (and over any alphabet). Our codes are simple to describe: they are folded Reed-Solomon codes, which are in fact exactly Reed-Solomon codes, but viewed as codes over a larger alphabet by careful bundling of codeword symbols. Given the ubiquity of RS codes, this is an appealing feature of our result, and in fact our methods directly yield better decoding algorithms for RS codes when errors occur in phased bursts (a model considered in [75]). Our result extends easily to the problem of list recovery (recall Definition 2.4). The biggest advantage here is that we are able to achieve a rate that is independent of the size of the input lists. This is an extremely useful feature that will be used in Chapters 4 and 5 to design codes over smaller alphabets. In particular, we will construct new codes from folded Reed-Solomon codes that achieve list-decoding capacity over constant sized alphabets (the folded Reed-Solomon codes are defined over alphabets whose size increases with the block length of the code). Our work builds on existing work of Guruswami and Sudan [63] and Parvaresh and Vardy [85]. See Figure 3.1 for a comparison of our work with previous known list-decoding algorithms (for various codes). We start with the description of our code in Section 3.2 and give some intuition why these codes might have good list decodable properties. We present the main ideas in our list-decoding algorithms for the folded Reed-Solomon codes in Section 3.3. In Section 3.4, we present and analyze a polynomial time list-decoding algorithm for folded RS codes of rate 4 that can correct roughly GI s 4 fraction of errors . In Section 3.5, we extend the results in Section 3.4 to present codes that can be efficiently list decoded up to the list-decoding capacity. Finally, we extend our results to list recovery in Section 3.6. 3.2 Folded Reed-Solomon Codes In this section, we will define a simple variant of Reed-Solomon codes called folded ReedSolomon codes. By choosing parameters suitably, we will design a list-decoding algorithm that can decode close to the optimal fraction GI4 of errors with rate 4 . 1 Independent of our work, Alex Vardy (personal communication) constructed a variant of the code defined in [85] which could be list decoded with fraction of errors more than tTu ) v for all rates v . However, his construction gives only a small improvement over the tu ) v bound and does not achieve the list-decoding capacity. 31 1 List decoding capacity (this chapter) Unique decoding radius Guruswami-Sudan Parvaresh-Vardy ρ (FRACTION OF ERRORS) ---> 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 R (RATE) ---> Figure 3.1: List-decoding radius @ plotted against the rate 4 of the code for known algorithms. The best possible trade-off, i.e., list-decoding capacity, is @opGÐI´4 , and our work achieves this. 3.2.1 Description of Folded Reed-Solomon Codes Consider a ³!ë-nFG¡ ¯ Reed-Solomon code 1 consisting of evaluations of degree polynomials over ¯ at the set ¯X . Note that tÂntG . Let Y be a generator of the multiplicative åZ group ¯X , and let the evaluation points be ordered as G£!ZYä!ZY !¥¤¥¤¤¥!Y . Using all nonzero field elements as evaluation points is one of the most commonly used instantiations of Reed-Solomon codes. Let õG be an integer parameter called the folding parameter. For ease of presentation, we will assume that divides [tJIG . Definition 3.1 (Folded Reed-Solomon Code). The -folded version of the RS code 1 , denoted wyx{z| } , is a code of block length ~ùY6 over $¯ , where wpsILG . The ¾ $ ' at most , has as its ’th encoding of a message ?O{ , a polynomial over ¯ of degree Z ; b!¥¥ä!fµ<Yk $ $ åZ ` . In other symbol, for EF,RHY6 , the -tuple W?Yd $ b!f?Yk $ words, the codewords of 1 wyx{z| } are in one-one correspondence with those of ¾ $ ' the RS code 1 and are obtained by bundling together consecutive -tuple of symbols in codewords of 1 . The way the above definition is stated the message alphabet is ¯ while the codeword alphabet is $¯ whereas in our definition of codes, both the alphabets were the same. This 32 P <6=P Q=P <P <6 <6=P Q=P ?=P < P ?6 P Q5 P < P ?6 <6 Q <6 Q P <6 <6P <6P ?5 P ?6 P ?6 P ? P ?5 Figure 3.2: Folding of the Reed-Solomon Code with Parameter = . can be easily taken care of by bundling consecutive message symbols from ( to make the message alphabet to be T . We will however, state our results with the message symbols as coming from as this simplifies our presentation. We illustrate the above construction for the choice W in Figure 3.2. The polynomial ¡R¢Q£¥¤ is the message, whose Reed-Solomon encoding consists of the values of ¡ at ¯ ¦l§©¨¦«ª&¨j¬r¬j¬r¨¦d­®lª where ¦d¯ ±° . Then, we perform a folding operation by bundling together tuples of symbols to give a codeword of length ²´³ over the alphabet ´µ . Note that the folding operation does not change the rate ¶ of the original Reed-Solomon code. The relative distance of the folded RS code also meets the Singleton bound and is at least ·¹¸º¶ . Remark 3.1 (Origins of term “folded RS codes”). The terminology of folded RS codes was coined in [75], where an algorithm to correct random errors in such codes was presented (for a noise model similar to the one used in [27, 18]: see Section 3.7 for more details). The motivation was to decode RS codes from many random “phased burst” errors. Our decoding algorithm for folded RS codes can also be likewise viewed as an algorithm to correct beyond the ·{¸¼» ¶ bound for RS codes if errors occur in large, phased bursts (the actual errors can be adversarial). 3.2.2 Why Might Folding Help? Since folding seems like such a simplistic operation, and the resulting code is essentially just a RS code but viewed as a code over a large alphabet, let us now understand why it can possibly give hope to correct more errors compared to the bound for RS codes. Consider the folded RS code with folding parameter ½¾ . First of all, decoding the folded RS code up to a fraction ¿ of errors is certainly not harder than decoding the RS code up to the same fraction ¿ of errors. Indeed, we can “unfold” the received word of the folded RS code and treat it as a received word of the original RS code and run the RS listdecoding algorithm on it. The resulting list will certainly include all folded RS codewords 33 within distance @ of the received word, and it may include some extra codewords which we can, of course, easily prune. In fact, decoding the folded RS code is a strictly easier task. To see why, say we want to correct a fraction G6 of errors. Then, if we use the RS code, our decoding algorithm ought to be able to correct an error pattern that corrupts every ’th symbol in the RS encoding ; of µ<O{ (i.e., corrupts µ ¬½ÎW for ErÍuRY6 ). However, after the folding operation, this error pattern corrupts every one of the symbols over the larger alphabet ¯ , and thus need not be corrected. In other words, for the same fraction of errors, the folding operation reduces the total number of error patterns that need to be corrected, since the channel has less flexibility in how it may distribute the errors. It is of course far from clear how one may exploit this to actually correct more errors. To this end, algebraic ideas that exploit the specific nature of the folding and the relationship between a polynomial µ?O{ and its shifted counterpart µ?YÀO_ will be used. These will become clear once we describe our algorithms later in the chapter. We note that the above simplification of the channel is not attained for free since the alphabet size increases after the folding operation. For folding parameter that is an absolute constant, the increase in alphabet size is moderate and the alphabet remains polynomially large in the block length. (Recall that the RS code has an alphabet size that is linear in the block length.) Still, having an alphabet size that is a large polynomial is somewhat unsatisfactory. Fortunately, existing alphabet reduction techniques, which are used in Chapter 4, can handle polynomially large alphabets, so this does not pose a big problem. 3.2.3 Relation to Parvaresh Vardy Codes In this subsection, we relate folded RS codes to the Parvaresh-Vardy (PV) codes [85], which among other things will help make the ideas presented in the previous subsection more concrete. The basic idea in the PV codes is to encode a polynomial of degree by the evaluÒ ations of Á]HV polynomials ý B!ë£Z!¥¤¤¥¤!fýåZ where Î<O{P ÎçåZ¥?O{PÄpÖo© #.<O_ for an appropriate power T (and some irreducible polynomial #.?O{ of some appropriate degree) — let us call Á the order of such a code. Our first main idea is to pick the irreducible polynomial #a<O{ (and the parameter T ) in such a manner that every polynomial Ò of degree at most satisfies the following identity: ?YcO{-ʵ<O_ZÄzÖo© #a<O{ , where Y is the generator of the underlying field. Thus, a folded RS code with bundling using an Y as above is in fact exactly the PV code of order Á§v for the set of evaluation points ¢cG£!ZY $ !ZY $ !¥¤¥¤¥¤!ZY $ åZ $ ¦ . This is nice as it shows that PV codes can meet the Singleton bound (since folded RS codes do), but as such does not lead to any better codes for list decoding. We now introduce our second main idea. Let us compare the folded RS code to a PV code of order V (instead where divides ) for the set of evaluation points Z of order U å ¢cG£!ZYä!¥¤¥¤¤Y $ !ZY $ !ZY $ !¥¤¥¤¥¤!Z; Y $ åU !¤¥¤¥¤!ZY å $ !ZY å $ Z ¤¥¤¥¤!ZY ; åU ¦ . We find that in the Î PV encoding of , for every EöͧEöj6 IzG and every RÅQR% IzG , ?Y $ 34 Î åZ Î ), whereas it apYl$ appears exactly twice (once as µ?YÀ$ and another time as Z<Y pears only once in the folded RS encoding. (See Figure 3.3 for an example when õ and Áò5V .) In other words, the PV and folded RS codes have the same information, but the É!Ê5Ë%ÌÍ É&Ê5Ë%ÌÍ É!Ê5Ë%ÒÍ É!Ê ÎhË%ÌÍ É&Ê ÎhË%ÌÍ É!Ê Î^ËPÒÍ É!Ê ÎhÏÐËPÌÍ É&Ê ÎhÏË%ÌÍ É!Ê Î^ÏË%ÒÍ É&Ê ÎhÑË%ÌÍ É&Ê ÎhÑË%ÌÍ É!Ê Î^ÑË%ÒÍ ÆPÇ,È É&Ê ÎhÏË%ÌÍ É!Ê5Ë%ÒÍ É&Ê5Ë%ÌÍ É!Ê Î^ËPÌÍ É&Ê ÎhË Ì Í É!Ê Î^ÏË Ì Í É&Ê ÎhÑË Ì Í É!Ê Î^Ë Ò Í É&Ê ÎhË%ÒÍ codeword É!Ê ÎhÏÐËPÒÍ É!Ê Î^ÏË Ò Í É!Ê ÎhÑÐË Ò Í ÓPÔ codeword Figure 3.3: The¯ correspondence between a folded Reed-Solomon code (with d ¦ ¯ the ­ÃParvaresh Vardy code (of order Õ × and ° ) Ö ) evaluated over ­ and ® ® ¨ ¨ ¨ r ¨ j ¬ j ¬ r ¬ ¨ j ¨ r ¬ j ¬ r ¬ ¨ µ · ° °TØ ° µ ° ° ØÚÙ . The correspondence for the first block in the folded RS codeword and the first three blocks in the PV codeword is shown explicitly in the left corner of the figure. ® rate of the folded RS codes is bigger by a factor of Ø Ø ÛÖܸ Ø . Decoding the folded RS codes from a fraction ¿ of errors reduces to correcting the same fraction ¿ of errors for the PV code. But the rate vs. list-decoding radius trade-off is better for the folded RS code since it has (for large enough , almost) twice the rate of the PV code. In other words, our folded RS codes are chosen such that they are compressed forms of suitable PV codes, and thus have better rate than the corresponding PV code for a similar error-correction performance. This is where our gain is, and using this idea we are able to construct folded RS codes of rate ¶ that are list decodable up to radius roughly ·\¸ ÝÞ» ß ¶áà for any Õãâ · . Picking Õ large enough lets us get within any desired ä of list-decoding capacity. 3.3 Problem Statement and Informal Description of the Algorithms We first start by stating more precisely the problem we will solve in the rest of the chapter. We will give list-decoding algorithms for the folded Reed-Solomon code åyæáçkè^é,ê ërê ê ì of rate ¶ . More precisely, for every ·îí Õ íï and ðòñAó , given a received word ô 35 È å $ !¤¥¤¥¤!Z; åZN (where recall pMyIzG ), we want to output  all  ' that disagree with d in at most G*Iõ`GPnqÝ<õ åk$  ZÚ 4 Z $ fraction of positions in polynomial time. In other words, we need to output all degree ; polynomials µ <O{ such that for at least GòntÝ< kå $  ÚZ 4   Z fraction of E ÍE ; $ j6ILG , µ<Y Î $ ¥¾;XÎ $ (for every = E 0Eq I5G ). By picking the parameters _,! Á and Ý carefully, we will get folded Reed-Solomon codes of rate 4 that can be list decoded ; up to a GIF4Il fraction of errors (for any ). Æ`<; ý !¥¤¥¤¥¤!Z; $ åZfb!¥¤¤¥¤!<; codewords in wyxáz| } ¾ $ We will now present the main ideas need to design our list-decoding algorithm. Readers familiar with list-decoding algorithms of [97, 63, 85] can skip the rest of this section. For the ease of presentation we will start with the case when Áòt . As a warm up, let us consider the case when ÁòQpG . Note that for %pG , we are interested in list decodÈ ing Reed-Solomon codes. More precisely, given the received word d_õÆ?; ý !¥¤¥¤¥¤!Z; åZ , we are interested in all degree polynomials µ?O{ such that for at least `GnÝ< 4 fraction ; Î of positions E|ÍJEp0I5G , µ<Y (/;XÎ . We now sketch the main ideas of the algorithms in [97, 63]. The algorithms have two main steps: the first is an interpolation step and the second one is a root finding step. In the interpolation step, the list-decoding algorithm finds a bivariate polynomial öa<OS!÷o that fits the input. That is, Î for every position Í , öa<Y !Z;XÎWk ; . Such a polynomial öaæ! can be found in polynomial time if we search for one with large enough total degree (this amounts to solving a system of linear equations). After the interpolation step, the root finding step finds all factors of öa<OS!÷o of the form ÷LIwµ<O_ . The crux of the analysis is to show that Î for every degree polynomial µ<O{ that satisfies µ<Y K;XÎ for at least `Gn Ýe 4 fraction of positions Í , ÷qIFµ?O{ is indeed a factor of öa?OK!÷y . Î However, the above is not true for every bivariate polynomial öa<OS!÷o that satisfies öa<Y !Z;XÎW ; for all positions Í . The main ideas in [97, 63] were to introduce more constraints on öa?OK!÷y . In particular, the work of Sudan [97] added the constraint that a certain weighted degree of öa<OS!÷ is below a fixed upper bound. Specifically, öa?OK!÷y was restricted to have a non-trivially bounded `G£!ë -weighted degree. The `G£!ë -weighted degree of a Î monomial O ÷ is Í¥nøU and the `G<!bC -weighted degree of a bivariate polynomial öa?OK!÷y is the maximum G£!ë -weighted degree among its monomials. The intuition behind defining such a weighted degree is that given öa?OK!÷y with weighted `G<!bC of 7 , the univariate polynomial öa<OS!fµ<O{` , where ?O{ is some degree polynomial, has total degree at most 7 . The upper bound 7 is chosen carefully such that if µ<O_ is a codeword that needs ; to be output, then öa?OK!fµ<O_N has more than 7 zeroes and thus öa<OS!fµ<O{`ô , which in turn implies that ÷õI?O{ divides öa?OS!h÷y . To get to the bound of GJI|G(nQÝe 4 , Guruswami and Sudan in [63], added a further constraint on öa<OS!÷o that required it to Î have G roots at <Y !Z;XÎW , where G is some parameter (in [97] GpG while in [63], G is roughly G6XÝ ). 36 We now consider the next non-trivial case of Át V (the ideas for this case can be easily generalized for the generalÈ ùÁ case). Note that now given the received word Æ`?; ý !Z;cZN\!<;£!Z;<b!¤¥¤¥¤!?; åU!Z; åZN we want to find all degree polynomials µ<O{ ; WÎ such that for at least VGn|Ý<N s 4 fraction of positions EÍ_Ej6<VÂIvG , ?Y K i Î Z ;£WÎ and µ<Y { ;£WÎ Z . As in the previous case, we will have an interpolation and a root finding step. The interpolation step is a straightforward generalization of G case: we find a trivariate polynomial öa<OS!÷8!!úJ that fits the received word, that is, for ; ; WÎ . Further, öa?OS!h÷8!,úò has an upper bound every EõÍEÊj6<VIqG , öa<Y !Z;£iν!Z;£iÎ ZNP on its G£!ë"!ë -weighted degree (which is a straightforward generalization of the G£!ë weighted degree for the bivariate case) and has a multiplicity of G at every point. These straightforward generalization and their various properties are recorded in Section 3.4.1. For the root finding step, it suffices to show that for every degree polynomial µ<O_ that ; needs to be output öa<OS!f?O{\!fµ<YcO{`³ô . This, however does not follow from weighted degree and multiple root properties of öa?OK!÷8!,úò . Here we will need two new ideas, the first of which is to show that for some irreducible polynomial #.<O_ of degree I5G , Ò ¯ µ?O{ ô?YcO{LÖy© W#.?O{N (this is Lemma 3.4). The second idea, due to Parvaresh and Vardy [85], is the following. We first obtain the bivariate polynomial (over an appropriate Ò extension field) HQ÷8!,úò-ôûöa<OS!÷9!,úòtÖy© W#.?O{N . Note that by our first idea, we are ¯ looking for solutions on the curve ú|ü÷ (÷ corresponds to µ<O_ and ú corresponds to µ?YÀO_ in the extension field). The crux of the argument is to show that all the polynomials µ?O{ that need ; to be output correspond to (in the extension field) some root of the equation ¯ . See Section 3.4.3 for the details. HQ÷8!÷ k As was mentioned earlier, the extension of the ýÁ[%V case to the general ÁgõV case is fairly straightforward (and is presented in part as Lemma 3.6). To go from Á_ö to any ÁlE% requires another simple idea: We will reduce the problem of list decoding folded Reed-Solomon code with folding parameter to the problem of list decoding folded Reed-Solomon code with folding parameter Á . We then use the algorithm outlined in the previous paragraph for the folded Reed-Solomon code with folding parameter Á . A careful tracking of the agreement parameter in the reduction, brings down the final agreement fraction (that is required for the original folded Reed-Solomon code with folding parameter ) from {`GDnÂÝ< olp 4 $ (which can be obtained without the reduction) to `Gún[Ý< åk$  ZÚÿþ p 4  . This reduction is presented in detail in Section 3.4 for the ÁòLV $ case. The generalization to any Á}E, is presented in Section 3.5. 3.4 Trivariate Interpolation Based Decoding As mentioned in the previous section, the list-decoding algorithm for RS codes from [97, 63] is based on bivariate interpolation. The key factor driving the agreement parameter _ needed for the decoding to be successful was the ( `G£!ë -weighted) degree 7 of the interpolated bivariate polynomial. Our quest for an improved algorithm for folded RS codes will be based on trying to lower this degree 7 by using more degrees of freedom in the interpo- 37 lation. Specifically, we will try to use trivariate interpolation of a polynomial öa?OK!÷µZb!h ÷B\ Z through points in ¯ . This enables us to perform the interpolation with 7 in NÑ j , which is much smaller than the o Y bound for bivariate interpolation. In principle, Z . Of this could lead to an algorithm that works for agreement fraction 4 instead of 4 course, this is a somewhat simplistic hope and additional ideas are needed to make this approach work. We now turn to the task of developing a trivariate interpolation based decoder fraction of errors. and proving that it can indeed decode up to a GIF4 3.4.1 Facts about Trivariate Interpolation We begin with some basic definitions and facts concerning trivariate polynomials. Definition 3.2. For a polynomial öa<OS!÷YZ\!÷"\*+x ¯ OS!h÷ÁZ!÷"N¡ , its `G<!bB!bC -weighted de-Ø gree is defined to be the maximum value of nyÃ>Znyà taken over all monomials O SP÷ Z ÷ ; that occur with a nonzero coefficient in öa<OS!÷YZ!÷"b . If öa<OS!÷úZ\!÷"\kô then its `G£!ë"!ë ; weighted degree is . Definition 3.3 (Multiplicity of zeroes). A polynomial öa?OS!h÷YZ\!÷"\ over ¯ is said to have a zero of multiplicity G§vG at a point 8!úZ!B\-+{ ¯ if öa?Onx8!h÷ÁZµnÁZ!÷"³nBb has no monomial of degree less than G with a nonzero coefficient. (The degree of the monomial Ø Î O ÷ Z ÷ equals ÍÁn <Zjn¥ .) Lemma 3.1. Let ¢U jν!Z;XÎçZ!Z;XΪ\f¦ Î Û Z be an arbitrary set of triples from ¯ . Let öa?OS!h÷úZb!h÷B\ +S ¯ OS!h÷ÁZ!h÷BN¡ be a nonzero polynomial of G£!ë"!ë -weighted degree at most 7 that has a zero of multiplicity G at WúÎ!Z;XÎçZ!Z;XΪb for every ÍP+q "¡ . Let µ<O_\!<O{ be polynomials of degree at most such that for at least _t7Â6G values of Í9+x "¡ , we have äÎW3 ;XÎçZ and ; . úÎWk±;XΪ . Then, öa<OS!f?O{\!?O{N³ô Proof. If we define 4o?O{9¾öa<OS!fµ<O{b!<O{` , then 4o?O{ is a univariate polynomial of degree at most 7 , and for every Í9+, ¡ for which YÎW8;XÎçZ and WúÎW8;XΪ , <OrIúÎW divides 4y<O{ . Therefore if G_om7 , then 4o<O_ has more roots (counting multiplicities) than its degree, and so it must be the zero polynomial. Lemma 3.2. Given an arbitrary set of triples ¢U YÎ!Z;XÎçZ\!Z;XΪ\ë¦Î Û Z from ¯ and an integer parameter G| G , there exists a nonzero polynomial öa<OS!÷äZ\!÷"\ over ¯ of G£!ë"!ë weighted degree at most 7 such that öa?OS!h÷jZ\!÷"\ has a zero of multiplicity G at WjÎ!Z;XÎçZ\!;XÎ \ for all Í+t ¡ , provided s Ø L . Moreover, we can find such a öa<OS!÷YZ\!÷"\ in time ' polynomial in ³!ZG by solving a system of homogeneous linear equations over ¯ . Proof. We begin with the following claims. (i) The condition that öa?OK!÷äZ\!h÷B\ has a zero of multiplicity G at úÎ!Z;XÎçZ!Z;XΪb for all Í_+ ¡ amounts to : homogeneous linear conditions in the coefficients of ö ; and (ii) The number of monomials in öa<OS!÷µZb!h÷B\ equals the number, say ~Xi"!f7§ , of triples ^Íë!<Z\! b of nonnegative integers that obey 38 ÍjnÃ<ZänQÃE÷7 is at least s Ø . Hence, if s Ø 5: , then the number of unknowns ' ' exceeds the number of equations, and we are guaranteed a nonzero solution. To complete the proof, we prove the two claims. To prove the first claim, it suffices to show that for any arbitrary tuple 8!8!ZYÁ , the condition that öa<OS!÷9!,úò has multiplicity G at point 8!8!ZYÁ amounts to many homogeneous linear constraints. By the definition of multiplicities of roots, this amounts to setting the coefficients of all monomials of total degree G in öa<On8!÷0n3!!úon YÁ to be zero. In particular, the coefficient of the monomial Î Î Ø ú Î is given by Ø Î Î Ø Î Î Î Ø Î Î å>Î Î Ø å>Î Ø Y Î å>Î , where O ÷ Î Î ÎØ Î Î Î Î Î Ø Î s s s s s s s Ø Î Î Î s Î Î Ø Î is the coefficient of O ÷ ú s in öa<OS!÷9!,úò . Thus, the condition on multiplici ons roots of öa?OS!h÷8!,úò at 8!3!ZYú follows if the following is satisfied by for every ties triple /Í`Z\!Í!ÍÑ such that Í`Zjnlͽµn´Í½(E>G : ü ü ü Î Î ÎØ Î Ø Î s Íi Z iÍ iÍ ; Î >å Î Î Ø å>Î Ø Î >å Î Î Î Ø Î Y s s ¤ Î þ `Í Zÿ þ ½Í Nÿ þ ÑÍ fÿ s s The claim follows by noting that the number of integral solutions to ÍbZ3nͽÐnÍÑgEÅG is . To prove the second claim, following [85], we will first show that the number ~yÑ"!f7 is at least as large as the volume of the 3-dimensional region Ê¢¬n;Zµn;£}Ep7h· ; the correspondence between monomials in ¯ OK!÷8!,ú¡ and ¬j!Z;cZ!Z;£J ¦ . Consider Ø Î Î Î unit cubes in : O ÷ ú s )ùâ8^Í`Z\!Í!ÍÑ , where â3/ÍNZb!`ͽ!ͽb3z Í`Z\!ÍNZcn_Gúîg Í!Ín_Gúî ÍÑ!ÍÑknLG . Note that the volume of each such cube is G . Thus, ~Xi"!f7 is the volume of the union of cubes â8^ÍfZb!`ͽ!ͽ\ for positive integers ÍNZ\!ͽ!ÍÑ such that Í`Zµn,ͽ³n>ͽ}E|7 : let denote this union. It is easy to see that . To complete the claim we will show that the volume of equals s Ø . Indeed the volume of is ý ý å ' ' å ' å B T ;£BT;cZCT>¬u ý å ' ý þ i7õI´¬ T>¬ ý V> G T ý >V 7 ! q where the third equality follows by substituting ý 7õIl¬ I8;cZ T;cZCT£¬ ÿ 57zIl¬ . 3.4.2 Using Trivariate Interpolation for Folded RS Codes Let us now see how trivariate interpolation can be used in the context of decoding the folded RS code 1s¾wyxáz| } of block length ~ 2WòILGN6 . (Throughout this section, we ¾ $ ' 39 denote S5JIQG .) Given a received word .+~^T$¯ for 1 that needs to be list decoded, we define d´+]B¯ to be the corresponding “unfolded” received word. (Formally, let the ’th ; symbol of be ý !¥¤¥¤¥¤! åZN for E lRÛ~ . Then d is defined by ; for $ $ ; ; E aR=~ and E YRx .) Suppose that µ?O{ is a polynomial whose encoding agrees with on at least _ locations (note that the agreement is on symbols from T$¯ ). Then, here is an obvious but important observation: For at Z least _/vIxG values of Í , Î µ?Y ³=;XÎ Z hold. ; E,Í3R~ , both the equalities µ<Y Î ³±;XÎ and Î Define the notation ?O{3qµ<YcO{ . Therefore, if we consider the triples <Y !;XÎ!Z;XÎ ZN+ ; ¯ for ÍK !¥G£!¥¤¥¤¥¤!~IvG (with the convention ; ; ý ), then for at least _/hI2G Î Î triples, we have µ<Y k±;XÎ and <Y k±;XÎ Z . This suggests that interpolating a polynomial öa?OK!÷úZ\!÷"b through these triples and employing Lemma 3.1, we can hope that ?O{ will ; , and then somehow use this to find µ<O{ . We formalize satisfy öa?OK!fµ<O_\!fµ<YcO{N- this in the following lemma. The proof follows immediately from the preceding discussion and Lemma 3.1. Lemma 3.3. Let a+ç $¯ and let dw+0 ¯ be the unfolded version of . Let öa<OS!÷YZb!h÷B\ be any nonzero polynomial over ¯ of `G£!ë"!ë -weighted degree at 7 that has a zero of ; Î !¥G£!¥¤¥¤¤¥!`kIyG . Let _ be an integer such that _9 multiplicity G at <Y !Z;XÎ!;XÎ Zf for Í åZ . 6$ Then every polynomial µ<O{y+ ¯ O[¡ of degree at most whose encoding according to ; . wyx{z| } agrees with on at least _ locations satisfies öa<OS!f?O{\!fµ<YcO{`kô ¾ $ ' Lemmas 3.2 and 3.3 motivate the following approach to list decoding the folded RS code wyxáz| } . Here 0+t^À$¯ is the received word and d~H<; ý !Z;cZ\!¤¥¤¥¤!; åZfJ+wB¯ ¾ $ ' is its unfolded version. The algorithm uses an integer multiplicity parameter G.õG , and is intended to work for an agreement parameter GsE±_ÐE~ . Algorithm Trivariate-FRS-decoder: Step 1 (Trivariate Interpolation) Define the degree parameter # 72"! s GC<G-nQG?G-nFV>%$nQGò¤ (3.1) Interpolate a nonzero polynomial öa?OK!÷YZ\!÷"b with coefficients from ¯ with the following two properties: (i) ö has `G<!bB!bC -weighted degree at most 7 , and (ii) ö ; Î has a zero of multiplicity G at ?Y !Z;XÎ!Z;XÎ ZN for Í !¥G£!¥¤¥¤¥¤![ILG (where ; ¾; ý ). (Lemma 3.2 guarantees the feasibility of this step as well as its computability in time polynomial in ³!ZG .) Step 2 (Trivariate “Root-finding”) Find a list of all degree EL polynomials µ?O{9+] ¯ O[¡ ; such that öa?OK!fµ<O_\!fµ<YcO{N( . Output those whose encoding agrees with on at least _ locations. 40 Ignoring the time complexity of Step 2 for now, we can already claim the following result concerning the error-correction performance of this algorithm. Theorem 3.1. The algorithm Trivariate-FRS-decoder successfully list decodes the folded Reed-Solomon code wyxáz| } up to a radius of ~÷I'&h~ ¾ $ Ø ' Ø G9n Z G8n *) IKV . ( ' $ $ åZ s Proof. By Lemma 3.3, we know that any µ?O{ whose encoding agrees with on _ or more locations will be output in Step 2, provided _s åZ . For the choice of 7 in (3.1), this Ø 6$ ( NG8n condition is met for the choice _ÐõG9n,+ s ' 6$ å Z s have 7 E /HIGG Z fG8n n Z å Z *- . Indeed, we 6 $ # G . s TG<G-nQG?G(n~V>únQG 2 /HIGPG G s IQG H RG9n/. ±_\! þ G V G3n n G ÿ þ GÁÿ G9n G s IQG H þ G9n G G ÿ þ G8n G ^HIQGG G V ÿ n G ^HIQGG10 ; where the first inequality follows from (3.1) and the fact that for any real ¬´ , !/¬2$.E|¬ ; while the second inequality follows from the fact that for any real ¬0 , ¬0R3!^¬4$³nFG . The decoding radius is equal to ~÷I _ , and recalling that 0Q ~ , we get bound claimed in the lemma. The rate of the folded Reed-Solomon code is 45z i3n_GN6{Q6 , and so the fraction . Letting the parameter grow, we of errors corrected (for large enough G ) is GÐI $ åZ 4 . $ can approach a decoding radius of GI4 3.4.3 Root-finding Step In light of the above discussion, the only missing piece in our decoding algorithm is an efficient way to solve the following trivariate “root-finding” problem: Given a nonzero polynomial öa<OS!÷YZ\!h÷B\ with coefficients from a finite field ¯ , a primitive element Y of the field ¯ , and an integer parameter 0RqsILG , find the list of all polynomials ?O{ of degree at most such that öa<OS!fµ<O{b!f?YcO{N³ô ; . The following simple algebraic lemma is at the heart of our solution to this problem. Lemma 3.4. Let Y be a primitive element that generates ¯X . Then we have the following two facts: 41 1. The polynomial #.<O_ Þ6 587 O ¯ åZ I8Y is irreducible over ¯ . 2. Every polynomial ?O{9+] ¯ O[¡ of degree less than -IG satisfies ?YcO{k5µ?O{ Ò Öy© #a<O{ . ¯ ¯ åZ IøY is irreducible over ¯ follows from a known, precise Proof. The fact that #.<O_k±O characterization of all irreducible binomials, i.e., polynomials of the form O:9I~Ü , see for instance [77, Chap. 3, Sec. 5]. For completeness, and since this is an easy special case, we now prove this fact. Suppose #a<O{ is not irreducible and some irreducible polynomial µ<O_§+q ¯ O[¡ of degree ; , G{E/;[R%*I÷G , divides it. Let < be a root of µ?O{ in the ; ; ¯ = åZ MG . Also, >< extension field ¯= . We then have < implies #.< , which = ¯ åZ ½Y . These equations together imply Y ¾ ¾ Xe G . Now, Y is primitive in implies < ¯ , so that Y?9F G iff @ is divisible by WaIvG . We conclude that .IÊG must divide ¯ = åZ åZ åZ ¯ åZ Ò ÷GnÂn. n0; ½nÂBA . This is, however, impossible since GnÂn. n0¥½nÂBA ôC; /Öo© W(I,GN and R;JR,(IG . This contradiction proves that #.<O_ has no such factor of degree less than JIQG , and is therefore irreducible. ¯ ¯ For the second part, we have the simple but useful identity µ<O{ pµ<O that holds ¯ ¯ for all polynomials in ¯ O0¡ . Therefore, ?O{ Ip?YcO{K µ<O -Ip?YcO{ . Since ¯ ¯ ¯ ¯ O YcO implies µ<O { µ?YÀO_ , µ<O òI÷µ<YcÒ O{ is divisible by O IKYcO , and ¯ åZ ¯ ¯ ºY . Hence µ<O_ ômµ<YcO{u Öo© I . # <O_N which implies that µ<O{ thus also by O Ò Öo© #.<O_³5µ<YcO{ since the degree of ?YcO{ is less than JIG . Armed with this lemma, we are ready to tackle the trivariate root-finding problem. Lemma 3.5. There is a deterministic algorithm that on input a finite field ¯ , a primitive element Y of the field ¯ , a nonzero polynomial öa<OS!÷YZ\!÷"bÐ+g ¯ OS!÷úZ\!÷"N¡ of degree less than in ÷úZ , and an integer parameter .RI~G , outputs a list of all polynomials µ<O{ of ; degree at most satisfying the condition öa<OS!fµ<O{b!f?YcO{N3ô . The algorithm has run time polynomial in . ¯ åZ Proof. Let #.<O_ýO I=Y . We know by Lemma 3.4 that #a<O{ is irreducible. We first divide out the largest power of #a<O{ that divides öa<OS!÷YZ!÷"\ to obtain ö ý <OS!÷úZ\!h÷B\ ; where öa?OS!h÷úZ\!÷"\ #.<O_ A ö ý <OS!÷úZ\!h÷B\ for some ;K and #a<O{ does not divide ý ö <OS!÷úZ\!h÷B\ . Note that as #a<O{ is irreducible, ?O{ does not divide #a<O{ . Thus, if ; ; as well, so we µ<O_ satisfies öa<OS!fµ<O{b!f?YcO{Nyô , then ö ý <OS!f?O{\!fµ<YcO{`oô will work with ö ý instead of ö . Let us view ö ý <OS!÷úZ\!÷"\ as a polynomial H ý <÷úZ\!h÷B\ with coefficients from ¯ O[¡ . Further, reduce each of the coefficients modulo #a<O{ to get a polynomial H<÷jZ!÷"b with coefficients from the extension field ¯ ¾ (which is isomorphic e to ¯ O[¡^6CW#.<O_N as #.<O_ is irreducible over ¯ ). We note that H<÷jZ\!÷"\ is a nonzero polynomial since ö ý <OS!÷úZ!÷"b is not divisible by #a<O{ . In view of Lemma 3.4, it suffices to find degree E polynomials µ?O{ satisfying Ò ; ¯ ý ö <OS!fµ<O{b!ëµ?O{ g/Öo© #.<O_Noô . In turn, this means it suffices to find elements 42 D D D D D ; ¯ . If we define the univariate polynomial 4yQ÷YZN Þ6 587 + ¯ ¾¯ satisfying H ! S m ; e HQ÷úZ\!÷ Z , this is equivalent to finding all +] ¯ ¾ such that 4y ³ , or in other words e the roots in ¯ ¾ of 4oQ ÷úZ` . ¯ ; e Now 4o< ÷úZN is a nonzero polynomial since 4o<÷YZN iff ÷BI±÷ Z divides HQ÷úZ!÷"\ , and this cannot happen as HQ ÷jZ\h! ÷B\ has degree less than in ÷YZ . The degree of 4yQ÷jZN is at most T> where T is the total degree of öa? OK! ÷jZ! ÷"b . The characteristic of ¯ ¾ is at most e , and its degree over the base field is at most µ¨ª«8 . Therefore, by Theorem 2.4 we can find all roots of 4yQ ÷úZN by a deterministic algorithm running in time polynomial in T!N . Each of the roots will be a polynomial in ¯ O0¡ of degree less than YISG . Once we find all the roots, we prune the list and only output those roots µ< O{ that have degree at most and satisfy ; . ö ý < OS!f? O{\!fµ< YcO{`k With this, we have a polynomial time implementation of the algorithm Trivariate-FRSdecoder. There is the technicality that the degree of öa?OS!h÷YZ\!÷"\ in ÷úZ should be # less than is at most 7Â6£ , which by the choice of 7 in (3.1) is at most <Gµn0 s Y6£ÂR . This degree ?GPn, Z . For a fixed G and growing , the degree is much smaller than . (In fact, for constant rate codes, the degree is a constant independent of .) By letting {!ZG grow in Theorem 3.1, and recalling that the running time is polynomial in ³!ZG , we can conclude the following main result of this section. ; ; Theorem 3.2. For every Ý_ and 4 , RH4öRrG , there is a family of -folded ReedSolomon codes for in a G6eÝ< that have rate at least 4 and that can be list decoded up of errors in time polynomial in the block length and G6eÝ . to a fraction GItG8nFÝe`4 D D Remark 3.2 (Optimality of degree of relation between µ<O{ and µ<YcO{ ). Let ¯ ¾ e D be the extension field ¯ O[¡^6CW#.<O_N — its elements are in one-one correspondence with D polynomials of degree less than *I÷G over ¯ . Let D $ ¯ ¾ ) ¯ ¾ be such that for e X every µ?O{+[ ¯ ¾ , iµ<O_N3W°a?O{N for some polynomial ° over ¯ . (In the above, Ò e ¯ D we had W?O{N-õµ<O{ Öo© i#a<O{` and °a?O{-/YcO ; as a polynomial over ¯ ¾ , e ¯ úò- ú , and hence had degree .) Any such map is an ¯ -linear function on ¯ ¾ , e and is therefore a linearized polynomial, cf. [77, Chap. 3, Sec. 4], which has only terms ý with exponents that are powers of (including pG ). It turns out that for our purposes cannot have degree G , and so it must have degree at least . 3.5 Codes Approaching List Decoding Capacity Given that trivariate interpolation improved the decoding radius achievable with rate 4 Z from GyIq4 to GyIq4 , it is natural to attempt to use higher order interpolation to improve the decoding radius further. In this section, we discuss the quite straightforward technical changes needed for such a generalization. Consider again the -folded RS code 1Pwyx{z| } . Let Á be an integer in the range ¾ $ ' GsEÁE, . We will develop a decoding algorithm based on interpolating an ÁDn[G -variate 43 polynomial öa?OS!h÷úZ!÷"!¥¤¥¤¤!÷TÂN . The definitions of the `G£!ë"!ë"!¥¤¥¤¥¤!ë -weighted degree  Z (with repeated Á times) of ö and the multiplicity at a point W8!YZ\!B!¥¤¤¥¤!ÀÂf8+] ¯ are straightforward extensions of Definitions 3.2 and 3.3. As before let dx ?; ý !;cZ\!¥¤¥¤¤!Z; åZN be the unfolded version of the received word [+ ç $¯ of the folded RS code that needs to be decoded. For convenience, define ; ; for tö . Following algorithm Trivariate-FRS-decoder, for suitable integer FE º Þ parameters 7]!ZG , the interpolation phase of the Á>naG -variate FRS decoder will fit a nonzero polynomial öa?OK!÷úZ\!¥¤¤¥¤!÷TÂf with the following properties: 1. It has `G£!ë"!ë"!¥¤¥¤¥¤!ë -weighted degree at most 7 Î 2. It has a zero of multiplicity G at ?Y !Z;XÎ!Z;XÎ Z\!¤¥¤¥¤!;XΠ½åZf for ͳ ; !¥G£!¥¤¥¤¥¤!gIG . The following is a straightforward generalization of Lemmas 3.2 and 3.3.   Z , a nonzero polynomial öa<OS!÷YZ\!¥¤¥¤¤¥!h÷ÀÂf Lemma 3.6. (a) Provided  þ Z p HG ' þ with the following properties exists. öa<OS!÷YZ\!¥¤¤¥¤!÷TÂf has `G£!ë"!¥¤¥¤¥¤!ë weighted deÎ gree at most 7 and has roots with multiplicity G at <Y !Z;XÎ!;XÎ Z!¤¥¤¥¤!Z;XΠ½åZN for every ; Í3+´¢ !¥¤¤¥¤¥!gIQGe¦ . Moreover such a öa<OS!÷jZ!¥¤¥¤¥¤!÷TÂN can be found in time polyno  Z mial in , G and 7 6£  . (b) Let _ be an integer such that _ åk  Z . Then every polynomial µ<O{+[ ¯ O0¡ of degree at most whose encoding 6$according to wyx{zk| } agrees with the received ; ¾ $ ' ½åZ word on at least _ locations satisfies öa<OS!f?O{\!fµ<YcO{b!¥¤¤¥¤!fµ<Y O{`kô . Proof. The first part follows from (i) a simple lower bound on the number of monomials O 9 ÷ Z A ^÷I A þ with @änu;¥ZenJ;ëcn0½nK;!ÂN9Et7 , which gives the number of coefficients of öa?OK!÷úZ\!¥¤¥¤¤¥!h÷ÀÂf , and (ii) an estimation of the number of ÁDn0G -variate monomials of total degree less than G , which gives the number of interpolation conditions per ÁDn0G -tuple. We now briefly justify these claims. By a generalization of the argument in Lemma 3.2, one can lower bound the number of monomials OL9Ú÷ Z A ,÷  A þ such that @Ðn_;¥Zn´M;!Âf8Et7 ; ¦ . We will by the volume of ÂÑ õ¢¬on,;cZän,;£³nt£n,;Â-E|7S· ¬j!;cZ!Z;£!¥¤¤¥¤;Â- use induction on Á to prove that the volume of áÂÑ is  þ Z p . The proof of Lemma 3.2 HG ' þ shows this for ÁsLV . Now assume that the volume of á½åZ½ is exactly  þ . Note that the G ' þ e subset of ÂÑ where the value of ;Â3t is fixed is exactly ½åZ½ å 'N Thus, the volume of :ÂÑ is exactly ý ' i7õIF;Âf  T;Âk Á" ½åZ G Á"   ý T 7  Z ! 9 Á ntG!"  where the second equality follows by substituting 57õI~; . Further, a straightforward generalization of the argument in the proof of Lemma 3.2, shows that the condition on the 44 ; multiplicity of the polynomial öa<OS!÷jZ\!¥¤¤¥¤!÷TÂf is satisfied if for every Í3+´¢ !¥¤¤¥¤!gIQGe¦ ; and every tuple >½!<Z!¥¤¥¤¥¤! ©ÂN such that n £Zún ú en¥Â9E>G the following is ü ü ü Ø Ø ¥ ü Z  å Î å Qå P Ø å Ø ÎØ O Y ; Î ; Î Z ¥ ^; Î þ ½Â å þ Z ! þ \ ÿ þ eZÿ þ Nÿ þ © Âÿ þ þ þ where Ø O is the coefficient of the monomial O ÷ Z ¥,÷  þ in öa<OS!÷úZ\!¥¤¤¥¤!÷TÂf .  þ The number of positive integral solutions  for Ícn >ZCn j¥bn Â9E>G is exactly  Z . Thus, the total number of constraints is {  Z . Thus, the condition in part (a) of the lemma, implies that the set of homogeneous linear equations have more variables than constraints.  Z 6<  ) Hence, a solution can be found in time polynomial in the number of variables ( Et7  and constraints (at most G " ). The second part is similar to the proof of Lemma 3.3. If µ<O_ has agreement on at least _ Î locations of , then for at least _/I Ácn§G of the ÁcnÂG -tuples ?Y !;XÎ!Z;XÎ Z\!¥¤¤¥¤!Z;XΠ½åZf , we ; Î have µ<Y ¥3;XÎ for o !¥G£!¥¤¥¤¤¥!,Á(IQG . As in Lemma 3.1, we conclude that 4y<O{ Þ6 587 ½åZ Î öa?OS!ëµ?O{\!fµ<YcO{b!¤¥¤¥¤!fµ<Y O{N has a zero of multiplicity G at Y for each such ÁÁn_G tuple. Also, by design 4o?O{ has degree at most 7 . Hence if _^öI=Áòn|GPG_27 , then ; 4y<O{ has more zeroes (counting multiplicities) than its degree, and thus 4o?O{³ô . Note the lower bound condition on 7 above is met with the choice Z  Z  72R+ëi G?G-ntGBX?G(n ÁXN nQGò¤ - (3.2) The task of finding the list of all degree polynomials µ<O_Ð+g ¯ O[¡ satisfying ; ½åZ öa?OS!ëµ?O{\!fµ<YcO{b!¤¥¤¥¤!fµ<Y O{Ng can be solved using ideas similar to the proof of Lemma 3.5. First, by dividing out by #.<O_ enough times, we can assume that not all coefficients of öa<OS!÷jZ\!¥¤¤¥¤!÷TÂf , viewed as a polynomial in ÷YZ\!¥¤¤¥¤!÷T with coefficients in ¯ O[¡ , are divisible by #.<O_ . We can then goD modulo #a<O{ to get a nonzero polynomial HQ÷úZ\!÷"!¤¥¤¥¤!h÷ÀÂf over the extension field ¯ ¾ v ¯ O[¡^6Ci#a<O{` . Now, by Lemma 3.4, e Ò ¯S Öo© #.<O_ for every G . Therefore, the task at hand we have µ<Ya,O_]µ<O_ reduces to the problem of finding all roots + ¯ ¾ of the polynomial 4yQ÷jZN where e ¯ ¯ 4yQ÷úZ`-üHQ÷úZ\!÷ Z !¥¤¤¥¤!÷ Z þ e . There is the risk that 4o<÷jZN is the zero polynomial, but it is easily seen that this cannot happen if the total degree of H is less than . This will be the Z  Z case since the total degree is at most 7Â6£ , which is at most <G(n ÁX\/Y6< UT .  The degree of the polynomial 4o<÷YZN is at most , and therefore all its roots in ¯ ¾ e  can be found in time (by Theorem 2.4). We conclude that the “root-finding” step can be accomplished in polynomial time. The algorithm works for agreement _Ð åk  Z , which for the choice of 7 in (3.2) is 6$ satisfied if _Ð Ñ Â Y Z  Z .>G9n nFV¤ 2 G HIºÁÐnQG Á 45 Indeed, 7 # G ´. þ p /HI ÁÐntGPG ^HIºÁ9nLGPG G E . <G(n ^Hº I Á9nLGP G Á Ñ Â j Z  Z . G8n 2 G HI ÁÐnQG Á Ñ Â j Z  Z R . G8n 2 G HI ÁÐnQG E _\!  TG<G-nQGBX<G(nºÁeúnQG 2 E  ÂnQG 2 ÁX þ p n G G^ I ÁÐntG n~V ; where the first inequality follows from (3.2) along with the fact that for any real ¬L ; , !^¬4$E¬ while the second inequality follows by upper bounding GjnaÍ by Gún Á for every E Í3E Á . We record these observations in the following, which is a multivariate generalization of Theorem 3.1. Theorem 3.3. For every integer G and every Á , GlE ÁEO , the Á}nõG -variate FRS decoder successfully list decodes the -folded Reed-Solomon code wyxáza| } up to ¾ $ ' a radius j6HI8_ as long as the agreement parameter _ satisfies Ñ Â Y Z  Z n V¤ ~ . G8n 2 G HI ÁQ n G Á _9  (3.3)   The algorithm runs in " time and outputs a list of size at most ·¸±.· ÷^ÂntG . Recalling that the block length of wyx{z| } is ~pY6 and the rate is Ñ*n5GN6 , ¾ $ ' the above algorithm can decode a fraction of errors approaching GÐI .  G3n G Á 2 HIºÁ9ntG 4   Z (3.4) using lists of size at most . By picking G<! large enough compared to Á , the decoding ;   Z . We state this radius can be made larger than GPIq`G-nQÝ<`4 for any desired ÝK result formally below. ; ; Theorem 3.4. For every RvÝ[ErG , integer Á]%G and R24%R G , there is a family of -folded Reed-Solomon codes for EzÁX6eÝ that have rate at least 4 and which can be   Z  list decoded up to a GIL`G8n~Ýe`4 fraction of errors in time ~u[`" and outputs a  list of size at most ~u[ where ~ is the block length of the code. The alphabet size of the code as a function of the block length ~ is ~u[ 6$³ . Proof. We first instantiate the parameters G and G Á Ý in terms of Á and Ý : Á(IG -nFÝ< ¤ Ý 46 Note that as ÝEpG , öE,ÁX6eÝ . With the above choice, we have .G8n Á 2 G HI ÁÐntG þ Ý G9n ÿ RG8nFÝ}¤ Together with the bound (3.4) on the decoding radius, we conclude that the Á9nQG -variate   Z decoding algorithm certainly list decodes up to a fraction GItG9nÝ<`4 of errors.  The worst case list size is and the claim on the list size follows by recalling that F {nÊG and ~ Y6 . The alphabet size is $ ~g0 "6$³ . The running time has two major components: (1) Interpolating the ÁnLG -variate polynomial öa , which by  Z Lemma 3.6 is ^G ; and (2) Finding all the roots of the interpolated polynomial, which  takes e" time. Of the two, the time complexity of the root finding step dominates, which  is ~g0 " . In the limit of large Á , the decoding radius approaches the list-decoding capacity GIK4 , leading to our main result. ; ; Theorem 3.5 (Explicit capacity-approaching codes). For every RQ4pRqG and R*E 4 , there is a family of folded Reed-Solomon codes that have rate at least 4 and which can be list decoded up to a GòIt4qI, fraction of errors in time (and outputs a list of size at Z most) ~6X e ¹»º½¼ ® where ~ is the block length of Ø the code. The alphabet size of the Z code as a function of the block length ~ is ~o6X . Proof. Given !f4 , we will apply Theorem 3.4 with the choice ª¨ ©£«B`G6<4P ¨ª©£«B`G8neYX ÁsWV and Ý} UGIF4P ¤ 4oG3n< (3.5) Note that as oE54 , ÝoE÷G . Thus, the list-decoding radius guaranteed by Theorem 3.4 is at least GIQ`G8nFÝ<4   Z Z  Z GIF4y`G3nFÝ<\`G6<4P GIF4y`G3nFÝ<\`G8ne (by the choice of Á in (3.5)) GItW4xnl< (using the value of Ý ) ¤ We now turn our attention to the time complexity of the decoding algorithm and the alphabet size of the code. To this end we first claim that is `G6X . First we note that by the choice of Á , Á}E V¨®­úG6<4} E ¨®­jG8nl< k¨®­ú`G6<4P ! where the second inequality follows from the fact that for Thus, we have öE Á 4 `G3ne o 4 Á( E=iÁ( E Ý U GI4} UGIF4} ; Rõ¬xEG , ¨®­jG(n¬"ʬ6<V . >V S 4 ¨®­j`G6<4} E G IF4 >V ! 47 ; Z where for the last step we used ¨®­j`G6e4}3E I[G for R4pE|G . The claims on the running time, worst case list size and the alphabet size of the code follow from Theorem 3.4 and the åZ ¨ª©<«"G6<4}` . facts that is `G6X and Á is Remark 3.3 (Upper bound on in Theorem 3.5). A version of Theorem 3.5 can also be proven for 4 . The reason it is stated for Et4 , is that we generally think of as much smaller than 4 (this is certainly true when we apply the generalization of Theorem 3.5 (Theorem 3.6) in Chapters 4 and 5). However, if one wants 4 , first note that the theorem is trivial for pG9Il4 . Then if 4pRxRGÐI´4 , can do a proof similar to the one Z½å above. However, in this range Ý Z can be strictly greater than G . In such a case we Ñ applying Theorem 3.4 with a smaller Ý than what apply Theorem 3.4 with ÝzG (note that we want only increases the decoding radius). This implies that we have ùELÁ , in which Z case both the worst case list and the alphabet size become ~5¨ª©<«"G6<4}`6X<`X ¹»º½¼ . Remark 3.4 (Minor improvement to decoding radius). It is possible to slightly improve  Z with essentially no effort. The idea is to the bound of (3.4) to G(I . GÐn 2 . 2 Z $ not use only a fraction /I ÁÁn_G`6 of the 0Áún_G -tuples for interpolation. Specifically, Ò Î I Á . This does not affect the number of ÁntG we omit tuples with Y for ÍQÖy© Ox º tuples for which we have agreement (this remains at least _/Ê# I Á³n~G ), but the number of interpolation conditions is reduced to ~w^ I Án5GÐ5k^ I ÁnLGN6 . This translates  $ åk  into the stated improvement in list-decoding radius. For clarity of presentation, we simply chose to use all tuples for interpolation. Remark 3.5 (Average list size). Theorem 3.5 states that the worst case list size (over all possible received words) is polynomial in the block length of the codeword (for fixed 4 and ). One might also be interested in what is the average list size (over all the possible received words within a distance @> from some codeword). It is known that for ReedSolomon codes of rate 4 the average list size is T G even for @ close to G*I|4 [81]. Since folded Reed-Solomon codes are just Reed-Solomon codewords with symbols bundled together, the arguments in [81] extend easily to show that even for folded Reed-Solomon codes, the average list size is T G . 3.6 Extension to List Recovery We now present a very useful generalization of the list decoding result of Theorem 3.5 to the setting of list recovery. Recall that under the list recovery problem, one is given as input for each codeword position, not just one but a set of several, say , alphabet symbols. The goal is to find and output all codewords which agree with some element of the input sets for several positions. Codes for which this more general problem can be solved turn out to be extremely valuable as outer codes in concatenated code constructions. In short, this is because one can pass a set of possibilities from decodings of the inner codes and then list recover the outer code with those sets as the input. If we only had a list-decodable code at 48 the outer level, we will be forced to make a unique choice in decoding the inner codes thus losing valuable information. This is a good time to recall the definition of list recoverable codes (Definition 2.4). Theorem 3.5 can be generalized to list recover the folded RS codes. Specifically, for a FRS code with parameters as in Section 3.5, for an arbitrary constant G , we can ><! -list recover in polynomial time provided GIZ<~O .G8n Á G  ! HI ÁÐnQG 2 þp (3.6) where ~ %j6 . We briefly justify this claim. The generalization of the list-decoding algorithm of Section 3.5 is straightforward: instead of one interpolation condition for each symbol of the received word, we just impose ·¸¶ÁÎN·3E many interpolation conditions for each position Í9+¢cG£!fVC!¥¤¥¤¤¥!µ¦ (where ¶"Î is the Í ’th input set in the list recovery instance). The number of interpolation conditions is at most , and so replacing by in the bound of Lemma 3.6 guarantees successful decoding2 . This in turn implies that the condition on the number of agreement of (3.3) generalizes to the one in (3.6). 3 This simple generalization to list recovery is a positive feature of all interpolation based decoding algorithms [97, 63, 85] beginning with the one due to Sudan [97]. Picking G[ Á and \[ Á in (3.6), we get <! -list recoverable codes with rate 4 for  Z  Z . Now comes the remarkable fact: we can pick a suitable Á][ and <uEvG(IW 4 perform ><! -list recovery with <0EHGJI,45I~ which is independent of ! We state the formal result below (Theorem 3.5 is a special case when pG ). ; ; Theorem 3.6. For every integer G , for all 4 , Rù4 R G and RME4 , and for every prime U , there is an explicit family of folded Reed-Solomon codes over fields of characteristic U that have rate at least 4 and which are `G"IÂ4]Ic! !ë/Y` -list recoverable in polynomial time, where (/j§ ~6X Ø X ¹»º¼ S ® . The alphabet size of a code of Z½å ® . block length ~ in the family is ~o6X ¹»º½¼ S Proof. (Sketch) Using the exact same arguments as in the proof of Theorem 3.4 to the agreement condition of (3.6), we get that one can list recover in polynomial time as long as  Z  Z ; <oE|GúI_`GngÝ<\ 4 , for any RÝ*E|G . The arguments to obtain an upper bound of GIF4QI are similar to the ones employed in the proof of theorem 3.5. However, Á needs to be defined in a slightly different manner: ÁòWV 2 ª¨ ©£«B 6<4P ¨ª©£«B`G8nl< X ¤ In fact, this extension also works when the average size of the size is at most ^ , that is _J`abc2d e a d*fg^%- . cvuwyx8zcv{}|~ We will also need the condition that hjilkKmonh -p^oqsrtn . This condition is required to argue that in the “root finding” step, the “final” polynomial v hj c n is not the zero polynomial. The condition is met ~sx ~ for constant rate codes if ^ (recall that we think of as growing while i and m are fixed). In all our applications of list recovery for folded Reed-Solomon codes, the parameter ^ will be a constant, so this is not a concern. 3 49 Also this implies that is /. Z½å ¹»º¼ S Ø 2 , which implies the claimed bound on the alphabet size of the code as well as (/j . We also note that increasing the folding parameter only helps improve the result (at the cost of a larger alphabet). In particular, we have the following corollary of the theorem above. ; Corollary 3.7. For every integer G , for all constants RE 4 , for all 4y!f4P ; ; R4|Et4JBRG , and for every prime U , there is an explicit family of folded Reed-Solomon codes, over fields of characteristic U that have rate at least 4 and which can be `G(Ix4LI ! !f(~S` -list recovered in polynomial time, where for codes of block lengthØ ~ , ~[Ð ~6X e ¹»º½¼ S ® and the code is defined over alphabet of size ~6 " ¹»º¼ S Zå ® . Note that one can trivially increase the alphabet of a code by thinking of every symbol as coming from a larger alphabet. However, this trivial transformation decreases the rate of the code. Corollary 3.7 states that for folded Reed-Solomon codes, we can increase the alphabet while retaining the rate and the list recoverability properties. At this point this extra feature is an odd one to state explicitly, but we will need this result in Chapter 4. Remark 3.6 (Soft Decoding). The decoding algorithm for folded RS codes from Theorem 3.5 can be further generalized to handle soft information, where for each codeword position Í the decoder is given as input a non-negative weight (Îæ for each possible alphabet symbol . The weights ÐÎæ can be used to encode the confidence information concerning ; the likelihood of the the Í ’th symbol of the codeword being . For any u , for suitable choice of parameters, our codes of rate 4 overÈ alphabet & have a soft decoding algorithm that outputs all codewords Ü(zÆ ÜZ\!fÜ\!¥¤¥¤¤¥!fÜ that satisfy ü ÎÛ Z For Áa [63]. Ðή ×v R ³`G3nei4  ü ~[ . ü Î Û Z ÙY  Z Îæ 2 Z  Z ¤ G , this soft decoding condition is identical to the one for Reed-Solomon codes in 3.7 Bibliographic Notes and Open Questions We have solved the qualitative problem of achieving list-decoding capacity over large alphabets. Our work could be improved with some respect to some parameters. The size of the list needed to perform list decoding to a radius that is within of capacity grows Z as " Ñ where is the block length of the code. It remains an open question to bring × this list size down to a constant independent of , or even to µ e½ with an exponent Ü independent of (we recall that the existential random coding arguments work with a list size of `G6Xe ). 50 These results in this chapter were first reported in [58]. We would like to point out that the presentation in this chapter is somewhat different from the original papers [85, 58] in terms of technical details, organization, as well as chronology. Our description closely follows that of a survey by Guruswami [50]. With the benefit of hindsight, we believe this alternate presentation to be simpler and more self-contained than the description in [58], which used the results of Parvaresh-Vardy as a black-box. Below, we discuss some technical aspects of the original development of this material, in order to shed light on the origins of our work. Two independent works by Coppersmith and Sudan [27] and Bleichenbacher, Kiayias and Yung [18] considered the variant of RS codes where the message consists of two (or more) independent polynomials over some field ¯ , and the encoding consists of the joint evaluation of these polynomials at elements of ¯ (so this defines a code over ¯ ).4 A naive way to decode these codes, which are also called “interleaved Reed-Solomon codes,” would be to recover the two polynomials individually, by running separate instances of the RS decoder. Of course, this gives no gain over the performance of RS codes. The hope in these works was that something can possibly be gained by exploiting that errors in the two polynomials happen at “synchronized” locations. However, these works could not give any improvement over the G*I 4 bound known for RS codes for worst-case errors. Nevertheless, for random errors, where each error replaces the correct symbol by a uniform random field element, they were able to correct well beyond a fraction GI 4 of errors. In fact, as the order of interleaving (i.e., number of independent polynomials) grows, the radius approaches the optimal value GsIQ4 . This model of random errors is not very practical or interesting in a coding-theoretic setting, though the algorithms are interesting from an algebraic viewpoint. The algorithm of Coppersmith and Sudan bears an intriguing relation to multivariate interpolation. Multivariate interpolation essentially amounts to finding a non-trivial linear dependence among the rows of a certain matrix (that consists of the evaluations of appropriate monomials at the interpolation points). The algorithm in [27], instead finds a non-trivial linear dependence among the columns of this same matrix! The positions corresponding to columns not involved in this dependence are erased (they correspond to error locations) and the codeword is recovered from the remaining symbols using erasure decoding. In [84], Parvaresh and Vardy gave a heuristic decoding algorithm for these interleaved RS codes based on multivariate interpolation. However, the provable performance of these codes coincided with the GaI 4 bound for Reed-Solomon codes. The key obstacle in improving this bound was the following: for the case when the messages are pairs Wµ<O{b!?O{N of degree polynomials, two algebraically independent relations were needed to identify both µ?O{ and <O_ . The interpolation method could only provide one such re; lation in general (of the form öa<OS!fµ<O{b!<O{`k for a trivariate polynomial öa<OS!÷8!!úJ ). This still left too much ambiguity in the possible values of iµ<O_\!<O{` . (The approach 4 The resulting code is in fact just a Reed-Solomon code where the evaluation points belong to the subfield p of the extension field over of degree two. 51 in [84] was to find several interpolation polynomials, but there was no guarantee that they were not all algebraically dependent.) Then, in [85], Parvaresh and Vardy put forth the ingenious idea of obtaining the extra algebraic relation essentially “for free” by enforcing it as an a priori condition satisfied at the encoder. Specifically, instead of letting the second polynomial ?O{ to be an independent degree polynomial, their insight was to make it correlated with µ<O{ by a specific algeÒ braic condition, such as ?O{kµ<O_%Ä÷Öo© #a<O{ for some integer T and an irreducible polynomial #a<O{ of degree PnQG . Then, once we have the interpolation polynomial öa<OS!÷8!!úJ , µ<O_ can be obtained as follows: Reduce the coefficients of öa<OS!÷9!,úò modulo #a<O{ to get a polynomial HQ÷8!!úJ with coefficients from ¯ O[¡/6Ui#.?O{N and then find roots of the univariate polynomial HQ÷8!h÷ Ä . This was the key idea in [85] to improve the GJI 4 decoding radius for rates ; less than G6UGrq . For rates 45) , their decoding radius approached GIFi4S¨ª©£«B`G6<4PN . The modification to using independent polynomials, however, does not come for free. In particular, since one sends at least twice as much information as in the original RS code, there is no way to construct codes with rate more than G6<V in the PV scheme. If we use ÁyqV correlated polynomials for the encoding, we incur a factor G6Á loss in the rate. This proves quite expensive, and as a result the improvements over RS codes offered by these codes are only manifest at very low rates. The central idea behind our work is to avoid this rate loss by making the correlated polynomial <O{ essentially identical to the first (say <O_}mµ<YcO{ ). Then the evaluations of ?O{ can be inferred as a simple cyclic shift of the evaluations of µ<O{ , so intuitively there is no need to explicitly include those too in the encoding. 52 Chapter 4 RESULTS VIA CODE CONCATENATION 4.1 Introduction In Chapter 3, we presented efficient list-decoding algorithms for folded Reed-Solomon ; codes that can correct G*I|4õI fraction of errors with rate 4 (for any Q ). One drawback of folded Reed-Solomon codes is that they are defined over alphabets whose size is polynomial in the blocklength of the code. This is an undesirable feature of the code and we address this issue in this chapter. First, we show how to convert folded Reed-Solomon codes to a related code that can still ; be list decoded up to G>IP4yIò fraction of errors with rate 4 (for any ). However, unlike Z folded Reed-Solomon codes these codes are defined over alphabets of size V " ¹»º¼ Ñ® . Recall that codes that can be list decoded up to G9I´4,Iw fraction of errors need alphabets of size Ve Xi (see section 2.2.1). Next, we will show how to use folded Reed-Solomon codes to obtain codes over fixed alphabets (for example, binary codes). We will present explicit linear codes over fixed alphabets that achieve tradeoffs between rate and fraction of errors that satisfy the so called Zyablov and Blokh-Zyablov bounds (along with efficient list-decoding algorithms that achieve these tradeoffs). The codes list decodable up to the Blokh-Zyablov bound tradeoff are the best known to date for explicit codes over fixed alphabets. However, unlike Chapter 3, these results do not get close to the list-decoding capacity (see Figure 4.1). In particular, for binary codes, if G6<V}IºY fraction of errors are targeted, our codes have rate s<Y . By contrast, codes on list-decoding capacity will have rate P?Y . Unfortunately (as has been mentioned before), the only codes that are known to achieve list-decoding capacity are random codes for which no efficient list-decoding algorithms are known. Previous to our work, the best known explicit codes had rate y?Y [51] (these codes also had efficient list-decoding algorithms). We choose to present the codes that are list decodable up to the Zyablov bound (even though the code that are list decodable up to the Blokh Zyablov have better rate vs. list decodability tradeoff) because of the following reasons (i) The construction is much simpler and these codes give the same asymptotic rate for the high error regime and (ii) The worst case list sizes and the code construction time are asymptotically smaller. All our codes are based on code concatenation (and their generalizations called multilevel code concatenation). We next turn to an informal description of code concatenation. 53 1 List decoding capacity Zyablov bound (Section 4.3) Blokh Zyablov bound (Section 4.4) R (RATE) ---> 0.8 0.6 0.4 0.2 0 0 0.1 0.2 0.3 ρ (ERROR-CORRECTION RADIUS) ---> 0.4 0.5 Figure 4.1: Rate 4 of our binary codes plotted against the list-decoding radius @ of our åZ `GúIg4P is also algorithms. The best possible trade-off, i.e., list-decoding capacity, @o5² plotted. 4.1.1 Code Concatenation and List Recovery Concatenated codes were defined in the seminal thesis of Forney [40]. Concatenated codes are constructed from two different codes that are defined over alphabets of different sizes. Say we are interested in a code over »¡ (in this chapter, we will always think of §õV as being a fixed constant). Then the outer code 1¹b is defined over 5ös¡ , where ö%m ' for some positive integer . The second code, called the inner code is defined over ¸¡ and is of dimension (Note that the message space of 19Î and the alphabet of 1 b have the same size). The concatenated code, denoted by 1 m1¹b s13Î , is defined as follows. Let the rate of 1\b be 4 and let the blocklengths of 1 >\ and 13Î be ~ and respectively. Define È 4 ~ and G[ 6 È . The input to 1 is a vector L MÆ^KZ!¥¤¥¤¥¤!J +÷f »¡ ' . Let 1\b `L³zÆ/¬ÁZ\!¥¤¥¤¤!`¬ . The codeword in 1 corresponding to L is defined as follows 1aQL³zÆW13Î /¬ÁZN\!f13Î ¬Db\!¥¤¥¤¥¤!f13Î ¬ È ¤ It is easy to check that 1 has rate G<4 , dimension and blocklength ´~ . Notice that to construct a -ary code 1 we use another -ary code 1ÐÎ . However, the nice thing about 18Î is that it has small blocklength. In particular, since 4 and G are con stants (and typically ö and ~ are polynomially related), wpa/¨ª©£« ~S . This implies that we can use up exponential time (in ) to search for a “good” inner code. Further, one can use the brute force algorithm to (list) decode 19Î . 54 Table 4.1: Values of rate at different decoding radius for List decoding capacity ( 4 ), 9 À Zyablov bound ( 4 ) and Blokh Zyablov bound ( 4 ) in the binary case. For rates above ; ; ¤¸ , the Blokh Zyablov bound is up to 3 decimal places, hence we have not shown this. 4 @ 9 4À 4 0.01 0.02 0.03 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.919 0.858 0.805 0.713 0.531 0.390 0.278 0.188 0.118 0.065 0.572 0.452 0.375 0.273 0.141 0.076 0.041 0.020 0.009 0.002 0.739 0.624 0.539 0.415 0.233 0.132 0.073 0.037 0.017 0.006 Finally, we motivate why we are interested in list recovery. Consider the following natural decoding algorithm for the concatenated code 1¹b 213Î . Given a received word in N ¸¡ , we divide it into ~ blocks from »¡ . Then we use a decoding algorithm for 19Î to get an intermediate received word to feed into a decoding algorithm for 1b . Now one can use unique decoding for 18Î and list decoding for 1 b . However, this loses information in the first step. Instead, one can use the brute force list-decoding algorithm for 1Î to get a sequence of lists (each of which is a subset of 5ös¡ ). Now we use a list-recovery algorithm for 1\b to get the final list of codewords. The natural algorithm above is used to design codes over fixed alphabets that are list decodable up to the Zyablov bound in Section 4.3 and (along with expanders) to design codes that achieve list-decoding capacity (but have much smaller alphabet size as compared to those for folded Reed-Solomon codes) in Section 4.2. 4.2 Capacity-Achieving Codes over Smaller Alphabets Theorem 3.5 has two undesirable aspects: both the alphabet size and worst-case list size output by the list-decoding algorithm are a polynomial of large degree in the block length. We now show that the alphabet size can be reduced to a constant that depends only on the distance to capacity. ; ; Theorem 4.1. For every 4 , Rr4R G , every F , there is a polynomial time conZ structible family of codes over an alphabet of size V " ¹»º½¼ Ñ® that have rate at least 4 and which can be list decoded up to a fraction GI4Il< of errors in polynomial time. Proof. The theorem is proved using the code construction scheme used by Guruswami and Indyk in [54] for linear time unique decodable codes with optimal rate, with different components appropriate for list decoding plugged in. We briefly describe the main ideas behind the construction and proof below. The high level approach is to concatenate two codes 1 and 1 , and then redistribute the symbols of the resulting codeword using an ß0 á º expander graph. Assume that Rq`GIF4PN6p and let ÝPt . Z will be a code of rate `G9IVX< over an alphabet & of size The outer code 1 á that can be !faG6X<º` -list recovered in polynomial time (to recall definitions pertaining to 55 list recovery, see Definition 2.4), as guaranteed by Theorem 3.6. That is, the rate of 1 ; º á will be close to G , and it can be <!6 -list recovered for large and <*) . The inner code 1 will be a NG}I54÷IQ><b!faG6X<` -list decodable code with nearß0 optimal rate, say rate at least i4Fn´<< . Such a code is guaranteed to exist over an alphabet of size `G6 using random coding arguments. A naive brute-force for such a code, howZ ever, is too expensive, since we need a code with ·Ä&P·C codewords. Guruswami and Indyk [52], see also [49, Sec. 9.3], prove that there is a small (quasi-polynomial sized) sample space of pseudolinear codes in which most codes have the needed property. Furthermore, they also present a deterministic polynomial time construction of such a code (using derandomization techniques), see [49, Sec. 9.3.3]. and 1 gives a code 1 8¡ of rate at least GPItVX<W45n The concatenation of 1 ß0 á º º%0 á given a received word of <ev4 over an alphabet & of size ·Ä&}·2aG6X . Moreover, the concatenated code, one can find all codewords that agree with the received word on a fraction 4xnl> of locations in at least `GIle fraction of the inner blocks. Indeed, we can do this by running the natural list-decoding algorithm, call it ¢ , for 1£ 8¡ that decodes á each of the inner blocks to a radius of `GäI[4wIu>e returning up to úLaº%G0 6X< possibilities in polynomial time. for each block, and then !¤ -list recovering 1 º á The last component in this construction is a 7 aG6X -regular bipartite expander graph which is used to redistribute symbols of the concatenated code in a manner so that an overall agreement on a fraction 4~n¥X of the redistributed symbols implies a fractional agreement of at least 4tnx> on most (specifically a fraction `GJIx< ) of the inner blocks of the concatenated code. In other words, the expander redistributes symbols in a manner that “smoothens” the distributions of errors evenly among the various inner blocks (except for possibly a fraction of the blocks). This expander based redistribution incurs no loss in Z Z rate, but increases the alphabet size to `G6 " WjLV<" ¹»º½¼ Ñ® . We now discuss some details of how the expander is used. Suppose that the block length of the folded Reed-Solomon code 1 is ~yZ and that of 1 is ~} . Let us assume that ~ ß0 á º is a multiple of 7 , say ~2úë7 (if this is not the case, we can make it so by padding at most 7mIG dummy symbols at a negligible loss in rate). Therefore codewords of 1 , ß0 and therefore also of 1 8¡ , can be thought of as being composed of blocks of 7 symbols á each. Let ~r ~oZ½ú , soº%0 that codewords of 1 8¡ can be viewed as elements in Ñ& . º%0 á Let °O i!f4y!f# be a 7 -regular bipartite graph with ~ vertices on each side (i.e., ·¸J·£õ·¸4a·£ ~ ), with the property that for every subset ÷ t4 of size at least i4~nXe~ , the number of vertices belonging to that have at most i4nãqe<`7 of their neighbors in ÷ is at most ÝÃ~ (for ÝuÊ ). It is a well-known fact (used also in [54]) that if ° is picked to be the double cover of a Ramanujan expander of degree 7%tU6C Ý , then ° will have such a property. We now define our final code 1 X z°i1¦ 8¡ {2Ñ& formally. The codewords in ºP0 1§á v¡ . Given a codeword Üs+{1 8¡ , its 1 X are in one-one correspondence with those of º%0 á º%0 á ~07 symbols (each belonging to & ) are placed on the ~07 edges of ° , with the 7 symbols in its Í ’th block (belonging to & , as defined above) being placed on the 7 edges incident 56 Expander graph º »>¼ °± ² « « ¯ ¨ Codeword in °± ² Codeword in ¶ H³ ´µ »8½ ¨ ©®­ ª «8¬ ¬ ¨ ·¹¸ © »v¾>¿ ° ±² ¯ © Codeword °À Á ²yÀ ÂÄà in Figure 4.2: The code 1 X used in the proof of Theorem 4.1. We start with a codeword È Æ^ËÁZ!¥¤¥¤¥¤!Ë in 1\b . Then every symbol is encoded by 19Î to form a codeword in 1 × × 9 (this intermediate codeword is marked by the dotted box). The symbols in the codeword for 1 × × are divided into chunks of 7 symbols and then redistributed along the edges of 9 an expander ° of degree 7 . In the figure, we use 7öp for clarity. Also the distribution of three symbols @ , ; and Ü (that form a symbol in the final codeword in 1 X ) is shown. on the Í ’th vertex of (in some fixed order). The codeword in 1 X corresponding to Ü has as its Í ’th symbol the collection of 7 symbols (in some fixed order) on the 7 edges incident on the Í ’th vertex of 4 . See Figure 4.2 for a pictorial view of the construction. 1 8¡ , and is thus at least 4 . Its alphabet Note that the rate of 1 X is identical to that Z Z 0 á We will now argue how 1 X can size is ·Ä&}· 5aG6X " LV " ¹»º¼ Ñ® , as ºPclaimed. be list decoded up to a fraction `GIF4QIXe of errors. Given a received word Ål+mÑ& , the following is the natural algorithm to find all codewords of 1 X with agreement at least i4ongX<Z~ with Å . Redistribute symbols according to the expander backwards to compute the received word Å< for 1 8¡ which would result á in Å . Then run the earlier-mentioned decoding algorithm ¢ on Åe . º%0 We now briefly argue the correctness of this algorithm. Let ï+K1 X be a codeword with 57 agreement at least i4xn¥X<Z~ with Å . Let ï denote the codeword of 1§ 8¡ that leads to ï 0 á after symbol redistribution by ° , and finally suppose ï is the codewordº%of that yields 1 á º ï< upon concatenation by 1 . By the expansion properties of ° , it follows that all but a Ý ß fraction of ~õ7 -long blocks 0 of ÅX have agreement at least i4yn qe<`7 with the corresponding blocks of ï£ . By an averaging argument, this implies that at least a fraction `G3I, Ý< of the encoding the ~oZ symbols of ï , agree ~*Z blocks of ï that correspond to codewords of 1 ß0 with at least a fraction GòI Ý<W4Ln±q<<Pö`GòIx<\i4Ln=q<eH4tn,> of the symbols of the corresponding block of ÅX . As argued earlier, this in turn implies that the decoding algorithm ¢ for 1 8¡ when run on input ÅX will output a polynomial size list that will º%0 á include ï< . 4.3 Binary Codes List Decodable up to the Zyablov Bound Concatenating the folded Reed-Solomon codes with suitable inner codes also gives us polytime constructible binary codes that can be efficiently list decoded up to the Zyablov bound, i.e., up to twice the radius achieved by the standard GMD decoding of concatenated codes [41]. The optimal list recoverability of the folded Reed-Solomon codes plays a crucial role in establishing such a result. ; ; Theorem 4.2. For all RQ4*!ZGyRG and all , there is a polynomial time constructible family of binary linear codes of rate at least 4QrG which can be list decoded in polynomial åZ GI8G>äIl of errors. time up to a fraction `GI4}`² ; Proof. Let YS be a small constant that will be fixed later. We will construct binary codes with the claimed property by concatenating two codes 1òZ and 1Ð . For 1(Z , we will use a folded Reed-Solomon code over a field of characteristic V with block length ³Z , rate at least ; 4 , and which can be `GCIy4ÂIÜYä!6 -list recovered in polynomial time for ú G 6Y . Let the åU ¨ª©£«"G6YÁGÐIl4P åZ ¨ª©<«3YZf (by Theorem 3.6, alphabet size of 1-Z be V where à is <Y such a 1(Z exists). For 1Ð , we will use a binary linear code of dimension à and rate at least åZ GPIòGyI>YÁ . Such a code is known to exist G which is /@"!6 -list decodable for @´² via a random coding argument that employs the semi-random method [51]. Also, a greedy construction of such a code by constructing its à basis elements in turn is presented in [51] and this processØ takes V>" time. We conclude that the necessary inner code can be } Z½å ^X Z } ® constructed in " time. The code 1-Z , being a folded Reed-Solomon code ¹»º¼ Z over a field of characteristic V , is ú -linear, and therefore when concatenated with a binary linear inner code such as 1 , results in a binary linear code. The rate of the concatenated code is at least 4rG . The decoding algorithm proceeds in a natural way. Given a received word, we break it up into blocks corresponding to the various inner encodings by 1sZ . Each of these blocks is list decoded up to a radius @ , returning a set of at most possible candidates for each outer codeword symbol. The outer code is then `GòI4qI Yä!6W -list recovered using these sets, each of which has size at most , as input. To argue about the fraction of errors this 58 algorithm corrects, we note that the algorithm fails to recover a codeword only if on more than a fraction GäI[4wI YÁ of the inner blocks the codeword differs from the received word on more than a fraction @ of symbols. It follows that the algorithm correctly list decodes up åZ `G"I GYI YÁ . If we pick an appropriate Y in y , to a radius G"I.4]I YÁ½@y÷`G"Ia40I YÁ² å Z `GúI G³I Yú9t² åZ `GúI G£UI§e6e (and `GúIu4KI YÁ8pGúIg4KIÂ<6X ), then by Lemma 2.4, ² åZ GIãGsIãYÁ9÷GI4}² åZ `GI8G£µI´ as desired. which implies GIF4I8YÁ`² Optimizing over the choice of inner and outer codes rates G<!f4 in the above results, we can decode up to the Zyablov bound, see Figure 4.1. For an analytic expression, see (4.2) with ÁòpG . Remark 4.1. In particular, decoding up to the Zyablov bound implies that we can correct a ; fraction `G6<V³Iue of errors with rate P/ for small P) , which is better than the rate of s 6µØ ¨ª©£«B` G6X<` achieved in [55]. However, our× construction and decoding complexity are constant Ü in [55]. Also, ¹»º¼ Z Ñ® whereas these are at most /<Ñ for an absolute Z we bound the list size needed in the worst-case by "e ¹»º½¼ Ñ® , while the list size needed Z in the construction in [55] is `G6X<` ¹»º½¼e¹»º½¼ Ñ® . 4.4 Unique Decoding of a Random Ensemble of Binary Codes We will digress a bit to talk about a consequence of (the proof of) Theorem 4.2. One of the biggest open questions in coding theory is to come up with explicit binary codes that are on the Gilbert Varshamov (or GV) bound. In particular, these are codes that achieve relative distance Ý with rate G-I,²w Ý< . There exist ensembles of binary codes for which if one picks a code at random then with high probability it lies on the GV bound. Coming up with an explicit construction of such a code, however, has turned out to be an elusive task. Given the bleak state of affairs, some attention has been paid to the following problem. Give a probabilistic construction of binary codes that meet the GV bound (with high probability) together with efficient (encoding and) decoding up to half the distance of the code. Zyablov and Pinsker [110] give such a construction for binary codes of rate about ; ; ¤ V with subexponential time decoding algorithms. Guruswami and Indyk [53] give such ; å a construction for binary linear codes up to rates about G with polynomial time encoding and decoding algorithms. Next we briefly argue that Theorem 4.2 can be used to extend ; ; the result of [53] to work till rates of about ¤ V . In other words, we get the rate achieved by the construction of [110] but (like [53]) we get polynomial time encoding and decoding (up to half the GV bound). We start with a brief overview of the construction of [53], which is based on code concatenation. The outer code is chosen to be the Reed-Solomon code (of say length ~ and rate 4 ) while there are ~ linear binary inner codes of rate G (recall that in the “usual” code concatenation only one inner code is used) that are chosen uniformly (and independently) at random. A result of Thommesen [102] states that with high probability such a code 59 lies on the GV bound provided the rates of the codes satisfy 4Ev9<G>`6G , where 9<G>P GÐI´²´GÐIlV åZ . Guruswami and Indyk ; å then give list-decoding algorithms for such codes such that for (overall) rate G£4ùEùG , the fraction of errors they can correct is at least Z £² åZ `G(IºG£4P (that is, more than half the distance on the GV bound) as well as satisfy the constraint in Thommesen’s result. Given Theorem 4.2, here is the natural way to extend the result of [53]. We pick the outer code of rate 4 to be a folded Reed-Solomon code (with the list recoverable properties as required in Theorem 4.2) and the pick ~ independent binary linear codes of rate G as the inner codes. It is not hard to check that the proof of Thommesen also works when the outer code is folded Reed-Solomon. That is, the construction just mentioned lies on the GV bound with high probability. It is also easy to check that the proof of Theorem 4.2 also works when all the inner codes are different (basically the list decoding for the inner code in Theorem 4.2 is done by brute-force, which of course one can do even if all the ~ inner åZ `GPI±G> codes are different). Thus, if G£4 E9<G> , we can list decode up to `GsI54P`² fraction of errors and at the same time have the property that with high probability, the constructed code lies on the GV bound. Thus, all we now need to do is to check what is the maximum rate G£4 one can achieve while at the same time satisfying G£4rEm9?G> and ; ; G"I.4P`² åZ `GBI G>9 Z ² åZ `GBI G£4} . This rate turns out to be around ¤ V (see Figure 4.3). Thus, we have argued the following. Theorem 4.3. There is a probabilistic polynomial time procedure to construct codes whose rate vs. distance tradeoff meets the Gilbert-Varshamov bound with high probability for all ; ; rates up to ¤ V . Furthermore, these codes can be decoded in polynomial time up to half the relative distance. One might hope that this method along with ideas of multilevel concatenated codes (about which we will talk next) can be used to push the overall rate significantly up from ; ; However, the following simple argument shows that one cannot ¤ V that we achieve ; here. ; go beyond a rate of ¤ . If we are targeting list decoding up to @ fraction of errors (and use code concatenation), then the inner rate G must be at most G-Ix²w/@ (see for example (4.2)). Now by the Thommesen condition the overall rate is at most 9?G> . It is easy to check that 9 is an increasing function. Thus, the maximum overall rate that we can achieve is 9GsIQ²´^@` — this is the curve titled “Limit of the method” in Figure 4.3. One can see from Figure 4.3, the maximum rate for which this curve still beats half the GV bound is at ; ; most ¤ . 4.5 List Decoding up to the Blokh Zyablov Bound We now present linear codes over any fixed alphabet that can be constructed in polynomial time and can be efficiently list decoded up to the so called Blokh-Zyablov bound (Figure 4.1). This achieves a sizable improvement over the tradeoff achieved by codes from Section 4.3 (see Figure 4.1 and Table 4.1). 60 0.5 Half the GV bound Truncated Zyablov bound Limit of the method 0.45 ρ (FRACTION OF ERRORS) ---> 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0.02 0.04 0.06 R0 (OVERALL RATE) ---> 0.08 0.1 Figure 4.3: Tradeoffs for decoding certain random ensemble of concatenated codes. “Half Z åZ `G*I|4 ý while “Truncated Zyablov bound” is the the GV bound” is the curve ² limit till which we can list decode the concatenated codes (and still satisfy the Thommesen condition, that is for inner and outer rates G and 4 , G£44 ý E 9<G> ). “Limit of the method” is the best tradeoff one can hope for using list decoding of code concatenation along with the Thommesen result. Our codes are constructed via multilevel concatenated codes. We will provide a formal definition later on — we just sketch the basic idea here. For an integer ÁozG , a multilevel ý ½åZ Z concatenated code 1 over ¯ is obtained by combining Á “outer” codes 1 b !f1 b !¥¤¤¥¤!f1 b of the same block length , say ~ , over large alphabets of size say 9Æ !N 9 !¥¤¥¤¥¤!N 9 þ e , respectively, with a suitable “inner” code. The inner code is of dimension @ ý n@CZ"/n@½åZ . Given ý Z ½åZ messages ! !¥¤¥¤¤¥!` for the Á outer codes, the encoding as per the multilevel generalized concatenation codes proceeds by first encoding each as per 1 > \ . Then for every ; GPEÍ3E¼~ , the collection of the Í th symbols of 1 b ^ for Eº.E Á(IG , which can be viewed as a string over ¯ of length @ ý nÇ@Znl¥nÇ@½åZ , is encoded by the inner code. For ÁòpG this reduces to the usual definition of code concatenation. We present a list-decoding algorithm for 1 , given list-recovery algorithms for the outer codes and list-decoding algorithms for the inner code and some of its subcodes. What makes this part more interesting than the usual code concatenation (like those in Section 4.3), is that the inner code in addition to having good list-decodable properties, also needs to have good list-decodable properties for certain subcodes. Specifically, the subcodes of dimension @ n¥@ ZµnL>n¥@½åZ of the inner code obtained by arbitrarily fixing the first 61 @ ý n5cn@ åZ symbols of the message, must have better list-decodability properties for increasing (which is intuitively possible since they have lower rate). In turn, this allows the outer codes 1 b to have rates increasing with , leading to an overall improvement in the rate for a certain list-decoding radius. To make effective use of the above approach, we also prove, via an application of the probabilistic method, that a random linear code over ¯ has the required stronger condition on list decodability. By applying the method of conditional expectation ([2]), we can construct such a code deterministically in time singly exponential in the block length of the code (which is polynomial if the inner code encodes messages of length a/¨ª©£« ~S ). Note that constructing such an inner code, given the existence of such codes, is easy in quasipolynomial time by trying all possible generator matrices. The lower time complexity is essential for constructing the final code 1 in polynomial time. 4.5.1 Multilevel Concatenated Codes We will be working with multilevel concatenation coding schemes [33]. We start this section with the definition of multilevel concatenated codes. As the name suggests, these are generalizations of concatenated codes. Recall that for a concatenated code, we start with a code 1\b over a large alphabet (called the outer code). Then we need a code 1Î that maps all symbols of the larger alphabet to strings over a smaller alphabet (called the inner code). The encoding for the concatenated code (denoted by 1>\ 113Î ) is done as follows. We think of the message as being a string over the large alphabet and then encode it using 1\b . Now we use 18Î to encode each of the symbols in the codeword of 1¹b to get our codeword (in 1 b 113Î ) over the smaller alphabet. Multilevel code concatenation generalizes the usual code concatenation in the following manner. Instead of there being one outer code, there are multiple outer codes. In particular, we “stack” codewords from these multiple outer codes and construct a matrix. The inner codes then act on the columns of these intermediate matrix. We now formally define multilevel concatenated codes. ; ý ½åZ Z There are Á}|G outer codes, denoted by 1 b !f1 b !¤¥¤¥¤!ë1 >\ . For every E,Í3EÁ³IwG , 1 Î b is a code of block length ~ and rate 4JÎ and defined over a field È . The inner code to 13Î is code of block length and rate G that maps tuples from UÈ Æ îu}È î{¥BîuÈ X þ ¯ symbols in . In other words, Î 1 b $^}È ) ç}È ! î{Bîg}È ) ^ ¯ ¤ e þ ý ½Â åZ The multilevel concatenated code, denoted by W1 b _ î 1 Z b îF¤¥¤¤f1 b Éò13Î is a map of 13Î $X}È Æ îg}È the following form: ý Z ½Â åZ W1 b î[1 b îw¤¥¤¤f1 b (13Î $Dç}È Æ Æ îlç}È îwBîl^È1Ê þ e ) e ^ ¯ t ¤ 62 ý Z ½åZ 9+ç}È Æ Æ *î We now describe the encoding scheme. Given a message / ! !¥¤¥¤¤! çÈ 5î¥úî~ç}È Ê þ e , we first construct an Áoî#~ matrix à , whose Í áË row is the codeword 1 Î b / Î .X Note that every column of à is an element from the set lÈ Æ î È îw¥"îg}È . Let the áË column (for G}Eº.E¼~ ) be denoted by à . The codeword þ e Ä corresponding to the multilevel concatenated code ( 1 is defined as follows ý 1a^ ! Z !¥¤¥¤¤! ½åZ ûÌ ý ½Â åZ i1 b îK1 Z b îl¤¥¤¥¤ë1 b l-13Î ) k÷i13Î iÃZfb!f13Î Wôë\!ä!f13Î ià N"¤ (The codeword can be naturally be thought of as an tîò~ matrix, whose Í ’th column corresponds to the inner codeword encoding the Í ’th symbols of the Á outer codewords.) For the rest of the chapter, we will only consider outer codes over the same alphabet, that is, ö ý ¼ö*Z8p¼ö ½åZ8¼ö . Further, ö÷5 9 for some integer @ÂpG . Note that if ý ý ½åZ ½åZ 1 >\ !¤¥¤¥¤!ë1 >\ and 18Î are all ¯ linear, then so is W1 b î[1 Z b îw¥"î[1 b 13Î . The gain from using multilevel concatenated codes comes from looking at the inner code 13Î along with its subcodes. For the rest of the section, we will consider the case when 13Î is linear (though the ideas can easily be generalized for general codes). Let °;ù+L ¯9 Â6Í be the generator matrix for 19Î . Let G ý Î@kÁX6 denote the rate of 18Î . For EqE ÁoIzG , define G G ý G*IòC6ÃÁe , and let ° denote G tî submatrix of ° containing the last G rows of ° . Denote the code generated by ° by 1 Î ; the rate of 1 Î is G . For our purposes we will actually look at the subcode of 1Î where one fixes the first ; E .EÁ8IFG message symbols. Note that for every these are just cosets of 1 Î . We will be looking at 18Î , which in addition to having good list decoding properties as a “whole,” also has good list-decoding properties for each of its subcode 1 Î . ý ½åZ The multilevel concatenated code 1 ( vW1 b îwîS1 b l(13Î ) has rate 4yi1 that satisfies ½åZ 4oW1³ ý ü G Á Î ÛCý 4Îj¤ (4.1) The Blokh-Zyablov bound is the trade-off between rate and relative distance obtained when the outer codes meet the Singleton bound (i.e., 1 b has relative distance GYI]4 ), and ý the various subcodes 1 Î of the inner code, including the whole inner code 1ÐÎ L1 Î , lie on åZ the Gilbert-Varshamov bound (i.e., have relative distance Ý t² ¯ `GcI G ). The multilevel Ó åZ concatenated code then has relative distance at least Ö ­ ý6Ï Ï Â½åZ`GoIp4 `² ¯ `GyIKG . Expressing the rate in terms of distance, the Blokh-Zyablov bound says that there exist multilevel concatenated code 1 with relative distance at least Ý with the following rate: 4  ü ½åZ Ý W1³ ý6Ð Ð Ö Z Cå E GsI ¤ å Z Á Î ÛCý ² ¯ ` GI8G(nîGXÍ6ÁX ¾ G (4.2) As Á increases, the trade-off approaches the integral 4Yi1kpGIF² ¯ Ý<µIlÝ Z½å ý ¾ T£¬ ¤ å Z ² ¯ GIl¬" (4.3) 63 The convergence of 4 ; ÁspG .  i1 to 4äW1 happens quite quickly even for small Á such as Nested List Decoding We will need to work with the following generalization of list decoding. The definition looks more complicated than it really is. Definition 4.1 (Nested linear list decodable code). Given a linear code 1 in terms of some È Í ÑFzÆW ý !ë8Z!¥¤¥¤¥¤!fõ½åZ of generator matrix °÷+]ú¯' , an integer Á that divides , a vector È ; ; ý integers ( E § È E¼Á-IQG ), a vector @.ÊÆ^@ !@ZÁ; ¤¥¤¥¤!`@½åZ with R@ RpG , and a vector Å*mÆ?G ý !¥¤¥¤¥¤!ZG½åZ of reals where G ý ÷C6 and E G½åZ-R÷RG¥ÎkR=G ý , 1 is called an vÅU!@B!6ÑÐ -nested ; list decodable if the following holds: For every Eò§E Á(ItG , 1 is a rate G code that is /@ !f -list decodable, where 1 is the subcode of 1 generated by the the last G rows of the generator matrix ° . 4.5.2 Linear Codes with Good Nested List Decodability In this section, we will prove the following result concerning the existence (and constructibility) of linear codes over any fixed alphabet with good nested list-decodable properties. ; Theorem 4.4. For any integer Á]%G and reals R/G½åZ*R/G½åURH3R GeZ*R G ý R G È , ; ; F , let @ ² ¯ å Z ` G}IG I5Ve< for every EùtE@ÁIzG . Let Å´ Æ?G ý !¥¤¥¤¥¤!ZG½åZ , È È @´ Æ^@ ý !`@Z!¥¤¥¤¥¤!@k½åZ and Ñõ Æi ý !f9Z!¥¤¥¤¥¤!fõ½åZ , where Z . For large enough , there exists a linear code (over fixed alphabet ¯ ) that is 8Åc!@"!oÑÐ -nested list decodable. Further, such a code can be constructed in time Ñ . Proof. We will show the existence of the required codes via a simple use of the probabilistic method (in fact, we will show that a random linear code has the required properties with high probability). We will then use the method of conditional expectation ([2]) to derandomize the construction with the claimed time complexity. ; Define "!?GÎæ$ for every E aEÁIgG . We will pick a random ý îJ matrix ° with entries picked independently from ¯ . We will show that the linear code 1 generated by ° ; has good nested list decodable properties with high probability. Let 1 , for E §E¼Á(ItG rows of ° . Recall be the code generated by the “bottom” ; that we have to show that Z with high probability 1 is ^@ !N f list decodable for every E ÂE¼Á(IQG ( 1 obviously Ò has rate G ). Finally for integers Ò!ëtùG , and a prime power , let ÓÑ­ U!ë"!sÒj denote Z Z the collection of subsets ¢¬ !`¬ !¤¥¤¥¤!N¬?Ôc¦ x ¯' such that all vectors ¬ !¥¤¥¤¤!`¬Ô are linearly independent over ¯ . We recollect the following two straightforward facts: (i) Given any distinct vectors from Á¯' , for some ´G , at least ^¨ª©£« ¯ of them are linearly independent; (ii) Any set of linearly independent vectors in ú¯' are mapped to independent random vectors in ¯ by 64 a random uî] matrix over ¯ . The first claim is obvious. For the second claim, first note that for any Å´+] ¯' and a random .î§ matrix í (where each of the values are chosen uniformly and independently at random from ¯ ) the values at the different positions in Å[>í are independent. Further, the value at position G*E5ÍÐE5 , is given by ÅS£íuÎ , where íaÎ is the Í áË column of í . Now for fixed Å , ÅFDíÂÎ takes values from ¯ uniformly at random (note that íÂÎ is a random vector from ¯' ). Finally, for linearly independent vectors Å Z !¥¤¥¤¥¤!Å´$ by a suitable linear invertible map can be mapped to the standard basis vectors Õ Z\!¥¤¤¥¤! Õ $ . Obviously, the values Õ Zµ<í.ÎФ¥¤¥¤! Õ $ <í.Î are independent. We now move on to the proof of existence of linear codes with good nested list decodability. We will actually do the proof in a manner that will facilitate the derandomZ ization of the proof. Define ÒO ^¨ª©£« ¯ nHG . For any vector dh+ö ¯ , integer Ò ; E¼SE ÁòI5G , subset Hõ2¢¬ Z !¥¤¤¥¤!`¬ Ô ¦§+ÓÑ­ U!ë !sÒY and any collection Ö of subsets ¶YZb!ë¶Á!¥¤¥¤¥¤!f¶ Ô v¢cG£!¥¤¤¥¤!ä¦ of size at most @ , define an indicator variable ×B !d8!ZH(!Ö- Î in the following manner. ×B>!`d3!ZH(!Ö(3pG if and only if for every GPE,Í3ECÒ , 1a/¬ differs ; Z from d in exactly the set ¶"Î . Note that if for some E tE Á*I÷G , there are nzG codewords in 1 all of which differ from some received word d in at most @ places, then this set of codewords is a “counter-example” that shows that 1 is not /d3!@"!oÑÐ -nested list Z decodable. Since the ³nqG codewords will have some set H of Ò linearly independent codewords, the counter example will imply that ×B !d8!ZH(!Ö-y G for some collection of subsets Ö . In other words, the indicator variable captures the set of bad events we would like to avoid. Finally define the sum of all the indicator variables as follows: ¶ À ü ½åZ ü ü ü ÛCý Ù | Ù ÛÙ Ú ¯ S ¾ ð 0 Þ ' Ô Ü Û1Ý%Þ O Þtßsà Þ â á Ý Z½ O à Þ Ï Ø ; ×B !d8!ZH(!Ö-b¤ ¿ ¿ S Note that if ¶ , then 1 is /d3!`@B!oÑÐ -nested list decodable as required. Thus, we can À prove the existence of such a 1 if we can show that ã¦äJ ¸¶ ¡³RõG . By linearity of expectaÀ tion, we have ã Ķ À ¡ ü ½åZ ü ü ü ÛCý Ù | Ù ÛÙ Ú ¯ S ¾ ð 0 Þ ' Ô Ü Û1Ý%Þ O Þtßsà Þ â á Ý Z½ O à Þ Ï Ø ã O×" !d8!ZH(!Ö-ѡѤ ¿ ¿ S Z Fix some arbitrary >!d8!ZHz2¢¬ !`¬ !¥¤¤¥¤!`¬ Ô ¦c!Öq2¢e¶YZ\!f¶Á!¥¤¥¤¥¤!f¶ ing domains). Then we have ã( O×B>!`d3!H(!Ö(½¡ä Ô (4.4) ¦ (in their correspond- IJ O×B!d3!H(!Ö(3pG¡ å Ù ð IJX ¸1a/¬ Î differ from d in exactly the positions in ¶Îç¡ Þ å Þ JIG ¿ ¿ G ¿ ¿ å þ ÿ ÎÛ Z þ ÿ Ô (4.5) 65 Þ JIQG ¿ ¿ ! å ÎÛ Z Ô (4.6) where the second and the third equality follow from the definition of the indicator variable, the fact that vectors in H are linearly independent and the fact that a random matrix maps linearly independent vectors to independent uniformly random vectors in ¯ . Using (4.6) in (4.4), we get ã Ķ À ¡ ü ½åZ ü ü ü ÛCý Ù | Ù *Ù Ú ¯ S ¾ ð 0 Þ ' Ô¥ Ü Û1ÝÞ O ÞYßÛà Þ âá Ý ZÑ O à Þ Ï ü ½åZ Ø ü ü Þ W-IQG ¿ ¿ å ÎÛ Z Ô ¿ ¿ S ü CÛ ý Ø Ù | Ù *Ù Ú ¯ S Ø O ß Ù Ýiý »Z½ O S à ¾ ð 0 Þ ' Ô¥ S S S ü ½åZ ü ü ü S W -IQG S ÛCý Ø Ù |Ù *Ù Ú ¯ S ÛCý þ ÿ ð 0 Þ ' Ô¥ S ¾ ü ½åZ ü ü åZ æ Ô ¾ S ÛCý Ø Ù | Ù ÛÙ Ú ¯ S ¾ ð 0 Þ ' Ô Âü ½åZ åZ Ô ' S BÔX ¾ S ÛCý ü ½åZ Z Zå S åU åZ B ÔX Ô S ÛCý å Á B Ô ¤ ß E E E E JIQG S å Î Û Z þ Î^ÿ Ô Ô (4.7) The first inequality follows from Proposition 2.1. The second inequality follows by upper bounding the number of Ò linearly independent vectors in ¯' S by Ô ' S . The third inequality åZ follows from the fact that !?G $ and @ %² ¯ GPI>G ILVee , The final inequality Z follows from the fact that ÒS ^¨ª©£«¯W YntG . Thus, (4.7) shows that there exists a code 1 (in fact with high probability) that is ^d8!@"!oÑÐ -nested list decodable. In fact, this could have been proved using a simpler argument. However, the advantage of the argument above is that we can now apply the method of conditional expectations to derandomize the above proof. The algorithm to deterministically generate a linear code 1 that is /d3!@"!oÑÐ -nested list decodable is as follows. The algorithm consists of steps. At any step GKEHÍoEH , we Î choose the Í áË column of the generator matrix to be the value Å +_Á¯' Æ that minimizes the Z ç Î å Z !bí.Î}%Å Î ¡ , where í.Î denotes conditional expectation ã ¸¶ · í]Za%Å !¤¥¤¥¤!\íaÎçåZa%Å Z ÎçåZ are the column vectors chosen in the previous ͵ItG the Í áË column of í and Å !¥À ¤¥¤¤!Å 66 Z Î G E ÍaE and vectors Å !¥¤¥¤¤!Å , we can steps. This algorithm would work if for any _ Z Î exactly compute ã( ¸¶ · í0Z}2Å !¥¤¥¤¥¤!bí.ÎvÅ ¡ . Indeed from (4.4), we have ã Ķ · í]Z À Å Z !¥¤¥¤¥¤!bí.ÎjÅ Î ¡ is À ü ½åZ ü ÛCý Ø ü Ù |oÙ ¾ ð ÙÛÚ 0 ü ¯ Þ ' S Ô Ü Û1Ý%Þ Þtßsà Þ Há Ý Z½ O à Þ Ï ã Ä×B !d8!H(!Ö-· í]Z(Å Z Î !¥¤¥¤¥¤!bí.ÎúÅ i¡ ¤ ¿ ¿ S Thus, we would be done if we can compute the following for every value of !d8!ZH ¢¬ Z !¥¤¥¤¤¥!N¬ Ô ¦c!Ö ¢e¶YZ!¥¤¥¤¥¤!f¶ Ô ¦ : ã Ä×B>!d9!ZH(!Ö( Gc· í0Z_ Å Z !¥¤¤¥¤!bíaÎg Å Î ¡ . Note that fixing the first Í columns of ° implies fixing the value of the codewords in the first Í ; positions. Thus, the indicator variable is (or in other words, the conditional expectation we ; need to compute is ) if for some message, the corresponding codeword does not disagree ; with d exactly as dictated by Ö in the first Í positions. More formally, ×B>!d8!ZH(!Ö- if ; Î EvÍ and Evͽ8EÒ : ¬ Uí S 2 Ï d S , if +t Ï ¶"Î and the following is true for some GgE ¬ Î Uí S õd S otherwise. However, if none of these conditions hold, then using argument similar to the ones used to obtain (4.6), one can show that ã O×">!`d3!ZH-!Ö-¥· Z Î í0ZQÅ !¥¤¥¤¥¤!bíaÎjÅ ¡" Þ ç >å Îçå Þ ç JIG ¿ ¿ G ¿ ¿ å ! ÿ ÿ þ Û Z þ S Ô where ¶ L¶ [ ¢cG£!fVC!¥¤¥¤¥¤!ÍN¦ for every GPE ECÒ . S S To complete the proof, we need to estimate the time complexity of the above algorithm. There are steps and at every step Í , the algorithm has to consider >' Æ Eù different Î Î choices of Å . For every choice of Å , the algorithm has to compute the conditional expectation of the indicator variables for all possible values of >!`d3!ZH(!Ö . It is easy to check  Z Ô¥ possibilities. Finally, the computation of the that there are Î Û Z PÔ ' S XVBÔ*EÁ< conditional expected value of a fixed takes time Á¥èÒj . Thus, in all the indicator variable Z total time taken is /u Á < Ô Á¥èÒYkt Ñ , as required. 4.5.3 List Decoding Multilevel Concatenated Codes In this section, we will see how one can list decode multilevel concatenated codes, provided the outer codes have good list recoverability and the inner code has good nested list decodability. We have the following result, which generalizes Theorem 4.2 for regular concatenated codes (the case ÁspG ). ; Theorem 4.5. Let Á]%G and rG be integers. Let R24 ý Rv4PZyR ¥8Rv4{½åZyR G , ; ; ; ; R¾G ý RmG be rationals and Rêé ý !¥ä!ér½åZRmG , R|@ ý !ä!@k½åZ}RHG and u be reals. Let be a prime power and let örH 9 for some integer @´%G . Further, let 1 b ; ( E aE Á>IG ) be an ¯ -linear code over }È of rate 4 and block length ~ that is vé ! !ë3 list recoverable. Finally, let 19Î be a linear vÅc!`@B!oÑÐ -nested list decodable code over ¯ of 67 È ÅF Æ?G ý !¥¥ä!ZG½åZ with G¥Îy `GILÍ6ÁXG ý , rate G ý and block È length õW@Áe6©G ý , where È ý !ä!@k½åZ and ÑFzÆ ! !ä! . Then 1q÷i1 ý b î_"î01 ½bå Z 13Î is a linear @oõ ^ Æ @ Ó /Ö ­ é @ !f  -list decodable code. Further, if the outer code 1 b can be list recovered in time H ~[ and the inner code 18Î can be list decoded in time _ /j (for the áË level), then 1 can be list decoded in time .æ ½åZ ÛCý ? H ~[ún ~Hr_ /jN 2 . Proof. Given list-recovery algorithms for 1 > \ and list-decoding algorithms for 19Î (and its subcodes 1 Î ), we will design a list-decoding algorithm for 1 . Recall that the received word is an *îø~ matrix over ¯ . Each consecutive “chunk” of j6Á rows should be decoded to a codeword in 1 b . The details follow. Ó Before we describe the algorithm, we will need to fix some notation. Define Ý Ö ­ é @ . Let ë + t¯ be the received word, which we will think of as an 5î±~ matrix over ¯ (note that Á divides ). For any ]î ~ matrix à and for any GPEÍ3E~ , let ; ÃKγ+SD¯ denote the Í áË column of the matrix à . Finally, for every EòuE Á-IQG , let 1 Î denote the subcode of 18Î generated by all but the first @ rows of the generator matrix of 13Î . We are now ready to describe our algorithm. Recall that the algorithm needs to output all codewords in 1 that differ from ë in at most Ý fraction of positions. For the ease of exposition, we will consider an algorithm ý ½åZ that outputs matrices from 1 b îQ¥îF1 b . The algorithm has Á phases. At the end ; of phase ( EÛ~EÁI÷G ), the algorithm will have a list of matrices (called ì ) from ý 1 b î]cî§1 > \ , where each matrix in ì is a possible submatrix of some matrix that will be in the final list output by the algorithm. The following steps are performed in phase (where we are assuming that the list-decoding algorithm for 1 Î returns a list of messages while the list-recovery algorithm for 1 b returns a list of codewords). 1. Set ì to be the empty set. ý åZ 2. For every ïwù Ü ! Y!fÜZ a+Cì åZ repeat the following steps (if this is the first ; phase, that is y , then repeat the following steps once): (a) Let ° be the first @ rows of the generator matrix of 19Î . Let íMmi° Ñðueï , ; where we think of ï as an @§î ~ matrix over ¯ . Let îxïërIí (for  ; we use the convention that í is the all s matrix). For every GÂE÷ÍPE ~ , use the list-decoding algorithm for 1 Î on column îÎ for up to @ fraction of errors ½å to obtain list ¶ Î H^È3 . Let H Î |È be the projection of every vector in ¶ Î on to its first component. (b) Run the list-recovery algorithm for 1 > \ on set of lists ¢©H Î ¦Î obtained from the previous step for up to é fraction of errors. Store the set of codewords returned in × . (c) Add ¢UWï!Å¥· Ål+ð× ¦ to ì . 68 At the end, remove all the matrices à +ðì¹Â½åZ , for which the codeword W19Î WÃFZN\! 13Î iÃw\b!ä!f13Î ià ` is at a distance more than Ý from ë . Output the remaining matrices as the final answer. We will first talk about the running time complexity of the algorithm. It is easy to check that each repetition of steps 2(a)-(c) takes time <H ~[Cn ~p_ /Y` . To compute the final running time, we need to get a bound on number of times step 2 is repeated in phase . It is easy to check that the number of repetitions is exactly ·Oì åZ· . Thus, we need to bound ·Oì åZ· . By the list recoverability property of 1 > \ , we can bound ·O× · by . This implies that ·Oì ·UEt-·Oì åZ· , and therefore by induction we have Î Z ·OìkÎN·UEt for Í ; !G<!¤¥¤¥¤!^Á(IQGJ¤ (4.8) Thus, the overall running time and the size of the list output by the algorithm are as claimed in the statement of the theorem. We now argue the correctness of the algorithm. That is, we have to show that for every ý ½åZ à +51 >\ î~Yî´1 b , such that i18Î iÃZfb!f13Î Wôë"¥ä!f13Î Wà N is at a distance at most Ý from ë (call such an à a good matrix), à +ðì½åZ . In fact, we will prove a stronger ; claim: for every good matrix à and every E¼KEWÁsI5G , Ã=o+ñì , where Ã> denotes ý the submatrix of à that lies in 1 b î´¥îK1 b (that is the first “rows” of à ). For the rest of the argument fix an arbitrary good matrix à . Now assume that the stronger claim åZ above holds for I5G ( R¾ÁJI5G ). In other words, à +Zì åZ . Now, we need to show that à +òì . ý ½åZ For concreteness, let à %/ !ä! ð . As à is a good matrix and Ý]Eóé @ , 13Î iÃKÎW can disagree with ëuÎ on at least a fraction @ of positions for at most é fraction of column indices Í . The next crucial observation is that for any column index Í , 1Î WÃ_Î k åZ ý ½åZ W° iðu^ Î !ä! Î änW° [ ° iðgU/ Î !¥¥ä! Î , where ° is as defined in step 2(a), ° [ ° is the submatrix of ° obtained by “removing” ° and Î is the Í áË component of the vector . Figure 4.5.3 might help the reader to visualize the different variables. Note that ° [ ° is the generator matrix of 1 Î . Thus, for at most é fraction of column ½åZ indices Í , ^ Î !¥¥ä! Î 9BW° [ ° disagrees with ë§ÎµIíaÎ on at least @ fraction of places, where í is as defined in Step 2(a), and í§Î denotes the Í ’th column of í . As 1 Î is /@ ! -list decodable, for at least G-IZé fraction of column index Í , à Πwill be in ¶ Î (where à Πis ÃKÎ projected on it’s last ÁIº co-ordinates and ¶ Î is as defined in Step 2(a)). In other words, Î is in H Î for at least GIôé fraction of Í ’s. Further, as ·¸¶ Î ·E , · H Î ·úE . This implies with the list recoverability property of 1 > \ that +¥× , where × is as defined in step 2(b). Finally, step 2(c) implies that à +òì as required. The proof of correctness of the algorithm along with (4.8) shows that 1 decodable, which completes the proof. is WÝ!ë  -list 69 õù ý Z ù ° ð Xà W° i ð õö i° [ ° i ð¥ø÷ åZ Z Z ù ù ù ù ù ù ö ½Â åZ Z Î ¥ ¥ åZ Î Î .. . ¥ 13Î iûü ÃZNh13Î ûü W Ã_Î ¥13Î iûü à õö ý ¥ Î .. . ½åZ ý ÷*ú ú åZ ù ú ú ú ú ú ½åZ ø ú ÷ø Figure 4.4: Different variables in the proof of Theorem 4.5. 4.5.4 Putting it Together We combine the results we have proved in the last couple of subsections to get the main result of this section. ; ; ; Theorem 4.6. For every fixed field ¯ , reals õ and R Ý[R G£! R/GKEGòI² ¯ WÝe\!`[ integer Á|G , there exists linear codes 1 over ¯ of block length ~ that are WݳI]!ë ~[N list decodable with rate 4 such that G 45±GsI  Z½å ü ½åZ Ý ! å Z Á Î ÛCý ² ¯ `GIãG-nîGÍ`6ÃÁe (4.9) å and (~[k ÷ ~o6X s ¾ X . Finally, 1 can be constructed in time ~6X  ý ® and list decoded in time polynomial in ~ . Proof. Let Yö G ; ; E 2E Á.I2G define (we will define its value later). For every ±G`G£I U6ÁX and 4 pG£I 1 . The code is going to be a multilevel concatenated Zå ý ½åZ ¾ e S code i1 b î-¥î91 b 6>13Î , where 1 b is the code È from Corollary 3.7 of rate 4 and block length ~u (over ¯þ ) and 13Î is an fÆ?G ý !¥¤¤¥¤!ZG½åZ !@"!oÑÐ -nested list decodable code as guar ; åZ Z} Ø. anteed by Theorem 4.4, where for E .EÁ3IlG , @ L² ¯ G³I Ø G IKVY and t Z } ! ~uª6Y } s Z S ® Finally, we will use the property of 1 b that it is GBI4 I Yä!N ¹»º¼ ; list recoverable for any E>Y[Et4 . Corollary 3.7 implies that such codes exist with (where åZ we apply Corollary 3.7 with 4 QÖ CE 4 pGIÝ<6e² ¯ `GIãG>6ÁX ) 9 } Zå  z~ 6Y ¾ e ¤ (4.10) 70 Further, as codes from Corollary 3.7 are ¯ -linear, 1 is a linear code. The claims on the list decodability of 1 follows from the choices of 4 and G , Corol lary 3.7 and Theorems 4.4 and 4.5. In particular, note that we invoke Theorem 4.5 with åZ the following parameters: é G*Iq4 I¼Y and @ ² ¯ G*I¼G IqVY Ø (which by Z } and é @ ù Ý L I z Y ÿ y e h Lemma 2.4 implies that that as long as ), ~uª6Y } X ¹»º¼ S S ® . The choices of and Ó Y imply that x2 ~o6X " s ¹»º½¼ Z S ® . Now ¨ª©<«"G6<4 9E¨ª©£«B; `G6<4 $ Î , where 4 $ Î tÖ ­ 4 ÷GI}Ýe6<² ¯ åZ `GI G> . Finally, we use the fact that for any Rò;§RqG , ¨®­jG6;9EpG6;IQG to get that ¨ª©£«B`G6<4 9EL`G6<4 Î IGk $ WÝe6Ci² ¯ åZ `GsI G>8IQÝeN . The claimed upper bound of (~[ follows as (~S*EH  (by Theorem 4.5). time By the choices of 4 and G and (4.1), the rate of 1 is as claimed. The construction } Ø for 1 is the time required to construct 19Î , which by Theorem 4.4 is V£ where is _ ï@ÁX6G , which by (4.10) implies that the construction the block length of 18Î . Note that  Zå  time is ~o6X Yý ¾ e ® . The claimed running time follows by using the bound ² ¯ åZ GI8G>6ÃÁe8E|G . We finally consider the running time of the list-decoding algorithm. We list decode the inner code(s) by brute force, which takes V time, that is, _ ^Y0V . Thus, Corollary 3.7, Theorem 4.5 and the bound on ( ~[ implies the claimed running time complexity. Choosing the parameter G in the above theorem so as to maximize (4.9) gives us linear codes over any fixed field whose rate vs. list-decoding radius tradeoff meets the BlokhZyablov bound (4.2). As Á grows, the trade-off approaches the integral form (4.3) of the Blokh-Zyablov bound. 4.6 Bibliographic Notes and Open Questions We managed to reduce the alphabet size needed to approach capacity to a constant independent of . However, this involved a brute-force search for a rather large code. Obtaining a “direct” algebraic construction over a constant-sized alphabet (such as variants of algebraic-geometric codes, or AG codes) might help in addressing these two issues. To this end, Guruswami and Patthak [55] define correlated AG codes, and describe list-decoding algorithms for those codes, based on a generalization of the Parvaresh-Vardy approach to the general class of algebraic-geometric codes (of which Reed-Solomon codes are a special case). However, to relate folded AG codes to correlated AG codes like we did for ReedSolomon codes requires bijections on the set of rational points of the underlying algebraic curve that have some special, hard to guarantee, property. This step seems like a highly intricate algebraic task, and especially so in the interesting asymptotic setting of a family of asymptotically good AG codes over a fixed alphabet. Our proof of existence of the requisite inner codes with good nested list decodable properties (and in particular the derandomization of the construction of such codes using 71 conditional expectation) is similar to the one used to establish list decodability properties of random “pseudolinear” codes in [52] (see also [49, Sec. 9.3]). Concatenated codes were defined in the seminal thesis of Forney [40]. Its generalizations to linear multilevel concatenated codes were introduced by Blokh and Zyablov [20] and general multilevel concatenated codes were introduced by Zinoviev [108]. Our listdecoding algorithm is inspired by the argument for “unequal error protection” property of multilevel concatenated codes [109]. The results on capacity achieving list decodable codes over small alphabets (Section 4.2) and binary linear codes that are list decodable up to the Zyablov bound (Section 4.3) appeared in [58]. The result on linear codes that are list decodable up to the Blokh Zyablov bound (Section 4.5) appeared in [60]. The biggest and perhaps most challenging question left unresolved by our work is the following. ; ; give explicit construction of Open Question 4.1. For every RQ@]R÷G6<V and every a binary codes that are ^@"!f`G6XeN -list decodable with rate GJIx²w/@³I~ . Further, design polynomial time list decoding algorithms that can correct up to @ fraction of errors. In fact, just resolving the above question for any fixed @ (even with an exponential time list-decoding algorithm) is widely open at this point. 72 Chapter 5 LIST DECODABILITY OF RANDOM LINEAR CONCATENATED CODES 5.1 Introduction In Chapter 2, we saw that for any fixed alphabet of size §õV there exist codes of rate 4 åZ that can be list decoded up to ² ¯ `G-I,45IF< fraction of errors with list of size `G6Xe . Z For linear codes one can show a similar result with lists of size Ñ . These results are shown by choosing the code at random. However, as we saw in Chapter 4 the explicit constructions of codes over finite alphabets are nowhere close to achieving list-decoding capacity. The linear codes in Chapter 4 are based on code concatenation. A natural question to ask is whether linear codes based on code concatenation can get us to list-decoding capacity for fixed alphabets. In this chapter, we answer the question above in the affirmative. In particular, in Section 5.4 we show that if the outer code is random linear code and the inner codes are also (independent) random linear codes, then the resulting concatenated codes can get to within of the list-decoding capacity with list of constant size depending on only. In Section 5.5, we also show a similar result when the outer code is the folded Reed-Solomon code from Chapter 3. However, we can only show the latter result with polynomial-sized lists. The way to interpret the results in this chapter is the following. We exhibit an ensemble of random linear codes with more structure than general random (linear) codes that achieve the list-decoding capacity. This structure gives rise to the hope of being able to list decode such a random ensemble of codes up to the list-decoding capacity. Furthermore, for designing explicit codes that meet the list-decoding capacity, one can concentrate on concatenated codes. Another corollary of our result is that we need fewer random bits to construct a code that achieves the list-decoding capacity. In particular, a general random linear code requires number of random bits that grows quadratically with the block length. On the other hand, random concatenated codes with outer codes as folded Reed-Solomon code require number of random bits that grows quasi-linearly with the block length. The results in this chapter (and their proofs) are inspired by the following results due to Blokh and Zyablov [19] and Thommesen [102]. Blokh and Zybalov show that random concatenated linear binary codes (where both the outer and inner codes are chosen uniformly at random) have with high probability the same minimum distance as general random linear codes. Thommesen shows a similar result when the outer code is the Reed-Solomon code. The rate versus distance tradeoff achieved by random linear codes satisfies the so 73 called Gilbert-Varshamov (GV) bound. However, like list decodability of binary codes, explicit codes that achieve the GV bound are not known. Coming up with such explicit constructions is one of the biggest open questions in coding theory. 5.2 Preliminaries We will consider outer codes that are defined over È , where ö|LX' for some fixed tV . The outer code will have rate and block length of 4 and ~ respectively. The outer code 1\b will either be a random linear code over UÈ or the folded Reed-Solomon code from Chapter 3. In the case when 1 b is random, we will pick 1 b by selecting 4ø~ vectors uniformly at random from }È to form the rows of the generator matrix. For every Î position G[EmÍyE ~ , we will choose an inner code 1 Î to be a random linear code over ¯ of block length and rate GypC6 . In particular, we will work with the corresponding generator matrices í.Î , where every í.Î is a random îo matrix over ¯ . All the generator matrices í.Î (as well as the generator matrix for 1 b , when we choose a random 1 b ) are chosen independently. This fact will be used crucially in our proofs. Î Given the outer code 1 b and the inner codes 1 Î , the resulting concatenated code 1| 1\b ?i1 Î Z !¥¤¥¤¥¤!f1 Î is constructed as follows.1 For every codeword É´vWÉkZ\!¥¤¥¤¥¤!NÉ (+ 1\b , the following codeword is in 1 : ɳí Ä É³Zëí]Z!NÉä\íu!¥¤¤¥¤!NÉ ûÌ í b! where the operations are over ¯ . We will need the following notions of the weight of a vector. Given a vector ÅF+SúY¯ , its Hamming weight is denoted by :_/ŵ . Given a vector d ?;Z\!¥¤¥¤¥¤!Z; _+OçB¯ and a subset ¶¾ 5~u¡ , we will use _ Þ /dµ to denote the Hamming weight over ¯ of the subvector ?;eÎ ÑÎ Ù Þ . Note that :_/dk:_ <; . We will need the following lemma due to Thommesen, which is stated in a slightly different form in [102]. For the sake of completeness we also present its proof. Lemma 5.1 ([102]). Given a fixed outer code 1¹>\ of block length ~ and an ensemble of random inner linear codes of block length given by generator matrices íKZ!¥¤¥¤¥¤!bí the following is true. Let d´+] Y¯ . For any codeword É´+S1 >\ , any non-empty subset ¶º÷ 5~g¡ such that ÉjÎ9 Ï ; Z ¯2 : for all Í3+K¶ and any integer uE,8·¸¶-·X(.>GI IJX _ Þ å Þ Z½å Wɳí2Iwd9E U¡jEQ ¿ ¿ N ! ¾ Ù Ê where the probability is taken over the random choices of íKZ!¥¤¥¤¥¤!bí 1 . Note that this is a slightly general form of code concatenation that is considered in Chapter 4. We did consider the current generalization briefly in Section 4.4. 74 are Proof. Let ·¸¶-·"ûÁ and w.l.o.g. assume that ¶t% 5Á¥¡ . As the choices for íKZ!¥¤¥¤¥¤!bí made independently, it is enough to show that the claimed probability holds for the random ; choices for í0Z\!¤¥¤¥¤!\í  . For any GE ÍgE Á and any ;|+÷ ¯ , since ÉjÎ0 Ï , we have å INJ »ÉúÎiíaμ;¡Yq È . Further, these probabilities are independent for every Í . Thus, for  å  any d5MÆ?;cZ!¤¥¤¥¤!Z; +÷ç ¯ , IJ O »ÉúÎiíaÎ(F;XÎ for every G]EvÍEÅÁ¥¡ÐH . This þ implies that: INJ þ å  ü :_ Þ Wɳí2Iwd8E U¡t ÛCýþ Á yÿ WJIG ¤ The claimed result follows from the Proposition 2.1. *+-, & '( $ ) ! " %# $ ! * * # $ ! Figure 5.1: Geometric interpretations of functions and . For ; E E|G define ëåZ ¯ ³pGI² ¯ `GIl b ¤ We will need the following property of the function above. Lemma 5.2. Let *tV be an integer. For every ¯ 8E ; E ¤ EpG , (5.1) 75 Proof. The proof follows from the subsequent sequence of relations: ëåZ ¯ kpGI² ¯ `GÐI ëåZ båZ ëåZ ëåZ pGIQ`GI c¨ª©£«>¯W-IQGÁnLGIl U¨ª©£«£¯GI únF IG -IQG ëåZ ëåZ nLGI GÐIl¨ª©£« ¯ ëåZ ÿJÿ þ þ GI E ! ëåZ where the last inequality follows from the facts that EMG and GPIQ ëåZ M E GPI|G6e , ¯ åZ which implies that ¨ª©£«> ¯ . Z½å ¯ 2 |G . X /. We will consider the following function ; 30 10 20 ¯ >k÷`GI £ åZ ² ¯ åZ 20 GI X¬\! E C!`¬0E|G . We will need the following property of this function.2 ; ; E ;§EQ ¯ /¬N6¬ , and > Lemma 5.3 ([102]). Let *tV be an integer. For any ¬0 Ó åZ åZ ý¤Ö Ï 6Ï ­ ¯ >k÷`GI8; ² ¯ GI´¬l;b¤ where !4 B 10 Proof. The proof follows from the subsequent geometric interpretations of £ ¯ and ¯ . See Figure 5.1 for a pictorial illustration of the arguments used in this proof (for PLV ). ; First, we claim that for any E ý E÷G , ¯ satisfies the following property: the line ; åZ åZ segment between ¯ ý \!f² ¯ GkIK ¯ ý N` and ý ! is tangent to the curve ² ¯ G³I at ¯ ý . Thus, we need to show that Iò² ¯ å Z ` GÐI ¯ ý N å Z zW² ¯ `GIl ¯ ý `\¤ (5.2) ý Il ¯ ý åZ åZ åZ One can check that i² ¯ G<I-¬³ . ¯ Z å å Z å Z ¾ e ® ¹»º½¼¾ ¾ e å ® ¹»º¼¾ Zå ¾ e Zå ® ¾ » ¹ º ½ ` ¼ ¾ Now, ý Il ¯ ý k ý IQG8nLGI Æ åZ c¨ª©£«£¯ JIQGäItGI Æ åZ U¨ª©<«>¯`GI Æ åZ åZ I Æ ý IG åZ åZ GI Æ äÀ Ѩª©<«>¯WJIGäI´¨ª©£«>¯`GIl Æ ún ý IG åZ åZ ² ¯ GI ¯ ý NäÀ Ѩª©<«>¯WJIGäI´¨ª©£«>¯i² ¯ GIl ¯ ý `N å Z nS¨ª©<«>¯`GIF² ¯ GI ¯ ý NN Iò² ¯ åZ GIl ¯ ý N ! i² ¯ åZ `GÐI ¯ ý N 65 2 Lemma 5.3 was proven in [102] for the ~ the result for general . ~ * case. Here we present the straightforward extension of 76 åZ and which proves (5.2) (where we have used the expression for ¯ and i² ¯ GòI båZ å Z the fact that GIl 5² ¯ `GÐI ¯ N ). ; We now claim that ¥ ¯ > is the intercept of the line segment through ¬Y! and e¬j!ë² ¯ åZ `GYI e¬"N on the “; -axis.” Indeed, the “; -coordinate” increases by ² ¯ åZ `GYI e¬" in the line segment from ¬ to e¬ . Thus, when the line segment crosses the “; -axis”, it would cross at an intercept of G6CGI £ times the “gain” going from ¬ to X¬ . The lemma follows åZ from the fact that the function ² ¯ `GòIòG> is a decreasing (strictly) convex function of G and thus, the minimum of X ¯ £ would occur at ±; provided ;c¬0EQ ¯ ¬" . 10 0 0 70 20 0 0 0 70 5.3 Overview of the Proof Techniques In this section, we will highlight the main ideas in our proofs. Our proofs are inspired by Thommesen’s proof of the following result [102]. Binary linear concatenated codes with an outer Reed-Solomon code and independently and randomly chosen inner codes meet the Gilbert-Varshamov bound3 . Given that our proof builds on the proof of Thommesen, we start out by reviewing the main ideas in his proof. The outer code 1 b in [102] is a Reed-Solomon code of length ~ and rate 4 (over È ) and the inner codes (over ¯ such that öme' for some w%G ) are generated by ~ , where G´M6 . Note randomly chosen _îw generator matrices í ½íSZb!¤¥¤¥¤!\í that since the final code will be linear, to show that with high probability the concatenated åZ G}I±G£4} , it is enough to show that the probability code will have distance close to ² åZ `GsI>G£4P8I<½´~ (for some of the Hamming weight of ɳí over ¯ being at most i² Reed-Solomon codeword Éx É3Z\!¥¤¥¤¥¤!NÉ ), is small. Let us now concentrate on a fixed ; codeword É%+ 1 b . Now note that if for some GxEùÍSE ~ , ÉjÎ. , then for every ; choice of íÂÎ , ÉúÎiíaÎ . Thus, only the non-zero symbols of É contribute to _ ɳí] . Further, for a non-zero ÉjÎ , ÉúÎiíaÎ takes all the values in ¯ with equal probability over the random choices of íÂÎ . Also for two different non-zero positions ÍfZ Ï Í in É , the random variables ÉjÎ íaÎ and ÉúÎ Ø í.Î Ø are independent (as the choices for íÂÎ and í.Î Ø are that ɳí takes each of the possible X values in Bt¯ with independent). This implies the same probability. Thus, the total probability that ɳí has a Hamming weight of at most å Zå Ù ] ` is ÛCý å EQ 6 . The rest of the argument follows by doing a careful union bound of this probability for all non zero codewords in 1b (using the known weight distribution of Reed-Solomon codes4 ). Let us now try to extend the idea above to show a similar result for list decoding of a code similar to the one above (the inner codes are the same but we might change the outer 9 A 3 8 9 9 ;: A binary code of rate n . 4 A 8 9 ;: <8 9 ;: >= ? -@ 8 9 ;: satisfies the Gilbert-Varshamov bound if it has relative distance at least BC c hQtu In fact, the argument works just as well for any code that has a weight distribution that is close to that of the Reed-Solomon code. In particular, it also works for folded Reed-Solomon codes– we alluded to this fact in Section 4.4. 77 åZ `GkI G£4}I code). We want to show that for any Hamming ball of radius at most .zW² e½~ has at most codewords from the concatenated code 1 (assuming we want to show that is the worst case list size). To show this let us look at a set of ònuG codewords from 1 and try to prove the probability that all of them lie within some ball of radius is small. that Z Z be the corresponding codewords in 1 b . As a warm up, let us try and Let É !¥¤¤¥¤!NÉ show this for a Hamming centered around ñ . Thus, we need to show that all of the Jn§G ball ; Z Z codewords É íK!¥¤¥¤¤!NÉ reverts í have Hamming weight at most . Note that v back to the setup of Thommesen, that is, any fixed codeword has weight at most with small probability. However, we need all the codewords to have small weight. Extending Thommesen’s proof would be straightforward if the random variables corresponding to Î each of É í having small weight were independent. In particular, if we can Z show that for Z every position G´ErͧE½~ , all the non-zero symbols in ¢XÉ Î !NÉ Î !¥¤¤¥¤!NÉ Î ¦ are linearly independent5 over ¯ then the generalization of Thommesen’s proof is immediate. Unfortunately, the notion of independence discussed above does not hold for every nzG tuple of codewords from 1 b . A fairly common trick to get independence when dealing with linear codes is to look at messages that are linearly independent. It turns out that if 1\b is a random linear code over È then we have a good approximation of the the notion of independence above. Specifically, we show that with a Z very high probability for Z Z linearly independent (over UÈ ) set of messages6 L !¥¤¥¤¥¤!hL , the set of codewords É 1\b `L Z \!¥¤¥¤¤¥!fÉ M1\b `L have the following approximate independence property. Z For most of the positions GÂE÷ÍsEû~ , most of the non-zero symbols in ¢XÉ Î !¥¤¤¥¤!NÉ Î ¦ are linearly independent over ¯ . It turns out that this approximate notion of independence is enough for Thommesen’s proof to go through. Generalizing this argument to the case when the Hamming ball is centered around an arbitrary vector from úY¯ is straightforward. We remark that the notion above crucially uses the fact that the outer code is a random linear code. However, the argument is bit more tricky when 1¹ Z b is fixed to be (say) the Z Reed-Solomon code. Now even if the messages L !¥¤¥¤¥¤!hL are linearly independent it is not clear that the corresponding codewords will satisfy the notion of independence in the above paragraph. Interestingly, we can show that this notion of independence is equivalent to showing good list recoverability properties for 1>\ . Reed-Solomon codes are however not known to have optimal list recoverability (which is what is required in our case). In fact, the results in Chapter 6 show that this is impossible for Reed-Solomon codes in general. However, as we saw in Chapter 3, folded Reed-Solomon codes do have optimal list recoverability and we exploit this fact in this chapter. ED is isomorphic to GF and hence, we can think of the symbols in !H as vectors over . Again any set of I need not be linearly independent. However, it is easy to see that some 5LK;MON>P H QI t messages subset of J t SR of messages are indeed linearly independent. Hence, we can continue the t with J . argument by replacing I 5 6 Recall that Ik h k 6n k 78 5.4 List Decodability of Random Concatenated Codes In this section, we will look at the list decodability of concatenated codes when both the outer code and the inner codes are (independent) random linear codes. The following is the main result of this section. ; Theorem 5.1. Let be a prime power and let RÛGlROG be an arbitrary rational. Let ; ; R%Rr ¯ <G> be an arbitrary real, where ¯ <G£ is as defined in (5.1), and Rö4hE the following holds for large enough integers ³!^~ such ¯ <G>äIl<`6G be a rational. Then |4 ~ . Let 1 b be a random that there exist integers and that satisfy §¼GX and Z linear code over ¯ that is generated by a random î¹~ matrix over ¯ . Let 1 Î !¥¤¥¤¥¤!f1 Î Î be random linear codes over ¯ , where 1 Î is generated by a random .î matrix í§Î and the random choices for 1 b !bí0Z!¥¤¥¤¤¥!\í are all independent. Then the concatenated code /T /T 1|L1\b ?Pi1 Î Z ¥! ¤¥¤¤¥!ë1 Î probability at least G(Ix probability, 1 has rate G£4 U V X W Ø Ù å Z is a ² ¯ GI4{G>äI´c!N 3 / B -list decodable code with ÿ å t µ þ over the choices of 1 b !\í]Z!¥¤¥¤¥¤!bí . Further, with high . In the rest of this section, we will prove the above theorem. Define ö|te' . Let be the worst-case list size that we are shooting for (we will fix Z its Z 9+ valueat the end). The first observation is that any 8n*G -tuple of messages L !¥¤¥¤¥¤!hL ç È Z contains at least Ò0^¨ª©£« È iunxG many messages that are linearly independent over }È . Thus, to prove the theorem it suffices to show that with high probability, no åZ Hamming ball over Bt¯ of radius i² ¯ `G9I#G£4PjIwe½´~ contains a Ò -tuple of codewords W1aQL Z \!¥¤¥¤¥¤!f1aQL Ô N , where L Z !¥¤¥¤¥¤!hL Ô are linearly independent over È . åZ Define @, ² ¯ `G}I4 G>ÐIL . For every Ò -tuple of linearly independent messages QL Z !¥¤¤¥¤!hL Ô s+5ç È Ô and received word dQ+l Y¯ , define an indicator random variable /d8!hL Z !¤¥¤¥¤!^L Ô> as follows. /d3!^L Z !¥¤¥¤¥¤!hLôÔ£òG if and only if for every G§EK_E3Ò , :_i1L ¥äI´dµ9E,@~ . That is, it captures the bad event that we want to avoid. Define Y Y O Ò À ü ¾ Z Ù |Ù Ø Y /d3!^L ü M O M ß Û ÙÚ 0Þ È" µ Z !¥¤¥¤¥¤!hL Ô Ô where ÓÑ­ ö! !sÒY denotes the collection of subsets of UÈ -linearly independent vectors ; from È of size Ò . We want to show that with high probability O . By Markov’s À inequality, the theorem would follow if we can show that: ã( O À ¡ Ø ü [¾ Z Ù |oÙ ü M M ß * ÙÚ 0Þ ÈB ä ã Ô¥ Y /d3!hL Z Z !¤¥¤¥¤!^L Ô å Ñ¡ is t ¤ (5.3) Note that the number of distinct possibilities for d8!hL ¥! ¤¥¤¥¤!hL Ô is upper bounded by t Z Z ¥! ¤¥¤¥¤!hL Ô . To prove (5.3), we will ö Ô 5 Y3 Ô . Fix some arbitrary choice of d3!hL show that Z Z å tk Ô¥ *ã ^d8!^L !¥¤¥¤¥¤!hL Ô ½¡ is Y ¤ (5.4) Y 79 Z Before we proceed, we need some more notation. Given vectors É !¤¥¤¥¤!fÉèÔ.+KÈ , we Z define î(WÉ !¥¤¥¤¤!NÉèÔ£Ð2 ú9Z\!¥¤¥¤¤¥!!ú as follows. For every G*ELÍEK~ , úµÎõ2 yÒ¡ denotes the largest subset such that the elements ^Ë Î Ù are linearly independent over ¯ (in case of a tie choose the lexically first such set), where É´Â ^Ë Z !¥¤¤¥¤!Ë . A subset of }È is linearly independent over ¯ if its elements, when viewed as vectors from j¯' (recall that ¯ is isomorphic to ú¯' ) are linearly independent over ¯ . If Ë Î + úäÎ then we will call Ë Î a good symbol. Note that a good symbol is always non-zero. We will also define another Z partition of all the good symbols, .WÉ !¥¤¥¤¥¤!NÉ Ô k÷<HjZ!¥¤¥¤¥¤!ZH by setting H q¢Í· a+¥úäÎѦ Ô for GPE aECÒ . Z Since L !¥¤¤¥¤!hL Ô are linearly independent over UÈ , the corresponding codewords in 1\b are distributed uniformly in }È . In other words, for any fixed É Z !¤¥¤¥¤!NÉèÔ>8+^È Ô , /T \ IJ !]_^ ] À `a [b Ô 1\b NQL ktÉ Û Z å ö t Ô å Y ¤ Ô (5.5) Î Recall that we denote the (random) generator matrices for the inner code 1 Î by í.Î for Z Z every G}EQÍ8E¼~ . Also note that every É !¥¤¥¤¥¤!NÉèÔ£+FçÈ Ô has a unique î(WÉ !¤¥¤¥¤!fÉèÔ£ . In other words, the V Ô choices of î partition the tuples in ^ È Ô . Let K÷@´~ . Consider the following calculation (where the dependence of î and Z on É !¥¤¥¤¥¤!NÉ Ô have been suppressed for clarity): Y /d3!^L ã Z \ !¥¤¤¥¤!hL Ô Ñ¡ ß O IJ ß O Ô Y Ô ß O E `a Û Z Û Z :_WÉ í2I´dµ9E b (5.6) Ù Zc ` ;: : | a 1\ b NQL ktÉ db IJ ] À!]_^ Û Z `a ü å b IJ Û Z Û Z :_WÉ ímIwd8E Ù ;: : | Zc (5.7) ` ü a _ WÉ í2I´dµ9E b å IJ Û Z Û Z ð Ù | Zc ;: : ü ß Ô O ß Y Ô Ô Ô O ß S (5.8) å Y Ô ü ;: : Zc å Ô ß Û Z Ù O | ß fe IJ y _ S ð hg WÉ í2I´dµÐE (5.9) In the above (5.6) follows from the fact that the (random) choices for 1b and í Ñí0Z\!¤¥¤¥¤!\í are all independent. (5.7) follows from (5.5). (5.8) follows from the simple 80 fact that for every d´+^B¯ and HK÷ 5~u¡ , _ ^d9E _^d . (5.9) follows from the subse- ð ijlk Û Z :_ ð WÉ í2Iwd9Enm ` a åZ É í2I´dµ9E b IJ _ WÉ í2I´dµ9E b ¤ Û Z ð quent argument. By definition of conditional probability, IJ is the same as INJ ` ð É :_ ß Ô åZ a ímIwd9Epo oÛ Z Ô Ô _ S ð Ô S S Now as all symbols corresponding to H are good symbols, for every Í.+¾H , the value Ô Ô åZ Z of É Î Ô í.Î is independent of the values of ¢XÉ Î íaÎ!¥¤¥¤¥¤!NÉ Î Ô íaν¦ . Further since each of is iní]Z!¥¤¥¤¥¤!bí are chosen independently (at random), the event :_ ð ß ÉèÔí IFd}E åÔ Z Ô Û Z :_ S WÉ í2Iwd9E . Thus, IJ dependent of the event Û Z :_ S WÉ í÷IKdµ9E ð ð is k fe :_ ð IJ ® ß WÉ j k hg©IJ ` a åZ Ô ímIwd9E Ô q b ð S WÉ í2I´dµÐE :_ Û Z m ¤ Inductively applying the argument above gives (5.9). Further, ã Y /d3!^L Z !¥¤¥¤¥¤!hL Ô ü Ñ¡ ;: : Zc Ô å ß Û Z Ù O | ü ß ß Ù iÝ ý Ä O Ä O E ü ß Ù ÝWÄ ý O Ä à ß ÔÛ à ß re :_ ð å t ©IJ ü Zc ;: : ds lt Ä Ô ü O ß Ù iÝ ý O Ä Ä à ß hg S S ß å Ô Û ÑZ S Û S ¿ð ¿ Ä S å Û ½Z Û ¿ð ¿ Ä S S re :U _ ð IJ (5.10) fe :_ ð hg ® ® WÉT¥í2I´d9E t t v Z X hg S ug _ S WÉTí2I´dµ9E ð WÉTí2Iwd9E ß S å å å å S 6Ä µ Ù Ù ® S (5.11) fe INJ Ô WÉ ímIwd9E IJ Ô å ß Ù ß Û Z X Û O O |ß Û ß ¿ð ¿ Ä ¿ð ¿ Ä ® (5.12) (5.13) In the above (5.10), (5.11), (5.13) follow from rearranging and grouping the summands. (5.12) uses the following argument. Given a fixed î÷ úZ\!¥¤¥¤¥¤!,ú , the number of tuples å Û Z ¿ ¿ ' e ¿ ¿ Ô ¿ ¿ , where É Z !¥¤¥¤ ¤¥!fÉèÔ£ such that î- É Z !¥¤¥¤¥¤!NÉèÔ> î is at most z Î the ¿ ¿ ' is an upper bound on the number of ·ú³Î`· linearly independent vectors from ¯' and ¿ ¿ Ô å ¿ ¿ follows from the fact that every bad symbol ¢Ë Î ¦ Ù has to take a value that is a w yx Ú 81 linear combination of the symbols ¢Ë Î ¦ Ù ß ' Ô S ¿ ð S ¿ . Finally, there are V Ô* EQ (5.13) implies the following s lt Z t3 Y ^d9!hL Ô * ã( Z U where å å å Zå ® å # t 6 Ä S k w x ÔÛ !¥¤¥¤¥¤!hL ü ½¡jE Ô ß Ù iÝ ý Ä Ä Z å zv å Z X ©IJre :_ ð ß ß Ù [s Z t ¿ . Now pE Î Û Z ¿ ¿ ' ¥Ô Y5£ ' Ô distinct choices for î . S Ù ® ¿ # Û Z hg³¤ WÉ ímIwd9E S å à ß Ô å We now proceed to upper bound # by tµ for every GgEü´EÒ . Note that this will imply the claimed result as there are at most ~Hn5G Ô p tµ choices for different values of T ’s. We first start with the case when T R,T X , where T X ~w`GI4QI8YÁ\! ; for some parameter fe IJ y _ S ð G ~ for some Ý" ; RWYtRGòIQ4 to be defined later. In this case we use the fact that hgsE|G . Thus, we would be done if we can show that WÉ í2I´dµÐE þ GWT ~ I ~´GIF4PNÁn ÒÁT<Î n Ò ~ n ; EqIJÝ R Âÿ ! that we will choose soon. The above would be satisfied if T R|GI4}äI G ~ G G þ Ò n ÒÁT ~ G Ý I ! Ðÿ G n . Z n ÔÚÄ S n Z n as T RvT X . Note that if 5 2 Ô t Z S Ú Ô Ä . V Òÿ. nQG 2 and if we set Ý , it is enough to choose Yg Ô Ô We now turn our attention to the case when T Q T X . The arguments are very similar to which is satisfied if we choose YQ the ones employed by Thommesen in the proof of his main theorem in [102]. In this case, by Lemma 5.1 we have å # EQ rÄ |{ Zå ¾ { v } å { Zå Z /v ~W } å v Z S Ù S S B The above implies that we can show that # is every T X EQTaE ~ , D6C^úTU9Et² ¯ åZ þ GI8G þ GI ß å Ù ß Z v } å Ù S ¤ å tk ZåBå} ® provided we show that for ~´GIF4P T S ÿ I ~ T2Ò I Ò I ~ úTÁÿ IÝX! 82 for ÝL Ô n rÄ E , then both Ô E and E . In other words, Ô&Ä ÔÚÄ åZ jÄ . Using the fact that ² ¯ is increasing, the above is satisfied if <6X . Now if ö ÔÚÄ B6U/úTc9Et² V ¯ Ò åZ þ GI8G þ GI ~w`GI4QI8YÁ T ÿ I V~ IÝ! T2Òoÿ By Lemma 5.4, as long as Ò_>Ü ¯ 6CWÝ `G8I´4}N (and the conditions on Y are satisfied), the above can be satisfied by picking D6C/´~[3L² åZ ¯ GIãG£4PäI£ÝP@"! as required. We now verify that the conditions on Y in Lemma 5.4 are satisfied by Ø our Zå × . choice of Y0 . Note that if we choose ÒStÜb¯ 6C Ý `GIF4}` , we will have Y0 Ô ¾ Now, as 4pRG , we also have Y[EtÝ 6C?G<Üë¯ . Finally, we show that YSEp`GI4}`6<V . Indeed Yg Ý `GI4} Ü ¯ G `GI4} E Ü ¯ G G U`GI4} E G ¯ <G>\`GIF4P R GI4 ! V where the first inequality follows from the facts that ܯ G and E G . The second inequality follows from the assumption on . The third inequality Øfollows from Lemma 5.2. Z Z Zå as claimed in the stateNote that Ò´÷ . Zå Ø 2 , which implies , ö ® ment of the theorem. We still need to argue that with high probability the rate of the code 1 1>\ W1 Î Z !¥¤¥¤¥¤!f1 Î is G<4 . One way to argue this would be to show that with high probabil ity all of the generator matrices have full rank. However, this is not the case: in fact, with some non-negligible probability at least one of them will not have full rank. However, we ; claim that with high probability 1 has distance . Note that as 1 is a linear code, this Z implies that for every distinct pair of messages L A Ï L +5 È are mapped to distinct codewords, which implies that 1 has æ codewords, as desired. We now briefly argue ; why 1 has distance . The proof above in fact implies that with high probability 1 has åZ distance about ² ¯ G(I G£4P½~ . It is easy to see that to show that 1 has distance at least ; , it is enough to show that with high probability M Ù | iñú!hL} . Note that this is a special case of our proof, with Ò|ùG and d÷Oñ and hence, with probability at least GÐIX Y , the code 1 has large distance. The proof is complete. dc Y Remark 5.1. In a typical use of concatenated codes, the block lengths of the inner and outer codes satisfy wóy ¨ª©£« ~[ , in which Ø Z½å case the concatenated code of Theorem 5.1 is ^e . However, the proof of Theorem 5.1 also list decodable with lists of size ~ works with smaller . In particular as long as is at least Ò , the proof of Theorem 5.1 goes through. Thus, with in oÒ , one can get concatenated codes that are list decodable Z½å ^ s up to the list-decoding capacity with lists of size YýN . 83 ; G EõQEF~ be integers. Let R Ge!ë4 R G be Lemma 5.4. Let be a prime power, and § ; rationals and Ý] be a real such that 4 % E ¯ <G>kIÝeN6G and Ý]EvØ ¯ <G> , where ¯ <G> ; Ó ­ . Z å ! × 2 , where Üë¯ is the ¾ × constant that depends only on from Lemma 2.4. Then for all integers ÒL Ø Zå¾ and gEpW² ¯ åZ G9I¥G£4}úI´V<Ý<Ñ~ the following is satisfied. For every integer `G8I´4,¥ I YÁ~ME TaE~ , ~´GIF48 I YÁ V ~ åZ Et² ¯ G8 I G GI I ¤ (5.14) úT T ÿ ÒÁT(ÿ þ þ åZ Proof. Using the fact ² ¯ is an increasing function, (5.14) is satisfied if the following is is as defined in (5.1). Let Yx be a real such that YxEõÖ true (where TX9÷`GI4QIãYúZ~ ): E ´~ Ä é Ó Ö Ï Ï­ Ä 0 T ~wGI4I8Yú V~ åZ X² ¯ GÐI8G GI I ¤ ~ ÿ T ÿ T XÐ Ò ÿ ê þ F þ þ Define a new variable }÷GúI ~´GjIg4KI YÁN6eT . Note that as T X ÷GjIg4KI YÁ~öEtTE ~ , ; E oE54xn Y . Also TU6Ú~öz`GIF4QI8Yú\`GI > åZ . Thus, the above inequality would be satisfied if 0 20 é Ó !4 E÷`GÐIF4QIãYÁ ý6Ï Ö 6Ï ­ } ´~ 20 GI £ åZ ² ¯ åZ þ 0 GI8G sI V ¤ `GIF4I8YÁ¤Ò ÿ ê åZ Again using the fact that ² ¯ is an increasing function along with the fact that YFE 4PN6<V , we get that the above is satisfied if !4 Ó é Ep`GI4QI8YÁ ý¤Ï Ö 6Ï ­ } ´~ 20 `GI £ åZ ² ¯ åZ þ 0 GI8G sI 0 `G-I ¤ `GÐIF4}¤Ò ÿ ê 0 × Ø Z å¾ , then7 ² ¯ åZ . GIãG sI Zå 2 q² ¯ åZ G(I G >IFÝ . Since Ô ; åZ for every E yEt4[n Y , GjIg4_I YÁGjI > Ý*EQÝ , the above equation would be satisfied Ò By Lemma 2.4, if l 30 if 0 10 Ó Ep`GI4QI8YÁ ¤ý Ð Ö 6Ï ­ } ¯ >äIlÝX¤ ´~ Note that the assumptions YE Ý 6U<G<Üë¯ tE Ý<6G (as ÝHE G and Üë¯ G ) and 4 E ¯ <G>PIzÝ<`6G , Ó we have 4vn/YùE ¯ ?G>N6©G . Thus, by using Lemma 5.3 we get that G*I4õI YÁcÖ ­ ý¤Ð 6Ï } ¯ £]h² ¯ åZ GI G£4õIGYÁ . By Lemma 2.4, the facts that åZ åZ åZ YSEQÝ 6C<G<Ü ¯ and ² ¯ is increasing, we have ² ¯ GYI G£4{I GYÁ9t² ¯ `GYI G£4PIgÝ . This åZ implies that (5.14) is satisfied if D6C^~[9Et² ¯ GIãG£4PäIFV<Ý , as desired. !4 7 We also use the fact that !4 0 B"C c is increasing. 84 5.5 Using Folded Reed-Solomon Code as Outer Code In this section, we will prove a result similar to Theorem 5.1, with the outer code being the folded Reed-Solomon code from Chapter 3. The proof will make crucial use of the list recoverability of folded Reed-Solomon codes. Before we begin we will need the following definition and results. 5.5.1 Preliminaries We will need the following notion of independence. Definition 5.1 (Independent tuples). Let 1 be a code of block length ~ and rate È 4 defined ; over ¯ . Let Ò{|G and EQTCZ\!¤¥¤¥¤!fT E~ be integers. Let _õÆ TCZ!¥¤¥¤¥¤!NT . An ordered Ô Ô Z tuple of codewords Ü !¤¥¤¥¤!NÜÛÔ , Ü +1 is said to be 8! ¯ -independent if the following Z holds. TCZ8 _ Ü and for every G}R .EêÒ , T is the number of positions Í such that Ü Î is åZ ¯ -independent of the vectors ¢XÜ Î Z !¥¤¤¥¤!NÜ Î ¦ , where Ü S ÷ Ü S Z !¥¤¥¤¥¤!NÜ S . T Z Note that for any tuple of codewords Ü ¥! ¤¤¥¤!NÜ Ô there exists a unique 8! ¯ -independent. 1 such that it is The next result will be crucial in our proof. Lemma 5.5. Let 1 be a folded Reed-Solomon code of block length ~ that is defined over Theorem 3.6. For any -tuple of codewords from 1 , È with ö ' as guaranteed by ¯ ; where 2/Ò{B ~o6X X%Ô ¹»º¼ (where { is same as the one in Theorem 3.6), there exists a sub-tuple of Ò codewords such that the Ò -tuple is 8! ¯ -independent, where È KzÆ TCZ!¥¤¥¤¤!NT such that for every GsE .ECÒ , T pGI4Il<~ . Ô Proof. The proof is constructive. In particular, given an -tuple of codewords, we will construct a Ò sub-tuple with the required property. The correctness of the procedure will hinge on the list recoverability of the folded Reed-Solomon code as guaranteed by Theorem 3.6. We will construct the final sub-tuple iteratively. In the first step, pick any non-zero Z codeword in the -tuple– call it Ü . Note that as 1 has distance `GòI4P~ (and ñF+Q1 ), ZÜ is non-zero in at least TCZ*%`GJIQ4}Z~ `GòIQ4Ix<Z~ many places. Note that Ü Z is vacuously independent of the “previous” codewords in these positions. Now, say that the Z  procedure has chosen codewords Ü !¥¤¥¤¥¤!NÜ such that the tuple is Y^! ¯ -independent for È jDÊÆ TCZ!¥¤¥¤¤!NT , where for every GEòÂE Á , T õG(IF4QIe~ . For every GEtÍ9E ~ , Z   define ¶"Î to be the ¯ -span of the vectors ¢XÜ Î !¥¤¥¤¤!NÜ Î ¦ in Á¯' . Note that ·¸¶"ÎN·ÐEö . Call Ü÷WÜZ!¥¤¥¤¥¤!NÜ 8+K1 to be a bad codeword, if there does not exist any Tk Z9÷`GCIy4ÂI<~ È Z  such that WÜ !¥¤¤¥¤!NÜ !NÜ¥ is 8! ¯ -independent for KzÆWTCZ!¥¤¤¥¤!NT Z . In other words, Ü is a bad codeword if and only if some HC÷ 5~g¡ with · Hy·>÷i4´n{e~ satisfies Ü\ÎY+_¶BÎ for every Í3+ H . Put differently, Ü satisfies the condition of being in the output list for list recovering 1 with input ¶äZ\!¥¤¥¤¤!f¶ and agreement fraction 4´ nK . Thus, by Theorem 3.6, the number ¯ ¯  of such bad codewords is õH~6X e ¹»º½¼ Em~6X X%Ô ¹»º¼ , where Ò is 1 w 85 the number of steps for which this greedy procedure can be applied. Thus, as long as at each step there are strictly more than codewords from the original -tuple of codewords left, we can continue this greedy procedure. Note that we can continue this procedure Ò times, as long as Ò_Et86 . The proof is complete. w w Finally, we will need a bound on the number of independent tuples for folded ReedSolomon codes. ; Lemma 5.6. Let 1 be a folded Reed-Solomon code of block length ~ and rate R4pRG ; that is defined over È ,È where ö|te' . Let ÒKpG and EQTCZ!¥¤¥¤¥¤!NT E~ be integers and Ô define _zÆ TCZ!¤¥¤¥¤!NT . Then the number of 8! ¯ -independent tuples in 1 is at most E Ä Ô ÔXÔ Z å Z Ô Û Z ¡ ö E S å k Zå ½Z ý ¤ Proof. Given a tuple WÜ !¥¤¤¥¤!NÜ*Ô> that is 8! ¯ -independent, define H 2 5~u¡ with · H ·C T , for GEý5E Ò to be the set of positions Í , where Ü Î is linearly independent of the åZ Z values ¢XÜ Î !¤¥¤¥¤!fÜ Î ¦ . We will estimate the number of 8! ¯ -independent tuples by first estimating a bound on the number of choices for the áË codeword in the tuple (given a fixed choice of the first IQG codewords). To complete the proof, we will show that w w Z Ô ö EQ 3 E E Ä ¡ S å 3 Z½å ½Z ý ¤ A codeword Üò+K1 can be the áË codeword in the tuple in the following way. Now for every åZ EQ Ô values (as in these position the value has position in ~u¡ [ H , Ü can take at most j to lie in the ¯ span of the values of the first ItG codewords in that position). Since 1 is folded Reed-Solomon, once we fix the values at positions in 5~g¡ [ H , the codeword will be ; ; completely determined once any Ö CEÁi4 ~QI[ ~tI.T <n[G£! tÖ CEú T I ~w`GIÂ4P<n[G£! positions in H are chosen (w.l.o.g. assume that they are the “first” so many positions). The number of choices for H is S EtV,EtB . Thus, we have w EQ c Ô å ÄS ö Ä E Ä ¡ S Z å 3 Z½å ½Z ý Ô ö EQ k E Ä ¡ S å 3 Zå Z ý ! as desired. 5.5.2 The Main Result We will now prove the following result. ; ; Theorem 5.2. Let be a prime power and let RòGRG be an arbitrary rational. Let R ; R, ¯ <G> an arbitrary real, where ¯ <G> is as defined in (5.1), and R4pEp ¯ <G£BI]<`6G be a rational. Then the following holds for large enough integers ³!^~ such that there exist 24 ~ . Let 1 b be a folded Reed-Solomon integers and that satisfy SWGX and 86 zT Z code over ¯ of block length ~ and rate 4 . Let 1 Î !¥¤¥¤¥¤!f1 Î be random linear codes over Î ¯ , where 1 Î is generated by a random 8î matrix íÂÎ over ¯ and the random choices for Z are all independent. Then the concatenated code 1mÊ1¹b *W1 Î !¥¤¥¤¤!f1 Î í]Z!¥¤¥¤¥¤!bí Z½å ^ Ø Z å Z Ø f ¹»º¼ is a ² ¯ `GI4{G>äI´c! -list decodable code with probability at ÿ þ å least GJI t over the choices of í[\Z !¥¤¤¥¤!bí . Further, with high probability, 1 has rate G<4 . In the rest of this section, we will prove the above theorem. Define ö Ê ' . Let be the worst-case list size that we are shooting for (we will fix ý its value at the end). By Lemma 5.5, any wn|G -tuple of 1¹b codewords WÉ !¥¤¥¤¥¤!NÉ + ¯ } W1\b i Z contains at least Ò_ÿ+ëW[ntG`6C ~o6Y e>Ô ¹»º½¼ codewords that form an È 8! ¯ -independent tuple, for some ~ ÆWTZ\!¤¥¤¥¤!fT , with T HG-Ix45I YÁ~ (we will Ô ; specify Y , RAYõRGIq4 , later). Thus, to prove the theorem it suffices to show that åZ with high probability, no Hamming ball in t¯ of radius i² ¯ G(IîG£4PIe½~ contains Z Z a Ò -tuple of codewords WÉ íS!¥¤¤¥¤!NÉèÔíg , where É !¤¥¤¥¤!fÉèÔ£ is a Ò -tuple of folded ReedSolomon codewords that is 8! ¯ -independent. For the rest of the proof, we will call a Z Ò -tuple of 1\b codewords WÉ !¥¤¥¤¤¥!fÉ Ô a good tuple if it is 8! ¯ -independent for some È KzÆ TCZ!¥¤¥¤¤!NT Ô , where T p`GI4QI8YÁZ~ for every GsE .ECÒ . åZ Z Define @L² ¯ GÐIF4 G>YI´ . For every good Ò -tuple of 1 b codewords WÉ !¥¤¥¤¥¤!NÉ Ô Z and received word d+r"t¯ , define an indicator variable ^d8!NÉ !¥¤¤¥¤!NÉèÔ> as follows. /d8!NÉ Z !¥¤¤¥¤!NÉèÔ>oùG if and only if for every GKE xE Ò , :_WÉ íöI,d.E @>~ . That is, it captures the bad event that we want to avoid. Define 1 1 1 Y Y O À ü [¾ Z ;: : !]S^ ] ; Ù |Ù Ø ß Ù good O À We want to show that with high probability O À would follow if we can show that: ã O À ¡" ü Ø ¾ Z Ù |Ù Y ^d8!NÉ ü ü ;: : ]S^ ] ß Ù good À ß ã( Y /d8!NÉ ß Z !¥¤¥¤¤¥!fÉ Ô \¤ . By Markov’s inequality, the theorem Z !¥¤¥¤¥¤!NÉ Ô å ½¡úEt tµ ¤ (5.15) Z a final bit of notation. For a good tuple WÉ !¥¤¥¤¥¤!NÉ Ô WÉ Z !¥¤¥¤¥¤!NÉ Ô # ~u¡ to be the set of positions Í such åZ Z set ¢XÉ Î !¥¤¥¤¥¤!NÉ Î ¦ . Note that since the tuple is good, · H É Z !¥¤¥¤¥¤!NÉ Ô ·cp`GIF4IãYúZ~ . Let [q@´~ . Consider the following sequence of inequalities (where below we have Z suppressed the dependence of H on É !¥¤¥¤¤¥!fÉèÔ£ for clarity): Before we proceed, we need and every GQE½÷E Ò , define H that É Î is ¯ -independent of the ã O À ¡" Ø ü ¾ Z Ù |Ù ü good ;: : À]S^ ] ß Ù ß Û Z IJ O `a Ô Û Z :_WÉ b ímIwd9E (5.16) 87 ü E [¾ Z Ù |Ù Ø ü [¾ Z Ù| Ø ü Ù E [¾ Z Ù |Ù Ø ü [¾ Z Ù |oÙ Ø ;: : !]S^ ] ;: : !]_^ ß Ù good O À ü good Ù Ý ;: : À!]S^ ü Ý Ù ß O O Z ½å BÄ å} j Ä O O ß Z ½å BÄ å} j Ä O Ù ß t à ß ü E Ù Ý IJ ß Ù good O À ü ü E ü O ß Z ½å BÄ å} j Ä O O fe IJ Û ] Z å Ô Û Z b ð S WÉ í2Iwd9E ug { }n} (5.19) Ù S Ô å { Zå ¾ { v h} } å j Ä S Ù S Û Z ß good XÛ ß Û ß ] À O ¿ð ¿ Ä ¿ð ¿ Ä Z Ô ¡ å Zå j ÑZ ý Ô å rÄ ÔÔ å ö E Ä S å Û Z Û Z ÔÔ Z å Ù å j Ä Ô Û Z ß å ½Z åBå} j ö ÄS å Ô Û Z å j Ä (5.20) { Zå ¾ { v }n} S Ù S S ß Ù S { Zå ¾ { v }h} S (5.21) Ù S |{ Z½å ¾ { v i} å { Z½å |W£v Z } å zZ v å Z v p 3 } S (5.17) (5.18) ü å :_ Û Z ¾ ;: : ]S^ Ô Î WÉ 2 í I´dµ9E { ß à _ S ð y å S Zå ¿ð ¿ Ô Û ] Z ß à ß à ß Û Z Ô å ß t ü O ß Ù Ý ZåBÄ å} Ä O j ß `a (5.22) S ß Ù S H ¤ (5.23) In the above (5.16) follows from the definition of the indicator variable. (5.17) follows from the simple fact that for every vector É of length ~ and every H 5~u¡ , _ ÉE ð :_WÉ . (5.18) follows by an argument similar to the one used to argue (5.9) from (5.8) in the proof of Theorem 5.1. Basically, we need to write out the probability as a product of conditional probabilities (with t Ò “taken out” first) and then using the facts that Z the tuple É !¥¤¤¥¤!NÉ Ô is good and the choices for í[Z!¥¤¥¤¥¤!bí are independent.8 (5.19) follows from Lemma 5.1. (5.20) follows from rearranging the summand and using the fact that the tuple is good (and hence T GoIp4vI YÁ~ ). (5.21) follows from the fact that there are t choices9 for d and Lemma 5.6. (5.22) follows from the fact that 8 In Theorem 5.1, the tuple of codewords were not ordered while they are ordered here. However, it is easy to check that the argument in Theorem 5.1 also works for ordered tuples as long as the induction is applied in the right order. 9 As the final code will be linear, it is sufficient to only look at received words that have Hamming weight at most j- . However, this gives a negligible improvement to the final result and hence, we just ~ bound the number of choices for by ` . i 88 T I5`G(Ix4}Z~Hn5GE|T I5`G(I,4tIºYÁ~ (for ~vG6Y ) and that T 2`G-Ix4LI YÁ~ . (5.23) follows by rearranging the terms. Z Now, as long as F ÒkÒun5G , we have 3Ô E . (5.23) will imply (5.15) (along åZ rÄ that ÔÚfor Ä with the fact that ² ¯ is increasing) if we can show every GµI[4´I YÁZ~OEtTaE~ , úT Et² ¯ åZ þ GIãG þ GI `GIF4QIãYÁ~ V ~ I IlÝX! T ÒÁT ÿ ÿ for ÝlMe6e . Thus, by Lemma 5.4 (and using the arguments used in the proof of Theorem 5.1 to show that the conditions of Lemma 5.4 are satisfied), we can select Ò in Z . Ø Zå 2 (and Y in y/ GI4}`6G> ), and pick D6C/´~[3L² åZ ¯ ` GIãG£4PäIlP@"! as desired. This along with Lemma 5.5, implies that we can set ¥f Zå ^ Ø ¯ l÷ ~6 ! ¹»º¼ as required. Using arguments similar to those in the proof of Theorem 5.1, one can show that the Z code 1\b 1PW1 Î !¥¤¥¤¥¤!f1 Î with high probability has rate G£4 . Remark 5.2. The idea of using list recoverability to argue independence can also be used to prove Theorem 5.1. That is, first show that with good probability, a random linear outer code will have good list recoverability. Then the argument in this section can be used to prove Theorem 5.1. However, this gives worse parameters than the proof presented in Section 5.4. In particular, by a straightforward application of the probabilistic method, one can } show that a random linear code of rate 4 over UÈ is i4xnîYä! !,ö{S -list recoverable [49, Sec 9.3.2]. In proof of Theorem 5.2, is roughly Ô , where Ò is roughly G6X . Thus, if we used the arguments in the proof Ø of Theorem 5.2, we would be able to prove Theorem 5.1 3 ¯ but with lists of size of ö guaranteed by Theorem 5.1. W B Ø X , which is worse than the list size of ö Zå ^X 5.6 Bibliographic Notes and Open Questions The material presented in this chapter appears in [61]. Theorem 5.1 in some sense generalizes the following result of Blokh and Zyablov [19]. Blokh and Zyablov show that the concatenated code where both the outer and inner codes are chosen to be random linear codes with high probability lies on the Gilbert-Varshamov åZ bound of relative minimum distance is (at least) ² ¯ GIF4P for rate 4 . The arguments used in this chapter also generalize Thommesen’s proof that concatenated codes obtained by setting Reed-Solomon codes as outer codes and independent random inner code lie on the Gilbert-Varshamov bound [102]. In particular by using d to be 89 5.2, one can recover Thommesen’s the all zero vector and ÒS5÷G in proof of Theorem È proof. Note that when ÒSzG , a codeword ï is NÆv ! ¯ -independent if :_ u3 . Thus, the proof of Thommesen only required a knowledge of the weight distribution of the ReedSolomon code. However, for our purposes, we need a stronger form of independence in the proof of Theorem 5.2 for which we used the strong list-recoverability property of folded Reed-Solomon codes. Theorem 5.2 leads to the following intriguing possibility. l Open Question 5.1. Can one list decode the concatenated codes from Theorem 5.2 up to the fraction of errors for which Theorem 5.2 guarantees it to be list decodable (with high probability)? Current list-decoding algorithms for concatenated codes work in two stages. In the first stage, the inner code(s) are list decoded and in the second stage the outer code is list recovered (for example see Chapter 4). In particular, the fact that in these algorithms the first phase is oblivious to the outer codes seems to be a bottleneck. Somehow “merging” the two stages might lead to a positive resolution of the question above. 90 Chapter 6 LIMITS TO LIST DECODING REED-SOLOMON CODES 6.1 Introduction In Chapters 3 and 4 we were interested in the following question: Can one construct explicit codes along with efficient list-decoding algorithms that can correct errors up to the listdecoding capacity ? Note that in the question above, we have the freedom to pick the code. In this chapter, we will turn around the question by focusing on a fixed code and then asking what is the best possible tradeoff between rate and fraction of errors (that can be corrected via efficient list decoding) for the given code. In this chapter, we will primarily focus on Reed-Solomon codes. Reed-Solomon codes are an important and extensively studied family of error-correcting codes. The codewords of a Reed-Solomon code (henceforth, RS code) over a field are obtained by evaluating low degree polynomials at distinct elements of . The rate versus distance tradeoff for ReedSolomon codes meets the Singleton bound, which along with the code’s nice algebraic properties, give RS codes a prominent place in coding theory. As a result the problem of decoding RS codes has received much attention. As we already saw in Section 3.1, in terms of fraction of errors corrected, the best known polynomial time list algorithm today can, for Reed-Solomon codes of rate 4 , correct up to a G}I 4 ([97, 63]) fraction of errors. The performance of the algorithm in [63] matches the so-called Johnson bound (cf. [64]) which gives a general lower bound on the number of errors one can correct using small lists in any code, as a function of the distance of the code. As we saw in Chapter 3, there are explicit codes known that have better list decodable properties than Reed-Solomon codes. However, Reed-Solomon codes have been instrumental in all the algorithmic progress in list decoding (see Section 3.1 for more details on these developments). In addition, Reed-Solomon codes have important practical applications. Thus, given the significance (both theoretical and practical) of Reed-Solomon codes, it is an important question to pin down the optimal tradeoff between the rate and list decodability of Reed-Solomon codes. This chapter is motivated by the question of whether the Guruswami-Sudan result is the best possible (i.e., whether the Johnson bound is “tight” for Reed-Solomon codes). By this we mean whether attempting to decode with a larger error parameter might lead to super-polynomially large lists as output, which of course will preclude a polynomial time algorithm. While we don’t quite show this to be the case, we give evidence in this direction by demonstrating that in the more general setting of list recovery (to which also the algorithm of Guruswami and Sudan [63] applies) its performance is indeed the best 91 possible. We also present constructions of explicit “bad list-decoding configurations” for ReedSolomon codes. The details follow. 6.2 Overview of the Results 6.2.1 Limitations to List Recovery The algorithm in [63] in fact solves the following more general polynomial reconstruction problem in polynomial time: Given ú distinct pairs âÎ!ZY>ÎW9+] output a list of all polynomials U of degree that satisfy UäâDÎ 8 Y£Î for more than values of ÍÐ+l¢cG£!fVC!¥¤¥¤¤¥!ç¦ (we stress that the Î ’s need not be distinct). In particular, the algorithm can solve the list recovery problem (see Definition 2.4). As a special case, it can solve the following “error-free” or “noiseless” version of the list recovery problem. Definition 6.1 (Noiseless List Recovery). For a -ary code 1 of block length , the noiseless list recovery problem is the following. We are given a set ¶ÁÎNQ ¯ of possible symbols for the Í ’th symbol È for each position Í , G*EÍE5 , and the goal is to output all codewords ÜaMÆ ÜZ\!¥¤¥¤¤!NÜ such that ÜëÎJ+L¶BÎ for every Í . When each ¶"Î has at most elements, we refer to the problem as noiseless list recovery with input lists of size . ; Note that if a code 1 is ! !f8 -list recoverable then is the worst case output list size when one solves the noiseless list recovery problem on 1 with input lists of size . Guruswami and Sudan algorithm [63] can solve the noiseless list recovery problem for Reed-Solomon codes with input lists of size R in polynomial time. That is, Reed; ' Solomon codes are !k IyG£!f(/jN -list recoverable for some polynomially bounded func' tion (^Y . In Section 6.3, we demonstrate that this latter performance is the best possible with surprising accuracy — specifically, we show that when ï , there are settings of ' parameters for which the list of output polynomials needs to be super-polynomially large in (Theorem 6.3). In fact, our result also applies to the model considered by Ar et al. [3], where the input lists are “mixtures of codewords.” In particular, in their model the lists at every position take values from a collection of fixed codewords. As a corollary, this rules out an efficient solution to the polynomial reconstruction algorithm that works even under the slightly weaker condition on the agreement parameter: _.r > I56<V .1 In this respect, the “square root” bound achieved by [63] is optimal, and any improvement to their list-decoding algorithm which works with agreement fraction _N6´R| 4 where 4|vinLG`6 is the rate of the code, or in other words that works beyond the Johnson bound, must exploit the fact that the evaluation points Î are distinct (or “almost distinct”). 1 This in turn rules out, for every hQt´u sn>rÚ- . as long as , a solution to the polynomial reconstruction algorithm that works 92 While this part on tightness of Johnson bound remains speculative at this stage, for the problem of list recovery itself, our work proves that RS codes are indeed sub-optimal, as we describe below. By our work Reed-Solomon codes for list recovery with input lists of size must have rate at most G6 . On the other hand, Guruswami and Indyk [52] prove that ; (in fact 4 can be close to G ) such that for every integer there there exists a fixed 4 are codes of rate 4 which are list recoverable given input lists of size (the alphabet size and output list size will necessarily grow with but the rate itself is independent of ). Note that in Chapter 3, we showed that folded Reed-Solomon codes are explicit list recoverable codes with optimal rate. 6.2.2 Explicit “Bad” List Decoding Configurations The result mentioned above presents an explicit bad list recovery configuration, i.e., an input instance to the list recovery problem with a super-polynomial number of solutions. To prove results on limitations of list decoding, such as the tightness of the Johnson bound, we need to demonstrate a received word d with super-polynomially many codewords that agree with d at _ or more places. A simple counting argument establishes the existence /å of such received words that have agreement _ with Z 6e ' many codewords [70, 25]. In particular, this implies the following for z . For tM (in which case we say ; that the Reed-Solomon code has low rate), one can get _ ' for any Ý´ and for in P^Y (in which case we say that the Reed-Solomon code has high rate), one can get _Âù.nq . 2 . In Section 6.4.2, we demonstrate an explicit construction of such a ¹»º¼ received word with super-polynomial number of codewords with agreement _ up to WVjIy<N ; ; (for any ] ), where Sõ for any Ý] . Note that such a construction is trivial for __ since we can interpolate degree polynomials through any set of points. In Section 6.4.3, we demonstrate an explicit construction of such a received word with super , when is in s/Y . polynomial number of codewords with agreement _ up to }n ¹»º¼ H In general, the quest for explicit constructions of this sort (namely small Hamming balls with several codewords) is well motivated. If achieved with appropriate parameters they will lead to a derandomization of the inapproximability result for computing the minimum distance of a linear code [32]. However, for this application it is important to get V> æH codewords in a ball of radius @ times the distance of the code for some constant @,ROG . Unfortunately, neither of our explicit constructions achieve @ smaller than GI U`G . As another motivation, we point out that the current best trade-off between rate and relative distance (for a code over constant sized alphabet) is achieved by a non-linear code comprising of precisely a bad list-decoding configuration in certain algebraic-geometric codes [107]. Unfortunately the associated received word is only shown to exist by a counting argument and its explicit specification will be required to get explicit codes with these parameters. 93 6.2.3 Proof Approach We show our result on list recovering Reed-Solomon codes by proving a super-polynomial (in 5 ©$ ) bound on the number of polynomials over ¯ o of degree that take values åZ . Note that in ¯ at every point in ¯ o , for any prime power where is roughly $ this implies that there can be a super-polynomial number of solutions to list recovery when input list sizes are . We establish this bound on the number of such polynomials by ' exploiting a folklore connection of such polynomials to a classic family of cyclic codes called BCH codes, followed by an (exact) estimation of the size of BCH codes with certain parameters. We also write down an explicit collection of polynomials, obtained by taking ¯ -linear combinations of translated norm functions, all of which take values only in ¯ . By the BCH bound, we conclude that this in fact is a precise description of the collection of all such polynomials. Our explicit construction of a received word d with several RS codewords (for low rate RS codes) with non-trivial agreement with d is obtained using ideas from [25] relating to representations of elements in an extension finite field by products of distinct linear factors. Our explicit construction for high rate RS codes is obtained by looking at cosets of certain prime fields. 6.3 BCH Codes and List Recovering Reed-Solomon Codes 6.3.1 Main Result We will work with polynomials over ¯ o of characteristic U where is a power of U , and %G . Our goal in this section is to prove the following result, and in Section 6.3.2 we will use it to state corollaries on limits to list decodability of Reed-Solomon codes. (We will only need a lower bound on the number of polynomials with the stated property but the result below in fact gives an exact estimation, which in turn is used in Section 6.3.4 to give a precise characterization of the concerned polynomials.) Theorem 6.1. Let be a prime power, and G be an integer. Then, the number of ¯ åZ univariate polynomials in ¯ o ¡ of degree at most ¯ o åZ which take values in ¯ when evaluated at every point in ¯ o is exactly o . That is, oo ¢e o 8+] ¯ o ¡· deg i8E oo $ IQG and l+] ¯ o !fW9+] ¯ ¦ t o JIG o In the rest of this section, we prove Theorem 6.1. The proof is based on a connection of polynomials with the stated property to a family of cyclic codes called BCH codes, followed by an estimation of the size (or dimension) of the associated BCH code. Now, the latter estimation itself uses basic algebra. In particular one can prove Theorem 6.1 using finite field theory and Fourier transform without resorting to coding terminology. However, the connection to BCH codes is well known and we use this body of prior work to modularize our presentation. 94 We begin with the definition of BCH codes2 . We point the reader to [80], Ch. 7, Sec. 6, and Ch. 9, Secs. 1-3, for detailed background information on BCH codes. ¢¡ £ ¢¡ $ ¯ Ä Definition 6.2. Let be a primitive element of ¯ o , and let ~õ $ IG . The BCH code ¯ of designed distance T is a linear code of block length over ¯ defined as: N È ; ý !NÜZ!¥¤¥¤¥¤!NÜ åZ 0 ¯ ·»Ü< Î ³ for Í÷G<!ëVU!¤¥¤¥¤!NTIG£! where | C ¢ Æ Ü + $ Ä N åZ Ü</¬ktÜ ý nFÜZ ¬ynQenFÜ åZ¬ +] ¯ ¬¡ ¦c¤ We will omit one or more the subscripts in they are clear from the context. ¢¡ ¯ for notational convenience when $ Ä N In our proof, we will use the following well-known result. For the sake of completeness, we present its proof here. Lemma 6.1 (BCH codes are subfield subcodes of RS codes). Let be a prime power and G an integer. Let xv©$xIG , T be an integer in the range GÂRzT_Rp , and be a ¯ maybe written as primitive element of ¯ o . Then the set of codewords of Z åZ È ¢CÆWW b!fW \!¥¤¥¤¥¤!fW ] + ¯ <· ÷+] ¯ o ý £ ¤¡ $ Ä Òu¥ ¡Ñ! «W9E,uIT! and ?Yú8+] ¯ T Y[+] ¯ o N ¦c¤ Proof. Our goal is to prove that the two sets ¶YZkq¢Æ Ü ý !NÜZ\!¥¤¤¥¤!NÜ å Z È Î ·»ÜeW k Üe ¬³tÜ ; for ÍpG£!fVC!¥¤¥¤¤¥!fT}IQG<! where ý nÜZ¬ynQ¥<nÜ åZ +] ¯ »¬¡"¦}! å Z¬ u¥ §¦ Ò ý Z å Z È ¶ÁÐ ÆW \ !f b!¤¥¤¥¤!f ·<÷+] ¯ o ¡Ñ! «W9E,uIT! and ?Yú8+] ¯ TY[+] ¯ o ¦8! are identical. We will do so by showing both the inclusions ¶Yt¶YZ and ¶äZ t¶Á . å We begin with showing ¶ú{¶YZ . Let 8ê ÛCý Ä @ s+S ¯ o ¡ be a polynomial of degree at most ^gIlTU that takes values in ¯ . Then, for GpG£!fVC!¥¤¥¤¥¤!NTIG , we have ¨ 1® W® ΰ¯ ¨ åZ ¨ å Ä Î ´ ΰ¯ ¨ å Ä ¨ åZ ® ε¯·¶6¸ ª©¬«_­ «_­ Î ÛCý± ÛCý³² ­ ­ ÛCý°² Î ÛCý!«S­ å Z Î ÛCý Î åZ ; Î where in the last step we use that Î ÛCý Y for every YS+] ¯ o [ ¢cGe¦ and È * p Ï G since ý GsE>G-n .E,gIG and is primitive. Therefore, ÆW b!ëoW Z \ !¥¤¥¤¤¥!ëoW åZ K + ¶YZ . 2 What we define are actually referred to more specifically as narrow-sense primitive BCH codes, but we will just use the term BCH codes for them. 95 ¶µZ{p¶Á . Suppose Æ Ü ý !NÜZ\!¥¤¤¥¤!NÜ åZ ; We next proceed to show the inclusion E aE,gIG , define (this is the “inverse Fourier transform”) @ È +¶YZ . For å Z G ü å Î ÜfÎ/ ! Î ÛCý Z where by , we mean the multiplicative inverse of KG in the field ¯ o . Note that @ Z Ü< å ¥ Z ÜeW å where Ü</¬ åZ ÜfÎ^¬ Î . So, by the definition of ¶äZ , it follows that Î ÛCý @ ; for . ~uIT . Therefore the polynomial o 9+g ¯ defined by o k ü åZ ÛCý @ ü å Ä ÛCý @ has degree at most ^uITc . ;  We now claim that for ktÜÚ for EÁE,gIG . Indeed,  W ü åZ  ü åZ å Z G ü å Î Â ÜëÎ/ ÿ ÛCýþ Î ÛCý @ ÛCý ü åZ ÜfÎ ü åZ ½å>Î ÛCý Î ÛCý åZ ÜÚÂ3! ½å>Î ; Q} where in the last step we used the fact that ÛCý whenever Í* Ï Á , and equals È ý åZ È when ÍkKÁ . Therefore, Æ Ü ý !NÜZ!¥¤¥¤¥¤!NÜ åZ ÊÆioW \!¥¤¥¤¤!fW . We are pretty much ; done, except that we have to check also that }+F ¯ (since we wanted ?YúP+F ¯ for ; ; Z åZ ÜfÎ . Since S5 $ IQG , we all YK+0 ¯ o , including Y0 ). Note that kï@ ý Î ÛCý ; Z have unGy in ¯ o and so HIG+l ¯ . This together with the fact that ÜbÎÐ+´ ¯ for ; well, completing the proof. every Í implies that 8+] ¯ as of the above lemma, in order to prove Theorem 6.1, we have to prove that · £ ¤In¡ ¯ light ·£t o when Ty÷W $ IlGG³I ¯ åZ Z . We turn to this task next. We begin with $ Ä N the following bound on the size of BCH codes [15, Ch. 12]. For the sake of completeness, we also give a proof sketch. ¢¡ Lemma 6.2 (Dimension of BCH Codes). For integer Í , , let !çÍ%$ be a shorthand for Í Ò ¯ ·>t ¿ ¯ $ ÄN ¿ where Öo© . Then · $ Ä N ; EÍ3E,gIG£! ç! Í $ E,gIlT for all D! E .E,HIQGX¦ (6.1) åZ , then for H $ IÊG . (Note that for this value of , if Í0Í ý nÍ`ZynÊfÍ åZ $ $ åZ ý !çͽB$ qÍ $ åZänÍ (nFÍNZ nt>nÍ $ åUN© $ , and so !çͽB $ is obtained by a simple cyclic shift of the -ary representation of Í .) ×B U!{!NTckq¢Í· ; ¹ 96 Proof. It follows from Definition 6.2 that the BCH codewords are simply polynomials Ü</¬ Î over ¯ of degree at most /PIKG that vanish at for GPE,Í3RT . Note that if Ü</¬\!NÜ /¬ are Ò two such polynomials, then so is Ü</¬fnsÜ^ ¬" . Moreover, since pG , ¬DÜ</¬tÖy© /¬ I*G Î also vanishes at each designated . It follows that if Üe ¬" is a codeword, then so is G ¬"Üe ¬" Ò Öy© /¬ IG for every polynomial G/¬9+] ¯ ¬¡ . ¯ is an ideal in the quotient ring 4, ¯ ¬¡^6C/¬ IQG . It is well In other words $ Ä known that 4 is a principal ideal ring, i.e., a ring in which every ideal is generated by one element [77, Chap. 1, Sec. 3]. Therefore there is a unique monic polynomial ¬"+0 ¯ »¬¡ such that ¢¡ ¢¡ º u¥ u¥ Ò Ò ¯ |¢Y/¬ j/¬· ú ¬"9+] ¯ ¬¡ « B 8 E uI Q Ð G I «"âë¦ N $ Ä Ò ¯ ·9M å Þ65 ¼ P , and so it remains to prove that «"H§OwI It follows that · $ Ä N ·O×B U!{!NTc· where ×BWc!{!NTU is defined as in (6.1). It is easily argued that the polynomial ¬ is the monic polynomial of lowest degree Î over ¯ that has for every Í , GPE,Í3R,T , as roots. It is well known ([80, Chap. 7, Sec. 5]) that ¬" is then given by £ ¤¡ ¬"3 » Ù Q¼ Ø _8 8 ¼ v X ¬aI å N Î N u¥ µ! N Î where ÃÊW is the cyclotomic coset3 of . Further for the ease of notation, define Ã Ä N ÃÊ ]ÃÊW B 0ÃÊ Ä åZ . To complete the proof we will show that ¾½ ½ (6.2) ·<Ã Ä N ·ÄIõ·×BWc!`_!NTU(·U¤ ; Î To prove (6.2), we claim that for every E Í[E lIÊG , + à if and only if Ä N ^{IÍS+ Ï ×B;/_!NU!NTU . To see that this is true note Ï ×B U!{!NTc if and only that /{IQÍS+ê if there is a E ¥ÎRr such that !/wItÍ $ O´IQZÍ X[%´I5T . In other words, ; !çͽ $ öÍ%X , where E%ZÍ X]RrT . This implies that ^wIQÍ_+ Ï ×" U!_!fTc if and only if Î +KÃÊ Î tÃ Ä N , which proves the claim. ¯ ¯ where Ty÷W $ I_GI ¯ o ååZ Z . Let’s now use the above to compute the size of $ Ä N ; We need to compute the quantity O· ×" U!_!fTc· , i.e., the number of Í , EpÍòRp $ I5G such ¯ åZ åZ for each y ; !¥G£!¥¤¥¤¥¤!HIG . This condition that !^ͽ $ ¯ o åZÐE ¯ o åZ ÷G3nFnenr $ åZ is the -ary expansion of Í , is equivalent to saying that if ÍsvÍ ý n,Í`Zsnq¥Un,Í åZr$ $ then all the integers whose -ary representations are cyclic shifts of /Í ý !Í`Z\!¤¥¤¥¤!Í åZf åZ . Clearly, this condition is satisfied if and only if for $ each are E G*npn2Ánp© $ ; ;  !¥G£!¥¤¥¤¤¥!` ILG , Í +~¢ !GX¦ . There are V $ choices¯ for Í with this property, and hence åZ we conclude O· ×" U!_!fTc·£5V $ when Tyz © $lIQGäI ¯ o åZ . ÎÐÏ [Ñ É)Ò 5 Ï 3 In other words ~ . ` £ ¢¡ 5 Á À 1à ÀªÄ 1ÅÆ ÃÇzÇzÇà ÀªÄ È!ÉSÊË_Å1Æ<Ì , where Í ¿ )À  h a n a a a a is the smallest integer such that 97 Together with Lemma 6.1, we conclude that the number of polynomials of degree at ¯ åZ most ¯ o åZ over ¯ o which take on values only in ¯ at every point in ¯ o is precisely o . This is exactly the claim of Theorem 6.1. Before moving on to state implications of the above result for Reed-Solomon list decoding, we state the following variant of Theorem 6.1. Theorem 6.2. Let be a prime power, and ö|G be an integer. Then, for each Á , GPEÁE  , the number of univariate polynomials in ¯ o ¡ of degree at most Û Z $ å which take s lt values in ¯ when evaluated at every point in ¯ o is at least Sþ Æ o S . And the number åZ is exactly (namely just the constant of such polynomials of degree strictly less than $ polynomials, so there are no polynomials with this property for degrees between G and $ åZ IG ). Since the proof of the theorem above is similar to the proof of Theorem 6.1, we will just sketch it here. By Lemmas 6.1 and 6.2, to count the number of univariate polynomials åZ n5¥>nx $ åk which take values in ¯ , we need to count in ¯ o ¡ of degree at most $ åZ such that all integers corresponding the number of integers ÍkLÍ ý nlÍ`Z`nQ<nÍ åZr$ $ åZ åk . It is easy to see all integers to cyclic shifts of /Í ý !¥¤¥¤¥¤!Í åZf are at most $ l n K n $ $ ; Í such that Í +¢ !¥Ge¦ for all and Í öG for at most Á values of , satisfy the required  condition. The number of such integers is ÛCý $ , which implies the bound claimed in åZ the theorem. The argument when degree is R,$ is similar. In this case we have to count å Z ý the number of integers Í n,Í`Zsncn,Í åZ $ such that all integers corresponding to å$ Z . Note that if Í Ï ; for some ; E±gE ILG , all cyclic shifts of /Í ý !¥¤¥¤¥¤!Í åZN is R5r$ $ åZ . Thus, only Í ; satisfies the required then the / ILG-I#U th shift with be at least $ condition, which implies claimed bound in the theorem. 6.3.2 Implications for Reed-Solomon List Decoding In the result of Theorem 6.1, if we imagine keeping *Q fixed and letting grow, then for the choice [t©$ and a÷ ©$lIQGN6C JIG (so that L ), Theorem 6.1 immediately ' gives us the following “negative” result on polynomial reconstruction algorithms and ReedSolomon list decoding.4 Theorem 6.3. For every prime power S2 , there exist infinitely many pairs of integers "! such that q for which there are Reed-Solomon codes of dimension ÑnLG and ' block length , such that noiselessly list recovering them with input lists of size requires ' super-polynomial (in fact e ¾ ) output list size. SÓ[Ô Õ The above result is exactly tight in the following sense. It is easy to argue combinatori ally (via the “Johnson type” bounds, cf. [64]) that when RÛ , the number of codewords 5 [Ñ We remark that we used the notation ~ . will take 4 5 Ñ ~ ' u t in the previous subsection, but for this Subsection we 98 is polynomially bounded. Moreover [63] presents a polynomial time algorithm to recover all the solution codewords in this case. As was mentioned in the introduction, our results also show the tightness of noiselessly list recovering Reed-Solomon codes in the special setting of Ar, Lipton, Rubinfeld and Sudan [3]. One of the problems considered in [3] is that of noiselessly list recovering Reed-Solomon codes with list size , when the set ¶jÎ at every position Í is the set of values of fixed codewords at position Í . Note that our lower bound also works in this restricted model if one takes the fixed codewords to be the constant codewords. The algorithm in [63] solves the more general problem of finding all polynomials of degree at most which agree with at least _ out of ú distinct pairs vÎ!ZY£ÎW whenever _} . The following corollary states that, in light of Theorem 6.3, this is essentially the best possible trade-off one can hope for from such a general algorithm. We view this as providing the message that a list-decoding algorithm for Reed-Solomon codes that works with fractional agreement _N6 that is less than 4 where 4 is the rate, must exploit the fact that the evaluation points Î are distinct or almost distinct (by which we mean that no BÎ ; is repeated too many times). Note that for small values of 4 (close to ), our result covers even an improvement of the necessary fractional agreement by i4P which is substantially smaller than 4 . Corollary 6.4. Suppose ¢ is an algorithm that takes as input j distinct pairs âÎ!ZY>ÎW9+] for an arbitrary field and outputs a list of all polynomials U of degree at most for which UYvÎWùY£Î for more than I ' pairs. Then, there exist inputs under which ¢ must output a list of super-polynomial size. Proof. Note that in the list recovery setting of Theorem 6.3, the total number of pairs Y Q R~³ ntG , and the agreement parameter _kQ . Then ' ' > I V ×Ö R . E . G8n nLG 2 I V Q Ö G8n I V [±_3¤ 2 I XV V Therefore there can be super-polynomially many candidate polynomials to output even when the agreement parameter _ satisfies _Ð > I~C6<V . 6.3.3 Implications for List Recovering Folded Reed-Solomon Codes In this subsection, we will digress a bit and see what the ideas in Section 6.3.1 imply about list recoverability of folded Reed-Solomon codes. Recall that a folded Reed-Solomon code with folding parameter is just a Reed-Solomon code with consecutive evaluation pointsbundled together (see Chapter 3). In particular, if we start with an ³!ë¡ Reed 56[ folded Reed-Solomon code. Solomon code, then we get an ~%Y6_! 99 It is not too hard to check that the one can generalize Theorem 6.1 to show the following. þ codewords from an Let @öG be an integer and be a prime power. Then there are ¯ þ ¯ þ åZ . ! ¯ åZ 2 folded Reed-Solomon code such that every symbol of such a codeword takes $8 $ þ þ a value in ^ ¯ $ . The set of folded Reed-Solomon codewords are just the BCH codewords from Theorem 6.1, with consecutive positions in the BCH codeword “folded” into one symbol. Thus, this shows that an ~]! folded Reed-Solomon code (with folding parameter ) cannot be noiselessly list recovered with input lists of size $ . Let us now recall the algorithmic results for (noiselessly) list recovering folded Reed Solomon codes. From (3.6) it follows that an ~g! folded Reed-Solomon code (with folding parameter ) can be noiselessly list recovered with input lists of size if ~O .G8n Á G 2 þ HI ÁÐnQGCÿ  þp ~ ! ; where G§E ÁgEõ , and G[FÁ are parameters that we can choose. Thus for any 0  G , then we can satisfy the above condition if E÷GIl<  Z þ ~ ÿ  HI Ð Á nQG ÿ þ  Z ¤ if (6.3) The bound above unfortunately is much smaller than the bound of ~6 Z$ , unlike the case of Reed-Solomon codes where the two bounds were (surprisingly) tight. For the case ; when , the bound in (6.3) is at least WU ~[ , however one can show that for any ݧ Z å choose Áo ÷ wG-IxÝ<6eV> , in which case the bound in (6.3) ~6 $8 . Indeed, one can Zå Zå Z `G*Ie $8 Z½å Z . The claimed expression W < Ý < 6 £ V is ~6 $8 j ~o6 $ $8 follows by noting that ~o6 >4-`G while ÝX!` and are all `G . 6.3.4 A Precise Description of Polynomials with Values in Base Field ¯ åZ We proved in Section 6.3.1, for ö ¯ o åZ , there are exactly o polynomials over ¯ o of degree ö or less that evaluate to a value in ¯ at every point in ¯ o . The proof of this obtains the coefficients of such polynomials using a “Fourier transform” of codewords of an associated BCH code, and as such gives little insight into the structure of these polynomials. One of the natural questions to ask is: Can we say something more concrete about the structure of these o polynomials? In this section, we answer this question by giving an exact description of the set of all these o polynomials. We begin with the following well-known fact which simply states that the “Norm” function of ¯ o over ¯ takes only values in ¯ . Lemma 6.3. For all ¬]+] ¯ o , ¬ ¾ ¾o ee +] ¯ . Theorem 6.5. Let be a prime power, and let O|G . Let be a primitive element of ¯ Then, there are exactly o univariate polynomials in ¯ o ¡ of degree at most öv ¯ o ¯o . åZ åZ 100 that take values in ¯ when evaluated at every point in ¯ o , and these are precisely the polynomials in the set ~%|¢ ü o åZ Î Î ÛCý Î n È ·£ ý !ÁZ\!¥¤¤¥¤!B o åZÐ+g ¯ ¦}¤ Proof. By Lemma 6.3, clearly every polynomial in the set ~ satisfies ?YÁÂ+t ¯ for all Y5+t ¯ o . The claim that there are exactly o polynomials over ¯ o of degree ö or less that take values only in ¯ was already established in Theorem 6.1. So the claimed result that ~ precisely describes the set of all these polynomials follows if we show that ·5~l·£L o . Note that by definition, ·5~l·CEt o . To show that ·5~·UL o , it clearly suffices to show (by linearity) that if ü o åZ Î Î n È ; (6.4) Î ÛCý ; ZõBïB å Ð Z as polynomials in ¯ o ¡ , then ý Á . We will prove this by setting o up a full rank homogeneous linear system of equations that the "Î ’s must satisfy. For this we need Lucas’ theorem, stated below. Lemma 6.4 (Lucas’ Theorem, cf. [47]). Let U be a prime. Let @ and ; be positive integers with U -ary expansions @ ý nñ@CZ?Uont<nñ@ U2 and ; ý n¥;¥ZUnQ<n¥; U2 respectively. Then Ò Ò ; 9 @ 9 Æ 9 9 Öo© U , which gives us 9 Ï Öo© U if and only if @ ó; for ; A A A A A all .+{¢ Æ !¥G£! ä!ZGc¦ . VV Define the set Htq¢ ü Ù Þ ; ·ò¶º¢ ! Y! IGe¦ò¦}¤ Applying Lemma 6.4 with U being the characteristic of the field ¯ , we note that when Î È operating in the field ¯ o , the binomial coefficient of in the expansion of nF is G ; ; o åZ Î È å if .+ H and otherwise. It follows that (6.4) holds if and only if Î ÛCý W ¤Îú for å Z all S+îH , which by the definition of H and the fact that öm Gn,Jn, n¥n,r$ is equivalent to ü o åZ ; Î W Îú for all a+ H . (6.5) Î ÛCý Let us label the V $ elements ¢X}·Úa+ Hs¦ as ý !N³Z\!¥¤¥¤¤!Nä o åZ (note that these are distinct elements of ¯ o since is primitive in ¯ o ). The coefficient matrix of the homogeneous system of equations (6.5) with unknowns ý !¥¤¤¥¤! B o åZ is then the Vandermonde matrix åZ G ý ý ý o åZ ÷*ú õù G Z Z Z o ú ! ù .. .. .. .. ú ù .. . . . . ø ö . o åZ G Y o åZO o åZ o åZ 101 which has full rank. Therefore, the only solution to the system (6.5) is ý óÁZs ; , as desired. B åZk o ä 6.3.5 Some Further Facts on BCH Codes The results in the previous subsections show that a large number ( o ) of polynomials over ¯ o take on values in ¯ at every evaluation point, and this proved the tightness of the “square-root” bound for agreement _3tK5$ and total number of points tú (recall Corollary 6.4). It is a natural question whether similarly large list size can be shown at other points <_\!® , specifically for slightly smaller ú and _ . For example, what if Át³WJIQG and we consider list recovery from lists of size I~G . In particular, how many polynomials ; of degree at most öm W $ I5GN6C PI5G take on values in ¯[ ¢ ¦ at _ points in ¯ o . It is easily seen that when _3St$ , there are precisely JIQG such polynomials, namely the constant polynomials that equal an element of ¯X . Indeed, by the Johnson bound, since _ò ös for the choice _-| and |k PILG , we should not expect a large list size. However, even for the slightly smaller amount of agreement _òzSI5G ! ös $ , there are only about a linear in number of codewords, as Lemma 6.5 below shows. Hence obtaining super-polynomial number of codewords at other points on the square-root bound when the agreement _ is less than the block length remains an interesting question, which perhaps the BCH code connection just by itself cannot resolve. Lemma 6.5. Let be a prime power and let OqG . For any polynomial over ¯ o ¡ , ; ¦· . Then, there are exactly let its Hamming weight be defined as ·¢t +z ¯ o ·¸vµl Ï ¯ åZ òIQG $ univariate polynomials in ¯ o ¡ of degree at most ö÷ ¯ o åZ that take values in ¯ when evaluated at every point in ¯ o and that have Hamming weight $ IvG . Furthermore, these are precisely the polynomials in the set ¢ ä n È · + ¯ o ! 0+]´¯X ¦ . Ø [Ù Ø Ù Proof. It is obvious that all the polynomials in satisfy the required property and are distinct polynomials. We next show that any polynomial of degree at most ö that satisfies the required properties belongs to completing the proof. Let o be a polynomial of degree at most ö that satisfies the required properties. ; We must show that o + . Let Y5+t ¯ o be such that o<YÁy . Clearly, for each [ ¢©Yj¦< , â`6Cv0I8YÁ È +0 ¯X . By a pigeonhole argument, there must exist some l+Fç ¯ o ¯ åZ q+5 ¯X such that vµa äâlI±Yú È for at least ¯ o åZ ïö values of in ¯ o [ ¢©Yj¦ . ; È Since ?Yú , we have that the degree ö polynomials o and Y IºYÁ agree on at least öqn|G field elements, which means that they must be equal to each other. Thus the polynomial belongs to and the proof is complete. Ù LØ ÚÙ Ø Ø Ù 102 6.4 Explicit Hamming Balls with Several Reed-Solomon Codewords Throughout this section, we will be concerned with an »U!ë0nHG¡ Reed-Solomon code ¯ x{z ¸c!ëynqG¡ over ¯ . We will be interested in a received word d5+ ¯ such that a superpolynomial number of codewords of x{zÁ »U!ë}ntG¡ agree with d on _ or more positions, and the aim would be to prove such a result for _ non-trivially larger than . We start with the existential result. 6.4.1 Existence of Bad List Decoding Configurations ¯ /å It is easy to prove the existence of a received word d with at least 6e ' codewords with agreement at least _ with d . One way to see this is that this quantity is the expected number of such codewords for a received word that is the evaluation of a random polynomial of degree _ [70].5 ¯ /å We have the following lower bound on Q 6e ' : ¯ X' ^å ' _ ^å ' _ LV ¯ ; Now when xO for some Ý and _. ¯ ' ¹»º½¼ >å »¹ º¼ ¤ , then Шª©<«9*I=_U¨ª©£«õ_ is PW ¨ª©£«8£ , which implies that the number of RS codewords with agreement _ with the received word ¯ Å is . Ì Ì On the other hand, if sW< let _ .n , where ' ¯ (we also assume Ì Ì Ì _9EQ£6eV ). Now, 9¨ª©<«9I _U¨ª©£« _9LШª©<«9I,ÑJn \ ¨ª©£«8IxGk5Jn ¹»º¼ I ¨ª©£«856eV . ¯ ¯ Thus, we get V< RS codewords with agreement _35ÐnS . ¯2 with the received word ¹»º½¼ Å . In the remainder of the chapter, we will try to match these parameters with explicit received words. We will refer to Reed-Solomon codes with constant rate as high rate ReedSolomon codes and to Reed-Solomon codes with inverse polynomial rate as low rate ReedSolomon codes. 6.4.2 Low Rate Reed-Solomon Codes Another argument for the existence of a bad list-decoding configuration (from the previous subsection), as suggested in [25], is based on an element in ¯ ù ¯ W , for some ¯ Ù Wun@U for at least Q 6e subsets positive integer , that can be written as a product 9 ð H M ¯ with · Hy·( _ — the existence of such a again follows by a trivial counting argument. Here we use the result due to Cheng and Wan [25] that for certain settings of parameters and fields such a can be explicitly specified with only a slight loss in the number of subsets H , and thereby get an explicit received word d with several close-by codewords from xázÁ »c!bPnQG¡ . 5 x ÜlÝ Ü F C C by using a random monic polynomial. The bound can be improved slightly to Û q ~ c 103 ; be arbitrary. Let be a prime power, be a positive integer Theorem 6.6 ([25]). Let and be such that ¯ k ¯ . For any ´+]´¯X , let s ~ `vµ denote the number of _ -tuples È Æ8@CZ!o@c!¥¤¥¤¥¤!o@< of distinct @£Îs+, ¯ such that 5 Î Û Z _n@<Îi . If _oO nLV>\ anpG , Ø x E Ø3 ] R _µIFV and Ö CEÁ<_ ! oIG ] p p 3 , then for all ´+]¯X , ~s Nvµ9q<_äIQG /å åZ . Proof. From the proof of Theorem 3 in [25], we obtain ~} Nâ oH#JZ9It# , where #JZ* ¯ ] å Ø] ¯ ] Ø Ø] ¯ åZ e and #(9zGÁn> % (I_G . Observe that from the choice of , P I E ¯ å> . ¯ å> We first give a lower bound on #sZ . Indeed, using P E and ItGRt , we have ¯] ¯ ] å ¯ å> ¯ ] X / å å Z #JZÐ . ¯ n Note that from our choice of _ , we have _9q nFV> , that is, _äI uq _ . Further, Ø ] åZ from our choice of , KIvG E p 3 . We now bound3 #( from above. From our ¯ å> ¯ å> Ø 3 ^åZ R `GPn /å åZ bounds on P and [IõG , we have #({E `Gsn p p ¯] ^å åZ , where the second inequality comes from our bound on _µI . It IQG Combining the bounds on #sZ and #( proves the theorem. E E E Þ Þ We now state the main result of this section concerning Reed-Solomon codes: ; Theorem 6.7. Let w be arbitrary real, a prime and any positive integer. Ø 3 power, ] Ø 3 p If _{ npV>\ ]nvG and 5hÖ CEÁ<_ ! _IõG ] p then for every in the range ¯ _"I ]ELÂE>_"IlG , there exists an explicit received word d´+] ¯ such that there are at least ¯ codewords of x{zÁ »U!ë}nQG¡ that agree with d in at least _ positions. G ßT T p] E ; à We will prove the above theorem at the end of this section. As }) , and c!ë"! §) in the above, we can get super-polynomially many codewords with agreement Gkn´Ý<N for ; Z . As P) some Ý}tÝU/<Ð for a Reed-Solomon code of dimension tending to , we can get super-polynomially many codewords with agreement tending to V> with dimension Z still being . We record these as two corollaries below (for the sake of completeness, we ¯ /å sketch the proofs). We note that the non-explicit bound Q 6e ' gives a super-polynomial `å Z number of codewords for agreement _y C6eÝ for dimension about wH , where as our explicit construction can give agreement at most V> (or dimension at most ). ; à ; Corollary 6.8. For all RYzRhG , and primes U , there exists Ý~ such that for any ¯ power of U (call it ) that is large enough, there exists an explicit dv+ ¯ such that the Reed-Solomon code x{zÁ »U!ë*nqGv nqG¡ contains a super-polynomial (in ) number of codewords with agreement at least iVJI8Yú` with d . E á Proof. For any integer , choose , _ and such that _k÷ n_V£ -nG , ±_"I -nG and _kzWVòI8YÁN . These relations imply that P ¤ `å} ! Z½å} å ZZ äIFV 104 Note that in the limit as power such that UB ý ØØ ý , { to infinity, ý 0IzG . Zå } ý enough so that , } `å} ¨ª©<«>¯ 9I]G>I¨ª©£«£¯`G"I YÁ9 } ãâ E Z å} .Ø Further, choose to be a prime } Ø 3 uIpG / p p 3 ] . Finally note that as _ goes goes to infinity, { where ý ;Ó For the Ø rest Ø of the proof we Ø Ø will assume that is large aILG and aILG K_ . Note that now Ý â E E I¨ª©£«>¯UIa¨ª©<«>¯`G"I YÁ9 ; T . As all conditions of Theorem ä ¯ Z HG . Now Nå} `å} as _ . Zå} 2 onqG and is large enough, we can assume that _}E . Z½å} 2 iV D . Thus, Ø Ø ¯ Nå} _ EmiV B ^ Zå} / ^ . To finish the proof we will show that p where Ü and T are constants which depend on Y . Indeed as ?_jntG!"E>_ , and § , we have 6.7 are satisfied, we have that the relevant number of codewords is at least x åâ G E ä /væ ä ' <_jnQG&" Ø `å} Ø ^ ¤ ^ Z½å} / i V D / Ø Ø Since is large enough, KO D6<V> , which along with the above inequality implies that Ø Ø Ø Ø Zå G / ! Ø Ø Ø Ø Ø V Ä ^ V V ^ NZåå }} ^ Ø Ø / Ø Nå } Ø ^ where T is chosen such that VÄ pV V / ^ Z . Note that such T exists å } / and it only depends Ø Ø on Y . Finally, if × Y_R|G6<V , then there exists a value Ü that ; depends only Zå on Y such that \ Q . Thus, we have proved the theorem for ò R Y[RG6<V . I Y implies an agreement of V*> I Y for any YBÐF Y , the Since having an agreement of V*ò ; proof of the theorem for R Y[RqG follows. ; ; Z Corollary 6.9. For all R YSR and primes U , there exists Ý , such that for any power ¯ of U (call it ) that is large enough, there is an explicit dl+[ ¯ such that the Reed-Solomon Z ` å } nG¡ contains a super-polynomial (in ) number of codewords code x{zÁ ¸c!ëyn5GyÊ with agreement at least `G8nFÝ<` with d . Proof. The proof is similar to the proof of Corollary 6.8 and hence, most details are skipped. Choose _ and such that _ n5> ]I÷G and _I §nõG . Note that : be a prime power such that ý Ez§E±UB ý , for { n , _P2 Ø nV>\ onG . Also let E ç éè E Ø p 33 where ý ö§IqG Ø p êÓ ] . As in Corollary 6.8, we consider to be very large and we â E*IxG / Ø , _ëâ Z } } EIxG and ãâ }åZ . Recalling that _3÷`G³n´Ý<N , we have have ý õ Ý"ü â ¯ Y . Again using arguments as in the proof of Corollary 6.8, we have a lower bound of s v where T is a constant which depends on Y . (Proof of Theorem 6.7). In what follows, we fix #./¬ to be a polynomial of degree that is irreducible over ¯ . For the rest of this proof we will denote ¯ ¬¡^6Ci#a ¬"N by ¯ . Also note that for any root of # , ¯ Wk, ¯ . 105 ø ; ì E .I5G and note that and _ satisfy the conditions of TheoPick any where E rem 6.6. For any H>; ý !6;¥Z!Y!¤; , where ;fÎ3+{ ¯ with at least one non zero ; ; define S 3 ¬" Þ6 5v7 SÎ ÛCý ;`Î^¬ Î . Fix G ¬" to be an arbitrary non-zero polynomial of degree at most oIG . By their definitions, G and 8W are elements of ¯X . í È We will set the received word d to be Æ y 9f Ù | . Note that since #./¬ is an irreducible ¾ ; y 9f 9 ¯ for all @.+] ¯ , and d is a well-defined element of ¯ . polynomial, #a>@Uò Ï ø 587 _n We now proceed to bound from below the number of polynomials of degree Þ6A Z that agree with d on _ positions. For each non-zero tuple +0TS¯ , define ö 8/¬3 I I . Clearly, ö 3W[+z´¯X . For notational convenience we will use ~I to denote ~ò N ö 3W` . Then, for ypG£!ä!^~ there exist ¢ Á where ¢ Á x ¯ and · ¢ ·£±_ î ï ø x ið ï!ñ Ù WSn @Us ö 8 . By Theorem 6.6, we have ~It <_3I such that Þo 587 9 ø S ^ å å Z for every — let us denote by ~ this latter quantity. Recalling the definition of G ö , we have that for any ! U , N |Iò , or equivalently G fnP W` 8Wk ; N . Since # is the irreducible polynomial of over ¯ , this implies that #a ¬ divides ¬ 8/¬ÁnîG ¬" in ¯ »¬¡ . I Finally we define H ¬ to be a polynomial of degree a±_jn such that ï Þ H /¬#. ¬"kL /¬ 8 /¬ÁnîG/¬\¤ (6.6) H Clearly H I@C equals GCI@UN6<#aI@U for each @_+ñ¢ Á and thus the polynomial ø agrees with d on at least _ positions. To complete the proof we will give a lower bound on the number of distinct polynomials in the collection ¢©H ø ¦ . For a fixed , out of the ~ choices for , _&" choices of would lead to the same6 polynomial of degree _ . Since ¯ ç p åZ ~ % ~ , there are at least j choices of pairs ! U . Clearly for £ZF@ Ï the Ø G polynomials ¬" and ¬" are distinct, however we could have ¬ /¬( Ø Ø Ø ¬"` Ø ¬" (both are equal to say ¶ ¬" ) leading to H ¬"-üH Ø ¬" . However the degree of ¶ is at most _en 5µn , and hence ¶ can have at most µn roots, and therefore Ù ¬8n @C with · H*·>±_ . It follows that no single degree at most ' D factors of the form 9 ð polynomial is counted more than ' times in the collection ¢©H ¦ , and hence there must be at least ò x ò rS Z Q I GZ~ _&" ' ò ò e' &_ " ' ò î ó Ü l Ü ø Ãz÷z÷÷à õ isÁ aÃz÷zsolution 5ú )À 7õ Ì If ôöõ of the equation ù ÷ ÷ à . permutation ü on t distinct polynomials among them, where we used ~%z<_IG , Z jS Z LX' å> I . since =_ún 6 c abc6h Ik /å å Z and W S Z a I G<_IaG8 a n then so is Üø z à ÷ z ÷ ! ÷ à öô õû õû for any wcv{ w { 106 6.4.3 High Rate Reed-Solomon Codes We now consider the case of constant rate Reed-Solomon codes. We start with the main result of this subsection. Theorem 6.10. Let ÷mV be an integer. Let UF'@Uwn|G be a prime and define _P ;\ for any G*RC;}R @*ItG . Let the received word Å be the evaluation of 4o<O_Ð O over X . Then there are 9 many codewords in 4s¶- [8UY!ë÷;³IG`]nG¡| that agree with Å in A at least _ places. ý To get some interesting numbers, let’s instantiate the parameters in the above theorem. First we need the following result (we will prove this later in the subsection): ; R E|G6CWÜ ~ I G , where GPR,Ü R±q , there exists infinitely many Lemma 6.6. For every x with prime Uu @cSntG such that @ is oW . Corollary 6.11. 3 3 Let U be a prime that satisfies Lemma 6.6 for some . Then there exists at least Ve p æ codewords in 4P¶( =Uj!b0õP/j\!NT]p0I*nG\¡| with agreement Z Z _k5}nñy/ Ñi . ý Ó Proof. Set y ; !`GòIÝ<@Q$ònqG ; for some Ý[ . Thus, _ ö>;(IqG÷ !`GòI,Ý<@c $u Z Z y8@U3( y/j . Further, _( ;\õ*n,QÊno^ ÑW . The last part follows from Z f . Finally, the number of codewords is at least 9 A åZ LV y9fj the fact that [ïoW åZÚ 3 3 A p VX ® . Ó 9 If one is satisfied with super polynomially many codewords, say V " for some */jk × 4( ¨ª©£«kY , then choosing g (for some suitable constant Ü ), gives an agree¹»å º½¼ × " ment _35Pn¥ þ9 æ ¹»º½¼ 2 . " ® . 9 9 ¹»º¼ " Proof of Theorem 6.10. The basic idea is to find a “lot” of _ -tuples ?;Z\!Z;£!¥¤¥¤¥¤!Z;X iò+w , (where for every Í- Ï , ;eÎ9± Ï ; ) such that the polynomial 6B O B ] ?O{³ Î Û Z <O I¥;XÎ is actually of the form x O n ü ^å ; Ü O Û Z where Üf /å can be .7 The above is equivalent to showing that <;CZ!¥¤¥¤¥¤!Z;X i satisfy the following equations ;   Z n8; nQ¥^;  ; Áò÷G<!ëVU!¤¥¤¥¤ë_IG We give an “explicit” description of at least 9 distinct ?;UZ\!¥¤¤¥¤!Z;X Ñ such tuples. 7 öÿ Ë { höÿ Then v h Inku w n u I 5 is of degree a r A u t as needed. (6.7) 107 Let ´ X be generated by Y and set ö Y9 . Note that the order of is exactly . Now consider the “orbits” in ( X under the action of . It is not too hard to see that for ; EzÍ}R @ , the Í áË orbit is the set Y Î ¢ , where ¢M ¢cG£!N8!N !¤¥¤¥¤!f åZ ¦ . ; We will call Y Î the “representative” of the Í áË orbit. Consider all subsets ¢Í ý !¤¥¤¥¤!`Í åZë¦ ¢ !¥G£!¥¤¥¤¤¥!6@-IlGe¦ A of size ; . Each such subset corresponds to a tuple ?;Z!¥¤¥¤¥¤!Z;X i in the following manner ; Î (recall that _8C;\ ). For subset ¢Í ý !¥¤¤¥¤!Í åZë¦ , define ; =Y , where EtT.R ; and A Ä ; E G§Rq . Note that each such subset ¢Í ý !¤¥¤¥¤!Í A åZb¦ implies a distinct tuple ?;UZ\!¥¤¤¥¤!Z;X Ñ . Thus, there are 9 such distinct tuples. A To complete the proof, we will now verify that (6.7) holds for every such tuple <;Z!¥¤¥¤¥¤!Z;X Ñ . Indeed by construction, for Áò÷G£!¥¤¥¤¥¤!f_IQG : v ü ; Û Z  ü A åZ ÛCý Ä v Y Î Â å Z ü ÛCý  ü A åZ Ä ÛCý Y v Î Â þ   I G ; ! IG}ÿ ó where the last inequality follows from the the fact that the order of is . We now turn to the proof of Lemma 6.6. First we need the following result, which is a special case of Linnik’s theorem: Theorem 6.12 ([78]). There exists a constant Ü , GRtÜ Rq , such that for all sufficiently Ò × large T , there exists a prime U such that U0R,T and UuôzGõÖo© T . ; Z Proof of Lemma 6.6. Fix any R tE × åZ . The basic idea of the proof is to “redistribute” the product ;bT as @c , where @oïyi³ë . Let TyLV be sufficiently large so that it satisfies the condition of Theorem 6.12. Thus, × åZ by Theorem 6.12, U{ ;ëTnG is prime for some GaEó;RpV . Let V Î Eó;*RpV Î Z for ; some Í8+ !ZG Ü IyG\IyG¡ . Now we consider two cases depending on whether Í3E,Í ý "!Get$ or not. å>Î First consider the case when ÍÐEtÍ ý . Here define ¬Îä,! Z $ . Finally, let @. ;\V and ; defined. Also note that lLV å . First note that E¬ÎäE>G and thus, @ and are well @ V Z Ñ å Î Z Ñ ^ 3 3 å åZ tV V ï;\V p V å ;\V V G ! Î where the inequality follows from the fact that for all positive reals !;$±;I{G and ;(LV . Similarly, one can show that @C6e R~ and thus, @oïoW as required. å Î Z Now we consider the case when Í9Í ý . In this case define ¬ÎY,! Z $ . Finally, let å and l ;\V . Note that ¬ÎäE>G . Also note that as ÍntGsRòG Ü IQ G , ¬ÎY ; and @oLVB thus, @ and are well defined. As before, we first lower bound @ VB å ; V VB V Î å Z LV Î Z å Z Ñ å Î Z |G£! where the first inequality follows from ;xR V and the second follows from the fact that for all positive ; , !?;$0E ; . Similarly one can show that 9 3 Ev , which implies that @oïyi as required. ó 108 Smooth Variation of the Agreement In this section, we will see how to get rid of the “restriction” that _ has to be a multiple of in Theorem 6.10. ; Theorem 6.13. Let ~tV and Et?}R be integers. Let Uu @cJn§G be a prime and define _k ;\n0? for any GPR;-R @I{G . Let the received word Å be the evaluation of 4o?O{³±O å9 Z over X . Then there are many codewords in 4P¶( [8UY!ëz>;kIG0nQG3n?¥¡| that A agree with Å in at least _ places. ý Since the proof is very similar to that of Theorem 6.10, we will just sketch the main ideas here. The basic argument used earlier was that every _ -tuple <;Z!Z;£!¥¤¥¤¤¥!;X Ñ was chosen such that the polynomials 6B O B ] <O{ and 4o<O_ agreed on the first _I[ co-efficients and the RS codewords were simply the polynomials 4o?O{cIu 6B O B ] <O{ . Now the simple observation is that for any fixed polynomial 70<O{ of degree ? we can get RS codewords of dimension Jnw? by considering the polynomials 70<O{ 4y<O{YIF 6B O B ] <O_ . The new agreement _ is with the new received word 4ò/<O_(õ4o?O{`70<O_ . Now _½Iº_ is the number of roots of 7[?O{ that are not in the set ¢©;Z\!¥¤¥¤¤¥!;X ¦ . Thus, we can now vary the values of by picking the polynomial 70<O{ of different degrees. However, the difference _"I5 might go down (as an arbitrary polynomial 70?O{ of degree ? might not have ? roots and even then, some of them might be in the set ¢©;cZb!¤¥¤¥¤!;X ¦ ). To get around this, while choosing the tuples ?;Z!¥¤¥¤¥¤!Z;X Ñ , we will not pick any elements from one of the @ cosets (recall that the tuples <;Z\!¥¤¥¤¤!Z;X Ñ are just a collection of ; out of the @ cosets formed by the orbits of _=Y 9 , where Y generates X ). This reduces åZ the number of tuples from 9 to 9 . Now we pick an arbitrary subset of that coset of ; A A size EQ?R – say the subset is ¢ Z\!¥¤¥¤¥¤! ¦ . Finally, pick 70<O_k Î Û Z <O I Î . Note û that this implies that _ ±_jn? as desired. û çx 6.5 Bibliographic Notes and Open Questions Results in Section 6.3 and Section 6.4.2 appeared in [59] while those in Section 6.4.3 are from [62]. Our work, specifically the part that deals with precisely describing the collection of polynomials that take values only in ¯ , bears some similarity to [51] which also exhibited limits to list recoverability of codes. One of the simple yet powerful ideas used in [51], and also in the work on extractor codes [101], is that polynomials which are G ’th powers of a lower degree polynomial take only values in a multiplicative subgroup consisting of ç the G ’th powers in the field. Specifically, the construction in [101, 51] yields roughly Ù codewords for list recovery where is the size of the ¶ÁÎ ’s in Definition 6.1. Note that this gives super-polynomially many codewords only when the input lists are asymptotically bigger than Y6£ . In our work, we also use G ’th powers, but the value of G is such that the G ’th powers form a subfield of the field. Therefore, one can also freely add polynomials which are G ’th T 109 powers and the sum still takes on values in the subfield. This lets us demonstrate a much larger collection of polynomials which take on only a small possible number of values at every point in the field. Proving bounds on the size of this collection of polynomials used techniques that were new to this line of study. The technique behind our results in Section 6.4.2 is closely related to that of the result of Cheng and Wan [25] on connections between Reed-Solomon list decoding and the discrete logarithm problem over finite fields. However, our aim is slightly different compared to theirs in that we want to get a large collection of codewords close by to a received word. In particular in Theorem 6.6, we get an estimate on ~ `vµ while Cheng and Wan only require ; ~ò Nv8 . Also Cheng and Wan consider equation (6.6) only with the choice ¦8 ¬"kpG . Ben-Sasson, Kopparty and Radhakrishnan in [12], exploiting the sparsity of linearized ; polynomials, have shown the following. For every Ýl+m !¥G there exits Reed-Solomon code of block length and dimension nSG , which contains super-polynomial many codewords that agree with a received word in at least ; positions. Also they show for constant rate Reed-Solomon codes (where the rate is 4H ), there exists a received word that has Z agreement 4J~ (where 4Jµz4 ) with roughly ~] ¹»º½¼ codewords. The received word in the above constructions, however, is not explicit. Ben-Sasson et al. also construct an explicit received word that agrees with super-polynomially many Reed-Solomon codewords in 4(Ñ many places, where n{G is the dimension of the code. However, their results do not give an explicit bad list decoding configurations for constant rate Reed-Solomon codes. The results in [12] do not work for prime fields while the results on explicit received words in this chapter do work for prime fields. We conclude with some open questions. Open Question 6.1. We have shown that RS codes of rate G6 cannot be list recovered with input lists of size in polynomial time when is a prime power. Can one show a similar result for other values of ? Using the density of primes and our work, we can bound the rate by aG6 , but if it is true it will be nice to show it is at most G6 for every . We have shown that the bound for polynomial reconstruction is the best possible given general pairs âÎ!ZY£Îi´+ as input. It remains a big challenge to determine whether this is the case also when the DÎ ’s are all distinct, or equivalently Open Question 6.2. Is the Johnson bound is the true list decoding radius of RS codes? We conjecture this to be the case in the following sense: there exists a field and a subset of evaluations points ¶ such that for the Reed-Solomon code defined over and ¶ , the answer to the question above is yes. One approach that might give at least partial results would be to use some of our ideas (in particular those using the norm function, possibly extended to other symmetric functions of the automorphisms of ¯ o over ¯ ) together with ideas in the work of Justesen and Hoholdt [70] who used the Trace function to demonstrate that a linear number of codewords could occur at the Johnson bound. Further, the work of 110 Ben-Sasson et al. [12] gives evidence for this for x{z codes of rate ; to . å for constant close Open Question 6.3. Can one show an analog of Theorem 6.6 on products of linear factors for the case when _ is linear in the field size (the currently known results work only for _ Z )? up to This is an interesting field theory question in itself, and furthermore might help towards showing the existence of super-polynomial number of Reed-Solomon codewords ; with agreement _ÐpGÁn]eN for some for constant rate (i.e. when is linear in )? It is important for the latter, however, that we show that ~} Nâ is very large for some special field element in an extension field, since by a trivial counting argument it follows that ¯ there exist w+0«¯X for which ~s `v9EÛ Q 6C IQG . 111 Chapter 7 LOCAL TESTING OF REED-MULLER CODES From this chapter onwards, we will switch gears and talk about property testing of codes. 7.1 Introduction A low degree tester is a probabilistic algorithm which, given a degree parameter _ and oracle access to a function on arguments (which take values from some finite field ), has the following behavior. If is the evaluation of a polynomial on variables with total degree at most _ , then the low degree tester must accept with probability one. On the other hand, if is “far” from being the evaluation of some polynomial on variables with degree at most _ , then the tester must reject with constant probability. The tester can query the function to obtain the evaluation of at any point. However, the tester must accomplish its task by using as few probes as possible. Low degree testers play an important part in the construction of Probabilistically Checkable Proofs (or PCPs). In fact, different parameters of low degree testers (for example, the number of probes and the amount of randomness used) directly affect the parameters of the corresponding PCPs as well as various inapproximability results obtained from such PCPs ([36, 5]). Low degree testers also form the core of the proof of MIP NEXPTIME in [9]. Blum, Luby, and Rubinfeld designed the first low degree tester, which handled the linear case, i.e., _s G ([21]), although with a different motivation. This was followed by a series of works that gave low degree testers that worked for larger values of the degree parameter ([93, 42, 7]). However, these subsequent results as well as others which use low degree testers ([9, 43]) only work when the degree is smaller than size of the field . Alon et al. proposed a low degree tester for any nontrivial degree parameter over the binary field Á [1]. A natural open problem was to give a low degree tester for all degrees for finite fields of size between two and the degree parameter. In this chapter we (partially) solve this problem by presenting a low degree test for multivariate polynomials over any prime field B . 7.1.1 Connection to Coding Theory The evaluations of polynomials in variables of degree at most _ are well known ReedMuller codes (note that when ùG , we have the Reed-Solomon codes, which we considered in Chapter 6). In particular, the evaluation of polynomials in variables of degree 112 at most _ over ¯ is the Reed-Muller code or RM¯ ?_\!`j with parameters _ and . These codes have length and dimension (see [28, 29, 69] for more details). Therefore, vector of evaluations of) the function is a valid a function has degree _ if and only if (the codeword in RM¯ /³!Z_N . In other words, low degree testing is equivalent to locally testing Reed-Muller codes. 7.1.2 Overview of Our Results It is easier to define our tester over Á . To test if has degree at most _ , set a let Í÷<_úntG (mod 2). Pick -vectors ;UZ!¥µ!Z; and ; from , and test if Z , and ' T ü 88 T ×½Ù | × Û × × s ü ' ; Î Ü Zµ ;n Ü ; k ! Û Z ; ý G (and we will stick to this convention where for notational convenience we use throughout this chapter). We remark here that a polynomial of degree at most _ always passes the test, whereas a polynomial of degree greater than _ gets caught with non-negligible probability . To obtain a constant rejection probability we repeat the test oG6e times. The analysis of our test follows a similar general structure developed by Rubinfeld and Sudan in [93] and borrows techniques from [93, 1]. The presence of a doubly-transitive group suffices for the analysis given in [93]. Essentially we show that the presence of a doubly-transitive group acting on the coordinates of the dual code does indeed allow us to localize the test. However, this gives a weaker result. We use techniques developed in [1] for better results, although the adoption is not immediate. In particular the interplay between certain geometric objects described below and their polynomial representations plays a pivotal role in getting results that are only about a quadratic factor away from optimal query complexity. In coding theory terminology, we show that Reed-Muller codes over prime fields are locally testable. We further consider a new basis of Reed-Muller code over prime fields that in general differs from the minimum weight basis. This allows us to present a novel exact characterization of the multivariate polynomials of degree _ in variables over prime fields. Our basis has a clean geometric structure in terms of flats [69], and unions of parallel flats but with different weights assigned to different parallel flats1 . The equivalent polynomial and geometric representations allow us to provide an almost optimal test. Main Result Our results may be stated quantitatively as follows. For a given Ø integer _òHU§ILG and a ] Z ; Z given real , our testing algorithm queries at W. n8_µhU X 2 points to determine 1 ý The natural basis given in [28, 29] assigns the same weight to each parallel flat. 113 whether can be described by a polynomial of degree at most _ . If is indeed a polynomial of degree at most _ , our algorithm always accepts, and if has a relative Hamming distance at least from every degree _ polynomial, then our algorithm rejects with probability Z at least . (In the case _]RUSIzG , our tester still works but more efficient testers are known). Our result is almost optimal since any such testing algorithm must query in at ] Z least s n U p X many points (see Corollary 7.5). We extend our analysis also to obtain a self-corrector for (as defined in [21]), in case the function is reasonably close to a degree _ polynomial. Specifically, we show that the value of the function at any given point ¬+l may be obtained with good probability random points. Using pairwise independence we can achieve by querying on yU even higher probability by querying on U random points and using majority logic decoding. ý 7.1.3 Overview of the Analysis The design of our tester and its analysis follows the following general paradigm first formalized by Rubinfeld and Sudan [93]. The analysis also uses additional ideas used in [1]. In this section, we review the main steps involved. The first step is coming up with an exact characterization for functions that have low degree. The characterization identifies a collection of subsets of points and a predicate such that an input function is of low degree if and only if for every subset in the collection, the predicate is satisfied by the evaluation of the function at the points in the subset. The second step entails showing that the characterization is a robust characterization, that is, the following natural tester is indeed a local tester (see section 2.3 for a formal definition): Pick one of the subsets in the collection uniformly at random and check if the predicate is satisfied by the evaluation of the function on the points in the chosen subset. Note that the number of queries made by the tester is bounded above by the size of the largest subset in the collection. There is a natural characterization for polynomials of low degree using their alternative interpretation as a RM code. As RM code is a linear code, a function is of low degree if and only if it is orthogonal to every codeword in the dual of the corresponding RM code. The problem with the above characterization is that the resulting local tester will have to make as many queries as the maximum number of non-zero position in any dual codeword, which can be large. To get around this problem, instead of considering all codewords in the dual of the RM code, we consider a collection of dual codewords that have few non-zero positions. To obtain an exact characterization, note that this collection has to generate the dual code. We use the well known fact that the dual of a RM code is a RM code (with different parameters). Thus, to obtain a collection of dual codewords with low weight that generate the dual of a RM code it is enough to find low weight codewords that generate every RM code. To this end we show that the characteristic vector of any affine subspace (also called a 114 flat in RM terminology [69]) generates certain RM codes. To complete the characterization, we show that any RM code can be generated by flats and certain weighted characteristic vectors of affine subspaces (which we call pseudoflats). To prove these we look at the affine subspaces as the intersection of (a fixed number of) hyperplanes and alternatively represent the characteristic vectors as polynomials. To prove that the above exact characterization is robust we use the self-correcting approach ([21, 93]). Given an input we define a related function as follows. The value of ¬" is defined to be the most frequently occurring value, or plurality, of at correlated random points. The major part of the analysis is to show that if disagrees from all low degree polynomials in a lot of places then the tester rejects with high probability. The analysis proceeds by first showing that and agree on most points. Then we show that if the tester rejects with low enough probability then is a low degree polynomial. In other words, if is far enough from all low degree polynomials, then the tester rejects with high probability. To complete the proof, we take care of the case when is close to some low degree polynomial separately. 7.2 Preliminaries  Throughout this chapter, we use U to denote a prime and to denote a prime power (U for some positive integer Á ) to be a prime power. In this chapter, we will mostly deal with prime fields. We therefore restrict most definitions to the prime field setting. For any _ò+t k PILGÑ¡ , let denote the family of all functions over ¯ that are polynomials of total degree at most _ (and w.l.o.g. individual degree at most }I5G ) in variables. In particular L+Q if there exists coefficients @ Ù +Q ¯ , for every Íy+Ê ¡ , ; ?Îä+w¢ !¥¥Y!fJIGe¦ , Î Û Z ?ÎäE>_ , such that û û 88 ] ü 88 88 Ù Ýiý ¯ å Z Ù û û à Ù ¤ ýÏ s tû Ù Ï å ¬ Î Ù ÎÛ Z û ¤ û û @ 88 (7.1) The codeword corresponding to a function will be the evaluation vector of . We recall the definition of the (Primitive) Reed-Muller code as described in [69, 29]. Definition 7.1. Let ÷"¯ be the vector space of -tuples, for Q G , over the field ¯ . ; order Reed-Muller code RM¯ ÑB!`j is the For any such that EÊ{E÷k }I5G , the ¯ subspace of ¯ ¿ ¿ of all -variable polynomial functions (reduced modulo ¬ Î Ig¬Î ) of degree at most . This implies that the code corresponding to the family of functions - is RM¯ ?_\!Y . Therefore, a characterization for one will simply translate into a characterization for the other. We will be using terminology defined in Section 2.3. We now briefly review the definitions that are relevant to this chapter. For any two functions B!S$ ¯ ) ¯ , the relative 115 ; distance Ýci"!Â+v !¥G¡ between and is defined as Ýci"! Ä Pr Ù |٠ĵ/¬[ Ï ¬"½¡ . ûÌ For a function and a family of functions ± (defined over the same domain¾ and range), we ; say is - close to ± , for some R|gR2G , if, there exists an ~+x± , where ÝUW"!}Ez . Otherwise it is - far from ± . A one sided testing algorithm (one-sided tester) for - is a probabilistic algorithm that ; is given query access to a function and a distance parameter , R,RG . If S+gJ , then the tester should always accept (perfect completeness), and if is -far from - , then Z with probability at least the tester should reject . For vectors ¬j!Z;§+]B , the dot (scalar) product of ¬ and ; , denoted ¬^; , is defined to be co-ordinate of . Î Û Z ¬Î;XÎ , where ÐÎ denotes the Í To motivate the next notation which we will use frequently, we give a definition. ; Definition 7.2. For any ÷ , a -flat in is a -dimensional affine subspace. Let ;cZ\!ä!; +] be linearly independent vectors and ;-+] be a point. Then the subset ' lq¢ ü ' ÎÛ Z ÜfÎ;XÎnñ;<· BÍ8+ ¡BÜfÎä+]CC¦ is a -dimensional flat. We will say that is generated by ;CZ!Y!; at ; . The incidence ' vector of the points in a given -flat will be referred to as the codeword corresponding to the given -flat. Given a function K$X" )hC , for ;cZ\!ä!Z; 6! ;-+]D we define ý ü <;cZ\!ä!Z; 6! ; Ä 88 ý >;n ü ÜfÎ ;XÎWb! (7.2) ûÌ × Û × × Ù | ÙÎ Ì which is the sum of the evaluations of function over an -flat generated by ;Z!¥ä!Z; ! at H ; . Alternatively, as we will see later in Observation 7.4, this can also be interpreted as the dot product of the codeword corresponding to the -flat generated by ;Z\!ä!Z; at ; and that corresponding to the function . While -flats are well-known, we define a new geometric object, called a pseudoflat. A -pseudoflat is a union of U.IQG parallel ÑIQG -flats. Definition 7.3. Let ÐZb!ë³!ä!fúåZ be parallel ioILG -flats ( 0ÊG ), such that for some ;q+÷ and all _[+% UKIqVX¡ , Zg½;Ân|ä , where for any set ¶ýO and ;q+÷ , def ;.n¶ ù¢¬gn=;ú· ¬|+÷¶9¦ . We define a -pseudoflat to be the union of the set of points 9Z to úåZ . Further, given an G (where GwEýG~E USI5V ) and a -pseudoflat, we define a i"!ZG> pseudoflat vector as follows. Let × be the incidence vector of for _+| UgIqG\¡ . åZ Then the Ñ"!ZG> -pseudoflat vector is defined to be Û Z M× . We will also refer to the ÑB!G£ pseudoflat vector as a codeword. Let be a -pseudoflat. Also, for S+ UuI5G¡ , let be the iaI5G -flat generated by ;cZ\!ä!; åZ at ;"n ÐZ; , where ;cZ!Y!; åZ are linearly independent. Then we say that the ' ' 116 i "!ZG> -pseudoflat vector corresponding to as well as the pseudoflat , are generated by ;"!Z;cZ\!¥¥ä!Z; åZ at ; exponentiated along ; . ' See Figure 7.1 for an illustration of the Definition 7.3. % & % $ & $ % & % $ & $ % & $ ! ! ! ! ! !#" -pseudoflat vector corresponding to : : Figure 7.1: Illustration of a -pseudoflat defined over with gzVU!QUS and ´ . Picture on the left shows the points in (recall that each of Z\!¥¤¥¤¤!f are G -flats or lines). åZ : points in it. The points in are shown by filled Each µÎ (for G[E2ÍyEm ) has UB' circles and the points in (' [ are shown by unfilled circles. The picture on the right is the ' WVC!¥G -pseudoflat corresponding to . Given a function K$e )ùC , for ;cZ\!¥¥ä!Z; 6! ;J+g , for all Í3+F U.IVX¡ , we define Î H ü <;cZ\!¥¥µ!Z; 6! ; Ä ûÌ × Û × × Ù | 88 Ì ý ü Î Ü Z Xµ;³n Ù Ü ; b¤ (7.3) As we will see later in Observation 7.5, the above can also be interpreted as the dot product of the codeword corresponding to the ½!ZG> -pseudoflat vector generated by ;DZ\!ä!Z; at ; exponentiated along ;CZ and the codeword corresponding to the function . 7.2.1 Facts from Finite Fields In this section we spell out some facts from finite fields which will be used later. We begin with a simple lemma. Lemma 7.1. For any _9+F ¸JIG¡ , Ù| @ Ï ; if and only if _ktJIQG . ¾ ¯ åZ Proof. First note that Ù | @ Ù | @ . Observing that for any @_+l ¯X , @ 9 ; ¯ åZ 9 ¾ ¾ follows that Ù | @ Ù| G pIGo Ï . 9 9 ¾ ¾ 9 G , it 117 ; . Let be a generator of ¯X . The sum ¯ åU Î ] åZ 9 ¾ Ï òItG and can be re-written as Î ÛCý N ¾ ] eå Z . The denominator is non-zero for _}q N ¯ åZ thus, the fraction is well defined. The proof is complete by noting that pG . Ï I_G , Next we show that for all _òt Ù| @ This immediately implies the following lemma. + »JIQG¡ . Then Lemma 7.2. Let _ëZ!ä!Z_ F ü × × Ù | ¾ 88 Ò Ó ; Ó Ø Ü Z Ü bÜ Ï è Ce­ ©>­C¨) è_fZk±_Ð|C±_ t l JIQG<¤ Proof. Note that the left hand side can be rewritten as x ÎÙ . (7.4) ×v ^Ù | Ü Î2 ¤ ¾ We will need to transform products of variables to powers of linear functions in those variables. With this motivation, we present the following identity. Lemma 7.3. For each "! s.t. ; RQ§EpU.IG there exists Ü ' +]´ X such that ü ' ' >å Î Ü ' å ¬ÎÁ IG ' B ¶ Î ÎÛ Z ÎÛ Z where ¶BÎj *Û Ú ü ¹ á ' ¹ ¿ ë¿ Û Î ü Ù ¹ ¬ ' ¤ (7.5) Proof. Consider the right hand side of the (7.5). Note that all the monomials are of degree exactly . Also note that Î' Û Z ¬CÎ appears only in the ¶ and nowhere else. Now consider ' ; any other monomial of degree that hasØ a support of size , where RüFRH : w.l.o.g. Î Î Î assume that this monomial is à Q¬ Z ¬ ¥ë¬ S such that Í`Z"nFn_Í 5 . Now note that for any ×,+Ê £¡ , à appears with a coefficient of Î Î Ø ' O Î S in the expansion of Ù ¬ ' . x S ¹ åS Further for every Í*W , the number of choices of ×-+ £¡ with ·O×·äHÍ is exactly ' å> Î< . ' Therefore, summing up the coefficients of à in the various summands ¶úÎ (along with the IG' å>Î factor), we get that the coefficient of à in the right hand side of (7.5) is þ Í`Z\!ͽ!¥¤¥¤¥¤!Í ÿ ü ' I å>Î * IG ' þ I´Í ÿ ÎÛ Moreover, it is clear that Ü ü' å Iÿ å å IG ' S þ Í`Z\!Í!¥¤¥¤¤¥!`Í ÿ þ I I ÿ ÛCý S å GIG ' þ Í`Z\!Í!¥¤¥¤¤¥!`Í ÿ ; ¤ ' ' Å Z½»Z½ O ¸Z 5T" (mod U ) and Ü ' Ï ; for the choice of . 118 7.3 Characterization of Low Degree Polynomials over In this section we present an exact characterization for the family - over prime fields. Specifically we prove the following: ; Theorem 7.1. Let _mUÂILG<nºG . (Note E G§E U§I~V .) Let ÍÐòU§IxVPIîG . Then a function belongs to , if and only if for every ;CZ\!ä!Z; Z!6;-+]B , we have Î H Ì ; ?;cZ!ä!Z; ' Z !6;³ ' (7.6) As mentioned previously, a characterization for the family ( is equivalent to a characterization for RM<_\!j . It turns out that it is easier to characterize when viewed as RM<_\!j . Therefore our goal is to determine whether a given word belongs to the RM code. Since we deal with a linear code, a simple strategy will then be to check whether the given word is orthogonal to all the codewords in the dual code. Though this yields a characterization, this is computationally inefficient. Note however that the dot product is linear in its input. Therefore checking orthogonality with a basis of the dual code suffices. To make it computationally efficient, we look for a basis with small weights. The above theorem essentially is a clever restatement of this idea. We recall the following useful lemma which can be found in [69]. Lemma 7.4. RM¯ Ñ"!j is a linear code with block length and minimum distance i4xn G È where 4 is the remainder and ö the quotient resulting from dividing WòILG]I, by JIG . Then RM¯ i"!Y ó RM¯ `WJIGäuI~IG£!Y . Since the dual of a RM code is again a RM code (of appropriate order), we therefore need the generators of RM code (of arbitrary order). We first establish that flats and pseudoflats (of suitable dimension and exponent) indeed generate the Reed-Muller code. We then end the section with a proof of Theorem 7.1 and a few remarks. We begin with few simple observations about flats. Note that an -flat is the intersection of /KI hyperplanes in general position. Equivalently, it consists of all points Ç that satisfy /lIê linear equations over (i.e., one equation for each hyperplane): BÍu+ ´Iïç¡ Û Z ÜfÎ ¬ W;NÎ where ÜëÎ !6;NÎ defines the Í hyperplane (i.e., Ç satisfies Û Z ÜëÎ Ç C;`Î ). General position means that the matrix ¢XÜ\Î ¦ has rank ^gI . Note that then the characteristic function (and by abuse of notation the incidence vector) of can be written as . ü å G if /ÇZ!¥ä!Ç 8+K åZ å (7.7) `GIt ÜfÎ ¬ IZ;NÎ k ; otherwise Û Z ÎÛ Z We now record a lemma here that will be used later in this section. Lemma 7.5. For 0 vectors of -flats. , the incidence vector of any -flat is a linear sum of the incidence 119 Ø be an -flat. We want to show that it is generated by a linear Proof. Let ¥n G and let combination of flats. Let be generated Z by ;CZ\!ä!Z; åZ!JZ!ä! Z at ; . For each non-zero vector Übε È Æ ÜfÎçZ!¥¤¤¥¤!NÜ Î Z in ? define: Ø ü Z ÇÎú Û Z Z ÜfÎ ¤ Z Clearly there are U IÊG such ÇXÎ . Now for each Í[+ö U IõG¡ , define an -flat Î generated by ;UZ\!ä!Z; å Z!ÇÎ at ; . Denote the incidence vector of a flat by G , then we claim that G0/õ÷UaIQG å Z\!òZ!¥¤¥¤¥¤! Since the vectors ;UZ\!¥¤¤¥¤!Z; proof in three sub cases: V üp å Z ÎÛ Z G ¤ (7.8) Z are all linearly independent, we can divide the yØ åZ Çt+ is of the form ;(n Î Û Z ?Î?;XÎ , for some ?eZ!¥¤¥¤¥¤!N? åZg+|C : Then each flat äÎ contributes Z G to the right hand side of (7.8), and therefore, the right hand side is UaIQG\ U IG³÷G in C . ÞØ Ç[+ Z is of the form ;kn Î Û Z Te ÎH9Î for some TZ\!¥¤¥¤¤!NT Zò+wU : Then the flats Z that contribute have @o Î Û Z TeÎâ9Î , for @0 G£!¥¤¤¥¤!<UuI5G . Therefore, the right hand side of (7.8) is UaIG ÷G in C . ÞØ ÇK+ åZ Z is of the form ;8n Î Û Z ?Î;XÎ"n Î Û Z e T ÎHÐÎ : Then the flats that contribute Z have @ Î Û Z T<ÎH9Î , for @y÷G<!¤¥¤¥¤!QUPI´G . Therefore, the right hand side of (7.8) is UaIQG ÷G in C . As mentioned previously, we give an explicit basis for RMc<G<!j . For the special case of Ugt , our basis coincides with the min-weight basis given in [29].2 However, in general, our basis differs from the min-weight basis provided in [29]. The following Proposition shows that the incidence vectors of flats form a basis for the Reed-Muller code of orders that are multiples of UaIQG . Proposition 7.6. RMc`UI[G\/òIK b!`j is generated by the incidence vectors of the -flats. 2 The equations of the hyperplanes are slightly different in our case; nonetheless, both of them define the same basis generated by the min-weight codewords. 120 Proof. We first show that the incidence vectors of the -flats are in RMNUIG^oILWb!Y . Recall that is the intersection of /ÂI independent hyperplanes. Therefore using (7.7), can be represented by a polynomial of degree at most ^I U{IõG in ¬YZ\!ä!`¬ . Therefore the incidence vectors of -flats are in RMcNUaIG^uI b!`j . We prove that RMNU[IpG\/{I \!j is generated by -flats by induction on wI . ; When 0I¥ , the code consists of constants, which is clearly generated by -flats i.e., the whole space. ; To prove for an arbitrary /lI K , we show that any monomial of total degree TKE UuIG^SI can be written as a linear sum of the incidence vectors of -flats. Let the monomial be ¬ Z ë¬ Â þ . Rewrite the monomials as 1 ¬úZ"203 f¬Á4 Z ¥1 ¬ÀÂ"20¥3 ë¬C4 . Group into û û ?eZ times ?©Â times products of UuI5G (not necessarily distinct) variables as much as possible. Rewrite each group using (7.5) with FhU[I|G . For any incomplete group of size T , use the same equation by setting the last UgIGòI,T>ª variables to the constant 1. After expansion, the monomial can be seen to be a sum of products of at most ^wI linear terms raised to the power of UgIG . We can add to it a polynomial of degree at most UgIG\/SIúIG so as to represent the resulting polynomial as a sum of polynomials, each polynomial as in (7.7). Each such non-zero polynomial is generated by a _ flat, _P . By induction, the polynomial we added is generated by >fnuG flats. Thus, by Lemma 7.5 our given monomial is generated by -flats. This leads to the following observation: Observation 7.4. Consider an -flat generated by ;Z\!¥¥ä!Z; at ; . Denote the incidence vector of this flat by × . Then the right hand side of (7.2) may be identified as ×y< , where × and denote the vector corresponding to respective codewords and is the dot (scalar) product. To generate a Reed-Muller code of any arbitrary order, we need pseudoflats. Note that the points in a -pseudoflat may alternatively be viewed as the space given by the union of intersections of /0IaILG hyperplanes, where the union is parameterized by another hyperplane that does not take one particular value. Concretely, it is the set of points Ç which satisfy the following constraints over : ü ÜëÎ ¬ ï ;`Î and Ü å ¬ Ï ; å ' ¤ Û Z Û Z ' Thus the values taken by the points of a -pseudoflat in its corresponding Ñ"Z! G> -pseudoflat BÍ3+F uI~IQG¡ ü /º vector is given by the polynomial å ' åZ å ÎÛ Z GIt ü ÜëÎ ¬ IZ;NÎW Û Z åZ äc ü Ü å ¬ I; å ' Û Z ' (7.9) 121 Remark 7.1. Note the difference between (7.9) and the basis polynomial in [29] that (along with the action of the affine general linear group) yields the min-weight codewords: j/¬ÁZ\!ä!`¬ $ where òZ!¥ä! ' åZ!ËúZ\!ä!Ë k ' åZ å ÎÛ Z ¥åZ `GIQ ¬CÎ"I Ð ÎW å Û Z ¬ ' I´Ë \! +]U . The next lemma shows that the code generated by the incidence vectors of -flats is a subcode of the code generated by the ½!ZG> -pseudoflats vectors. Claim 7.7. The ½!ZG> -pseudoflats vectors, where %G and GK+p UgIVX¡ , generate a code containing the incidence vectors of -flats. Ø Proof. Let be the incidence vector of an -flat generated by ;Z!¥ä!Z; at ; . Since pseudoflat vector corresponding to an -pseudoflat (as well as a flat) assigns the same value to all vector) by points in the same IuG -flat, we can describe (as well as any ½! -pseudoflat È giving its values on each of its U IyG -flats. In particular, zÆG£!¥¤¥¤¥¤!¥G . Let be a pseudoflat generated by ;UZ\!ä!Z; exponentiated along ;CZ at ;Bn ÐP;cZ , for each a+]C , and let be the corresponding >½!G£ - pseudoflat vector. By Definition 7.3, assigns a value Í to the on ><I_G -flat generated by ;>!ä!; at ;Dn´ knuÍ`; . Rewriting them in terms of the values È its bIuG -flats yields that zÆNU8IøU%!U8IøÁngG!ä!U8I ún*Í`!Y!U8IøYIuG +] . ; !¥G£!Y!QU*IG£! each taking values in . Then a solution Let denote U variables for y to the following system of equations Ø Ø Ù ü GJ implies that Ø Ù Ù /ÍjI Ù|ý U for every ; E ä EãUaIG ¥åZ ÛCý , which suffices to establish the claim. Consider the identity ü åZå G-zIG sn´Í` Ù| ý Ù which may be verified by expanding and applying Lemma 7.1. Setting to IG\I U establishes the claim. ¥åZå The next Proposition complements Proposition 7.6. Together they say that by choosing pseudoflats appropriately, Reed-Muller codes of any given order can be generated. This gives an equivalent representation of Reed-Muller codes. An exact characterization then follows from this alternate representation. Proposition 7.8. For every G|+ U{I|VX¡ , the linear code generated by ½!ZG> -pseudoflat vectors is equivalent to RMc`UaIQG^uI ún8G<!Y . 122 Proof. For the forward direction, consider an -pseudoflat . Its corresponding >½!G£ pseudoflat vector is given by an equation similar to (7.9). Thus the codeword corresponding to the evaluation vector of this flat can be represented by a polynomial of degree at most UaIQG\/uIZ jnîG . This completes the forward direction. Since monomials of degree at most UÐIgG\/-IF are generated by the incidence vectors of -flats, Claim 7.7 will establish the proposition for such monomials. Thus, to prove the other direction of the proposition, we restrict our attention to monomials of degree at least U3IÂG^IW¥nuG and show that these monomials are generated by >!ZG> -pseudoflats vectors. Now consider any such monomial. Let the degree of the monomial be UäIyG^kI ënáG> ( GPE GÁE=G ). Rewrite it as in Proposition 7.6. Since the degree of the monomial is U.IQG^gI ³n>GX , we will be left with an incomplete group of degree G£ . We make any incomplete group complete by adding 1’s (as necessary) to the product. We then use Lemma 7.3 to rewrite each (complete) group as a linear sum of G powered terms. After expansion, the monomial can be seen to be a sum of product of at most /gIñ degree U.IQG powered linear terms and a G powered linear terms. Each such polynomial is generated either by an >½!G£ -pseudoflat vector or an -flat. Claim 7.7 completes the proof. The following is analogous to Observation 7.4. Observation 7.5. Consider an -pseudoflat, generated by ;Z\!¥¥µ!Z; at ; exponentiated along ;UZ . Let # be its corresponding ½!ZG> -pseudoflat vector. Then the right hand side of (7.3) may be interpreted as #5X . Now we prove the exact characterization. Proof of Theorem 7.1: The proof directly follows from Lemma 7.4, Proposition 7.6, Proposition 7.8 and Observation 7.4 and Observation 7.5. Indeed by Observation 7.4, Observation 7.5 and (7.6) are essentially tests to determine whether the dot product of the function with every vector in the dual space of RMc?_\!Y evaluates to zero. ó Remark 7.2. One can obtain an alternate characterization from Remark 7.1 which we state here without proof. ; x with Let _kzU.IGäe}nF4 (note R4pE÷U.IV> ). Let GîUaIF4QIFV . Let -Ø x Ø Ø Ù ¬.I if is non-empty; and /¬9õG · 2·U G . Define the polynomial ¬" Ä û(Ì N / otherwise. Then a function belong to if and only if for every ;UZ\!ä!Z; Z!6;a+FD , we ' have ü × Ù| ý05 / WÜZN ü 88 Tp × Ø × Ù | ýT >;n ü' Z ÎÛ Z ÜfÎ"r;XÎ k ; ¤ Moreover, this characterization can also be extended to certain degrees for more general fields, i.e., C (see the next remark). þ Remark 7.3. The exact characterization of low degree polynomials as claimed in [42] may be proved using duality. Note that their proof works as long as the dual code has a minweight basis (see [29]). Suppose that the polynomial has degree TlE IQ<6,U]I|G , then 123 the dual of RM¯ TD!Y is RM¯ `WòItGÑ]IFTItG£!Y and therefore has a min-weight basis. Note that then the dual code has min-weight WT*npG . Therefore, assuming the minimum weight codewords constitute a basis (that is, the span of all codewords with the minimum Hamming weight is the same as the code), any T8n´G evaluations of the original polynomial on a line are dependent and vice-versa. 7.4 A Tester for Low Degree Polynomials over In this section we present and analyze a one-sided tester for ( . The analysis of the algorithm roughly follows the proof structure given in [93, 1]. We emphasize that the generalization from [1] to our case is not straightforward. As in [93, 1] we define a self-corrected version of the (possibly corrupted) function being tested. The straightforward adoption of the analysis given in [93] gives reasonable bounds. However, a better bound is achieved by following the techniques developed in [1]. In there, they show that the self-corrector function can be interpolated with overwhelming probability. However their approach appears to use special properties of ú and it is not clear how to generalize their technique for arbitrary prime fields. We give a clean formulation which relies on the flats being represented through polynomials as described earlier. In particular, Claims 7.14, 7.15 and their generalizations appear to require our new polynomial based view. 7.4.1 Tester in In this subsection we describe the algorithm when underlying field is B . Algorithm Test- in C ; 0. Let _k÷UaIQGäe}nF4 , Et4|R¥U.IG . Denote G8UaIFVòIF4 . 1. Uniformly and independently at random select ;Z\!ä!Z; Z!6;-+] . ' ; 2. If H < ;cZ\!¥¥µ!Z; Z\!¤;\ò Ï , then reject, else accept. Ì ' Theorem 7.2. The algorithm Z Test- Z in U is a one-sided tester ;for with a success probability at least min Ü<U"' <\! 76 Ø for some constant ÜJ . T ' p TZ Corollary 7.3. Repeating the algorithm Test-- in C for y n,rU ' times, the probp ability of error can be reduced to less than G6<V . We will provide a general proof framework. However, for the ease of exposition we prove the main technical lemmas for the case of j . The proof idea in the general case is similar and the details are omitted. Therefore we will essentially prove the following. Theorem 7.4. The algorithm Test- in is a one-sided tester for with success prob Z ; Z ability at least min WÜeW>' e\! 76 ] Ø for some constant ÜJ . Ó p 124 7.4.2 Analysis of Algorithm Test- In this subsection we analyze the algorithm described in Section 7.4.1. From Claim 7.1 it is clear that if K+[ , then the tester accepts. Thus, the bulk of the proof is to show that if is -far from , then the tester rejects with significant probability. Our proof structure follows that of the analysis of the test in [1]. In particular, let be the function to be tested Î for membership in . Assume we perform Test H for an appropriate Í as required by the algorithm described in Section 7.4.1. For such anÌ Í , we define >Îs$Á ) U as follows: ĵ?;³IºH Î ?;Iº;cZ\!Z;£!Y!; ' Z!Z;cZN-÷j¡ . For ;K+~± !fQ+wC! denote U Pr B N B B p Ï l+]C!<U ;B N ãU B withÌ ties broken arbitrarily. With this Define XÎ<;³L such that ,t meaning of plurality, for all Í8+F U.IVX¡ S¢ ¦ , eÎ can be written as: XÎ?; h½ 88 T » T e ?;µIãH 88 B p Î plurality B 8 Further we define 88 T Ì g <;I8;cZ\!Z;£!ä!Z; ' Z!Z;cZN ¤ ; Î Î Ä OáG B B A H <;cZ\!¥¥ä!Z; ' Z!6;ò Ï ¡ ûÌ p (7.10) (7.11) Ì The next lemma follows from a Markov-type8 argument. 8 Lemma 7.9. For a fixed K$e )hU , let XÎ! Î be defined as above. Then, Ýci"!<Îi9EtV Î . 88 T Î Proof. If for some ;Â+]" , Pr ĵ?;kLµ<;YI¥H <;}I#;cZ\!;£!¥ä!Z; ' Z\!Z;cZf½¡úG6<V , B B p 8 we only need to worry aboutÌ the set of elements ; such that then <;]µ<; . Thus, Pr ĵ<;gù?;I H Î ?;]I;cZ\!;£!¥ä!Z; ' Z\!Z;cZfÑ¡oE G6eV . If the fraction of such B B p elements is more 8 than V Î thenÌ that contradicts the condition that 88 T Î T H ?;cÎ Z\!ä!Z; ' Z\! ò Ï ¡ 88 B p ; Pr Ø T H <;cZµI !Z;£!ä!Z; Z! ò Ï ¡ ' B B 8 8 B p Î Ï ?;äI8H <;*Iã;cZ\!;£!¥ä!Z; Z\!Z;cZf½¡i¤ Pr T ¸8 ?;sL ' B B 8 8 B p Pr B Î A A ; 6; Ì Z; 6; Ì Therefore, we obtain ÝUiB!eÎW9ELV Î . Ì 88 T Î Z Note that Pr XÎ<;Ð|?;³I8H <;yIî;cZ!Z;£!ä!Z; ' Z!Z;cZfÑ¡ . We now show B B p that this probability is actually much higher. The next lemma gives a weak bound in that Ì direction following the analysis in [93]. For the sake of completeness, we present the proof. 8 88 T ý Ù |Ù XÎ<;.Mµ<;I=H Î <;uI=;cZ\!;£!ä!Z; Z\!Z;cZfÑ¡ Lemma 7.10. À;L+q , IJ Z ' B B p GÐIFV!U ' Î. Ì 125 Proof. We will use ×D!ÛÒ!o× !sÒ to denote ѳnSG dimensional vectors over . Now note that ÓªÕ ü' Z ü 8 8 T ý ýT ¹ ¹ 88 8 8 T ý ýT ¹ ¹ 88 Î × Z µ>×Z<;*Iã;cZfún ×ë <;X Dnî;cZN½¡ B Ù | Ù p Û Ù | p Û<; ZÑ ý ý>= Ú ü ÓªÕ Î Ù |Ù Ik¨:9aJ^C<¨ ) å Ø >×ZjntG µ>×Z¥?;*Iã;cZf B B B B p Ù| p < Û ; ý ý>= Ú Z ü' n ×ë< ;X Dî n ;½¡ Û ü ü' Z ÓªÕ Î Ù|Ù Ik:¨ 9aJ^C<¨ ) > ×ZúntG µ ×b <;X Dn8;Ñ¡ (7.12) B B p Û Z Ù| p < Û ; ý #ý = Ú È È ; ; È by Let ÷r%Æ?;UZ\!äZ! ; Z and ÷ r?Æ ; Z !äZ! ; Z . Also we will denote Æ !ä! ' ' 8 I:XÎ<; :9 Ik¨ aJ^C<¨ ) B 8 8 T ý ýT ¹ ¹ 88 ñ . Now note that GÐI ; Î H <;cZ\!¥¥ä!Z; Z! k ¡ T ' 88 ü ; Î µ ³ n ©÷ok ¡ Z T 88 T ¹Ù|ýp ü INJ T ĵ ³n8;cZfún B 8 8 B p ¹ Ù | ýT p ¹ Û<Ú ; Z½ ý 8 8 ý#= ÎäE B A B p I J N B B p A IJ 6; Ì ü × ; Î ;nש÷yk ¡ Zµ Î µ<;*I8;cZún Z T 88 Ù | ý T p Û<; ZÑ ý ý>= ¹ ü ¹ Ú 8 8 Î IJ T ¸?;jn ú Z t n G µ<;Pn B 8 8 B p B Ù | ý T p Û<; ý ý>= ¹ ¹Ú 88 IJ >; Z× >; A × B B ¸?;jn B p × >× ; ש÷yk ש÷o³ ; ¡ ¡ 8 Therefore for any given ×[L Ï ñ we have the following: #?< ? IJ ĵ?;nZש÷yk Ô ýT ü Ù | p Û@ Ô Ú Î I*ÒCZjnQG µ <;}nZש÷5n u Ò ©÷ ½¡úpGI 8 and for any given ~ Ò L Ï ñ , #?< ? IJ ¸µ<;}nÒu©÷ k ü ¹ ýT ¹ Ù | p Û@ Ú Î Î I*>×ZjntG µ <;}nZ×r÷5nÒu©÷ ½¡ú|GI Τ 126 Combining the above two and using the union bound we get, ü #?< ? B AC Î >×ZúntG ?;}nש÷yED ¹ Ù | ýT p ü ¹ ÛÚ @ ü Î Î I* ZjnQG CZúntG µ<;}n ¹ Ù | ýT ü p ¹ ÛÚ @ Ù | ýT p ÛÚ Î @ TÙ | ý p Û @ CZjnQ8 G ?;}n g© ÷ 8 Ñ¡ Ú IJ 8× Ô Ò Ô Ò Ô ×}©÷LnÒu©÷ Ò Ô Z |GIVU ' IQG Z |GIV!U ' Î (7.13) The lemma now follows from the observation that the probability that the same object is drawn from a set in two independent trials lower bounds the probability of drawing the most likely object in one trial: Suppose the objects are ordered so that U"Î is the probability of drawing object Í , and UúZs=UB2¥ . Then the probability of drawing the same object twice is Î U Î E Î UÁZUÎäEãUÁZ . However, when the degree being tested is larger than the field size, we can improve the above lemma considerably. The following lemma strengthens Lemma 7.10 when _ÐãUPI´G or equivalently Â|G . We now focus on the ú case. The proof appears in Section 7.4.3. 8 88 T Ù | Ù XÎ<;.Mµ<;I=H Î <;uI=;cZ\!;£!ä!Z; Z\!Z;cZfÑ¡ Lemma 7.11. À;L+q , IJ ' B B p s GÐIt/UsntGc Î . Ì 8 Lemma 7.11 will be instrumental in proving the next lemma, which shows that sufficiently small Î implies XÎ is the self-corrected version of the function (the proof appears 8 in Section 7.4.4). Lemma 7.12. Over Á , if ÎPR .|G ). 8 T Z ' /Z p , then the function eÎ belongs to (assuming 8 By combining Lemma 7.9 and Lemma 7.12 we obtain that if is PG6CiUU'¥N -far from Ð then Î is at least s`G6UÑU£'¥N . We next consider the case in which Î is small. By Lemma 7.9, in this case, the distance Ýy|ÝUW"! is small. The Z next lemma shows that in this case the test rejects with probability that is close to ' Ý . This follows from the fact Z that in this case, the probability over the selection of ;Z\!äZ! ; Z\6! ; , that among the £' points ' Î Üf?Î ;XÎDn ; (where ÜZ!¥¤¥¤¥¤!NÜ Z ' 8 Z-+{" ), the functions and differ in precisely one point, Ý . Observe that if they do, then the test rejects. is close to £' ; ÎòE Z Z . Let Ý denote the relative distance between Lemma 7.13. Suppose E ' p and and £' Z . Then, when ;UZ\ !ä!Z; ' Z!6; are chosen randomly, the probability Z Î ÜfÎ ;XÎDñ n ; (where WÜZ\!¥¤¤¥¤!NÜ Zf+S ' the points ), that for exactly one point Ç among ' Z å S µ^Çò Ï /Ç is at least Z © Ý . S T 127 8 Z Observe that Î is at least PW ' Ý< . The proof of Lemma 7.13 is deferred to Section 7.4.5. Proof of Theorem 7.4: Clearly if belongs to ( , then by Claim 7.1 the tester accepts 8 probability 1. 8 with Therefore let Ýci"! Ñ. . Let TÝUiB! , where 8 G is as in algorithm Test- Z . If Z then by Lemma 7.12 +]Ð and, by Lemma 7.13, Î is at least P ' R Z TU , ' T p which by the definition of is at least PW>' ; for some fixed constant ÜJ . ó T Z < . Hence Ît§ÍÑ . Ü< £' Z <b! Z 2 , ' Z p Remark 7.4. Theorem 7.2 follows from a similar argument. 7.4.3 Proof of Lemma 7.11 Observe that the goal of Lemma 7.11 is to show that at any fixed point ; , if £Î is interpolated from a random hyperplane, then w.h.p. the interpolated value is the most popular vote. To ensure this we show that if eÎ is interpolated on two independently random hyperplanes, then the probability that these interpolated values are the same, that is the collision probability, is large. To estimate this collision probability, we show that the difference of the Î interpolation values can be rewritten as a sum of H on small number of random hyperÎ planes. Thus if the test passes often (that is, H evaluates to zero w.h.p.), then this sum (by Ì a simple union bound) evaluates to zero often,Ì which proves the high collision probability. Î The improvement will arise because we will express differences involving H as a telescoping series to essentially reduce the number of events in the union bound.Ì To do this we will need the following claims. We note that a similar claim for UwÊV was proven by expanding the terms on both sides in [1]. However, the latter does not give much insight into the general case i.e., for . We provide proofs that have a much cleaner structure based on the underlying geometric structure, i.e., flats or pseudoflats. å Z!Z; Z !¥ä! Claim 7.14. For every ä+w¢eVC!Y!bCnyGX¦ , for every ;Ái±;CZfb! !!6;!Z;£!ä!Z; ; Z9+0 , let let ' ¶ Ì ý <;"! ±H def Ì The the following holds: ¶ Ì ?;!sµIF¶ Ì <;"!Z;£!ä!Z; å Z! !Z; \Z !ä!Z; ' \Z !¤;\\¤ ü <;"! k Ù| û ý eª¶ Ì <;}nF?*! µI¶ Ì Eg ?;n? !P k¤ ; Proof. Assume ;"! ! are independent. If not then both sides are equal to and hence the equality is trivially satisfied. To see why this claim is true for the left hand side, recall the ý definition of H and note that the sets of points in the flat generated by ;"!Z;c!¥¥ä!Z; åZ\!*! ; Z!¥ä!Z; Z Ì at ; and the flat generated by ;!Z;>!¥¥ä!Z; åZ\! !Z; Z\!ä!Z; Z at ; are the ' ' same. A similar argument works for the expression on the right hand side of the equality. 128 We claim that it is enough to prove the result for u2G and ;ò÷ñ . A linear transform (or renaming the co-ordinate system appropriately) reduces the case of {%G and ;0 Ï ñ to the case of q G and ;´ ñ . We now show how to reduce the case of z G to the ´öG case. Fix some values Ü¥!ä!NÜ åZ\!NÜ Z\!ä!NÜ Z and note that one can write ' ÜZP;yn,Ü\h;£kn5bÜ åZ; åZµn,Ü Qn,Ü ZP; Zn,Ü ' Z%; ' Zµn; as ÜZP;yn,Ü Qn;N , where ;fY Ù Ý åZ½ ZÑ Z à Ü ; n; . Thus, 88 88' ü ' åZ ü µ ÜPZ ;}nFÜ ~ñ n ; \¤ × Ø × × × Ù | × × Ù | Ø Ì p e p One can rewrite the other ¶ terms similarly. Note that for a fixed vector WÜ¥!Y!fÜ åZ! Ü Z\!ä!NÜ ' ZN , the value ofÌ ;ë is the same. Finally note that the equality in the general case åZ similar equalities hold in the pG case. is satisfied if U"' Now consider the space F generated by ;! and at 0. Note that ¶ ? ;! s (with ; ;Ð5ñ ) is just -fG , where G is the incidence vector of the flat given by the equation . Ì åZ over . Similarly ¶ < ;"! k5 9ÑG Therefore G is equivalent to the polynomial `GI ¥ å Z where is given by the polynomial `G8L polynomial I over . We use the following Ì identity (in ) ü ¥åZ åZ ¥åZ ¥åZ I W GItW*? ~î n ; ¡IL GIQW? î n ; ¡ (7.14) Ù| û åZ is the incidence vector of the flat Now observe that the polynomial `G*IvW*? ÷n ; å Z åZ is the incidence generated by ;Iw? and . Similarly, the polynomial `G9I, ? n#; åZ and . Therefore, interpreting vector of the flat generated by ;yIF? the above equation ¶ ?;!sk ý ý 88 T 88 ý e ög in terms of incidence vectors of flats, Observation 7.4 completes the proof assuming (7.14) is true. ¥åZ . ExWe complete the proof by proving (7.14). Consider the sum: Ù | W?*Qnº; ý ¥åZ åZ ÛCý ý åZå ^ ;} Ù ? åZ½å . panding the terms and rearranging the sums we get û | ¥åZ I ; ¥åZ . Similarly, Ù W? n åZ By Lemma 7.1 the sum evaluates to I û ; | I åZ Iã; ¥åZ which proves (7.14). û ý We will also need the following claim. Claim 7.15. For every Í8+{¢cG£!ä!<UaIFV¦ , for every ä+{¢eVC!Y!ë}ntGe¦ and for every ;i=;cZN\! !!¤;!;£!ä!Z; åZ!Z; Z!Y!; Z9+0D , denote ¶ Îæ ?;!s ±H Ì Ì Then there exists ÜëÎ such that Îæ ¶ Ì Î def Îæ ?;!sµIF¶ Ì ' <;"!Z;£!ä!Z; å Z!!; Z !ä!Z; ' \Z !¤;\\¤ ü ?;! ktÜëÎ Ù| û ý j^¶ Ì Îæ Îæ ?;}nF?*! äIF¶ Ì Em <;}nF? !s (¤ 129 Proof. As in the proof of Claim 7.14, we only need to prove the claim for G and Îæ ;a ñ . Observe that ¶ ?;"! Sc# , where # denotes the iVC!Í -pseudoflat vector of the pseudoflat generated by ;"! at ; exponentiated along ; . Note that the polynomial Ì Î åZ IG . Similarly defining # is just ; v we can identify the other terms with polynomials over . To complete the proof, we need to prove the following identity (which is similar to the one in (7.14)): Î ; v åZ I åZ ü ktÜfÎ Ù| e ?;}n? P Î GIQ<;I? P åZ ¡DIQ<;}nF? ý * * ög Î ¥åZ GIt?;*I? ¡ k¤ (7.15) û åZ Î where ÜëÎyùV . Before we prove the identity, note that IG k G in C . This is ¥åZ E , OIG UgI,0 . Therefore À"äOIG ¥ å åHZ G holds in C . because for GgEv / Î HÎ G Substitution yields the desired result. Also note that Ù | < ;n,*? s 2I:; (expand and apply Lemma 7.1). Now consider the sum û ü ü ü Í UaIQG Î ¥åZ ¥åZ Îçå å $ $ $ ? ;}n*? P < ;I*? s IG $ ; ? Ù| Ù | 6ý Ï Ï Î þ >ÿ þ ÿ ý Ï Ï åZ ¤ û û ü $ Í U.IG ¥åZ Îçå å $ $ ü IG $ ; ? $ Ù| þ >ÿ þ ÿ ýÏ Ï ¤ ý Ï Ï Î åZ 6 $ û Î ü Í . U I G åZ Î Î åZ : I ; nLIG IG ; ¡ £ ÿ a U I G I > ÿ ÛCýþ þ 1 0 2 3 4 Û Z Î Î ¥åZ Î (7.16) IG¥ ; î n ; V ¡ Î å Z Î Î ¥ å Z Î IG ; n; V ¡ . Substituting and Similarly one has Ù <;.n5? ?;uI5? ý ý ý ý ý | simplifying one gets (7.15). û Finally, we will also need the following claim. Claim 7.16. For every Í3+w¢cG£!Y!<U.IVc¦ , for every Y+w¢eVC!Y!6ntGe¦ and for every ;ѱ; b! !!6;!Z;£!ä!; åZ\!; Z\!ä!Z; ZÐ+] , there exists ÜbÎä+] X such that Îæ ¶ â*!Z;µI¶ Îæ Ì Ì !Z;k ü ý Ù| û jç¶ ¶ Îæ Îæ ' Îæ ?;}n?**!Z;yI?*sµI¶ Ì Î® âxn?j;!QI?j;Nn Ì n?j;! Il?j;µI¶ < ;}n? ! ;*Il? ÌnòÜfÎ ç¶ Îæ ? ;}n?**! µIF¶ Ì Îæ < ;PnF? ! P j mm Ì Proof. As in the proof of Claim 7.15, the proof boils down to Ì proving a polynomial identity over C . In particular, we need to prove the following identity over D : Î `GI åZ cI Î `GIF åZ ³÷v Î Î I ; \ `GI ¥åZ I[ Î Î ¥åZ Î ¥åZ ID; GÁIg <n ; â I ¥åZ \¤ 130 ý Î We also use that Ù | vwnK?j; |I ÜfÎúLV Î as before. û Î and Claim 7.15 to expand the last term. Note that We need one more simple fact before we can prove Lemma 7.11. For a probabil; Ç G MaxÎ Ù ¢ÇÎѦQ MaxÎ Ù ¢ÇXÎi¦j% Î Û Z ÇÎW[ Î Û Z ÇÎ3 ity vector Ç÷+ö !¥G¡ , G H ¢Çν¦ G G G GJI GG Ç GG . G G Proof of Lemma 7.11: We8 first prove the lemma for MaxÎ Ù Î Û Z Ç Î 88 T ý ý < ; . We fix ;´+x and let Y Ä ûÌ ' Z\!Z;cZfÑ¡ . Recall that we want to lower B Ù | Ù ý ?;3µ<;µIîH s Ip/UnpGc ý . In that direction, we bound a slightly different but related Gs bound Y p by Ì INJ B ?;yI8;cZ!Z;£!Y!; probability. Define K 88 T 88 T 88 T 88 T ý ý IJ B B Ù | Ù H ?;*Iã;cZ!Z;£!ä!Z; ' Z\!;cZf³±H <;*I Z\! !ä! ' Z\! ZN½¡ p p s È Ì Ì K Denote ÷ ?Æ ;UZ\!ä!Z; Z and similarly ú . Then by the definitions of and Y we have, ' Y[ K . Note that we have ý ý ; K IJ B B Ù | Ù H < ;á I ;cZZ! ;£!äZ! ; ' ZZ! ;cZf{ I H ? ;ÁI Z\! !¥ä! ' Z! ZNk ¡Ñ¤ p p s Ì Ì We will now use a hybrid argument. Now, for any choice of ;Z\!äZ! ; Z and Z\!¥µ! ' Z we have: ' ý ý H ? ;*8 I ;cZ\! ;£!äZ! ; ' ZZ! ;cZfäã I H ? ;*I Z\! !¥¥ä! ' Z\! Zf ý ý Ì ± H ? ;*ã I ;cZZ! ;£!Y! ; ' Z\Z! ;cZfÌ ä8 I H < ;*ã I ;cZ\! ;£!¥äZ! ; ' ! ' Z\Z! ;cZf ý ý n Ì H ? ;*8 î I ;cZ\! ;£!äZ! ; ' ! ' ZZ! ;cZNÌ ä8 I H < ;*8 I ;cZ\Z! ;£!¥äZ! ; ' åZ! ' ! ' ZZ! ;cZN ý ý n H Ì ? ;*8 î I ;cZ\!¥¥äZ! ; ' åZ! ' ! ' Z\Z! ;cZfä8 I Ì H ? ;*8 I ;cZ\!äZ! ; ' åU! ' åZ\! ' ! ' Z\Z! ;cZf Ì Ì ý ?;*I8;cZ\! ! !ä! ' Z\!Z;cZfäI8H <;*I Z\! !ä! ' Z\!Z;cZf ý ý nîH Ì ?;*I Zb! ! !Y! ' Z!Z;cZNäI8H Ì <;*I8;cZ\! !ä! ' Z\! ZN ý ý nîH Ì ?;*8 I ;cZ\! ! !ä! ' Z\! ZNäI8H Ì <;*I Z\! !ä! ' Z\! ZN ý ý Ì Ì Consider any pair H < ;*I8;cZ\!Z;£!¥µ!Z; ! Z!Y! ' Z!Z;cZNäI8H <;*Iã;cZ\!;£!¥ä!Z; åZ\! ! ý in the first “rows” in the sum above. Note that H ?;{I ¥Y! ' Z!Z;cZf that appears Ì Ì ý ;cZ\Z! ;£!¥äZ! ; ! Z\!ä! Z\Z! ;cZf and H < ;oî I ;cZ\Z! ;£!¥¥äZ! ; åZ\! !ä! ' Z\!Z;cZf differÌ only ' nîH ý .. . in a single parameter. We apply Claim 7.14 and obtain: Ì ý ý H ?;-I ;cZ!Z;£!Y!; ! Z\!¥¥ä! Z\!;cZNDI H <;(I ;cZ!Z;£!¥ä!Z; åZ\! !ä! Z\!Z;cZfk ' ý ' H ? ;YÌ Iø;cZn ; !Z;£!ä!Z; åZ! !Y! Z!Z;cZNn H Ì <;YI ;cZI ; !Z;£!ä!Z; åZb! !¥¥µ! Zb!;cZf ý ý ' ' I:Ì H ?B ; I:c ; Zfn !Z;£!¥¥ä!Z; ! Z\!ä! Z\!Z;cZfëI:H Ì <;BI:; I !Z;£!ä!Z; ! Z\!Y! Z!Zc ; ZN . ý Ì ' Ì ' Z + are chosen uniformly at Recall that ; is fixed and ;>!ä!Z; Z! !ä! l ' ' random, so all the parameters on the right hand side of the equation are independent and 131 ý uniformly distributed. Similarly one can expand the pairs H <;ÐI ;cZ\! ! !Y! Z!Z;cZNUI ý ý ý ' H <;BI Z\! !Y! Z!Z;cZf and H <;BI:;cZ! ! !ä! Z\! Ì ZNbI:H ?;"I Z! !Y! Z! ZN ý ' ' ' 8 into four H with all parameters being independent and uniformlyÌ distributed 3 . Finally noÌ ýÌ ý Z!Z;cZN and H <;BI Zb! !¥ä! ' Z!Z;cZf tice that theÌ parameters in both H <;BI Z\8 ! ! !¥ä! ' are independent and uniformly Ì distributed. Further recall that by the definition of ý , Ì ; ý ý for independent and uniformly distributed GÎ s. IJ H ?G<Z\!¥¥ä!ZG ' Zfx Ï ¡§E p 8 Thus, by the Ìunion bound, we have: 88 T ý ; ; 8 Ù | Ù H ý ?;cZ\!ä ; ZN"I H Z\!¥¥ä! ZNò Ï ! ú ¡ ÷ E / c n G ý ¤ (7.17) ' B B p ' p s Ì Ì ; Therefore YS K p G It^UPntG ý . A similar argument proves the lemma for CZ¥?; . The 8 argument. Thus, we use only catch is that H `¤ is not symmetric– in particular in its first another identity as given in Claim 7.16 to resolve the issue and get four extra terms than in Ì the case of ý , which results in the claimed bound of /UPntG\U Î . IJ 88 T 88 T ó 8 88 T ý Ù |Ù XÎ<; Remark 7.5. Analogously, in the case we have: for every ;§+] , INJ Ø B B B p Î 8 Iã;cZ!Z;£!Y!; ' Z\!Z;cZfún~µ<;½¡j|GIVNUaIQG`PnºqUaIGúntG Î . Lµ<;µIãH <;* The proof isÌ similar to that of Lemma 7.11 where it can be shown K Î8vG-I~VN U§ILG`n qUaIQGúnQG Î , for each K Î defined for XÎ< ; . 8 7.4.4 Proof of Lemma 7.12 T Z Î From Theorem 7.1, it suffices to prove that if ÎäR Z then H P <;cZ\!¥¥ä!Z; Z! ' p ; ' Z\!6;{+÷ . Fix the choice of ;UZ!ä!Z; Z!6; . Define ÷ for every ;§ c ; Z ! ¥ ä Z ! ; È ' ' Æ?;cZ!ä!Z; ' Z . We will express H P Î Q÷8!¤;\ as the sum of H Î with random arguments. We uniformly select i"n.G random È variables Îæ over for Ì GPE,Í3EL"n.G , and GsE .EL"n.G . Define úäγHÆ Îæ»Z!¥µ! ή Z . We also select uniformly i*n5G random variables GÎ over ' 8 B for GsE,Í8ELÐn´G . We Z use Îæ and G¥Î ’s to set up the random arguments. Now ; by Lemma ' 7.11, for every ×_+ (i.e. think of × as an ordered iynqG -tuple over ¢ !¥G£!fVc¦ ), with probability at least GIQ/UPnQGU Î over the choice of ή and G¥Î , Î 8×ÁÐ÷§nI;<I Á × ú8ZI G<Z\!o×Áú³n G!ä!o×ú ' Z n G ' Z\!o×Áú8Zn G<Z`\! (7.18) Z Ì Z where for vectors OK!÷v+] ' , ÷qr OM Î' Û Z ÷Î OyÎ , holds. Z 8 the union bound: Let #òZ be the event that (7.18) holds for all ×a+] ' . By XÎ8×ÁÐ÷§nI;kLµ>×Ð÷§n];¥IøH Pr ¸#òZ½¡j|GI ' Z /c}nQGU Τ (7.19) È Assume that #òZ holds. We now need the following claims. Let ÒlÈ ÆÒDZ!¥ä!sÒ Z be a ' i}ntG dimensional vector over Á , and denote Ò õÆÒ!¥ä!sÒ Z . 3 Since LNO M h ÷ n ' is symmetric in all but its last argument. 132 Claim 7.17. If (7.18) holds for all ×.+g ' ` ü ý H P Q ÷8!6;k Æ ýÛ Ú I:H T s ` ü Ù|T s n Ô ü' Z , then ü' Z ^»Z\!ä!; ' j Z n Û Û Ì Z ü' ý I:H iV;cZµI ZѸZún Òe ^»Z!Y!ë V ; Û Ì ü' Z Vp;3IãG<Zjn Ò<? G¥ Ñ Û ü' Z ý nîH Z½»Zjn Ò< ç¸Z\!ä! ZÑ Zjn ' Û Ù | Ô ý Z <;cZjn Òe ü ý H P < ÷9!6; ý Ò< ü' Z Ò< Û ü' Z >× (7.20) ; Ì ×}Ãú9ZúnîG<ZNnsµ>ש÷Lnñ;Ñ¡ ü I AC CA * Û ¹ T Ù| p s ü AC n ÙQP Ô T s Ú ü Ô Ù| ü I ýÛ ü Ú I n Ô AC Ù| Ù| ü ;n 8× ü' Z Û Ò< >×}Ãúµ n Ts AC ¹ Ù | T p r÷5n s ü Ô ü Ts µ r÷5n ;³n 8× ¹ T Û Òe >×úµ Dn ü' Z Û Z ü' ü' Û Òe >×}Ãúä Dn Û DE Ò< ?G¥ Ñ ü' Z Û Ò< <G Ñ bb Òe <G¥ i Ò< <G Dn ü' Z Ò< 8×Ãúä Dn Û ü' Z Û p; Z×}Ãú9ZµIãG<Zjn × µ8×Ãú8ZjnîG<Zún ü' Z Û Ts ACRAC ¹ Ù | T p µWV ©÷LnFV 3I s Z Ù| p s ü' Z µiV } × ©÷Ln~Vp;3IZ×}Ãú8ZµIãG<Zjn nsµ>×}Ãú8ZjnîGeZjn Z nîG Z\! ' j ' ñ;3IZ×}Ãú9ZäI8G<Z\!o×Ãú³änîG!ä!o×Ãú 8× s b ç ' Z !ZG<Zjn <Ò ?G¥ Ñ Û }©÷5n T ¹ Ùü | s p ý ^I:H ©÷Ln e ¹Ù |Tp Æ b ç ' Z !6;³n Ò< ?G¥ Ñ Û ü' Z ' ZäI Z½ ' Z n Û Òe ^ ' Z ! Ì Proof. ü' Z ü' Z Û Ò< >×}Ãúµ i ü' Z Û DESDE Ò< <G¥ i DE Ò< 8×Ãúä Dn ü' Z Û QED Ò< ?G¥ i 133 ` ü ýÛ Ú ü n Ô I:H ý I:H Ì n H ü' Z ZѸZún Ì ` ü H P Q ÷8!6;k ýÛ Ú I:H T s ` ü H Ù|T s Ù| Ô n Ò< Z ü' Z ü Z H P < ÷9!6; Ù| p ü s ×Z ¹ Ù | sT p I AC Ù| p s ü ÙQP Ô I ü ýÛ Ú Z ×Z ACRAC * ü' Z ü' Z b Z 6! ;³n Ò< <G Ñ Û ü' Z Z n Û Òe ^ ' Z ! b Û (7.21) >×}©÷5nñ;3Iô×Ãú8ZµIãG<Z!o×}ÃúkänîG!Y!6×ú ' j Z n8G ' Z! Ù | Û ü Ú Ô Ù | Ts µ ©÷Ln ñ;n >× Ts µiV }©÷Ln~V 3I ü' Z Û p; Z×}Ãú8ZµIãG<Zjn × ü Ô , then ×}Ãú9Zjn8G<ZNún÷µ>×}©÷Lnñ;Ñ¡ ¹ T n e I:H Ì ü b ×ZZ¥8ש÷5n; ¹ T ü' Vp;3IãG<Zjn Proof. Òe Û iV;cZµI Z ^»Z!Y!; ' j Z n Ò< ^ Û ' ü' Z Z½»Zún Òe ^»Z!Y!ë V ; ' ZäI Z½ ' Û Z Ò<? G¥ Ñ ¤ <;cZún Ì ü' Z ^»Z\!¥¥ä! ÑZ ' Zjn <Ò ç ' Z !GeZjn <Ò <G Ñ Û Û Ì Ô Z b Ò< ?G¥ Ñ Z ¤! ;n Û ü' Z Z n Û Ò< ^ ' Z ! ü' Z Claim 7.18. If (7.18) holds for all ×a+] ' Z ü' Z Ò< ?G¥ Ñ Û Û ü' Z Ò< Û WV;cZµI ü' Z ç¸Zb!¥¥µ!Z; ' j Z n Ò< ç Û ' ü' Z ZѸZjn Ò< ç¸Zb!¥¥µ!f V ; ' ZµI ÑZ ' Û ?;cZYn Vp;3IãG<Zjn ý ü' Z Ì T s Ù | Ô ` sT Ù| ý Ts AC ¹ Ù | T p ZNµ r÷5n s × 8× ;n ü' Z Û Ò< 8×Ãúä Dn ü' Z Û Òe <G¥ Dn ü' Z Û Ò< 8×Ãúä Dn ü' Z Û ED Ò< <G Ñ ü' Z Û DE Ò< 8×Ãúä i ERD DE Ò< <G Ñ 134 ü I ü Ô ü Ù| Ts AC ¹ Ù | T p ZNµiV ©÷Ln~V 3I ` s ü' Z × Z p; Z×}Ãú8ZµIãG<Zjn × ü' Z ü' Z Ò< 8×Ãúä Dn Û ü' Z ü' Z Û Ò< ?G¥ Ñ DE b ç¸Z\!ä!Z; ' Y Z n <Ò ç ' Z !6;³n Ò< ?G¥ i Û Û Û ý Û Ù | Ì Ú Ô s ü ü' Z ü' Z Z H WV;cZµI Z½»Zjn Ò< ç¸Z\!ä!f V ; ' ZµI ZÑ ' Z n Ò< ç Z ! Û Û ' Ù | Ì Ô s ü' Z V ;8IãG<Zjn Ò<< G Ñ Û n `T I:H <;cZjn Ò< T b Î Let #( be the event that for every ÒÁB+] Á' !H <;cZNn Z ; Î Û }nQGÒ< ?G¥ ik , H 8 iV;cZ\I Z½»ZNn ' Û Òe ^Ì»Z!Y!ëV; Z ; Z ý GeZNn ' Û Ò< ?G¥ ѳ , andÌ H ZѸZNn ' Û Ò< ^»Z\!ä! Z½ ; ' . By the definition of Î and the union bound, we have: Ì 8 Pr ¸#(N¡úpGIl ' 8 Z Z Òe ^ »Z!Y!; ' ZbI Z½ ' Z Zfn ZNn ' Û Ò< ' fZ n Z Ò< ^ ' ' Û Ò< ç Z ' Z ç ' Z\Z! G<Zfn ' Û Î¤ Z !6;n !fVp;I Ò<? G¥ ѳ (7.22) T Suppose that γE Z holds. Then by (7.19) and (7.22), the probability that #}Z ' p In other words, there exists a choice 8 of the Îæ ’s and G¥Î ’s and #( hold is strictly positive. for which all summands in either Claim 7.17 or in Claim 7.18, whichever is appropriate, is ; Z Î . In other words, if Î(E Z/ , then XÎ 0. This implies that H P ?;cZ\!ä!; Z\!6;s ' ' p belongs to . ó T 8 Remark 7.6. Over we have: if ÎoR .|G ). T Z ® ¥ åZ ' ¥åZ Z p , then X Î belongs to (if In case of U , we can generalize (7.18) in a straightforward manner. Let #Z denote the event that all such events holds. We can similarly obtain 8 Pr ¸# Z ¡j|GI U ' Z XVNUaIQG`PnºqUaIGúntG ν¤ (7.23) 135 Claim 7.19. Assume equivalent of (7.18) holds for all ×.+g ' ` Î ü H P Q ÷8!6; Î ü' Z Z , then ü' Z ü' Z b ç¸Z\!ä!Z; ' j Z n eÒ ^ ' Z !¤;n Òe <G¥ i Û Û Û ý Û Ù | Ì Ú Ô Z ü ü ü ' Î Î Ò Z I:H ÒCPZ ;cZµIt ÒCZäIQG Z½»Zjn Ò< ç¸Zb!¥¥µ!sÒCZP; \ AC ' ZI Ù| Û Z Û Ù | Ì Ô Ô f Ô Ú ü' Z ü' Z ÒCZµIG Z½ ' Z n Òe ^ Z Û ! ÒUoZ ;8IQ ÒUZµIQGP G<Zún Ò<< G¥ i Û ' Û n ýT I:H ýT <;cZjn Ò< ` /ý bµb (7.24) Proof. ü Î H P < ÷9!6; Î Z XÎ>×r÷5n; ¹ Ù ü | ýT p Î Z ^e I:H Ù | ýT p ¹ × Î × Ì 8×}©÷5n;8Iô×Ãú8ZµIãG<Z\!6×}Ãúkän8G!¥ä!o×Ãú ' ú Z nîG ' Z\! ×Ãú8ZjnîGeZfúnp8ש÷Ln;½¡ I ü AC n ¹ Ù | ýT p ü × Z Î AC CA * Û Î AC Ú ü ü Ô Tý µ ©÷Ln 8× Ù | ñ;n ü' Z Û Òe >×}Ãúä Dn ü' Z Û Ò< ?G¥ i DE µ ÒCZ ×}©÷5n CÒ Zo;3IQÒCZäIQG×ú9ZäItÒCZµIGPG<Z Ù| Û Z Ù | Ô Ô Ô Z fÚ ü' ü' Z n Ò< >×}Ãúµ n Òe <G¥ i Û Û Z Z ü ü ü ü ' ' Î I × Z µ> ×}© ÷Lñ n ;³n Ò<? G¥ Dn Ò<8 ×à úä Ñ ED AC ý Û Ù | Û Û Ù | p Ú Ô ü ü ü Î Î Ò Z AC × Z µ ÒCZ ×r ÷5n ÒCoZ ;3IQ ÒCZäIQG ×à ú8ZäIt ÒCZµIQGG<Z AC Ù| Û Z Ù Ù | | Ô Ô Ú p Ô Z ü' ü' Z n Ò<8 ×à úä Dn Ò<< G Ñ Û Û Z Z Z ü ü ü ü ' ' ' Î I H < ;cZún : Òe ^»Z!Y! ; Zjn Ò< ^ Z 6 ! ;³n Ò<< G Ñ ' Û Û ' Û ý Û Ù | Ì Ú Ô ý Ò Z ýT bµb ýT ¹ ýT I ýT ý ` ýT ¹ ýT bb b 136 ü n Ô AC ýT Ù | ü Ô ý Ô Ù| Ò Û Z Ú Z Î ` Î I:H ÒCZP;cZµIQÒCZäIQG ZѸZún Û Ì I*ÒUZµIQG ½Z ' Z n ü' Z ü' Z Ò< Û Ò< ç¸Zb!¥¥ä! ü' Z ^ ' Z !sÒCZ;8ItÒCZµIGPGeZjn <Ò <G Ñ Û ' Z ÒCZP; bb 8 Let #P be the event analogous to the event #J in Claim 7.18. Then by the definition of 8 Î and the union bound, we have Z 8 Τ (7.25) Pr ¸# ¡j|GI!V U ' Z Then if we are given that ÎÐR ¥åZ ¥åZ Z , then the probability that # Z and # ® ' Î p ; hold is strictly positive. Therefore, this implies H P < ;cZ\!äZ! ; Z6! ;³ . ' T 7.4.5 Proof of Lemma 7.13 For each 1z+g ' Z , let O be the indicator random variable whose value is 1 if and only if È À µW1l÷lnð;ò Ï W1Z÷lnò; , where ÷zõÆ?;cZ!¥¤¥¤¥¤!Z; ' Z . Clearly, Pr O pG¡tÝ for every 1 . It follows that the random variable O O which counts theÀ number of points Ç Z À À XÝ . It is not of the required form in which µ/ÇCP Ï ^Ç has expectation ã O0¡³5' Ý difficult to check that the random variables O are pairwise independent, since for any two Z distinct 1-ZoùW1(Z½»Z\!¥¤¤¥¤!f13Îæ ZN and 1Ð.ùiÀ 1иZ\!¥¤¥¤¥¤!f1Ð ZN , the sums Î' Û Z 1(ZÑ Î;XÎjn ; ' ' Z and Î' Û Z 1Ð Î;XÎn; attain each pair of distinct values in " with equal probability when the vectors are chosen randomly and independently. Since O ’s are pairwise independent, À Var O[¡ Var O ¡ . Since O ’s are boolean random variables, we note À À ¡ã O À ¡ItHã( O ¡/ ã O ¡IQâã O ¡/ E¥ã O ¡i¤ À À À À À À n ã( O[¡ . Next we use the following Thus we obtain Var O[¡E ã O[¡ , so ã( O ¡E ã O[¡ ñ Var O well known inequality which holds for a random variable O values, Pr O âã O[¡/ ¡j âã O[¡^ ¡ ¤ ã O attains value Í with probability UDÎ , then we have Indeed if O ; taking nonnegative, integer ü ÎUT ý ÍUÎ ü ÎUT ý Í UÎ UÎ ER ü ÎVT ý ÍUÎ ü ÎUT ý UÎ ( ã O[¡Á Pr O ; ¡Ñ! where the inequality follows by the Cauchy-Schwartz inequality. In our case, this implies Pr Où ; ¡ú âã O0¡/ âã O0¡/ ¡ ã O ã O[¡BnLHã( O[¡^ ( ã( O[¡ G8n ã O[¡ ¤ 137 Therefore, ã O0¡w Pr O pG¡UnFV Pr OùtVX¡u V ã O0¡ t I G3n ã O[¡ Pr OMpG¡cn~V ã O0¡ ã O[¡ þ G9n ( I Pr OO÷G\¡ ÿ Pr OMpG¡i¤ After simplification we obtain, Pr OM÷G¡ú GI ã O[¡ *ã O[¡i¤ G8n ã( [ O ¡ The proof is complete by recalling that ã( O[¡ä Ý . ó 7.5 A Lower Bound and Improved Self-correction 7.5.1 A Lower Bound The next theorem is a simple modification of a theorem in [1] and essentially implies that our result is almost optimal. Proposition 7.20. Let W be any family of functions $ ) U that corresponds to a linear code â . Let T denote the minimum distance of the code â and let T X denote the minimum distance of the dual code of â . Every one-sided testing algorithm for Z the family W must perform P TUX queries, and if the distance parameter is at most TU6,U , then s`G6Xe is also a lower bound for the necessary number of queries. Lemma 7.4 and Proposition 7.20 gives us the following corollary. with distance parameter must perCorollary 7.5. Every one-sided tester for testing ( ]p Z form P max !G9nL`<_úntG mod UaIQG`NU e N queries. ý 7.5.2 Improved Self-correction From Lemmas 7.9, 7.11 and 7.12 the following corollary is immediate: Corollary 7.6. Consider a function ~$ ) that is -close to a degree-_ polynomial Ê$(D ) , where LR Z Z . (Assume õ G .) Then the function can be ' p self-corrected. That is, for any given ¬x+x , it is possible to obtain the value ¬ with Z Z points on . probability at least GIl>' by querying on >' T An analogous result may be obtained for the general case. We, however, improve the Z above corollary slightly. The above corrector does not allow any error in the U' points it queries. We obtain a stronger result by querying on a slightly larger flat ² , but allowing some errors. Errors are handled by decoding the induced Reed-Muller code on ² . 138 Proposition 7.21. Consider a function K$e" )hC that is -close to a degree-_ polynomial $ÁD ) U . Then the function can be self-corrected. That is, assume ÑonqG , then for any given ¬õ+z , the value of /¬ can be obtained with probability at least Ø U å 8åU ' åc with U queries to . GÐI Zå /8 T p Proof. Our goal is to correct the RMc<_\!Y at the point ¬ . Assume _}öUgIqG3conQ4 , ; å where Eõ4HE UuI,V> . Then the relative distance of the code Ý is `G-I,4}6,UU ' . Note å åZ E÷ݧE±U å ' . Recall that the local testability test requires a ÑnG -flat, i.e., it that V!U ' Z ; åUNå Ù| Ü Z µ?; ý n Î' Û Z ÜfÎ;XÎ 3 , where ;XÎä+]D . tests × × 88 T ý p We choose a slightly larger flat, i.e., a hiÂnzG to be chosen later. -flat with We consider the code restricted to this -flat with point ¬ being the origin. We query on this -flat. It is known that a majority logic decoding algorithm exists that can decode Reed-Muller codes up to half the minimum distance for any choice of parameters (see [99]). Thus if the number of errors is small we can recover /¬ . Let the relative distance of from the code be and let ¶ be the set of points where it disagrees with the closest codeword. Let the random -flat be ²m¢¬an Î Û Z _ÑÎçËÎ`· _ÑÎ+ variable ÷ ; = takeÈ the value G if ¬kn Î Û Z ËÎ_ÑÎä+K¶ and 0 µ!`ËÎ3+(ÂD ¦ . Let the random ; [ ¢ ¦ and qÊÆ^Ëú Z\!ä!`Ë1 . Define ÷õ ; = Ù ÷ ; = otherwise. Let 7HQ and ÷U IQG . We would like to bound the probability 88 w 88 88 Pr A æ· ÷qIl ·c÷ Ý<6<VòIle ¡i¤ È pG¡Q , by linearity we get ãkAÐ 1÷}¡t . Let HLzÆ?_fZ!ä!Z_ . Now 88 Since Pr A³ ÷ @G 1÷¡" ü I@G 1÷ ü 1øÇ 1÷ ð ! ÷ ð ¡ ð Ú ð ü ü 1øÇ 1÷ ð !÷ ð ¡Un 1 ÇÁ ÷ ð ! ÷ ð ¡ Û7Y Û Y Z Û7Y Ù | 7 ð Ú ð ð ð Ú E -Il ún cUaIV>/-Il JI´ UaIQG Ù Ýiýà ð | å -Il ún ð ¡Un Û çÙ The above follows from the fact that when Hm Ï lH( then the corresponding events ÷ and ð ; . Also, when ÷ and ÷ are dependent ÷ are independent and therefore 1øÇ 1÷ !÷ ¡" ð ð ð ð ð then 1øÇÁ ÷ !÷ ¡ ãkAÐ ÷ ÷ ¡I ã³AÐ 1÷ ¡jã³AÐ 1÷ ¡jE-Il . ð ð ð ð ð ð å Z Therefore, by Chebyshev’s inequality we have (assuming R#U ' ) Pr A æ· ÷|I´ ·Up Ý<6<VòIle ¡jE UGI´<\U.IG WÝ<6eVsI´< 139 Now note WÝ<6eVsIle9pU å ' å Z I´<k÷`GI´JU ' Z U å ' å Z . We thus have c `GIl<\UaIG GIl-hU ' Z U åU ' ^ U Z åU GIl-h U ' U ' å Z h U GIl-h U ' Pr A³ ®· ÷|I´ ·c÷WÝe6<VòIle ¡jE E åU åU Q n G 8åU c 'å ¤ 7.6 Bibliographics Notes The results presented in this chapter appear in [72]. As was mentioned earlier, the study of low degree testing (along with self-correction) dates back to the work of Blum, Luby and Rubinfeld ([21]), where an algorithm was required to test whether a given function is linear. The approach in [21] later naturally extended to yield testers for low degree polynomials over fields larger than the total degree. Roughly, the idea is to project the given function on to a random line and then test if the projected univariate polynomial has low degree. Specifically, for a purported degree _ function ,$B¯ ) ¯ , the test works as follows. Pick vectors ; and ; from Á¯ (uniformly at random), and distinct Á<Z!Y!,Á Z from ¯ arbitrarily. Query the oracle representing at the _cnwG points ;Dn ÁÎ; and extrapolate to a degree _ polynomial in one variable Á . Now A B test for a random Á+g if A B ÁXkLµ;nºÁr; (for details see [93],[42]). Similar ideas are also employed to test whether a given function is a low degree polynomial in each of its variable (see [36, 8, 6]). Alon et al. give a tester over field Á for any degree up to the number of inputs to the function (i.e., for any non-trivial degree) [1]. In other words, their work shows that ReedMuller codes are locally testable. Under the coding theory interpretation, their tester picks a random minimum-weight codeword from the dual code and checks if it is orthogonal to the input vector. It is important to note that these minimum-weight code words generate the Reed-Muller code. ; ; Specifically their test works as follows: given a function K$C¢ !¥Ge¦X) ¢ !¥Ge¦ , to test if ; the given function has degree at most _ , pick <_änLG -vectors ;Z\!¥¥ä!Z;X Z-+~¢ !¥Ge¦ and test if ü ü * Û4Þ Ú á Z ÎÙ Þ ;XÎWk ; ¤ Independent of [72], Kaufman and Ron, generalizing a characterization result of [42], gave a tester for low degree polynomials over general finite fields (see [74]). They show that a given polynomial is of degree at most _ if and only if the restriction of the polynomial to every affine subspace of suitable dimension is of degree at most _ . Following this 140 idea, their tester chooses a random affine subspace of a suitable dimension, computes the polynomial restricted to this subspace, and verifies that the coefficients of the higher degree terms are zero4 . To obtain constant soundness, the test is repeated many times. An advantage of the approach presented in this chapter is that in one round of the test (over the prime field) we test only one linear constraint, whereas their approach needs to test multiple linear constraints. A basis of RM consisting of minimum-weight codewords was considered in [28, 29]. We extend their result to obtain a different exact characterization for low-degree polynomials. Furthermore, it seems likely that their exact characterization can be turned into a robust characterization following analysis similar to our robust characterization. However, our basis is cleaner and yields a simpler analysis. We point out that for degree smaller than the field size, the exact characterization obtained from [28, 29] coincides with [21, 93, 42]. This provides an alternate proof to the exact characterization of [42] (for more details, see Remark 7.3 and [42]). In an attempt to generalize our result to more general fields, we obtain an exact characterization of low degree polynomials over general finite fields [71] (see [86] for more details). This provides an alternate proof to the result of Kaufman and Ron [74] described earlier. Specifically the result says that a given polynomial is of degree at most _ if and Z only if the restriction of the polynomial to every affine subspace of dimension ¯ å ¯ (and higher) is of degree at most _ . Recently Kaufman and Litsyn ([73]) show that the dual of BCH codes are locally testable. They also give a sufficient condition for a code to be locally testable. The condition roughly says that if the number of fixed length codewords in the dual of the union of the code and its -far coset is suitably smaller than the same in the dual of the code, then the code is locally testable. Their argument is more combinatorial in nature and needs the knowledge of weight-distribution of the code and thus differs from the self-correction approach used in this work. 4 Since the coefficients can be written as linear sums of the evaluations of the polynomial, this is equivalent to check several linear constraints 141 Chapter 8 TOLERANT LOCALLY TESTABLE CODES In this chapter, we revisit the notion of local testers (as defined in Section 2.3) that was the focus of Chapter 7. 8.1 Introduction In the definition of LTCs, there is no requirement on the tester for input strings that are very close to a codeword (it has to reject “far” away received words). This “asymmetry” in the way the tester accepts and rejects an input reflects the way Probabilistically Checkable Proofs (or PCPs) [6, 5] are defined, where we only care about accepting perfectly correct proofs with high probability. However, the crux of error-correcting codes is to tolerate and correct a few errors that could occur during transmission of the codeword (and not just be able to detect errors). In this context, the fact that a tester can reject received words with few errors is not satisfactory. A more desirable (and stronger) requirement in this scenario would be the following– we would like the tester to make a quick decision on whether or not the purported codeword is close to any codeword. If the tester declares that there is probably a close-by codeword, we then use a decoding algorithm to decode the received word. If on the other hand, the tester rejects, then we assume with high confidence that the received word is far away from all codewords and not run our expensive decoding algorithm. In this chapter, we introduce the concept of tolerant testers. These are testers which reject (w.h.p) received words far from every codeword (like the “standard” local testers) and accept (w.h.p) close-by received words (unlike the “standard” ones which only need to accept codewords). We will refer to codes that admit a tolerant tester as tolerant LTCs. In particular we get tolerant testers that (i) make `G queries and work with codes of near constant rate codes and (ii) make sub-linear number of queries and work with codes of constant rate. 8.2 Preliminaries Recall that for any two vectors Ëä!ÇÂ+F ¸¡ , ÝU^Ëä!Ç denotes the (relative) Hamming distance between them. We will abuse the notation a bit and for any ¶º÷ »¡æ , use ÝU/ËY!ë¶3 to denote Ó Ö ­Z Ù Þ Ýc/Ëä!ÇC . We now formally define a tolerant tester. ; Definition 8.1. For any linear code â over ¯ of block length and distance T , and E ÜZÐEQÜE|G , a WÜZb!fÜ\b -tolerant tester H for â with query complexity Uä^Y (or simply U when 142 the argument is clear from the context) is a probabilistic polynomial time oracle Turing machine such that for every vector Ç.+0"¯ : 1. If ÝU/ÇB!âäÐE ance), × Ä , H upon oracle access to Ç accepts with probability at least (toler- × Ø Ä , H rejects with probability at least (soundness), 3. H makes Uä/j probes into the string (oracle) Ç . 2. If ÝU^Ç"!âä9 A code is said to be WÜZ!NÜ\!<U -testable if it admits a ÜZ\!NÜb -tolerant tester of query complexity Uä . A tester has perfect completeness if it accepts any codeword with probability G . As ; pointed out earlier, local testers are just !NÜb -tolerant testers with perfect completeness. We will refer to these as standard testers henceforth. Note that our definition of tolerant testers is per se not a generalization of standard testers since we do not require perfect completeness for the case when the input Ç is a codeword. However, all our constructions will inherit this property from the standard testers we obtain them from. Recall one of the applications of tolerant testers mentioned earlier: a tolerant tester is used to decide if the expensive decoding algorithm should be used. In this scenario, one would like to set the parameters ÜZ and Ü such that the tester is tolerant up to the decoding radius. For example, if we have an unique decoding algorithm which can correct up to Ä Z Z errors, a particularly appealing setting of parameters would be ÜeZ and Ü as close to as possible. However, we would not be able to achieve such large ÜeZ . In general we will ×WØ aim for positive constants ÜZ and Ü with × being as small as possible while minimizing UY/Y . One might hope that the existing standard testers could also be tolerant testers. We give a simple example to illustrate the fact that this is not the case in general. Consider the tester for the Reed-Solomon (RS) codes of dimension "naG : pick "nV points uniformly at random and check if the degree univariate polynomial obtained by interpolating on the first -nxG points agrees with the input on the last point. It is well known that this is a standard tester [96]. However, this is not a tolerant tester. Assume we have an input which differs from a åZ degree polynomial in only one point. Thus, for Z choices of n,V points, the tester would reject, that is, the rejection probability is T Ù ' X pÙ Ø T ' which is greater than Z for p high rate RS codes. Another pointer towards the inherent difficulty in coming up with a tolerant tester is the work of Fischer and Fortnow [39] which shows that there are certain boolean properties which have a standard tester with constant number of queries but for which every tolerant Z tester requires at least ú queries. In this chapter, we examine existing standard testers and convert some standard testers into tolerant ones. In Section 8.3 we record a few general facts which will be useful in 143 performing this conversion. The ultimate goal, if this can be realized at all, would be to construct tolerant LTCs of constant rate which can be tested using aG queries (we remark that such a construction has not been obtained even without the requirement of tolerance). In this work, we show that we can achieve either constant number of queries with slightly sub-constant rate (Section 8.4) as well as constant rate with sub-linear number of queries (Section 8.5.1). That is, something non-trivial is possible in both the domains: (a) constant rate, and (b) constant number of queries. Specifically, in Section 8.4 we discuss binary ; codes which encode bits into codewords of length [5 E[Y ¨ª©<« for any , and can be tolerant tested using aG6Xe queries. In Section 8.5.1, following [14], we will study the simple construction of LTCs using products of codes — this yields asymptotically good } codes which are tolerant testable using a sub-linear number of queries for any desired ; Yt . An interesting common feature of the codes in Section 8.4 and 8.5.1 is that they can be constructed from any code that has good distance properties and which in particular need not admit a local tester with sub-linear query complexity. In Section 8.6 we discuss the tolerant testability of Reed-Muller codes, which were considered in Chapter 7. The overall message from this chapter is that a lot of the work on locally testable code constructions extends fairly easily to also yield tolerant locally testable codes. However, there does not seem to be a generic way to “compile” a standard tester to a tolerant tester for an arbitrary code. ¥ 8.3 General Observations In this section we will spell out some general properties of tolerant testers and subsequently use them to design tolerant testers for some existing codes. All the testers we refer to are non-adaptive testers which decide on the locations to query all at once based only on the random choices. The motivation for the definition below will be clear in Section 8.4. È ; È Definition 8.2. Let RqEvG . A tester H is fÆÁ£Z!NZ !Æ Á!f¥ !N -smooth if there exists a set VK÷ "¡ where ·1Vo·>tj with the following properties: H queries at most XZ points in V , and for every ¬F+>V , the probability that each of  these queries equals location ¬ is at most , and ¿ \¿ H queries at most points in "¡ [ V , and for every ¬0+F ¡ [ V , the probability that ÂØ each of these queries equals location ¬ is at most å . ¿ \Á¿ È ; ; È As a special case a fÆG£!N !Æ ! !¥G -smooth tester makes a total of queries each of them distributed uniformly among the possible probe points. The following lemma follows easily by an application of the union bound. ; È È ; Lemma 8.1. For any RvqR G , a fÆÁ£Z\!NZ !ÆÁ!N¥ N! -smooth !fÜ\b -tolerant tester H Z½å with perfect completeness is a WÜZ\!NÜb -tolerant tester H , where ÜZ3 ¡ Ý ¯ ÂN Zå N ¯ Ø Â Ø à . Ä E N N 144 × Proof. The soundness follows from ø the assumption on H . Assume ÝU/ÇB!âµoE Ä and let {+_â be the closest codeword to Ç . Suppose that differs from Ç in a set V of ;CT places among locations in V , and a set of v_Iº;T placesø among locations in ¡ [ V , where ; ¥ ) we have vEöÜZ and E;tE . The probability thatø any of the at most <Z (resp. ÂØ å  queries of H into V (resp. ¡ [ V ) falls in V- (resp. ) is at most BhÄ (resp. Z å BëÐÄ ). N N , it accepts (since H has perfect Clearly, whenever H does not query a location in VJ completeness). Thus, an easy calculation shows that the probability that H rejects Ç is at most » h½ ÜZT Ö CEB¢ ÁeZZ ! Áë¥ GI ¦ which is G6e for the choice of ÜZ stated in the lemma. The above lemma is not useful for us unless the relative distance and the number of queries are constants. Next we sketch how to design tolerant testers from existing robust testers with certain properties. We first recall the definition of robust testers from [14]. A standard tester H has two inputs: an oracle for the received word Ç and a random string Á . Depending on Á , H generates query positions ÍëÈ Z!¥ä!Í ¯ , fixes a circuit 1¹Â and then accepts if 1¹Â^Ç ÁXNkpG where Ç ÁXkõÆ^ÇÎ !¥ä!ÇXÎ . The robustness of H on inputs ¾ minimum, Ç and Á , denoted byÌ @ðµ/ÇB!^Áe , is defined to be the over all strings ; such that Ì 1 Â\<;§ G , of ÝU/Ç ÁXb!; . The expected robustness of H on Ç is the expected value of choices of Á and would be denoted by @ ð /ÇC . @ ð ^Ç"!^ÁX over the random Ì A standard tester H is said to be Ü -robust for â if for every Ç.+0â , the tester accepts with probability G , and for every ÇÂ+] ¯ , ÝU/ÇB!âäÐEQÜ8@>ðµ/ÇC . The tolerant version H( of the standard Ü -robust tester H is obtained by accepting an oracle Ç on random input Á , if @ðä/Ç"!^ÁXE^] for some threshold ] . (Throughout the chapter ] will denote the threshold.) We will sometimes refer to; such a tester as one with threshold ] . Recall that a standard tester H accepts if @ ð ^Ç"!^ÁXk . We next show that H is sound. The following lemma follows from the fact that H is Ü -robust: ; × × Ø Ä , then Lemma 8.2. Let E_]_EmG , and let ¥Ü ò a` . For any ÇS+wB¯ , if ÝU^Ç"!âäs the tolerant tester H( with threshold ] rejects Ç Ä with probability at least . ×WØ Ä . By the definition of robustness, the expected Proof. Let ÇÂ+] ¯ be such that ÝU^Ç"!âä9 ×WØ robustness, @ðµ^Ç is at least × Ä , and thus at least b]ÂnLV>N6e by the choice of Ü . By the standard averaging argument, we can have @cðä/Ç"!^ÁXÐEc] on at most a fraction G6X of the of the random choices of Á for H (and hence H ). Therefore, @ ð /ÇB!,ÁX(d] with probability at least V>6e over the choice of Á and thus H rejects Ç with probability at least V>6X . We next mention a property of the query pattern of H which would make Hò tolerant. Let ¶ be the set of all possible choices for the random string Á . Further for each Á , let U"ðµÁe be the set of positions queried by H . ó Definition 8.3. A tester H has a partitioned query pattern if there exists a partition ÁZ ¥ ]¶ of the random choices of H for some , such that for every Í , ½ $ ½ 145 ½ NÂ Ù Þ U ð Áe³q¢cG£!fVC!ä!µ¦ , and For all Á>!^ÁB+K¶BÎ , Uðµ ÁX´ Uðµ Á¥ªkde if Áa Ï Á . × Lemma 8.3. Let H have a partitioned query pattern. For any Çx+t ¯ , if ÝU^Ç"!âäaE where ÜZ3 Q ` , then the tolerant test H with threshold ] rejects with probability at most Ä , Ä Z . be the partition of ¶ , the set of all random choices of the tester H . Proof. Let ¶äZ\!¥¥ä!f¶ $ For each , by the properties of ¶ , Â Ù Þ S @ ð /ÇB!^ÁePEzÝU/ÇB!âä . By an averaging argument and by the assumption on ÝU^Ç"!âä and the value of ÜXZ , at least fraction of the choices of Á in ¶ have @ ð ^Ç"!^ÁXEd] and thus, H accepts. Recalling that ¶äZ!¥ä!f¶ was a partition of $ ¶ , for at least of the choices of Á in ¶ , H accepts. This completes the proof. ó 8.4 Tolerant Testers for Binary Codes One of the natural goals in the study of tolerant codes is to design explicit tolerant binary codes with constant relative distance and as large a rate as possible. In the case of standard testers, Ben-Sasson et al [11] give binary locally testable codes which map bits to ; and which are testable with aG6X< queries. Their con E[j ¨ª©£« bits for any § struction uses objects called PCPs of Proximity (PCPP) which they also introduce in [11]. In this section, we show that a simple modification to their construction yields tolerant ; testable binary codes which map bits to E[Y ¨ª©£« bits for any . We note that a similar modification is used by Ben-Sasson et al to give a relaxed locally decodable codes Z [11] but with worse parameters (specifically they gives codes with block length ). ¥ ¥ 8.4.1 PCP of Proximity We start with the definition1 of of a Probabilistic Checkable proof of Proximity (PCPP). A pair language is simply a language whose elements are naturally a pair of strings, i.e., gf Ó hgikjmlö¢CÆW1P!o@ È · it is some collection of strings /¬Y!Z; . A notable example is ÓPx Boolean circuit 1 evaluates to G on assignment @¦ . ; Definition 8.4. Fix E>YSEpG . A probabilistic verifier is a PCPP for a pair language with proximity parameter Y and query complexity U if the following conditions hold: (Completeness) If ¬j!;w+ accessing the oracle ; on then there exists a proof with probability G . 1 pn such that accepts by ¢©;ú· ¬j!Z;_+ 9¦ , then for all proofs n , Z with probability strictly less than . (Soundness) If ; is Y -far from (/¬[ accepts by accessing the oracle ; n The definition here is a special case of the general PCPP defined in [11] which would be sufficient for our purposes. 146 (Query complexity) For any input ¬ and proof n , pn ; . makes at most Cf· ¬·Ä queries in Note that a PCPP differs from a standard PCP in that it has a more relaxed soundness condition but its queries into part of the input ; are also counted in its query complexity. Ben-Sasson et. al. give constructions of PCPPs with the following guarantees: ; be arbitrary. There exists a PCP of proximity for the pair Lemma 8.4 ([11]). Let S p f language ÓPx Ó hgikjmlO¢UW1P!`¬¥·¸1 is a boolean circuit proof and 1 ¬" Ge¦ whose  length, for inputs circuits of size Á , is at most Á} E[j ¨ª©£« ÁX and for _y ¹»º½¼e¹»º¼  the Z Z » ¹ ½ º e ¼ ¹»º¼e¹»º½¼ verifier of proximity has query complexity a/Ö CE"¢ } ! ¦< for any proximity parameter Y Z that satisfies Y{ . Furthermore, the queries of the verifier are non-adaptive and each of the queries which lie in the input part ¬ are uniformly distributed among the locations of ¬ . ¥ The fact that the queries to the input part are uniformly distributed follows by an examination of the verifier construction in [11]. In fact, in the extended version of that paper, the authors make this fact explicit and use it in their construction of relaxed locally decodable codes (LDCs). To achieve a tolerant LTC using the PCPP, we will need all queries of the verifier to be somewhat uniformly or smoothly distributed. We will now proceed to make the queries of the PCPP verifier that fall into the “proof part” n near-uniform. This will follow a fairly general method suggested in [11] to smoothen out the query distribution, which the authors used to obtain relaxed locally decodable codes from the PCPP. We will obtain tolerant LTCs instead, and in fact will manage to do so without 3 a substantial increase in the encoding length (i.e., the encoding length will remain *XV ¹»º½¼ ' ). On the other hand, ; Z the best encoding length achieved for relaxed LDCs in [11] is for constant y . We begin with the definition of a mapping that helps smoothen out the query distribution. È ; q ÆUÎ Î Û Z with UÎS for all Íw+ "¡ and Definition 8.5. Given any Ç2+ B¯ and UH Î Û Z UCγ2G , we define the mapping rtsvuvs0wJxæ! as follows: ryszuvs0wJx/ÇB!Uq J+_¯ such that ÇÎ is repeated !^<kUÎH$ times in ryszuvs0wJx^Ç"!{UÁq and Î Û Z !^<kUÎ$ . We now show why the mapping is useful. A similar fact appears in [11], but for the sake of completeness we present its proof here. Lemma 8.5. For any Ç[+w ¯ let a non-adaptive verifier H (with oracle access to Ç ) make U/j queries and let UÎ be the È probability that each of these queries probes location Í3+F ¡ . Z q HÆ ÜëÎ Î Û Z . Consider the map rtsvuvs0wJx/ÇB!zÜ¥q J$c"¯ ) B¯ . Then there Let Üfγ n and ÜP exists another tester H( for strings of length Á with the following properties: H makes V<U/Y queries on Çc8|rtsvuvs0wJx^Ç"!}Üq each of which probes location , for 1. any .+ ¡ , with probability at most , and 2. for every ÇQ+5B¯ , the decision of H on Ç is identical to that of H e_R~BE,£ . on Ç . Further, 147 Proof. We first add dummy queries to H each of which are uniformly distributed, and then permute the V< queries in a random order. Note that each of the Ve queries is now Z identically distributed. Moreover, any position in Ç is probed with probability at least for each of the V< queries. For the rest of the proof we will assume that H makes Ve queries Z for each of which any Í+÷ "¡ is probed with probability Ü\Î n . Let G¥Î !ç£úÜfÎH$ . Û ÜëÎ2G , we Note that G¥Î8E£úÜfÎ and G¥Î8L£úÜfÎILG . Recalling that Î Û Z G¥Î and Î Z have e_R~ E,£ . H just simulates H in the following manner: if H queries Ç<Î for any ÍÐ+ ¡ , H queries one of the G¥Î copies of ÇXÎ in Ç> uniformly at random. It is clear that the decision of HJ on Ç>µ~ryszuvs0wJx/ÇB!zÜq is identical to that of H on Ç . We now look at the query distribution of Z H . H queries any a+F ¡ , where Ç QÇÎ , with probability U tÜfΣ . Recalling the lower ×8 Z Z bound on GÎ , we have U E ×v åZ which is at most since clearly ÜëÎ . We showed as required. earlier that BE,£ which implies UD E One might wonder if we can use Lemma 8.5 to smoothen out the queries made by the verifier of an arbitrary LTC to obtain a tolerant LTC. That is, whether the above allows one to compile the verifier for any LTC in a black-box manner to obtain a tolerant verifier. We will now argue (informally) that this technique alone will not work. Let 1sZ be an ³!ë"!NT£¡ ¯ LTC with a standard tester HäZ that makes identically distributed queries with distribution UÎ , GEÍ(Eq , such that UÎ9vG6<VX for each Í . Create a new gn5G£!ë"!NT£¡ ¯ code 1 whose ^´n2G ’th coordinate is just a copy of the ’th coordinate, i.e., corresponding to each Z codeword WÜZ\!NÜ\!¤¥¤¥¤!NÜ +S ¯ of 1-Z , we will have a codeword WÜZ!NÜ\!¥¤¥¤¥¤!NÜ !NÜ +S ¯ Z of 1Ð . Consider the following tester Hú for 1Ð : Given oracle access to Ç5+| ¯ , with probability G6<V check whether Ç QÇ Z , and with probability G6<V run the tester HäZ on the first coordinates of Ç . Clearly, Hú is a standard tester for 1 . Now, consider what happens in the conversion procedure of Lemma 8.5 to get i1 !ZH q 2WXZb!¤¥¤¥¤!N ZN be from W1¥!H\ . Note that by Lemmas 8.5 and 8.3, H- is tolerant. Let y the query distribution of HÁ . Since HÁ queries /Ç !`Ç ZN with probability G6<V , the combined number of locations of Çc3ryszuvs0wJx/ÇB!£q corresponding to Ç !Ç Z will be about G6<V of the total length . Now let Ç be obtained from a codeword of 1 by corrupting just these locations. The tester H( will accept such a Çc with probability at least G6eV , which contradicts the soundness requirement since Ç is G6eV -far from 1 . Therefore, using the behavior of the original tester HÁ as just a black-box, we cannot in general argue that the construction of Lemma 8.5 maintains good soundness. Applying the transformation of Lemma 8.5 to the proximity verifier and proof of proximity of Lemma 8.4, we conclude the following. ; Proposition 8.6. Let o be arbitrary. There exists a PCP of proximity for the pair lang f guage ÓPx Ó hpiHjl_p¢UW1P!`¬¥·¸1 is a boolean circuit and 1 ¬3÷Ge¦ with the following properties: ¥ 1. The proof length, for inputs circuits of size Á , is at most Á( E[Y ¨ª©£« ÁX , and 148 Z Z  2. for _* ¹»º¼e¹»º½¼  the verifier of proximity has query complexity a/Ö CEB¢ } ! ¦< for Z » ¹ º e ¼ ¹»º½¼e¹»parameter º¼ Y that satisfies Y[ . any proximity Furthermore, the queries of the verifier are non-adaptive with the following properties: 1. Each query made to one of the locations of the input ¬ is uniformly distributed among the locations of ¬ , and 2. each query to one of the locations in the proof of proximity n probes each location with probability at most V>6· n8· (and thus is distributed nearly uniformly among the locations of n ). 8.4.2 The Code We now outline the construction of the locally testable code from [11]. The idea behind the construction is to make use of a PCPP to aid in checking if the received word is a codeword is far away from being one. Details follow. ; ; Suppose we have a binary code 1 ý $-¢ !GX¦X'K) ¢ !¥Ge¦ $ of distance T defined by a ; å Í parity check matrix ²M+{¢ !¥Ge¦ 6$ ' $ that is sparse, i.e., each of whose rows has only an absolute constant number of G ’s. Such a code is referred to as a low-density parity check code (LDPC). For the construction below, we will use any such code which is asymptotically good (i.e., has rate 6 and relative distance Tc6 both positive as ) ). Explicit constructions of such codes are known using expander graphs [95]. Let be a verifier of a PCP of proximity for membership in 1 ý ; more precisely, the proof of proximity of an input ; string p+w¢ !¥Ge¦r$ will be a proof that 1 ý vs3÷G where 1 ý is a linear-sized circuit which performs the parity checks required by ² on (the circuit will have size /0PHÑC since ² is sparse and 1 ý has positive rate). Denote by nk ¬" be the proof of proximity guaranteed by Proposition 8.6 for the claim that the input 1 ý /¬ is a member of 1 ý (i.e., satisfies the circuit 1 ý ). By Proposition 8.6 and fact that the size of 1 ý is Ñ , the length of nk ¬" can be made at most E[Y/¨ª©£« . åZ The final code is defined as âZ ¬"{ i1 ý ¬ !#nk/¬` where _{ ¹»º¼ ' ¿ ¿ . The ¿ À Æf ¿ since the repetition of the code part 1 ý /¬ is required in order to ensure good distance, length of the proof part nk ¬" typically dominates and we have no guarantee on how far Ï ¬B are. apart nk ¬ZN and nk/¬Bb for ¬ÁZòQ For the rest of this section let denote the proof length. The tester HZ for â"Z on an input ; S picks Í8+ _½¡ at random and runs the PCPP verifier on 5÷âòZ\!ä!9 N!#nY9+{¢ !¥Ge¦ $ 9Î1on . It also performs a few rounds of the following consistency checks: pick ÍbZ\!ÍJ+, _½¡ and eZ! s+t §¡ at random and check if Î <ZN ÐÎ Ø ë . Ben-Sasson et al in [11] show not be a tolerant tester. To see this, note that that HYZ is a standard tester. However, HäZ need Z fraction of the total length. Now consider a received word the proof part of âZ forms a ' ý » ¹ ½ º ¼ × ý ý â !ä! !#nª where +x1 ý but n is not a correct proof for ý being a valid à ¥ û 149 codeword in Ü ý . Note that × is close to âZ . However, HYZ is not guaranteed to accept × with high probability. û û The problem with the construction above was that the proof part was too small: a natural fix is to make the proof part a constant fraction of the codeword. We will show that this is sufficient to make the code tolerant testable. We also remark that a similar idea was used by Ben-Sasson et. al. to give efficient constructions for relaxed locally decodable codes [11]. ; ; ; Construction 8.1. Let R |RrG be a parameter, 1 ý $k¢ !GX¦ ' ) ¢ !¥Ge¦r$ be a good 2 binary code and be a PCP of proximity verifier for membership in 1 ý . Finally let nk/¬ be the proof corresponding to the claim that 1 ý ¬" is a codeword in 1 ý . The final code is Ø Zå ' defined as âD ¬³÷i1 ý ¬ !#nk ¬"% with G<Zk ¹»º¼ ¿ ¿ and GШª©£« .3 » $ For the rest of the section the proof length · nk/¬¥· will be denoted by . Further the proximity parameter and the number of queries made by the PCPP verifier would be denoted by YX and ` respectively. Finally let @ ý denote the relative distance of the code 1 ý . The tester HÁ for â is also the natural generalization of HäZ . For a parameter (to be Ø ; instantiated later) and input öâsZ!¥ä! !#nÁZ\!ä!#n Ø +¢ !¥Ge¦Y $ , HÁ does the following: 1. Repeat the next two steps twice. 2. Pick Í3+F G<Z¡ and a+F G`¡ randomly and run on Î4pn . 3. Do repetitions of the following: pick ÍfZ\!`ͽ*+q G<Z½¡ and eZ! *+q §¡ randomly and check if ÐÎ eZf³9Î Ø b . The following lemma captures the properties of the code â and its tester HÁ . Lemma 8.7. The code âD in Construction 8.1 and the tester Hú (with parameters and respectively) above have the following properties: 1. The code âD has block length wq¨ª©£«Ðy by `GI ½@ ý . 2. HÁ makes a total of PLV<NÐn´ È with minimum distance T lower bounded queries. È -smooth. 3. HÁ is fÆG£!N !ÆWVC!fV<` !¥GI 5 and the encoding can be done by circuits of nearly linear size M 5 . Í þM N>P overhead is overkill, and a suitably large constant will do, but since the proof length The factor + will anyway be larger MOthan by more than a polylogarithmic factor in the constructions we use, NP factor and this eases the presentation somewhat. we can afford this additional 2 This means that 3 hHrtn m r d ?h Qnd d 4d r hHrBn 150 4. H is a W ÜZ\!NÜb -tolerant tester with ÜZ ¯ nôµ . V Ä E » » ¯ V ¯ Zý å » Z½å » ¯ ý ¡ Ý à and Üy Z½å <Yòn Ä » S ¹»º¼ ' £ n Proof. From the definition of âB , it has block length QÛG£Z½ n±G $ ¨ª©£«Ðò Q¨ª©£«Ðs . Further as 1 ý has relative distance @ ý , â has relative distance at least Æ $ ÷`GI ½@ ý . S ¹»º¼ ' H makes the same number of queries as which is ë in Step 2. In Step 3, HÁ makes Ve queries. As HÁ repeats Steps 2 and 3 twice, we get the desired query complexity. To show the smoothness of Hú we need to define the appropriate subset V ¡ such that ·1Vo· `GI µ½ . Let V be the set of indices with the code part: i.e. Vù GZ½u¡ . H makes V< queries in V in Step 3 each of which is uniformly distributed. Further by Proposition 8.6, HÁ in step 2 makes at most N queries in V which are uniformly distributed and at most N queries in ¡ [ V each of which are within a factor V of being queried uniformly at random. To complete the proof of property note that HY repeats step 2 and 3 twice. The tolerance of Hú follows from property and Lemma Ø 8.1. For the soundness part ; note that if âòZ!ä! Ø !#nZ!Y!n Ø §+2¢ !¥Ge¦Y $ is Y -far from â then (- } å } å Y´IC far from the repetition code 1s* âòZ\!ä! is at least S ; ý ¢e1 ¬ · ¬H+O¢ !¥Ge¦ ' ¦ . For Y2 Ü\ëTc 6 with the choice of Ü in the lemma, we have Y_I @YnLU6e . The rest of the proof just follows the proof in [11] (also see [68, Chap. 12]) of the soundness of the tester HµZ for the code âZ – for the sake of completeness we complete the poof here. We will show that one invocation of Steps 2 and 3 results in Z H accepting with probability strictly less than . The two repetitions of Steps 2 and 3 Z reduces this error to at most . ; Let ËK+¢ !GX¦ $ be the string such that Ë is the “repetition sequence” that is closest to Ì Ì , that is one that minimizes â !`Ë k Î Û Z âÐÎ!ËÁ . We now consider two cases: » Case 1: Ì v-^!Ë G<Z½[6e . In this case, a single execution of the test in Step 3 rejects with probability ã Ì Î Î Ø Ù â ÐÎ !ÐÎ Ø N6§¡ G w<GeZf G {< G<ZN ü ü ü ÎØ Ì Î ü Ì ÎØ Î G ü Ì v9Î !`Ë <G Z Î Û Z Ì â !`Ë N6C^ G<Z` G6X ! âÐÎ ! 9Î Ø v9Î `! Ë where the first inequality follows from the choice of Ë and the second inequality 151 V follows from the case hypothesis. Thus, after ¯ probability `GIQG6e RG6e?}RG6<V . repetitions the test will accept with Case 2: Ì v !Ë .RÅG<ZÑ[6e . In this case we have the following (where for any Ì Ì Ó subset ¶ of vectors and a vector Ë , we will use ^Ëä!f¶3kQÖ ­{Z Ù Þ /Ëä!ÇC ): Ì /ËY!f1 ý G<Z Ì ^Ë !ë1 G<ZÑ Ì â !f1 äI G<Z½ Ì â !Ë >YXÐnlU6e IQG6e =Yn6e ! (8.1) where the first inequality follows from the triangle inequality and the last inequality follows from the case hypothesis (and the fact Ì that is Yµn0U6e -far from 1 ). Now by the case hypothesis, for an average Í , vν!ËÁ_Eù[6X . Thus, by a Markov argument, at most one thirds of the Î ’s are 6X -far from Ë . Since Ë is YX8nw>6e -far from 1 ý (by (8.1)), this implies (along with triangle inequality) that for at least two thirds of the ÐÎ ’s are YX -far from 1 ý . Thus, by the property of the PCPP, for each such ÐÎ the test in Step 2 should accept with probability at most G6 . Thus the total Z Z Z acceptance probability in this case is at most >G8n , as desired. Thus, in both cases the tester accepts with probability at most G6<V , as required. ; } } RAYÊRG and let 2 , Y , ZW} . With these settings we get Z Ysn ¯ n ÛY and `g } from Proposition 8.6 with the choice 0 VY . Finally, } PLV<`Ðnl L } Z . Substituting the parameters in Ü¥ and ÜZ , we get Ü9 and Fix any V Ä ÜZT Y VX3Ö CE"¢©Yµ nF`X6<V>b!ÐiVòIãYÁ`£¦ Also note that the minimum distance T.pGkIòµ½@ ý [÷`GkI the following result for tolerant testable binary codes. ; 5s<Y 3 ¤ } Ñ@ ý _ Æ . Thus, we have ¥ ; Theorem 8.1. There exists an absolute constant ý such that for every Y , R±Y{R÷G , ; ; } there exists an explicit binary linear code â´$C¢ !GX¦ ' ) ¢ !¥Ge¦ where [5 E[Y ¨ª©£« with minimum distance TlH ý which admits a ÜZ\!NÜb -tolerant tester with Üo<YÁ , ÜZ3LP<Y and query complexity a } Z . The claim about explicitness follows from the fact that the PCPP of Lemma 8.4 and hence Proposition 8.6 has an explicit construction. The claim about linearity follows from the fact that the PCPP for CIRCUITVAL is a linear function of the input when the circuit computes linear functions — this aspect of the construction is discussed in detail in Chapter 9 in [68]. 152 8.5 Product of Codes Tensor product of codes (or just product of codes) is simple way to construct new codes from existing codes such that the constructed codes have testers with sub-linear query complexity even though the original code need not admit a sub-linear complexity tester [14]. We start with the definition of product of codes. Definition 8.6 (Tensor Product of Codes). Given âÁZ and âD that are ÄcZ!äZ\!fTUZ¡ and Ä£!ú!NT`¡ codes, their tensor product, denoted by âúZqâ , consists of úaîFäZ matrices such that every row of the matrix is a codeword in âúZ and every column is a codeword in â . It is well known that âZ(â is an äZ½ú!ëcZN£!NTUZ`T>N¡ code. A special case in which we will be interested is when âúZkâÐ,â . In such a case, given an !bB!fT<¡ ¯ code â , the product of â with itself, denoted by â , is a !ë !NT ¡ ¯ code such that a codeword (viewed as a yîs matrix) restricted to any row or column is a codeword in â . It can be shown that this is equivalent to the following [100]. Given the §îu generator Í matrix à of â , â is precisely the set of matrices in the set ¢eÃ5ð.rO%Xà ·Où+K±P¯ ' ' ¦ . 8.5.1 Tolerant Testers for Tensor Products of Codes A very natural test for â is to randomly choose a row or a column and then check if the restriction of the received word on that row or column is a codeword in â (which can be done for example by querying all the points in the row or column). Unfortunately, as we will see in Section 8.5.2, this test is not robust in general. Ben-Sasson and Sudan in [14] considered the more general product of codes â for _9Q (where â denotes â tensored with itself _IKG times) along with the following general tester: Choose at random ;-+w¢cG£!ä!Z_ë¦ and Í3+w¢cG£!ä!ä¦ and check if ; áË coordinate of ] ^åZ . the received word (which is an element of ¯ ) when restricted4 to Í is a codeword in â It is shown in [14] that this test is robust, in that if a received word is far from â , then many / å Z . This tester lends itself to recursion: the test of the tested substrings will be far from â ^ å Z ^ U å forØ â can be reduced to a test for â and so on till we need to check whether a word in ¯ is a codeword of â . This last check can done by querying all the points, out of the points in the original received word, thus leading to a sub-linear query complexity. As shown in [14], the reduction can be done in ¨ª©£«õ_ stages by the standard halving technique. Thus, even though â might not have a tester with a small query complexity, we can test â with a polylogarithmic number of queries. We now give a tolerant version of the test for product of codes given by Ben-Sasson and Sudan [14]. In what follows _Ð, will be a power of two. As mentioned above the tester H for the tensor product â reduces the test to checking if some restriction of the given Ø string belong to â . For the rest of this section, with a slight abuse of notation let Ç +] ¯ denote 4 For the 5 Ï Ì * case signifies either row or column and denotes the row/column index. 153 the final restriction being tested. In what follows we assume that by looking at all points in Ø any ÇÂ+]B¯ one can determine if ÝU^Ç"!â 9E] in time polynomial in . The tolerant version of the test of [14] is a simple modification as mentioned in Section 8.3: reduce the test on â to â as in [14] and then accept if Ç is ] -close to â . First we make the following observation about the test inÌ [14]. The test recurses ¨ª©£«õ_ times to reduce the test to â . At step , the test chooses an random coordinate ; (this will ] just be a random bit) and fixes the value of the ; á coordinate of the current â Ø to an index Í (where Í takes values in the range GEqÍ E| ). The key observation here is that for each fixed choice of ;Z\!ä!6; , distinct choices of ÍNZ\!ä!Í correspond to querying ] points in the original ¹»º¼ ¹»º½¼ disjoint sets form a partition of all Ç0+_"¯ string, which together coordinates of Ç . In other words, H has a partitioned query pattern, which will be useful to argue tolerance. For soundness, we use the results in [14], which show that their tester is 1 ¹»º¼ -robust for 1|LV . Applying Lemmas 8.2 and 8.3, therefore, we have the following result: ; ; Theorem 8.2. Let _òq be a power of two and R]_E2G . There exist RqÜXZsRqÜ\}E2G ×WØ with × 51 ¹»º¼ `G8nFV>6}]D such that the proposed tolerant tester for â is a ÜZ\!fÜ\b -tolerant where ~ is the block length of â . Further, ÜZ and Ü are tester with query complexity ~ constants (independent of ~ ) if _ is a constant and â has constant relative distance. ; Corollary 8.3. For every YF , there is an explicit family of asymptotically good binary } linear codes which are tolerant testable using queries, where is the block length of the concerned code. (The rate, relative distance and thresholds ÜeZ\!fÜ\ for the tolerant testing depend on Y .) 8.5.2 Robust Testability of Product of Codes Recall that a standard tester for a code is robust if for every received word which is far from being a codeword, the tester not only rejects the codeword with high probability but also with high probability the tester’s local view of the received word is far from any accepting view (see Section 8.3 for a more formal definition). As was mentioned before for the product code âúZFâ , there is a natural tester (which we call H Ø )– flip a coin; if it is heads check if a random row is a codeword in âúZ ; if it is tails, check if a random column is a codeword in â" . This test is indeed robust in a couple of special cases– for example, when both âúZ and â are Reed-Solomon codes (see Section 8.6.1 for more details) and when both âúZ and â are themselves tensor product of a code [14]. P. Valiant showed that there are linear codes âÁZ and â such that â"ZNâ is not robustly testable [103]. Valiant constructs linear codes âúZ , â and a matrix Ç such that every row (and column) of Ç is “close” to some codeword in âúZ (and âD ) while Ç is “far” from every codeword in âZNlâ (where close and far are in the sense of hamming distance). 154 However, Valiant’s construction does not work when âúZ and â are the same code. In this section, we show a reduction from Valiant’s construction to exhibit a code â such that â is not robustly testable. Preliminaries and Known Results â is said to be robustly testable if it has a PG -robust tester. For a given code â of block length over ¯ and a vector Ç_+l ¯ , the (relative) Hamming distance of Ç to the closest codeword in â is denoted by Ý C/Ç . Asking whether Hy Ø is a robust tester has the following nice interpretation. The rows or columns of the received word Ç . Let the row or column queries ÍNZ\!ä!Í ¯ are either  corresponding to the random seed Á be denoted by Ç . Then the robustness of Hy Ø on Â Â Ø inputs ^Ç"^! ÁX , @ ð ^Ç"^! ÁX is just Ý /Ç when %Í Â corresponds to a row and Ý Ø /Ç when PÍ Â corresponds to a column. Therefore the expected robustness of H Ø on Ç is the average of the following two quantities: the average relative distance of the rows of Ç from âYZ and the average relative distance of the columns of Ç from â" . In particular, if Hy Ø is PG -robust then it implies that for every received word Ç such Z â . P. that all rows (and columns) of Ç are CG -close to âÁZ (and â ), Ç is U`G -close to â"( Valiant proved the following result. ; Theorem 8.4 ([103]). There exist linear codes Z!ëcZ!NTUZ8QYZf6UG ¡ and úQ Z !ë£!NT> ; Á\6cG ¡ (call them â"Z and â ) and a úî{YZ received word Ç such that every row of Ç is ; G6äZ -close to âZ and every column of Ç is a codeword âB but Ç is G6eV -far from âZÐîgâ . Note that in the above construction, Yw Ï äZ and in particular âZ and âD are not the same code. Reduction from the Construction of Valiant In this section, we prove the following result. Theorem 8.5. Let âZ5 Ï â be äZ\!ëcZ!NTCZ|P/YZfÑ¡ and ú!ë£!NT-|P/Á\½¡ codes respectively (with úxYZ ) and let Ç be a úkîyäZ matrix such that every row (and column) of Ç is ^äZN -close to â"Z (^ú\ -close to âD ) but Ç is @ -far from âZâ . Then there exists a linear code â with parameters ³!ë"!NT_ s/Y and a received word ÇU such that such that every row (and column) of Çc is /YZfN6<V -close (and /ú\N6eV -close) to â but Ç is @6 -far from â . Proof. We will first assume that µZ divides ú and let Ø ;Â+_&9' , let È Ø . For any ¬z+m&' and È â fÆ/¬Y!Z; 3zÆNçâ"Z¥/¬` $ !â?; 3 Thus, 5cZjnx£ and 0uäZjnlÁ . Also as TCZkLP/YZf and T9LP^úb , TyLP^Y . We now construct the gîy matrix Çc from Ç . The lower left jkîyuäZ sub-matrix of Ç contains the matrix Ç $ where Ç $ is the horizontal concatenation of copies of Ç (which is ; a ú-î]YZ matrix). Every other entry in Çc is . See figure 8.1 for an example with %LV . 155 4a £¤ 2a ¥¦ N N N N N N N N N N NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN NN ¡¢ a 4a a Figure 8.1: The construction of the new received word §{¨ from § for the case when ©kªR« ©¬p«®­@ and ¯°«d­ . The shaded boxes represent § and the unshaded regions has all ± s. @ , Let be the codeword in ²ªRd²7¬ closest to § and construct B¨ in the same manner ¬ as §³¨ was constructed from § . We first claim that m¨ is the codeword in5 ² closest to §³¨ . ¬ For the sake of contradiction, assume that there is some other codeword ´¨ ¨ in ² such that µ¶ § ¨¸· ¨ ¨:¹»º µ¶ § ¨¸· ¨:¹ . For any ­}© ¨½¼ ­}© ¨ matrix ¾ let ¾ denote the lower left © ¨½¼ © ¨ A d$ where ¿ÂÁ²yªÃ²7¬ . Further, as ¿ sub-matrix of ¾ . Note that by definition of ² , ¨ ¨ «Àµ ¶ §³¨ · B¨ ¨ ¹Åº µ¶ §³¨ · B¨ ¹ , it holds that A §Äµ¨ ¶ (necessarily) has ± everywhere other than § ¨ and A of . § · ¹pÆ µ¶ § · ¿ ¹ which contradicts the definition Finally, it is easy to see that Ç QÈ ¶ § ¨ ¹ « µ ¶ § ¨ · ¨ ¹É © ¬ « Ê µ ¶§ · ¹¯ É ¶Ë ¯ ©HªNÌ-©(¬ ¹ ¬ « µ¶ § · ¹#É ¶bÍ ©ªÎ©(¬ ¹ «ÐÏÍ and if for any row (or column), the relative distance of § restricted to that row (or column) from ²<ª (²7¬ ) is at most Ñ then for every row (or column), the relative distance of §{¨ restricted to that row (or column) from ² is at most Ñ É ­ . This completes the proof for the case when ©½ª divides ©(¬ . For the case when ©kª does not divide ©(¬ a similar construction works if one defines ² in the following manner (for any ¿ÒÁÔÓgÕÖ and ¿t¬Á×ÓpÕ È ) Ú Ú ² ¶Ø ¿ · ;Ù ¹ « Ø#¶ ²yª ¶ ¿ ¹¹ S Ö · ¶ ²7¬ ¶ ; ¹#¹ S È Ù ¶ ª · ©(¬ ¹ . The received word §¨ in this case would have its lower left ¼ where « lcm Ú ©H Ú matrix as §7Û S ÖÝÜ S ÈÝÞ (where §7Û $ ÖÝÜ $ ÈÝÞ is the matrix obtained by vertically concatenating ¯ß¬ copies of § $ Ö ) and it has ± s everywhere else. Note that à áBâ³ã as the all zeros vector is a codeword in both â c and â ã and à áBâ cä â ã . 5 156 Theorem 8.4 and 8.5 imply the following result. Corollary 8.6. There exist a linear code ² with linear distance such that the tester H # ¶ æ ¹ -robust for ² ¬ . not å QÈ is 8.6 Tolerant Testing of Reed-Muller Codes In this section, we discuss testers for codes based on multivariate polynomials. 8.6.1 Bivariate Polynomial Codes ¬ As we saw in Section 8.5.2, one cannot have a robust standard testers for ² in general. In æ ·ëê «ì©×í è³îUï , that is, this subsection, we consider a special case when ²®« x{zç © ·éè Ì the Reed–Solomon code based on evaluation of degree è polynomials over ð ï at © distinct ¬ points in the field. We show that the tester for ² considered in Section 8.5.2 is tolerant for this special case. It is well-known (see, for example, Proposition 2 in [88]) that in this ¬ case ² is the code with codewords being the evaluations of bivariate polynomials over ð ï of degree è in each variable. The problem of low-degree testing for bivariate polynomials is a well-studied one: in particular we use the work of Polishchuk and Spielman [88] who analyze a tester using axis parallel lines. Call a bivariate polynomial to be one of degree ¶ è ª ·éè ¬ ¹ if the maximum degrees of the two variables are è ª and è ¬ respectively. In what ÚÍÚ follows, we denote¶ by ö ¨ Áñð ï the ¶ received word to be tested (thought of as an © ¼ © matrix), and let ö ¿ · ; ¹ be the degree èy·éè¹ polynomial whose encoding is closest to öò¨ . # æ æ We now specify the tolerant tester H ¨ . The upper bound of í æ í êÉ © on ] comes from the fact that this is largest radius for which decoding an xáz<ç © ·è Ì ·êóî code is known to be solvable in polynomial time [63]. 1. Fix ] where ±õô]ô 2. With probability ¬ª æ í # æ í êÉ © . choose ;o«d± or ;o« æ. ¶ úû¹¹mÆ ] for every ö If ;Ë« æ , choose a column ý randomly and reject if Ƕ öù¨ ¶ úV· ý ¹·éü ¶ úû¹¹»Æ ] for every univariate polynomial ü of degree è and accept otherwise. ö Ƕ ¶ If ;÷«ø± , choose a row G randomly and reject if öù¨ G · úû¹·ëü univariate polynomial ü of degree è and accept otherwise. ¨ is a tolerant tester. æ í # æ í êÉ © , the Theorem 8.7. There exists an absolute constant ý ý Æ ± such that for ]Êô ¶ æ ¬ tester H ¨ withÚ thresholdÚ ] is a ýJª · ý¬ ·þ ~ ¹ -tolerant tester for ² (where ²ß«xáz<ç © ·è Ì ·êóî ) ¬ Æ ÿ Û ` ¬ Þ and ~ is the block length of ² ¬ . where ý0ªR« ÿ ` , ý¬p« The following theorem shows that H Ä Ä 157 ¶ Proof. To analyze H ¨ let X G · úû¹ be the closest degree è univariate polynomial (breaking ¶ ties arbitrarily) for each row G . Similarly construct X úV· ý ¹ . We will use the following refinement of the Bivariate testing lemma of [88]: ô æÇ­ ¶ i such that the following Ì X · ö÷¨ ¹¹ . The following proposition shows that the standard tester version of Hm¨ (that is H¨ with ],«^± ) is a robust tester– Proposition 8.9. H¨ with ],«^± is a ­vý ý robust tester, where ý ý is the constant from Lemma Lemma 8.8 ([88, 13]). There exists an universal constant ý ý Ƕ ¬ Ƕ ö÷¨ · ö ¹ ôcý ý ú ¶Ç¶ X · ö÷¨ ¹ holds. If i è ô© then ö´¨ · ² ¹ « 8.8. Proof. By the definition of the¶ rowØ polynomial , for any¶ row index G , the robustness of Ç ¶ ¶ æ ö÷¨ G · úû¹· X G · úû¹¹ . Similarly for ;Å« , we the tester with ;Å«ì± and G , ö´¨ · ; · GÄÙ ¹ « ¶ Ø Ç¶ ö ¨ ¶ úV· ý Ï ¹· X ¶ úV· ý ¹#¹ . Now the have ö ¨¸· ; · ý0Ù ¹ « expected robustness of the test is given Ï by ¶ö ¨¹ « Ï ç o«^± î IJ ; Ú ªÚ }ç o« æ î IJ Ä; }ç ò« î7ú Ƕ ö ¨ ¶ G · úû¹· X ¶ G · úû¹#¹ Ì IJ G IJ çý «º î7ú Ƕ ö ¨ ¶ Vú · ý ¹· X ¶ Vú · ý ¹¹ æ ¶Ç¶ ª Ç ¶ « ­ ö ¨ · X ¹ Ì ö ¨ · X ¹#¹ Ƕ ¶ Using Lemma 8.8, we get öù¨ · ö ¹ ô^­zý ý ö÷¨ ¹ , as required. ó Ï From the description of H ¨ , it is clear that it has a partitioned query pattern. There are two partitions: one for the rows (corresponding to the choice ;»« ± ) and one for the æ columns (corresponding to the choice ;o« ). Lemmas 8.2 and 8.3 prove Theorem 8.7 where ý ý is the constant from Lemma 8.8. 8.6.2 General Reed-Muller Codes ¶ We now turn our attention to testing of general Reed-Muller codes. Recall that RM ï èy· ¯ ¹ the linear code consisting of evaluations of ¯ -variate polynomials over ð ï of total degree at ¶ most è at all points in ðÀ$ï . 6 To test codewords of RM ï èy· ¯ ¹ , we need to, given a function ð $ï ð ï as a table of values, test if is close to an ¯ -variate polynomial of total degree è . We will do this using the following natural and by now well-studied low-degree test which we call the lines test: pick a random line in ð$ï and check if the restriction of on the line is a univariate polynomial of degree at most è . In order to achieve tolerance, 6 The results of the previous section were for polynomials which had degree in each individual variable bounded by some value; here we study the total degree case. 158 we will modify the above test to accept if the restriction of on the picked line is within distance ] from some degree è univariate polynomial, for a threshold ] . Using the analysis of the low-degree test from [42], we can show the following. æ Theorem 8.8. Ú ô í ¬ ]Â Ú ÿ For ±Ôÿ Û ` ôì Þ ` , ý¬m« Ä and U with ý0ªo« Ä the distance of the code. èÉ and «°å ¶ è¹ , RMï ¶ èt· ¯ ¹ is ¶ ýJª · ý ¬ · U ¹ testable «© ª 7ª $ where ©Ô« $ and ê are the block length and # Proof. Recall that our goal is to test if a given function ð«$ï ð ï is close to an ¯ -variate polynomial of total degree è . For any ¿ · ñÁ ð$ï , a line passing through ¿ in direction ¶ úû¹ to be the univariate ¿Ì_ _Áð ï . Further define ü « is given by the set Ü ÌÜ from the restriction polynomial of degree at most è which is closest (in Hamming distance) of on . We will use the following result. Ü d Theorem 8.9 ([42]). There exists a constant ý such that for all è , if is a prime power that ð$ï ð ï with is at least ý è , then given a function Ï « 587 ! #" | Ü o$ ç ü Ü ¶ _ ¹(« ' IJ&% " | $ Ì ¶ ¿ Ìî_d ¹Ýî ô æ · there exists an ¯ -variate polynomial of total degree at most è such that ê %Áj_ ¶ · ¹ ^ ô ­Ï . The above result clearly implies that the line test is robust which we record in the following corollary. Corollary 8.10. There exists a constant ý such that the line test for RM ï is -robust. ¶ èy· ¯ ¹ with *) ýè The line test picks a random line by choosing ¿ and randomly. Consider the case ú0ú0ú O ï when is fixed. It is not hard to check that for there is a partition of ð $ï «¼O˪ 7ª " , where each O has size $ such that + . «Âð $ï . In other words: Ü ½ ½ ½ Proposition 8.10. The line test has a partitioned query pattern. ¶ The proposed tolerant tester for RM ï èy· ¯ ¹ is as follows: pick ¿ · Áßð $ï uniformly at random and check if the input restricted to is ] -close to some univariate polynomial of # Ü æ í èÉ , the degree è . If so accept, otherwise reject. When the threshold ] satisfies ]Òô test can be implemented in polynomial time [63]. From Corollary 8.10, Proposition 8.10, ¶ Lemmas 8.2 and 8.3, the above is indeed a tolerant tester for RM ï èt· ¯ ¹ , and Theorem 8.8 follows. 159 8.7 Bibliographic Notes and Open Questions The results in this chapter (other than those in Section 8.5.2) were presented in [57]. Results in Section 8.5.2 appear in [26]. In the general context of property testing, the notion of tolerant testing was introduced by Parnas et al. [83] along with the related notion of distance approximation. Parnas et al. also give tolerant testers for clustering. We feel that codeword-testing is a particularly natural instance to study tolerant testing. (In fact, if LTCs were defined solely from a coding-theoretic viewpoint, without their relevance and applications to PCPs in mind, we feel that it is likely that the original definition itself would have required tolerant testers.) The question of whether the natural tester for mªÄ/p¬ is a robust one was first explicitly asked by Ben-Sasson and Sudan [14]. P. Valiant showed that in general, the answer to the question is no. Dinur, Sudan and Wigderson [30] further show that the answer is positive if at least one of Bª or p¬ is a smooth code, for a certain notion of smoothness. They also show that any non-adaptive and bi-regular binary linear LTC is a smooth code. A bi-regular LTC has a tester that in every query probes the same number of positions and every bit in the received word is queried by the same number of queries. The latter requirement (in the ¶Ø#æ ·0 Ù · Ø ± · ±ÄÙ · æ ¹ -smooth, where the tester terminology of this chapter) is that the tester is makes queries. The result of [30] however only works with constant query complexity. Note that for such an LTC, Lemma 8.1 implies that the code is also tolerant testable. Obtaining non-trivial lower bounds on the the block length of codes that are locally testable with very few (even 1 ) queries is an extremely interesting question. This problem has remained open and resisted even moderate progress despite all the advancements in constructions of LTCs. The requirement of having a tolerant local tester is a stronger requirement. While we have seen that we can get tolerance with similar parameters to the best known LTCs, it remains an interesting question whether the added requirement of tolerance makes the task of proving lower bounds more tractable. In particular, Open Question 8.1. Does there exists a code with constant rate and linear distance that has a tolerant tester that makes constant number of queries ? This seems like a good first step in making progress towards understanding whether locally testable codes with constant rate and linear distance exist, a question which is arguably one of the grand challenges in this area. For interesting work in this direction which proves that such codes, if they exist, cannot also be cyclic, see [10]. The standard testers for Reed-Muller codes considered in Section 8.6 (and hence, the tolerant testers derived from them) work only for the case when the size of the field is larger than the degree of the polynomial being tested. Results in Chapter 7 and those in [74] give a standard tester for Reed-Muller codes which works for all fields. These testers do have a partitioned query pattern– however, it is not clear if the testers are robust. Thus, our techniques to convert it into a tolerant tester fail. It will be interesting to show the following result. 160 Open Question 8.2. Design tolerant testers for RM codes over any finite field. 161 Chapter 9 CONCLUDING REMARKS 9.1 Summary of Contributions In this thesis, we looked at two different relaxation of the decoding problems for error correcting codes: list decoding and property testing. In list decoding we focused on the achieving the best possible tradeoff between the rate of a code and the fraction of errors that could be handled by an efficient list decoding algorithm. Our first result was an explicit construction of a family of code along with efficient list decoding algorithm that achieves the list decoding capacity. That is, for any º æ , we presented folded Reed-Solomon codesæ of rate along with polyrate ± º nomial time list decoding algorithms that can correct up to í í32 fraction of errors (for any 2 Æ ± ). This was the first result to achieve the list decoding capacity for any rate (and over any alphabet) and answered one of the central open questions in coding theory. We¶769also explicit codes that achieve the tradeoff above with alphabets of size 8#:<;>=? constructed 6 6 ª ª ¹ Þ Þ Û , which are not that much bigger than the optimal size of ­A@ Û . ­54 For alphabets of fixed size, we presented explicit codes along with efficient list decoding algorithms that can correct a fraction of errors¶ up ÿ to the so called Blokh-Zyablovæ bound. In particular, these give binary codes of rate å 2 ¹ that can be list decoded up to a É ­íB2 ¶ ¬ fraction of errors. These codes have rates that come close to the optimal rate of 2 ¹ that can be achieved by random codes with exponential time list decoding algorithms. A key ingredient in designing codes over smaller alphabets was to come up with optimal list recovery algorithms. We also showed that the list recovery algorithm for Reed-Solomon codes due to Guruswami and Sudan is the best possible. We also presented some explicit bad list decoding configurations for list decoding Reed Solomon codes. Our contributions in property testing of error correcting codes are two-fold. First, we presented local testers for Reed-Muller codes that use near optimal number of queries to test membership in Reed-Muller codes over fixed alphabets. Second, we defined a natural variation of local testers called tolerant testers and showed that they had comparable parameters with those of the best known LTCs. 9.2 Directions for Future Work Even though we made some algorithmic progress in list decoding and property testing of error correcting codes, there are many questions that are still left unanswered. We have 162 highlighted the open questions throughout the thesis. In this section, we focus on some of the prominent ones (and related questions that we did not talk about earlier). 9.2.1 List Decoding The focus of this thesis in list decoding was on the optimal tradeoff between the rate and list decodability of codes. We first highlight the algorithmic challenges in this vein (most of which have been highlighted in the earlier chapters). ö ö The biggest unresolved question from this thesis is to come up with explicit codes over fixed alphabets that achieve the list decoding capacity. In particular is there a ¶ ¬ polynomial time construction of a binary codes of rate å 2 ¹ that be list decoded in æ polynomial time up to É ­míB2 fraction of errors? (Open Question 4.1) ö A less ambitious goal than the one above would be to give a polynomial time con¶ æ struction of a binary code with rate å 2 ¹ that can be list decoded up to íC2 fraction of erasures? Erasures are a weaker noise model that we have not considered in this thesis. In the erasure noise model, the only kind of errors that are allowed is the “dropping” of a symbol during transmission. Further, it is assumed that the receiver knows which symbols have been erased. For this weaker noise model, one can show æ . that for rate the optimal fraction of errors that can be list decoded is í ö Another less ambitious goal would be to resolve the following question. æ Is there a polynomial time construction of a code that can be listÿ decoded up to É ­½íD2 fraction of errors with rate that is asymptotically better than 2 ? Even though we achieved decoding capacity for large alphabets (that is, for rate æ list ª6 code, list decode í íC2 ¶#fraction of errors), the worst case list size was © @ Û Þ , æ É 2 ¹ worst case list size achievable by random codes. which is very far from the E A big open question is to come up with explicit codes that achieve the list decoding capacity with constant worst case list size. As a less ambitious goal would be to reduce the worst case list size to © for some constant ý that is independent of 2 . (See Section 3.7) We now look at some questions that relate to the combinatorial aspects of list decoding. ö ö For a rate Reed-Solomon code, can one list decode more than errors in polynomial time? (Open Question 6.2) 1 æ í þ fraction of To get to within 2 of list decoding capacity can one prove a lower¶#bound on the worst æ É 2 ¹ suffice case list size? For random codes it is known that list of size E but no 1 general lower bound is known. For high fraction of errors, tight bounds are known [65]. 163 ö For codes of rate over fixed alphabet of size òÆ ­ , can one show existence of linear 6 æ ¶ ¹ íG2 ? ª Þ codes that have Û many codewords in any Hamming ball of radius íF ï (See discussion in Section 2.2.1) 9.2.2 Property Testing Here are some open questions concerning property testing of codes. ö ö The biggest open question in this area is to answer the following question. Are there codes of constant rate and linear distance that can be locally tested with constant many queries? ö A less ambitious (but perhaps still very challenging) goal is to show that the answer to the question above is no for 1 queries. Can one show that the answer to the first question (or even the second) is no, if one also puts in the extra requirement of tolerant testability? (Open Question 8.1) 164 BIBLIOGRAPHY [1] Noga Alon, Tali Kaufman, Michael Krivelevich, Simon Litsyn, and Dana Ron. Testing Reed-Muller codes. IEEE Transactions on Information Theory, 51(11):4032– 4039, 2005. [2] Noga Alon and Joel Spencer. The Probabilistic Method. John Wiley and Sons, Inc., 1992. [3] Sigal Ar, Richard Lipton, Ronitt Rubinfeld, and Madhu Sudan. Reconstructing algebraic functions from mixed data. SIAM Journal on Computing, 28(2):488–511, 1999. [4] Sanjeev Arora, László Babai, Jacques Stern, and Z Sweedyk. The hardness of approximate optima in lattices, codes, and systems of linear equations. Journal of Computer and System Sciences, 54:317–331, 1997. [5] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the intractibility of approximation problems. Journal of the ACM, 45(3):501–555, 1998. [6] Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: A new characterization of NP. Journal of the ACM, 45(1):70–122, 1998. [7] Sanjeev Arora and Madhu Sudan. Improved low-degree testing and its applications. Combinatorica, 23(3):365–426, 2003. [8] László Babai, Lance Fortnow, Leonid A. Levin, and Mario Szegedy. Checking computations in polylogarithmic time. In Proceedings of the 23rd Annual ACM Symposium on the Theory of Computing(STOC), pages 21–31, 1991. [9] László Babai, Lance Fortnow, and Carsten Lund. Non-deterministic exponential time has two-prover interactive protocols. Computational Complexity, 1:3–40, 1991. [10] László Babai, Amir Shpilka, and Daniel Stefankovic. Locally testable cyclic codes. IEEE Transactions on Information Theory, 51(8):2849–2858, 2005. 165 [11] Eli Ben-Sasson, Oded Goldreich, Prahladh Harsha, Madhu Sudan, and Salil Vadhan. Robust PCPs of proximity, shorter PCPs and application to coding. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pages 1–10, 2004. [12] Eli Ben-Sasson, Swastik Kopparty, and Jaikumar Radhakrishnan. Subspace polynomials and list decoding of Reed-Solomon codes. In Proceedings of the 47th Annual Symposium on Foundations of Computer Science (FOCS), pages 207–216, 2006. [13] Eli Ben-Sasson and Madhu Sudan. Simple PCPs with poly-log rate and query complexity. In Proceedings of 37th ACM Symposium on Theory of Computing (STOC), pages 266–275, 2005. [14] Eli Ben-Sasson and Madhu Sudan. Robust locally testable codes and products of codes. Random Structures and Algorithms, 28(4):387–402, 2006. [15] Elwyn Berlekamp. Algebraic Coding Theory. McGraw Hill, New York, 1968. [16] Elwyn Berlekamp. Factoring polynomials over large finite fields. Mathematics of Computation, 24:713–735, 1970. [17] Elwyn R. Berlekamp, Robert J. McEliece, and Henk C. A. van Tilborg. On the inherent intractability of certain coding problems. IEEE Transactions on Information Theory, 24:384–386, 1978. [18] Daniel Bleichenbacher, Aggelos Kiayias, and Moti Yung. Decoding of interleaved Reed Solomon codes over noisy data. In Proceedings of the 30th International Colloquium on Automata, Languages and Programming (ICALP), pages 97–108, 2003. [19] E. L. Blokh and Victor V. Zyablov. Existence of linear concatenated binary codes with optimal correcting properties. Prob. Peredachi Inform., 9:3–10, 1973. [20] E. L. Blokh and Victor V. Zyablov. Linear Concatenated Codes. Moscow: Nauka, 1982. (in Russian). [21] Manuel Blum, Micahel Luby, and Ronit Rubinfeld. Self-testing/correcting with applications to numerical problems. Journal of Computer and System Sciences, 47(3):549–595, 1993. [22] Donald G. Chandler, Eric P. Batterman, and Govind Shah. Hexagonal, information encoding article, process and system. US Patent Number 4,874,936, October 1989. 166 [23] C. L. Chen and M. Y. Hsiao. Error-correcting codes for semiconductor memory applications: A state-of-the-art review. IBM Journal of Research and Development, 28(2):124–134, 1984. [24] Peter M. Chen, Edward K. Lee, Garth A. Gibson, Randy H. Katz, and David A. Patterson. RAID: High-performance, reliable secondary storage. ACM Computing Surveys, 26(2):145–185, 1994. [25] Qi Cheng and Daqing Wan. On the list and bounded distance decodability of ReedSolomon codes. SIAM Journal on Computing, 37(1):195–209, 2007. [26] Don Coppersmith and Atri Rudra. On the robust testability of product of codes. In Electronic Colloquium on Computational Complexity (ECCC) Tech Report TR05104, 2005. [27] Don Coppersmith and Madhu Sudan. Reconstructing curves in three (and higher) dimensional spaces from noisy data. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing (STOC), pages 136–142, June 2003. [28] Philippe Delsarte, Jean-Marie Goethals, and Florence Jessie MacWilliams. On generalized Reed-Muller codes and their relatives. Information and Control, 16:403– 442, 1970. [29] Peng Ding and Jennifer D. Key. Minimum-weight codewords as generators of generalized Reed-Muller codes. IEEE Trans. on Information Theory., 46:2152–2158, 2000. [30] Irit Dinur, Madhu Sudan, and Avi Wigderson. Robust local testability of tensor products of LDPC codes. In Proceedings of the 10th International Workshop on Randomization and Computation (RANDOM), pages 304–315, 2006. [31] Rodney G. Downey, Michael R. Fellows, Alexander Vardy, and Geoff Whittle. The parametrized complexity of some fundamental problems in coding theory. SIAM Journal on Computing, 29(2):545–570, 1999. [32] Ilya Dumer, Daniele Micciancio, and Madhu Sudan. Hardness of approximating the minimum distance of a linear code. IEEE Transactions on Information Theory, 49(1):22–37, 2003. [33] Ilya I. Dumer. Concatenated codes and their multilevel generalizations. In V. S. Pless and W. C. Huffman, editors, Handbook of Coding Theory, volume 2, pages 1911–1988. North Holland, 1998. 167 [34] Peter Elias. List decoding for noisy channels. Technical Report 335, Research Laboratory of Electronics, MIT, 1957. [35] Peter Elias. Error-correcting codes for list decoding. IEEE Transactions on Information Theory, 37:5–12, 1991. [36] Uriel Feige, Shafi Goldwasser, László Lovász, Shmuel Safra, and Mario Szegedy. Interactive proofs and the hardness of approximating cliques. Journal of the ACM, 43(2):268–292, 1996. [37] Uriel Fiege and Daniele Micciancio. The inapproximability of lattice and coding problems with preprocessing. Journal of Computer and System Sciences, 69(1):45– 67, 2004. [38] Eldar Fischer. The art of uninformed decisions: A primer to property testing. Bulletin of the European Association for Theoretical Computer Science, (75):97–126, 2001. [39] Eldar Fischer and Lance Fortnow. Tolerant versus intolerant testing for boolean properties. Theory of Computing, 2(9):173–183, 2006. [40] G. David Forney. Concatenated Codes. MIT Press, Cambridge, MA, 1966. [41] G. David Forney. Generalized Minimum Distance decoding. IEEE Transactions on Information Theory, 12:125–131, 1966. [42] Katalin Friedl and Madhu Sudan. Some improvements to total degree tests. In Proceedings of the 3rd Israel Symp. on Theory and Computing Systems (ISTCS), pages 190–198, 1995. [43] Peter Gemmell, Richard Lipton, Ronitt Rubinfeld, Madhu Sudan, and Avi Wigderson. Self-testing/correcting for polynomials and for approxiamte functions. In Proceeding of the 23rd Symposium on the Theory of Computing (STOC), pages 32–42, 1991. [44] Oded Goldreich. Short locally testable codes and proofs (Survey). ECCC Technical Report TR05-014, 2005. [45] Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. Journal of the ACM, 45(4):653–750, 1998. [46] Oded Goldreich and Madhu Sudan. Locally testable codes and PCPs of almost linear length. In Proceedings of 43rd Symposium on Foundations of Computer Science (FOCS), pages 13–22, 2002. 168 [47] Andrew Granville. The arithmetic properties of binomial coefficients. In http://www.cecm.sfu.ca/organics/papers/granville/, 1996. [48] Venkatesan Guruswami. Limits to list decodability of linear codes. In Proceedings of the 34th ACM Symposium on Theory of Computing (STOC), pages 802–811, 2002. [49] Venkatesan Guruswami. List decoding of error-correcting codes. Number 3282 in Lecture Notes in Computer Science. Springer, 2004. (Winning Thesis of the 2002 ACM Doctoral Dissertation Competition). [50] Venkatesan Guruswami. Algorithmic results in list decoding. In Foundations and Trends in Theoretical Computer Science (FnT-TCS), volume 2. NOW publishers, 2006. [51] Venkatesan Guruswami, Johan Håstad, Madhu Sudan, and David Zuckerman. Combinatorial bounds for list decoding. IEEE Transactions on Information Theory, 48(5):1021–1035, 2002. [52] Venkatesan Guruswami and Piotr Indyk. Expander-based constructions of efficiently decodable codes. In Proceedings of the 42nd Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 658–667, 2001. [53] Venkatesan Guruswami and Piotr Indyk. Efficiently decodable codes meeting Gilbert-Varshamov bound for low rates. In Proceedings of the 15th Annual ACMSIAM Symposium on Discrete Algorithms (SODA), pages 756–757, 2004. [54] Venkatesan Guruswami and Piotr Indyk. Linear-time encodable/decodable codes with near-optimal rate. IEEE Transactions on Information Theory, 51(10):3393– 3400, October 2005. [55] Venkatesan Guruswami and Anindya C. Patthak. Correlated Algebraic-Geometric codes: Improved list decoding over bounded alphabets. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS), October 2006. [56] Venkatesan Guruswami and Atri Rudra. Limits to list decoding Reed-Solomon codes. In Proceedings of the 37th ACM Symposium on Theory of Computing (STOC), pages 602–609, May 2005. [57] Venkatesan Guruswami and Atri Rudra. Tolerant locally testable codes. In Proceedings of the 9th International Workshop on Randomization and Computation (RANDOM), pages 306–317, 2005. 169 [58] Venkatesan Guruswami and Atri Rudra. Explicit capacity-achieving list-decodable codes. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC), pages 1–10, May 2006. [59] Venkatesan Guruswami and Atri Rudra. Limits to list decoding Reed-Solomon codes. IEEE Transactions on Information Theory, 52(8), August 2006. [60] Venkatesan Guruswami and Atri Rudra. Better binary list-decodable codes via multilevel concatenation. In Proceedings of the 11th International Workshop on Randomization and Computation (RANDOM), 2007. To Appear. [61] Venkatesan Guruswami and Atri Rudra. Concatenated codes can achieve list decoding capacity. Manuscript, June 2007. [62] Venkatesan Guruswami and Atri Rudra. Explicit bad list decoding configurations for Reed Solomon codes of constant rate. Manuscript, May 2006. [63] Venkatesan Guruswami and Madhu Sudan. Improved decoding of Reed-Solomon and algebraic-geometric codes. IEEE Transactions on Information Theory, 45:1757– 1767, 1999. [64] Venkatesan Guruswami and Madhu Sudan. Manuscript, February 2001. Extensions to the Johnson bound. [65] Venkatesan Guruswami and Salil Vadhan. A lower bound on list size for list decoding. In Proceedings of the 9th International Workshop on Randomization and Computation (RANDOM), pages 318–329, 2005. [66] Venkatesan Guruswami and Alexander Vardy. Maximum-likelihood decoding of Reed-Solomon codes is NP-hard. IEEE Transactions on Information Theory, 51(7):2249–2256, 2005. [67] Richard W. Hamming. Error Detecting and Error Correcting Codes. Bell System Technical Journal, 29:147–160, April 1950. [68] Prahladh Harsha. Robust PCPs of Proximity and Shorter PCPs. PhD thesis, Massachusetts Institute of Technology, 2004. [69] Edward F. Assmus Jr. and Jennifer D. Key. Polynomial codes and finite geometries. In V. S. Pless and W. C. Huffman, editors, Handbook of Coding Theory, volume 2, pages 1269–1343. North Holland, 1998. 170 [70] JH rn Justesen and Tom HH holdt. Bounds on list decoding of MDS codes. IEEE Transactions on Information Theory, 47(4):1604–1609, May 2001. [71] Charanjit S. Jutla, Anindya C. Patthak, and Atri Rudra. Testing polynomials over general fields. manuscript, 2004. [72] Charanjit S. Jutla, Anindya C. Patthak, Atri Rudra, and David Zuckerman. Testing low-degree polynomials over prime fields. In Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 423–432, 2004. [73] Tali Kaufman and Simon Litsyn. Almost orthogonal linear codes are locally testable. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 317–326, 2005. [74] Tali Kaufman and Dana Ron. Testing polynomials over general fields. SIAM Journal on Computing, 36(3):779–802, 2006. [75] Victor Y. Krachkovsky. Reed-Solomon codes for correcting phased error bursts. IEEE Transactions on Information Theory, 49(11):2975–2984, November 2003. [76] Michael Langberg. Private codes or Succinct random codes that are (almost) perfect. In Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 325–334, October 2004. [77] Rudolf Lidl and Harald Niederreiter. Introduction to Finite Fields and their applications. Cambridge University Press, Cambridge, MA, 1986. [78] Yu. V. Linnik. On the least prime in an arithmetic progression. I. The basic theorem. Mat. Sbornik N. S., 15(57):139–178, 1944. [79] Antoine C. Lobstein. The hardness of solving subset sum with preprocessing. IEEE Transactions on Information Theory, 36:943–946, 1990. [80] Florence Jessie MacWilliams and Neil J. A. Sloane. The Theory of Error-Correcting Codes. Elsevier/North-Holland, Amsterdam, 1981. [81] Robert J. McEliece. On the average list size for the Guruswami-Sudan decoder. In 7th International Symposium on Communications Theory and Applications (ISCTA), July 2003. [82] D. E. Muller. Application of boolean algebra to switching circuit design and to error detection. IEEE Transactions on Computers, 3:6–12, 1954. 171 [83] Michal Parnas, Dana Ron, and Ronitt Rubinfeld. Tolerant property testing and distance approximation. Journal of Computer and System Sciences, 72(6):1012–1042, 2006. [84] Farzad Parvaresh and Alexander Vardy. Multivariate interpolation decoding beyond the Guruswami-Sudan radius. In Proceedings of the 42nd Allerton Conference on Communication, Control and Computing, 2004. [85] Farzad Parvaresh and Alexander Vardy. Correcting errors beyond the GuruswamiSudan radius in polynomial time. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, pages 285–294, 2005. [86] Anindya C. Patthak. Error Correcting Codes : Local-testing, List-decoding, and Applications to Cryptography. PhD thesis, University of Texas at Austin, 2007. [87] Larry L. Peterson and Bruce S. Davis. Computer Networks: A Systems Approach. Morgan Kaufmann Publishers, San Francisco, 1996. [88] A. Polishchuk and D. A. Spielman. Nearly-linear size holographic proofs. In Proceedings of the 26th Annual ACM Symposium on Theory of Computing (STOC), pages 194–203, 1994. [89] Irving S. Reed. A class of multiple-error-correcting codes and the decoding scheme. IEEE Transactions on Information Theory, 4:38–49, 1954. [90] Irving S. Reed and Gustav Solomon. Polynomial codes over certain finite fields. SIAM Journal on Applied Mathematics, 8:300–304, 1960. [91] Oded Regev. Improved inapproximability of lattice and coding problems with preprocessing. IEEE Transactions on Information Theory, 50:2031–2037, 2004. [92] Dana Ron. Property Testing. In S. Rajasekaran, P. M. Pardalos, J. H. Reif, and J. D. P. Rolim, editors, Handbook of Randomization, pages 597–649, 2001. [93] Ronitt Rubinfeld and Madhu Sudan. Robust characterizations of polynomials with applications to program testing. SIAM Journal on Computing, 25(2):252–271, 1996. [94] Claude E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27:379–423, 623–656, 1948. [95] Michael Sipser and Daniel Spielman. Expander codes. IEEE Transactions on Information Theory, 42(6):1710–1722, 1996. 172 [96] Madhu Sudan. Efficient Checking of Polynomials and Proofs and the Hardness of Approximation Problems. ACM Distinguished Theses Series. Lecture Notes in Computer Science, no. 1001, Springer, 1996. [97] Madhu Sudan. Decoding of Reed-Solomon codes beyond the error-correction bound. Journal of Complexity, 13(1):180–193, 1997. [98] Madhu Sudan. List decoding: Algorithms and applications. SIGACT News, 31:16– 27, 2000. [99] Madhu Sudan. Lecture notes on algorithmic introduction to coding theory, Fall 2001. Lecture 15. [100] Madhu Sudan. Lecture notes on algorithmic introduction to coding theory, Fall 2001. Lecture 6. [101] Amnon Ta-Shma and David Zuckerman. Extractor Codes. IEEE Transactions on Information Theory, 50(12):3015–3025, 2001. [102] Christian Thommesen. The existence of binary linear concatenated codes with ReedSolomon outer codes which asymptotically meet the Gilbert-Varshamov bound. IEEE Transactions on Information Theory, 29(6):850–853, November 1983. [103] Paul Valiant. The tensor product of two codes is not necessarily robustly testable. In Proceedings of the 9th International Workshop on Randomization and Computation (RANDOM), pages 472–481, 2005. [104] Jacobus H. van Lint. Introduction to Coding Theory. Graduate Texts in Mathematics 86, (Third Edition) Springer-Verlag, Berlin, 1999. [105] Stephen B. Wicker and Vijay K. Bhargava, editors. Reed-Solomon Codes and Their Applications. John Wiley and Sons, Inc., September 1999. [106] John M. Wozencraft. List Decoding. Quarterly Progress Report, Research Laboratory of Electronics, MIT, 48:90–95, 1958. [107] Chaoping Xing. Nonlinear codes from algebraic curves improving the TsfasmanVladut-Zink bound. IEEE Transactions on Information Theory, 49(7):1653–1657, 2003. [108] Victor A. Zinoviev. Generalized concatenated codes. Prob. Peredachi Inform., 12(1):5–15, 1976. 173 [109] Victor A. Zinoviev and Victor V. Zyablov. Codes with unequal protection. Prob. Peredachi Inform., 15(4):50–60, 1979. [110] Victor V. Zyablov and Mark S. Pinsker. List cascade decoding. Problems of Information Transmission, 17(4):29–34, 1981 (in Russian); pp. 236-240 (in English), 1982. 174 VITA Atri Rudra was born in Dhanbad, India which is also where he grew up. He got a Bachelors in Technology in Computer Science and Engineering from Indian Institute of Technology, Kharagpur in 2000. After spending two years at IBM India Research Lab in New Delhi, he joined the graduate program at University of Texas at Austin in 2002. He moved to the Computer Science and Engineering department at the University of Washington in 2004, where he earned his Master of Science and Doctor of Philosophy degrees in 2005 and 2007 respectively under the supervision of Venkatesan Guruswami. Beginning September 2007, he will be an Assistant Professor at the University at Buffalo, New York.