Noisy Connections: A Survey of Interactive Coding and its Borders with Other Topics Allison Bishop Lewko Columbia University featuring works by Schulman, Haeupler, Brakerski, Kalai, Jain, Rao, Vitercik, Dodis, Chung, Pass, Telang Two-Party Computation with Communication Errors 0 1 1 0 10 Alice Input x Input y *Sender does not know an error occurred, Rest of the computation is wrong! We consider strong adversary who can corrupt a constant fraction of all bits (fixed communication length). What Makes Interactive Coding Distinct from Error-Correcting Codes? Interactive coding problem for 2 parties: • As first formulated and studied by Schulman (1992) • For m rounds of interaction, just using error-correcting codes can only achieve error rate < 1/m • Goal is to get constant relative error rate, and constant (multiplicative) overhead in communication Expressing the Protocol as a Tree Alice speaks (function of input y) Bob speaks (function of input x) Execution of the Protocol with No Errors 0 1 0 1 Path in tree = transcript of communication Simulating the Protocol Tree Path Under Errors 1 0 - Errors cause Bob and Alice to have differing views of simulated transcript Approach: 1. Provide mechanism to detect disagreement 2. Provide mechanism to move back toward agreement 3. Once re-synched, try again to proceed down protocol tree Communicating “Pebble” Movements • Each party has a “pebble” it moves around the protocol tree • We can use 4 symbol alphabet for “Down Left”, “Down Right,” “Back Up”, “Hold” to describe pebbles that move along one branch of the tree at a time (or stay put) • Goal is to communicate the sequence of pebble moves so each party can know where the other party’s pebble is. • We want to encode a dynamic string of characters L, R, B, H so that it is decoded properly at moments in time when there are not too many past errors. Encoding Movements via Tree Codes [Schulman 92] • Edges labeled by symbols from Constant-size alphabet Tree code: Example with alphabet {1,2,3,4,5} 4 1 2 5 1 2 2 1 3 3 5 1 4 2 4 3 4 • Any two paths have constant-fraction of symbols differing from lowest common ancestor onwards 3 4 5 5 5 1 2 3 2 1 3 4 1 Example: Strings 1, 2, 5 and 3, 2, 4 differ in 2 out of 3 symbols. Interactive Coding from Tree Codes Suppose we have a 4-ary tree code: • Encode a sequence of moves “L, R, B, H, …” by the labels of corresponding edges in the tree code one symbol = one edge down the tree code • Decode by finding path in tree code of right length and closest Hamming distance One technicality: don’t want final pebble moves to change simulated transcript, So can’t hold when we reach bottom of the protocol tree. Need to pad with dummy layers at the bottom (easy enough to do). Intuition for Why This Works • Define a good event as both parties correctly decode and know where the other party’s pebble is. • When this happens, “progress” is made (either in moving forward, or getting closer to syncing up) • Bad event is a decoding error. Only a bounded amount of damage done, as pebbles only move one edge at a time. Decoding error of depth L Interval length cL with constant fraction of errors Time Bad intervals can Cover only bounded Fraction of time Now That You Think Tree Codes are Cool… Some bad news: We don’t know how to efficiently construct them. Some progress on this: [B12, MS14] but still no unconditional, poly-time deterministic construction. Randomized constructions are known, but we still want efficient decoding too Efficient Solution: (tiny) TCs + Hashing [BK12] Let’s revisit the higher level approach: 1. Provide mechanism to detect disagreement 2. Provide mechanism to move back toward agreement 3. Once re-synched, try again to proceed down protocol tree Observation: - We can build short tree codes by brute force in poly time - Naïve concatenation: use TC1 for awhile, then use TC2, etc. Problem: lose ability to detect/correct errors in the more distant past Solution: Hash entire simulated transcript to detect any lingering disagreement [BK12] Protocol Overview 2 Chunk 5 1 3 2 5 Short tree code Hash Check 2 Chunk 5 1 3 2 • Hash entire simulated transcript so far + chunk number to detect disagreement • Back up when disagreement found 5 Short tree code Hash Check • Divide original protocol into smallish chunks – use short tree code within each Note: hash length long enough to avoid collisions whp, and chunk length should be similar to avoid communication blowup from the hash phases. … Even Simpler Efficient Solution – no TCs! [H14] Observation: Hash collisions aren’t really so bad! If they happen at a constant rate, can really handle them like errors. • We can make the chunks and hashes constant length now we don’t even need short TCs to get constant error rate with constant communication overhead. • Independent hash keys are picked each time, so we can use a Chernoff bound to suitably control overall effect of hash collisions on simulation progress. Simplest Protocol Overview Chunk Hash Check Chunk Hash Check *Simulated In the clear • Divide original protocol into constant size chunks • Hash entire simulated transcript so far + chunk number to detect disagreement • Back up when disagreement found Note: chunk length should be similar to hash length to avoid communication blowup from the hash phases. Hash collisions happen at bounded constant frequency whp. Applications/Extensions: 1. Interactive Coding Meets Cryptography What happens when we apply interactive coding in situations where we want to preserve more than just correctness and (roughly) communication complexity? Example: “Knowledge preserving” interactive coding [CPT13] Question: Can we ensure that parties don’t learn anything more about the other’s input than they would learn in the error free setting? Answer: No! (at least not with a good error-rate). Main intuition is that errors will force a “back track” so that some unnecessary Function of an input will be computed and sent. IP = PSPACE over Adversarial Channels [DL] It turns out: Correctness and Soundness can be preserved over adversarial channel errors! challenge Verifier ... What? response Channel errors! Let’s go back. Prover Main idea: Verifier can pick fresh Randomness after going back Amplification used to prevent Poly tries from helping prover too much A natural concern: Can cheating prover use guise of channel errors to avoid answering tough challenges? Applications/Extensions 2. Multi-party Protocols Interactive coding for multi-party protocols [RS94, GMS11, JKL15] • Network of n parties, can communicate via pairwise channels • Goal is to compute a joint function of inputs over channels • Many models: synchronous vs. asynchronous, noisy vs. adversarial, etc. • Many measures: communication complexity, computation, rounds, links, etc. Basic Idea: Reduce to 2-party case 𝑃1 𝑃2 𝑃6 𝑃3 Problems: 𝑃5 𝑃4 1. 𝜃 𝑛2 communication links 2. Need to synchronize One Approach [JKL15] 𝑃1 𝑃2 Properties: 𝑃3 𝑃𝑛 𝑃4 constant fraction of adversarial error Resilient to constant 𝑛 Constant blowup in communication ongoing work Efficiency preserving Assumptions: 1. Protocol is semi-adaptive 2. There exists one party connected to all the rest High-Level Description 𝑃1 𝑃2 𝑃3 𝑃6 𝑃∗ 𝑃4 1. All communication is through 𝑃∗ 𝑃5 Use variant of [Brakerski-K12]? Usevariant variant Use of of (efficient) [Haeupler14] [Schulman93] yields(efficient) 𝑂(log 𝑛) comm (inefficient) blowup 2. Make each pairwise protocol 𝑃𝑖 , 𝑃∗ error resilient Problem 1: 𝑃∗ needs to synchronize global protocol. Problem 2: Errors need to be detected fast (after 𝑂 𝑛 communication). Solution 2: After each (global) chunk of 𝑛 bits, all parties speak. Passing the Burden of Being P* [LV] Challenge: P* maintains a view of each pairwise transcript to check consistency – can’t pass these all to a new P* without lots of communication overhead! Idea: Replace Hash(Transcript so far) with an iterated hash. Let 𝑡1 , 𝑡2 , 𝑡3 , … be chunks of transcript. Compute hash to check agreement as Hash(𝑡3 , Hash(𝑡2 , Hash(𝑡1 ))) *Now we can pass short hashes intead of long transcripts to a new P* to maintain ability to detect prior disagreements. 3. A More Speculative Connection Recently, King and Saia [KS13] presented an expected poly-time Byzantine Agreement algorithm against a computationally unbounded, adaptive asynchronous adversary [LL13] presented an impossibility result for a kind of “mobile” adversary who can corrupt a changing set of players and reset their memories upon releasing them to corrupt others. Intriguing Question: Adversarial network channels can be defined to model each of these adversaries, so can we classify a “worst-case” adversarial network against which Byzantine Agreement is possible? 4. Connection Between Formulas and Communication [KW88] Boolean f uncti on f : f 0; 1gn ! f 0; 1g 0 1 1 0 x s:t: f (x) = 1 i s:t: x i 6 = yi Alice y s:t: f (y) = 0 How many bits need to be sent in the worst case? i s:t: x i 6 = yi Communication Complexity = Formula Depth [KW88] 1 0 Left Right Left AND Alice 0 OR OR 1 i= 4 AND x1 y1 AND x2 x1 y2 y1 x3 y3 AND 0 x4 y4 i= 4 AND x3 x5 y3 y5 x2 y2 Carrying Error-Resilience through the Karchmer-Wigderson Connection [KLR12] We build: want: know: [KW88] Error-free computation Error-free communication Compiler Compiler Error-resilient computation Error-resilient communication Communication with Errors: An Easier Model (Sender Feedback) 0 1 1 0 1 \ Redo" 0 1 1 Oops! *Sender knows error occurred Alice Short-Circuit Errors 01 True output of gate replaced by value from one of its inputs AND 1 1 OR OR 0 AND 0 1 1 1 AND 1 0 1 AND AND 0 0 1 1 Recovery from Non-Fatal Short-Circuits Result: can allow a fixed constant fraction of errors per path Efficiency: formula depth multiplied by a constant Example: allow one error per path Some Further Directions • What other kinds of circuit errors can we correct? • What kinds of bounds on size of error-resilient circuits can we prove? • What other properties of 2 or multi-party computations can/can’t be preserved under channel errors? • What are the “right” network adversarial models for various applications? • How can we unify this with distributed computing theory where correctness is relaxed and not a fixed function of the inputs?