Some new(ish) results in network information theory Muriel Médard RLE EECS MIT Newton Institute Workshop January 2010 Collaborators • MIT: David Karger, Anna Lee • Technical University of Munich: Ralf Koetter † (previously UIUC) • California Institute of Technology: Michelle Effros, Tracey Ho (previously MIT, UIUC, Lucent) • Stanford: Andrea Goldsmith, Ivana Maric • University of South Australia: Desmond Lun (previously MIT, UIUC) Overview • New results in network information theory have generally centered around min-cut max-flow theorems because of multicast – Network coding • • • • • Algebraic model Random codes Correlated sources Erasures Errors – High SNR networks • What happens when we do not have multicast – Problem of non-multicast network coding – Equivalence Starting without noise [KM01, 02, 03] The meaning of connection The basic problem Linear codes Linear codes Linear network system Transfer matrix d Linear network system An algebraic flow theorem Multicast Multicast theorem We recover the min-cut max-flow theorem of [ACLY00] One source, disjoint connections plus multicasts Avoiding roots • The effect of the network is that of a transfer matrix from sources to receivers • To recover symbols at the receivers, we require sufficient degrees of freedom – an invertible matrix in the coefficients of all nodes • We just need to avoid roots in each of the submatrices corresponding to the individual receivers • The realization of the determinant of the matrix will be non-zero with high probability if the coefficients are chosen independently and randomly over a large enough field - similar to Schwartz-Zippel bound [HKMKE03, HMSEK03] • Random and distributed choice of coefficients, followed by matrix inversion at the receiver • Is independence crucial? Connection to Slepian-Wolf • Also a question of min-cut in terms of entropies for some senders and a receiver • Here again, random approaches optimal • Different approaches: random linear codes, binning, etc… Extending to a whole network • Use distributed random linear codes • Redundancy is removed or added in different parts of the network depending on available capacity • No knowledge of source entropies at interior network nodes • For the special case of a a single source and depth one the network coding error exponents reduce to known error exponents for Slepian-Wolf coding [Csi82] Proof outline • Sets of joint types • Sequences of each type Proof outline • Cardinality bounds • Probability of source vectors of certain types • • Extra step to bound probability of being mapped to a zero at any node Why is this useful? Joint source network coding Sum of rates on virtual links > R joint solution = R separate solution (R = 3) Joint (cost 9) [Lee et al 07] Separate (cost 10.5) (physical, for receiver 1, for receiver 2) What next? • Random coding approaches work well for arbitrary networks with multicast • Decoding is easy if we have independent or linearly dependent sources, may be difficult if we have arbitrary correlation – need minimum entropy decoding – can add some structure – can add some error [CME08] • What happens when the channels are not perfect? Erasure reliability – single flow End-to-end erasure coding: Capacity is 1 AB 1 BC packets per unit time. As two separate channels: Capacity is min 1 AB , 1 BC packets per unit time. - Can use block erasure coding on each channel. But delay is a problem. Network coding: minimum cut is capacity - For erasures, correlated or not, we can in the multicast case deal with average flows uniquely [LME04], [LMK05], [DGPHE04] - Nodes store received packets in memory - Random linear combinations of memory contents sent out - Delay expressions generalize Jackson networks to the innovative packets - Can be used in a rateless fashion Scheme for erasure reliability • We have k message packets w1, w2, . . . , wk (fixedlength vectors over Fq) at the source. • (Uniformly-)random linear combinations of w1, w2, . . . , wk injected into source’s memory according to process of rate R0. • At every node, (uniformly-)random linear combinations of memory contents sent out; – received packets stored into memory. – in every packet, store length-k vector over Fq representing the transformation it is of w1, w2, . . . , wk — global encoding vector sent in the packet overhead, no need for side information about delays. Outline of proof • Keep track of the propagation of innovative packets packets whose auxiliary encoding vectors (transformation with respect to the n packets injected into the source’s memory) are linearly independent across particular cuts. • Can show that, if R0 less than capacity and input process is Poisson, then propagation of innovative packets through any node forms a stable M/M/1 queueing system in steady-state. – We obtain delay expressions using in effect a generalization of Jackson networks for innovative packets • We may also obtain error exponents over the network • Erasures can be handled readily and we have fairly good characterization of the network-wide behavior The case with errors • There is separation between network coding and channel coding when the links are pointto-point [SYC06], [Bor02] • Both papers address the multicast, in which case the tightness of the min-cut max-flow bounds can be exploited and random approaches work • The deleterious effects of the channel are not crucial for capacity results but we may not have a handle on error exponents and other detailed aspects of behavior • Is the point-to-point assumption crucial? What happens when we do not have point-to-point links? • Let us consider the case where we have interference as the dominating factor (high SNR) • Recently considered in the high gain regime [ADT08, 09] • In the high SNR regime (different from high gain in networks because of noise amplification), analog network coding is near optimal [MGM10] • In all of these cases, min-cut max-flow holds and random arguments also hold Model as additive channels over a finite field Model as error free links Multiple access region Beyond multicast • Solutions are no longer easy, even in the scalar linear case • Linearity may not suffice [DFZ04], examples from matroids • It no longer suffices to avoid roots • Many problems regarding solvability remain open – achievability approaches will not work Linear general case Linear general case Linear general case The non-multicast case • Even for linear systems, we do not know how to perform network coding over error-free networks • Random schemes will not work • What can we say about systems with errors for point-to-point links? • The best we can hope for is for an equivalence between error-free systems and systems with errors – Random schemes handle errors locally – The global problem of non-multicast network coding is inherently combinatorial and difficult Beyond multicast All channels have capacity 1 bit/unit time Are these two networks essentially the same? Intuitively, since the “noise” is uncorrelated to any other random variable it cannot help..... Decoding is restrictive The characteristic of a noisy link vs a capacitated bit-pipe A noisy channel allows for a larger set of strategies than a pipe Equivalence Dual to Shannon Theory N X X Y + Throughput C C=I(X;Y) Shannon Theory Dual to Shannon Theory: By emulating noisy channels as noiseless channels with same link capacity, can apply existing tools for noiseless channels (e.g. network coding) to obtain new results for networks with noisy links. This provides a new method for finding network capacity Network equivalence • We consider a factorization of networks into hyperedges • We do not provide solutions to networks since we are dealing with inherently combinatorial and intractable problems • We instead consider mimicking behavior of noisy links over errorfree links • We can generate equivalence classes among networks, subsuming all possible coding schemes [KEM09] – Rnoiseless Rnoisy easy since the maximum rate on the noiseless channels equals the capacity of the noisy links: can transmit at same rates on both. – Rnoisy Rnoiseless hard since must show the capacity region is not increased by transmitting over links at rates above the noisy link capacity. We prove this using theory of types • Metrics other than capacity may not be the same for both networks (e.g. error exponents) particularly because of feedback • For links other than point-to-point, we can generate bounds • These may be distortion-based rather than capacity-based, unlike the small examples we have customarily considered Conclusions • Random approaches seem to hold us in good stead for multicast systems under a variety of settings (general networks, correlated sources, erasures, errors, interference…) • For more general network connections, the difficulty arises even in the absence of channel complications - we may have equivalences between networks • Can we combine the two to figure out how to break up networks in a reasonable way to allow useful modeling?