A Comprehensive view of duality in multiuser source coding and channel coding S. Sandeep Pradhan University of Michigan, Ann Arbor joint work with K. Ramchandran Univ. of California, Berkeley M Acknowledgements: Jim Chou, Univ. of California Phillip Chou, Microsoft Research David Tse, Univ. of California Pramod Viswanath, Univ. of Illinois Michael Gastpar, Univ. of California Prakash Ishwar, Univ. of California Martin Vetterli, EPFL Outline Motivation, related work and background Duality between source and channel coding – Role of source distortion measure & channel cost measure Extension to the case of side information MIMO source coding and channel coding with one-sided collaboration Future work: Extensions to multiuser joint sourcechannel coding Conclusions Motivation • Expanding applications of MIMO source and channel coding • Explore a unifying thread to these diverse problems • We consider SCSI and CCSI as functional duals • We consider 1. Distributed source coding Functional dual 2. Broadcast channel coding 3. Multiple description source coding Functional dual 4. Multiple access channel coding It all starts with Shannon “There is a curious and provocative duality between the properties of a source with a distortion measure and those of a channel. This duality is enhanced if we consider channels in which there is a “cost” associated with the different input letters, and it is desired to find the capacity subject to the constraint that the expected cost not exceed a certain quantity…..” Related work (incomplete list) •Duality between source coding and channel coding: •Shannon (1959) •Csiszar and Korner (textbook, 1981) •Cover & Thomas (textbook: 1991): covering vs. packing •Eyuboglu and Forney (1993): quantizing vs. modulation: boundary/granular gains vs. shaping/coding gains •Laroia, Farvardin & Tretter (1994): SVQ versus shell mapping •Duality between source coding with side information (SCSI) and channel coding with side information (CCSI): •Chou, Pradhan & Ramchandran (1999) •Barron, Wornell and Chen (2000) •Su, Eggers & Girod (2000) •Cover and Chiang (2001) Notation: Source coding: ^ X X Encoder Decoder Source alphabet X Distribution p (x ) Reconstruction alphabet X̂ Distortion measure d ( x, xˆ ) : X Xˆ Distortion constraint D: Ed ( x, xˆ ) D Encoder: X L {1,2,...,2 LR } LR L Decoder {1,2,....,2 } X̂ Minimum rate of representing X with distortion D: Rate-distortion function R(D)= min I ( X ; Xˆ ) p ( xˆ | x) Channel coding: m Encoder X̂ Channel X ^ m Decoder Input and output alphabets X̂ , X Conditional distribution p ( x | xˆ ) Cost measure w(xˆ ) : X̂ Cost constraint W: Ew( xˆ ) W Encoder: {1,2,....,2 LR } X̂ L Decoder X L {1,2,...,2 LR } Maximum rate of communication with cost W: max Capacity-cost function C(W)= I ( X ; Xˆ ) p ( xˆ ) Source encoder and channel decoder have mapping with the same domain and range. Similarly, channel encoder and source decoder have the same domain and range. Inspiration for cost function/distortion measure analysis: Gastpar, Rimoldi & Vetterli ’00: To code or not to code? S X Encoder Source: p(s) Channel: p(y|x) ^ S Y Channel Decoder Encoder: f(.) Decoder: g(.) For a given pair of p(s) and p(y|x), there exist a distortion measure d ( s, sˆ) and a cost measure w(x ) such that uncoded mappings at the encoder and decoder are optimal in terms of end-to-end achievable performance. Bottom line: Any source can be “matched” optimally to any channel if you are allowed to pick the distortion & cost measures for the source & channel. Role of distortion measures: (Fact 1) X p( X ) Quantizer p( Xˆ | X ) X̂ Given a source: p ( X ) Let p( Xˆ | X ) be some arbitrary quantizer. Then there exists a distortion measure d ( x, xˆ ) such that: arg min p' ( xˆ | x) I ( X ; Xˆ ) p( xˆ | x) : X ~ p ( x), Ed ( x, xˆ ) D and d ( x, xˆ ) c log p' ( x | xˆ ) ( x) Bottom line: any given quantizer p( Xˆ | X ) is the optimal quantizer for any source p ( X ) provided you are allowed to pick the distortion measure Role of cost measures: (Fact 2) X̂ p ' ( Xˆ ) X p ( X | Xˆ ) Given a channel: p ( X | Xˆ ) Let p(Xˆ ) be some arbitrary input distribution. Then there exists a cost measure w(xˆ ) such that: p' ( xˆ ) Channel arg max I ( X ; Xˆ ) p( xˆ ) : ( X | Xˆ ) ~ p ( x | xˆ ), Ew( xˆ ) W and w( xˆ ) cD( p ( x | xˆ ) || p' ( x)) Bottom line: any given input distribution p' ( X ) is the optimal input for any channel p ( Xˆ | X ) provided you are allowed to pick the cost measure Now we are ready to characterize duality Duality between classical source and channel coding: p( X ) X Optimal Quantizer p * ( Xˆ | X ) p * ( Xˆ ) X̂ Theorem 1a: For a given source coding problem with source p ( X ) distortion measure d ( x, xˆ ) , distortion constraint D, let the optimal quantizer be arg min p * ( xˆ | x) I ( X ; Xˆ ) p( xˆ | x) : X ~ p ( x), Ed ( x, xˆ ) D inducing the distributions (using Bayes’ rule): p * ( xˆ | x) p ( x) p * ( x | xˆ ) __ p( x) p * ( xˆ | x) x ; p * ( xˆ ) p( x) p * ( xˆ | x) x REVERSAL OF ORDER Optimal Quantizer p( X ) p * ( Xˆ ) p * ( Xˆ | X ) X Then X̂ p( X ) Channel X p * ( X | Xˆ ) p * ( Xˆ ) X̂ a unique dual channel coding problem with channel p * ( x | xˆ ), input alphabet X̂ , output alphabet X, cost measure w(xˆ ), and cost constraint W, such that: (i) R(D)=C(W); (ii) p * ( xˆ ) arg max p ( xˆ ): X | Xˆ ~ p*( x| xˆ ), EwW I ( X ; Xˆ ), where w( xˆ ) c1D( p * ( x | xˆ ) || p( x)) and W E p*( xˆ ) w( Xˆ ). Interpretation of functional duality For a given source coding problem, we can associate a specific channel coding problem such that • both problems induce the same optimal joint distribution p * ( x, xˆ ) • the optimal encoder for one is functionally identical to the optimal decoder for the other in the limit of large block length • an appropriate channel-cost measure is associated Source coding: distortion measure is as important as the source distribution Channel coding: cost measure is as important as the channel conditional distribution Source coding with side information: X Encoder Decoder ^ X S •The encoder needs to compress the source X. •The decoder has access to correlated side information S. •Studied by Slepian-Wolf ‘73, Wyner-Ziv ’76 Berger ’77 •Applications: sensor networks, digital upgrade, diversity coding for packet networks Channel coding with side information: m Encoder ^ X Channel X Decoder ^ m S • Encoder has access to some information S related to the statistical nature of the channel. • Encoder wishes to communicate over this cost-constrained channel • Studied by Gelfand-Pinsker ‘81, Costa ‘83, Heegard-El Gamal ‘85 • Applications: watermarking, data hiding, precoding for known interference, multiantenna broadcast channels. Duality (loose sense) CCSI Side information at encoder only Channel code is “partitioned” into a bank of source codes SCSI Side info. at decoder only Source code is “partitioned” into a bank of channel codes Source coding with side information at decoder (SCSI): (Wyner-Ziv ’76) X p ( x | s) Conditional source U Encoder Side information p (s) Context-dependent distortion measure d ( x, xˆ , s ) L RL Encoder f : X {1,2,..,2 } Decoder g : {1,2,..,2 RL } S L Xˆ L U Decoder ^ X ^ S min R ( D) [ I ( X ;U ) I ( S ;U )] p(u | x), p( xˆ | u, s ) such that (S X U ), ( X (U , S ) Xˆ ) & EdS ( X , Xˆ ) D Rate-distortion function: * Intuition (natural Markov chains): • ( S X U ) side information S is not present at the encoder • ( X {U , S } Xˆ ) Note: source X is not present at the decoder p * ( x, s, xˆ, u) p( s) p( x | s) p* (u | x) p* ( xˆ | s, u) Completely determines the optimal joint distribution SCSI: Gaussian example: (reconstruction of (X-S)): • Conditional source: X=S+V, p(v)~N(0,N) • Side information: p(s)~N(0,Q) • Distortion measure: d S ( x, xˆ ) (( x s ) xˆ ) 2 (mean squared error reconstruction of (x-s)) • Ed S ( x, xˆ ) D X Test channel Decoder Encoder U + q + X̂ S p* (u | x) N D N p* ( xˆ | u, s) (MMSE estimator) + X + Z S p* ( x | xˆ, s) Channel coding with side information at encoder (CCSI): (Gelfand-Pinsker ’81) Conditional channel p ( x | xˆ, s) Side information p (s) Cost measure w( xˆ , s ) Encoder f : {1,2,..,2 RL } S L Xˆ L Decoder g : X L {1,2,..,2 RL } U X̂ p ( X | Xˆ , S ) X Decoder U Encoder S max [ I ( X ;U ) I ( S ;U )] Capacity-Cost function: C (W ) p(u | s ), p( xˆ | u, s ) ( X {Xˆ , S} U ), ( X {U , S} Xˆ ), & EwS ( Xˆ ) W such that * Intuition (natural Markov chains): ( X { Xˆ , S} U ) • • ( X {U , S } Xˆ ) channel does not care about U encoder does not have access to X p * ( x, s, xˆ, u ) p( s) p* (u | s) p* ( xˆ | s, u ) p( x | xˆ, s) Completely determines the optimal joint distribution CCSI: Gaussian example (known interference): • Conditional channel: X Xˆ S Z , • Side information: p ( s ) ~ N (0, Q) 2 • Distortion measure: wS ( xˆ ) ( xˆ ) ( power constraint on x̂ ) • Ew( xˆ , s ) N D U X̂ + + p* ( xˆ | u, s) (MMSE precoder) Decoder X + S p ( z ) ~ N (0, D) Channel Encoder (Costa ’83) U + q Z S p ( x | xˆ, s) N D N p* (u | x) CCSI N D N Encoder X + U + q * p (u | x) Encoder X̂ S p ( xˆ | u, s) * Decoder SCSI Channel + + Decoder X + Z q p ( x | xˆ, s) p* (u | x) S * Test channel U Decoder Encoder X * p (u | x) U p ( xˆ | u, s) * Induced test channel X̂ p ( x | xˆ, s) * X S Theorem 2a: Given: p( x | s), p( s), d S ( x, xˆ ), D, Find optimal: Inducing: p* (u | x), p* ( xˆ | u, s) that minimizes [ I ( X ;U ) I ( S ;U )] p* ( x, s, xˆ, u ) p( s) p( x | s) p* (u | x) p* ( xˆ | s, u ). & If : p* ( x | xˆ, s) (U { Xˆ , S} X ), using Bayes’ rule is satisfied (natural CCSI constraint) X Encoder p * (u | x) Induced test channel Decoder U p ( xˆ | u, s) * X̂ p* ( x | xˆ, s) X S Channel Encoder U => a dual CCSI with Channel= p ( x | xˆ, s) * p* ( xˆ | u, s) X̂ p ( x | xˆ, s) * X Decoder p * (u | x) U S Side information = p (s ) Cost measure= wS (xˆ ) Cost constraint=W (i) Rate-distortion bound R* ( D) = capacity-cost bound C * (W ) (ii) p* (u | s), p* ( xˆ | s, u ) achieve capacity-cost optimality C * (W ) (iii) and wS ( xˆ ) c1 D( p* ( x | xˆ, s) || p ( x | s)) ( s), W E p ( s ) p* ( xˆ|s ) ( wS ( xˆ )) Markov chains and duality S X U X U , S Xˆ DUALITY X Xˆ , S U X U , S Xˆ ^ p(s,x,u,x) CCSI SCSI X Enc. SCSI U U Dec. S ^ X U ^ X X Ch. Enc. S Dec. CCSI U Duality implication: Generalization of Wyner-Ziv no-rate-loss case CCSI:(Cohen-Lapidoth, 2000, Erez-Shamai-Zamir, 2000) extension of Costa’s result for X Xˆ S Z to arbitrary S with no rate-loss Channel U Encoder X̂ X + + Decoder U Z S p ( x | xˆ, s) New result: Wyner-Ziv’s no rate loss result can be extended to arbitrary source and side information as long as X=S+V, where V is Gaussian, for MSE distortion measure. X Encoder U U ^ X Decoder S Functional duality in MIMO source and channel coding with one-sided collaboration: • For ease of illustration, we consider 2-input-2-output system • Consider only sum-rate, and single distortion/cost measure • We consider functional duality in the distributional sense • Future & on-going work: duality in the coding sense. MIMO source coding with one-sided collaboration: X1 X2 Encoder-1 M1 Encoder-2 M2 Decoder-1 Decoder-2 X̂ 1 X1 Test Channel X̂ 2 X2 Either the encoders or the decoders (but not both) collaborate MIMO channel coding with one-sided collaboration: M1 Encoder-1 X̂ 1 X1 Decoder-1 M1 Channel M2 Encoder-2 X̂ 2 X2 Decoder-2 M2 Either the encoders or the decoders (but not both) collaborate Distributed source coding X1 X2 Encoder-1 M1 Encoder-2 M2 Decoder-1 Decoder-2 X̂ 1 X1 Test Channel X̂ 2 X2 • Two correlated sources with given joint distribution p( x1 , x2 ) joint distortion measure d ( x1 , x2 , xˆ1 , xˆ2 ) • Encoders DO NOT collaborate, Decoders DO collaborate • Problem: For a given joint distortion D, find the minimum sum-rate R • Achievable rate region (Berger ‘77) Distributed source coding: Achievable sum-rate region: RDS ( D) min I ( X 1;U1 ) I ( X 2 ;U 2 ) I (U1;U 2 ) such that U1 X1 X 2 U 2 X 1 X 2 U1U 2 Xˆ 1 Xˆ 2 E[d]<D 1. Two sources can not see each other 2. The decoder can not see the source Broadcast channel coding M1 Encoder-1 X̂ 1 X1 Decoder-1 M1 Channel M2 Encoder-2 X̂ 2 X2 Decoder-2 M2 • Broadcast channel with a given conditional distribution p( x1 , x2 joint cost measure w( xˆ1 , xˆ2 ) • Encoders DO collaborate, Decoders DO NOT collaborate • Problem: For a given joint cost W, find the maximum sum-rate R • Achievable rate region (Marton ’79) | xˆ1 , xˆ2 ) Broadcast Channel Coding: Achievable sum-rate region: RBC (W ) max I ( X 1;U1 ) I ( X 2 ;U 2 ) I (U1;U 2 ) such that U1U 2 Xˆ 1 Xˆ 2 X 1 X 2 X X U U Xˆ Xˆ 1 2 E[w]<W 1. Channel only cares about i/p 2. Encoder does not have the channel o/p 1 2 1 2 Duality (loose sense) in Distr. Source coding and Broadcast channel Distributed source coding Collaboration at decoder only Uses Wyner-Ziv coding: source code is “partitioned” into a bank of channel codes Broadcast channel coding Collaboration at encoder only Uses Gelfand-Pinsker coding: channel code is “partitioned” into a bank of source codes Theorem 3a: p( X 1 , X 2 , U1 , U 2 , Xˆ 1 , Xˆ 2 ) U1 X1 X 2 U 2 X 1 X 2 U1U 2 Xˆ 1 Xˆ 2 Dist. Source Coding DUALITY U1U 2 Xˆ 1 Xˆ 2 X 1 X 2 X 1 X 2 U1U 2 Xˆ 1 Xˆ 2 Broadcast Channel Coding Example: 2-in-2-out Gaussian Linear Channel: (Caire, Shamai, Yu, Cioffi, Viswanath, Tse) X̂ 1 X̂ 2 N1 H N2 + + X1 Sum power w( xˆ1 , xˆ2 ) ( xˆ12 xˆ22 ) , X2 • Marton’s sum-rate is shown to be tight • Using Sato’s bound => the capacity of Broadcast channel depends only on marginals. •For optimal i/p distribution, if we keep the variance of the noise the same and change the correlation,at one point we get U1 X1 X 2 U 2 (also called worst-case noise) . At this point we have duality! Multiple access channel coding with independent message sets M1 Encoder-1 X̂ 1 X1 Decoder-1 M1 Channel M2 Encoder-2 X̂ 2 X2 Decoder-2 • Multiple access channel with a given conditional distribution p( x1 , x2 joint cost measure w( xˆ1 , xˆ2 ) • Encoders DO NOT collaborate, Decoders DO collaborate • Problem: For a given joint cost W, find the maximum sum-rate R • Capacity-cost function (Ahlswede ’71): CMA (W ) max I ( X 1 X 2 ; Xˆ 1 , Xˆ 2 ) such that xˆ1 , xˆ2 E[ w] W are independent M2 | xˆ1 , xˆ2 ) Multiple description source coding problem: Decoder-1 M1 X Encoder Decoder-0 M2 Decoder-2 X̂ 1 X̂ 0 X̂ 2 Another version with essentially the same coding techniques, which is “amenable” to duality: M1 X Encoder Decoder-1 X̂ 1 Decoder-0 M2 Decoder-2 X̂ 2 X̂ 0 “Multiple Description Source Coding with no-excess sum-rate” X1 X2 Encoder-1 M1 Encoder-2 M2 Decoder-1 Decoder-2 X̂ 1 X1 Test Channel X̂ 2 p( x1 , x2 ) • Two correlated sources with given joint distribution joint distortion measure d ( x1 , x2 , xˆ1 , xˆ2 ) • Encoders DO collaborate, Decoders DO NOT collaborate • Problem: For a given joint distortion D, find the minimum sum-rate R • Rate-distortion region (Ahlswede ‘85): RMD ( D) min I ( X 1 X 2 ; Xˆ 1 , Xˆ 2 ) such that xˆ1 , xˆ2 are independent E[d ] D X2 Duality (loose sense) in Multiple description coding and multiple access channel MD coding with no excess sum-rate Collaboration at encoder only Uses successive refinement strategy MAC with independent message sets Collaboration at decoder only Uses successive cancellation strategy Theorem 4a: For a multiple description coding with no excess sum-rate with Given: p( x1 , x2 ) , d ( x , x , xˆ , xˆ ) , D 1 2 1 2 x1 , x2 Source alphabets: Reconstruction alphabets xˆ1 , xˆ2 Find the optimal conditional distribution p* ( xˆ1 , xˆ2 | x1 , x2 ) Induces p* ( xˆ1 , xˆ2 ) , p* ( x1 , x2 | xˆ1 , xˆ2 ) Then there exists a multiple access channel with: Channel distribution: Input alphabets: xˆ1 , xˆ2 Output alphabets: x1 , x2 p* ( x1 , x2 | xˆ1 , xˆ2 ) Joint cost measure: w( xˆ1 , xˆ2 ) RMD ( D) CMA (W ) sum capacity-cost bound min I ( X 1 X 2 ; Xˆ 1 , Xˆ 2 ) max I ( X 1 X 2 ; Xˆ 1 , Xˆ 2 ) 1) sum-rate-distortion bound 2) p* ( xˆ1 , xˆ2 ) , p* ( x1 , x2 | xˆ1 , xˆ2 ) achieve optimality for this MA channel coding problem 3) Joint cost measure is w( xˆ1 , xˆ2 ) c1D( p* ( x1 , x2 | xˆ1 , xˆ2 ) || p( x1 , x2 )) Similarly, for a given MA channel coding problem with independent message sets => a dual MD source coding problem with no excess sum-rate. Example: Given a MA channel: N1 X̂ 1 H X̂ 2 N2 ˆ N, X HX N Gaussian , + + X2 Sum power w( xˆ1 , xˆ2 ) ( xˆ12 xˆ22 ) , ˆ ) PI, Cov(X => X̂ 2 Cov(X) P(HHT ) I, Decoder N1 H 2 1 2 log P 1 1 4 2 P 2 2 Channel X̂ 1 W 2P 1 H 1 X1 Sum-Capacity optimization: C MA ( 2 P ) => Cov(N) I, + + N2 Z1 X1 A X2 + + Z2 X̂ 1 X̂ 2 Dual MD coding problem: Source X Gaussian Cov(X) P(HHT ) I , Quadratic distortion d (x, xˆ ) (x - Hxˆ )T (x - Hxˆ ) Encoder Z1 X1 A X2 Test Channel + + Z2 N1 X̂ 1 H X̂ 2 + + N2 X1 X2 What is addressed in this work: • Duality in empirical per-letter distributions • Extension of Wyner-Ziv no-rate loss result to more arbitrary cases • Underlying connection between 4 multiuser communication problems What is left to be addressed: • Duality in optimal source codes and channel codes • Rate-loss in dual problems • Joint source-channel coding in dual problems Conclusions • Distributional relationship between MIMO source & channel coding • Functional characterization: swappable encoder and decoder codebooks • Highlighted the importance of source distortion and channel cost measures • Cross-leveraging of advances in the applications of these fields