LECTURE 9 Last time: • Channel capacity • Binary symmetric channels • Erasure channels • Maximizing capacity Lecture outline • Maximizing capacity: Arimoto-Blahut • Convergence • Examples Arimoto-Blahut Lemma 1: I(X; Y ) = max � � PX (x)PY |X (y |x) P�X|Y y∈Y x∈X ⎛ ⎞ � P (x|y) ⎝ X |Y ⎠ log PX (x) Proof: I(X; Y ) = � � � x∈X x∈Y PX|Y (x|y)PY (y) log Recall: PX|Y (x|y) = � PX (x)PY |X (y|x) x� ∈X PX (x� )PY |X (y|x� ) and PY (y) = � � � x�∈X PX (x )PY |X (y|x ) PX|Y (x|y) PX (x) � Arimoto-Blahut I(X; Y ) − ⎛ PX (x)PY |X (y |x) y∈Y x∈X ⎞ P�X |Y (x| y) log ⎝ PX (x) = I(X; Y ) − ⎛ � � � � PY (y)PX | Y (x|y) y∈Y x∈X ⎞ P�X |Y (x| y) log ⎝ ⎠ PX (x) ⎠ ⎛ ⎞ PX |Y (x|y) ⎝ ⎠ = PY (y)PX | Y (x|y) log � PX | Y (x| y) y∈Y x∈X (using log(x) ≥ 1 − 1 x ) � � � � ≥ − PY (y)PX | Y (x|y) y ∈Y x∈X � � y∈Y x∈X = 0 PY (y)P�X | Y (x|y) Arimoto-Blahut Capacity is C = max max � � PX P� X|Y y∈Y x∈X ⎛ ⎞ � PX |Y (x|y) ⎝ ⎠ log PX (x) PX (x)PY |X (y |x) For fixed PX , RHS is maximized when P�X |Y (x|y) = � PX (x)PY |X (y|x) � � x� ∈X PX (x )PY |X (y|x ) For fixed P�X|Y , RHS is maximized when � PX (x) = e � x�∈X y∈Y � � e PY |X (y |x) log(P�X |Y (x|y)) � � � y∈Y PY |X (y|x ) log(PX|Y (x |y)) � Arimoto-Blahut Combining the two means maximization when � �X |Y (x|y)) P (y|x) log(P e y∈Y Y |X � � � PX (x) = � � � � PY |X (y|x ) log(PX|Y (x |y)) y∈Y x� ∈X e � � � PY |X (y|x) PY |X (y |x) log � y∈Y PX (x� )PY |X (y |x� ) x� ∈X PX (x)e ⎛ = � �⎞ � PY |X (y|x) PY |X (y |x� ) log � ⎜ ⎟ y∈Y � PX (x� )PY |X (y |x� ) � ⎜ ⎟ x� ∈X x� ∈X PX (x ) ⎝e ⎠ Note also that � x∈X PX (x) = 1. This may be very hard to solve. Arimoto-Blahut Proof: The first two statements follow immedi­ ately from our lemma For any value of x where PX |Y (x|y) = 0, PX (x) should be set to 0 to obtain the max­ imum. To find the maximum over the PMF PX , let us first ignore the constraint of positiv­ ity and use a Lagrange multiplier for the � x PX (x) = 1 Then ∂ � � { x∈X y∈Y PX (x)PY |X (y|x) log ∂PX (x) � λ ( x∈X PX (x) − 1)} = 0 � � � PX |Y (x|y) + PX (x) Arimoto-Blahut Equivalently � � � − log(PX (x))−1+ y∈Y PY |X (y|x) log P�X|Y (x|y) + λ=0 so � PX (x) = e � x∈X � � x∈X PY |X (y|x) log PX|Y (x|y) � e � � � x∈X PY |X (y |x) log PX |Y (x|y) � (this ensures that λ is such that the sum of the PX (x)s is 1) What about the constraint we did not use for positivity? The solution we found satisfies that. Convergence of Arimoto-Blahut 0 be a PMF and let Let PX r+1 r (x) PX (x) = PX � � � � � r (x ),...,P r (x cx PX 1 X |X | ) r r x� ∈X cx PX (x1 ),...,PX (x|X | ) r (x� ) PX where � � r r cx PX (x1), . . . , PX (x|X |) � =e y∈Y � PY |X (y|x) log PY |X (y |x) � � � x� ∈X PX (x )PY |X (y|x ) � the sequence I r of I(X; Y ) for X taking the R for I r converges to C from below PMF PX Convergence of Arimoto-Blahut Proof: r , we can increase mutual For any given PX information by taking PYr |X r (x)P PX Y |X (y |x) r � � x� ∈X PX (x )PY |X (y|x ) =� r+1 With PYr |X fixed, then choose PX by r+1 PX (x) = � � r y∈Y PY |X (y |x) log(PX |Y (x|y)) e � x� ∈X e r � � y∈Y PY |X (y |x ) log(PX |Y (x | y)) If we define Jr = � � x∈X � r+1 P y∈Y X (x)PY |X (y|x) log r (x|y ) PX |Y r+1 PX (x) Then I r ≤ J r ≤ I r+1 ≤ J r+1 ≤ . . . This an upper bounded non-decreasing se­ quence, therefore it reaches a limit � Convergence of Arimoto-Blahut Why is the limit C? ∗ be a capacity achieving PMF Let PX ⎛ = = − ⎞ r+1 � PX (x) ∗ ⎝ ⎠ PX (x) log r PX (x) x∈X � ∗ (x) PX x∈X � � ⎛ ⎞ r r cx PX (x1), ..., PX (x|X |) ⎝ ⎠ � � log � r r r � x � ∈X cx PX (x1 ), ..., PX (x|X | ) PX (x ) � � ∗ PX (x) PY | X (y|x) x∈X y∈Y ⎛ ⎞ PY |X (y |x) ⎝ ⎠ log � r � � x� ∈X PX (x )PY | X (y |x ) � � ∗ PX (x) PY | X (y |x) x∈X y∈Y ⎛ � r (x� ) ⎝ log PX x�∈X � �⎞ PY |X (y � |x� ) � � � � r ⎟ y � ∈Y PY |X (y |x ) log �� ∈X PX (x�� )PY | X (y � |x�� ) ⎟ x e ⎠ Convergence of Arimoto-Blahut By considering the K-L distance, we have that � � ∗ x∈X y∈Y PX (x)PY |X (y |x) �� ∗ (x� )P � ) � P (y | x � log �x ∈X PXr (x�)PY |X (y |x � ) ≥ 0 Y | X x� ∈X X so ≥ − ⎛ ⎞ r +1 � ∗ (x) log ⎝ PX (x) ⎠ PX r (x) P X x∈X � � ∗ (x) PX PY | X (y|x) x∈X y∈Y ⎛ ⎞ PY |X (y |x) ⎝ ⎠ log � ∗ � � x�∈X PX (x )PY | X (y |x ) � � ∗ PX (x) PY | X (y |x) x∈X y∈Y ⎛ � r (x� ) log ⎝ PX x�∈X � PY |X (y � |x� ) � � � � r �� y � ∈Y PY |X (y |x ) log e �⎞ � �� x�� ∈X PX (x )PY | X (y |x ) ⎟ ⎟ ⎠ Convergence of Arimoto-Blahut Hence ⎛ ≥ ⎞ r+1 � PX (x) ∗ ⎝ ⎠ PX (x) log r PX (x) x∈X C − Jr Sum over r m � ≤ = ≤ (C − J r ) r=0 m � � ⎛ ⎞ r+1 ∗ (x) log ⎝ PX (x) ⎠ PX r (x) P X r=0 x∈X ⎛ ⎞ m+1 � PX (x) ∗ ⎝ ⎠ PX (x) log 0 PX (x) x∈X � � ∗ � ∗ (x) log PX (x) PX 0 (x) PX x∈X C−J r ≥ 0 and non increasing, with bounded sum, so it goes to 0, hence J r converges to C convergence can be very slow In practice, Example Other types of maximization Interior point methods Cutting plane algorithms MIT OpenCourseWare http://ocw.mit.edu 6.441 Information Theory Spring 2010 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.