Capacity Limits of Wireless Channels with Multiple Antennas: Challenges, Insights, and New Mathematical Methods Andrea Goldsmith Stanford University CoAuthors: T. Holliday, S. Jafar, N. Jindal, S. Vishwanath Princeton-Rutgers Seminar Series Rutgers University April 23, 2003 Future Wireless Systems Ubiquitous Communication Among People and Devices Nth Generation Cellular Nth Generation WLANs Wireless Entertainment Wireless Ad Hoc Networks Sensor Networks Smart Homes/Appliances Automated Cars/Factories Telemedicine/Learning All this and more… Challenges The wireless channel is a randomly-varying broadcast medium with limited bandwidth. Fundamental capacity limits and good protocol designs for wireless networks are open problems. Hard energy and delay constraints change fundamental design principles Many applications fail miserably with a “generic” network approach: need for crosslayer design Outline Wireless Channel Capacity Capacity of MIMO Channels Imperfect channel information Channel correlations Multiuser MIMO Channels Duality and Dirty Paper Coding Lyapunov Exponents and Capacity Wireless Channel Capacity Fundamental Limit on Data Rates Capacity: The set of simultaneously achievable rates {R1,…,Rn} R3 R1 R2 R3 Main drivers of channel capacity Bandwidth and power Statistics of the channel Channel knowledge and how it is used Number of antennas at TX and RX R2 R1 MIMO Channel Model n TX antennas m RX antennas h11 x1 h31 h22 x2 h32 h13 x3 h12 h21 y1 y2 h23 h33 y1 h11 h1n x1 n1 , ym hm1 hmn xn nm y3 y Hx n n ~ N (0, 2 I ) Model applies to any channel described by a matrix (e.g. ISI channels) What’s so great about MIMO? Fantastic capacity gains (Foschini/Gans’96, Telatar’99) Capacity grows linearly with antennas when channel known perfectly at Tx and Rx C max log | I H T QH | max Pi : Pi P B Q:Tr (Q ) P i Rank ( H T QH ) 2 log( 1 p i i ) i 1 Vector codes (or scalar codes with SIC) optimal Assumptions: Perfect channel knowledge Spatially uncorrelated fading: Rank (HTQH)=min(n,m) What happens when these assumptions are relaxed? Realistic Assumptions No transmitter knowledge of H Capacity is much smaller No receiver knowledge of H Capacity does not increase as the number of antennas increases (Marzetta/Hochwald’99) Will the promise of MIMO be realized in practice? Partial Channel Knowledge H ,q Channel x Transmitter y Hx n n ~ N (0, 2 I ) y Receiver H ~ p( H | q ) q Model channel as H~N(m,S) Receiver knows channel H perfectly Transmitter has partial information q about H Partial Information Models Channel mean information Mean is measured, Covariance unknown H ~ N ( m , I) Channel covariance information Mean unknown, measure covariance H ~ N (0, S) We have developed necessary and sufficient conditions for the optimality of beamforming Obtained for both MISO and MIMO channels Optimal transmission strategy also known Beamforming Scalar codes with transmit precoding c1 x1 x2 cn xn x Receiver • Transforms the MIMO system into a SISO system. • Greatly simplifies encoding and decoding. • Channel indicates the best direction to beamform •Need “sufficient” knowledge for optimality Optimality of Beamforming Mean Information Optimality of Beamforming Covariance Information No Tx or Rx Knowledge Increasing nT beyond coherence time T in a block fading channel does not increase capacity (Marzetta/Hochwald’99) We have shown that with correlated fading, adding Tx antennas always increases capacity Assumes uncorrelated fading. Small transmit antenna spacing is good! Impact of spatial correlations on channel capacity Perfect Rx and Tx knowledge: hurts (Boche/Jorswieck’03) Perfect Rx knowledge, no Tx knowledge: hurts (BJ’03) Perfect Rx knowledge, Tx knows correlation: helps TX and Rx only know correlation: helps Gaussian Broadcast and Multiple Access Channels Broadcast (BC): One Transmitter to Many Receivers. • Transmit power constraint • Perfect Tx and Rx knowledge Multiple Access (MAC): Many Transmitters to One Receiver. x x x h1(t) x h22(t) h21(t) h3(t) Comparison of MAC and BC Differences: Shared vs. individual power Near-far effect in MAC P Similarities: constraints P1 P2 Optimal BC “superposition” coding is also optimal for MAC (sum of Gaussian codewords) Both decoders exploit successive decoding and interference cancellation MAC-BC Capacity Regions MAC capacity region known for many cases Convex optimization problem BC capacity region typically only known for (parallel) degraded channels Formulas often not convex Can we find a connection between the BC and MAC capacity regions? Duality Dual Broadcast and MAC Channels Gaussian BC and MAC with same channel gains and same noise power at each receiver z1 (n) h1 (n) x + h1 (n) y1 (n) x1 (n) x z (n) ( P1 ) x(n) (P ) h M (n) x + zM (n) + y (n) h M (n) yM (n) xM (n) x ( PM ) Broadcast Channel (BC) Multiple-Access Channel (MAC) The BC from the MAC C MAC ( P1 , P2 ; h1 , h2 ) C BC ( P1 P2 ; h1 , h2 ) h1 h2 P1=0.5, P2=1.5 P1=1, P2=1 Blue = BC Red = MAC P1=1.5, P2=0.5 MAC with sum-power constraint C BC ( P; h1 , h2 ) 0 P1 P C MAC ( P1 , P P1 ; h1 , h2 ) Sum-Power MAC CBC ( P; h1 , h2 ) Sum C ( P , P P ; h , h ) C MAC 1 1 1 2 MAC ( P; h1 , h2 ) 0 P1 P MAC with sum power constraint Power pooled between MAC transmitters No transmitter coordination Same capacity region! MAC P P BC BC to MAC: Channel Scaling Scale channel gain by , power by 1/ MAC capacity region unaffected by scaling Scaled MAC capacity region is a subset of the scaled BC capacity region for any MAC region inside scaled BC region for any scaling P1 h1 MAC + P2 h2 P1 P2 BC h1 h2 + + The BC from the MAC Blue = Scaled BC Red = MAC h2 h1 0 C MAC ( P1 , P2 ; h1 , h2 ) C BC ( 0 P1 P2 ; h1 , h2 ) Duality: Constant AWGN Channels BC in terms of MAC C BC ( P; h1 , h2 ) 0 P1 P C MAC ( P1 , P P1 ; h1 , h2 ) MAC in terms of BC P1 C MAC ( P1 , P2 ; h1 , h2 ) C BC ( P2 ; h1 , h2 ) 0 What is the relationship between the optimal transmission strategies? Transmission Strategy Transformations Equate rates, solve for powers R1M log( 1 R2M log( 1 h12 P1M ) M 2 h2 P2 h22 P2M 2 log( 1 ) log( 1 h12 P1B 2 h22 P2B h22 P1B ) R1B B ) R 2 2 Opposite decoding order Stronger user (User 1) decoded last in BC Weaker user (User 2) decoded last in MAC Duality Applies to Different Fading Channel Capacities Ergodic (Shannon) capacity: maximum rate averaged over all fading states. Zero-outage capacity: maximum rate that can be maintained in all fading states. Outage capacity: maximum rate that can be maintained in all nonoutage fading states. Minimum rate capacity: Minimum rate maintained in all states, maximize average rate in excess of minimum Explicit transformations between transmission strategies Duality: Minimum Rate Capacity MAC in terms of BC Blue = Scaled BC Red = MAC BC region known MAC region can only be obtained by duality What other unknown capacity regions can be obtained by duality? Dirty Paper Coding (Costa’83) Basic premise If the interference is known, channel capacity same as if there is no interference Accomplished by cleverly distributing the writing (codewords) and coloring their ink Decoder must know how to read these codewords Dirty Paper Coding Clean Channel Dirty Paper Coding Dirty Channel Modulo Encoding/Decoding Received signal Y=X+S, -1X1 S known to transmitter, not receiver Modulo operation removes the interference effects Set X so that Y[-1,1]=desired message (e.g. 0.5) Receiver demodulates modulo [-1,1] -1 0 +1 … … -7 -5 -3 -1 0 +1 +3 S +5 X -1 0 +1 +7 Broadcast MIMO Channel (r1 t ) n1 y1 H1x n1 H1 x (r2 t ) H2 t1 TX antennas r11, r21 RX antennas n2 Perfect CSI at TX and RX y2 H2 x n 2 n1 ~ N(0, I r1 ) n 2 ~ N(0, I r2 ) Non-degraded broadcast channel Capacity Results Non-degraded broadcast channel Receivers not necessarily “better” or “worse” due to multiple transmit/receive antennas Capacity region for general case unknown Pioneering work by Caire/Shamai (Allerton’00): Two TX antennas/two RXs (1 antenna each) Dirty paper coding/lattice precoding* Computationally very complex MIMO version of the Sato upper bound *Extended by Yu/Cioffi Dirty-Paper Coding (DPC) for MIMO BC Coding scheme: Choose a codeword for user 1 Treat this codeword as interference to user 2 Pick signal for User 2 using “pre-coding” Receiver 2 experiences no interference: R 2 log(det(I H 2S 2 H 2T )) Signal for Receiver 2 interferes with Receiver 1: det(I H1 (S1 S 2 ) H1T ) R1 log T det(I H S H ) 1 2 1 Encoding order can be switched Dirty Paper Coding in Cellular Does DPC achieve capacity? DPC yields MIMO BC achievable region. We call this the dirty-paper region Is this region the capacity region? We use duality, dirty paper coding, and Sato’s upper bound to address this question MIMO MAC with sum power MAC with sum power: Transmitters Share power Sum MAC C ( P) code independently P CMAC ( P1 , P P1 ) 0 P1 P Theorem: Dirty-paper BC region equals the dual sum-power MAC region C DPC BC ( P) C Sum MAC ( P) Transformations: MAC to BC Show any rate achievable in sum-power MAC also achievable with DPC for BC: DPC BC Sum MAC DPC Sum C BC ( P) CMAC ( P) A sum-power MAC strategy for point (R1,…RN) has a given input covariance matrix and encoding order We find the corresponding PSD covariance matrix and encoding order to achieve (R1,…,RN) with DPC on BC The rank-preserving transform “flips the effective channel” and reverses the order Side result: beamforming is optimal for BC with 1 Rx antenna at each mobile Transformations: BC to MAC Show any rate achievable with DPC in BC also achievable in sum-power MAC: DPC Sum C BC ( P) CMAC ( P) We DPC BC Sum MAC find transformation between optimal DPC strategy and optimal sum-power MAC strategy “Flip the effective channel” and reverse order Computing the Capacity Region C DPC BC ( P) C Sum MAC ( P) Hard to compute DPC region (Caire/Shamai’00) “Easy” to compute the MIMO MAC capacity region Obtain DPC region by solving for sum-power MAC and applying the theorem Fast iterative algorithms have been developed Greatly simplifies calculation of the DPC region and the associated transmit strategy Sato Upper Bound on the BC Capacity Region Based on receiver cooperation n1 x H1 + H2 n2 + y1 Joint receiver y2 BC sum rate capacity Cooperative capacity sumrate CBC (P, H) max 1 log | I HΣ x H T | Sx 2 The Sato Bound for MIMO BC Introduce noise correlation between receivers BC capacity region unaffected Only depends on noise marginals Tight Bound (Caire/Shamai’00) Cooperative capacity with worst-case noise correlation sumrate CBC (P, H) inf max 1 log | I Σ z1/2HΣ x H T Σ z1/2 | Sz Sx 2 Explicit formula for worst-case noise covariance By Lagrangian duality, cooperative BC region equals the sum-rate capacity region of MIMO MAC Sum-Rate Proof DPC Achievable C DPC BC Duality Sum DPC CMAC ( P) C BC ( P) ( P) C BC ( P ) sumrate DPC CBC ( P) C BC ( P) C BC ( P ) C Coop BC CMAC ( P) C (P) C ( P) Obvious Sato Bound *Same result by Vishwanath/Tse for 1 Rx antenna Sum MAC Coop BC sumrate ( P) C Sum MAC ( P) Lagrangian Duality Compute from MAC MIMO BC Capacity Bounds Single User Capacity Bounds Dirty Paper Achievable Region BC Sum Rate Point Sato Upper Bound Does the DPC region equal the capacity region? Full Capacity Region DPC gives us an achievable region Sato bound only touches at sum-rate point We need a tighter bound to prove DPC is optimal A Tighter Upper Bound n1 H1 x y1 y2 H2 n2 y2 + Give data of one user to other users + Channel becomes a degraded BC Capacity region for degraded BC known Tight upper bound on original channel capacity This bound and duality prove that DPC achieves capacity under a Gaussian input restriction Remains to be shown that Gaussian inputs are optimal Full Capacity Region Proof Tight Upper Bound C BC ( P) C DSM BC ( P) C DPC BC Final Result DPC C BC ( P) C BC ( P) ( P) CBC ( P) for Gaussian inputs C DSM BC ( P) C DSM MAC ( P) DP CMAC ( P) C BC ( P) Duality Duality Compute from MAC DSM CMAC ( P) CMAC ( P) Worst Case Noise Diagonalizes Time-varying Channels with Memory Time-varying channels with finite memory induce infinite memory in the channel output. Capacity for time-varying infinite memory channels is only known in terms of a limit 1 C maxn lim I X n ; Y n p ( X ) n n Closed-form capacity solutions only known in a few cases Gilbert/Elliot and Finite State Markov Channels A New Characterization of Channel Capacity Capacity using Lyapunov exponents C max[ ( X ) (Y ) ( X , Y )] p( x ) where the Lyapunov exponent 1 ( X ) lim log || B X1 B X 2 ...B X n || n n for BXi a random matrix whose entries depend on the input symbol Xi Similar definitions hold for (Y) and (X;Y) Matrices BYi and BXiYi depend on input and channel Lyapunov Exponents and Entropy Lyapunov exponent equals entropy under certain conditions Entropy as a product of random matrices Connection between IT and dynamic systems theory 1 ( X ) lim log P ( X 1 ,, X n ) n n 1 (Y ) lim log P(Y1 ,, Yn ) n n 1 ( X , Y ) lim log P(( X 1 , Y1 ), , ( X n , Yn )) n n Still have a limiting expression for entropy Sample entropy has poor convergence properties Lyapunov Direction Vector The vector pn is the “direction” associated with (X) for any m. Also defines the conditional channel state probability mBX BX ...BX n pn P( Z n1 | X ) || mBX BX ...BX ||1 1 1 2 2 n n Vector has a number of interesting properties It is the standard prediction filter in hidden Markov models Under certain conditions we can use its stationary distribution to directly compute (X) (X) Computing Lyapunov Exponents Define p as the stationary distribution of the “direction vector” pnpn pn+2 p pn pn+1 We prove that we can compute these Lyapunov exponents in closed form as ( X ) Ep , X [log || pBX ||] This result is a significant advance in the theory of Lyapunov exponent computation Computing Capacity Closed-form formula for mutual information I ( X ; Y ) ( X ) (Y ) ( X , Y ) We prove continuity of the Lyapunov exponents with respect to input distribution and channel Can thus maximize mutual information relative to channel input distribution to get capacity Numerical results for time-varying SISO and MIMO channel capacity have been obtained We also develop a new CLT and confidence interval methodology for sample entropy Sensor Networks Energy is a driving constraint. Data flows to centralized location. Low per-node rates but up to 100,000 nodes. Data highly correlated in time and space. Nodes can cooperate in transmission and reception. Energy-Constrained Network Design Each node can only send a finite number of bits Short-range networks must consider both transmit, analog HW, and processing energy Transmit energy per bit minimized by sending each bit over many dimensions (time/bandwidth product) Delay vs. energy tradeoffs for each bit Sophisticated techniques for modulation, coding, etc., not necessarily energy-efficient Sleep modes save energy but complicate networking New network design paradigm: Bit allocation must be optimized across all protocols Delay vs. throughput vs. node/network lifetime tradeoffs Optimization of node cooperation (coding, MIMO, etc.) Results to Date Modulation Optimization Adaptive MQAM vs. MFSK for given delay and rate Takes into account RF hardware/processing tradeoffs MIMO vs. MISO vs. SISO for constrained energy SISO has best performance at short distances (<100m) Optimal Adaptation with Delay/Energy Constraints Minimum Energy Routing Conclusions Shannon capacity gives fundamental data rate limits for wireless channels Many open capacity problems for time-varying multiuser MIMO channels Duality and dirty paper coding are powerful tools to solve new capacity problems and simplify computation Lyapunov exponents a powerful new tool for solving capacity problems Cooperative communications in sensor networks is an interesting new area of research