Lecture 2 The Capacity of Wireless Channels

advertisement
MIMO Communication Systems
Lecture 2
TheCapacityofWirelessChannels
Prof. Chun-Hung Liu
Dept. of Electrical and Computer Engineering
National Chiao Tung University
Fall 2015
2015/10/11
Lecture2:CapacityofWirelessChannels
1
Outline
• Capacity of SISO Channels (Chapter 4 in Goldsmith’s Book)
• Capacity in AWGN
• Capacity of Flat-Fading Channels
• Channel and System Model
• Channel Distribution Information (CDI) Known
• Channel Side Information at Receiver
• Channel Side Information at Transmitter and Receiver
• Capacity of MIMO Channels (Chapter 10.1~10.3 in
Goldsmith’s Book)
2015/10/11
Lecture2:CapacityofWirelessChannels
2
Introduction to Shannon Channel Capacity
•
Channel capacity was pioneered by Claude Shannon in the late 1940s,
using a mathematical theory of communication based on the notion of
mutual information between the input and output of a channel
• C. E. Shannon, “A Mathematical Theory of Communication”, Bell System
Technical Journal, vol. 27, pp. 379-423, 623-656, July, October, 1948
• Proposed Entropy as a measure of information
1916-2001
• Showed that Digital Communication is
possible.
• Derived the Channel Capacity Formula
• Showed that reliable information transmission
is always possible if transmitting rate does not
exceed the channel capacity
Claude Shannon in 1948 (32 years old)
2015/10/11
Lecture2:CapacityofWirelessChannels
3
Introduction to Shannon Channel Capacity
The fundamental problem of communication is that of reproducing at one
point, either exactly or approximately, a message selected at another point.
~ Claude Shannon
Channel
Channel Capacity :
C = B log2 (1 + SN R)
2015/10/11
Lecture2:CapacityofWirelessChannels
4
Introduction to Shannon Channel Capacity
• Consider a discrete-time additive white Gaussian noise (AWGN) channel
with channel input/output relationship
, where x[i] is the
channel input at time i, y[i] is the corresponding channel output,
and n[i] is a white Gaussian noise random process.
•
Assume a channel bandwidth B and transmit power P.
•
The channel SNR, the power in x[i] divided by the power in n[i] , is
constant and given by
, where
is the power spectral
density of the noise.
•
The capacity of this channel is given by Shannon’s well-known formula
where the capacity units are bits/second (bps).
•
Shannon’s coding theorem proves that a code exists that achieves data
rates arbitrarily close to capacity with arbitrarily small probability of bit
error.
2015/10/11
Lecture2:CapacityofWirelessChannels
5
Introduction to Shannon Channel Capacity
•
Shannon proved that channel capacity equals the mutual information of
the channel maximized over all possible input distributions:
(4.3)
For the AWGN channel, the maximizing input distribution is Gaussian,
which results in the channel capacity
.
•
•
At the time that Shannon developed his theory of information, data rates
over standard telephone lines were on the order of 100 bps. Thus, it was
believed that Shannon capacity, which predicted speeds of roughly 30
Kbps over the same telephone lines, was not a very useful bound for real
systems.
However, breakthroughs in hardware, modulation, and coding
techniques have brought commercial modems of today very close to the
speeds predicted by Shannon in the 1950s.
2015/10/11
Lecture2:CapacityofWirelessChannels
6
I. Capacity of SISO Channels
2015/10/11
Lecture2:CapacityofWirelessChannels
7
Mutual Information
• The converse theorem shows that any code with rate R>C has a
probability of error bounded away from zero.
•
For a memoryless time-invariant channel with random input x and
random output y, the channel’s mutual information is defined as
where the sum is taken over all possible input and output pairs
and
the input and output alphabets.
•
Mutual information can also be written in terms of the entropy in the
channel output y and conditional output y|x as
where
2015/10/11
Lecture2:CapacityofWirelessChannels
8
The AWGN Channel
•
The most important continuous alphabet channel is the Gaussian
channel depicted in the following figure. This is a time-discrete
channel with output Yi at time i, where Yi is the sum of the input Xi
and the noise Zi .
•
The noise Zi is drawn i.i.d. from a Gaussian distribution with
variance N. Thus,
Yi = Xi + Z i ,
Zi ⇠ N (0, N )
•
The noise Zi is assumed to be
independent of the signal Xi .
•
If the noise variance is zero or the
input is unconstrained, the
capacity of the channel is infinite.
Figure: A Gaussian Channel
2015/10/11
Lecture2:CapacityofWirelessChannels
9
The AWGN Channel
•
Assume an average power constraint. For any codeword (x1 , x2 , . . . , xn )
transmitted over the channel, we require that
n
1X 2
xi  P
n i=1
•
Assume that we want to send 1 bit over the channel in one use of the
channel.
•
Given the power
the best that we can do is to send one
pconstraint,
p
of two levels,
P or P . The receiver looks at the corresponding
Y received and tries to decide which of the two levels was sent.
•
Assuming that both levels are equally likely (this would be the case if
we wish to send exactly 1 bit of information), the optimump
decoding
p
rule is to decide that P was sent if Y > 0 and decide
P was
sent if Y  0 .
2015/10/11
Lecture2:CapacityofWirelessChannels
10
Probability of Error in AWGN Channels
•
The probability of error with such a decoding scheme is


p
p
1
1
Pe = P Y < 0 X = P + P Y > 0 X =
P
2
2


p
p
p
1
1
= P Z<
P X= P + P Z> P X=
2
2
h
p i
=P Z> P =1
where
P
⇣p
⌘
P/N
(x) is the cumulative normal function
(x) =
•
p
Z
x
1
1
p e
2⇡
t2
2
dt = 1
Q(x).
Using such a scheme, we have converted the Gaussian channel into
a discrete binary symmetric channel with crossover probability Pe .
2015/10/11
Lecture2:CapacityofWirelessChannels
11
Differential Entropy
•
Let X now be a continuous r.v. with cumulative distribution
F (x) = P[X  x]
d
and f (x) =
F (x) is the density function.
dx
•
Let S = {x : f (x) > 0} be the support set. Then
• Definition of Differential Entropy h(X) :
Z
h(X) =
f (x) log f (x) dx
S
Since we integrate over only the support set, no worries about log 0.
2015/10/11
Lecture2:CapacityofWirelessChannels
12
Differential Entropy of Gaussian Distribution
•
If we have:
X ⇠ N (0,
•
2
)
Let’s compute this in nats.
=
E[X 2 ]
1
log2 2⇡e 2 , bits
2
Note: only a function of the variance 2 , not the mean. Why?
So entropy of a Gaussian is monotonically related to the variance.
=
•
•
2015/10/11
Lecture2:CapacityofWirelessChannels
13
Gaussian Distribution Maximizes Differential Entropy
•
Definition : The relative entropy (or Kullback–Leibler distance) D(f ||g)
between two densities f and g is defined by
D(g|| ) =
•
Z
g(x) log
✓
g(x)
(x)
◆
dx
0
(Why?)
Theorem : Let random variable X have zero mean and variance
Then h(X)  12 log2 (2⇡e 2 ), with equality iff X ⇠ N (0, 2 ) .
2
.
R
Proof: Let g(x) be any density satisfying x2 g(x)dx = 2 . Let (x) be the
density of a Gaussian random variable with zero mean and variable 2 .
2
Note that log (x) is a quadratic form and it is 2x 2 12 log(2⇡ 2 ) . Then
D(g|| ) =
=
=
•
Z
✓
g(x)
(x)
◆
g(x) log
dx
Z
h(g)
g log( (x))dx =
h(g) + h( )
0
h(g)
Z
(x) log( (x))dx (Why?)
Therefore, the Gaussian distribution maximizes the entropy overall
distributions with the same variance.
2015/10/11
Lecture2:CapacityofWirelessChannels
14
Capacity of Gaussian Channels
•
Definition: The information capacity of the Gaussian channel with
power constraint P is
C=
•
max2
f (x):E[X ]P
I(X; Y ),
where I(X; Y ) is called the mutual information between X and Y.
We can calculate the information capacity as follows:
Expanding I(X; Y ) , we have
I(X; Y ) = h(Y )
h(Y |X) = h(Y )
h(X + Z|X)
= h(Y )
h(Z|X) = h(Y )
h(Z),
since Z is independent of X. Now h(Z) =
1
2
log 2⇡eN . Also,
E[Y 2 ] = E[(X + Z)2 ] = E[X 2 ] + 2E[X]E[Z] + E[Z 2 ] = P + N,
since X and Z are independent and E[Z] = 0 .
2015/10/11
Lecture2:CapacityofWirelessChannels
15
Capacity in AWGN
• Given E[Y 2 ] = P + N , the entropy of Y is bounded by 12 log 2⇡e(P + N )
(the Gaussian distribution maximizes the entropy for a given variance).
•
•
Applying this result to bound the mutual information, we obtain
I(X; Y ) = h(Y ) h(Z)
1
 log 2⇡e(P + N )
2
✓
◆
1
P
= log 1 +
2
N
1
log 2⇡eN
2
Hence, the information capacity of the Gaussian channel is
✓
1
P
C = max
I(X; Y ) = log 1 +
2
2
N
E[X ]P
◆
,
and the maximum is attained when X ⇠ N (0, P ) .
2015/10/11
Lecture2:CapacityofWirelessChannels
16
Bandlimited Channels
•
The output of a channel can be described as the convolution
Y (t) = [X(t) + Z(t)] ⇤ c(t),
where X(t) is the signal waveform, Z(t) is the waveform of the white
Gaussian noise, and c(t) is the impulse response of an ideal bandpass
filter, which cuts out all frequencies greater than B.
•
Nyquist–Shannon Sampling Theorem: Suppose that a function f(t) is
bandlimited to B, namely, the spectrum of the function is 0 for all
frequencies greater than B. Then the function is completely determined
1
by samples of the function spaced 2B
seconds apart.
Proof:
Let F (!) be the Fourier transform of f (t). Then
1
f (t) =
2⇡
2015/10/11
Z
1
1
i!t
F (!)e d! =
2⇡
1
Z
2⇡B
F (!)ei!t d!,
2⇡B
Lecture2:CapacityofWirelessChannels
17
Bandlimited Channels
since F (!) is zero outside the band 2⇡B  !  2⇡B . If we consider
1
samples spaced 2B
seconds apart, the value of the signal at the sample
points can be written
•
Z
⇣ n ⌘
1
f
=
2B
2⇡
2⇡B
F (!)ei!n/2B d!.
2⇡B
The right-hand side of this equation is also the definition of the
coefficients of the Fourier series expansion of the periodic extension of
the function F (!) , taking the interval −2πB to 2πB as the fundamental
period.
n
• The sample values f ( 2B
) determine the Fourier coefficients and, by
extension, they determine the value of F (!) in the interval (−2πB, 2πB).
•
Consider the function
sin(2⇡Bt) (This function is 1 at t=0 and is 0
sinc(t) =
for t=n/2B, n=0)
2⇡Bt
2015/10/11
Lecture2:CapacityofWirelessChannels
18
Bandlimited Channels
•
The spectrum of this function is constant in the band (−B, B) and is
zero outside this band.
•
Now define
g(t) =
n=
•
•
⇣ n ⌘
⇣
f
sinc t
2B
1
1
X
n ⌘
2B
From the properties of the sinc function, it follows that g(t) is
bandlimited to B and is equal to f (n/2B) at t = n/2B . Since there is
only one function satisfying these constraints, we must have g(t) = f (t).
This provides an explicit representation of f (t) in terms of its samples.
What does this theorem tell us?
A general function has an infinite number of degrees of freedom-the
value of the function at every point can be chosen independently. The
Nyquist–Shannon sampling theorem shows that a bandlimited function
has only 2B degrees of freedom per second. The values of the function
at the sample points can be chosen independently, and this specifies the
entire function.
2015/10/11
Lecture2:CapacityofWirelessChannels
19
Bandlimited Channels
•
•
If a function is bandlimited, it cannot be limited in time. But we can
consider functions that have most of their energy in bandwidth B and
have most of their energy in a finite time interval, say (0,T).
Now let’s look back the problem of communication over a
bandlimited channel.
Assuming that the channel has bandwidth B, we can represent both the
input and the output by samples taken 1/2B seconds apart
•
If the noise has power spectral density N0 /2 watts/hertz and bandwidth
B hertz, the noise has power N20 2B = N0 B and each of the 2BT noise
samples in time T has variance N0 BT /2BT = N0 /2 .
•
Looking at the input as a vector in the 2TB-dimensional space, we see
that the received signal is spherically normally distributed about this
point with covariance N0 I .
2
2015/10/11
Lecture2:CapacityofWirelessChannels
20
Capacity of Bandlimited Channels
•
Let the channel be used over the time interval [0,T]. In this case, the
energy per sample is P T /2BT = P/2B , the noise variance per sample
is N0 2B T = N0 /2 , and hence the capacity per sample is
2
2BT
!
✓
◆
P
1
1
P
C = log2 1 + 2B
= log2 1 +
(bits per sample)
2
N0 /2
2
N0 B
•
Since there are 2B samples each second, the capacity of the channel
can be rewritten as
✓
◆
P
C = B log2 1 +
(bits/s, bps)
N0 B
•
(This equation is one of the most famous formulas of information
theory. It gives the capacity of a bandlimited Gaussian channel with
noise spectral density N0 /2 watts/Hz and power P watts.)
If we let B ! 1 in the above capacity formula, we obtain
P
C=
log2 e
N0
2015/10/11
(for infinite bandwidth channels, the capacity grows
linearly with the power.)
Lecture2:CapacityofWirelessChannels
21
Capacity of Flat-Fading Channels
•
Assume a discrete-time
channel with stationary and ergodic timep
varying gain g[i], 0  g[i], and AWGN n[i], as shown in Figure 4.1.
•
The channel power gain g[i] follows a given distribution p(g), e.g.
for Rayleigh fading p(g) is exponential. The channel gain g[i] can
change at each time i, either as an i.i.d. process or with some
correlation over time.
2015/10/11
Lecture2:CapacityofWirelessChannels
22
Capacity of Flat-Fading Channels
•
In a block fading channel g[i] is constant over some block length T after
which time g[i] changes to a new independent value based on the
distribution p(g) .
•
Let
denote the average transmit signal power,
denote the noise
power spectral density of n[i] , and B denote the received signal
bandwidth. The instantaneous received signal-to-noise ratio (SNR) is
then
and its expected value over all time
is
.
•
The channel gain g[i] , also called the channel side (state) information
(CSI), changes during the transmission of the codeword.
•
The capacity of this channel depends on what is known about g[i] at the
transmitter and receiver. We will consider three different scenarios
regarding this knowledge: Channel Distribution Information (CDI),
Receiver CSI and Transmitter and Receiver CSI.
2015/10/11
Lecture2:CapacityofWirelessChannels
23
Capacity of Flat-Fading Channels
•
•
•
First consider the case where the channel gain distribution p(g) or,
equivalently, the distribution of SNR p( ) is known to the transmitter
and receiver.
For i.i.d. fading the capacity is given by (4.3), but solving for the
capacity-achieving input distribution, i.e., the distribution achieving
the maximum in (4.3), can be quite complicated depending on the
fading distribution.
For these reasons, finding the capacity-achieving input distribution
and corresponding capacity of fading channels under CDI remains an
open problem for almost all channel distributions.
•
Now consider the case where the CSI g[i] is known at the receiver at
time i. Equivalently, [i] is known at the receiver at time i.
•
Also assume that both the transmitter and receiver know the
distribution of g[i] . In this case, there are two channel capacity
definitions that are relevant to system design: Shannon capacity, also
called ergodic capacity, and capacity with outage.
2015/10/11
Lecture2:CapacityofWirelessChannels
24
Capacity of Flat-Fading Channels
• For the AWGN channel, Shannon capacity defines the maximum data
rate that can be sent over the channel with asymptotically small error
probability.
• Note that for Shannon capacity the rate transmitted over the channel is
constant: the transmitter cannot adapt its transmission strategy relative
to the CSI.
• Capacity with outage is defined as the maximum rate that can be
transmitted over a channel with some outage probability corresponding
to the probability that the transmission cannot be decoded with
negligible error probability.
• The basic premise of capacity with outage is that a high data rate can be
sent over the channel and decoded correctly except when the channel is
in deep fading.
• By allowing the system to lose some data in the event of deep fades, a
higher data rate can be maintained if all data must be received correctly
regardless of the fading state.
2015/10/11
Lecture2:CapacityofWirelessChannels
25
Capacity of Flat-Fading Channels
•
Shannon capacity of a fading channel with receiver CSI for an average
power constraint P can be obtained as
(4.4)
•
By Jensen’s inequality,
E[B log2 (1 + )]
where
•
•
B log2 (1 + E[ ])
is the average SNR on the channel.
Here we see that the Shannon capacity of a fading channel with receiver
CSI only is less than the Shannon capacity of an AWGN channel with the
same average SNR.
In other words, fading reduces Shannon capacity when only the receiver
has CSI.
2015/10/11
Lecture2:CapacityofWirelessChannels
26
Capacity of Flat-Fading Channels
•
Example 4.2: Consider a flat-fading channel with i.i.d. channel gain g[i]
which can take on three possible values:
with probability
,
with probability
with probability
.
The transmit power is 10 mW, the noise spectral density is
W/Hz,
and the channel bandwidth is 30 KHz. Assume the receiver has knowledge
of the instantaneous value of g[i] but the transmitter does not. Find the
Shannon capacity of this channel and compare with the capacity of an
AWGN channel with the same average SNR.
2015/10/11
Lecture2:CapacityofWirelessChannels
27
Capacity of Flat-Fading Channels
•
•
•
•
•
Capacity with outage applies to slowly-varying channels, where the
instantaneous SNR γ is constant over a large number of transmissions (a
transmission burst) and then changes to a new value based on the fading
distribution.
If the channel has received SNR γ during a burst, then data can be sent
over the channel at rate
with negligible probability of error
Capacity with outage allows bits sent over a given transmission burst to
be decoded at the end of the burst with some probability that these bits
will be decoded incorrectly.
Specifically, the transmitter fixes a minimum received SNR
and
encodes for a data rate
.
The data is correctly received if the instantaneous received SNR is greater
than or equal to
.
2015/10/11
Lecture2:CapacityofWirelessChannels
28
Capacity of Flat-Fading Channels
•
If the received SNR is below
then the receiver declares an outage.
The probability of outage is thus
.
•
The average rate correctly received over many transmission bursts is
since data is only correctly received on
transmissions.
•
Capacity with outage is
typically characterized by a plot
of capacity versus outage, as
shown in Figure 4.2.
• In Figure 4.2. we plot the
normalized capacity
as a
function of outage prob.
for a Rayleigh
fading channel with
dB.
2015/10/11
Lecture2:CapacityofWirelessChannels
29
Capacity of Flat-Fading Channels
•
When both the transmitter and receiver have CSI, the transmitter can
adapt its transmission strategy relative to this CSI, as shown in Figure
4.3.
•
In this case, there is no notion of capacity versus outage where the
transmitter sends bits that cannot be decoded, since the transmitter
knows the channel and thus will not send bits unless they can be
decoded correctly.
2015/10/11
Lecture2:CapacityofWirelessChannels
30
Capacity of Flat-Fading Channels
•
Now consider the Shannon capacity when the channel power gain g[i] is
known to both the transmitter and receiver at time i.
•
Let s[i] be a stationary and ergodic stochastic process representing the
channel state, which takes values on a finite set S of discrete memoryless
channels.
• Let
denote the capacity of a particular channel
, and p(s)
denote the probability, or fraction of time, that the channel is in state
s. The capacity of this time-varying channel is then given by
(4.6)
•
The capacity of an AWGN channel with average received SNR γ is
2015/10/11
Lecture2:CapacityofWirelessChannels
31
Capacity of Flat-Fading Channels
•
From (4.6) the capacity of the fading channel with transmitter and
receiver side information is
•
Let us now allow the transmit power P ( ) to vary with γ, subject to
an average power constraint
•
Define the fading channel capacity with average power constraint as
(4.9)
(Can this capacity be achieved?)
2015/10/11
Lecture2:CapacityofWirelessChannels
32
Capacity of Flat-Fading Channels
•
Figure 4.4 shows the main idea of how to achieve the capacity in (4.9)
•
To find the optimal power allocation P ( ) , we form the Lagrangian
2015/10/11
Lecture2:CapacityofWirelessChannels
33
Capacity of Flat-Fading Channels
•
Next we differentiate the Lagrangian and set the derivative equal to zero:
•
Solving forP ( ) with the constraint that P ( ) > 0 yields the optimal
power adaptation that maximizes (4.9) as
(4.12)
for some “cutoff” value
:
•
If [i] is below this cutoff then no data is transmitted over the ith time
interval, so the channel is only used at time i if
.
•
Substituting (4.12) into (4.9) then yields the capacity formula:
(4.13)
2015/10/11
Lecture2:CapacityofWirelessChannels
34
Capacity of Flat-Fading Channels
•
The multiplexing nature of the capacity-achieving coding strategy
indicates that (4.13) is achieved with a time varying data rate, where the
rate corresponding to instantaneous SNR γ is
.
•
Note that the optimal power allocation policy (4.12) only depends on
the fading distribution p( ) through the cutoff value . This cutoff
value is found from the power constraint.
•
By rearranging the power constraint and replacing the inequality with
equality (since using the maximum available power will always be optimal)
yields the power constraint
• By using the optimal power allocation (4.12), we have
2015/10/11
Lecture2:CapacityofWirelessChannels
35
Capacity of Flat-Fading Channels
•
•
Note that this expression only depends on the distribution p( ) . The value
for
cannot be solved for in closed form for typical continuous pdfs p( )
and thus must be found numerically.
Since γ is time-varying, the maximizing power adaptation policy of (4.12)
is a “water-filling” formula in time, as illustrated in Figure 4.5.
1
1/
0 •
P( )
P
1/
2015/10/11
Lecture2:CapacityofWirelessChannels
The water-filling
terminology refers to the
fact that the line 1/
sketches out the bottom of
a bowl, and power is
poured into the bowl to a
constant water level
of
.
36
Capacity of Flat-Fading Channels
•
Example 4.4: Assume the same channel as in the previous example, with
a bandwidth of 30 KHz and three possible received SNRs:
with
with
with
.
Find the ergodic capacity of this channel assuming both transmitter and
receiver have instantaneous CSI.
Solution: We know the optimal power allocation is water-filling, and we
need to find the cutoff value that satisfies the discrete version of (4.15)
given by
We first assume that all channel states are used to obtain
, i.e.
assume
, and see if the resulting cutoff value is below that of
the weakest channel. If not then we have an inconsistency, and must redo
the calculation assuming at least one of the channel states is not used.
Applying (4.17) to our channel model yields
2015/10/11
Lecture2:CapacityofWirelessChannels
37
Capacity of Flat-Fading Channels
2015/10/11
Lecture2:CapacityofWirelessChannels
38
Capacity of Flat-Fading Channels
Comparing with the results of the previous example we see that this rate is
only slightly higher than for the case of receiver CSI only, and is still
significantly below that of an AWGN channel with the same average SNR.
That is because the average SNR for this channel is relatively high: for low
SNR channels capacity in flat-fading can exceed that of the AWGN channel
with the same SNR by taking advantage of the rare times when the channel
is in a very good state.
•
Zero-Outage Capacity and Channel Inversion: now consider a suboptimal
transmitter adaptation scheme where the transmitter uses the CSI to
maintain a constant received power, i.e., it inverts the channel fading.
•
The channel then appears to the encoder and decoder as a time-invariant
AWGN channel. This power adaptation, called channel inversion, is
given by
where σ equals the constant received SNR that
can be maintained with the transmit power constraint (4.8).
2015/10/11
Lecture2:CapacityofWirelessChannels
39
Capacity of Flat-Fading Channels
= E[1/ ]
•
The constant σ thus satisfies
•
Fading channel capacity with channel inversion is just the capacity of an
AWGN channel with SNR σ:
1
E[1/ ]
•
•
•
•
(4.18)
The capacity-achieving transmission strategy for this capacity uses a fixedrate encoder and decoder designed for an AWGN channel with SNR σ.
This has the advantage of maintaining a fixed data rate over the channel
regardless of channel conditions.
For this reason the channel capacity given in (4.18) is called zero-outage
capacity.
Zero-outage capacity can exhibit a large data rate reduction relative to
Shannon capacity in extreme fading environments. For example, in
Rayleigh fading E[1/ ] is infinite, and thus the zero-outage capacity given
by (4.18) is zero.
2015/10/11
Lecture2:CapacityofWirelessChannels
40
Capacity of Flat-Fading Channels
•
Example 4.5: Assume the same channel as in the previous example, with
a bandwidth of 30 KHz and three possible received SNRs:
with
with
with
.
Assuming transmitter and receiver CSI, find the zero-outage capacity
of this channel.
= 1/E[1/ ]
E[1/ ] =
•
The outage capacity is defined as the maximum data rate that can be
maintained in all non-outage channel states times the probability of nonoutage.
2015/10/11
Lecture2:CapacityofWirelessChannels
41
Capacity of Flat-Fading Channels
•
Outage capacity is achieved with a truncated channel inversion policy for
power adaptation that only compensates for fading above a certain cutoff
fade depth
•
Since the channel is only used when
yields = 1/E 0 [1/ ] , where
Z
E 0 [1/ ] ,
•
, the power constraint (4.8)
1
1
p( )d .
0
The outage capacity associated with a given outage probability
corresponding cutoff
is given by
1
P[
E 0 [1/ ]
2015/10/11
Lecture2:CapacityofWirelessChannels
and
0]
42
Capacity of Flat-Fading Channels
•
We can also obtain the maximum outage capacity by maximizing outage
capacity over all possible
1
P[
E 0 [1/ ]
•
0]
This maximum outage capacity will still be less than Shannon capacity
(4.13) since truncated channel inversion is a suboptimal transmission
strategy.
• Example 4.6: Assume the same channel as in the previous example, with
a bandwidth of 30 KHz and three possible received SNRs:
with
with
, and
with
.
Find the outage capacity of this channel and associated outage
probabilities for cutoff values
. Which of these cutoff
values yields a larger outage capacity?
2015/10/11
Lecture2:CapacityofWirelessChannels
43
II. Capacity of MIMO Channels
2015/10/11
Lecture2:CapacityofWirelessChannels
44
Circular Complex Gaussian Vectors
•
•
•
•
A complex random vector is of the form x = xR + jxI where xR and xI
are real random vectors.
Complex Gaussian random vectors are ones in which [xR , xI ]t is a
real Gaussian random vector.
The distribution is completely specified by the mean and covariance
matrix of the real vector [xR , xI ]t .
Define
where A⇤ is the transpose of the matrix A with each element replaced by its
complex conjugate, and At is just the transpose of A.
•
Note that in general the covariance matrix K of the complex random
vector x by itself is not enough to specify the full second-order statistics
of x. Indeed, since K is Hermitian, i.e., K = K⇤, the diagonal elements are
real and the elements in the lower and upper triangles are complex
conjugates of each other.
2015/10/11
Lecture2:CapacityofWirelessChannels
45
Circular Complex Gaussian Vectors
•
In wireless communication, we are almost exclusively interested in
complex random vectors that have the circular symmetry property:
•
For a circular symmetric complex random vector x,
for any ✓ ; hence the mean µ = 0 . Moreover
for any ✓ ; hence the pseudo-covariance matrix J is also zero.
•
Thus, the covariance matrix K fully specifies the first and second
order statistics of a circular symmetric random vector.
2015/10/11
Lecture2:CapacityofWirelessChannels
46
Circular Complex Gaussian Vectors
•
•
•
And if the complex random vector is also Gaussian, K in fact specifies
its entire statistics.
A circular symmetric Gaussian random vector with covariance matrix
K is denoted as
.
Some special cases:
• A complex Gaussian random variable w = wR + jwI with i.i.d.
zero-mean Gaussian real and imaginary components is circular
symmetric. In fact, a circular symmetric Gaussian random
variable must have i.i.d. zero-mean real and imaginary
components.
• A collection of n i.i.d. CN (0, 1) random variables forms a standard
circular symmetric Gaussian random vector w and is denoted
by CN (0, I) . The density function of w can be explicitly written as
• Uw has the same distribution as w for any complex orthogonal
matrix U (such a matrix is called a unitary matrix and
characterized by the property UU⇤ = I.
2015/10/11
Lecture2:CapacityofWirelessChannels
47
Narrowband MIMO Model
•
Here we consider a narrowband MIMO channel. A narrowband point-topoint communication system of
transmit and
receive antennas is
shown in the following figure.
2015/10/11
Lecture2:CapacityofWirelessChannels
48
Narrowband MIMO Model
•
Or simply as
•
We assume a channel bandwidth of B and complex Gaussian noise with
zero mean and covariance matrix
, where typically
.
•
For simplicity, given a transmit power constraint P we will assume an
equivalent model with a noise power of unity and transmit power
, where ρ can be interpreted as the average SNR per receive
antenna under unity channel gain.
This power constraint implies that the input symbols satisfy
•
Mt
X
E[xi x⇤i ] = ⇢,
(10.1)
i=1
where
2015/10/11
is the trace of the input covariance matrix Rx = E[xxT ]
Lecture2:CapacityofWirelessChannels
49
Parallel Decomposition of the MIMO Channel
•
•
•
•
•
When both the transmitter and receiver have multiple antennas, there is
another mechanism for performance gain called (spatial) multiplexing
gain.
The multiplexing gain of an MIMO system results from the fact that a
MIMO channel can be decomposed into a number R of parallel
independent channels.
By multiplexing independent data onto these independent channels, we
get an R-fold increase in data rate in comparison to a system with just
one antenna at the transmitter and receiver.
Consider a MIMO channel with
channel gain matrix
known to both the transmitter and the receiver.
Let
denote the rank of H. From matrix theory, for any matrix H we
can obtain its singular value decomposition (SVD) as
(10.2)
where Σ is an
2015/10/11
diagonal matrix of singular values
Lecture2:CapacityofWirelessChannels
of H.
50
Parallel Decomposition of the MIMO Channel
•
Since
cannot exceed the number of columns or rows of H,
If H is full rank, which is sometimes referred to as a rich scattering
environment, then
.
The parallel decomposition of the channel is obtained by defining a
transformation on the channel input and output x and y through transmit
precoding and receiver shaping, as shown in the following figure:
•
•
The transmit precoding and receiver shaping transform the MIMO
channel into RH parallel single-input single-output (SISO) channels with
input
and output , since from the SVD, we have that
2015/10/11
Lecture2:CapacityofWirelessChannels
51
Parallel Decomposition of the MIMO Channel
2015/10/11
Lecture2:CapacityofWirelessChannels
52
Parallel Decomposition of the MIMO Channel
•
This parallel decomposition is shown in the following figure:
2015/10/11
Lecture2:CapacityofWirelessChannels
53
MIMO Channel Capacity
• The capacity of a MIMO channel is an extension of the mutual
information formula for a SISO channel given in Lecture 3 to a matrix
channel.
• Specifically, the capacity is given in terms of the mutual information
between the channel input vector x and output vector y as
(10.5)
•
The definition of entropy yields that H(Y|X) = H(N), the entropy in the
noise. Since this noise n has fixed entropy independent of the channel
input, maximizing mutual information is equivalent to maximizing the
entropy in y.
•
The mutual information of y depends on its covariance matrix, which
for the narrowband MIMO model is given by
E[yyH ]
2015/10/11
Lecture2:CapacityofWirelessChannels
(10.6)
54
MIMO Channel Capacity
where Rx is the covariance of the MIMO channel input.
•
The mutual information can be shown as
(10.7)
•
The MIMO capacity is achieved by maximizing the mutual information
over all input covariance matrices
satisfying the power constraint:
(10.8)
where
denotes the determinant of the matrix A.
• Now let us consider the case of Channel Known at Transmitter
2015/10/11
Lecture2:CapacityofWirelessChannels
55
MIMO Channel Capacity
•
Substituting the matrix SVD of H into C and using properties of unitary
matrices we get the MIMO capacity with CSIT and CSIR as
(10.9)
Since ⇢ = P/ n2 , the capacity (10.9) can also be expressed in terms of the
power allocation Pi to the ith parallel channel as
(10.10)
where ⇢i = Pi / n2 and
channel at full power.
•
i
=
2
2
i P/ n
is the SNR associated with the ith
Solving the optimization leads to a water-filling power allocation for the
MIMO channel:
(10.11)
2015/10/11
Lecture2:CapacityofWirelessChannels
56
MIMO Channel Capacity
• The resulting capacity is then
(10.12)
•
Now consider the case of Channel Unknown at Transmitter (Uniform
Power Allocation)
•
Suppose now that the receiver knows the channel but the transmitter
does not. Without channel information, the transmitter cannot optimize
its power allocation or input covariance structure across antennas.
If the distribution of H follows the ZMSW channel gain model, there is
no bias in terms of the mean or covariance of H.
Thus, it seems intuitive that the best strategy should be to allocate equal
power to each transmit antenna, resulting in an input covariance matrix
equal to the scaled identity matrix:
•
•
2015/10/11
Lecture2:CapacityofWirelessChannels
57
MIMO Channel Capacity
•
•
It is shown that under these assumptions this input covariance matrix
indeed maximizes the mutual information of the channel.
For an Mt transmit, Mr-receive antenna system, this yields mutual
information given by
(10.13)
•
Using the SVD of H, we can express this as
where
singular values of H.
•
and
is the number of nonzero
The mutual information of the MIMO channel (10.13) depends on the
specific realization of the matrix H, in particular its singular values { i }.
2015/10/11
Lecture2:CapacityofWirelessChannels
58
MIMO Channel Capacity
• In capacity with outage the transmitter fixes a transmission rate C, and
the outage probability associated with C is the probability that the
transmitted data will not be received correctly or, equivalently, the
probability that the channel H has mutual information less than C.
• This probability is given by
(10.14)
P
•
Note that for fixed Mr , under the ZMSW model the law of large numbers
implies that
(10.15)
• Substituting this into (10.13) yields that the mutual information in the
asymptotic limit of large Mt becomes a constant equal to
2015/10/11
Lecture2:CapacityofWirelessChannels
59
MIMO Channel Capacity
• We can have two important observations from the results in (10.14)
and (10.15)
• As SNR grows large, capacity also grows linearly with M = min{Mt , Mr }
for any Mt and Mr .
• At very low SNRs transmit antennas are not beneficial: Capacity only
scales with the number of receive antennas independent of the number of
transmit antennas.
• Fading Channels
• Channel Known at Transmitter: Water-Filling
EH
(10.16)
EH
2015/10/11
Lecture2:CapacityofWirelessChannels
60
MIMO Channel Capacity
•
•
A less restrictive constraint is a long-term power constraint, where we
can use different powers for different channel realizations subject to the
average power constraint over all channel realizations.
The ergodic capacity in this case is
EH
•
(10.17)
Channel Unknown at Transmitter: Ergodic Capacity and Capacity
with Outage
• Consider now a time-varying channel with random matrix H
known at the receiver but not the transmitter. The transmitter
assumes a ZMSW distribution for H.
• The two relevant capacity definitions in this case are ergodic
capacity and capacity with outage.
• Ergodic capacity defines the maximum rate, averaged over all
channel realizations, that can be transmitted over the channel
for a transmission strategy based only on the distribution of H.
2015/10/11
Lecture2:CapacityofWirelessChannels
61
MIMO Channel Capacity
•
This leads to the transmitter optimization problem - i.e., finding the
optimum input covariance matrix to maximize ergodic capacity subject
to the transmit power constraint.
Mathematically, the problem is to characterize the optimum Rx to
maximize
•
EH
(10.18)
where the expectation is with respect to the distribution on the channel matrix
H, which for the ZMSW model is i.i.d. zero-mean circularly symmetric unit
variance.
•
As in the case of scalar channels, the optimum input covariance matrix
that maximizes ergodic capacity for the ZMSW model is the scaled
identity matrix M⇢ IMt . Thus the ergodic capacity is given by:
t
EH
2015/10/11
Lecture2:CapacityofWirelessChannels
(10.19)
62
MIMO Channel Capacity
• The ergodic capacity of a 4x4 MIMO system with i.i.d. complex
Gaussian channel gains is shown in Figure 10.4.
2015/10/11
Lecture2:CapacityofWirelessChannels
63
MIMO Channel Capacity
•
•
•
•
•
Capacity with outage is defined similar to the definition for static channels
described in previous, although now capacity with outage applies to a
slowly-varying channel where the channel matrix H is constant over a
relatively long transmission time, then changes to a new value.
As in the static channel case, the channel realization and corresponding
channel capacity is not known at the transmitter, yet the transmitter must
still fix a transmission rate to send data over the channel.
For any choice of this rate C, there will be an outage probability associated
with C, which defines the probability that the transmitted data will not be
received correctly.
The outage capacity can sometimes be improved by not allocating power to
one or more of the transmit antennas, especially when the outage
probability is high. This is because outage capacity depends on the tail of
the probability distribution.
With fewer antennas, less averaging takes place and the spread of the tail
increases.
2015/10/11
Lecture2:CapacityofWirelessChannels
64
MIMO Channel Capacity
•
The capacity with outage of a 4 × 4 MIMO system with i.i.d. complex
Gaussian channel gains is shown in Figure 10.5
2015/10/11
Lecture2:CapacityofWirelessChannels
65
MIMO Channel Capacity
2015/10/11
Lecture2:CapacityofWirelessChannels
66
Download