Free Deconvolution for MIMO Capacity Estimation Øyvind Ryan M´erouane Debbah

advertisement
Free Deconvolution for MIMO Capacity Estimation
Øyvind Ryan
Mérouane Debbah
Centre of Mathematics for Applications,
University of Oslo,
P.O. Box 1053 Blindern,
0316 Oslo, Norway
Phone: +47 93 24 83 21
Fax: +47 22 85 43 49
Email: oyvindry@ifi.uio.no
SUPELEC,
3 rue Joliot-Curie,
91192 GIF SUR YVETTE CEDEX, France
Phone: +33 1 69 85 20 07
Fax: +33 1 69 85 12 59
Email: merouane.debbah@supelec.fr
Abstract—In many channel measurement applications, one needs
to estimate some characteristics of the channels based on a limited
set of measurements. This is mainly due to the highly time varying
characteristics of the channel. In this contribution, it will be shown
how free probability can be used for channel capacity estimation in
MIMO systems. Free probability has already been shown to be an
invaluable tool for describing the asymptotic behaviour of many systems
when the dimensions of the system get large (i.e. the number of
antennas). In particular, introducing the notion of free deconvolution, we
provide hereafter an asymptotically (in the number of antennas) unbiased
capacity estimator (w.r.t. the number of observations) for MIMO channels
impaired with noise. Another unbiased estimator (for any number of
observations) is also constructed by slightly modifying the free probability
based estimator. The framework is then extended to MIMO channels
with phase off-set and frequency drift for which no estimator has been
provided so far.
I. I NTRODUCTION
Random matrices, and in particular limit distributions of sample
covariance matrices, have proved to be a useful tool for modelling systems, for instance in digital communications [1], nuclear
physics [2] and mathematical finance [3]. A typical random matrix
model is the information-plus-noise model,
1
(Rn + σXn )(Rn + σXn )H .
(1)
N
Rn and Xn are assumed independent random matrices of dimension
n × N , where Xn contains i.i.d. standard (i.e. mean 0, variance
1) complex Gaussian entries. (1) can be thought of as the sample
covariance matrices of random vectors rn + σxn . rn can be interpreted as a vector containing the system characteristics (direction
of arrival for instance in radar applications or impulse response
in channel estimation applications). xn represents additive noise,
with σ a measure of the strength of the noise. Classical signal
processing estimation methods consider the case where the number
of observations is far bigger than the dimensions of the system, for
which equation (1) can be shown to be approximately:
Wn =
Wn = Γn + σ 2 In .
(2)
Here, Γn is the true covariance of the signal. In this case, one can
separate the signal from the noise subspace and infer (based only on
the statistics of the signal) on the characteristics of the input signal.
However, in many situations, one can gather only a limited number
of observations during which the characteristics of the signal does
not change. In order to model this case, n and N will be increased
n
= c, i.e. the number of observations is increased
so that limn→∞ N
at the same rate as the number of parameters of the system.
Previous contributions have already dealt with this problem. In [4],
Dozier and Silverstein explain how one can use the eigenvalue
distribution of Γn = N1 Rn RH
n to estimate the eigenvalue distribution
of Wn . In [5], [6], an algorithm was provided to estimate one from
another, using the concept of multiplicative free convolution, which
admits a convenient implementation. In this paper, channel capacity
estimation in MIMO systems is used as a benchmark application by
using the connection between free probability theory and systems
of the form (1). For MIMO channels with and without frequencyoff sets, we derive explicit unbiased estimators which perform much
better than classical ones.
This paper is organized as follows. Section II presents the problem under consideration. Section III provides the basic concepts
needed on free probability, including multiplicative and additive free
(de)convolution. In section IV, we formalize the free probability
approach as an estimator, and explain some of the shortcomings
for MIMO models with frequency off-sets. Another estimator, called
the unbiased capacity estimator is then formalized to address the
shortcomings of the free probability based estimator. In section V,
simulations of the free probability based and the unbiased estimators
are performed and compared. In the following, upper (lower boldface)
symbols will be used for matrices (column vectors) whereas lower
symbols will represent scalar values, (.)T will denote transpose
hermitian transpose.
operator, (.) conjugation and (.)H = (.)T
In will represent the n × n identity matrix. T rn will denote the
non-normalized trace on n × n matrices, while trn = n1 T rn denotes
the normalized trace. Also, we will throughout the paper use c as a
shorthand notation for the ratio between the number of rows and the
number of columns in the random matrix model being considered,
n
for systems of the form (1).
i.e. c = limn→∞ N
II. S TATEMENT OF THE PROBLEM
In usual time varying measurement methods for MIMO systems,
one validates models [7] by determining how the model fits with
actual capacity measurements. In this setting, one has to be extremely
cautious about the measurement noise, especially for far field measurements where the signal strength can be lower than the noise.
The MIMO measured channel in the frequency domain can be
modelled by:
1 Ĥi = √ Dri HDti + σXi
n
(3)
where Ĥi , H and Xi are respectively the n × n measured MIMO
matrix (n is the number of receiving and transmitting antennas1 ), the
n × n MIMO channel and the n × n noise matrix with i.i.d. zero
1 Without loss of generality, we consider the same number of antennas at
the transmitter and receiver.
mean unit variance Gaussian entries. We suppose that the channel
H, although time varying, stays constant (block fading assumption)
during L blocks. Dri and Dti are diagonal matrices which represent
phase off-sets and phase drifts (which are impairments due to the
antennas and not the channel) at the receiver and transmitter given
respectively by (these are supposed to vary on a block basis):
i
i
i
t
Dri = diag[ejφ1 , ..., ejφn ]
and
Dti = diag[ejθ1 , ..., ejθn ]
where the phases φij and θji are random, independent, and uniformly
distributed. We will also consider the simpler model without phase
off-sets and phase drifts, i.e.
1
Ĥi = √ (H + σXi ) .
n
(4)
The capacity of a channel with channel matrix H and signal to
noise ratio ρ = σ12 is given by
H
1
C = n1 log
ndet I + nσ 21 HH
(5)
1
= n l=1 log(1 + σ 2 λl )
where λl are the eigenvalues of n1 HHH . The problem consists
therefore of estimating the eigenvalues of n1 HHH based on few
observations Ĥi , which is paramount for modelling purposes. Note
that the capacity expression supposes that the channel is perfectly
known at the receiver and not at the transmitter. In practice, with
the noise impairment, the channel will never be estimated perfectly
and therefore expression (5) is not achievable. However, for MIMO
modelling purposes, for which the capacity is often the matching
metric, one needs to compare the capacity of the model with
expression (5).
There are different methods actually used for channel capacity
estimation. Usual methods discard, through an ad-hoc threshold
procedure, all channels Ĥi for which the channel to noise ratio
( nσ1 2 Trace(HHH )) is lower than a threshold and then compute
M
M
1
1 1 1 H
Ĥi )(
Ĥi ) )
C̃(σ ) = log2 det I + 2 (
n
σ M i=1
M i=1
2
where M ≤ L is the number of channels having a signal to noise
ratio higher than the threshold. One of the drawbacks of this method
is that one will not analyse the true capacity but only the capacity
of the ”good channels”. Moreover, one has to limit the channel
measurement campaign (in order to have enough channels higher
than the threshold) only to regions which are close (in terms of actual
distance) enough to the base station.
Other methods, in order to have a capacity estimation at a given
signal to noise ratio (different from the measured one with noise variance σ 2 ), normalize each channel realization Ĥi and then compute
for a different value of the noise variance σ12 (for example 10dB) the
capacity estimate C̃(σ12 ). In the case where σ 2 is high and σ12 is low,
one usually finds a high capacity estimate as one measures only the
noise, which is known to have a high multiplexing gain.
In this contribution, we will provide a neat framework, based on
free deconvolution, for channel capacity estimation that circumvents
all the previous drawbacks. Moreover, we will deal with model (3),
for which no solution has been provided in the literature so far.
III. F RAMEWORK FOR FREE CONVOLUTION
Free probability [8] is a theory for non-commutative random variables (like matrices), which has grown into an entire field of research
through the pioneering work of Voiculescu. These random variables
are elements in what is called a noncommutative probability space.
Free probability is analogous to classical probability in the sense that
a general linear functional (such as the trace on the set of matrices)
takes the place of the expectation, and the concept of freeness [8]
takes the place of the concept of independence. It turns out that many
types of random matrices (for which our expectation is the (classical)
expectation of the trace) exhibit asymptotic freeness [8], meaning that
the freeness relation holds only asymptotically when the matrices get
large. For instance, random matrices √1n An1 , √1n An2 , ..., where the
Ani are n × n with all entries independent and standard Gaussian
(i.e. mean 0 and variance 1), are asymptotically free [8].
In free probability, one also has a transform analogous to the
logarithm of the Fourier transform. One also has convolution, called
free convolution, which linearizes this transform. We will denote
Additive free convolution by μ1 μ2 , and multiplicative free convolution by μ1 μ2 , where μ1 and μ2 are probability measures.
In free probability, these operations are formally defined by associating the moments of the probability measures μ1 and μ2 with
the moments non-commutative random variables a1 and a2 which
are free, forming their sum/product, and associating the moments
of the sum/product with another probability measure. We will also
find the concepts of additive and multiplicative free deconvolution
useful: Given probability measures μ and μ2 , when there is a unique
probability measure μ1 such that μ = μ1 μ2 (μ = μ1 μ2 ), we
will write μ1 = μ μ2 (μ1 = μ μ2 respectively). We say that μ1
is the additive (respectively multiplicative) free deconvolution of μ
with μ2 .
One important measure is the Marc̆henko Pastur law μc [9],
characterized by the density
(x − a)+ (b − x)+
1
f μc (x) = (1 − )+ δ(x) +
,
(6)
c
2πcx
√
√
where (z)+ = max(0, z), a = (1 − c)2 and b = (1 + c)2 .
It is known that μc describes asymptotic eigenvalue distributions of
Wishart matrices. These have the form N1 RRH , where R is an n×N
n
→ c) random matrix with independent standard Gaussian entries.
(N
By the empirical eigenvalue distribution of an n×n random matrix
X we will mean the random atomic measure
1
(δ(λ1 (X)) + · · · + δ(λn (X))) ,
n
where λ1 (X), ..., λn (X) are the (random) eigenvalues of X. The
empirical eigenvalue distribution of X is also denoted μX . To make
the connection between models (4), (3) and the model (1), we need
the following result from free probability [5]:
Theorem 1: Assume that the empirical eigenvalue distribution of
Γn = N1 Rn RH
n converges in distribution almost surely to a
compactly supported probability measure μΓ . Then we have that the
empirical eigenvalue distribution of Wn also converges in distribution almost surely to a compactly supported probability measure μW
uniquely identified by
μW μc = (μΓ μc ) μσ 2 I .
(7)
When we have L observations Ĥi in a MIMO system as in (4) or
(3), we will form the n × nL random matrices
1
σ
H1...L + √ X1...L
Ĥ1...L = √
(8)
n
L
with
H1...L
1 Ĥ1...L = √ Ĥ1 , Ĥ2 , ..., ĤL ,
L
1 r
= √ Di HDti , Dri HDti , ..., Dri HDti ,
L
X1...L = [X1 , X2 , ..., XL ] .
For the case L = 1, the formula
j H j
Dr1 HDt1 Dr1 HDt1
HHH
trn
= trn
can be combined with theorem 1 to give the approximation
μĤ1 ĤH μ1 = μ 1 HHH μ1 μσ 2 .
1
n
(9)
(10)
for a single observation. This approximation works well when n is
H
when
large. For many observations, note that H1...L HH
1...L = HH
there is no phase off-set and phase drift, so that the approximation
(11)
μĤ1...L ĤH μ 1 = μ 1 HHH μ 1 μσ 2
1...L
L
n
L
applies and generalizes (10).
Note that the ratio between the number of rows and columns in
the matrices H1...L , X1...L and Ĥ1...L is c = L1 , considering the
horizontal stacking of the observations in a larger matrix. It is only
this stacking which will be considered in this paper, so this value of
c will apply for all random matrix models we will consider.
When phase off-set and phase drift are added, it is much harder to
adapt theorem 1 to produce the moments of n1 HHH . The reason is
that theorem 1 really helps us to find the moments of n1 H1...L HH
1...L .
In the case without phase off-set and phase drift, this is enough
since these moments are equal to the moments of n1 HHH . However,
equality between these moments does not hold when phase off-set
and phase drift are added. In section IV, we will instead define an
estimator for the channel capacity which do not stack observations
into the matrix H1...L at all.
IV. N EW ESTIMATORS FOR CHANNEL CAPACITY
In this section, two new channel capacity estimators are defined.
First, a free probability based estimator is introduced, which will be
shown to be asymptotically unbiased w.r.t. the number of observations. Then, by slightly modifying this we will construct what we
call the unbiased capacity estimator. This estimator will be shown
to be unbiased for any number of observations.
A. The free probability based capacity estimator
The free probability based estimator can take advantage of knowledge of the rank of the channel matrix: With a lower rank, it can
perform estimation with less computation.
Definition 1: By a free probability based estimator for the capacity
of a channel (with channel matrix H) we will mean an estimator
which performs the following steps:
1) Compute the first r moments ĥ1 , ..., ĥr of the sam=
ple covariance matrix
Ĥ1...L ĤH
1...L (i.e. compute ĥj
j
H
Ĥ1...L Ĥ1...L
for 1 ≤ j ≤ r), where r is the rank
trn
of n1 HHH ,
2) use (11) to estimate the first r moments h1 , ..., hr of n1 HHH ,
3) estimate the r nonzero eigenvalues λ1 , ..., λr of n1 HHH from
h1 , ..., hr ,
4) substitute these in (5).
Steps 2 and 3 in definition 1 need some elaboration. To address
step 3, we remark that the k nonzero eigenvalues of a matrix can
be calculated from it’s k first moments, by using the Newton-Girard
formulas as described in [10]. An accompanying implementation of
this can be found in [11].
To address step 2, a Matlab implementation [11] which performs
free (de)convolution was developed and used for simulations in this
paper. The following algorithm for computing free convolution with
the Marc̆henko Pastur law in terms of moments appeared first in [6].
It is easiest to describe for c = 1:
Proposition 1: If mn are the moments of μ, and Mn are the
moments of μ μ1 , then we have that
Mn =
(12)
mk coefn−k (1 + M1 z + M2 z 2 + · · · )k ,
k≤n
where coefk is the coefficient of z k in the polynomial on the right
hand side.
In [12], Matlab implementations of this can be found, together
with a straightforward generalization to free convolution with any
Marc̆henko Pastur law. ¿From the results in section III we conclude
that the free probability based estimator is asymptotically unbiased
(w.r.t. the number of observations) for systems of type (4) (not for
systems of (3)).
B. The unbiased capacity estimator
The definition below for the unbiased capacity estimator is motivated from computing expected values of mixed moments of Gaussian
and deterministic matrices [10]:
Definition 2: The unbiased capacity estimator is defined for rank
r ≤ 4, with the requirement that the channel matrix H is hermitian
if r = 4, in the following way:
1) For each observation, perform the following
a) Compute the first r moments ĥi1 , ..., ĥir of the sam(i.e. compute ĥj =
ple covariance matrix
Ĥi ĤH
i
j
H
Ĥ1...L Ĥ1...L
trn
for 1 ≤ j ≤ r),
b) find estimates hi1 , hi2 , hi3 of the first three moments of
1
HHH by solving
n
ĥi1
ĥi2
ĥi3
=
=
=
hi1 + σ 2
hi2 + 4σ 2 hi1 + 2σ 4
hi3 + 6σ 2 hi2 + 3σ 2 h2i1 +3σ 4 5 + N22 hi1 + σ 6 5 +
(13)
1
n2
,
and an estimate hi4 of the fourth moment by solving the
additional formulas
ˆ i2 = hdi2
hd
+2σ 2 1 + n1 hi1
+σ 4 1 + n2
= hi4
ĥi4
+8σ 2 hi3
(14)
+8σ 2 hi2 hi1 +4σ 4 7 + n62 hi2
2
+4σ 4 7 + n2 h
i1
+4σ 4 n1 + n22 hdi2
+4σ 6 14 + n2 + n172 + n83 hi1
+σ 8 14 + n102
when r = 4, where
ˆ i2 = trn
hd
2 diag Ĥi ĤH
i
(hdij is used as notation for the moments of a diagonal
matrix, i.e. the d stands for diagonal).
If we restrict to L = 1 observation, the following holds:
1) The free probability based and the unbiased estimator coincide
for the first two moments h1 and h2 .
2) The third moment h3 in the free probability based estimator is
biased, with bias given by
6σ 4 trn n1 HHH + σ 6
−
n2
This holds regardless of whether model (3) or model (4) is used.
In this paper, the unbiased capacity estimator is used for systems
with phase off-set and phase drift (where the free probability based
estimator fails). Free capacity estimation is used for systems without
phase off-set and phase drift, and has an implementation which can
be adapted to channel matrices with any rank.
V. C HANNEL CAPACITY ESTIMATION
Several candidates for channel capacity estimators for (4) have
been used in the literature. We will consider the following:
L
H
1
1
C1 = nL
i=1 log 2 det I + σ 2 Ĥi Ĥi
H
C2 = n1 log2 det I + Lσ1 2 L
i=1 Ĥi Ĥi
L
H
1
C3 = n1 log2 det I + σ12 ( L1 L
i=1 Ĥi )( L
i=1 Ĥi ) )
(15)
A. Channels without phase off-set and phase drift
In figure 1, C1 , C2 and C3 are compared with the free probability
based and unbiased estimators for various number of observations,
with σ 2 = 0.1, and a 10 × 10 channel matrix of rank 3. It is seen
that only the C3 estimator gives values close to the true capacity. The
channel considered has no phase drift or phase off-set. C1 and C2
are seen to have a high bias. The free probability based and unbiased
estimators are seen to give values closer to the true capacity than C3 .
B. Channels with phase off-set and phase drift
In figure 2, the C3 estimator is compared with the free probability
based estimator, the unbiased estimator and the true capacity, for
various number of observations, and with the same σ and channel
matrix as in figure 1. Phase off-set and phase drift have also been
introduced. In this case, the free probability based estimator and the
C3 -estimator seem to be biased.
In figure 3, we have varied σ, used 10 observations, and also
formed a rank 3 channel matrix with n = 4. It is seen that
the deviation from the true capacity is small, which provides a
2.6
2.4
2.2
Capacity
2
1.8
1.6
1.4
1.2
1
0.8
0
True capacity
Classical capacity estimation (C )
1
Classical capacity estimation (C2)
Classical capacity estimation (C3)
Free capacity estimation
Unbiased capacity estimation
5
10
15
20
Number of observations
25
30
Fig. 1. Comparison of various classical capacity estimators and the two
new capacity estimators introduced in this paper for various number of
observations, model (4). σ2 = 0.1 and n = 10. The rank of H was 3.
2.6
2.4
2.2
2
Capacity
Form the estimates hj = L1 L
i=1 hij , 1 ≤ j ≤ r, of the first
moments of n1 HHH ,
2) estimate the r nonzero eigenvalues λ1 , ...λr of n1 HHH from
h1 , ..., hr ,
3) substitute these in (5).
A Matlab implementation performing these steps can be found
in [11]. Without the assumption of H being hermitian, the expression
for the rank 4 estimator would be more complex. The following
proposition shows that the estimator of definition 2 actually qualifies
for it’s name. It’s proof (found in [10]) is the background for the
formulas in definition 2.
Proposition 2: The unbiased capacity estimator is actually unbiased (for any number of observations), i.e.
j 1
HHH
, 1 ≤ j ≤ 4.
E(hj ) = trn
n
1.8
1.6
1.4
1.2
1
0.8
0
True capacity
Classical capacity estimation (C3)
Free capacity estimation
Unbiased capacity estimation
5
10
15
20
Number of observations
25
30
Fig. 2. Comparison of capacity estimators which worked for model (4) for
increasing number of observations. Model (3) is used. σ2 = 0.1 and n = 10.
The rank of H was 3.
very good candidate for channel estimation in highly time-varying
environments. In [10], the estimator was also tested with only one
observation, for which the deviation from the true capacity was
higher, but still quite small for small σ. In [10], the estimator was
also tested for one observation with n = 10, for which the deviation
from the true capacity was smaller.
Finally, let us use a channel matrix of rank 4. In this case we have
increased the number of observations further to predict the channel
capacity. In figure 4, unbiased capacity estimation is performed for
a rank 4 channel matrix with n = 4 and 1600 observations are
performed. In [10], comparison is also done for 50 observations also.
9
True capacity
Unbiased capacity estimation
8
R EFERENCES
Capacity
7
6
5
4
3
2
0.05
0.1
0.15
0.2
0.25
σ
0.3
0.35
0.4
0.45
0.5
Fig. 3. The unbiased estimator for L = 10 observations and n = 4, with
varying values of σ. Model (3). The rank of H was 3.
10
True capacity
Unbiased capacity estimation
9
Capacity
8
7
6
5
4
3
ACKNOWLEDGMENT
Mérouane Debbah is supported by Alcatel-Lucent within the
Alcatel-Lucent Chair on flexible radio at SUPELEC.
0.05
0.1
0.15
0.2
0.25
σ
0.3
0.35
0.4
0.45
0.5
Fig. 4. The unbiased estimator for L = 1600 observations and n = 4, with
varying values of σ. Model (3). The rank of H was 4.
VI. C ONCLUSION
In this paper, we have shown that free probability provides a neat
framework for estimating the channel capacity for certain MIMO
systems. In the case of highly time varying environments, where one
can rely only on a set of limited noisy measurements, we have provided an asymptotically unbiased estimator of the channel capacity.
A modified estimator called the unbiased estimator (in comparison
with the asymptotically unbiased estimator) was also introduced to
take into account the bias in the case of finite dimensions and was
proved to be adequate for low rank channel matrices. Moreover,
although the results are based on asymptotic claims (in the number
of observations), simulations show that the estimators work well for
a very low number of observations also. Even when considering
discrepancies such as phase drifts and phase off-set, the algorithm,
based on the unbiased estimator, provided very good performance.
[1] E. Telatar, “Capacity of multi-antenna gaussian channels,” Eur. Trans.
Telecomm. ETT, vol. 10, no. 6, pp. 585–596, Nov. 1999.
[2] T. Guhr, A. Müller-Groeling, and H. A. Weidenmüller, “Random matrix
theories in quantum physics: Common concepts,” Physica Rep., pp. 190–
, 299 1998.
[3] J.-P. Bouchaud and M. Potters, Theory of Financial Risks-From Statistical Physics to Risk Management. Cambridge: Cambridge University
Press, 2000.
[4] B. Dozier and J. W. Silverstein, “On the empirical distribution of
eigenvalues of large dimensional information-plus-noise type matrices,”
J. Multivariate Anal., vol. 98, no. 4, pp. 678–694, 2007.
[5] Ø. Ryan and M. Debbah, “Multiplicative free convolution
and
information-plus-noise
type
matrices,”
2007,
http://arxiv.org/abs/math.PR/0702342.
[6] ——, “Free deconvolution for signal processing applications,”
Submitted to IEEE Trans. on Information Theory, 2007,
http://arxiv.org/abs/cs.IT/0701025.
[7] J. P. Kermoal, L. Schumacher, K. I. Pedersen, P. E. Mogensen, and
F. Frederiken, “A stochastic MIMO radio channel model with experimental validation,” IEEE Journal on Selected Areas in Communications,
vol. 20, no. 6, pp. 1211–1225, 2002.
[8] F. Hiai and D. Petz, The Semicircle Law, Free Random Variables and
Entropy. American Mathematical Society, 2000.
[9] A. M. Tulino and S. Verdú, Random Matrix Theory and Wireless
Communications. www.nowpublishers.com, 2004.
[10] Ø. Ryan and M. Debbah, “Channel capacity estimation using free
probability theory,” Submitted to Submitted to IEEE Trans. Signal
Process., 2007, http://arxiv.org/abs/0707.3095.
[11] Ø. Ryan, Tools for estimating channel capacity, 2007,
http://ifi.uio.no/˜oyvindry/channelcapacity/.
[12] ——,
Computational
tools
for
free
convolution,
2007,
http://ifi.uio.no/˜oyvindry/freedeconvsignalprocapps/.
Download