Bounds on the Capacity of the Additive Inverse

advertisement
2012 IEEE International Symposium on Information Theory Proceedings
Bounds on the Capacity of the Additive Inverse
Gaussian Noise Channel
Hui-Ting Chang and Stefan M. Moser
Department of Electrical and Computer Engineering
National Chiao Tung University (NCTU)
Hsinchu, Taiwan
Email: {hui-ting.chang,stefan.moser}@ieee.org
Abstract—A very recent and new model describing communication based on the exchange of chemical molecules in a
drifting liquid medium is investigated and new analytical upper
and lower bounds on the capacity are presented. The bounds
are asymptotically tight, i.e., if the average-delay constraint is
loosened to infinity or if the drift velocity of the liquid medium
tends to infinity, the corresponding asymptotic capacities are
derived precisely.
I. I NTRODUCTION
Recently, Srinivas, Adve, and Eckford [1] presented a new
intriguing channel model describing communication in a fluid
media via emission of molecules. The basic idea is that
in certain situations like, e.g., when nano devices try to
communicate with each other or with some receiving station, it
might not be possible to use traditional signal propagation via
electromagnetic waves because, for example, the nano device
is too small to accommodate the minimal necessary antenna
size or it does not possess enough power.
So the question arises as to how communication could be
established in such a setup. If we assume that the nano device
is inside a certain liquid medium (e.g., blood in a blood
vessel), then one can think of communication based on the
emission of chemical substances. Such a system, of course,
will behave fundamentally different from the usual information
transmission systems. It is therefore a very interesting task to
try to model this communication scenario and analyze it.
The work in [1] is a first attempt to this: there the additive
inverse Gaussian noise (AIGN) channel is introduced. The
model is simplistic and neglects many properties of a real
system, nevertheless, it also shows a Shannon-like beauty
because it simplifies as much as possible without losing the
essentials. It definitely is of fundamental importance with big
impact on practice seeing that nano devices are a very hot
topic in current research worldwide. It also has huge potential
for future research as it allows to slowly incorporate additional
effects that might have influence on a real system.
While we believe that the main and most important contribution of [1] is the description of the AIGN channel, the
authors also present in [1] a first analysis of the channel’s
capacity behavior. In this paper, we build on these results.
We will present new upper and lower bounds on capacity and
establish the exact asymptotic capacity in the cases when either
978-1-4673-2579-0/12/$31.00 ©2012 IEEE
the drift velocity of the fluid or the average-delay constraint
of the transmitter tends to infinity.
The remainder of this paper is structured as follows. In
Section II we will introduce the channel model and its underlying assumptions more in detail. In Section III we summarize
a couple of mathematical properties and identities related to
the given model, and Section IV reviews the bounds of [1].
In Section V we present our new upper and lower bounds
on capacity, while the exact asymptotic capacity formulas
are given in Section VI. Finally, Section VII outlines the
basic ideas underlying the derivations, and in Section VIII
we discuss the results.
In this paper all logarithms log(·) refer to the natural
logarithm, i.e., all results are stated in nats. For engineering
reasons, though, we have scaled the plots given in Figs. 1–4
to be in bits.
II. C HANNEL M ODEL
The basic idea of the system under consideration is that a
transmitter emits a molecule at a certain time into a fluid that
itself is drifting with constant speed. The molecule is then
transported by the fluid and its inherent Brownian motion into
random directions. Once it reaches the receiver, the molecule
is removed from the fluid. To simplify our model, we make
the following assumptions:
• The Brownian motion is described by a Wiener process
with variance σ 2 .
• We only consider a one-dimensional setup where the
position of the molecule is described by a single random
coordinate W . The transmitter is set at coordinate 0 and
the receiver is along the moving direction of the fluid at
a certain distance d. Without loss of generality we will
set d = 1.
• The fluid is moving with constant drift velocity v > 0.
• Once the molecule reaches W = d, it is absorbed and
not released again.
• The transmitter and receiver are perfectly synchronized
and have potentially infinite time to wait for the arrival
of the molecules.
• There are no other molecules from the same type interfering with the communication.
• The channel is memoryless, i.e., the trajectories of individual information carrying molecules are independent.
304
Moreover, in case a molecule overpasses another, the
receiver still can distinguish between them and put them
into correct order.
The basic ideas behind these simplifications are related to
Shannon’s approach when he introduced the additive Gaussian
noise channel [2] and also focused solely on the impact
of the noise, but neglected many other aspects like delay,
synchronization, or interference.
A Wiener process is described by independent Gaussian
position increments, i.e., for any time interval τ , the increment
in position ∆W is Gaussian distributed with mean vτ and
variance σ 2 τ . The increments of nonoverlapping intervals are
assumed to be independent. Here, v denotes the drift velocity
of the fluid and σ 2 is a channel parameter that describes the
strength of the Brownian motion and relates to the viscosity
of the fluid, the temperature, the size and structure of the
molecules, etc.
In our setup of the communication, the positions of transmitter and receiver are fixed, i.e., we need to “invert” the Wiener
process to describe the random time it takes for the molecule
to travel from position 0 to position d = 1. This random time
N has an inverse Gaussian distribution1 that is described by
its probability density function (PDF)
r
λ(n − µ)2
λ
I{n > 0}. (1)
exp −
fN (n) =
2πn3
2µ2 n
Here, I{·} denotes the indicator function that takes on the
values 1 or 0 depending on whether the statement holds or
not. The PDF (1) depends on two parameters: µ denotes the
average traveling time
µ=
1
d
=
v
v
(2)
and λ relates to the Brownian motion parameter σ 2 via
2
d
1
= 2.
(3)
2
σ
σ
Usually we write N ∼ IG(µ, λ).
To transport the information, the transmitter is now assumed
to modulate the emission time X of its molecule (time-position
modulation). After emission, the molecule takes the random
time N to travel to the receiver, i.e., the receiver registers the
arrival time
Y = X + N,
(4)
In addition to the nonnegativity constraint on the input X,
X ≥ 0,
for practicability reasons, the transmitter is also subject to an
average-delay constraint m:
E[X] ≤ m.
C=
1 The
max
fX : E[X]≤m
I(X; Y ).
(8)
Note that the capacity depends strongly on the two most
important channel parameters: the allowed average delay m
and the fluid’s drift velocity v.
III. M ATHEMATICAL P RELIMINARIES
In [1] the authors state the differential entropy of N ∼
IG(µ, λ) using some complicated expression involving derivatives with respect to the order of modified Bessel functions
of the second kind Kν . Luckily, these expressions can be
simplified considerably:
1
2λ
2πeµ3
3 2λ
h(N ) = log
,
(9)
+ e µ Ei −
2
λ
2
µ
where Ei (·) denotes the exponential integral function
Z ∞ −τ
e
Ei (−t) , −
dτ.
τ
t
The general moments of N are given as
r
2λ µλ ν− 1
λ
ν
, ν ∈ R.
e µ 2 Kν− 21
E[N ] =
π
µ
(10)
(11)
In particular, this means that
E[N ] = µ,
µ3
,
E N 2 = µ2 +
λ
µ3
Var(N ) =
,
λ
where X and N are independent, X ⊥
⊥ N . Hence, given some
emission time x ≥ 0, the channel output has a conditional
PDF
=
(7)
Note that other constraints are possible, e.g., it would be quite
reasonable to constrain the maximum delay. We have deferred
the investigation of such assumptions to our future research.
We refer to the model (4) above as additive inverse Gaussian noise (AIGN) channel. It is straightforward to see that
Shannon’s Channel Coding Theorem [2] can be applied to (4)
resulting in a capacity
λ=
fY |X (y|x)
s
(6)
1
E
=
N
1
=
E
N2
1
=
Var
N
1
1
+ ,
(12)
µ λ
1
3
3
+ 2+
, (13)
2
µ
λ
µλ
1
2
(14)
+ 2.
µλ λ
Moreover, we have
λ
λ(y − x − µ)
exp −
3
2π(y − x)
2µ2 (y − x)
2
2λ
2λ
E[log N ] = log µ + e µ Ei −
.
µ
I{y > x}. (5)
name is a bit unfortunate: it is called “inverse” because we have
“inverted” the problem of random position for given time to random time for
given position. However, an inverse Gaussian random variable has nothing to
do with 1/G for G being Gaussian distributed.
(15)
Similarly to Gaussian random variables, inverse Gaussians
are closed under scaling: for any α > 0,
305
N ∼ IG(µ, λ)
=⇒
αN ∼ IG(αµ, αλ).
(16)
However, while the sum of two independent Gaussians is
Gaussian again, this property only holds for inverse Gaussians
with similar parameters: if
λ1
λ2
= 2,
µ21
µ2
(17)
then it holds that for N1 ∼ IG(µ1 , λ1 ) and N2 ∼ IG(µ2 , λ2 )
with N1 ⊥
⊥ N2 ,
p
p 2 .
(18)
λ1 + λ2
N1 + N2 ∼ IG µ1 + µ2 ,
parameter. However, information theorists often consider the
power constraint also as being “part of the channel”, i.e.,
belonging to that part of a system that the system designer
has no control over. So, it is actually not that unorthodox to
plot the capacity as a function of some channel parameter.
The adaptations of the given analytical formulas for these
two cases are straightforward using (2).
6
Finally, it is interesting to note that the inverse Gaussian
distribution is differential-entropy maximizing when the following three constraints are given:
E[N ] = α1 , E N −1 = α2 , E[log N ] = α3 .
(19)
I(X; Y ) [bits]
5
IV. K NOWN B OUNDS ON C APACITY
In [1], two analytical bounds on capacity are presented.
Firstly, an upper bound is derived based on the fact that differential entropy h(Y ) under an average constraint E[Y ] ≤ m+µ
is maximized by an exponential distribution:
h(Y ) ≤ 1 + log(m + µ).
4
3
2
Upper bound (25)
Upper bound (26)
Upper bound (27)
Known upper bound (22)
Lower bound (29)
Known lower bound (24)
1
(20)
0
0
1
2
3
5
6
7
8
9
10
Delay m
This leads to the bound
C = max{h(Y ) − h(N )}
2λ
λe(m + µ)2
3 2λ
1
µ Ei
−
.
−
e
≤ log
2
2πµ3
2
µ
4
(21)
Fig. 1. Bounds on capacity as a function of m. The drift velocity has been
set to v = 2 and the channel parameter λ is assumed to be λ = 41 .
(22)
Secondly, a lower bound is given that is based on (18).
In the definition of capacity the maximization
is dropped and
2
is chosen. Note
instead a suboptimal input X ∼ IG m, λm
µ2
that this choice makes sure that X and N satisfy (17), i.e., we
get Y ∼ IG m + µ, µλ2 (m + µ)2 . This yields
0.5
0.45
0.4
I(X; Y ) [bits]
0.35
C ≥ h(Y ) − h(N )
(23)
m + µ 3 2λ(m+µ)
2λ(m
+
µ)
1
+ e µ2 Ei −
= log
2
µ
2
µ2
2λ
3
2λ
− e µ Ei −
.
(24)
2
µ
0.3
0.25
0.2
0.15
0.1
Both bounds are depicted in Figs. 1 and 2 as a function of the
average-delay constraint m and in Figs. 3 and 4 as a function
of the drift velocity v, respectively.
Upper bound (25)
Upper bound (27)
Known upper bound (22)
Known lower bound (24)
0.05
0
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
Delay m
V. N EW B OUNDS ON C APACITY
In the following we will present our bounds on capacity.
Similarly to Section IV, we will state the results using only
the channel parameters µ and λ as well as the delay constraint
m.
From an engineering point of view, there are two interesting
scenarios: we can either plot the capacity as a function of
the given average-delay constraint m or as a function of
the given drift velocity v. The former corresponds to the
traditional situation of capacity as a function of the cost,
which usually is power, but here has become delay. The
latter is less traditional as the drift velocity is a channel
Fig. 2. Bounds on capacity as a function of m identical to Fig. 1, but zoomed
in at low values of m.
We start with some upper bounds.
Theorem 1: The capacity of the AIGN channel as defined in
(4) is upper-bounded by either of the following three bounds:
3
λm
1
1
1
+ log 1 + m
;
+
C ≤ log 1 +
2
µ(m + µ)
2
µ λ
(25)
2λ
1
2λ
η−1
C ≤ log λ +
log µ + e µ Ei −
2
2
µ
306
!
λ
2λ −η− 1 µλ
−η
2e K
− (m + µ)
µ
η+ 21
π
µ
1
1
η+2
− log η;
(26)
log 1 + m
+
+
2
µ λ
1
λ 1
1
1
C ≤ log + log 1 + m
+
2
µ 2
µ λ
2λ
1
m
λ
2λ
+ log 1 +
.
(27)
−
− e µ Ei −
µ
2
µ
µ+λ
r
Here, Ei (·) is defined in (10) and Kν (·) represents the orderν modified Bessel function of the second kind. In the second
bound (26), 0 ≤ η ≤ 1 is a parameter that can be optimized
over. A suboptimal, but good choice for η is
(
)
2
η , min
,1 ,
(28)
log 1 + m
λ + log 2 + γ
where γ ≈ 0.577 denotes the Euler number.
These bounds are shown in Figs. 1–4.
5
4.5
4
3.5
I(X; Y ) [bits]
1
+ log
2
3
2.5
2
1.5
1
Upper bound (25)
Upper bound (26)
Upper bound (27)
Known upper bound (22)
Known lower bound (24)
0.5
0
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Velocity v
Fig. 4. Bounds on capacity as a function of v identical to Fig. 3, but zoomed
in at low values of v.
must be real, i.e., the bound is only valid if
6
2µ2
.
(31)
λ
Note that this lower bound can be simplified considerably, but
for the price of losing its tightness for large values of m or v:
it can be shown that the complicated second last term (second
and third row) can be lower-bounded by − log 1 + 21 :
m
µ
λ
3
λ 3 2λ
2λ
µ
C ≥ log
+
− + kλ + log − e Ei −
λ
m µ
2
µ 2
µ
2π
3 1
.
(32)
− log − log
2 2
e
VI. A SYMPTOTIC C APACITIES
m≥
I(X; Y ) [bits]
5
4
3
2
Upper bound (25)
Upper bound (26)
Upper bound (27)
Known upper bound (22)
Lower bound (29)
Known lower bound (24)
1
0
0
1
2
3
4
5
6
7
8
9
10
Velocity v
Fig. 3. Bounds on capacity as a function of v. The average-delay constraint
has been set to m = 2 and the channel parameter λ is assumed to be λ = 41 .
Next we will state a lower bound.
Theorem 2: The capacity of the AIGN channel as defined
in (4) is lower-bounded by the following bound:
µ
λ
3
λ 3 2λ
m
2λ
µ
+
− + kλ + log − e Ei −
C ≥ log
λ
m µ
2
µ 2
µ
!
r
r
1 λ
λm
2λ
K1
+ k 2 λ2
− log 1 + e µ
m
2 + k 2 λm
m
!!
r
r
1 µλ +kλ
λm
λ
e
K1 2
+ k 2 λ2
+
2m
1 + k 2 λm
m
−
1
2π
log
2
e
The upper bound (22) and the lower bound (29) are asymptotically tight, i.e., when we let v or m tend to infinity, these
two bounds coincide. Hence, we can state the exact asymptotic
capacity.
Theorem 3: The capacity of the AIGN channel as defined
in (4) is asymptotically, when the average-delay constraint
m is loosened to infinity while all other parameters are kept
constant, as follows:
1
3 2λ
λe
2λ
µ
.
− e Ei −
lim C(m) − log m = log
m↑∞
2
2πµ3
2
µ
(33)
In the asymptotic regime when the drift velocity v of the fluid
medium tends to infinity while all other parameters are kept
constant, the capacity is as follows:
3
1
λm2 e
.
(34)
lim C(v) − log v = log
v↑∞
2
2
2π
VII. P ROOF O UTLINES
(29)
A. Upper Bounds
where
k,
r
1
2
−
µ2
mλ
(30)
The upper bounds on capacity are all based on the duality
technique that we have successfully used in our previous work,
307
see, e.g., [3] or [4]. For an arbitrary choice of a distribution
R(·) on the channel output alphabet, we have
C ≤ EQ∗ D fY |X (·|X)R(·) ,
(35)
where D(·k·) is the relative entropy [5] and Q∗ denotes the
(unknown!) capacity-achieving input distribution. To be able
to use this technique, we need to find an elaborate choice of
R(·) that is simple enough to allow the evaluation of (35),
but that at the same time is good enough to result in a close
bound. Moreover, we need to find ways of dealing with the
expectation over the unknown Q∗ .
As discussed in Section IV, the basic idea of the upper
bound (22) was to upper-bound the output entropy by its
maximum possible value, which will be achieved if the output
happens to be exponentially distributed. It therefore does not
come as a surprise that if we choose R(·) to be an exponential
distribution, we can fully rederive (22) from (35).
The upper bound (25) is based on R(·) being an IG
distribution, which explains why for small m it gets close
to the lower (24) (which is based on the IG distribution, too).
The other two upper bounds (26) and (27) stem from
different versions of a derivation based on R(·) being a power
inverse Gaussian distribution [6]: for y > 0,
1+ η2
r
β
α
R(y) =
2πβ 3 y


η2 η2 !2
α
y
β
 , (36)
· exp − 2
−
2η β
β
y
where α, β > 0, and η ∈ R \ {0} are free parameters. The
family of power inverse Gaussian distributions contains the IG
distribution as a special case for the choice η = 1.
VIII. D ISCUSSION
We should point out that in [1] the authors have already
concluded from numerical analysis that their upper bound (22)
is very tight. We have now formally proven this by providing
an analytical lower bound that is tight in the asymptotic
regime.
Due to the tightness of the known upper bound (22), it
obviously is very difficult to find improved upper bounds. So,
our focus in the search for upper bounds lay mainly in the lowdelay and in the low-velocity regimes. We tried in particular
to find a bound that would show that the strange behavior of
(22) to increase again as v ↓ 0 is an artifact of the bounding
technique. We were able to find bounds that were strictly better
than (22) and, in particular, we did find bounds that grew more
or less monotonically in v, as we would expect. However, there
is still considerable room for improvement.
Note that the capacity for v = 0 is strictly larger than zero
because even if there is no drift, the molecules still have a
positive probability of arriving due to the Brownian motion.
It is very weird that in this situation of v = 0, the noise that
hurts communication at large speeds becomes the only mean
of communication and is therefore highly beneficial!
In the case when the capacity is analyzed as a function of
the average-delay constraint m, the general picture is better.
While the bound (22) remains strictly bounded away from
zero, we have found an upper bound that tends to zero as
m ↓ 0 (both (25) and (26), but the former is considerably
simpler). Unfortunately, the slope of convergence of our upper
bound (25) and of the lower bound (24) do not coincide. The
exact asymptotic capacity for v = 0 and the capacity growth
rates for v ↓ 0 and m ↓ 0 are projects of our future research.
ACKNOWLEDGMENTS
This work was supported by the National Science Council
under NSC 100–2628–E–009–003.
B. Lower Bound
The lower bound was inspired by the fact that (22) is
implicitly based on an output that is exponentially distributed.
For large v or m, the impact of the noise N will decrease, i.e.,
it is a good guess that an exponential input might work well.
The lower bound follows from a lengthy derivation based on
1
[7]:
the PDF of Y when in (4) X ∼ Exp m
p
√
1 − my + µλ −kλ
1
fY (y) =
e
Q − kλ
ky − √
·e
m
ky
p
√
1
+ ekλ Q
kλ
ky + √
, (37)
ky
where k is defined in (30) and where this form of the PDF
only is valid if condition (31) is satisfied. Here, Q (·) denotes
the Q-function defined as
Z ∞
t2
1
e− 2 dt.
(38)
Q (α) , √
2π α
R EFERENCES
[1] K. V. Srinivas, R. S. Adve, and A. W. Eckford, “Molecular
communication in fluid media: The additive inverse Gaussian noise
channel,” Dec. 2010, arXiv:1012.0081v2 [cs.IT]. [Online]. Available: htt
p://arxiv.org/abs/1012.0081v2
[2] C. E. Shannon, “A mathematical theory of communication,” Bell System
Techn. J., vol. 27, pp. 379–423 and 623–656, Jul. and Oct. 1948.
[3] A. Lapidoth and S. M. Moser, “Capacity bounds via duality with
applications to multiple-antenna systems on flat fading channels,” IEEE
Trans. Inf. Theory, vol. 49, no. 10, pp. 2426–2467, Oct. 2003.
[4] A. Lapidoth, S. M. Moser, and M. A. Wigger, “On the capacity of freespace optical intensity channels,” IEEE Trans. Inf. Theory, vol. 55, no. 10,
pp. 4449–4461, Oct. 2009.
[5] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed.
New York: John Wiley & Sons, 2006.
[6] T. Kawamura and K. Iwase, “Characterizations of the distributions of
power inverse Gaussian and others based on the entropy maximization
principle,” J. Japan Statist. Soc., vol. 33, no. 1, pp. 95–104, Jan. 2003.
[7] W. Schwarz, “On the convolution of inverse Gaussian and exponential
random variables,” Commun. in Statistics — Theory and Methods, vol. 31,
no. 12, pp. 2113–2121, 2002.
308
Download