by David A. Castanon

advertisement
ESL-P-822
"Sianalina and uncertainty:
a case study."
by
David A. Castanon
Nils R. Sandell
Abstract:
This paper studies the well-known counter example of Witsenhausen
when the initial uncertainty is small. Using an asymptotic approach, it is
established that linear strategies are asymptotically optimal over a large
class of nonlinear strategies. This serves as a guideline for optimal
solutions of non-classical problems with very noisy communication channels.
Acknowledge Contacts DOE-E(49-18)-2087, ONR-N0014-C-0346
Electronic Systems Laboratory
Massachusetts Institute of Technology
Cambridge, MA 02139 Room 35-308
*Submitted to 1978 IEEE Conference on Decision and Control, and for publication
in the IEEE Trans. on automatic control
1. Introduction
The key theoretical result in classical discrete time stochastic
control is the so-called nonlinear separation theoren[l,2].
This theorem
implies that the optimal control laws depend on the past observations and
controls only through the conditional probability distribution of the
present state, given these variables, and that this distribution is obtained by solution of a nonlinear filtering problem independent of the
choice of control laws.
stochastic control
blem.
The most important practical result in classical
is the solution of the linear-quadratic-Gaussian pro-
This solution is a linear feedback of Kalman filter state estimates
which is easily computed and implemented, and thus, has served as a basis
for practical control law synthesis
[3].
One of the key differences in decentralized control is the presence of
decision makers with different available information.
Decision problems
with "classical" information patterns are characterized by the presence of
one central decision maker endowed with perfect recall.
In finite-state
games, the absence of perfect recall gives rise to signalling information
sets.
Signalling is the act of transfering information from one decision
maker to another through the use of decision variables.
Perhaps the best
known example of signalling in decentralized control was presented by
Witsenhausen
[5].
This example is discussed in detail in this paper.
control-sharing team problem discussed by Sandell and Athans
[6] is another
example of how control actions are used to convey information.
-2-
The
2.
The Witsenhausen CounterexamPle
(5)
random variables with variances
Let x0 , V be zero-mean independentf
Consider the following stochastic control problem
and 1 respectively.
State equations
(1)
xl = xo + u0
Observation equations
x2
x
y0
xO
= x
Y
=
-
1
1
(2)
+V
k2 uO+
J
Admissible Controllers
uo = 'O (Y)
u =
(3)
x2-
Cost function
where yl, Y2
(4)
y1 (yl)
are Borel functions.
This information pattern is
known at stage 1.
Hence,
there is
nonclassical, since the value of Yv
information to signal,
Yl provides a vehicle of communication.
·
'
Fiqure I
Coontroller 2v
+
-'
.....
" 1(encoder)[
(e_ c
Problem:
-3-
not
Figcxe .1 provides a diagram of the
Noise
Co'trol'ler i
is
and the observation
decision problem involved.
Source
a
Minimize E{(x
-u
(decoder)
)'2 + ku
}
Note the similarities between the diagram in Figure 1 and classical
If one denoted Controller 1 as the encoder,
communications problems.
Controller 2 as the decoder, a classical communications problem would be
Minimize E
{(u1 - x0 )
}
E {u2} < C.
subject to
Thus, the Witsenhausen Counterexample is somewhat different in that one
does not try to reconstruct the source; instead one seeks to reconstruct
the output of the encoder.
Witsenhausen established that the optimal decoder was the conditional
estimate of xl given Yl'
However, the statistics of x l and y 1 depend on
the form of the encoder.
Assuming that V is Gaussian, the encoder problem
can be reduced to the following equivalent problem:
2
Let h(x) =
Df(y)
(27T e
=
)
; define
(5)
fh(y-f(x))dF(x)
Nf(y) = f f(x)h(y-f(x))dF(x)
(6)
gf(y) =
(7)
Nf(Y)
Df (Y)
*
*
J2(f) = J(f,gf)
(8)
where F(x) is the distribution of x0 .
Witsenhausen showed that gf(y) was
the conditional expectation of f(x) with respect to y.
The resulting
problem for the encoder was to minimize the expected value of J2(f) with
the choice of f(x) = x + yo(x).
E{J(f)}
2
Furthermore,
2
= 1 - I(Df)
+ k2 E
1 + k2 E
{ (x-f(x))
{ (x - f(x))2 }f
4-
-4-
}
cold D (y)
dyf
D (Y)
2
dy
(9)
where I(Df) is the Fisher information content of the density Df
density of y).
(the
Hence, the problem of choosing the optimal encoder
consists of a tradeoff between maximizing the information available
(signalling) and minimizing the use of control at the first stage.
Witsenhausen also proved the following results:
a. There exists an f(x) which minimizes E {J2 (f)J for arbitrary dis-.
tributions F(x).
b. The linear strategy f(x) = Xx
value of
X
is a stationary point for a particular
when F(x) is Gaussian.
c. For large values of G , the strategy f(x) = X x is not optimal.
There exist numerous stationary points, and the optimal strategies
Y01 Y1 are nonlinear.
When
Y is large, there is a lot of uncertainty in the systen.
In
this case, it becomes optimal to use signalling to reduce the cost, as
indicated by c above. However, it is not clear that signalling would be
so evident if channel capacity was limited or the distortion noise V had
a large variance.
Ideally, one would like to recognize problems where
signalling would be present in terms of the parameters of the problem.
The next section studies the Witsenhausen counterexample when the variance
G2 is small.
3. Asymptotic Analysis of the Witsenhausen Counterexample
Assume that
a is very small, and that F(x) is Gaussian.
Then, one
can obtain approximate expressions for the elements defined in equations
(5)-(8), representing them as power series in a
I(a,y)
defined by the integral equation
-5-
.
Consider a function
I(ar,y)
7
1
=
h(x,y)e-x /2
dx
(10)
Assume h(x,y) can be represented as
m
i
h(x,y)
lim
where
+ R(
ai(y)
R (x,y)
m
< M(y) for all y, and there exists b
0 such that
x
for all y,
(xy) ex
R
Definition 1 [71
/2b
dx <
C(y).
i
: The series
i=
of mth order for h(x,y) for x small if
m
h(x,y) lim
x >
for each y.
Lemma 1:
i=0
ai (y ) xi
i!
=0
xmI
The expansion is uniform if the limit is uniform in
y.
The series I(a,y) given by
m/2
I (ay)
=
a
k=
a2k
(11)
k
2
k!
is a uniform asymptotic expansion of I(a,y) for small a , and it is
uniform if M and C are independent of y.
Proof:
See Appendix.
Assume f(x) can be expanded in a series about x=O, as
m
i
f(x)
a. x
i=l
where
lim
+ R
(x)
i!
R (x)
m
= 0. Then, unsing Lemma 1 and equations (5)-(11) one
m
x
gets the following result (after considerable algebra):
x-+O
-6-
Theorem 1: An asymptotic expansion of J2 (f) to sixth order in a for small
is given by:
J 2 (f)
=
2 2
*
2
2
2
kao + a (k (al-l) + k a2
2
4
22
+ a
2
+ka4+
4
4k a3(al-1) + 4ala3
3 11 3
5k2a a
24
1
+ 2a2) +
2
6 (k a a6+l0k2a2 +
-2
0 6
3
24 2
+ 6k2 (a -l)a5 +6ala5+ 12a2a4 + 10a3
2
1
5
15
24
3
48ala3+ 24al
Proof:
4 a
)
+ o(
24a2a2
12
).
See Appendix.
The asymptotic expansion in Theorem 1 gives an approximate cost;
there are many strategies f(x), differing slightly, which yield the same
cost.
The difficulty in optimizing the expansion in Theorem 1 with
regards to the parameters a. of f is that the expansion is not valid for
arbitrary f.
This leads to the next theorem.
Theorem 2 : Suppose f is restricted to belong to a set r
expansion of Theorem 1 is uniformly valid.
f(x) =
k2__
k2
1 +
2
2
on which the
Then, the strategy
k4
+
x
2
(k
+ 1)
minimizes the asymptotic cost up to 6th order.
Proof:
See Appendix.
Typical conditions for obtaining the set r are uniform restrictions on
the magnitude of the coefficients a..
No such representation is given
here because it is unnecessarily restrictive.
-7-
4.
DISCUSSION
The asymptotic analysis of section 3 implies that, as the uncertainty
in the initial state decreases, linear controllers are optimal relative to
a class of nonlinear controllers.
lers with bounded coefficients.
This class includes polynomial controlThis suggests the existence of a region of
values for a where linear controllers are optimal in general.
Unfortunately,
the results are not as strong as one would like due to their asymptotic
nature and to the restrictions that must be placed on the class of admissible
control laws to obtain uniformly valid expansions.
Problems with nonclassical information pattern are of great importance,
while analytical results for these problems are virtually nonexistent.
This
paper
suggests that linear controllers are approximately optimal in
limiting cases, such as when there is little information to signal, or when
the channel is very noisy or has limited capacity.
-8-
Appendix
Proof of Lemma 1:
Consider the difference
+(,y)
2/22 002 dx
ak((x)
kk=o
1
/
_oo
a
a/>
-
=
I (a,y)
+ i
I(G,y)
r00
1
a
V2
I(a,y) .
k
From (10),
-x
2/2a2/2
dx
R (x,y) e
+
-
e-x
Rm (x,y) e
dx
-0
Hence
I(ay) -
I(O
Im.
,y)
1
f
1i'I
R (
)
Rm(xy)
-x
dx
/, '-o)m+ll
Now
limit
R (x,y)
<
_______
M(y),
so there exists
2-0O
(y) >
0
m+l
x
such that, for all Ixl < E(Y),
Rm
(x,y)|
< M(y)x
Thus
2
Rm
f
M(y)x
o
2 Tr a
+
co
R
2
V2 7T a
(xy)
x --2/2a2 dx
ex
dx
(E(Y)
m+l
< 2 M(y)a
(m+l)!
+
-9-
2C(y)
1
e-( 2
1
-2)
2
2(y)
e
dx
Thus,
limit I(
- I (I,y)
a->O I~a"Y) y)
am
and the limit is uniform if C and M,
(thus E(y)) are independent of y.
q.e.d.
Proof of Theorem 1:
Both Df(y) and Nf(Y)
Lemma 1.
are integrals of the form I(a,y) described in
Expansion of f(x) up to fifth order and straight substitution yields
g*f(Y)
~
f(o) + a2 R2 (Y) +
2
8
- 2R2 (y)d2 h(xy)
2
dx
R4 (y)
h(o,y)
+
6
48dx
R6
(y )
- 3R4
d 2 h(x,y)
(y )
d2
2
0
h(o,y)
6R 2 (y)/d
2
2(Y
+
2
h(x,y)
h(xy)2-
o 3R 2 (y) d
0
~dx
h(o,
where
R2 n (Y) =d
2n
(f(x) - f(o)) h(x,y)
d2 n
2
h(o,y)
and
h(x,y) = e/1
Now, from
(5),
2
(f(x)-y)
(6) and
d D (Y) = g*(y)
dy
f
(7), it is established that
y
Df(y)
-10-
2
h(x,y)
d 2
h(o,y)
+ o(6)
Using an asymptotic expansion to 6th order for the integral yields
1
Vrz
To
I(Df)
z2
h(oy)
2
2
o - 2R 2(y)
h(x,y)
2
R
-
h(o,y)
+ a
z
d4
h(x,y)l
o
+
2R2 (Y)
2
-
6R2 2(y)
dx 2
2
-
2R4(y)
h(o,y)
+6
6
( 2
h(x,y)
h(o,y)
dx
+ 6R 2 (y)R
4
2R6 (Y)
(y)-
22
2
=1- al
+
4
4 (-2aa3 + 2a
12aaa
dy; z = y
(
2
4
2
- a2 ) + a
-
f(o)
a3
+ 24a
-
6
(-3ala
- 6a2a
12
1
13
2
5a2+
+
dx
are easily computed in terms of the a. coefficients
The integrals defining I(D)
I(D)
d2 h(xy)
24
12a6 ) + o(c6
Similarly, E {x-f(x)}
2a2+ C2 (a
+ a a 2 + a4 (3a2
aa
a
I(3a 2
2
+ 6(a-l)a
)
o4
5
)
+ o(o
+ 4a (aa
3
1-
+ aoa
4
+ 4a3(a
f)
-
+ 2a2 +la2
6
3
+ 1aa
4a5
6)
Combininq these.two formulas vields Theorem 1.
a.e.d.
Proof of Theorem 2.
Theorem 1 gives an asymptotic approximation to the cost J2!(f), as
2 (a5,...,a).
-11-
J2 (ao ,...,a 5 ) can be written as
2
) ~
J2(
k (a +a
k2 (a2+
+
4
2
2
a+a
8
+ a
a4)
42
6
a
48
2
a4+
)
(a (1-2a 2 a 2) + a4 2)
2
1
4
1-2a2a 2
+ J2
5) + 0( 6)
(ala5a
24
2
2
22
J2 (al'a3'a 5 ) = a (k (a -1) + al ) + k a3(al-)a4
where
+a a a
4
- C a
6
2 2
a (5k a
+
12
3
+
3k
2
(a -1)a +3a a5 + 5a3
1
+
5
3
5
2
24
3
ala
+ 12a
6
3
The first 3 terms are positive, and achieve a minimum at a=a2 =a4 =a6 =0independent of al
1 a 3 ,a5 .
Min
Hence, the minimization reduces to
J 2 (ala
3 ,a5 ).
a.,a3 ,aS
TaKing partial derivatives yields the necessary conditions
2
2 2
2k a
(al-1) + 2aal
2
2
+ k a3
4
34
-4a a
2
+ k a
Da1
6
4
2
+
a5
=(a-1)
a4
J
2
-
6ala 3 + 6a)
2
CT
+
a6
O
k a
63
3
+ 5a
12a)
3
2a3
Because of the order of the approximation, these two equations are equivalent to
2
2
2
2
2 3
2k (al -1) + 2al + k a3a + a a - 4a a
2
2k (a -1) + 2a
1
1
+ 5
3
2
2
2
k a3 ( )+5 a3
3
30
3
Combining yields
a 2(l+k2 ) =
2 a
3
(a4
3
which implies
and k (al -1) + a
2
a3
0O(
)
- 2a 2 a3 1
(a )
-12-
0 O(
23
4a a
4
4)
O(a
Thus, a 3 is essentially zero, to the order of the expansion, since no
value of a 3 of order a2 can contribute to the cost in the expansion up to
sixth order.
This implies
1l+k2
(l+k )3
up to contributing costs of sixth order, and
a3
0.
q.e.d.
-13-
Download