ESL-P-822 "Sianalina and uncertainty: a case study." by David A. Castanon Nils R. Sandell Abstract: This paper studies the well-known counter example of Witsenhausen when the initial uncertainty is small. Using an asymptotic approach, it is established that linear strategies are asymptotically optimal over a large class of nonlinear strategies. This serves as a guideline for optimal solutions of non-classical problems with very noisy communication channels. Acknowledge Contacts DOE-E(49-18)-2087, ONR-N0014-C-0346 Electronic Systems Laboratory Massachusetts Institute of Technology Cambridge, MA 02139 Room 35-308 *Submitted to 1978 IEEE Conference on Decision and Control, and for publication in the IEEE Trans. on automatic control 1. Introduction The key theoretical result in classical discrete time stochastic control is the so-called nonlinear separation theoren[l,2]. This theorem implies that the optimal control laws depend on the past observations and controls only through the conditional probability distribution of the present state, given these variables, and that this distribution is obtained by solution of a nonlinear filtering problem independent of the choice of control laws. stochastic control blem. The most important practical result in classical is the solution of the linear-quadratic-Gaussian pro- This solution is a linear feedback of Kalman filter state estimates which is easily computed and implemented, and thus, has served as a basis for practical control law synthesis [3]. One of the key differences in decentralized control is the presence of decision makers with different available information. Decision problems with "classical" information patterns are characterized by the presence of one central decision maker endowed with perfect recall. In finite-state games, the absence of perfect recall gives rise to signalling information sets. Signalling is the act of transfering information from one decision maker to another through the use of decision variables. Perhaps the best known example of signalling in decentralized control was presented by Witsenhausen [5]. This example is discussed in detail in this paper. control-sharing team problem discussed by Sandell and Athans [6] is another example of how control actions are used to convey information. -2- The 2. The Witsenhausen CounterexamPle (5) random variables with variances Let x0 , V be zero-mean independentf Consider the following stochastic control problem and 1 respectively. State equations (1) xl = xo + u0 Observation equations x2 x y0 xO = x Y = - 1 1 (2) +V k2 uO+ J Admissible Controllers uo = 'O (Y) u = (3) x2- Cost function where yl, Y2 (4) y1 (yl) are Borel functions. This information pattern is known at stage 1. Hence, there is nonclassical, since the value of Yv information to signal, Yl provides a vehicle of communication. · ' Fiqure I Coontroller 2v + -' ..... " 1(encoder)[ (e_ c Problem: -3- not Figcxe .1 provides a diagram of the Noise Co'trol'ler i is and the observation decision problem involved. Source a Minimize E{(x -u (decoder) )'2 + ku } Note the similarities between the diagram in Figure 1 and classical If one denoted Controller 1 as the encoder, communications problems. Controller 2 as the decoder, a classical communications problem would be Minimize E {(u1 - x0 ) } E {u2} < C. subject to Thus, the Witsenhausen Counterexample is somewhat different in that one does not try to reconstruct the source; instead one seeks to reconstruct the output of the encoder. Witsenhausen established that the optimal decoder was the conditional estimate of xl given Yl' However, the statistics of x l and y 1 depend on the form of the encoder. Assuming that V is Gaussian, the encoder problem can be reduced to the following equivalent problem: 2 Let h(x) = Df(y) (27T e = ) ; define (5) fh(y-f(x))dF(x) Nf(y) = f f(x)h(y-f(x))dF(x) (6) gf(y) = (7) Nf(Y) Df (Y) * * J2(f) = J(f,gf) (8) where F(x) is the distribution of x0 . Witsenhausen showed that gf(y) was the conditional expectation of f(x) with respect to y. The resulting problem for the encoder was to minimize the expected value of J2(f) with the choice of f(x) = x + yo(x). E{J(f)} 2 Furthermore, 2 = 1 - I(Df) + k2 E 1 + k2 E { (x-f(x)) { (x - f(x))2 }f 4- -4- } cold D (y) dyf D (Y) 2 dy (9) where I(Df) is the Fisher information content of the density Df density of y). (the Hence, the problem of choosing the optimal encoder consists of a tradeoff between maximizing the information available (signalling) and minimizing the use of control at the first stage. Witsenhausen also proved the following results: a. There exists an f(x) which minimizes E {J2 (f)J for arbitrary dis-. tributions F(x). b. The linear strategy f(x) = Xx value of X is a stationary point for a particular when F(x) is Gaussian. c. For large values of G , the strategy f(x) = X x is not optimal. There exist numerous stationary points, and the optimal strategies Y01 Y1 are nonlinear. When Y is large, there is a lot of uncertainty in the systen. In this case, it becomes optimal to use signalling to reduce the cost, as indicated by c above. However, it is not clear that signalling would be so evident if channel capacity was limited or the distortion noise V had a large variance. Ideally, one would like to recognize problems where signalling would be present in terms of the parameters of the problem. The next section studies the Witsenhausen counterexample when the variance G2 is small. 3. Asymptotic Analysis of the Witsenhausen Counterexample Assume that a is very small, and that F(x) is Gaussian. Then, one can obtain approximate expressions for the elements defined in equations (5)-(8), representing them as power series in a I(a,y) defined by the integral equation -5- . Consider a function I(ar,y) 7 1 = h(x,y)e-x /2 dx (10) Assume h(x,y) can be represented as m i h(x,y) lim where + R( ai(y) R (x,y) m < M(y) for all y, and there exists b 0 such that x for all y, (xy) ex R Definition 1 [71 /2b dx < C(y). i : The series i= of mth order for h(x,y) for x small if m h(x,y) lim x > for each y. Lemma 1: i=0 ai (y ) xi i! =0 xmI The expansion is uniform if the limit is uniform in y. The series I(a,y) given by m/2 I (ay) = a k= a2k (11) k 2 k! is a uniform asymptotic expansion of I(a,y) for small a , and it is uniform if M and C are independent of y. Proof: See Appendix. Assume f(x) can be expanded in a series about x=O, as m i f(x) a. x i=l where lim + R (x) i! R (x) m = 0. Then, unsing Lemma 1 and equations (5)-(11) one m x gets the following result (after considerable algebra): x-+O -6- Theorem 1: An asymptotic expansion of J2 (f) to sixth order in a for small is given by: J 2 (f) = 2 2 * 2 2 2 kao + a (k (al-l) + k a2 2 4 22 + a 2 +ka4+ 4 4k a3(al-1) + 4ala3 3 11 3 5k2a a 24 1 + 2a2) + 2 6 (k a a6+l0k2a2 + -2 0 6 3 24 2 + 6k2 (a -l)a5 +6ala5+ 12a2a4 + 10a3 2 1 5 15 24 3 48ala3+ 24al Proof: 4 a ) + o( 24a2a2 12 ). See Appendix. The asymptotic expansion in Theorem 1 gives an approximate cost; there are many strategies f(x), differing slightly, which yield the same cost. The difficulty in optimizing the expansion in Theorem 1 with regards to the parameters a. of f is that the expansion is not valid for arbitrary f. This leads to the next theorem. Theorem 2 : Suppose f is restricted to belong to a set r expansion of Theorem 1 is uniformly valid. f(x) = k2__ k2 1 + 2 2 on which the Then, the strategy k4 + x 2 (k + 1) minimizes the asymptotic cost up to 6th order. Proof: See Appendix. Typical conditions for obtaining the set r are uniform restrictions on the magnitude of the coefficients a.. No such representation is given here because it is unnecessarily restrictive. -7- 4. DISCUSSION The asymptotic analysis of section 3 implies that, as the uncertainty in the initial state decreases, linear controllers are optimal relative to a class of nonlinear controllers. lers with bounded coefficients. This class includes polynomial controlThis suggests the existence of a region of values for a where linear controllers are optimal in general. Unfortunately, the results are not as strong as one would like due to their asymptotic nature and to the restrictions that must be placed on the class of admissible control laws to obtain uniformly valid expansions. Problems with nonclassical information pattern are of great importance, while analytical results for these problems are virtually nonexistent. This paper suggests that linear controllers are approximately optimal in limiting cases, such as when there is little information to signal, or when the channel is very noisy or has limited capacity. -8- Appendix Proof of Lemma 1: Consider the difference +(,y) 2/22 002 dx ak((x) kk=o 1 / _oo a a/> - = I (a,y) + i I(G,y) r00 1 a V2 I(a,y) . k From (10), -x 2/2a2/2 dx R (x,y) e + - e-x Rm (x,y) e dx -0 Hence I(ay) - I(O Im. ,y) 1 f 1i'I R ( ) Rm(xy) -x dx /, '-o)m+ll Now limit R (x,y) < _______ M(y), so there exists 2-0O (y) > 0 m+l x such that, for all Ixl < E(Y), Rm (x,y)| < M(y)x Thus 2 Rm f M(y)x o 2 Tr a + co R 2 V2 7T a (xy) x --2/2a2 dx ex dx (E(Y) m+l < 2 M(y)a (m+l)! + -9- 2C(y) 1 e-( 2 1 -2) 2 2(y) e dx Thus, limit I( - I (I,y) a->O I~a"Y) y) am and the limit is uniform if C and M, (thus E(y)) are independent of y. q.e.d. Proof of Theorem 1: Both Df(y) and Nf(Y) Lemma 1. are integrals of the form I(a,y) described in Expansion of f(x) up to fifth order and straight substitution yields g*f(Y) ~ f(o) + a2 R2 (Y) + 2 8 - 2R2 (y)d2 h(xy) 2 dx R4 (y) h(o,y) + 6 48dx R6 (y ) - 3R4 d 2 h(x,y) (y ) d2 2 0 h(o,y) 6R 2 (y)/d 2 2(Y + 2 h(x,y) h(xy)2- o 3R 2 (y) d 0 ~dx h(o, where R2 n (Y) =d 2n (f(x) - f(o)) h(x,y) d2 n 2 h(o,y) and h(x,y) = e/1 Now, from (5), 2 (f(x)-y) (6) and d D (Y) = g*(y) dy f (7), it is established that y Df(y) -10- 2 h(x,y) d 2 h(o,y) + o(6) Using an asymptotic expansion to 6th order for the integral yields 1 Vrz To I(Df) z2 h(oy) 2 2 o - 2R 2(y) h(x,y) 2 R - h(o,y) + a z d4 h(x,y)l o + 2R2 (Y) 2 - 6R2 2(y) dx 2 2 - 2R4(y) h(o,y) +6 6 ( 2 h(x,y) h(o,y) dx + 6R 2 (y)R 4 2R6 (Y) (y)- 22 2 =1- al + 4 4 (-2aa3 + 2a 12aaa dy; z = y ( 2 4 2 - a2 ) + a - f(o) a3 + 24a - 6 (-3ala - 6a2a 12 1 13 2 5a2+ + dx are easily computed in terms of the a. coefficients The integrals defining I(D) I(D) d2 h(xy) 24 12a6 ) + o(c6 Similarly, E {x-f(x)} 2a2+ C2 (a + a a 2 + a4 (3a2 aa a I(3a 2 2 + 6(a-l)a ) o4 5 ) + o(o + 4a (aa 3 1- + aoa 4 + 4a3(a f) - + 2a2 +la2 6 3 + 1aa 4a5 6) Combininq these.two formulas vields Theorem 1. a.e.d. Proof of Theorem 2. Theorem 1 gives an asymptotic approximation to the cost J2!(f), as 2 (a5,...,a). -11- J2 (ao ,...,a 5 ) can be written as 2 ) ~ J2( k (a +a k2 (a2+ + 4 2 2 a+a 8 + a a4) 42 6 a 48 2 a4+ ) (a (1-2a 2 a 2) + a4 2) 2 1 4 1-2a2a 2 + J2 5) + 0( 6) (ala5a 24 2 2 22 J2 (al'a3'a 5 ) = a (k (a -1) + al ) + k a3(al-)a4 where +a a a 4 - C a 6 2 2 a (5k a + 12 3 + 3k 2 (a -1)a +3a a5 + 5a3 1 + 5 3 5 2 24 3 ala + 12a 6 3 The first 3 terms are positive, and achieve a minimum at a=a2 =a4 =a6 =0independent of al 1 a 3 ,a5 . Min Hence, the minimization reduces to J 2 (ala 3 ,a5 ). a.,a3 ,aS TaKing partial derivatives yields the necessary conditions 2 2 2 2k a (al-1) + 2aal 2 2 + k a3 4 34 -4a a 2 + k a Da1 6 4 2 + a5 =(a-1) a4 J 2 - 6ala 3 + 6a) 2 CT + a6 O k a 63 3 + 5a 12a) 3 2a3 Because of the order of the approximation, these two equations are equivalent to 2 2 2 2 2 3 2k (al -1) + 2al + k a3a + a a - 4a a 2 2k (a -1) + 2a 1 1 + 5 3 2 2 2 k a3 ( )+5 a3 3 30 3 Combining yields a 2(l+k2 ) = 2 a 3 (a4 3 which implies and k (al -1) + a 2 a3 0O( ) - 2a 2 a3 1 (a ) -12- 0 O( 23 4a a 4 4) O(a Thus, a 3 is essentially zero, to the order of the expansion, since no value of a 3 of order a2 can contribute to the cost in the expansion up to sixth order. This implies 1l+k2 (l+k )3 up to contributing costs of sixth order, and a3 0. q.e.d. -13-