This article was downloaded by: [Indest open Consortium] On: 17 July 2009

This article was downloaded by: [Indest open Consortium] On: 17 July 2009 Access details: Access Details: [subscription number 907749878] Publisher Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Stochastic Analysis and Applications Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713597300 Ergodic control of degenerate diffusions Gopal K. Basak a; Vevek S. Borkar b; Mrinal K. Ghosh b a Dept.of Mathematics, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong b Dept. of Electrical Engineering, Indian Institute of Science, Bangalore, India Online Publication Date: 01 January 1997 To cite this Article Basak, Gopal K., Borkar, Vevek S. and Ghosh, Mrinal K.(1997)'Ergodic control of degenerate diffusions',Stochastic Analysis and Applications,15:1,1 — 17 To link to this Article: DOI: 10.1080/07362999708809460 URL: http://dx.doi.org/10.1080/07362999708809460 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. STOCHASTIC ANALYSIS AND APPLICATIONS, 15(1), 1- 17 (1997) ERGODIC CONTROL OF DEGENERATE DIFFUSIONS Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 Gopal I<. Basak * Dept. of Mathematics Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong mabasak@uxmail.ust.hk Vivek S. Borkar Dept. of Electrical Engineering Indian Institute of Science Bangalore 560 012, India vborkar@vidyut.ee.iisc.ernet.in Mrinal I<. Ghosh Dept. of Mathematics Indian Institute of Science Bangalore 560 012, India mrinal@math.iisc.ernet.in ABSTRACT We study the ergodic control problem of degenerate diffusions on Under a certain Liapunov type stability condition we establish the existence of an optimal control. We then study the correponding HJB equation and establish the existence of a unique viscosity solution in a certain class. 'Research partially supported by grant no. D4087 DAG 93/94 SC29 ' ~ e s e a r c hsupported by grant no. 26/01/92-G from DAE, Govt. of India Copyright 0 1997 by Marcel Dekker, Inc. Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 BASAK, BORKAR, AND GHOSH 1. INTRODUCTION \Ye address the problem of controlling degenerate diffusions by con- tinually monitoring t h e drift. T h e objective is to minimize the long-run average (ergodic) cost over all admissible controls. The state X ( t ) of the system a t time t is governed by where b : I R x~ U -+ Rd and a : I R -+ ~ R d x dare respectively the drift vector and the diffusion matrix, U is a given compact metric space which is the control set. X o is a prescribed lRd-valued random variable and W ( . )is a d-dimensional standard Wiener process independent of X o . u ( . ) is a non-anticipative control process taking values in U.Such a control is called an admissible control. Our aim is t o minimize over all admissible controls the following quantity : lim sup - E T+m T 6' c ( X ( t ) , u(t))dt where c is the running cost function. If the minimum is obtained for some u(.), it is called an optimal control. Under certain conditions we will show t h a t there exists an optimal control. We then study the corresponding Hamilton-Jacobi-Bellman (HJB for short) equation. T h e HJB equation for this problem is given by where 4 : Rd -+ IR and p is a scalar. Under certain conditions we establish the existence of a unique viscosity solution (4, p ) of the above equation. Finally we characterize the optimal control via this unique solution. T h e existence of an optimal control for this problem has been studied in [3], [4], [ 6 ] , [lo]. In these works the existence of an optimal control has been established under suitable conditions. However, one does not have the existence of an optimal control for arbitrary initial law. Under 3 ERGODIC CONTROL PROBLEM a certain Liapunov type stability condition we have got rid of this Iimitation. To the best of our knowledge the viscosity solution of the H J B equation for the ergodic control problem has not been studied thus far. Our paper is organized as follows. Section 2 deals with notation and preliminaries. In Section 3 we establish the existence of an optimal control. The HJB equation is studied in Section 4. NOTATION AND PRELIMINARIES 2. Let V be a compart metric space and U = P ( V ) the space of probability measures on V endowed with the topology of weak convergence. We consider the d-dimensional controlled diffusion process Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 X (.) = [XI(.) , . . . , Xd where b(., .) = [bl ( e l a(.) = [aij(.)] : IRd (a)] I described by .), . . . , bd(., .)I' : !Rd x V + R d x dis the diffusion + is the drift vector, matrix, Xo is an IRd-valued random variable with a prescribed law, W(.) = [W1(.), . . .Wd(.)]' is a d-dimensional standard Wiener process independent of Xo, u(.) is a U-valued process with measurable sample path satisfying the following nonanticipativity property : for t 2 s, W ( t ) - W(s) is independent of {u(r), W (r), r Is). Such a process is called an admissible relaxed control. It is called an admissible precise control if u(.) = 6,(.)for a V-valued process v(.) satisfying the same nonanticipativity condition. We make the following assumptions. We denote various constants by c;. ( A l ) (i) b is continuous and for x,y E lFld 4 BASAK, BORKAR, A N D GHOSH ( i i ) a is continuous, and for x, y E I R ~ Under ( A l ) , for a prescribed admissible control the equation (2.1) has a unique strong solution. For v E V, f E w,~:(R~), x E Rd let 1 Lvf(x)= - d d 2 f (.I ~ ; ~ ( x ) - + ~ b ~ ( z , v ) - af azidxj ;=, i,j=1 dx; d where a;j(.) = z ~ ~ k ( ' ) f Y ~, and ~ ( ' )for u E (2.2) u Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 k= 1 .4n admissible control (relaxed or precise) u(.) is called feedback if u(.) is progressively measurable with respect to the natural filtration of X ( . ) . In such a case (2.1) will not admit a strong solution in general. We say that (X(.), u(.)) is a stationary relaxed solution of the controlled martingale problem for L if (X(.), u ( . ) )is a stationary process such that u(.) is a relaxed feedback control and X(.) the corresponding controlled diffusion. We now state the following result which plays a very crucial role in the existence of an optimal control. For a proof see [9]. Theorem 2.1. If p E p ( R d x V ) satisfies then there exists a stationary relaxed solution (X(.), u ( . ) ) of the controlled martingale problem for L such that for each t 2: 0 5 ERGODIC CONTROL PROBLEM Let c : IRd x V -+ R be the cost function. We make the following assumption on c. (A2) c is continuous and for x, y E IR d Our aim is t o minimize the long run average (or ergodic) cost over all admissible relaxed controls. We will carry our program under the following 'stability' assumption. Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 (A3) For x, y E Rd, v E V, let We assume t h a t there exist a symmetric positive definite matrix Q and a constant (Y > 0 such that for x , y E IRd, v V, T h e following example will show the assumptions (A1) and (A3) arise naturally. Example. Let V = [0, 1] a i ( x ) = 0 for all i # b(x, v) = B x + Dv, a l ( x ) = x and 1. Here B , D are constant d x d matrices where all eigenvalues of B have negative real part and aj is the j t h column of a . It is easy t o see t h a t ( A l ) is satisfied. Now notice t h a t , there exist a positive definite matrix Q such that B ' Q obatins (A3): for x # y, + Q B = -I. Using this, one BASAK, BORKAR, A N D GHOSH We will now establish the 'asymptotic flatness of the flow' under ( A l ) ,(A3).We closely follow the arguments in [2]. L e m m a 2.1. Let u(.) be any admissible relaxed control. Let X ( x , t ) be the corresponding solution with initial condition X ( 0 ) = x. Then under ( A l ) ,(A3) there exist constants C4 > 0,Cg > 0 independent of Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 u(.) such that for any x , y E I R ~ E I X ( x ,t ) - X ( y , t ) / 5 ~ ~ e - -~y J~. ~ 1 z (2.5) Proof. Consider the Liapunov function 2~ ( x )= (t' Q X ')I 2 . (2.6) The function w may be modified near the origin to make it a C 2function on all of lRd. Write bk to denote the partial differentiation with respect to the k-th coordinate and let Then using (A3),for x # y, 7 i U w ( x- Y) 5 +(x for some y - Y ) Q ( X- Y ) } - ' I 2 > 0.Hence for some constant C4 > 0 1, - y12 , Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 ERGODIC CONTROL PROBLEM 7 Let r = inf{t 2 OIX(x,t) = X ( y , t ) ) (possibly +a). Now by Ito's formula, For t 2 r , X ( x , t) = X ( y , t) a s . by the pathwise uniqueness of the solution of (2.1). Thus by Gronwall's inequality it follows that for any t 2 0, Finally using the positive definiteness of Q, (2.5) follows from (2.8). m The arguments in [ Proposition 3.1, 11 can be mimicked to get the following result. L e m m a 2.2. Let u(.) be any admissible relaxed control and X ( x , t ) the corresponding process with initial condition X(0) = x. Then under ( A l ) and (A3) there exist a constant 6 > 0 and a function g(.) such that for any t 3. > 0, EXISTENCE RESULTS In this section we establish the existence of an optimal control under ( A l ) , (A2), (A3). Our arguments closely follow those in [4, Chapter 61, therefore we present here the bare sketches to illustrate the main ideas. Let BASAK, BORKAR, AND GHOSH For p E I?, let ,, / = cdp Using (A2), Theorem 2.1 and Lemma 2.2, it can be shown that where 6 > 0 is as in Lemma 2.2. Let p*= infp,. (3.4) P E ~ Lemma 3.1. The set I' in (3.1) is compact. Proof. In view of [4, Lemma 5.11 it suffices to show that there exists a C Liapunov function + IR Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 :IR~ such that lim LVw(x) + -03 Ixl-+='J Now consider the Liapunov function uniformly in v . (3.5) with a suitable modification near the origin already considered in the proof of Lemma 2.1. This clearly satisfies (3.5). Hence the result follows from [4, Lemma 5.11. rn Corollary 3.1. There exists a p* E l? such that Proof. From (3.3) it follows that p e J c d p is continuous on I?. Since r is compact the result follows. Let rn (X*(.), u*(.)) be the stationary relaxed solution of the controlled martingale problem corresponding to p*. Then if the law of (X *(O), u*(O))is such that ERGODIC CONTROL PROBLEM then 1 lim - T--bm T EL Jy T c(x*(s), v)u*(s)(du)ds= cdP* = p* . (3.6) Now fix the probability space and the Wiener process W ( . )on which u*(-) is defined. Consider the corresponding process X*(.) with varying initial law. Using (2.5) we can argue as in [Theorem 1.2.2, 41 that (3.6) holds for any initial law of X*(O). We can then mimick the arguments in [4, Chapter 61 to conclude the following. T h e o r e m 3.1. Under any admissible relaxed control u ( . ) and any initial law, the corresponding solution X(.) of (2.1) satisfies Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 lim inf T+w Jy c(x(s), v ) u ( s ) ( d ~ ) d2s p* . T (3.7) Thus u*(.) as above is optimal. Remark 3.1. Note that (3.7) establishes a much stronger optimality of u*(.), viz., the most "pessimistic" average cost under u*(.) is no worse than the most "optimistic" average cost under any other admissible control. HAMILTON-JACOBI-BELLMAN EQUATION 4. The HJB equation for the ergodic control problem is inf [L "4(x) uEU + T(x, u) - p] = 0 (4.1) where, I R ~+ IR is a suitable function and p is a scalar. Solving the HJB equation means finding a suitable pair (4,p) satisfying (4.1) in an apI$ : propriate sense. For the nondegenerate case under certain assumptions BASAK, BORKAR, AND GHOSH 10 one can find a unique solution (4,p) of (4.1) in a certain class such that q5 is C 2 [4, Chapter 61. To obtain analogous results for the degenerate [a]). case we introduce the notion of a viscosity solution of (4.1) ([7], Let (b 6 C(IRd) and p E R. Definition 4.1.The pair (4, p) is said to be a viscosity solution of (4.1) if for any $ E c (IRd) inf [LU$(x) u€U + c(x, u) - p] 2 0 (4.3) a t each local maximum x of ((b - $), and a t each local minimum x of ((b - $). Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 Let G = { f : IRd + IR I f (0) = 0, f is Lipschitz continuous ) . (4.5) We will show that (4.1) has a unique viscosity solution in G x IR. To this end we follow the traditional vanishing discount method. Let X > 0. For an admissible relaxed control u(.), let Let (bx denote the discounted value function, i.e. Let H = {f : IRd --+ IR I f is continuous and 1 f (x)l 5 C ( 1 + 1x1) for some constant C) . (4.8) By the results of [8] (see also [7]) q5x is the unique viscosity solution in H of the following HJB equation for the discounted control problem ERGODIC CONTROL PROBLEM 11 + inf [LU+x(x) E(x, u) - A+x(x)] = 0 . U (4.9) Theorem 4.1. Under ( A l ) , (A2) and (A3), (4.1) has a viscosity solution in G x (R. Proof. For x, y E IRd we have roo Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 Thus there exists a constant Cs independent of A such that is the unique viscosity solution in Let $x(x) = $x(z) - $x(O). Then H to Let {A,) be a sequence such that An -t 0 as n -t m. From (4.10) it fol- converges to a function 4 E lows using Ascoli's theorem that c(IRd) uniformly on compact subsets of lRd along a suitable subsequence of {A,). By (2.9), A$x(O) is bounded in A. Hence along a suitable subsequence of {A,} (still denoted by {A,) converges to a scalar p as n -t by an abuse of n~tation)A,~$x,(O) oo. Thus by the stability property of viscosity solution ([7], [8]) it follows that the pair ( $ , p ) is a viscosity solution of (4.1). Clearly $(O) = 0 and from (4.10) it follows that $ is Lipschitz continuous. Thus (4,p) E G x R. Theorem 4.2. Under ( A l ) , (A2), (A3), if ($, p ) E G x IR is a viscosity solution of (4.1) then p = p*. Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 12 BASAK, BORKAR, AND GHOSH > 0. Since ( 4 , p) is a viscosity solution of (4.1), therefore Proof. Let y Q is a viscosity solution of inf [(L"- y ) 4 ( x ) + E(x, Y) - p f Y ~ ( x )=] 0 . (4.12) Therefore by t h e uniqueness of viscosity solution ([7], [8]), it can be shown as in [8] t h a t 4 ( x ) = inf E u , [la e-"{r(x(t), u(t))- p + 7 +(X(t))}dt] . (4.13) Let p E I?. Let ( X ( - ) ,u(.)) be the stationary solution of the martingale problem corresponding t o p . Then fix the probability space and the Wiener process W ( . )on which u(.) is defined and consider the process X ( . ) with varying initial condition. Then for any x E IRd (4.14) Letting y + 0, and arguing as before using (2.5), we have Hence p 5 p* . To get the reverse inequality let uY(.) be a an optimal relaxed feedback control for the cost criterion in (4.13). Let Ft = u ( X (s), s 5 t ) . Then is an 3t-martingale ([4]). By Lemma 2.2 it is uniformly integrable. Therefore letting y + 0, we can argue as in [Lemma 1.1, 51 t o show t h a t for some relaxed feedback control E ( - ) is an 3t-martingale. Taking expectation, dividing by t and letting t co,we obtain + ERGODIC CONTROL PROBLEM p 2 lim inf 1 E t-m t 13 j0 Z'(X(s), ?i(s))dt 2 p* . Theorem 4.3. Under ( A l ) , (A2) and (A3), if (4, p*) E G x R is a viscosity solution of (4.1). Then (i) If u is any admissible relaxed control then + ( X ( t ) ) + J ~ ( ? ( x(s), u(s)) - p*)ds is an 3t-submartingale, where 3;= o ( X ( s ) , s 5 t) (ii) If for some admissible relaxed control u(.) is an Ft-martingale then u is optimal. (iii) If u is an admissible relaxed control which is optima1 and the Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 process (X(.), u(.)) is stationary with E [Jv f ( X ( t ) , u)u(t)(du)]= JRdXV f d p , m u g p(IRdx V ) for each t , then is an 3t-martingale. Proof. (i) Let u ( - )be any admissible relaxed control. By (4.12) is an 3t-submartingale [4]. Letting y + 0, (i) follows. is an 3t-martingale under u(.), we have by taking X ( 0 ) = x. Therefore BASAK, BORKAR, AND GHOSH 14 Letting t -+ rn and using Lemma 2.2 Thus u ( . ) is optimal. Then E M t = S 4dp. Suppose Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 Miis not a martingale. Then there exists t > s > 0 and A E FSwith P ( A ) > 0 such t h a t which is a contradiction. Theorem 4.4. The equation (4.1) has a unique viscosity solution in G x lR. Proof. Let (4,p ) E G x IR and ($I, p ') E G x lR be two viscosity solutions of (4.1). Then by Theorem 4.2 p = = p * . Let u*(.) be an optimal relaxed feedback control such t h a t the process (x(.),u ( . ) ) is stationary and for ea.ch t Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 ERGODIC CONTROL PROBLEM Then by Theorem 4.3 and are Ft-martingales and therefore so is + ( X ( t ) ) - $ ( X ( t ) ) . By Lemma 2.2 i t is uniformly integrable and therefore converges a s . Hence 4 - 1C, is a constant c' on the support M of ,Dl where ji is the marginal of p on R ~ Without . loss of generality assume that c' 2 0 (otherwise consider 6 Ad, assume c ' > 0.Then for some > 0, 4 - II, > $ on an E-neighbourhood M, of M and 0 6 M E . Let $ - 4 ) . If 0 E M then c ' = 0. If 0 E q = i n f i t > 0 I X (t) E a M E ) . We claim that Eq < co for any x 6 M,.Indeed, let x $ ME and y E M. Then since M is an invariant set for X ( . ) under u*(.) by (2.5) and Chebyshev's inequality. Thus We can mimick the arguments in [ Proposition 3.1, 11 t o show that for some 6 > 0 In view of (4.15a) and (4.15b) the optimal control problems of minimizing (4.16) and BASAK, BORKAR, A N D GHOSH over all admissible relaxed controls are well posed. Optimal relaxed controls for these problems can be shown to exist [4]. Let u(.) be an optimal relaxed feedback control for (4.16). Then is a martingale and is a submartingale. Thus @ ( X ( ~t A ) ) - ~ ( X (AVt)) is a submartingale. Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 Therefore c' -+ < 0, a contradiction. Thus = 0. By the same argument we have +(x) - 4(x) 5 0 for any x E tRd . Taking x = 0 and letting t m, we get c' Similary we can show that for any x E tRd Hence 4 - $. REFERENCES [I] G.K. Basak, A class of limit theorems for singular diffusions, J. Multivariate Analysis 39 (1991), 44-59. [2] G.K. Basak and R.N.Bhattacharya, Stability in distribution for a class of singular diffusions, Annals of Probability 20 (1992), 312-321. [3] A.G. Bhatt and V.S. Borkar, Occupation measures for controlled Markov processes : Characterization and optimality, Preprint. [4] V.S. Borkar, Optimal control of diffusion processes, Pitman Research notes i n Maths No. 203, Longman Scientific and Technical, Harlow, 1989 ERGODIC CONTROL PROBLEM 17 [5] V.S. Borkar, On extremal solutions to stochastic control problem, Appl. Math. Optim. 24 (1991), 317-330. [6] V.S. Borkar, A note on ergodic control of degenerate diffusions, Preprint. [7] W.H. Fleming and H.M. Soner, Controlled Markov processes and viscosity solutions, Springer-Verlag, New York, 1991. [8] P.L. Lions, Optimal control of diffusion processes and HamiltonJacobi-Bellman equations, Part 11: viscosity solutions and uniqueness. Communications on P.D.E. 8(1993), 1229-76. [9] R.H. Stockbridge, Time-average control of a martingale problem. Existence of a stationary solution, Annals of Probability 18 (1990), 190-205. Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009 [lo] R.H. Stockbridge, Time-average control of a martingale problems : a linear programming formulation, Annals of Probability 18 (1990), 206-217.

This article was downloaded by: [Indest open Consortium] On: 17 July 2009

Related documents

Products

Support

This article was downloaded by: [Indest open Consortium] On: 17 July 2009

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib