This article was downloaded by: [Indest open Consortium] On: 17 July 2009

advertisement
This article was downloaded by: [Indest open Consortium]
On: 17 July 2009
Access details: Access Details: [subscription number 907749878]
Publisher Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
Stochastic Analysis and Applications
Publication details, including instructions for authors and subscription information:
http://www.informaworld.com/smpp/title~content=t713597300
Ergodic control of degenerate diffusions
Gopal K. Basak a; Vevek S. Borkar b; Mrinal K. Ghosh b
a
Dept.of Mathematics, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong
Kong b Dept. of Electrical Engineering, Indian Institute of Science, Bangalore, India
Online Publication Date: 01 January 1997
To cite this Article Basak, Gopal K., Borkar, Vevek S. and Ghosh, Mrinal K.(1997)'Ergodic control of degenerate diffusions',Stochastic
Analysis and Applications,15:1,1 — 17
To link to this Article: DOI: 10.1080/07362999708809460
URL: http://dx.doi.org/10.1080/07362999708809460
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf
This article may be used for research, teaching and private study purposes. Any substantial or
systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or
distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss,
actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.
STOCHASTIC ANALYSIS AND APPLICATIONS, 15(1), 1- 17 (1997)
ERGODIC CONTROL OF DEGENERATE
DIFFUSIONS
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
Gopal I<. Basak *
Dept. of Mathematics
Hong Kong University of Science and Technology
Clear Water Bay, Kowloon, Hong Kong
mabasak@uxmail.ust.hk
Vivek S. Borkar
Dept. of Electrical Engineering
Indian Institute of Science
Bangalore 560 012, India
vborkar@vidyut.ee.iisc.ernet.in
Mrinal I<. Ghosh
Dept. of Mathematics
Indian Institute of Science
Bangalore 560 012, India
mrinal@math.iisc.ernet.in
ABSTRACT
We study the ergodic control problem of degenerate diffusions on
Under a certain Liapunov type stability condition we establish the existence of an optimal control. We then study the correponding HJB
equation and establish the existence of a unique viscosity solution in a
certain class.
'Research partially supported by grant no. D4087 DAG 93/94 SC29
' ~ e s e a r c hsupported by grant no. 26/01/92-G from DAE, Govt. of India
Copyright 0 1997 by Marcel Dekker, Inc.
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
BASAK, BORKAR, AND GHOSH
1.
INTRODUCTION
\Ye address the problem of controlling degenerate diffusions by con-
tinually monitoring t h e drift. T h e objective is to minimize the long-run
average (ergodic) cost over all admissible controls. The state X ( t ) of
the system a t time t is governed by
where b : I R x~
U -+ Rd and
a : I R -+
~ R d x dare respectively the drift
vector and the diffusion matrix,
U is a given compact metric space
which is the control set. X o is a prescribed lRd-valued random variable
and W ( . )is a d-dimensional standard Wiener process independent of
X o . u ( . ) is a non-anticipative control process taking values in U.Such
a control is called an admissible control. Our aim is t o minimize over
all admissible controls the following quantity :
lim sup - E
T+m
T
6'
c ( X ( t ) , u(t))dt
where c is the running cost function. If the minimum is obtained for
some u(.), it is called an optimal control. Under certain conditions
we will show t h a t there exists an optimal control. We then study the
corresponding Hamilton-Jacobi-Bellman (HJB for short) equation. T h e
HJB equation for this problem is given by
where
4
: Rd -+ IR and p is a scalar.
Under certain conditions we
establish the existence of a unique viscosity solution
(4, p ) of the above
equation. Finally we characterize the optimal control via this unique
solution.
T h e existence of an optimal control for this problem has been studied in [3], [4], [ 6 ] , [lo]. In these works the existence of an optimal control
has been established under suitable conditions. However, one does not
have the existence of an optimal control for arbitrary initial law. Under
3
ERGODIC CONTROL PROBLEM
a certain Liapunov type stability condition we have got rid of this Iimitation. To the best of our knowledge the viscosity solution of the H J B
equation for the ergodic control problem has not been studied thus far.
Our paper is organized as follows. Section 2 deals with notation
and preliminaries. In Section 3 we establish the existence of an optimal
control. The HJB equation is studied in Section 4.
NOTATION AND PRELIMINARIES
2.
Let V be a compart metric space and U = P ( V ) the space of
probability measures on V endowed with the topology of weak convergence. We consider the d-dimensional controlled diffusion process
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
X (.) = [XI(.) , . . . , Xd
where b(., .) = [bl
( e l
a(.) = [aij(.)] : IRd
(a)]
I
described by
.), . . . , bd(.,
.)I'
: !Rd x V
+ R d x dis the diffusion
+
is the drift vector,
matrix, Xo is an IRd-valued
random variable with a prescribed law, W(.) = [W1(.), . . .Wd(.)]' is
a d-dimensional standard Wiener process independent of Xo, u(.) is a
U-valued process with measurable sample path satisfying the following
nonanticipativity property : for t 2 s, W ( t ) - W(s) is independent
of {u(r), W (r), r Is). Such a process is called an admissible relaxed
control. It is called an admissible precise control if u(.) = 6,(.)for a
V-valued process v(.) satisfying the same nonanticipativity condition.
We make the following assumptions. We denote various constants by
c;.
( A l ) (i) b is continuous and for x,y E lFld
4
BASAK, BORKAR, A N D GHOSH
( i i ) a is continuous, and for x, y E I R ~
Under ( A l ) , for a prescribed admissible control the equation (2.1)
has a unique strong solution. For v E V, f E w,~:(R~), x E Rd
let
1
Lvf(x)= -
d
d 2 f (.I
~ ; ~ ( x ) - + ~ b ~ ( z , v ) - af
azidxj ;=,
i,j=1
dx;
d
where a;j(.) = z ~ ~ k ( ' ) f Y ~, and
~ ( ' )for u E
(2.2)
u
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
k= 1
.4n admissible control (relaxed or precise) u(.) is called feedback if u(.) is
progressively measurable with respect to the natural filtration of X ( . ) .
In such a case (2.1) will not admit a strong solution in general. We
say that (X(.), u(.)) is a stationary relaxed solution of the controlled
martingale problem for L if
(X(.),
u ( . ) )is a stationary process such that
u(.) is a relaxed feedback control and X(.) the corresponding controlled
diffusion.
We now state the following result which plays a very crucial role in
the existence of an optimal control. For a proof see [9].
Theorem 2.1. If p E p ( R d x V ) satisfies
then there exists a stationary relaxed solution (X(.), u ( . ) ) of the controlled martingale problem for L such that for each t 2: 0
5
ERGODIC CONTROL PROBLEM
Let c : IRd x V
-+ R
be the cost function. We make the following
assumption on c.
(A2) c is continuous and for x, y E IR
d
Our aim is t o minimize the long run average (or ergodic) cost
over all admissible relaxed controls.
We will carry our program under the following 'stability' assumption.
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
(A3) For x, y E Rd, v E V, let
We assume t h a t there exist a symmetric positive definite matrix Q and
a constant
(Y
> 0 such that for x , y E IRd, v
V,
T h e following example will show the assumptions (A1) and (A3)
arise naturally.
Example. Let V = [0, 1]
a i ( x ) = 0 for all i
#
b(x, v) = B x
+ Dv, a l ( x ) = x
and
1. Here B , D are constant d x d matrices where
all eigenvalues of B have negative real part and aj is the j t h column of
a . It is easy t o see t h a t ( A l ) is satisfied. Now notice t h a t , there exist a
positive definite matrix Q such that B ' Q
obatins (A3): for x
# y,
+ Q B = -I.
Using this, one
BASAK, BORKAR, A N D GHOSH
We will now establish the 'asymptotic flatness of the flow' under
( A l ) ,(A3).We closely follow the arguments in [2].
L e m m a 2.1. Let u(.) be any admissible relaxed control. Let X ( x , t )
be the corresponding solution with initial condition X ( 0 ) = x. Then
under ( A l ) ,(A3) there exist constants C4 > 0,Cg
> 0 independent
of
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
u(.) such that for any x , y E I R ~
E I X ( x ,t ) - X ( y , t ) / 5 ~ ~ e - -~y J~. ~ 1 z
(2.5)
Proof. Consider the Liapunov function
2~ ( x )= (t' Q X ')I 2
.
(2.6)
The function w may be modified near the origin to make it a C 2function
on all of lRd.
Write bk to denote the partial differentiation with respect to the k-th
coordinate and let
Then using (A3),for x
# y,
7
i U w ( x- Y) 5 +(x
for some y
- Y ) Q ( X- Y ) } - ' I 2
> 0.Hence for some constant C4 > 0
1,
- y12 ,
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
ERGODIC CONTROL PROBLEM
7
Let r = inf{t 2 OIX(x,t) = X ( y , t ) ) (possibly +a).
Now by Ito's
formula,
For t 2 r
,
X ( x , t) = X ( y , t) a s . by the pathwise uniqueness of the
solution of (2.1). Thus by Gronwall's inequality it follows that for any
t 2 0,
Finally using the positive definiteness of Q, (2.5) follows from (2.8). m
The arguments in [ Proposition 3.1, 11 can be mimicked to get the
following result.
L e m m a 2.2. Let u(.) be any admissible relaxed control and X ( x , t )
the corresponding process with initial condition X(0) = x. Then under
( A l ) and (A3) there exist a constant 6 > 0 and a function g(.) such that
for any t
3.
> 0,
EXISTENCE RESULTS
In this section we establish the existence of an optimal control under
( A l ) , (A2), (A3). Our arguments closely follow those in [4, Chapter 61,
therefore we present here the bare sketches to illustrate the main ideas.
Let
BASAK, BORKAR, AND GHOSH
For p E
I?, let
,, /
=
cdp
Using (A2), Theorem 2.1 and Lemma 2.2, it can be shown that
where 6
> 0 is as in Lemma 2.2. Let
p*=
infp,.
(3.4)
P E ~
Lemma 3.1. The set
I' in (3.1) is compact.
Proof. In view of [4, Lemma 5.11 it suffices to show that there exists a
C Liapunov function
+ IR
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
:IR~
such that
lim LVw(x) + -03
Ixl-+='J
Now consider the Liapunov function
uniformly in v
.
(3.5)
with a suitable modification near the origin already considered in the
proof of Lemma 2.1. This clearly satisfies (3.5). Hence the result follows
from [4, Lemma 5.11.
rn
Corollary 3.1. There exists a p* E l? such that
Proof. From (3.3) it follows that p e J c d p is continuous on I?. Since
r is compact the result follows.
Let
rn
(X*(.),
u*(.)) be the stationary relaxed solution of the con-
trolled martingale problem corresponding to p*. Then if the law of
(X *(O), u*(O))is such that
ERGODIC CONTROL PROBLEM
then
1
lim -
T--bm
T
EL Jy
T
c(x*(s), v)u*(s)(du)ds=
cdP* = p* .
(3.6)
Now fix the probability space and the Wiener process W ( . )on which
u*(-) is defined. Consider the corresponding process X*(.) with varying
initial law. Using (2.5) we can argue as in [Theorem 1.2.2, 41 that (3.6)
holds for any initial law of X*(O). We can then mimick the arguments
in [4, Chapter 61 to conclude the following.
T h e o r e m 3.1. Under any admissible relaxed control u ( . ) and any
initial law, the corresponding solution X(.) of (2.1) satisfies
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
lim inf
T+w
Jy c(x(s), v ) u ( s ) ( d ~ ) d2s p* .
T
(3.7)
Thus u*(.) as above is optimal.
Remark 3.1. Note that (3.7) establishes a much stronger optimality of
u*(.), viz., the most "pessimistic" average cost under u*(.) is no worse
than the most "optimistic" average cost under any other admissible
control.
HAMILTON-JACOBI-BELLMAN EQUATION
4.
The HJB equation for the ergodic control problem is
inf [L "4(x)
uEU
+ T(x, u) - p] = 0
(4.1)
where,
I R ~+ IR is a suitable function and p is a scalar. Solving the HJB
equation means finding a suitable pair (4,p) satisfying (4.1) in an apI$
:
propriate sense. For the nondegenerate case under certain assumptions
BASAK, BORKAR, AND GHOSH
10
one can find a unique solution
(4,p) of (4.1) in a certain class such that
q5 is C 2 [4, Chapter 61. To obtain analogous results for the degenerate
[a]).
case we introduce the notion of a viscosity solution of (4.1) ([7],
Let (b 6 C(IRd) and p E R.
Definition 4.1.The pair (4, p) is said to be a viscosity solution of (4.1)
if for any $ E
c
(IRd)
inf [LU$(x)
u€U
+ c(x, u) - p] 2 0
(4.3)
a t each local maximum x of ((b - $), and
a t each local minimum x of ((b - $).
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
Let
G = { f : IRd
+ IR I f (0) = 0,
f is Lipschitz continuous )
.
(4.5)
We will show that (4.1) has a unique viscosity solution in G x IR. To this
end we follow the traditional vanishing discount method. Let X
> 0.
For an admissible relaxed control u(.), let
Let (bx denote the discounted value function, i.e.
Let
H = {f : IRd --+ IR
I
f is continuous and 1 f (x)l 5 C ( 1 + 1x1)
for some constant C)
.
(4.8)
By the results of [8] (see also [7]) q5x is the unique viscosity solution in
H of the following HJB equation for the discounted control problem
ERGODIC CONTROL PROBLEM
11
+
inf [LU+x(x) E(x, u) - A+x(x)] = 0 .
U
(4.9)
Theorem 4.1. Under ( A l ) , (A2) and (A3), (4.1) has a viscosity solution in G x
(R.
Proof. For x, y E IRd we have
roo
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
Thus there exists a constant Cs independent of A such that
is the unique viscosity solution in
Let $x(x) = $x(z) - $x(O). Then
H to
Let {A,)
be a sequence such that An
-t
0 as n
-t m. From
(4.10) it fol-
converges to a function 4 E
lows using Ascoli's theorem that
c(IRd)
uniformly on compact subsets of lRd along a suitable subsequence of
{A,).
By (2.9), A$x(O) is bounded in A. Hence along a suitable subse-
quence of {A,}
(still denoted by {A,)
converges to a scalar p as n
-t
by an abuse of n~tation)A,~$x,(O)
oo. Thus by the stability property of
viscosity solution ([7], [8]) it follows that the pair ( $ , p ) is a viscosity
solution of (4.1). Clearly $(O) = 0 and from (4.10) it follows that $ is
Lipschitz continuous. Thus
(4,p) E G x
R.
Theorem 4.2. Under ( A l ) , (A2), (A3), if ($, p ) E G x IR is a viscosity
solution of (4.1) then p = p*.
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
12
BASAK, BORKAR, AND GHOSH
> 0. Since ( 4 , p) is a viscosity solution of (4.1), therefore
Proof. Let y
Q is a viscosity solution of
inf [(L"- y ) 4 ( x )
+ E(x, Y) - p f Y ~ ( x )=] 0 .
(4.12)
Therefore by t h e uniqueness of viscosity solution ([7], [8]), it can be
shown as in [8] t h a t
4 ( x ) = inf E
u ,
[la
e-"{r(x(t),
u(t))- p
+ 7 +(X(t))}dt]
. (4.13)
Let p E I?. Let ( X ( - ) ,u(.)) be the stationary solution of the martingale
problem corresponding t o p . Then fix the probability space and the
Wiener process W ( . )on which u(.) is defined and consider the process
X ( . ) with varying initial condition. Then for any x E IRd
(4.14)
Letting y
+ 0, and
arguing as before using (2.5), we have
Hence p 5 p* . To get the reverse inequality let uY(.) be a an optimal relaxed feedback control for the cost criterion in (4.13). Let
Ft = u ( X (s), s 5 t ) . Then
is an 3t-martingale ([4]). By Lemma 2.2 it is uniformly integrable.
Therefore letting y
+ 0,
we can argue as in [Lemma 1.1, 51 t o show
t h a t for some relaxed feedback control E ( - )
is an 3t-martingale. Taking expectation, dividing by t and letting t
co,we obtain
+
ERGODIC CONTROL PROBLEM
p 2 lim inf 1 E
t-m
t
13
j0 Z'(X(s), ?i(s))dt 2 p* .
Theorem 4.3. Under ( A l ) , (A2) and (A3), if (4, p*) E G x R is a
viscosity solution of (4.1). Then
(i) If u is any admissible relaxed control then + ( X ( t ) )
+ J ~ ( ? ( x(s),
u(s)) - p*)ds is an 3t-submartingale, where 3;= o ( X ( s ) , s 5 t)
(ii) If for some admissible relaxed control u(.)
is an Ft-martingale then u is optimal.
(iii) If u is an admissible relaxed control which is optima1 and the
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
process (X(.), u(.)) is stationary with E [Jv f ( X ( t ) , u)u(t)(du)]=
JRdXV
f d p , m u g p(IRdx V ) for each t , then
is an 3t-martingale.
Proof.
(i) Let u ( - )be any admissible relaxed control. By (4.12)
is an 3t-submartingale [4]. Letting y + 0, (i) follows.
is an 3t-martingale under u(.), we have
by taking X ( 0 ) = x. Therefore
BASAK, BORKAR, AND GHOSH
14
Letting t
-+
rn and using Lemma 2.2
Thus u ( . ) is optimal.
Then E M t = S 4dp. Suppose
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
Miis not a martingale. Then there exists
t > s > 0 and A E FSwith P ( A ) > 0 such t h a t
which is a contradiction.
Theorem 4.4. The equation (4.1) has a unique viscosity solution in
G x lR.
Proof. Let (4,p ) E G x IR and ($I, p ') E G x lR be two viscosity solutions
of (4.1). Then by Theorem 4.2 p =
= p * . Let u*(.) be an optimal
relaxed feedback control such t h a t the process (x(.),u ( . ) ) is stationary
and for ea.ch t
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
ERGODIC CONTROL PROBLEM
Then by Theorem 4.3
and
are Ft-martingales and therefore so is + ( X ( t ) ) - $ ( X ( t ) ) . By Lemma
2.2 i t is uniformly integrable and therefore converges a s . Hence
4 - 1C,
is a constant c' on the support M of ,Dl where ji is the marginal of p on
R ~ Without
.
loss of generality assume that c' 2 0 (otherwise consider
6 Ad, assume c ' > 0.Then for some
> 0, 4 - II, > $ on an E-neighbourhood M, of M and 0 6 M E . Let
$ - 4 ) . If 0 E M then c ' = 0. If 0
E
q = i n f i t > 0 I X (t) E a M E ) .
We claim that Eq
< co for any x 6 M,.Indeed, let x $ ME and y
E M.
Then since M is an invariant set for X ( . ) under u*(.)
by (2.5) and Chebyshev's inequality. Thus
We can mimick the arguments in [ Proposition 3.1, 11 t o show that
for some 6 > 0
In view of (4.15a) and (4.15b) the optimal control problems of minimizing
(4.16)
and
BASAK, BORKAR, A N D GHOSH
over all admissible relaxed controls are well posed. Optimal relaxed
controls for these problems can be shown to exist [4]. Let u(.) be an
optimal relaxed feedback control for (4.16). Then
is a martingale and
is a submartingale. Thus @ ( X ( ~t A
) ) - ~ ( X (AVt)) is a submartingale.
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
Therefore
c'
-+
< 0, a contradiction. Thus
= 0. By the same argument we have +(x) - 4(x) 5 0 for any x E tRd .
Taking x = 0 and letting t
m, we get c'
Similary we can show that for any x E tRd
Hence
4
-
$.
REFERENCES
[I] G.K. Basak, A class of limit theorems for singular diffusions, J.
Multivariate Analysis 39 (1991), 44-59.
[2] G.K. Basak and R.N.Bhattacharya, Stability in distribution for
a class of singular diffusions, Annals of Probability 20 (1992),
312-321.
[3] A.G. Bhatt and V.S. Borkar, Occupation measures for controlled
Markov processes : Characterization and optimality, Preprint.
[4] V.S. Borkar, Optimal control of diffusion processes, Pitman Research notes i n Maths No. 203, Longman Scientific and Technical,
Harlow, 1989
ERGODIC CONTROL PROBLEM
17
[5] V.S. Borkar, On extremal solutions to stochastic control problem,
Appl. Math. Optim. 24 (1991), 317-330.
[6] V.S. Borkar, A note on ergodic control of degenerate diffusions,
Preprint.
[7] W.H. Fleming and H.M. Soner, Controlled Markov processes and
viscosity solutions, Springer-Verlag, New York, 1991.
[8] P.L. Lions, Optimal control of diffusion processes and HamiltonJacobi-Bellman equations, Part 11: viscosity solutions and unique-
ness. Communications on P.D.E. 8(1993), 1229-76.
[9] R.H. Stockbridge, Time-average control of a martingale problem.
Existence of a stationary solution, Annals of Probability 18
(1990), 190-205.
Downloaded By: [Indest open Consortium] At: 08:34 17 July 2009
[lo] R.H. Stockbridge, Time-average control of a martingale problems
: a linear programming formulation, Annals of Probability 18
(1990), 206-217.
Download