CONTROLLED EQUILIBRIUM SELECTION IN STOCHASTICALLY PERTURBED DYNAMICS

advertisement
arXiv:1504.04889v1 [math.PR] 19 Apr 2015
CONTROLLED EQUILIBRIUM SELECTION IN
STOCHASTICALLY PERTURBED DYNAMICS
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
Abstract. We consider a dynamical system with finitely many equilibria and perturbed
by small noise, in addition to being controlled by an ‘expensive’ control. We study the
invariant distribution of the controlled process as the variance of the noise becomes vanishingly small. It is shown that depending on the relative magnitudes of the noise variance
and the ‘running cost’ for control, one can identify three regimes, in each of which the optimal control forces the invariant distribution of the process to concentrate near equilibria
that can be characterized according to the regime. We also show that in the vicinity of
the points of concentration the density of invariant distribution approximates the density
of a Gaussian, and we explicitly solve for its covariance matrix.
1. Introduction
The study of dynamical systems has a long and profound history. A lot of effort has
been devoted to understand the behavior of the system when it is perturbed by an additive
noise [6, 16, 22]. Small noise diffusions have found applications in climate modeling [4, 5],
electrical engineering [9, 27], finance [15] and many other areas. In this article we consider
a controlled dynamical system with small noise, which is modelled as a d−dimensional
controlled diffusion X = [X1 , . . . , Xd ]T governed by the stochastic integral equation
Z t
(1.1)
Xt = X0 +
m(Xs ) + ε Us ds + εν Wt , t ≥ 0 .
0
Here:
• m = [m1 , . . . , md ]T : Rd → Rd is a bounded C ∞ function with bounded derivatives,
• W is a standard Brownian motion in Rd ,
• U is an Rd −valued control process with measurable paths satisfying the nonanticipativity condition: for t > s, Wt − Ws is independent of X0 , Wr , Ur , r ≤ s. As pointed
out in [10, p. 18], we may, without loss of generality, consider U to be adapted to
the natural filtration of X. (The set of such ‘admissible’ controls is denoted by U.)
• 0 < ε ≪ 1,
• ν > 0.
We view this as a perturbation of the o.d.e. (for ordinary differential equation)
ẋ(t) = m x(t) ,
(1.2)
Date: April 21, 2015.
2000 Mathematics Subject Classification. 35R60, 93E20.
Key words and phrases. Controlled diffusion, equilibrium selection, large deviations, small noise, expensive controls, ergodic control, HJB equation, ergodic LQG.
1
2
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
perturbed by the ‘small noise’ εν Wt (‘small’ because ε ≪ 1) and the control εUt . In the
case when the control Ut ≡ 0, Freidlin and Wentzell developed a general framework for the
analysis of small noise perturbed dynamical systems in [16] that is based on the theory of
large deviations. The goal of this article is to study the effect of the additional control when
the parameter ε tends to 0. We assume that the set of nonwandering points of (1.2) consists
of finitely many hyperbolic equilibria, and that these are contained in some bounded open
set which is positively invariant under the flow. The objective is to minimize the long run
average (or ergodic) cost
Z T 1
2
1
(1.3)
ℓ(Xs ) + 2 |Us | ds ,
E
lim sup
T →∞ T
0
over all admissible controls. Here ℓ : Rd → R+ is a prescribed smooth, Lipschitz function
satisfying the condition:
lim ℓ(x) = ∞ .
|x|→∞
Since ε is small, this implies that the control is expensive. Under a stochastic Lyapunov
condition we introduce later, the cost is finite for U ≡ 0, ensuring in particular that the
set of controls with finite cost is nonempty. It is quite evident from ergodic theory that
for U = 0 the limit (1.3) is the moment of ℓ with respect to the invariant measure of
(1.1). It is not hard to show that as ε ց 0, the collection of invariant measures is tight
and concentrates on the set of equilibria of m. To find the actual support of the limit, in
the case of multiple equilibria, one often looks at the large deviation properties of these
invariant measures [16]. There are several studies in literature that deal with the large
deviation principle of invariant measures of dynamical systems. Among the most relevant
to the present are [14, 24] which obtain a large deviation principle for invariant measures
(more precisely, invariant densities) of (1.1) under the assumption that there is only one
equilibrium point. This is generalized to the multiple equilibria in [8]. A large deviation
principle for invariant measures for a class of reaction-diffusion systems is established in [13].
None of the above mentioned studies have any control component in their dynamics. One
of our motivations for this work is to add a control in the dynamics and study its effects on
the noise in the selection of equilibria.
Remark 1.1. The vector field m is assumed bounded for simplicity. The reader however
might notice that the regularity results in [3], on which the characterization of optimality
is based (see Theorem 2.2), the hypotheses in [3, Section 4.6.1] permit m to be unbounded
as long as
|m(x)|2
< ∞.
lim sup
ℓ(x)
|x|→∞
Provided that this condition is satisfied, the assumption that the drift is bounded can be
waived, and all the results of this paper hold unaltered, with the proofs requiring no major
modification.
Let us now state the following result on existence of solutions to (1.1) whose proof is
given in Appendix A.
R t
Lemma 1.1. Under the condition E 0 |Us |2 ds < ∞, ∀ t ≥ 0, the diffusion in (1.1) has a
unique weak solution.
CONTROLLED EQUILIBRIUM SELECTION
3
The qualitative properties of the dynamics are best understood if we consider the special
case d = 1, and m = − dF
dx for some continuously differentiable F : R → R. Then the
trajectory of (1.2) converges to a critical point of F . In fact, generically (i.e., for x(0) in an
open dense set) it converges to a stable one, i.e., to a local minimum. If one views the graph
of F as a ‘landscape’, the local minima are the bottoms of its ‘valleys’. The behavior of the
stochastically perturbed (albeit uncontrolled) version of this model, notably the analysis of
where the stationary probability distribution concentrates, has been of considerable interest
to physicists (see, e.g., [23, Chapter 8] or [16, Chapter 6]). Recent work on ‘stochastic
resonance’ (see, e.g., [21]) introduces an additional external input to the dynamics that
may be viewed as a control. This is chosen so that it ‘resonates’ with the noise in an
appropriate sense and induces transitions between valleys. The model in (1.1) goes a step
further and considers the full-fledged optimal control version of this, wherein one tries to
induce a preferred equilibrium behavior through a feedback control. The reason the latter
has to be ‘expensive’ is because this captures the physically realistic situation that one
can ‘tweak’ the dynamics but cannot replace it by something altogether different without
incurring considerable expense. The function ℓ captures the relative preference among
different points in the state space. Let us also point out that this problem can also be
viewed as a multi-scale diffusion problem as the control and noise are scaled differently.
The following hypothesis on the vector field m is in effect throughout the paper.
Hypothesis 1.1. The set S := {x ∈ Rd : m(x) = 0} is finite and its elements are hyperbolic, i.e., the Jacobian matrix Dm(x) of m at each point x ∈ S has no eigenvalues on the
imaginary axis. Also there exist a twice continuously differentiable function V̄ : Rd → R+ ,
and a bounded set K ⊂ Rd containing S, with the following properties:
(H1) c1 |x|2 ≤ V̄(x) ≤ c2 |x|2 for some positive constants c1 , c2 , and all x ∈ Kc .
(H2) ∇V̄ is Lipschitz and satisfies
hm(x), ∇V̄ (x)i < −2γ|x|
(1.4)
for some γ > 0, and all x ∈ Kc .
We need some definitions.
Definition 1.1. Let Ss ⊂ S denote the set of stable equilibria of (1.2), i.e., the set of points
z ∈ S for which the eigenvalues of Dm(z) have negative real parts.
Throughout the paper β∗ε denotes the optimal value of (1.3), η∗ε denotes the stationary
probability distribution of the process X under an optimal stationary Markov control (which
is denoted as v∗ε ), and ̺ε∗ its density (for existence and uniqueness see Theorem 2.2). We
also define the ‘running cost’ R[v] : Rd → R corresponding to a stationary Markov control
v : Rd → Rd by
1
R[v](x) := ℓ(x) + |v(x)|2 .
2
d
We say that a set K ⊂ R is stochastically stable (or that η∗ε concentrates on K) if it
is compact, and limεց0 η∗ε (K c ) = 0. It is evident that the class S of stochastically stable
sets, if nonempty, is closed under intersections. Hence we define the minimal stochastically
stable set S by S := ∩K∈S K.
4
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
Definition 1.2. For a square matrix M ∈ Rd×d , let E+ (M ) denote the sum of its eigenvalues
bz
that lie in the open right half complex plane. For z ∈ S, and with M ≡ Dm(z), we let Q
b
and Σz be the symmetric, nonnegative definite, square matrices solving the pair of equations
bz + Q
bz M = Q
b z2 ,
M TQ
(1.5)
b z T = −I .
bz + Σ
bz M − Q
bz Σ
M −Q
bz , Σ
b z ) of
By Lemma 3.4, which appears later in the paper there exists a unique pair (Q
b z is
symmetric positive semidefinite matrices solving (1.5). It is also evident by (1.5) that Σ
invertible.
The main results of the paper are summarized in Theorems 1.1–1.3 that follow.
Theorem 1.1. The minimal stochastically stable set S is a subset of S for all ν > 0.
We define
Z1 := Arg min ℓ(z) ,
J1 := min ℓ(z) ,
z∈S
z∈S
Z2 := Arg min ℓ(z) + E+
z∈S
Z3 := Arg min ℓ(z) ,
Dm(z) ,
J2 := min ℓ(z) + E+ Dm(z) ,
z∈S
J3 := min ℓ(z) .
z∈S s
z∈S s
β∗ε
depend on ν as follows:
The set S and the optimal value
(i) For ν > 1 (‘supercritical’ regime), S ⊂ Z1 , and β∗ε = J1 + O(ε2ν−2 ) if ν ∈ (1, 2),
while O(ε2 ) ≤ β∗ε − J1 ≤ O(ε2ν−2 ) for ν ≥ 2.
(ii) For ν < 1 (‘subcritical’ regime), S ⊂ Z3 , and O(ε2−2ν ) ≤ β∗ε − J3 ≤ O(ε4ν−2 ) if
ν ∈ (2/3, 1), while β∗ε = J3 + O(εν ) if ν ≤ 2/3.
(iii) For ν = 1 (‘critical’ regime), S ⊂ Z2 , limεց0 β∗ε = J2 , and β∗ε ≤ J2 + O(ε2 ).
To prove Theorem 1.1, we first identify the optimal control for ε > 0 from the associated
Hamilton–Jacobi–Bellman equation (HJB) and establish a rigorous connection of the HJB
studied in [2] with the ergodic control problem. It is not hard to show that the optimal
invariant measures η∗ε concentrate on S as ε ց 0 (see Lemma 3.1). In Theorem 1.1 we
actually identify three regimes, corresponding to different values of ν and characterize the
limiting β∗ε . For ν > 1 one can find a control U under which the invariant measure of
the dynamics (1.1) concentrates on a point in S. Construction of invariant measures with
similar properties is also possible for z ∈ Ss when ν < 1. The important difference is that
for ν < 1 the optimal invariant measure η∗ε cannot concentrate on S \ Ss (see Lemma 3.3).
To show this fact we construct a suitable Lyapunov function for the Morse-Smale dynamics
(see Theorem 2.1). The analysis in the critical regime ν = 1 turns out to be more subtle
than other two regimes. To establish the results for this regime we study the ergodic control
problem for LQG systems (Lemma 3.4). Depending on some moment results (Lemma 3.5)
we scale the space suitably and show that the resulting invariant measures are also tight.
In particular, we examine the asymptotic behavior of η∗ε and show that under an appropriate spatial scaling it ‘converges’ to a Gaussian distribution in the vicinity of the minimal
stochastically stable set.
CONTROLLED EQUILIBRIUM SELECTION
5
Theorem 1.2. Let ν ∈ (0, 2). Suppose that for z ∈ S there exists a sequence εn ց 0, such
that for some open neighborhood N of z whose closure does not contain any other elements
of S it holds that lim inf εn ց0 η∗εn (N ) > 0. Then along this sequence we have
T b −1 ενd ̺ε∗ εν x + z
1
x Σz x
−
−
−
→
,
exp
−
b z |1/2
εց0 (2π)d/2 |det Σ
η∗ε (N )
2
b z denotes the determinant of Σ
bz.
uniformly on compact sets, where det Σ
The following theorem shows that
in the supercritical case, the set S contains only those
points z ∈ Z1 for which E+ Dm(z) is minimal. In particular, if Z1 ∩ Ss 6= ∅, then S ⊂ Z3 .
Theorem 1.3. We assume ν ∈ (1, 2). Define the subset Z1∗ ⊂ Z1 by
Z1∗ := Arg min E+ Dm(z) .
z∈Z1
If z ∈ S \ Z1∗ and N is a neighborhood of z whose closure does not contain other elements
of S, then η∗ε (N ) −−−→ 0 . Moreover, for z ∈ Z1∗ , it holds that
εց0
β∗ε = ℓ(z) + ε2ν−2 E+ Dm(z) + o(ε2ν−2 ) .
(1.6)
Remark 1.2. It is evident from Theorem 1.1 (i) that ν = 2 is a critical
value. While for
ν ∈ (1, 2) and a point z ∈ Z1∗ \ Ss we have lim inf εց0 ε2−2ν β∗ε − ℓ(z) > 0, which means
R
R
that the control effort Rd 12 |v∗ε |2 dη∗ε exceeds Rd ℓ dη∗ε for all ε sufficiently small, the opposite
can occur if ν > 2. A simple example where this happens is the one-dimensional model
with data m(x) = x and ℓ(x) = (x + 1)2 . For this example, direct substitution shows that
the solution of the HJB equation (see (2.7)) is
V ε (x) =
2
1 − b(ε)
x + b(ε) ,
b(ε)
β∗ε = 1 + ε2ν
where
1 − b(ε)
− 2b(ε) 1 − b(ε) ,
b(ε)
−1
p
b(ε) := 2ε2 1 + 2ε2 + 1 + 2ε2
.
A simple calculation shows that if ν ∈ (1, 2) then lim inf εց0 ε2−2ν (β∗ε − 1) > 0, while if
ν > 2, then lim supεց0 ε−2 (β∗ε − 1) < 0. Therefore (1.6) does not hold for this example
when ν > 2.
Theorem 1.1 is the combination of Lemma 3.1, Lemma 3.3, Corollary 3.1, Corollary 3.2
and Theorems 3.1–3.3 presented in Section 3. Theorem 1.2 follows from Lemmas 3.7 and
3.8— see also Theorems 3.5 and 3.6 for related results. The proof of Theorem 1.3 can be
found in Section 3.5.
6
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
1.1. Notation. The following notation is used in this paper. The symbols R, and C denote
the fields of real numbers, and complex numbers, respectively. Also, N denotes the set of
natural numbers. The Euclidean norm on Rd is denoted by | · |, and h· , ·i denotes the inner
product. A ball of radius r > 0 in Rd around a point x is denoted by Br (x), or as Br if
x = 0. For a compact set K, Br (K) denotes the open r-neighborhood of K. For a set
A ⊂ Rd , we use Ā, Ac , and ∂A to denote the closure, the complement, and the boundary of
A, respectively. We write A ⋐ B to indicate that Ā ⊂ B. We define Cbk (Rd ), k ≥ 0, as the
set of functions whose i-th derivatives, i = 1, . . . , k, are continuous and bounded in Rd and
denote by Cck (Rd ) the subset of Cbk (Rd ) with compact support. The space of all probability
measures on a Polish space X with the Prohorov topology is denoted by P(X ). The density
of the d-dimensional Gaussian distribution with mean 0 and covariance matrix Σ is denoted
by ρΣ .
The symbols O(x) and o(x) denote the sets of functions f : Rd → R having the property
lim sup
|x|ց0
|f (x)|
< ∞,
|x|
and
lim sup
|x|ց0
|f (x)|
= 0,
|x|
respectively. Abusing the notation, O(x) and o(x) occasionally denote generic members of
these sets. Also κ1 , κ2 , . . . are generic constants whose definition differs from place to place.
In Section 2 we present a basic property of gradient-like flows (Theorem 2.1), and characterize the optimal control problem via a HJB equation (Theorem 2.2). Section 3 is devoted
to the study of the support of the limit, as ε ց 0, of the optimal stationary probability
distribution η∗ε . Appendix A contains the more technical proofs of Lemma 1.1 and Theorem 2.2.
2. Gradient-like flows and the optimal control problem
Recall the function V̄ defined in Hypothesis 1.1. Since ∇V̄ is Lipschitz, ∆V̄ is bounded
and thus (1.4) implies that for
Lε0 f (x) :=
ε2ν
∆f (x) + hm(x), ∇f (x)i
2
∀ x ∈ Rd ,
f ∈ C 2 (Rd ) ,
we have
Lε0 V̄(x) ≤ γ0 − γ |x|
∀ ε ∈ (0, 1) ,
for some constant γ0 > 0. This is the ‘stochastic Lyapunov condition’ that implies in
particular that the process X with U ≡ 0 has a unique stationary probability distribution
η0ε , and
Z T
Z
1
γ0
lim
|x| η0ε (dx) ≤
|Xt | dt =
E
∀ ε ∈ (0, 1) .
(2.1)
T →∞ T
γ
Rd
0
In view of Theorem 2.2 and Proposition 2.3 of [19], this follows from the ergodic theory
of Markov processes. Since Rℓ is Lipschitz, (2.1) implies that there exists a constant C
independent of ε such that ℓ dη0ε ≤ C < ∞ . Moreover, from [8] there exists a unique
Lipschitz continuous function Z ≥ 0, such that minRd Z = 0, Z(x) → ∞ as |x| → ∞ and
Z ∞
2
1
φ̇(s) + m(φ(s)) ds + Z(xi ) ,
φ(0) = x ,
Z(x) =
inf
φ : φ(t)→xi , xi ∈S 2 0
CONTROLLED EQUILIBRIUM SELECTION
7
and, if ̺ε0 denotes the density of η0ε , then −ε2 ln ̺ε0 (x) → Z(x) uniformly on compact subsets
of Rd as ε ց 0. The function Z is generally referred to as the quasi-potential.
2.1. Gradient-Like Morse–Smale dynamical systems. It is well known from the theory of dynamical systems that if the set of nonwandering points of a flow on a compact
manifold consists of hyperbolic fixed points, then the associated vector field is generically
gradient-like (see Definition 2.1 and Theorem 2.1 below). This is also the case under Hypothesis 1.1, since the ‘point at infinity’ is a source for the flow of m.
The theorem below is well known [20, 25]. What we have added in its statement is the
assertion that the energy function can be chosen in a manner that its Laplacian at critical
points of positive index is negative.
Recall that a function f : Rd → R is called inf-compact if the set {x ∈ Rd : f (x) ≤ C} is
compact (or empty) for every C ∈ R. We start with the following definition.
Definition 2.1. We say that V ∈ C ∞ (Rd ) is an energy function if it is inf-compact, and has
a finite set S = {z1 , . . . , zn } of critical points, which are all nondegenerate. A C ∞ vector
field m on Rd is called gradient-like relative to an energy function V provided that the set
of nonwandering points of its flow is S, that every point in S is a hyperbolic singular point
of m, and
m(x) · ∇V(x) < 0
∀x ∈ Rd \ S .
If m satisfies these properties, we also say that m is adapted to V.
For a hyperbolic singular point x̂ of a vector field m, we let Ws (x̂) and Wu (x̂) denote
the stable and unstable manifolds of the flow. Recall that the index of x̂ is defined as the
dimension of Wu (x̂).
Theorem 2.1. Suppose that a C ∞ vector field m in Rd satisfies (H1)–(H2) and
(a) The set of nonwandering points of its flow is a finite set S = {z1 , . . . , zn } of hyperbolic singular points.
(b) If y and z are singular points of m, then Ws (y) and Wu (z) intersect transversally
(if they intersect).
Then m is adapted to an energy function V which satisfies
(i) V(zi ) 6= V(zj ) for i 6= j.
(ii) ∆V(z) < 0, for all z ∈ S \ Ss where Ss , as defined earlier, denotes the set of singular
points of index 0, i.e., the stable equilibria of the flow of m.
(iii) For each z ∈ S there exists some open neighborhood Nz of z and a constant C0 > 0
such that
−C0−1 |x − z|2 ≤ hm(x), ∇V(x)i ≤ −C0 |x − z|2
∀x ∈ Nz .
(iv) There exists some open neighborhood Q of S and a constant C0′ > 0 such that
−
1
hm(x), ∇V(x)i ≤ |∇V(x)|2 ≤ −C0′ hm(x), ∇V(x)i
C0′
∀x ∈ Q .
8
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
Proof. Let x̂ be a critical point of m of index q > 0. Translating the coordinates we may
assume that x̂ ≡ 0. Since m(0) = 0, then m(x) takes the form
m(x) = M x + o(x)
locally around x = 0, where M = Dm(0) is the Jacobian of m at 0. By hypothesis M has
exactly q (d − q) eigenvalues in the open right half (left half) complex space. Therefore
since the corresponding eigenspaces are invariant under M , there exists a linear coordinate
transformation T such that, in the new coordinates x̃ = T (x), the linear map x 7→ M x has
the matrix representation M̃ = T M T −1 and M̃ = diag(M̃1 , −M̃2 ), where M̃1 and M̃2 are
square Hurwitz matrices of dimension d − q and q respectively. By the Lyapunov theorem
there exist positive definite matrices Q̃i , i = 1, 2, satisfying
M̃1T Q̃1 + Q̃1 M̃1 = −Id−q ,
M̃2T Q̃2 + Q̃2 M̃2 = −Iq ,
(2.2)
where Id−q and Iq are the identity matrices of dimension d − q and q, respectively. Let θ > 1
be such that
θ trace T T diag(0, Q̃2 )T > trace T T diag(Q̃1 , 0)T ,
(2.3)
and define V in some neighborhood of 0 by
V(x) := a + xT T T diag(Q̃1 , −θ Q̃2 )T x ,
(2.4)
where a is a constant to be determined later. By (2.3) we obtain ∆V(0) < 0, and thus (ii)
holds.
We have
hm(x), ∇V(x)i = xT M T T T diag(Q̃1 , −θ Q̃2 )T + T T diag(Q̃1 , −θ Q̃2 )T M x + o(|x|2 ) .
Expanding we obtain
T T diag(Q̃1 , −θ Q̃2 )T M = T T diag(Q̃1 , −θ Q̃2 )T T −1 M̃ T
= T T diag(Q̃1 M̃1 , θ Q̃2 M̃2 )T .
By (2.2) we have
hm(x), ∇V(x)i = −xT T T diag(Id−q , θIq )T x + o(|x|2 ) .
Therefore, since θ > 1, we have
−|T x|2 + o(|x|2 ) ≤ hm(x), ∇V(x)i ≤ −θ |T x|2 + o(|x|2 ) ,
thus establishing (iii).
Property (iv) follows by (iii) and (2.4).
As shown in [25] one can select n distinct real numbers ai and define V on S by setting
V(zi ) = ai in a consistent manner: if zi and zj are the α- and ω-limit points of some
trajectory then ai > aj . Thus V can be defined in nonoverlapping neighborhoods of the
singular points by (2.4) so as to satisfy (i). This function can then be extended to Rd by
the construction in [25].
The energy function V can be constructed in a manner so that it agrees, outside some
ball, with the Lyapunov function V̄ in Hypothesis 1.1. This is stated in the following lemma.
CONTROLLED EQUILIBRIUM SELECTION
9
Lemma 2.1. Under the assumptions of Theorem 2.1, the energy function V can be selected
so that V = V̄ on the complement of some ball which contains S.
Proof. Recall the definition of K in Hypothesis 1.1. It is evident that we can find radii
0 < R1 < R2 < R3 < R4 and balls centered at the origin such that S ⊂ BR1 , K ⊂ BR3 , and
the following are satisfied
c1 := sup V < c2 :=
BR1
c3 :=
sup
BR3 \BR2
inf
BR3 \BR2
V,
V,
V < c4 := inf
c
BR
4
V̄ .
c̄2 := sup V̄ < c̄3 := inf
c
BR2
BR
3
Let ψ : R → R be a smooth non-decreasing function such that ψ(t) = t for t ≤ c1 , ψ(t) = c4
for t ≥ c4 , and whose derivative is strictly positive on the interval [c1 , c3 ]. Similarly let
ψ̄ : R → R be a smooth non-decreasing function such that ψ̄(t) = 0 for t ≤ c̄2 − c4 and
ψ̄(t) = t for t ≥ c̄3 − c4 . Let G := ψ ◦ V + ψ̄ ◦ (V̄ − c4 ). By construction G agrees with
c . It can also be easily verified that sup
V on BR1 and with V̄ on BR
BR4 \BR1 m · ∇G < 0.
4
Replacing V with G we obtain a new energy function V such that |∇V| is Lipschitz.
2.2. Existence of an optimal control and the HJB equation. Recall that a stationary Markov control is of the form Ut = v(Xt ) where v : Rd → Rd is a measurable map.
Theorem 2.2 below establishes the existence of a stationary control that is optimal. Before stating this result we mention the occupation measure based formulation that is quite
related to the ergodic control problem considered here [7, 18]. Define, for f ∈ C 2 (Rd ),
ε2ν
∆f (x) + hm(x) + εu, ∇f (x)i .
2
Now consider the following minimization problem
Z
ℓ(x) + 21 |u|2 π(dx, du) ,
inf
Lε [f ](x, u) :=
π∈P(Rd ×Rd )
subject to:
Z
Rd ×Rd
Rd ×Rd
ε
L [f ](x, u) π(dx, du) = 0
∀f ∈
(2.5)
Cc2 (Rd ) .
It is not hard to show that the value of this minimization problem does not exceed the
optimal value β∗ε of the ergodic cost (1.3) under the controlled dynamics (1.1). Using
the lower semi-continuity of the cost it is also easy to show that there exists a minimizer
π∗ε ∈ P(Rd ×Rd ) which attains the optimal value of (2.5). In [18] it is shown that there exists
a càdlàg stationary process (X ∗ , v∗ε (X ∗ )) with distribution π∗ε that satisfies the martingale
problem with respect to the generator Lε . But it is not obvious that one can construct a
weak solution to (1.1) with control v∗ε . We adopt an analytic approach to find an optimal
stationary control and characterize the optimal value β∗ε .
Theorem 2.2. The HJB equation for the ergodic control problem, given by
i
h
ε2ν
ε
ε
2
1
∆V + min hm + εu, ∇V i + ℓ + 2 |u| = β ε ,
2
u∈Rd
(2.6)
10
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
has a solution (V ε , β ε ) in C 2 (Rd ) × R, where V ε is unique in the class of functions in
C 2 (Rd ) satisfying lim|x|→∞ V ε (x) = ∞, V ε (0) = 0, and β ε is uniquely specified as β ε = β∗ε .
Moreover, Ut = v∗ε (Xt ), t ≥ 0, where v∗ε = −ε∇V ε is an optimal control.
Proof. The proof is contained in Appendix A.
Remark 2.1. In view of the smoothness assumptions on the coefficients and the cost function,
standard elliptic regularity theory allows us to improve the regularity of V ε to C k (Rd ), for
any k ≥ 2.
From (2.6) it follows that
ε2
ε2ν
∆V ε + m · ∇V ε − |∇V ε |2 + ℓ = β∗ε .
2
2
(2.7)
3. The support of the limit of the stationary probability distribution
Throughout the rest of the paper η∗ε denotes the stationary probability distribution under
the optimal stationary control v∗ε in Theorem 2.2. We start the analysis with the following
lemma which states that η∗ε concentrates on S as ε ց 0.
Lemma 3.1. The family {η∗ε , ε ∈ (0, 1)} is tight, and any sub-sequential limit as ε ց 0 has
support on S.
Proof. Since ℓ is inf-compact and V ε is bounded below, it follows by (2.6) that the stationary
control v∗ε defined in Theorem 2.2 is stable. Let η∗ε be the invariant probability measure of
the SDE
dXt = m(Xt ) + εv∗ε (Xt ) dt + εν dWt .
Also by Theorem 2.2 we have
Z Rd
Recall that
η0ε
ℓ(x) + 12 |v∗ε (x)|2 η∗ε (dx) = β∗ε .
is the invariant probability measure of (1.1) under the control U ≡ 0. Define
Z
ε
ℓ(x) η0ε (dx) .
β0 :=
Rd
By (2.1) we have
Z
Rd
ℓ(x) η∗ε (dx) ≤ β∗ε ≤ β0ε ≤
γ0
γ
∀ ε ∈ (0, 1) .
(3.1)
Since ℓ is inf-compact, (3.1) implies that {η∗ε , ε ∈ (0, 1)} is tight. Let x(t) be the solution
of (1.2). Therefore, if Cm denotes a Lipschitz constant of m and X0 = x(0), we obtain
Z t
Z t
(3.2)
|Xs − x(s)| ds + ε |v∗ε (Xs )| ds + εν |Wt | .
|Xt − x(t)| ≤ Cm
0
0
Hence applying Gronwall’s inequality we obtain from (3.2) that
Z t
Cm t
ε
ν
|Xs − x(s)| ≤ e
ε |v∗ (Xs )| ds + ε sup |Ws | ,
0
s≤t
s ≤ t.
(3.3)
CONTROLLED EQUILIBRIUM SELECTION
11
In turn, for any δ > 0, (3.3) implies that
Z t
δe−Cm t
δe−Cm t
ε
Px(0) |Xt −x(t)| ≥ δ ≤ Px(0)
|v∗ (Xs )| ds ≥
,
+Px(0) sup |Ws | ≥
2ε
2εν
s≤t
0
for t > 0 . By Jensen’s inequality we have
Z t
Z t
δ2 e−2Cm t
δe−Cm t
ε
2
ε
≤ Px(0)
|v∗ (Xs )| ds ≥
Px(0)
|v∗ (Xs )| ds ≥
2ε
4tε2
0
0
Z
t
4tε2 2Cm t
ε
2
≤
e
E
|v
(X
)|
ds
.
s
x(0)
∗
δ2
0
Therefore for any compact set K ⊂ Rd we have
Z
Z
ε
4t2 ε2 2Cm t
e
|v∗ε (x)|2 η∗ε (dx)
Px |Xt − x(t)| ≥ δ η∗ (dx) ≤
δ2
Rd
K
+ sup Px sup |Ws | ≥
s≤t
x∈K
δ −Cm t
e
. (3.4)
2εν
It is clear that the right hand side of (3.4) tends to 0 as ε ց 0. Suppose that η∗ε → η̄ along
some subsequence as ε ց 0. We claim that for any f ∈ Cb (Rd ),
Z
Z
f (x) η̄(dx) ,
(3.5)
ft (x) η̄(dx) =
Rd
Rd
where ft (x) := f (x(t)), and x(t) is the solution of (1.2) with initial condition x(0) = x.
Since the ω-limit set of the trajectories of (1.2) is supported on S, (3.5) shows that η̄ has
support on S. Next we prove (3.5). It is enough to prove the claim for a bounded Lipschitz
function f . Since η∗ε is an invariant probability measure we have
Z
Z
ε
f (x) η∗ε (dx) ,
Ex [f (Xt )] η∗ (dx) =
Rd
Rd
v∗ε .
where X solves (1.1) with control
Hence to prove (3.5) it is enough to show that
Z
Z
ε
ft (x) η̄(dx)
Ex [f (Xt )] η∗ (dx) −−−→
εց0
Rd
Rd
Rd
for all bounded, Lipschitz functions f . Since ft :
→ R is a bounded continuous function
it suffices to show that, for any compact set K, we have
Z
Ex [f (Xt )] − ft (x) η∗ε (dx) −−−→ 0 .
(3.6)
K
εց0
But (3.6) follows by the Lipschitz property of f and (3.4).
We next consider the three separate cases, i.e., the supercritical, subcritical and critical
regimes. We use the following running example:
Example 3.1. Let m be a vector field in R, of the form m = −∇F for F a ‘double well
4
3
potential’ given as follows: F (x) := x4 − x3 − x2 on [−10, 10], with F suitably extended
so that it is globally Lipschitz and does not have any critical points outside the interval
[−10, 10]. Then ∇F vanishes at exactly three points: −1, 0, 2. Of these, 0 is a local
maximum, hence an unstable equilibrium for the o.d.e. ẋ(t) = m(x(t)), and both −1 and
12
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
2 are local minima, hence stable equilibria thereof. Let ℓ(x) = c|x|2 on [−10, 10] for a
suitable c > 0, modified suitably outside [−10, 10] to render it globally Lipschitz. Note that
5
, F (2) = − 38 . Thus x = 2 is the unique global minimum of F .
F (0) = 0, F (−1) = − 12
3.1. Supercritical regime (ν > 1). Let z ∈ S and consider the control
1
Utε = v(Xt ) := − m(Xt ) + (Xt − z) , t ≥ 0 .
ε
Then X is given by
dXt = −(Xt − z) dt + εν dWt , t ≥ 0 .
(3.7)
Since (3.7) has a unique strong solution and U is adapted to the family of sub-σ-fields
generated by X, the control U satisfies the nonanticipativity condition and is therefore admissible. Moreover, a standard argument using the Lipschitz property of m and Gronwall’s
inequality shows that, for some K > 0, we have
Z t
Z t
K
2
ε 2
|Xs | ds < ∞
∀t ≥ 0,
|Us | ds ≤
E 1+
E
ε
0
0
as desired.
The diffusion in (3.7) has a Gaussian stationary distribution µε with mean z and variance
O(ε2ν ). In particular,
Z
Z
|m(x)|2 + |x − z|2 ε−2 µε (dx)
|v|2 dµε ≤ 2
Rd
Rd
C
≤ 2
ε
Z
Rd
|x − z|2 µε (dx)
≤ O(ε2ν−2 )
(3.8)
for some C > 0, where the second inequality follows from the Lipschitz property of m
combined with the fact that m(z) = 0. It follows by (3.8) that the corresponding ergodic
cost is ℓ(z) + O(ε2ν−2 ) ≈ ℓ(z) for sufficiently small ε. In particular, this in conjunction with
Lemma 3.1 leads to:
Theorem 3.1. For ν > 1, it holds that
lim β∗ε = min {ℓ(z) : z ∈ S} .
εց0
and
β∗ε ≤ min {ℓ(z) : z ∈ S} + O ε2ν−2
∀ε ∈ (0, 1) .
3.1.1. An asymptotically optimal control that uses the energy function. It is worth mentioning here that if z ∈ Ss , then an asymptotically optimal control can be synthesized from an
energy function. Let V be an energy function that attains a unique global minimum at z
and such that |∇V| is Lipschitz and inf-compact. Such a function exists by Theorem 2.1
and Lemma 2.1. Consider the control
1
v ε (x) := − m(x) + ∇V(x) , t ≥ 0 .
ε
Then X is given by
dXt = −∇V(Xt ) dt + εν dWt , t ≥ 0 .
CONTROLLED EQUILIBRIUM SELECTION
Let µε denote its unique stationary probability distribution, and Lεvε :=
controlled extended generator. Since
2
Z
Rd
ε2ν
2 ∆ + ∇V
· ∇ its
ε2ν
k∆Vk∞ − |∇V|2 ,
2
Lεvε V ≤
it follows that
13
|∇V|2 dµε ≤ ε2ν k∆Vk∞ .
Note that µε has density ̺ε (x) = C(ε) e−
fore we have
Z
Z
ε
2 ε
|v (x)| µ (dx) ≤ 2
2V(x)
ε2ν
Rd
Rd
≤ 2
Z
Rd
, where C(ε) is a normalizing constant. There-
|m(x)|2 + |∇V(x)|2 ε−2 µε (dx)
ε−2 |m(x)|2 µε (dx) + ε2ν−2 k∆Vk∞
≤ O(ε2ν−2 ) + ε2ν−2 k∆Vk∞ .
For the last inequality we used the fact that m is bounded, m(z) = 0, and that V is locally
quadratic around z.
In Example 3.1, ℓ attains its minimum over {−1, 0, 2} at 0. Thus in the supercritical
regime we have β∗ε ≈ ℓ(0) = 0.
3.2. Subcritical regime (ν < 1). Recall that Ss is the collection of stable equilibrium
points. The following lemma holds for any ν > 0. It shows that if z ∈ Ss then there exists
a Markov stationary control with asymptotic cost O(εn ) for any n ∈ N, that renders {z}
stochastically stable.
Lemma 3.2. For any ν > 0 and z ∈ Ss there exists a Markov control v ε , for ε ∈ (0, 1),
with the following properties:
(a) With µε denoting the invariant probability measure of (1.1) under the control v ε , it
holds that
Z
1
|v ε (x)|2 µε (dx) = 0
∀n ∈ N.
lim sup n
ε
d
εց0
R
(b) If Σ is the symmetric positive definite solution of the matrix Lyapunov equation
T
Dm(z) Σ + Σ Dm(z)
= −I, and Cℓ a Lipschitz constant for ℓ, then
Z
q
ε
1
lim sup ν
ℓ(x) − ℓ(z) µ (dx) ≤ Cℓ trace Σ .
ε
εց0
Rd
(c) If ν ∈ (2/3, 1), then β∗ε ≤ J3 + O(ε4ν−2 ).
Proof. We let M := Dm(z). In order to simplify the notation, we translate the origin so
that z ≡ 0. Let R be the symmetric positive definite solution to the Lyapunov equation
M T R + RM = −3R2 .
(3.9)
14
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
We define the control v ε by
ε
v (x) :=
( −1
ε M x − m(x)
if |x| ≥ εν/2 ,
0
otherwise.
2ν
controlled extended genLet Lεvε := ε2 ∆ + m(x) + εv ε (x) · ∇ denote the corresponding
xT Rx
2ν
ε
erator. We apply the function F (x) := ε exp ε2ν
to Lvε .
For |x| ≥ εν/2 , using (3.9), we obtain
Lεvε F (x) =
≤
xT Rx
ε2ν trace(R) + |Rx|2 + xT (M T R + RM T )x e ε2ν
xT Rx
ε2ν trace(R) − 2|Rx|2 ) e ε2ν
≤ −|Rx|2 e
For |x| ≤ εν/2 , we have
Lεvε F (x) =
≤
xT Rx
ε2ν
∀ε ≤
2 −1/ν
.
trace(R)R−1 xT Rx
ε2ν trace(R) + |Rx|2 + 2m(x) · Rx e ε2ν
xT Rx
ε2ν trace(R) − 2|Rx|2 + 2|M x − m(x)||Rx| e ε2ν .
(3.10)
(3.11)
However, for some constant κ0 > 0, it holds that |M x − m(x)| ≤ κ0 |x|2 for all sufficiently
small |x|. Therefore, for some ε0 > 0 we have
− 2|Rx|2 + 2|M x − m(x)||Rx| ≤ −|Rx|2
However, since
ε2ν trace(R) − |Rx|2 ≤ 0
for |x| ≤ ε /2 ,
ν
∀ ε ≤ ε0 .
(3.12)
2 p
trace(R) ,
for |x| ≥ εν R−1 it follows by (3.11)–(3.12) that there exist a positive constant κ1 , not depending on ε, such
that
n
ν
∀ ε ≤ ε0 .
(3.13)
sup Lεvε F (x) : |x| ≤ ε /2 ≤ κ1 ε2ν
By (3.10) and (3.13), for some ε′0 > 0, we obtain
Lεvε F (x) ≤ κ1 ε2ν − |Rx|2 e
By (3.14) we obtain
Z
−2 exp ε−ν R−1 |x| ≥ εν/2
which implies part (a).
Consider the ‘scaled’ diffusion
|x|2 µε (dx) ≤
xT Rx
ε2ν
Z
∀ ε ≤ ε′0 .
|x| ≥ εν/2
|x|2 e
≤ R−1 κ1 ε2ν
dX̂t = b̂ε (X̂t ) dt + dWt ,
t ≥ 0,
xT Rx
ε2ν
(3.14)
µε (dx)
∀ ε ≤ ε′0 ,
(3.15)
CONTROLLED EQUILIBRIUM SELECTION
15
where
m(εν x) + ε v ε (εν x)
.
εν
ε
ε
and let µ̂ denote its invariant probability measure. It ̺ and ̺ˆε denote the densities of
µε and µ̂ε respectively, then ενd ̺ε (εν x) = ̺ˆε (x) for all x ∈ Rd . The (discontinuous) drift
b̂ε converges uniformly to M x as ε ց 0. Therefore µ̂ε converges to
R the Gaussian density
ρΣ as ε ց 0, uniformly on compact sets. Moreover, supε∈(0,1) Rd |x|4 µ̂ε (dx) < ∞ by
(3.15). Therefore, by uniform integrability, recalling that z ≡ 0, and the Cauchy-Schwartz
inequality we obtain we obtain
Z
1/2
Z
ε
|x|2 ε
1
ℓ(x) − ℓ(0) µ (dx) ≤ Cℓ lim
µ (dx)
lim
2ν
εց0
εց0 εν
Rd ε
Rd
q
≤ Cℓ trace Σ .
For part (c) apply the control v ε (x) = ε−1 M x − m(x) for x ∈ Rd . Using the bound
|M x − m(x)| ≤ κ1 |x|2 for some constant κ1 , we obtain
Z
Z
ε 2
ε
κ21 ε−2 |x|4 µε (dx) ∈ O(ε4ν−2 ) .
|v | dµ ≤
b̂ε :=
Rd
Rd
Also, using the Taylor series expansion of ℓ, and the fact that µε is Gaussian with mean 0
and covariance matrix ∝ ε2ν I, we have
Z
ℓ(x) − ℓ(0) µε (dx) ∈ O(ε2ν ) .
(3.16)
Rd
This completes the proof.
From Lemma 3.2 we see that we can always find a stable admissible control such that
the corresponding invariant probability measure concentrates on a stable equilibrium point.
Now we proceed to show that for ν < 1, η∗ε does not concentrate on S \ Ss .
→ 0.
Lemma 3.3. Suppose ν < 1. Then η∗ε (S \ Ss ) −−−
εց0
Proof. We argue by contradiction. If the statement of the theorem is not true then there
exists x̂ ∈ S \ Ss and a decreasing sequence {εn }, with εn ց 0, such that
lim inf η∗εn Br (x̂) > 0
(3.17)
n→∞
for all balls Br (x̂) of radius r > 0 centered at x̂. In the sequel all limits as ε ց 0 are meant
to be along this sequence {εn }. Without loss of generality we may assume that V satisfies
(i)–(iv) in Theorem 2.1.
Let δ > 0 be such that the interval (V(x̂) − 3δ, V(x̂) + 3δ) contains no other critical values
of V other than V(x̂) (such a δ exists by (i) of Theorem 2.1). Let ϕ : R → R be a smooth
function such that
(a) ϕ(V(x̂) + y) = y for y ∈ (V(x̂) − δ, V(x̂) + δ) ;
(b) ϕ′ ∈ [0, 1] on (V(x̂) − 2δ, V(x̂) + 2δ) ;
(c) ϕ′ ≡ 0 on (V(x̂) − 2δ, V(x̂) + 2δ)c .
16
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
By Theorem 2.1 (ii) there exists r > 0 such that supx∈Br (x̂) ∆V(x) < 0. We may also
choose this r small enough so that
Br (x̂) ⊂ support ϕ′ (V(·)) ⊂ Brc (S \ {x̂}) .
Thus
Z
Rd
′
ϕ (V)∆V
dη∗ε
Z
=
′
ϕ (V)∆V
Br (x̂)
dη∗ε
+
Z
Brc (S)
ϕ′ (V)∆V dη∗ε .
(3.18)
Since V is inf-compact, it follows that ϕ ◦ V is constant outside a compact set. Therefore
the support of ϕ′ (V(·)) is compact, and as a result ∆V is bounded on this set. By (3.18),
since η∗ε Brc (S) ց 0 as ε ց 0, there exists ε0 > 0 such that
Z
ϕ′ (V)∆V dη∗ε < 0
∀ ε < ε0 .
(3.19)
Rd
By the infinitesimal characterization of an invariant probability measure we have
Z
Lε [ϕ ◦ V] x, v∗ε (x) η∗ε (dx) = 0 ,
Rd
which we write as
Z
Z
ε2ν
ϕ′ (V) (v∗ε · ∇V) dη∗ε
ϕ′ (V)∆V dη∗ε + ε
2 Rd
d
R
Z
Z
ε2ν
′′
2
ε
ϕ′ (V)hm, ∇Vi dη∗ε . (3.20)
+
ϕ (V) |∇V| dη∗ = −
2 Rd
d
R
By the Cauchy–Schwarz inequality we have
Z
1/2
1/2 Z
Z
′
ε 2
ε
′
2
ε
′
ε
ε
ϕ (V) |v∗ | dη∗
ϕ (V) |∇V| dη∗
ϕ (V)(v∗ · ∇V) dη∗ ≤
Rd
Rd
Rd
≤
Let
ζ
ε
p
:=
2β∗ε sup
Rd
Z
Rd
′
p
ϕ′ (V)
Z
Rd
2
ϕ (V) |∇V|
dη∗ε
ϕ′ (V) |∇V|2 dη∗ε
1/2
1/2
.
.
By (3.19)–(3.21) and Theorem 2.1 (iv) we obtain
Z
p
p
ε2ν
1
ε
ε 2
′
ε
(ζ ) − ε 2β∗ sup ϕ (V) ζ −
ϕ′′ (V) |∇V|2 dη∗ε ≤ 0
C0′
2 Rd
Rd
(3.21)
(3.22)
∀ ε < ε0 .
(3.23)
Using the quadratic formula on (3.23) we obtain ζ ε ∈ O(εν ). On the other hand, since
ϕ′′ (V) = 0 on some open neighborhood of S, it follows that
Z
ϕ′′ (V) |∇V|2 dη∗ε −−−→ 0 .
Rd
εց0
Therefore, since by (3.20)–(3.21) we have
Z
Z
p
p
ε2ν
ε2ν
′
ε
ε
ε
′
ϕ (V)∆V dη∗ + ε 2β∗ sup ϕ (V) ζ +
ϕ′′ (V) |∇V|2 dη∗ε ≥ 0 ,
2 Rd
2
d
d
R
R
CONTROLLED EQUILIBRIUM SELECTION
it follows that
lim inf
εց0
Z
Rd
17
ϕ′ (V)∆V dη∗ε ≥ 0 .
However, this contradicts (3.17), since ∆V(x̂) < 0 .
Lemmas 3.2 and 3.3 imply the following:
Theorem 3.2. If ν < 1, then
(
min {ℓ(z) : z ∈ Ss } + O(εν )
ε
β∗ ≤
min {ℓ(z) : z ∈ Ss } + O(ε4ν−2 )
if ν ∈ (0, 32 ) ,
if ν ∈ ( 23 , 1) .
In Example 3.1, this leads to β∗ε ≈ ℓ(−1) = c.
3.3. Critical regime (ν = 1). Recall that a square matrix is called Hurwitz if its eigenvalues lie in the open left half complex plane. We need the following definition.
Definition 3.1. Let A denote the collection of all d × d Hurwitz matrices. For M ∈ Rd×d
define
1
Λ(M ) := min trace (A − M ) ΣA (A − M )T ,
(3.24)
A∈A 2
where ΣA is the (unique) symmetric solution of the Lyapunov equation
A ΣA + ΣA AT = −I .
(3.25)
Suppose z ∈ S, and without loss of generality assume that z = 0. Consider a family of
ε of the form
controls vA
Ax − m(x)
ε
vA
(x) :=
,
ε
ε of the diffusion
where A ∈ Rd×d is Hurwitz. The stationary probability distribution ηA
dXt = AXt dt + ε dWt
is Gaussian, with zero mean, and covariance matrix ε−2 ΣA , with ΣA the solution of (3.25).
Let M = Dm(0). Then for some constants κ1 = κ1 (A) and κ2 we have
|Ax − m(x)|2 = |(A − M )x|2 + ((A − M )x, M x − m(x) + |M x − m(x)|2 .
Using the Taylor series for M x − m(x) we write M x − m(x) = f (x) + g(x), where f is an
even quadratic function, and |g(x)| ≤ κ|x|3 . Using also the Taylor series for ℓ as in (3.16),
it follows that there exists a constant κ3 such that
Z ε
(dx) ≤ ℓ(0) + Λ(M ) + ε2 κ3 .
(3.26)
ℓ(x) + 12 |v∗ε (x)|2 ηA
β∗ε ≤
Rd
Therefore, by (3.26) we obtain
lim sup β∗ε ≤ min ℓ(y) + Λ Dm(y) + O(ε2 ) .
εց0
y∈S
(3.27)
This establishes the second part of Theorem 1.1 (iii).
Remark 3.1. If y ∈ Ss then Dm(y) is Hurwitz, and by choosing A = Dm(y) in (3.24), it
follows that Λ(Dm(y)) = 0.
18
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
It is worthwhile at this point to present the following one-dimensional example, which
shows how the limiting value of β∗ε bifurcates as we cross the critical regime.
Example 3.2. Let d = 1, m(x) = M x, and ℓ(x) = 12 Lx2 , with M > 0 and L > 0. Then
the solution to (2.7) is:
√
M + M 2 + Lε2 2
ε
x ,
V =
2ε2
p
ε2ν−2
β∗ε =
M + M 2 + Lε2 .
2
ε
ε
Note that β∗ → ℓ(0) = 0, β∗ → M , and β∗ε → ∞, as ε ց 0, when ν > 1, ν = 1, and ν < 1,
respectively.
Recall from Theorem 1.1 that E+ (M ) denotes the sum of eigenvalues of a matrix M that
lie in the open right half complex plane. The following result is certainly not new, but since
we could not locate it in this form in the literature, a proof is included.
Lemma 3.4. Suppose that M ∈ Rd×d has no eigenvalues on the imaginary axis, and let Q
be the (unique) positive semidefinite symmetric solution of the matrix Riccati equation
M T Q + QM = Q2 ,
(3.28)
(M − Q)Σ + Σ(M − Q)T = −I ,
(3.29)
satisfying
for some symmetric positive definite matrix Σ. Let U∗SM denote the class of stationary
Markov controls v with the following properties:
(i) v : Rd → Rd is continuous and has at most linear growth.
(ii) The diffusion dXt = M Xt − v(Xt ) dt + dWt is positive recurrent and its invariant
probability distribution ηv has finite second moments.
Let Λ(M ) be as defined in (3.24). Then we have the following properties:
(a) It holds that
Z
2
1
(3.30)
inf∗
2 |v(x)| ηv (dx) = E+ (M ) = Λ(M ) .
v∈USM
Rd
(b) The Markov control v(x) = −Qx is the unique control in U∗SM which attains this
infimum and it holds that Λ(M ) = 21 trace(Q).
Proof. It is well known that there exists at most one symmetric matrix Q satisfying (3.28)(3.29) [12, Theorem 3, p. 150]. For κ > 0, consider the ergodic control problem of minimizing
Z T 1
1
κ |Xs |2 + |Us |2 ds
JM (U ) := lim sup
(3.31)
E
2
T →∞ T
0
over U∗SM , subject to the linear controlled diffusion
Z t
M Xs + Us ds + Wt ,
Xt = X0 +
0
t ≥ 0.
(3.32)
CONTROLLED EQUILIBRIUM SELECTION
19
As is well known, an optimal stationary Markov control for this problem takes the form
Ut = −Qκ Xt , where Qκ is the unique positive definite symmetric solution to the matrix
Riccati equation (see [12, Theorem 1, p. 147])
Ψκ (x)
1 T
2 (x Qκ x)
Q2κ − M T Qκ − Qκ M = 2 κI .
is a solution of the associated HJB equation
i
h
1
1
∆Ψκ (x) + min M x + u · ∇Ψκ (x) + 12 |u|2 + κ |x|2 = trace(Qκ ) .
d
2
2
u∈R
Moreover,
=
(3.33)
(3.34)
The HJB equation (3.34) characterizes the optimal cost, i.e.,
inf∗
U ∈USM
JM (U ) =
1
trace(Qκ ) .
2
Since the stationary probability distribution of (3.32) under the control Ut = −Qκ Xt is
Gaussian, it follows by (3.31) that the matrix Aκ := M − Qκ minimizes
1
Jκ (A) := κ trace ΣA + trace (A − M ) ΣA (A − M )T
2
over all Hurwitz matrices A, where ΣA is as in (3.25) (compare with (3.24)). Combining
this with (3.34) we have
1
inf Jκ (A) = trace(Qκ ) .
A∈A
2
1
It follows that Λ(M ) ≤ 2 trace(Qκ ) for all κ > 0. Note also that κ 7→ 12 trace(Qκ ) is
decreasing. It is straightforward to show that
Λ(M ) = lim
κց0
1
trace(Qκ ) .
2
(3.35)
It is also well known that Qκ′ − Qκ is nonnegative definite if κ′ ≥ κ. Therefore Qκ has a
unique limit Q as κ ց 0 which is nonnegative definite and satisfies (3.28).
Next we prove that M − Q is Hurwitz. Let T be a coordinate transformation such
that M̃ := T M T −1 = diag(M̃1 , −M̃2 ), where M̃1 and M̃2 are square Hurwitz matrices of
dimension d − q and q respectively. We may select T = [T1 , T2 ]T with T1 ∈ R(d−q)×d and
T2 ∈ Rq×d so that the columns of each submatrix Ti are orthonormal. By the orthonormality
of the columns of T1 and T2 we have
!
T
I
T
T
1
d−q
2
TTT =
,
(3.36)
Iq
T2 T1T
where Id−q and Iq are the identity matrices of dimension d − q and q, respectively. Let
Q̃ := (T T )−1 QT −1 . Then (3.28) takes the form
M̃ T Q̃ + Q̃M̃ = Q̃T T T Q̃ .
We write Q̃ in the block form
Q̃ =
Q̃1
R
RT Q̃2
!
(3.37)
20
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
with Q̃1 ∈ R(d−q)×(d−q) , R ∈ R(d−q)×d and Q̃2 ∈ Rq×q . Block multiplication in (3.37) results
in
!
Q̃
1
M̃1T Q̃1 + Q̃1 M̃1 = Q̃1 R T T T
,
(3.38)
RT
and since M̃1 is Hurwitz, Q̃1 is positive semidefinite, and also the right-hand side is positive
semidefinite, we must have Q̃1 = 0. However this means that the right hand side of (3.38)
equals RRT , which implies that R = 0. Hence by (3.37) we obtain
M̃2T Q̃2 + Q̃2 M̃2 = −Q̃22 .
(3.39)
We claim that Q̃2 is positive definite. Indeed, since by (3.39) the nullspace of Q̃2 is
invariant under M̃2 , if Q̃2 is not invertible, then this implies that for some y ∈ Cq , with
y 6= 0, we have Q̃2 y = 0, and M̃2 y = λy for some λ ∈ C.
Since M̃2 is Hurwitz, λ must have
negative real part. If we define ỹ := T −1 y0 , with y0 ∈ Cd , then we have
−1
T
−1 0
(M − Q)ỹ = (T M̃ T − T Q̃T )T
y
= −λỹ .
(3.40)
However, M − Qκ is Hurwitz for all κ > 0 by (3.33), and thus, by continuity, M − Q cannot
have an eigenvalue with positive real part. This contradicts (3.40) and proves the claim.
If M −Q is not Hurwitz, then for some nonzero vector x ∈ Cd , and ρ ∈ C with nonnegative
real part, we must have (M − Q)x = ρx. The identity (M − Q)T Q + Q(M − Q) = −Q2
and the fact that Q is nonnegative definite imply that Qx = 0, and therefore M x = ρx.
Moreover, Qx = 0 implies that Q̃T x = 0, and since Q̃2 is invertible we must have T2T x = 0.
Hence T2T M x = 0, and we obtain M̃1 T1T x = T1T M x = ρT1T x, which contradicts the fact
that M̃1 is Hurwitz. Therefore M − Q must be Hurwitz. It follows that Q satisfies (3.29)
for some symmetric positive definite matrix Σ. Parenthetically, we mention that
−1
Z ∞
M̃2 t M̃2T t
.
dt
e e
Q̃2 =
0
It remains to calculate the trace of Q. By (3.36) we obtain
trace(Q) = trace T T diag(0, Q̃2 )T
= trace diag(0, Q̃2 )T T T
= trace(Q̃2 ) .
(3.41)
On the other hand, by (3.39) we have
trace(Q̃2 ) = − trace(M̃2T + M̃2 )
= −2 trace(M̃2 ) .
Thus by (3.35) and (3.41)–(3.42) we have shown that E+ (M ) = Λ(M ) =
that v(x) = −Qx attains the infimum over all linear controls.
(3.42)
1
2
trace(Q) and
CONTROLLED EQUILIBRIUM SELECTION
21
Now let v̂ ∈ U∗SM be any control. Since v̂ is suboptimal for the problem associated with
(3.34), then from (3.34) we obtain
1
1
∆Ψκ (x) + M x + v̂(x) · ∇Ψκ (x) + 12 |v̂(x)|2 + κ |x|2 = trace(Qκ ) + fκ (x) , (3.43)
2
2
where fκ (x) = 12 |Qk x − v̂(x)|2 . Applying Itô’s formula to (3.43), and using the fact that ηv̂
has finite second moments and Ψκ is quadratic, a standard argument gives
Z 1
κ |x|2 − fκ (x) + 12 |v̂(x)|2 ηv̂ (dx) = trace(Qκ ) .
(3.44)
2
Rd
R
1
2 η (dx) < ∞, and lim
Since
v̂
κց0 2 trace(Qκ ) = Λ(M ), it follows by (3.44) that
R 1 Rd |x|
2
Rd 2 |v̂(x)| ηv̂ (dx) ≥ Λ(M ). Hence (3.30) holds. Suppose v̂ is optimal. Then taking limits
in (3.44) we must have
Z
fκ (x) ηv̂ (dx) = 0 .
lim
κց0
Rd
Since fκ is nonnegative and locally equi-continuous, and ηv̂ has density, by Fatou’s lemma
we deduce that fκ → 0 as κ ց 0, uniformly on compacta. Therefore, v̂(x) = −Qx for all
x ∈ Rd .
We need the following lemma, which is valid for all ν.
Lemma 3.5. For any bounded domain G, such that S ⊂ G, there exists a constant κ̂0 =
κ̂0 (G, ν) such that
2
Z
dist(x, S)
η∗ε (dx) ≤ κ̂0
∀ν > 0,
sup
ε2(ν∧2)
ε∈(0,1) G
where dist(x, S) denotes the Euclidean distance of x from the set S.
Proof. Let V be an energy function as in Theorem 2.1 which agrees with the Lyapunov
function V̄ in Hypothesis 1.1 on the complement of some ball containing S (see Lemma 2.1).
We fix some bounded domain G which contains S, and choose some number δ such that
δ ≥ supx∈G V(x). Let ϕ : R → R be a smooth function such that
(a) ϕ(y) = y for y ∈ (−∞, δ) ;
(b) ϕ′ ∈ (0, 1) on (δ, 2δ) ;
(c) ϕ′ ≡ 0 on [2δ, ∞) .
If ν ≤ 1, then the steps in the proof of Lemma 3.3 show that all the terms on the left
hand side of (3.20) are of O(ε2ν ). Therefore, there exists a constant C0 > 0 such that
Z
hm, ∇Vi ε
sup
dη∗ ≥ −C0 .
ε2ν
ε∈(0,1) {x : V(x) ≤ δ}
Thus, by Theorem 2.1 (iii) we can find a constant κ̂0 such that
2
Z
dist(x, S)
η∗ε (dx) ≤ κ̂0 .
sup
ε2ν
ε∈(0,1) {x : V(x) ≤ δ}
22
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
Next we turn to the case ν > 1. Define
Z
ϕ′ (V) |v∗ε |2 dη∗ε .
Gε :=
Rd
By (3.21)–(3.22) we obtain
Z
′
(V)(v∗ε
· ∇V) dη∗ε ≤ ζε
p
Gε .
(3.45)
By (3.45) and Theorem 2.1 (iv), we have (compare with (3.23))
p
1
ε 2
(ζ
)
−
ε
Gε ζ ε + O(ε2ν ) ≤ 0 .
C0′
(3.46)
Rd
ϕ
Moreover, by Theorem 2.1 (iii) and Hypothesis 1.1, for some constant κ1 > 0, it holds that
Z
2
dist(x, S) η∗ε (dx) ≤ κ1 (ζ ε )2 .
(3.47)
{x : V(x) ≤ δ}
Let G be a bounded open set containing S such that ℓ(x) > min {ℓ(z) : z ∈ S} for all
x ∈ Gc . Since ℓ is Lipschitz, we have ℓ(x) − min {ℓ(z) : z ∈ S} ≥ Cℓ dist(x, S) in G. Using
the Cauchy–Schwartz inequality, we deduce from (3.47) that
Z
(3.48)
ℓ dη∗ε − min {ℓ(z) : z ∈ S} ≥ −κ2 ζ ε
Rd
for some constant κ2 > 0. Thus by (3.48) and Theorem 3.1 we have
−κ2 ζ ε ≤ β∗ε − min {ℓ(z) : z ∈ S} ≤ O(ε2ν−2 )
which implies that
Gε ≤ κ3 ζ ε ∨ ε2ν−2
(3.49)
for some constant
the quadratic formula to (3.46) and using (3.49) we
√ κ3 > ν0. Applying
2ν−2
obtain Gε ∈ O ε Gε ∨ ε ∨ ε
, which implies that
p
ν
ε Gε ∈ O ε2 ∨ ε1+ /2 ∨ εν .
√
√
Therefore ε Gε ∈ O(εν ) when ν ∈ (1, 2), and ε Gε ∈ O(ε2 ) when ν ≥ 2. The result then
follows by applying the quadratic formula to (3.46) and using (3.47).
Note that above proof gives us ζ ε ∈ O(ε2ν−2 ) for ν ∈ [1, 2]. In the proof of Lemma 3.5
we have established the following useful fact.
R
2ν−2
ε
ε 2
Corollary
3.1.
For
ν
∈
[1,
2)
we
have
Rd |v∗ | dη∗ ∈ O(ε ), whereas if ν ≥ 2 then
R
R
ε
2ν−2 ). In particular,
ε
2
ε
2
Rd ℓdη∗ − min{ℓ(z) : z ∈ S} ∈ O(ε
Rd |v∗ | dη∗ ∈ O(ε ). Moreover,
using Theorem 3.1 we obtain that
(
min{ℓ(z) : z ∈ S} + O(ε2ν−2 ) for ν ∈ (1, 2) ,
β∗ε =
min{ℓ(z) : z ∈ S} + O(ε2 )
for ν ≥ 2 .
We need an estimate on the growth of ∇V ε . First a definition.
CONTROLLED EQUILIBRIUM SELECTION
23
Definition 3.2. For the rest of the paper {Bz : z ∈ S} is some collection of nonempty,
disjoint balls, with each Bz centered around z, and we define BS := ∪z∈S Bz . We let
r : [0, 1] → [0, 1] be a continuous increasing function with r(0) = 0 and such that
r̆(ε) :=
For z ∈ S, we define
r(ε)
→∞
εν
as ε ց 0 .
Vbzε (x) := V ε (εν x + z) ,
and analogously, the ‘scaled’ vector field and penalty by
m
b εz (x) :=
m(εν x + z)
,
εν
ℓbεz (x) := ℓ(εν x + z) .
We define the ‘scaled’ density ̺ˆεz (x) := ενd ̺ε∗ (εν x + z), and denote by η̂zε the corresponding
probability measure in Rd . We also define

 ε ̺ˆεz (x)
if εν x + z ∈ Br̆(ε) (z) ,
η∗ (Br(ε) (z))
ε
̺˘z (x) :=
0
otherwise,
˚
̺εz (x)
:=


̺ˆεz (x)
η∗ε (Bz )
0
if εν x + z ∈ Bz ,
otherwise.
and η̆zε (dx) = ̺˘εz (x) dx, η̊zε (dx) = ˚
̺εz (x) dx.
Lemma 3.6. Assume ν ∈ (0, 2], and let Ṽ ε := ε2 V ε . There exists a constant c0 such that
|∇Ṽ ε (x)| ≤ c0 (1 + |x|)
∀ ε ∈ (0, 1) ,
∀ x ∈ Rd .
(3.50)
Also, the same applies to V̆zε := ε2(1−ν) Vbzε , with Vbzε as in Definition 3.2.
Proof. The function f ε := ε2−2ν V ε satisfies
ε2ν
ε2ν
∆f ε + m · ∇f ε −
|∇f ε |2 = ε2−2ν β∗ε − ℓ .
2
2
(3.51)
Thus for ν ∈ (0, 1], the result follows by applying [19, Lemma 5.1] to (3.51) and using the
assumptions on the growth of m and ℓ.
We next turn to the case ν ∈ (1, 2]. Let z ∈ Z1 (see Theorem 1.1). The function V̆zε
satisfies
1
1
∆V̆zε (x) + m
b εz (x) · ∇V̆zε (x) − |∇V̆zε (x)|2 = ε2(1−ν) β∗ε − ℓbεz (x) .
(3.52)
2
2
Since ℓ is Lipschitz, the map x 7→ ε2(1−ν) ℓbεz (x) − ℓ(z) is bounded on compact subsets of
Rd , uniformly in ε ∈ (0, 1). By Theorem 1.1 (i), which is established in Corollary 3.1, the
constants ε2(1−ν) β∗ε − ℓ(z) are uniformly bounded in ε ∈ (0, 1). Applying [19, Lemma 5.1]
24
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
to (3.52) it follows that V̆zε satisfies (3.50). Therefore, we have
∇x V ε (x + z) = ε−ν ∇y Vbzε (y)y=ε−ν x
=
=
and the proof is complete.
ε−ν
ε2(1−ν)
c0 1 + |ε−ν x|
c0 ν
ε + |x| ,
2
ε
We continue with a version of Lemma 3.5 for unbounded domains.
Proposition 3.1. Let ν ∈ (0, 2]. Then for any k ∈ N and r > 0, there exist constants
ε0 > 0 and κ̂1 such that
2k
Z
dist(x, S)
η∗ε (dx) ≤ κ̂1 .
sup
2∧2ν
ε
c
ε∈(0,ε0 ) Br (S)
Proof. Let Ṽ ε := ε2 V ε . By Lemma 3.6 the function Ṽ ε = ε2 V ε is locally bounded, uniformly
ε
in ε > 0. Applying the function V 2k eṼ to the operator
ε2ν
Lε∗ :=
∆ + (m + ε2 ∇V ε ) · ∇ ,
2
and using the identity Lε∗ Ṽ ε = ε2 β∗ε − ℓ − 12 |∇Ṽ ε |2 , after some calculations we obtain
2k Ṽ ε 1 − ε2ν 2k ∇V 2
ε
ε
2k Ṽ ε 2
ε
2ν ∆V
−
∇Ṽ +
L∗ V e
= V e
ε β∗ − ℓ + kε
V
2 1 − ε2ν V m · ∇V
|∇V|2
2k
2ν
+ 2k
+ k (2k − 1)ε +
.
(3.53)
V
1 − ε2ν
V2
By Hypothesis 1.1 and parts (iii) and (iv) of Theorem 2.1, we can add a positive constant
to V so that
|∇V|2
2k
m · ∇V
m · ∇V
2ν
+ (2k − 1)ε +
≤
on Rd ,
(3.54)
2
2ν
2
V
1−ε
V
V
for all sufficiently small ε. Let κ0 be a bound of β∗ε + k |∆V|
V , and define
1 − ε2ν 2k ∇V 2
ε
G0 := ℓ +
∇Ṽ +
.
2ε2 1 − ε2ν V Using (3.54), we write (3.53) as
2k Ṽ ε m(x) · ∇V(x)
1
ε
2k
Ṽ ε (x)
2−2∧2ν
L V e (x) ≤ V (x) e
.
κ0 − ε
G0 (x) + k 2∧2ν
ε2∧2ν ∗
ε
V(x)
(3.55)
Let F ε (x) denote the right-hand side of (3.55). Since ℓ is inf-compact, there exists r0 > 0
such that F ε < 0 on Brc0 . On the other hand, since m(x) · ∇V(x) is approximately quadratic
near S and negative on S c , it follows that there exists κ > 0, and ε0 > 0, such that
{x : F ε (x) > 0} ⊂ Bρ(ε) (S)
∀ ε < ε0 ,
CONTROLLED EQUILIBRIUM SELECTION
25
ε
where ρ(ε) := κε1∧ν . By Lemma 3.6, κ0 V 2k (x) eṼ (x) is locally bounded, uniformly in ε > 0.
Let κ1 be such a bound on Bρ(ε) (S). Then
2k Ṽ ε 1
m(x) · ∇V(x)
ε
2k
Ṽ ε (x)
c
L V e (x) ≤ κ1 − V (x) e
IBρ(ε)
G0 (x) − k 2∧2ν
(S) (x) (3.56)
ε2∧2ν ∗
ε
V(x)
for all x ∈ Rd . As a result of the strong maximum principle, V ε attains its infimum over
Rd in the set {x ∈ Rd : ℓ(x) ≤ β∗ε }. Therefore, Ṽ ε is bounded below in Rd by Lemma 3.6.
Thus, from (3.56) we obtain
Z
m(x) · ∇V(x)
κ2
∀ ε ≤ ε0 .
(3.57)
V 2k−1 (x) η∗ε (dx) ≤
2∧2ν
c
ε
k inf Rd eṼ ε
B
(S)
ρ(ε)
for some constant κ2 . Since by Hypothesis 1.1, V has strict quadratic growth and |m · ∇V|
has strict linear growth, then (3.57) implies that for some constant κ3 , we must have
Z
4k−1 ε
1
η∗ (dx) ≤ κ3
∀ε ≤ ε0 .
dist(x,
S)
2∧2ν
B c (S) ε
ρ(ε)
This finishes the proof.
As a corollary to the above result we obtain the following.
Corollary 3.2. For ν ∈ (0, 1), we have β∗ε − J3 ≥ O εν∧(2−2ν) .
Proof. Recall from Theorem 1.1 that J3 = min{ℓ(z) : z ∈ Ss }. Consider z ∈ S \ Ss ,
and let ϕ, Br (z), and ζ ε , be as in Lemma 3.3. Recall from the proof of Lemma 3.3 that
supBr (z) ∆V < 0. Thus, using Theorem 2.1 (iv) and Young’s inequality, we deduce from
(3.20) that
Z
Z
κ1 ε 2
1
′
ε
2−2ν
ϕ′ (V) |v∗ε |2 dη∗ε + κ3 (ζ ε )2
ϕ (V)∆V dη∗ + 2ν (ζ ) ≤ κ2 ε
−
2 Br (z)
ε
d
R
for some positive constants κi , i = 1, 2, 3 . It follows that for some r0 ≤ r we have
η∗ε (Br0 (z)) ∈ O(ε2−2ν ). Therefore there exists a neighborhood B of S \ Ss such that
η∗ε (B) ∈ O(ε2−2ν ). Let K ⊂ Rd be a compact set satisfying ℓ(x) − J3 ≥ 0 for x ∈
/ K,
and define G1 := ∪z∈S Br0 (z) \ B and G2 := K \ (G1 ∪ B). We can choose r0 and B in
such a way that {Br0 (z) : z ∈ S} are disjoint and B ∩ Ss = ∅. By Proposition 3.1 we have
η∗ε (G2 ) ∈ O(ε2ν ). Therefore
Z
ℓ dη∗ε − min{ℓ(z) : z ∈ Ss }
β∗ε − J3 ≥
Rd
≥ −κ4
η∗ε (B) +
Z
G1
dist(x, S) dη∗ε
≥ −O(ε2−2ν ) − O(εν ) − O(ε2ν )
≥ −O εν∧(2−2ν) ,
for some constant κ4 , where we use Lemma 3.5 in the third inequality.
We proceed to prove the converse inequality to (3.27).
+ η∗ε (G2 )
26
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
Lemma 3.7. Assume ν = 1, and let {Bz : z ∈ S} and r(ε) be as in Definition 3.2. Suppose
that for some z ∈ S and some sequence εn ց 0 it holds that η∗εn (Bz ) → ξz > 0. All limits
are assumed along this sequence. Normalize V ε so that V ε (z) = 0, and let M ≡ Dm(z).
Let Vbzε be as in Definition 3.2. Then
(a) The sequence Vbzεn converges to some function V ∈ C 2 (Rd ), along a subsequence of
εn ց 0 (also denoted as {εn }), and V satisfies
h
i
1
(3.58)
∆V (x) + min M x − u · ∇V (x) + 21 |u|2 + ℓ(z) = β̄ .
2
u∈Rd
(b) It holds that β̄ ≥ ℓ(z) + E+ Dm(z) .
(c) The diffusion
dXt = M Xt − ∇V (Xt ) dt + dWt
(3.59)
is positive recurrent, and its invariant probability measure η̄ has finite second moments.
(d) The densities ̺˘εz and ˚
̺εz in Definition 3.2 converge to the density ̺¯ of η̄, uniformly
on compact sets.
(e) It holds that
Z
lim inf
εn ց0
Br(εn ) (z)
ℓ(x) + 12 |v∗εn (x)|2 η∗εn (dx) ≥ ξz ℓ(z) + E+ Dm(z) .
Proof. By (2.7) we obtain
1
1 bε
∆ Vz + m
b εz · ∇Vbzε − |∇Vbzε |2 + ℓbεz = β∗ε .
2
2
By [19, Lemma 5.1] (see also Lemma 3.6) and the assumptions on m and ℓ, there exists a
constant c0 such that
|∇Vbzε (x)| ≤ c0 (1 + |x|)
∀ ε ∈ (0, 1) , ∀ x ∈ Rd .
(3.60)
It then follows that Vbzε is locally bounded in C 2,α (Rd ) for any α ∈ (0, 1).
Taking limits in (2.6) along εn ց 0 we obtain a function V ∈ C 2 (Rd ) and a constant β̄
which satisfy (3.58). This proves part (a).
In order to show that the diffusion in (3.59) is positive recurrent, consider the diffusion
dXt = m
b εz (Xt ) − ∇Vbzε (Xt ) dt + dWt ,
(3.61)
Recall from Definition 3.2 that η̂zε and ̺ˆεz denote the invariant probability measure of (3.61)
and its density, respectively. Let
1
b εz − ∇Vbzε · ∇
Lbεz := ∆ + m
2
denote the extended generator
of
(3.61).
By Lemma 3.5 and the Markov inequality we
κ̂0
ε
obtain η∗ Bz \ Bnε (z) ≤ n2 for all n ∈ N. It follows that {η̊zεn : n ∈ N} is a tight family
of measures. Since, by the Markov inequality just mentioned, we have
κ̂0 ε2
,
(3.62)
η∗ε Br(ε) (z) ≥ η∗ε (Bz ) − 2
r (ε)
CONTROLLED EQUILIBRIUM SELECTION
27
and η∗εn (Bz ) → ξz > 0 as εn ց 0 the family {η̆zεn : n ∈ N} is also tight. By the Harnack
inequality the family {ˆ
̺εzn : n ∈ N} is locally bounded, and locally Hölder equi-continuous,
and the same of course applies to {˘
̺εzn : n ∈ N}. Moreover, the tightness of {η̆zεn : n ∈ N}
implies the uniform integrability of {˘
̺εzn : n ∈ N}. Select any subsequence, also denoted by
ε
n
{εn } along which ̺˘z converges locally uniformly, and denote the
R limit by ̺¯. By uniform
integrability, ̺˘εzn also converges in L1 (Rd ), as n → ∞, and hence Rd ̺¯(x) dx = 1. Therefore
η̄(dx) := ̺¯(x) dx is a probability measure. Let f be a smooth function with compact
support, and
1
∆ + M x − ∇V (x) · ∇ .
2
L̄ :=
Then
Z
Rd
̺εzn (x) dx
Lbεzn f (x)˘
Z
ε
ε
L̄f (x)¯
̺(x) dx ≤ −
Lbzn f (x) ̺˘zn (x) − ̺¯(x) dx
Rd
Rd
Z
ε
n
+ Lbz f (x) − L̄f (x) ̺¯(x) dx .
d
Z
(3.63)
R
Since ̺˘εzn → ̺¯ in L1 (Rd ), the first term on the right hand side of (3.63) converges to 0 as
n → ∞. Similarly since m
b εzn (x) → M x and Vbzεn → V uniformly on compacta, the second
term also converges to 0. Since η̂zε is an invariant probability measure of (3.61), by the
R
̺εzn (x) dx = 0, for all large enough εn , which implies
definition
of ̺˘εzn we have Rd Lbεzn f (x)˘
R
that Rd L̄f (x)¯
̺(x) dx = 0. Hence, η̄ is an infinitesimal invariant probability measure of
(3.59), and since the diffusion is regular, it is also an invariant probability measure. This
proves part (d).
Since the diffusion in (3.59) has an invariant probability measure, it follows that it is
positive recurrent. By Lemma 3.5 we have
sup
ε∈(0,1)
Z
{εν x+z ∈ Bz }
|x|2 η̂zε (dx) < ∞ .
R
By Fatou’s lemma we obtain Rd |x|2 η̄(dx) < ∞. This completes the proof of part (c).
R
Since V has at most quadratic growth by (3.60), we have Rd |V (x)| η̄(dx) < ∞. Thus,
by Lemma 4.12 in [17], if Exdenotes
the expectation operator for the process governed by
(3.59), it is the case that Ex V (Xt ) converges as t → ∞. Integrating both sides of (3.58)
with respect to η̄, we deduce that
Z
Rd
1
2
|∇V (x)|2 η̄(dx) = β̄ − ℓ(z) .
(3.64)
Since the Markov control v(Xt ) = −∇V (Xt ) is in general suboptimal for the problem in
(3.30), then by Lemma 3.4 we must have E+ (M ) ≤ β̄ − ℓ(z). This proves part (b).
28
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
ε
By (3.62) we have ̺̺˘ˆεz → ξz along {εn } as n → ∞, uniformly on compacts. Therefore,
z
using Fatou’s lemma, we obtain by part (b) that
Z
ℓ(x) + 21 |v∗εn (x)|2 η∗εn (dx)
lim inf
εn ց0
Br(εn ) (z)
= lim inf
εn ց0
Z
̺ˆεn (x)
ℓbεzn (x) + 12 |∇Vbzεn (x)|2 εzn
η̆zεn (dx)
̺
˘
(x)
ν
z
{ε x+z ∈ Br̆(ε) (z)}
This proves part (e) and thus completes the proof.
≥ ξz β̄ ≥ ξz E+ (M ) + ℓ(z) . (3.65)
The statement in Theorem 1.1 (iii) follows from the following result.
Theorem 3.3. Recall the definition of J2 from Theorem 1.1. Provided ν = 1, it holds that
lim β∗ε = J2 .
εց0
Proof. In view of (3.27), it is enough to show that lim inf εց0 β∗ε ≥ J2 . Choose any sequence
εn
εP
n ց 0 such that η∗ (Bz ) → ξz , for all z ∈ S. Let S o := {z ∈ S : ξz > 0}. Thus
z∈S o ξz = 1. Therefore, by Lemma 3.7 (e) we have
X Z
ℓ(x) + 12 |v∗εn (x)|2 η∗εn (dx)
lim inf β∗εn ≥
n→∞
z∈S o
≥
X
z∈S o
Br(εn ) (z)
ξz ℓ(z) + E+ Dm(z)
≥ J2 ,
and the proof is complete.
(3.66)
Recall the definition of Z2 from Theorem 1.1, and let the function r be as in Definition 3.2.
The inequality in (3.66) suggests that the control effort concentrates in the vicinity of Z2 .
This is asserted in the theorem that follows.
c (Z ) → 0 as ε ց 0. Moreover,
Theorem 3.4. Suppose ν = 1. Then η∗ε Br(ε)
2
Z
|v∗ε (x)|2 η∗ε (dx) = 0 .
lim
εց0
c
Br(ε)
(Z2 )
Proof. The first claim follows by Lemma 3.5 and Theorem 3.3.
To prove the second claim, given given any sequence εn ց 0, we extract a subsequence
also denoted by εn along which limn→∞ η∗εn (BP
z ) → ξz for all z ∈ S. As in the proof of
Theorem 3.3, let So := {z ∈ S : ξz > 0}. Then z∈So ξz = 1, and So ⊂ Z2 by Theorem 3.3.
Then, by (3.65) we have
Z
ℓ(x) + 21 |v∗εn (x)|2 η∗εn (dx) ≥ J2 ,
lim inf
n→∞
Br(εn ) (S o )
CONTROLLED EQUILIBRIUM SELECTION
and hence, using Theorem 3.3 we obtain
Z
lim inf
n→∞
c
Br(ε
(S o )
n)
29
|v∗εn (x)|2 η∗εn (dx) = 0 ,
and the proof is complete.
We next show that, under the hypothesis of Theorem 1.2. the scaled density ˚
̺εz (x) in
Definition 3.2 converges to that of a Gaussian distribution as ε ց 0. This establishes
Theorem 1.2 for the critical regime.
Theorem 3.5. Let ν = 1, and suppose that for some z ∈ S and a sequence εn ց 0, we
bz , Σ
b z ) be the pair of matrices
have limn→∞ η∗εn (Bz ) = ξz > 0. Set M ≡ Dm(z), and let (Q
which solves (1.5). The following hold:
(a) Provided that we normalize V ε by setting V ε (z) = 0, it holds that V εn (εn x + z) →
1
Tb
2,α (Rd ), α ∈ (0, 1).
2 x Qz x as n → ∞, in C
(b) The density ˚
̺zεn in Definition 3.2 converges as n → ∞ (uniformly on compact sets)
b z.
to the density of a Gaussian with mean 0 and covariance matrix Σ
Proof. Since β̄ in Lemma 3.7 satisfies β̄ ≤ lim supεց0 β∗ε , it follows by Theorem 3.3 that
β̄ = J2 . Therefore, by (3.64) we have
Z
2
1
2 |∇V (x)| η̄(dx) = E+ Dm(z) .
Rd
b z x for all x ∈ Rd , as claimed, follows by Lemma 3.4 (b). This proves the
That ∇V (x) = Q
first part of the result.
b z x, the invariant probability measure η̄∗ of (3.59) is Gaussian with
Since ∇V (x) = Q
b z as stated. By Lemma 3.7 (d), ̺˘ε → ̺¯ and ̺ˆεzε → ξz as ε ց 0,
covariance matrix Σ
z
̺˘z
uniformly on compact sets. Since η∗εn (Bz ) → ξz , the second part of the theorem follows and
the proof is complete.
It is interesting to note that part (a) of Theorem 3.5 holds for any z ∈ Z2 . We state this
as follows.
Theorem 3.6. Suppose ν = 1. Recall the definition of Z2 from Theorem 1.1. Then for
any z ∈ Z2 , and subject to the normalization V ε (z) = 0, it holds that
uniformly on compact sets.
1
bz x ,
Vbzε (x) −−−→ xT Q
ε→0 2
Proof. Let M = Dm(z). Let εn ց 0 be any sequence, and extract a subsequence, also
denoted as {εn } along which Vbzεn → V ∈ C 2 (Rd ). Taking limits as εn ց 0 in (2.6), and
b z , we obtain
using Theorem 3.3 and the fact that E+ Dm(z) = 21 trace Q
h
i
1
1
2
1
bz ) .
∆V (x) + min M x + u · ∇V (x) + 2 |u| = trace(Q
d
2
2
u∈R
30
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
b z x we obtain
By the suboptimality of u(x) = −∇Q
1
b z x|2 = 1 trace(Q
b z ) + f (x) ,
b z x · ∇V (x) + 1 |Q
∆V (x) + M x − Q
(3.67)
2
2
2
for some nonnegative function f . However, since ∇V has linear growth, then both f and V
have at most quadratic growth. Hence,
R repeating the argument that led to the derivation
b denotes the
of (3.44), we obtain from (3.67) that Rd f (x)ρΣ (x) dx = 0, hence f ≡ 0. If E
bz )Xt and identity diffusion
expectation operator for the diffusion Xt with linear drift (M − Q
matrix then the stochastic representation of the solution of (3.67) gives
Z τ̂r 1 Tb
1 b
2
b
b
q(x) := x Qz x = lim inf Ex
|Qz Xt | − trace(Qz ) dt ,
rց0
2
2
0
where τ̂r is the first hitting time of the ball Br [1, Theorem 3.7.12]. Applying Dynkin’s
formula to (3.67), and since f ≡ 0, we obtain V ≥ q. Also with L̄ := 12 ∆ + M x − ∇V · ∇,
we have L̄(V −q) ≤ 0. Since V (0) = q(0), we have V = q on Rd by the comparison principle,
and the proof is complete.
Remark 3.2. In summary, when ν = 1, the control cost for having η∗ε concentrate around a
point z ∈ S is E+ (Dm(z)). Therefore a point z ′ ∈ S will be chosen over a point z ∈ S, if
and only if ℓ(z ′ ) + E+ (Dm(z ′ )) < ℓ(z) + E+ (Dm(z)). Recall also that E+ (Dm(z)) = 0 for
z ∈ Ss .
In Example 3.1, Dm(0) = 2. Therefore, applying Theorem 3.3 and Lemma 3.4, we
conclude that in the critical regime the invariant probability distribution η∗ε concentrates
at x = 0 if c > 2, and at x = −1 if c < 2.
3.4. The supercritical and subcritical regimes revisited. We return to the analysis
of the supercritical and subcritical regimes, in order to determine the asymptotic behavior
of the invariant probability distribution in the vicinity of the minimal stochastically stable
set. In these regimes there are two scales. If we center the coordinates around a point is S,
then we have V ε (x) ∈ O(ε−2 x), while − log ̺ε∗ (x) ∈ O(ε−2ν x). We start with the subcritical
regime.
3.4.1. Subcritical Regime. Recall the notation introduced in Definition 3.2. Let z ∈ Z3 be
such that for some sequence εn ց 0 it holds that lim inf n→∞ η∗εn (Bz ) > 0. Let r be as in
Definition 3.2.
Normalize V ε by setting V ε (z) = 0, and let M := Dm(z). We scale the space as 1/εν , to
obtain
ε2(1−ν) b ε
1 bε
∆Vz (x) + m
b εz (x) · ∇Vbzε (x) −
|∇Vz (x)|2 + ℓbεz (x) = β∗ε .
(3.68)
2
2
By Lemma 3.6, ∇V̆zε = ε2(1−ν) ∇Vbzε is locally bounded and has at most linear growth. We
write (3.68) in the form (3.52) which corresponds to the scaled diffusion
bt = m
bt ) − ε2(1−ν) ∇Vbzε (X
bt ) dt + dW
ct ,
dX
b εz (X
(3.69)
and the associated HJB equation
h
1
ε
∆V̆z (x) + min m
b εz (x) + ε1−ν u · ∇V̆zε (x) +
2
u∈Rd
ε2(1−ν)
|u|2
2
i
= ε2(1−ν) β∗ε − ℓbεz (x) . (3.70)
CONTROLLED EQUILIBRIUM SELECTION
31
Letting ũ = ε1−ν u and taking limits in (3.69)–(3.70) along some subsequence εn ց 0, we
obtain a function V ∈ C 2 (Rd ) of at most quadratic growth satisfying
bt ) dt + dW
ct ,
bt = M X
bt − ∇V (X
(3.71)
dX
and
i
h
1
∆V (x) + min M x + ũ · V (x) + 21 |ũ|2 = 0 .
2
ũ∈Rd
Following the proofs of Lemma 3.7 and Theorem 3.5, and using Lemma 3.5, we deduce that
the densities ̺˘εz and ˚
̺εz in Definition 3.2 converge as εn ց 0 to the density of a Gaussian
b z , uniformly on compact sets. Also, V (x) = 1 xT Q
b z x.
distribution with covariance matrix Σ
2
bz = 0 by Remark 3.1 and Lemma 3.4.
However, since Dm(z) is Hurwitz, then Q
3.4.2. Supercritical Regime. We assume ν ∈ (1, 2), and we use the same scaling and definitions as in Section 3.4.1 for the subcritical
regime, except that z ∈ Z1 .
It is clear that ε2(1−ν) ℓbεz (x) − ℓ(z) → 0 as ε ց 0. By Corollary 3.1 the constants
ε2(1−ν) β∗ε − ℓ(z) are bounded, uniformly in ε ∈ (0, 1). Extract a subsequence εn ց 0
2(1−ν)
along which εn
β∗ε − ℓ(z) converges to a constant, and a further subsequence along
which V̆zε converges to V ∈ C 2 (Rd ), uniformly on compact sets. We obtain
h
i
1
∆V (x) + min M x + ũ · V (x) + 21 |ũ|2 = β̂ .
2
ũ∈Rd
With ũ = ε1−ν u, as defined earlier, and noting that the scaled diffusion in (3.71) has
invariant probability distribution η̂zε we obtain by (3.70) that
Z bε
Z
ℓz (x) − ℓ(z) ε
ν
2 ε
2(1−ν)
ε
|ũ(ε x + z)| η̂z (dx) +
η̂
(dx)
=
ε
β
−
ℓ(z)
.
(3.72)
z
∗
ε2ν−2
Rd
Rd
Let G be a bounded open set containing S such that ℓ(x) > ℓ(z) for all x ∈ Gc . Since
z ∈ J1 and ℓ is Lipschitz, we have ℓ(x) − ℓ(z) ≥ Cℓ dist(x, S) in G. Hence using the fact
that 2ν − 2 < ν together with the Cauchy–Schwartz inequality and Lemma 3.5 it follows
that the second term in (3.72) vanishes as
ε ց 0. Therefore, following the arguments in
Lemma 3.7 it follows that β̂ ≥ E+ Dm(z) , which implies that
Z
ε2(1−ν)
|v∗ε (εν x + z)|2 η̊zε (dx) ≥ E+ Dm(z) .
lim inf
2
εց0
εν x+z ∈ Bz
b z (x − z) applied to the nonHowever, under the control uε (x) = − 1ε m(x) − M (x − z) + Q
scaled problem, and with µε denoting the stationary distribution of the controlled process
we obtain
Z
ε2(1−ν)
|u(x)|2 µε (dx) = E+ Dm(z) .
lim
2
εց0 Rd
Therefore β̂ = E+ Dm(z) . Hence arguing as in Theorem 3.3 we can show that
Z
ε2(1−ν)
(3.73)
|v∗ε (εν x + z)|2 η̊zε (dx) = E+ Dm(z) .
lim
2
εց0
εν x+z ∈ Bz
bz x, and that ˚
̺εz converges
Moreover, by the proof of Theorem 3.5 we obtain V (x) = 12 xT Q
bz.
to the density of a Gaussian distribution with covariance matrix Σ
32
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
3.5. On Theorem 1.3. We summarize the conclusions of Sections 3.4.1–3.4.2 in the following lemma.
Lemma 3.8. Assume ν ∈ (0, 2) and let {Bz : z ∈ S} be as in Definition 3.2. Suppose
that for some z ∈ S and some sequence εn ց 0 it holds that lim inf n→∞ η∗εn (Bz ) > 0. Let
ı = 1, 2, or 3, according to as ν > 1, ν = 1, or ν < 1, respectively. Then, the density
˚
̺εzn in Definition 3.2 converges as n → ∞ (uniformly on compact sets) to the density of a
b z.
Gaussian with mean 0 and covariance matrix Σ
The ergodic control problem in (1.1) and (1.3) is of course equivalent to minimizing
Z T 1
1
2
ℓ(Xs ) + 2 |Ũs − m(Xs )| ds
E
lim sup
2ε
T →∞ T
0
over all admissible controls Ũ , subject to
Z t
Xt = X0 +
Ũs ds + εν Wt ,
0
t ≥ 0.
It follows that if v∗ε is an optimal stationary Markov control for (1.1), (1.3), then
ṽ∗ε := m(x) + εv∗ε
is an optimal control for (3.31), (3.24), and vice-versa.
e z [v](x) by
Suppose z ∈ S. We define the ‘running cost’ R
e z [v](x) := ℓ(x) − ℓ(z) + 1 |v(x) − m(x)|2 .
R
2ε2
Then
Z
e z [ṽ ε ](x) η ε (dx) .
R
β∗ε = ℓ(z) +
∗
∗
Rd
b z )(x − z), with M = Dm(z),
Suppose ν ≥ 1. Using the suboptimal control ṽz (x) = (M − Q
ε
instead of ṽ∗ we obtain
Z
ℓ(x) − ℓ(z) ε
1
b z D 2 ℓ(z) .
lim
ρΣb (x) dx =
trace Σ
(3.74)
2ν
z
εց0 Rd
ε
2
b z . We write
where ρεb denotes the Gaussian density with covariance matrix ε2ν Σ
Σz
where
Rz (x) :=
b z (x − z)|2 + Rz (x) ,
|ṽz (x) − m(x)|2 = |Q
b z )(x − z) − m(x) .
M (x − z) − m(x) · (M − 2Q
(3.75)
Since |Rz (x)| ≤ C|x − z|3 for some constant C, and the third moments of ρεb are 0, it
Σz
follows by Lemma 3.4 and (3.75) that
Z
1
2
ε
2ν
(3.76)
d 2ε2ν |ṽz (x) − m(x)| ρΣb z (x) dx − E+ Dm(z) ∈ O(ε ) .
R
By (3.74) and (3.76), we obtain
Z
1
e z [ṽz ](x)ρε (x) dx = E+ Dm(z) .
R
lim 2ν−2
b
Σz
εց0 ε
Rd
CONTROLLED EQUILIBRIUM SELECTION
33
We are now ready for the proof of the theorem.
Proof of Theorem 1.3. Suppose first that ν ∈ (1, 2). Select the collection of balls {Bz : z ∈
S} so as to satisfy
∀z ∈ S \ Z1 .
(3.77)
inf ℓ > min ℓ
S
Bz
Let εn ց 0 be any sequence. Extract some subsequence also denoted by {εn } along which
η∗εn (Bz ) converges to ξz ∈ [0, 1] for all z ∈ Z1 . Let Z1′ := {z ∈ Z1 : ξz > 0} . Also define
δ(ε) := sup
max η∗εn (Bz ) .
εn <ε z∈S\Z1′
→ 0. In the rest of the proof
By Theorem 1.1 (i) and the definition of Z1′ we have δ(ε) −−−
εց0
all limits or estimates are meant to be along the subsequence {εn }. By (3.73) we have
Z
e z [ṽ ε ](x)̺ε (x) dx ≥ ε2ν−2 η ε (Bz ) E+ Dm(z) − δ1 (ε)
∀ z ∈ Z1′ ,
(3.78)
R
∗
∗
∗
Bz
where δ1 (ε) → 0 as ε ց 0. Let z ∗ ∈ Z1∗ . By the Cauchy–Schwartz inequality and Lemma 3.5
we obtain
2
1/2
Z
Z
ℓ(x) − ℓ(z)
|ℓ(x) − ℓ(z)| ε
ε
< ∞ . (3.79)
sup
η∗ (dx) ≤ sup
η∗ (dx)
εν
ε2ν
Bz
ε∈(0,1) Bz
ε∈(0,1)
It follows by (3.77), (3.78)–(3.79), and Proposition 3.1 that
X Z
XZ
ε
ε
ε
∗
e
ℓ(x) − ℓ(z) η∗ε (dx)
Rz [ṽ∗ ](x) η∗ (dx) +
β∗ − ℓ(z ) ≥
z∈Z1′
Bz
z∈Z\Z1′
+
≥ ε2ν−2
X
z∈Z1′
Z
BcS
Bz
ℓ(x) − ℓ(z) η∗ε (dx)
η∗ε (Bz ) E+ Dm(z) − δ1 (ε) + O(εν ) + O(ε2 ) .
b z )(x − z ∗ ), we have by (3.76) that
On the other hand, with ṽz ∗ (x) = (M − Q
Z
e z ∗ [ṽz ∗ ](x)ρε (x) dx ≤ ε2ν−2 E+ Dm(z ∗ ) + O(ε2 ) .
R
b ∗
Σ
z
Rd
Combining (3.80)–(3.81), and since ṽz ∗ is suboptimal, we obtain
X
η∗ε (Bz ) E+ Dm(z) ≤ E+ Dm(z ∗ ) + δ2 (ε) ,
(3.80)
(3.81)
(3.82)
z∈Z1′
with δ2 (ε) −−−→ 0. By the definition of δ we have
εց0
X
z∈Z1′
η∗ε (Bz ) ≥ 1 − δ(ε) .
(3.83)
34
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
Therefore, by (3.82)–(3.83) we obtain
X
η∗ε (Bz ) E+ Dm(z) − E+ Dm(z ∗ ) −−−→ 0 ,
εց0
z∈Z1′
which together with (3.83) and the definition of Z1′ implies that
η∗ε (Bz ) −−−→ 0
εց0
∀z ∈ Z1 \ Z1∗ .
Also, by (3.80) and (3.83) we have
β∗ε − J1 ≥ ε2ν−2 E+ Dm(z ∗ ) + o ε2ν−2
∀ z ∗ ∈ Z1∗ ,
which together with the upper bound in (3.81) implies (1.6), and the proof is complete.
4. Concluding remarks
In general, Morse–Smale flows may contain hyperbolic closed orbits, and it would be
desirable to extend the results of the paper accordingly. An energy function V as in Theorem 2.1 may be constructed to account for critical elements that are closed orbits [20, 25].
Note that under the control used in Section 3.1.1 the optimal stationary probability distribution concentrates on the minimum of V. In the case that z ∈ Rd belongs to a stable
periodic orbit with period T0 , we can construct V so that it attains its minimum on this
closed orbit. In this manner, if φt denotes the flow of the vector field m, then under the
control used in Section 3.1.1 we obtain
Z T0
Z
1
ℓ φt (z) dt .
ℓ(x) µε (dx) −−−→
εց0 T0 0
Rd
The same can be done in the subcritical regime, by modifying the proof of Lemma 3.2. We
leave it up to the reader to verify that Lemma 3.1 still holds if the set of critical elements
S contains hyperbolic closed orbits. Let us define
Z T0
1
ℓ̆(z) :=
ℓ φt (z) dt ,
T0 0
when z belongs to a closed orbit, and ℓ̆(z) = ℓ(z), when m(z) = 0. Then, provided
Arg minz∈S ℓ̆(z) contains only stable critical elements, then the support of the limit of the
stationary probability distribution lies in Ss , and this is true in any of the three regimes.
However, the full analysis when unstable closed orbits are involved, seems to be much more
difficult.
Appendix A. Proofs of Lemma 1.1 and Theorem 2.2
Proof of Lemma 1.1. Let (Ω, F, Ft , P) be a complete probability space, and {Wt } be a ddimensional standard Brownian motion defined on it. Consider any admissible control U .
Let
o
n
Rt
τN := inf t ≥ 0 : 0 |Us |2 ds ≥ N ε−1 , N ≥ 1 .
Since
E
Z
0
T
|Us |2 ds
< ∞
∀ T ∈ (0, ∞) ,
CONTROLLED EQUILIBRIUM SELECTION
it follows that τN ↑ ∞ a.s., as N → ∞. Define
Z
Z t∧τN
ε2−2ν t∧τN
1−ν
2
ΛN (t) := exp −ε
|Us | ds ,
hUs , dWs i −
2
0
0
35
t ∈ (0, ∞) .
Let P0 and E0 [ · ] denote the law of the process with U ≡ 0 and the expectation operator
under this law, respectively. In turn PN and EN denotes the measure defined by d PN/d P0 =
ΛN (T ) and the corresponding expectation, respectively. Then the laws of (X · ∧τN , U · ∧τN )
under PN and P coincide. Therefore
E0 ΛN (T ) ln ΛN (T ) = EN ln ΛN (T )
Z
Z T ∧τN
ε2−2ν T ∧τN
2
1−ν
|Us | ds
= EN −ε
hUs , dWs i −
2
0
0
Z T ∧τN
1
2−2ν
2
= EN ε
|Us | ds
2
0
Z T
1
−2ν
2
≤ E ε
|Us | ds
2
0
< ∞.
The third equality follows from the fact that under (Ω, Ft∧τN , PN ),
Z t∧τN
Wt∧τN +
ε−ν Us ds , t ≥ 0 ,
0
is a Brownian motion stopped at τN . Since x(ln(x))+ ≤ x ln(x) + e−1 , it follows that
+ < ∞.
sup E0 ΛN (T ) ln(ΛN (T ))
N
By [11, Theorem 1.3.4, p. 10], {ΛN (T ∧ τN )} are uniformly integrable and converge a.s. and
in L1 to Λ(T ), which then satisfies E0 [Λ(T )] = 1. Defining the probability measure P′ by
d P′/d P0 = Λ(T ), we obtain
Z t
Wt +
ε−ν Us ds , t ≥ 0 ,
0
as a (Ω, Ft , P′ ) Brownian motion. This gives a weak solution for the control Us over [0, T ]
under the probability measure P0 . To show the weak uniqueness we see that given two
solutions we can always define P′ as above and get two weak solutions of (1.1) with U = 0
under the law of P′ . But we have weak uniqueness in the latter. Hence weak uniqueness
follows.
The rest of this section is devoted to the proof of Theorem 2.2. Without loss of generality
we fix ε = 1. We consider the equation
i
h
1
∆V + min hm + u, ∇V i + ℓ + 12 |u|2 = β .
2
u∈Rd
If V is regular then the above equation can be rewritten as
1
1
ℓ − |∇V |2 = β − ∆V − m · ∇V .
(A.1)
2
2
36
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
It is shown in [3] that there exists a unique pair (V, β) ∈ C 2 (R) × R satisfying (A.1) such
that V (x) → ∞ as |x| → ∞. The existence of this solution is established as a limit of the
solution Vα to the discounted problem [2, p. 175]
ℓ−
1
1
|∇Vα |2 = αVα − ∆Vα − m · ∇Vα .
2
2
(A.2)
2,s
In [2, Theorem 4.18, p. 177] it is proved that (A.2) has a solution in Wloc
(Rd ), 1 ≤ s < ∞.
Then using standard elliptic pde results we can improve the solution so that Vα ∈ C 2 (Rd ).
1,2
The solution Vα to (A.2) is obtained in [2] as a strong Wloc
(Rd ) limit of the solution to the
Neumann boundary value problem
ℓ−
1
1
|∇vR |2 = αvR − ∆vR − m · ∇vR ,
2
2
∂vR
= 0
∂n
on
in BR (0) ,
(A.3)
∂BR (0) ,
as R → ∞ along some sub-sequence. Here n denotes the unit outward normal on ∂BR (0).
In fact, there exists vR ∈ L∞ (BR (0)) ∩ W 1,2 (BR (0)) solving the Neumann problem. In the
following lemma we obtain an estimate of the growth of Vα .
Lemma A.1. For every α > 0, the solution Vα of (A.2), which is obtained as a limit of
vR as R → ∞, is bounded below and has at most linear growth.
Proof. The proof mimics the calculations in [3]. By the proof of [3, Theorem 4.18] there
exist constants κ0 and R0 which do not depend on α such that
α vR (x) ≥ −κ0
∀ x ∈ BR (0) ,
∀ R > R0 ,
(A.4)
and for all α ∈ (0, 1). We divide the proof in a number of steps.
Step 1. The calculation here is based on the computations done in [3, pp. 179–180] (mentioned as Local bound from above). Choose 0 < 4r < 1. Consider a smooth function ψ such
that ψ = 1 on B2r (x0 ), ψ = 0 outside of B4r (x0 ) and 0 ≤ ψ ≤ 1. It is clear that we can
choose ψ such that |∇ψ| ≤ cr , where cr does not depends on x0 as we can translate this ψ
to any point. Now we choose x0 in BR (0) such that |x0 | + 4r < R.
Multiplying (A.3) with ψ 2 and integrating on BR (0) we have (see [2, p. 179])
Z
Z
Z
Z
Z
1
2 2
2
2
ℓ ψ2 ,
|∇vR | ψ =
(m · ∇vR )ψ +
(∇vR · ∇ψ)ψ −
vR ψ +
α
2 BR
BR
BR
BR
BR
which can also be written as
Z
Z
Z
2
(∇vR · ∇ψ)ψ −
vR ψ +
α
B4r (x0 )
B4r (x0 )
B4r (x0 )
1
+
2
Z
(m · ∇vR )ψ 2
2
B4r (x0 )
|∇vR | ψ
2
=
Z
B4r (x0 )
ℓ ψ 2 . (A.5)
CONTROLLED EQUILIBRIUM SELECTION
37
Applying Young’s inequality to the second and third terms of (A.5) and using (A.4) we
obtain
Z
Z
|∇vR |2 ψ 2
|∇vR |2 ≤
B4r (x0 )
B2r (x0 )
≤ 4
Z
2
ℓψ +
Z
2
B4r (x0 )
B4r (x0 )
Z
≤ κ1 (r) 1 +
ℓψ
B4r (x0 )
2
|∇ψ − mψ| − α
Z
vR ψ
B4r (x0 )
2
∀ α ∈ (0, 1) ,
and all R sufficiently large, where κ1 (r) is a constant depending on kmk∞ , κ0 , and r. Since
ℓ is nonnegative and Lipschitz continuous we have
ℓ(x) ≤ κ2 1 + |x| ≤ κ2 1 + |x0 | + |x − x0 | .
Therefore for some constant κ3 (r), independent of R, we have
Z
|∇vR |2 ≤ κ3 (r) 1 + |x0 |
∀ α ∈ (0, 1) ,
(A.6)
B2r (x0 )
and all R sufficiently large.
Step 2. Let us first consider d = 1. Then vR is absolutely continuous. Since vR → Vα
1,2
strongly in Wloc
(R) there exists a point, say 0, such that vR (0) → Vα (0) as R → ∞. Thus
|vR (0)| is bounded. Now let x ∈ R and choose R such that |x| + 4r < R. Let N be such
that 4rN ≥ |x| > 4r(N − 1). We have
Z x
′
|vR (x)| ≤ |vR (0)| + vR (s) ds
0
≤ |vR (0)| +
= |vR (0)| +
≤ |vR (0)| +
Z
4rN
0
′
|vR
(s)| ds
N
−1 Z 4r(i+1)
X
i=0
√
4r
4ri
′
|vR
(s)| ds
N
−1Z 4r(i+1)
X
i=0
4ri
′
|vR
(s)|2 ds
1/2
So if we use the estimate (A.6) then it follows that
√
1/2
|vR (x)| ≤ |vR (0)| + 4rN κ3 (r)(2 + |x|)
√
1/2
4r(|x| + 4r)
≤ |vR (0)| +
κ3 (r)(2 + |x|)
4r
≤ κ4 (r) 1 + |x|2 .
.
38
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
Here, κ4 (r) does not depend on R. So in the limit we obtain |Vα (x)| ≤ κ4 (r) 1 + |x|2 and
that Vα is bounded from below. Now rewriting (A.2) as
h
i
1
∆Vα + min hm + u, ∇Vα i + ℓ + 12 |u|2 = αVα ,
2
u∈Rd
and using the fact that the control U ≡ 0 is sub-optimal, we obtain
Z t
Vα (x) ≤ Ex
e−αs ℓ(Xs0 ) ds + Ex e−αt Vα (Xt0 ) .
(A.7)
0
In (A.7) X 0 denotes the solution to (1.1) for U ≡ 0. It is straightforward to show using
Hypothesis 1.1 that
Z ∞
i
Ex
e−αs ℓ(Xs0 ) ds ≤ κ5 1 + |x| ,
0
for a constant κ5 which depends on α, and that Ex |Xt0 |2 e−αt → 0 as t → ∞. Therefore
from (A.7), using also the fact that Vα is bounded from below, we conclude that Vα has at
most linear growth.
Step 3. Let d ≥ 2. We can use the Green function in this case, as used in [3, p. 179]. Again
we consider x0 ∈ BR (0) such that |x0 | + 4r < R. We consider a Green function Gx0 (x)
on B2r (x0 ). We also use the estimates in [3, (4.83), p. 179] for Green functions where the
constants depend only on r. Now we consider a test function η such that η = 1 on Br (x0 ),
η = 0 outside of B3r/2 (x0 ), and 0 ≤ η ≤ 1.
Now we test (A.3) with η 2 G and we have (the first expression on [3, p. 180])
Z
Z
Z
1
1
∇vR · ∇η 2 G +
div(vR ∇η 2 )G
vR η 2 G +
vR (x0 ) + α
2
2
B2r (x0 )
B2r (x0 )
B2r (x0 )
Z
Z
Z
1
2
2 2
ℓη 2 G .
m · ∇vR η G = −
|∇vR | η G +
−
2
B2r (x0 )
B2r (x0 )
B2r (x0 )
At this point we observe that because of the choice of r, G is positive on B2r (x0 ) (for d = 2
we use that 2r < 1, see [3, p. 73]). Also m · ∇vR ≤ |m|2 + 14 |∇vR |2 . Therefore
Z
Z
Z
1
1
vR η 2 G +
vR (x0 ) + α
∇vR · ∇η 2 G +
div(vR ∇η 2 )G
2
2
B2r (x0 )
B2r (x0 )
B2r (x0 )
Z
Z
κ2 (1 + |x0 | + 2r)η 2 G .
|m|2 η 2 G +
≤
B2r (x0 )
B2r (x0 )
Now we observe that ∇η is nonzero for |x − x0 | ≥ r and x ∈ B3r/2 (x0 ) where we can bound
G (from (4.83) of [3]). Again using the lower bound for vR and the gradient estimate in
(A.6) and the L1 bound of G (see [2, (4.38)]) we have
vR (x0 ) ≤ κ6 (r) 1 + |x0 | .
Here, the constant κ6 (r) does not depend on R and x0 (Observe that one can translate η
and G so that the bounds on ∇η, ∇2 η, and G do not change as x0 varies). Hence passing to
the limit as R ր ∞, we obtain Vα (x) ≤ κ6 (r)(1 + |x|) and that Vα is bounded from below.
This concludes the proof.
CONTROLLED EQUILIBRIUM SELECTION
39
Lemma A.2. For every α > 0, Vα (x) → ∞ as |x| → ∞. Moreover, there exists a compact
set B such that minRd Vα = minB Vα for all α ∈ (0, 1).
Proof. Since Vα is bounded from below we may add a suitable constant to both sides of
(A.2) and assume that Vα ≥ 0 (ℓ is translated by a constant accordingly). Define for
x0 ∈ R d ,
ζ(x) := c1 (1 − |x − x0 |2 ), for |x − x0 | ≤ 1 ,
where c1 is a constant to be chosen later. Let θ(x) = Vα (x) − ζ(x) in B1 (x0 ). Then in
B1 (x0 ) we have
1
1
1
− ∆θ − m · ∇θ + αθ = − ∆Vα − m · ∇Vα + αVα + ∆ζ + m · ∇ζ − αζ
2
2
2
1
1
= − |∇Vα |2 + ℓ + c1 (−2d) − 2c1 m · (x − x0 ) − αζ .
2
2
Therefore in B1 (x0 )
1
1
1
− ∆θ − m · ∇θ + (∇Vα + ∇ζ) · ∇θ + αθ = − |∇ζ|2 + ℓ − dc1 − 2c1 m · (x − x0 ) − αζ
2
2
2
≥ ℓ − 2c21 − M0 c1 − αc1 ,
where M0 is a constant that depends on the bound of m. Since ℓ(x) → ∞ as |x| → ∞ we
can choose c1 = c1 (|x0 |) such that c1 (|x0 |) → ∞ as |x0 | → ∞ and
ℓ(x) − 2c21 (x) − M0 c1 (x) − αc1 (x) > 0
∀ x ∈ B1 (x0 ) .
Therefore
1
(∇Vα + ∇ζ) · ∇θ + αθ > 0
in B1 (x0 ) .
2
Since θ ≥ 0 on ∂B1 (x0 ) and α > 0 applying the maximum principle we have θ ≥ 0 in
B1 (x0 ). In particular, Vα (x0 ) ≥ c1 (|x0 |). This proves that Vα (x) → ∞ as |x| → ∞.
In particular, Vα attains its minimum at some x̂ ∈ Rd . Also ∆Vα (x̂) ≥ 0. Thus from
(A.2) we obtain
1
ℓ(x̂) = αVα (x̂) − ∆Vα (x̂) ≤ αVα (x̂) .
(A.8)
2
RLet η0 be the invariant probability measure corresponding to the control U ≡ 0. Then
Rd ℓ dη0 ≤ κ for some constant κ. Applying Lemma A.3 we obtain
Z ∞
Z
Z
κ
Ex
e−αs ℓ(Xs0 ) ds dη0 ≤
Vα (x̂) ≤
Vα dη0 ≤
.
α
d
R
0
−∆θ − m · ∇θ +
Therefore using (A.8) we have ℓ(x̂) ≤ κ. The result follows using the fact that ℓ(x) → ∞
as |x| → ∞.
Lemma A.3. For all x ∈ Rd ,
Vα (x) = inf EU
x
U ∈U
U
where E
Z
0
∞
e−αs ℓ(Xs ) + 12 |Us |2 ds ,
denotes the expectation under the law of (X, U ).
40
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
Proof. From Lemma A.1 we have |Vα (x)| ≤ M1 + M2 |x| and Vα is bounded from below. Let
us write (A.2) in the form of
i
h
1
(A.9)
∆Vα + min hm + u, ∇Vα i + ℓ + 12 |u|2 = αVα .
2
u∈Rd
Let U ∈ U be any admissible control such that
Z ∞
2
−αs
U
1
ℓ(Xs ) + 2 |Us | ds < ∞ ,
e
Ex
0
and
Xt = x +
Since EU
x
R t
0
Z
t
m(Xs ) ds +
Z
t
Us ds + Wt ,
0
0
|Us |2 ds is finite it is easy to see that
Z t
U
U
Ex sup |X(s)| ≤ C |x| + t + Ex
|Us | ds
< ∞,
0≤s≤t
0
Therefore multiplying both sides by e−αt we get
−αt
e
t ≥ 0.
EU
x [|Xt |]
−αt
≤ C(|x| + t)e
−αt
≤ C(|x| + t)e
−α
t
2
+ Ce
EU
x
Z
t
s
−α
2
e
0
|Us | ds
Z
1/2 √ − α t U t −αs
2
2
+ C te
e
|Us | ds
Ex
0
−αt
≤ C(|x| + t)e
√
α
+ C te− 2 t EU
x
Z
t
−αs
e
0
2
|Us | ds
1/2
−−−→ 0 .
t→∞
Now applying Itô’s formula to (A.9) we have
−αt
Vα (x) − e
EU
x [Vα (Xt )]
Hence using Lemma A.1 we have
Vα (x) ≤ EU
x
Z
≤
∞
0
EU
x
Z
0
t
−αs
e
2
1
ℓ(Xs ) + 2 |Us | ds .
e−αs ℓ(Xs ) +
1
2
|Us |2 ds .
(A.10)
Since (A.10) holds for all U ∈ U, the desired upper bound to Vα is established.
To show the converse inequality let vα be a measurable selector from the minimizer of
(A.9). With τR denoting the first exit time from a ball of radius R we then have
Z t∧τR
2 vα
−αs
1
Vα (x) = Ex
ℓ(Xs ) + 2 vα (Xs )
e
ds + Evxα e−α(t∧τR ) Vα (Xt∧τR ) . (A.11)
0
We claim that τR → ∞, Pvxα -a.s. Suppose that τR 9 ∞ as R → ∞ in Pvxα . Then there
exists t0 > 0 and ε0 > 0 such that Pvxα (τR < t0 ) > ε0 for all R > 0. Let R0 > 0 be such that
min Vα >
∂BR0
2 αt0
e Vα (x) .
ε0
CONTROLLED EQUILIBRIUM SELECTION
41
Such R0 exists by Lemma A.2. Since ℓ is nonnegative, we obtain
Evxα e−α(t0 ∧τR0 ) Vα (Xt0 ∧τR0 ) > ε0 e−αt0 min Vα
∂BR0
> 2Vα (x) ,
which contradicts (A.11), since ℓ is nonnegative. Therefore τR → ∞, as R → ∞, in
probability, and therefore also Pvxα -a.s., since it is monotone in R.
Taking limits in (A.11) as R → ∞, and applying Fatou’s lemma, we obtain
Z t
2 −αs
vα
1
ℓ(Xs ) + 2 vα (Xs ) ds + Evxα e−αt Vα (Xt ) ,
e
Vα (x) ≥ Ex
0
and taking limits once more as t → ∞, using the fact that Vα is bounded below, we have
Z ∞
2 −αs
vα
1
ℓ(Xs ) + 2 vα (Xs )
e
Vα (x) ≥ Ex
ds .
(A.12)
0
By (A.10) we must have equality in (A.12). Moreover (A.10) and (A.12) imply that every measurable selector from the minimizer of (A.9), and in particular the control Ut =
−∇Vα (Xt ), is optimal for the α-discounted problem and that Vα is the associated value
function. Since the solution of (1.1) under the control Ut = −∇Vα (Xt ) is regular, and since
the drift m − ∇Vα is locally Lipschitz, it follows that (1.1) under the optimal α-discounted
control Ut = −∇Vα (Xt ) has a unique strong solution.
The following is a special case of the Hardy–Littlewood theorem [26, Theorem 2.2].
Lemma A.4. Let {an } be a sequence of non-negative real numbers. Then
lim sup (1 − β)
βր1
∞
X
n=1
β n an ≤ lim sup
N →∞
N
1 X
an .
N n=1
Concerning the proof of Lemma A.4, note thatPif the right hand side of the above display
n
is finite then the set { ann } is bounded. Therefore ∞
n=1 β an in finite for every β < 1. Hence
we can apply [26, Theorem 2.2] to obtain the result.
Continuing, we define
Z T 1
2
1
ℓ(Xs ) + 2 |Us | ds .
E
β∗ := inf lim sup
U ∈U T →∞ T
0
To be more precise we should define β∗ as a function of the initial condition x. But we shall
see that the infimum does not depend on the initial condition.
Lemma A.5. Let β be given by (A.1). Then β ≤ β∗ .
Proof. Fix x ∈ Rd . Let U ∈ U be such that
Z T 1
2
1
ℓ(Xs ) + 2 |Us | ds < ∞ .
Ex
lim sup
T →∞ T
0
It is easy to see that
Z T Z N 1
1
2
1
lim sup
ℓ(Xs ) + 2 |Us | ds = lim sup
ℓ(Xs ) +
Ex
Ex
T →∞ T
N →∞ N
0
0
1
2
2
|Us |
ds ,
42
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
where N runs over the set of natural numbers. Define
Z n 2
1
ℓ(Xs ) + 2 |Us | ds ,
an := Ex
n−1
n ≥ 1.
Therefore applying Lemma A.4 we have
Z T ∞
X
1
2
1
β n an .
lim sup
Ex
ℓ(Xs ) + 2 |Us | ds ≥ lim sup (1 − β)
T →∞ T
βր1
0
(A.13)
n=1
For α > 0, define β = e−α and thus we have
Z n
∞
∞
X
X
−αs
n
−α
2
1
ℓ(Xs ) + 2 |Us | ds
e
Ex
β an ≥ e
n−1
n=1
n=1
−α
= e
Ex
Z
∞
−αs
e
0
1
2
2
ℓ(Xs ) + |Us |
ds .
Therefore combining (A.13) with Lemma A.3 we obtain
Z T 1
2
1
lim sup
ℓ(Xs ) + 2 |Us | ds ≥ lim sup (1 − e−α )e−α Vα (x) .
Ex
T →∞ T
αց0
0
Since αVα (x) → β as α ց 0 from [2] we obtain from above that
Z T 1
2
1
ℓ(Xs ) + 2 |Us | ds ≥ β .
Ex
lim sup
T →∞ T
0
U ∈ U being arbitrary this concludes the proof.
Proof of Theorem 2.2. From Lemma A.5 we have β ≤ β∗ where β is given by (A.1). It is
easy to see that v∗ = −∇V is a minimizing selector of
i
h
1
∆V + min hm + u, ∇V i + ℓ + 12 |u|2 = β ,
2
u∈Rd
and the HJB takes the following form
1
1
∆V + hm, ∇V i − |∇V |2 + ℓ = β .
2
2
By [19, Lemma 5.1] we have
|∇V (x)| ≤ C1 + C2 |x|
for some constants C1 , C2 . Since v∗ is locally Lipschitz with at most linear growth there
exists a unique strong solution to (1.1) with U = v∗ . Applying Itô’s formula to V we have
Z T ∧τR β − ℓ(Xs ) − 21 |∇V (Xs )|2 ds ,
Ex V (XT ∧τR ) − V (x) = Ex
0
where τR denotes the exit time from the ball of radius R > 0 around 0. Since V is bounded
from below and τR → ∞ as R → ∞ we have
Z T 1
Ex
ℓ(Xt ) + 12 |∇V (Xt )|2 dt ≤ β .
lim sup
T
0
CONTROLLED EQUILIBRIUM SELECTION
43
Hence β = β∗ . Also this proves that v∗ is an optimal stable Markov control and by the
ergodic theorem
Z ℓ(x) + 21 |∇V (x)|2 dη∗ ,
β =
Rd
where η∗ is the invariant probability measure corresponding to the control v∗ .
Acknowledgment
This work was initiated during Vivek Borkar’s visit to the Department of Electrical
Engineering, Technion, supported by Technion. Thanks are due to Prof. Rami Atar for
suggesting the problem as well as for valuable discussions. The work of Ari Arapostathis
was supported in part by the Office of Naval Research through grant N00014-14-1-0196,
and in part by a grant from the POSTECH Academy-Industry Foundation. The work of
Anup Biswas was supported in part by an award from the Simons Foundation (# 197982)
to The University of Texas at Austin and in part by the Office of Naval Research through
the Electric Ship Research and Development Consortium. This research of Anup Biswas
was also supported in part by an INSPIRE faculty fellowship. The work of Vivek Borkar
was supported in part by a J. C. Bose Fellowship from the Department of Science and
Technology, Government of India.
References
[1] A. Arapostathis, V. S. Borkar, and M. K. Ghosh. Ergodic control of diffusion processes, volume 143 of
Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 2011.
[2] A. Bensoussan and J. Frehse. On Bellman equations of ergodic control in Rn . J. Reine Angew. Math.,
429:125–160, 1992.
[3] A. Bensoussan and J. Frehse. Regularity results for nonlinear elliptic systems and applications, volume
151 of Applied Mathematical Sciences. Springer-Verlag, Berlin, 2002.
[4] R. Benzi, G. Parisi, A. Sutera, and A. Vulpiani. A theory of stochastic resonance in climatic change.
SIAM J. Appl. Math., 43(3):565–478, 1983.
[5] N. Berglund and B. Gentz. Metastability in simple climate models: pathwise analysis of slowly driven
Langevin equations. Stoch. Dyn., 2(3):327–356, 2002. Special issue on stochastic climate models.
[6] N. Berglund and B. Gentz. Noise-induced phenomena in slow-fast dynamical systems. A sample-paths
approach. Probability and its Applications (New York). Springer-Verlag London, Ltd., London, 2006.
[7] A. G. Bhatt and V. S. Borkar. Occupation measures for controlled Markov processes: characterization
and optimality. Ann. Probab., 24(3):1531–1562, 1996.
[8] A. Biswas and V. S. Borkar. Small noise asymptotics for invariant densities for a class of diffusions: a
control theoretic view. J. Math. Anal. Appl., 360(2):476–484, 2009. Erratum at arXiv:1107.2277.
[9] B. Z. Bobrovsky, M. M. Zakai, and O. Zeitouni. Error bounds for the nonlinear filtering of signals with
small diffusion coefficients. IEEE Trans. Inform. Theory, 34(4):710–721, 1988.
[10] V. S. Borkar. Optimal control of diffusion processes, volume 203 of Pitman Research Notes in Mathematics Series. Longman Scientific & Technical, Harlow; copublished in the United States with John
Wiley & Sons, Inc., New York, 1989.
[11] V. S. Borkar. Probability theory. An advanced course. Universitext. Springer-Verlag, New York, 1995.
[12] R. W. Brockett. Finite dimensional linear systems. John Wiley & Sons, 1970.
[13] S. Cerrai and M. Röckner. Large deviations for invariant measures of stochastic reaction-diffusion systems with multiplicative noise and non-Lipschitz reaction term. Ann. Inst. H. Poincaré Probab. Statist.,
41(1):69–105, 2005.
[14] M. V. Day. Recent progress on the small parameter exit problem. Stochastics, 20(2):121–150, 1987.
[15] J. Feng, M. Forde, and J.-P. Fouque. Short-maturity asymptotics for a fast mean-reverting Heston
stochastic volatility model. SIAM J. Financial Math., 1(1):126–141, 2010.
44
ARI ARAPOSTATHIS, ANUP BISWAS, AND VIVEK S. BORKAR
[16] M. I. Freidlin and A. D. Wentzell. Random perturbations of dynamical systems, volume 260 of
Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, New York, second edition, 1998.
[17] N. Ichihara. Large time asymptotic problems for optimal stochastic control with superlinear cost. Stochastic Process. Appl., 122(4):1248–1275, 2012.
[18] T. G. Kurtz and R. H. Stockbridge. Existence of Markov controls and characterization of optimal
Markov controls. SIAM J. Control Optim., 36(2):609–653, 1998.
[19] G. Metafune, D. Pallara, and A. Rhandi. Global properties of invariant measures. J. Funct. Anal.,
223(2):396–424, 2005.
[20] K. R. Meyer. Energy functions for Morse Smale systems. Amer. J. Math., 90:1031–1040, 1968.
[21] F. Moss. Stochastic resonance: From the ice ages to the monkey’s ear. In George H. Weiss, editor,
Contemporary Problems in Statistical Physics, chapter 5, pages 205–253. SIAM, Philadelphia, 1994.
[22] E. Olivieri and M. E. Vares. Large deviations and metastability, volume 100 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 2005.
[23] Z. Schuss. Theory and applications of stochastic differential equations. John Wiley & Sons, Inc., New
York, 1980. Wiley Series in Probability and Statistics.
[24] S. J. Sheu. Asymptotic behavior of the invariant density of a diffusion Markov process with small
diffusion. SIAM J. Math. Anal., 17(2):451–460, 1986.
[25] S. Smale. On gradient dynamical systems. Ann. of Math. (2), 74:199–206, 1961.
[26] R. Sznajder and J. A. Filar. Some comments on a theorem of Hardy and Littlewood. J. Optim. Theory
Appl., 75(1):201–208, 1992.
[27] O. Zeitouni and M. Zakai. On the optimal tracking problem. SIAM J. Control Optim., 30(2):426–439,
1992.
Department of Electrical and Computer Engineering, The University of Texas at Austin,
1 University Station, Austin, TX 78712
E-mail address: ari@ece.utexas.edu
Department of Mathematics, Indian Institute of Science Education and Research, Dr. Homi
Bhabha Road, Pune 411008, India
E-mail address: anup@iiserpune.ac.in
Department of Electrical Engineering, Indian Institute of Technology, Powai, Mumbai
E-mail address: borkar@ee.iitb.ac.in
Download