MINIMISING THE TIME TO REACH A TARGET AND RETURN

advertisement
MINIMISING THE TIME TO REACH A TARGET AND RETURN
SAUL JACKA
Abstract. Motivated in part by a problem in Metropolis-coupled, Markov chain
Monte Carlo, we seek to minimise, in a suitable sense, the time it takes a (regular)
diffusion with instantaneous reflection at 0 and 1 to travel from the origin to 1 and
then return. The control mechanism is that we are allowed to chose the diffusion’s
drift at each point in [0,1]. We consider the static and dynamic versions of this
problem, where, in the dynamic version, we are only able to choose the drift at each
point at the time of first visiting that point. The dynamic version leads to a novel
type of stochastic control problem.
December 13, 2012
1. Introduction and problem motivation
1.1. Introduction. Suppose that X µ is the diffusion on [0, 1] started at 0 and given
by
dXtµ = σ(Xtµ )dBt + µ(Xtµ )dt on (0,1)
with instantaneous reflection at 0 and 1 (see [11] or [7] for details). Where there is no
risk of confusion we omit the superscript µ.
Formally, we define Tx to be the first time that the diffusion reaches x, then we define
S = S(X), the shuttle time (between 0 and 1), by
def
S(X) = inf{t > T1 (X) : Xt = 0}.
In this article, we consider the following problem (and several variants and generalizations):
Problem 1.1. Minimise the expected shuttle time ES; i.e. find
inf E[S(X µ )],
µ
where the infimum is taken over a suitably large class of drifts µ, to be specified in
more detail later.
Given the symmetry of the problem it is tempting to hypothesise that the optimal
choice of µ is 0. We shall soon see that, in general, this is false, although it is true
when σ ≡ 1
We actually want to try and minimise S (and additive functionals of X µ evaluated at
S) in as general a way as possible so we extend Problem 1.1 in the following ways:
Key words: SHUTTLE-TIME; DIFFUSION; BIRTH AND DEATH PROCESS;
MCMCMC; STOCHASTIC CONTROL; SIMULATED TEMPERING.
AMS 2010 subject classifications: Primary 60J25; secondary 60J27, 60J60, 93E20.
The author is most grateful to Gareth Roberts for suggesting this problem.
The author would like to thank David Hobson, Jon Warren and Sigurd Assing for stimulating
discussions on this problem and the participants at the EPSRC Workshop on Topics in Control, held
at the University of Warwick in November 2011, for further feedback.
1
2
SAUL JACKA
Problem 1.2. Find
Z
inf E[
µ
S
f (Xtµ )dt]
0
for suitable positive functions f ;
and
Problem 1.3. Find
Z
sup E[exp(−
µ
S
α(Xtµ )dt)];
0
for suitable positive functions α.
We shall also solve the corresponding discrete statespace problems (in both discrete
and continuous time).
Although we will prove more general versions it seems appropriate to give a preliminary statement of results in this context.
Theorem 1.4. Suppose that σ > 0, that f is a non-negative Borel-measurable function
on [0, 1] and that, denoting Lebesgue measure by λ ,
√
f
(1.1)
∈ L1 ([0, 1], λ),
σ
then
Z 1s
Z
S
2f (u) 2
f (X µ t )dt =
inf
E
du .
measurable µ
σ 2 (u)
0
0
q
If, in addition, σf2 is continuously differentiable and strictly positive on (0,1), then
the optimal drift is µ̂ given by
r
1
f 0
µ̂ = − (ln
).
2
σ2
Theorem 1.5. Suppose that σ > 0, that α is a non-negative Borel-measurable function on [0, 1] and that
√
α
(1.2)
∈ L1 ([0, 1], λ),
σ
Z 1s
Z S
2α(u) −2
sup E exp −
α(X µ t )dt = cosh
du .
σ 2 (u)
measurable µ
0
0
p
If, in addition, σα2 is continuously differentiable and strictly positive on (0,1), then
the optimal drift is µ̂ given by
r
1
α 0
µ̂ = − (ln
).
2
σ2
We will eventually solve the problems dynamically, i.e. we will solve the corresponding
stochastic control problems. However, we shall need to be careful about what these are
as the problem is essentially non-Markovian. Normally in stochastic control problems,
one can choose the drift of a controlled diffusion at each time point but this is not
appropriate here. In this context, it is appropriate that the drift is ‘frozen’ once we
have had to choose it for the first time. We shall formally model this in section 4 .
MINIMISING RETURN TIME
3
1.2. Problem motivation. The problem models one arising in simulated temperinga form of Markov Chain Monte Carlo (MCMC). Essentially the level corresponds to
the temperature in a “heat bath”. The idea is that when simulating a draw from a
highly multimodal distribution we use a reversible Markov Process to move between
low and high temperature states (and thus smear out the modes temporarily) so that
the Markov chain can then move around the statespace; then at low temperature we
sample from the true distribution (see [2]).
2. Notation and some general formulae
We assume the usual Markovian setup, so that X µ lives on a suitable filtered space
(Ω, F, (Ft )t≥0 , ), with the usual family of probability measures (Px )x∈[0,1] corresponding to the possible initial values.
Definition 2.1. Denote by sµ the standardised scale function for X µ and by mµ the
corresponding speed measure. Since X is regular and reflection is instantaneous we
have:
Z u
Z x
µ(t)
µ
s (x) =
exp −2
dt du,
2
0
0 σ (t)
R u µ(t)
Z x exp 2 0 σ2 (t) dt
Z x
du
def
µ
µ
=2
du,
m ([0, x]) = m (x) = 2
2
0
σ 2 (u)
0
0 σ (u)s (u)
(see [10]).
From now on, we shall consider the more general case where we only know that (dropping the µ dependence) s and m are absolutely continuous with respect to Lebesgue
measure so that, denoting the respective Radon-Nikodym derivatives by s0 and m0 we
have
2
s0 m0 = 2 Lebegue a.e.
σ
For such a pair we shall denote the corresponding diffusion by X s . We
R underline that
we are only considering regular diffusions with “martingale part” σdB or, more
precisely, diffusions X with scale functions s such that
ds(Xt ) = s0 (Xt )σ(Xt )dBt ,
so that, for example, sticky points are excluded (see [5] for a description of driftless
sticky BM and its construction, see also [4] for other problems arising in solving
stochastic differential equations ). Note that our assumptions do allow generalised
drift: if s is the difference between two convex functions (which we will not necessarily
assume) then
Z t
Z
1
s00 (da)
s
s
(2.1)
Xt = x +
σ(Xu )dBu −
Lat (X) 0
,
2 R
s − (a)
0
where s0 − denotes the left-hand derivative of s, s00 denotes the signed measure induced
by s0 − and Lat (X) denotes the local time at a developed by time t by X (see RY
Chapter VI for details).
Definition 2.2. For each y ∈ [0, 1], we denote by φy the function
Z
Ty
φy : x 7→ Ex
f (Xs )ds ,
0
4
SAUL JACKA
where, as is usual, the subscript x denotes the initial value of X under the corresponding law Px .
Theorem 2.3. For 0 ≤ x ≤ y, φy is given by
Z yZ v
f (u)m0 (u)s0 (v)dudv,
(2.2)
φy (x) =
x
u=0
while for 0 ≤ y ≤ x, φy is given by
Z xZ
(2.3)
φy (x) =
y
1
f (u)m0 (u)s0 (v)dudv.
u=v
In particular,
(2.4)
E0
Z
S
f (Xts )dt
Z
1
1
Z
f (u)m0 (u)s0 (v)dudv.
=
0
0
0
Proof: This follows immediately from Proposition VII.3.10 of [10] on observing that,
with instantaneous reflection at the boundaries, the speed measure is continuous. We give similar formulae for the discounted problem:
Definition 2.4. We denote by ψy the function
Z Ty
α(Xs )ds .
ψy : x 7→ Ex exp −
0
Theorem 2.5.
(i) Either
1
Z
α(u)dm(u) < ∞,
(2.5)
0
or
Z
E0 [exp −
S
α(Xs )ds = 0,
0
in which case
Z
S
α(Xs )ds = ∞ a.s.
0
(ii) Now suppose that (2.5) holds. For each n, denote by In (x) the integral
Z
def
In (x) =
α(u1 ) . . . α(un )dm(u1 ) . . . dm(un )ds(v1 ) . . . ds(vn ),
0≤u1 ≤v1 ≤u2 ...≤vn ≤x
and by I˜n (x) the integral
Z
def
˜
α(u1 ) . . . α(un )dm(u1 ) . . . dm(un )ds(v1 ) . . . ds(vn ),
In (x) =
x≤v1 ≤u1 ≤v2 ...≤un ≤1
(2.6)
with I0 = I˜0 ≡ 1. Now define G and G̃ by
∞
∞
X
X
def
G(x) =
In (x) and G̃(x) =
I˜n (x).
0
0
The the sums in (2.6) are convergent and for x ≤ y
ψy (x) =
G(x)
,
G(y)
MINIMISING RETURN TIME
5
while for x ≥ y
ψy (x) =
G̃(x)
.
G̃(y)
Proof:
(i) Note first that s(1) < ∞ follows from regularity.
We consider the case where x R≤ y. Now suppose that α is bounded. It folT
lows that ψy > 0 for each y since 0 y α(Xs )ds is a.s. finite for bounded α. Now,
R t∧Ty
R t∧T
setting Nt = e− 0 α(Xs )ds ψ(Xt∧Ty ), it is clear that N = E exp − 0 y α(Xs )ds |Ft∧Ty
and is thus a continuous martingale. Then, writing
Z t∧Ty
α(Xs )ds)Nt
ψ(Xt∧Ty ) = exp(
0
it follows that
t∧Ty
Z
ψ(Xt∧Ty ) −
Z
αψ(Xu )du =
t
α(Xs )ds)dNu ,
0
0
0
u∧Ty
Z
exp(
and hence is a martingale. Thus we conclude that ψ is in the domain of Ay ,
the extended or martingale generator for the stopped diffusion X Ty , and
Ay ψy = αψ.
(2.7)
Since the speed and scale measures for X and X Ty coincide on [0, y] and using
the fact that ψ 0 (0) = 0, we conclude from Theorem VII.3.12 of [10] that
Z x Z v
ψy (x) = ψy (0) +
s0 (v)α(u)ψy (u)m0 (u)dudv for x < y.
v=0
(2.8)
u=0
A similar argument establishes that
Z 1 Z 1
ψy (x) = ψy (1) +
s0 (v)α(u)ψy (u)m0 (u)dudv for x > y.
v=x
u=v
Now either
min(ψ1 (0), ψ0 (1)) = 0,
in which case
Z
E0 [exp −
S
α(Xs )ds = ψ1 (0)ψ0 (1) = 0,
0
or
(2.9)
min(ψ1 (0), ψ0 (1)) = c > 0.
Suppose that (2.9) holds, then (since ψ1 is increasing) it follows from (2.7)
that
Z 1 Z v
(2.10)
ψ1 (1−) ≥ c +
s0 (v)cα(u)m0 (u)dudv
v=0 u=0
Z 1 Z 1
= c 1+
s0 (v)α(u)m0 (u)dudv
u=0
v=u
1 ≥ c 1 + s(1) − s( )
2
Z
0
1
2
α(u)m0 (u)du .
6
SAUL JACKA
Similarly, we deduce that
1
ψ0 (0+) ≥ c 1 + s( )
2
1
Z
α(u)m0 (u)du .
1
2
Thus, if (2.5) fails, (2.9) cannot hold (since if (2.5) fails then at least one
R1
R1
of 02 α(u)m0 (u)du) and 1 α(u)m0 (u)du) is infinite) and so we must have
2
ψ1 (0)ψ0 (1) = 0.
To deal with unbounded α, take a monotone, positive sequence αn increasing
to α and take limits.
(ii) Suppose now that (2.5) holds. Setting
G(x) =
ψ1 (x)
,
ψ1 (0)
we see that G satisfies equation
P (2.7) with
P ˜ G(0) = 1.
Convergence of the series
In and
In follows from the bounds on In and
˜
In contained in the following lemma.
Lemma 2.6. Let
def
y
Z
B(y) =
def
Z
α(u)dm(u) and B̃(y) =
0
1
α(u)dm(u),
y
then
In (x) ≤
(2.11)
n
(s(x)B(x))n
˜n (x) ≤ (s̃(x)B̃(x)) ,
and
I
(n!)2
(n!)2
where
def
s̃(x) = s(1) − s(x).
Now we establish the first inequality in (2.11) by induction. The initial
inequality is trivially satisfied. It is obvious from the definition that
Z x Z v
In+1 (x) =
α(u)In (u)dm(u)ds(v),
v=0
and so, assuming that In (·) ≤
Zx Zv
(2.12) In+1 (x) ≤
(s(·)B(·))n
:
(n!)2
s(u)n B(u)n
dm(u)ds(v)
α(u)
(n!)2
v=0 u=0
Zx Zv
≤
u=0
α(u)
B(u)n
dm(u)s(v)n ds(v) (since s is increasing)
(n!)2
v=0 u=0
Zx
B(v)n+1
s(v)n ds(v)
n!(n + 1)!
=
v=0
B(x)n+1
≤
n!(n + 1)!
Zx
s(v)n ds(v) (since B is increasing)
v=0
n+1
=
(s(x)B(x))
,
((n + 1)!)2
MINIMISING RETURN TIME
7
establishing the inductive step. A similar argument establishes the second
inequality.
Now by iterating equation (2.7) we obtain
G(x) =
n−1
X
Ik (x)
k=0
Z
+
α(u1 ) . . . α(un )G(un )dm(u1 ) . . . dm(un )ds(v1 ) . . . ds(vn ).
0≤u1 ≤v1 ≤u2 ...≤vn ≤x
Since G is bounded by
1
ψ1 (0)
we see that
0 ≤ G(x) −
n−1
X
Ik (x) ≤
k=0
1
In (x).
ψ1 (0)
A similar argument establishes that
0 ≤ G̃(x) −
n−1
X
I˜k (x) ≤
k=0
1 ˜
In (x).
ψ0 (1)
and so we obtain (2.6) by taking limits as n → ∞.
3. Preliminary results
For now we will state and prove more general, but still non-dynamic versions of
Theorems 1.4 and 1.5.
We define our constrained control set as follows:
Definition 3.1. Given a scale function s0 ∼ λ and C, a Borel subset of [0,1], we
define the constrained control set MC
s0 by
(3.1)
MC
s0 = {scale functions s : ds|C = ds0 |C and s ∼ λ}
The corresponding controlled diffusion X s has scale function s and speed measure m
given by
2
m0 = 2 0 .
σ s
Theorem 3.2. For any scale function s ∼ λ, define the measure I s on ([0, 1], B([0, 1])
by
Z
Z
f (u)
def
s
du
I (D) =
f (u)m(du) =
2 2
0
D
D σ (u)s (u)
and the measure J by
Z s
2f (u)
def
J(D) =
du,
σ 2 (u)
D
then, given a scale function s0 ,
2
Z
p
S
s
c
s
0
inf E0
f (Xt )dt =
s0 (C)I (C) + J(C ) .
s∈MsC0
0
8
SAUL JACKA
The optimal choice of s is given by
(
s0 (dx) q
s(dx) = q
s0 (C)
I s0 (C)
: on C
2f (x)
dx
σ 2 (x)
: on C c
Proof: Note first that, from Theorem 2.3,
Z1 Z1
Z
S
(3.2)
f (u)s(dv)m(du)
f (Xts )dt = φ1 (0) + φ0 (1) =
E0
0
v=0 u=0
= s0 (C)I s0 (C) +
Z
[I s0 (C)s(dv) + s(C)f (v)m(dv)]
Cc
1
+
2
Z Z
[f (u)s(dv)m(du) + f (v)m(dv)s(du)],
Cc Cc
1
2
where the factor in the last term in (3.2) arises from the fact that we have symmetrised the integrand. Now, for s ∈ MsC0 , we can rewrite (2.4) as
Z
Z
s0
S
2f (v) s0
I (C)s0 (v) + s0 (C) 2
f (Xt )dt = s0 (C)I (C) +
(3.3)E0
dv
σ (v)s0 (v)
0
Cc
Z Z
2f (u) s0 (v) 2f (v) s0 (u) 1
+
+
dudv.
2
σ 2 (u) s0 (u) σ 2 (v) s0 (v)
Cc Cc
We now utilise the very elementary fact that for a, b ≥ 0,
(3.4)
b
inf [ax + ] = 2ab and if a, b > 0 this is attained at x =
x>0
x
r
b
.
a
Applying this to the third term
on the right-hand-sideq
of (3.3), we see from (3.4) that
2
R
R R q 4f (u)f (v)
it is bounded below by
du = J 2 (C c ) and this
[ σ2 (u)σ2 (v) dudv = C c σ2f2 (u)
(u)
Cc Cc
q
bound is attained when s0 (x) is a constant multiple of σ2f2 (x)
a.e. on C c .
(x)
Turning to the secondqterm in (3.3) we see from (3.4) that it is bounded below
p
p
R
by C c 2 s0 (C)I s0 (C) σ2f2 (v)
s0 (C)I s0 (C)J(C c ) and this is attained when
dv
=
2
(v)
q
q
s0 (C)
s0 (x) = σ2f2 (x)
a.e. on C c .
(x)
I s0 (C)
0
Thus,
q that the infimum of the RHS of (3.3) is attained by setting s (x) equal
q we see
2f (x)
on C c and this gives the stated value for the infimum.
to Iss00(C)
(C)
σ 2 (x)
In the exponential case we only deal with constraints on s on [0, y].
Theorem 3.3. Assume that (2.5) holds and define
def
σ̃ 2 (x) =
σ 2 (x)
,
α(x)
(i) Let G be as in equation (3.3), so that (at least formally)
G0 0
1
1
α σ̃ 2 s0 0 − 2G = σ 2 G00 + µG0 − αG = 0
2
s
2
MINIMISING RETURN TIME
(3.5)
9
and let G̃∗ satisfy the “adjoint equation”
2 0 ∗0 0
σ̃ s G̃
1
∗
α
=0
−
2
G̃
2
s0
with boundary conditions G̃∗ (0) = 1 and G̃∗0 (0) = 0, so that
Z x Z v
2α(v)G̃∗ (v) 0
∗
G̃ (x) = 1 +
s (u)dudv
2
0
v=0 u=0 σ (v)s (v)
Z x Z v
= 1+
α(v)G̃∗ (v)dm(v)ds(u),
v=0
u=0
∗
then G̃ is given by
∗
(3.6)
G̃ (x) =
∞
X
I˜n∗ (x),
n=0
where
(3.7)
def
I˜n∗ (x) =
Z
α(v1 ) . . . α(vn )ds(u1 ) . . . ds(un )dm(v1 ) . . . dm(vn ).
0≤u1 ≤v1 ≤...vn ≤x
(ii) The optimal payoff for Problem 1.3 is given by
Z S
sup E0 exp −
α(Xts )dt = ψ̂(y)
[0,y]
0
s∈Ms0
where
p
p
−2
∗
GG̃ cosh F (y) + σ̃ 2 G0 G̃∗0 sinh F (y) ,
ψ̂(y) =
with
√
Z 1s
2du
2α
def
F (y) =
=
du.
2
σ (u)
y σ̃(u)
y
q
G̃∗0
0
The payoff is attained by setting σ̃(x)s (x) = G
(y) for all x ≥ y (if y = 0,
G0 G̃∗
0
any constant value for σ̃(x)s (x) will do).
Z
1
Proof:
(i) This is proved in the same way as equation (2.6) in Theorem 2.5.
(ii) First we define
Z
def
∗
(3.8)
In (x) =
α(v1 ) . . . α(vn )ds(u1 ) . . . ds(un )dm(v1 ) . . . dm(vn );
x≤u1 ≤v1 ≤...vn ≤1
and
∗
(3.9)
def
G (x) =
∞
X
In∗ (x).
n=0
To prove (ii) we use the following representations (which the reader may easily
verify):
n
n
X
X
∗
2
0
∗
(3.10)
In (1) =
Im (y)In−m (y) − σ̃ (y)
Im
(y)(In−m
)0 (y),
m=0
m=1
10
SAUL JACKA
and
I˜n (0) =
(3.11)
n
X
∗
I˜m (y)I˜n−m
(y) − σ̃ 2 (y)
m=0
n
X
0
∗
I˜m
(y)(I˜n−m
)0 (y)
m=1
It follows from these equations that
Z
S
(3.12) E0
exp −α(Xt )dt
0
= G(y)G∗ (y) − σ̃ 2 (y)G0 (y)(G∗ )0 (y) G̃(y)G̃∗ (y) − σ̃ 2 (y)G̃0 (y)(G̃∗ )0 (y) .
Now essentially the same argument as in the proof of Theorem 3.2 will work
as follows. Multiplying out the expression on the RHS of (3.12) we obtain the
sum of the three terms:
P ˜
∗
(y) + I˜m (y)In∗ (y)]
[In (y)Im
(a) 21 G(y)G̃∗ (y)
m≥0,n≥0
P ˜0
∗ 0
(b) 21 G0 (y)˜(G∗ )0 (y)
[I n (y)(Im
) (y) + I˜0 m (y)(In∗ )0 (y)] ; and
m≥0,n≥0
P
∗ 0
0
) (y)I˜n (y)],
(y) + G0 (y)G̃∗ (y)(Im
(c)
[G(y)(G̃∗ )0 (y)In∗ (y)I˜m
m≥1,n≥0
where in the first two terms we have symmetrised the sums.
Using (2.6), (c) becomes
X
R
t0 (v1 ) . . . t0 (vn )t0 (w1 ) . . . t0 (wm )
[G(y)(G̃∗ )0 (y) 0
(3.13)
t (u1 ) . . . t0 (un )t0 (z1 ) . . . t0 (zm−1 )
m≥1,n≥0 Dm,n (y)
+G0 (y)G̃∗ (y)
t0 (u1 ) . . . t0 (un )t0 (z1 ) . . . t0 (zm−1 )
]dλ̃(u, v, w, z)
t0 (v1 ) . . . t0 (vn )t0 (w1 ) . . . t0 (wm )
where
Dm,n (x) = {(u, v, w, z) ∈ Rn × Rn × Rm × Rm−1 : x ≤ u1 ≤ v1 ≤ . . . vn ≤ 1;
and
x ≤ w1 ≤ z1 ≤ . . . wm ≤ 1},
t is the measure with Radon-Nikodym derivative t0 = σ̃s0 , and λ̃ denotes the
measure with Radon-Nikodym derivative σ̃1 . Clearly each term in the sum in
q
G̃∗0 (y)
0
(3.13) is minimised by taking t constant and equal to G(y)
a.e. on [y, 1].
G0 (y)G̃∗ (y)
The first two terms, (a) and (b), are each minimised by taking t0 constant
a.e. on [y, 1]. Substituting this value for t0 back in we obtain the result.
Remark 3.4. We see that, in general, in both Theorems 2.3 and 2.6 the optimal
scale function has a discontinuous derivative. In the case where C = [0, y) there is
a discontinuity in s0 at y. This will correspond to partial reflection at y (as in skew
Brownian motion- see [10] or [7]) and will give rise to a singular drift – at least at y.
Remark 3.5. We may easily extend Theorems 2.3 and 2.6 to the cases where f
def
or α disappears on some of [0,1]. In the case where N = {x : f (x) = 0} is nonempty, observe first that the cost functional does not depend on the amount of time the
diffusion spends in N so that every value for ds|N which leaves the diffusion recurrent
will give the same expected cost. If λ(N ) = 1 then the problem is trivial, otherwise
MINIMISING RETURN TIME
11
define the revised statespace S = [0, 1 − λ(N )] and solve the problem on this revised
def
interval with the cost function f˜(x) = f (g −1 (x)) where
g : t 7→ λ([0, t] ∩ N c )
and
g −1 : x 7→ inf{t : g(t) = x}.
This gives us a diffusion and scale function sS which minimises the cost functional
on S. Then we can extend this to a solution of the original problem by taking
ds = ds0 1N + ds̃1N c ,
where ds0 is any finite measure equivalent to λ and ds̃ is the Lebesgue-Stiltjes measure
given by
s̃([0, t]) = sS ([0, λ([0, t] ∩ N c ) = sS (g(t)).
An exactly analagous method will work in the discounted problem
4. The dynamic control problems
We now turn to the dynamic versions of Problems 1.2 and 1.3.
A moment’s consideration shows that it is not appropriate to model the dynamic version of the problem by allowing the drift to be chosen adaptively. If we were permitted
to do this then we could choose a very large positive drift until the diffusion reaches
1 and then a very large negative drift to force it back down to 0. The corresponding
optimal payoffs for Problems 1.2 and 1.3. would be 0 and 1 respectively. We choose,
instead, to consider the problem where the drift may be chosen dynamically at each
level, but only when the diffusion first reaches that level. Formally, reverting to the
finite drift setup, we are allowed to choose controls from the collection M of adapted
processes µ with the constraint that
(4.1)
µt = µTXt ,
or continuing the generalised setup, to choose scale measures dynamically, in such a
way that s0 (Xt ) is adapted.
Although these are very non-standard control problems we are able to solve them –
mainly because we can exhibit an explicit solution, to whit, following the same control
as in the “static” case.
Remark 4.1. Note that this last statement would not be true if our constraint was
not on the set [0, y]. To see this, consider the case where our constraint is on the set
[y, 1]. If the controlled diffusion starts at x > 0 then there is a positive probability that
it will not reach zero before hitting 1, in which case the drift will not have been chosen
at levels below IT1 , the infimum on [0, T1 ]. Consequently, on the way down we can set
the drift to be very large and negative below IT1 Thus the optimal dynamic control will
achieve a strictly lower payoff than the optimal static one in this case. We do not
pursue this problem further here but intend to do so in a sequel.
We need to define the set of admissible controls quite carefully and two approaches
suggest themselves: the first is to restrict controls to choosing a drift with the property
(4.1) whilst the second is to allow suitable random scale functions.
12
SAUL JACKA
Both approaches have their drawbacks: in the first case we know from Remark 3.4
to expect that, in general, the optimal control will not be in this class, whilst, in the
second, it is not clear how large a class of random scale functions will be appropriate. In the interests of ensuring that an optimal control exists, we adopt the second
approach. From now on, we fix the Brownian Motion B on the filtered probability
space (Ω, F, (Ft )t≥0 , P).
Definition 4.2. By an equivalent random scale function we simply mean a random,
finite Borel measure on [0, 1] a.s. equivalent to Lebesgue measure and we define
def
(4.2)M = {equivalent random scale functions s :
s
there exists a martingale Y with
Yts
Z
=
t
s0 ◦ s−1 .σ ◦ s−1 (Yu )dBu }.
0
s
We define the corresponding controlled process X by
Xts = s−1 (Yt ).
For any s0 ∈ M we then define the constrained control set Msy0 by
Msy0 = {s ∈ M : ds|[0,y) = ds0 |[0,y) }
(4.3)
Remark 4.3. Note that M contains all deterministic equivalent scale functions. An
example of a random element of M when σ ≡ 1 is s, given by
1
ds|[0, 1 ) = dλ; ds(x)|[ 1 ,1] = 1dλ1(T 1 (B)<1) + exp(−2(x − ))dλ1(T 1 (B)≥1) ,
2
2
2
2
2
corresponding to X s having drift 1 above level 21 if and only if it reaches that level
before time 1.
In general, the martingale constraint is both about existence of a solution to the corresponding stochastic differential equation and about imposing a suitable progressive
measurability condition on the random scale function.
Theorem 4.4. For each s ∈ M, let Mts denote the running maximum of the controlled
process X s . Then for each s0 ∈ M, the optimal payoff (or Bellman) process V s0
defined by
Z
S
s0 def
s
f
(X
Vt = essinf
E
)dt|F
t
t
s
s∈M
0
s
Mt 0
0
is given by
(4.4)
(R t
p
2
s0
s0 (Mts0 )I s0 (Mts0 ) + J(Mts0 ) − φXts0 (0) :
f
(X
)du
+
2
def
s0
u
0
Vt = vt = R t∧S
s0
f (Xus0 )du + φ0 (Xt∧S
):
0
for Mts0 < 1
for Mts0 = 1,
(where φ is formally given by equations (2.2) and (2.3) with s = s0 ), and the optimal
control is to take
p
s0
f (x)
s
(M
)
0
(4.5)
s0 (x) = s0 t s0
for x ≥ Mts0 .
I (Mt ) σ(x)
Proof: Consider the candidate Bellman process vt . Using the facts that
Z t∧T1
Z
T1
def
s0
s
(4.6)
Nt =
f (Xu )du − φXt∧T
(0) = E
f (Xus0 )du|Ft∧T1
0
0
1
0
MINIMISING RETURN TIME
13
is a martingale,
def
N 0t =
(4.7)
φ0 (Xts0 )
Z
t
f (Xus0 )du
+
0
R S
is equal to E 0 f (Xus0 )du|Ft on the stochastic interval [[T1 , S]], and hence is a martingale on that interval, M s0 is a continuous, increasing process, and φ1 (0) + φ0 (1) =
2s0 (1)I s0 (1) (so that v is continuous at T1 (X s0 )):
dvt
p
s0 s0
s0
s0
= 4
s0 (Mt )I (Mt ) + J(Mt )
s
s
s0
s0
s0
p
(M
)
I
1
s
(M
)
2f ((Mts0 )
1 0
0
t
t
s0
s0
s0
s0 0
+ (I ) ((Mt )
−
dMts0
× s0 (Mt ) ((Mt )
2
s0 (Mts0 )
2
I s0 (Mts0 )
σ 2 ((Mts0 )
+dNt 1(Mts0 <1) + dN 0 t 1(Mts0 =1)
p
s0 s0
s0
s0
= 4
s0 (Mt )I (Mt ) + J(Mt )
s
s
s s
s0
s0
s0 0
1 0 I (Mt )
1
f (Mt )
s0 (Mt )
f (Mts0 )
×
s
−
dMts0
s0 + 0
s0
s0
s0
s0
2
s
2
0
2
s0 (Mt )
s (Mt ) σ (Mt ) I (Mt )
σ (Mt )
+ dNt 1(Mts0 <1) + dN 0 t 1(Mts0 =1)
Now, since s0 , I s0 and J are non-negative it follows from (3.4) that
dvt ≥ dN̄t ,
where
dN̄t = dNt 1(Mts0 <1) + dN 0 t 1(Mts0 =1) ,
with equality if
s
s0 (Mts0 ) =
(4.8)
s0 (Mts0 )f (Mts0 )
.
I s0 (Mts0 )σ 2 (Mts0
Then the usual submartingale argument (see, for example [9] Chapter 11), together
with the fact that v is bounded by assumption (1.1)) gives us (4.4).
It is easy to check that s given by (4.8) is in MsM0 s0 . The fact that the optimal choice
t
q
(x)
0
of s satisfies (4.5) follows on substituting s x) = I ss00 (y)f
in the formulae for s and
(y)σ 2 (x)
I s and observing that the ratio
s0 (x)
I s0 (x)
is then constant on [y, 1].
Theorem 4.5. The Bellman process for Problem 1.3 is given by
Z S
Rt
s0
def
s0 def
Vt = esssup E exp(−
α(Xts0 )dt)|Ft = vt = e− 0 α(Xs )ds ψ(Xts0 , Mts0 ),
s∈M
s0
0
s
Mt 0
where
(
G(x)ψ̂(y)
ψ(x, y) =
G̃(x)
if y < 1,
if y = 1.
and
ψ̂(y) =
p
GG̃∗ cosh F (y) +
p
σ̃ 2 G0 G̃∗0 sinh F (y)
−2
,
14
SAUL JACKA
(as in Theorem 3.3 (ii)) with
def
Z
1
F (y) =
y
du
.
σ̃(u)
The payoff is attained by setting
s
σ̃(x)s0 (x) =
(4.9)
GG̃∗0
(Mts0 ) for all x ≥ Mts0
G0 G̃∗
Proof: The proof is very similar to that of Theorem 4.4. Note that ψ is continuous
at the point (1, 1).
Thus, for a suitable bounded martingale n,
Z t
dvt = exp −
α(Xss0 )ds ψy (Xts0 , Mts0 )dMts0 1(Mts0 <1) + dnt
0
Z t
α(Xss0 )ds G(Xts0 )ψ 0 (Mts0 )1(Mts0 <1) + dnt
= exp −
0
Z t
p
p
−3
= −2 exp −
α(Xss0 )ds G(Xts0 ) GG̃∗ cosh F (Mts0 ) + σ̃ 2 G0 G̃∗0 sinh F (Mts0 )
0
p
p
p
p
s0
0
0
∗
0
∗0
2
0
∗0
× [( GG̃ ) − G G̃ ] cosh F (Mt ) + ( σ̃ G G̃ ) ] − σ̃ 2 GG̃∗ sinh F (Mts0 ) + dnt .
Now
p
GG̃∗
0
1
= G̃∗0
2
r
s
1
G
+ G0
∗
2
G̃
G̃∗ p 0 ∗0
≥ G G̃ using (3.4),
G
with equality attained when
p
p
0
∗0
G G̃ = GG̃∗ .
(4.10)
Similarly, defining mα by setting dmα =
p
( σ̃ 2 G0 G̃∗0 )0 =
=
=
=
≥
r
(
G0
s0
dm
,
α
0
)(σ̃ 2 G̃∗0
=
s
dG dG̃∗
ds dmα
0
v
v
u dG̃∗
u
2
2 ∗ u dG
u
1 α 0 d G t dmα
1 0 d G̃ t ds
(m )
+ s
dG
2
dmα ds
2 dsdmα dG̃α∗
ds
dm
v
v
u dG̃∗
u dG 2 ∗
1
1 d2 G u
t dmα + s0 d G̃ u
t ds
dG
α
dG̃∗
2 σ̃ 2 s0 dmα ds
dsdm
ds
dmα
v
v
u dG̃∗
u dG u
u
1
1 t dmα
0 ∗ t ds
G̃
G
+
s
dG
dG̃∗
2 σ̃ 2 s0
ds
dmα
s
GG̃∗
σ̃ 2
MINIMISING RETURN TIME
15
with equality when
(4.11)
v
u dG̃∗
uG α
1
s0 = t dm
σ̃ G̃∗ dG
ds
∗
dG̃
2 0 ∗0
= s10 G0 and dm
Now we can easily see (by writing dG
α = σ̃ s G̃ ) that (4.11) implies
ds
(4.10) so the standard supermartingale argument establishes that
Vt = vt .
That the optimal choice of s0 is as given in (4.9) follows on observing that, with this
choice of s0 ,
(σ̃(G0 G̃∗ − GG̃∗0 ))0 (x) = 0 for x ≥ y,
and
G0 (y)G̃∗ (y) − G(y)G̃∗0 (y) = 0.
5. The discrete statespace case
5.1. Additive functional case. Supppose that X is a discrete-time birth and death
process on S = {0, . . . , N }, with transition matrix (P ) given by
Pn,n+1 = pn and 1 − pn = qn = Pn,n−1 and pN = q0 = 0.
We define
def
wn =
qn
pn
and
Wn =
n
Y
wk ,
k=1
with the usual convention that the empty product is 1. Note that s, given by
def
s(n) =
n−1
X
Wk ,
k=0
is the discrete scale function in that s(0) = 0, s is strictly increasing on S and s(Xt )
is a martingale.
Remark 5.1. Note that when we choose pn or wn we are implicitly specifying s(n +
1) − s(n) so we shall denote this quantity by ∆s(n) and we shall denote by ds the
def
Lebesgue-Stiltjes measure on S = {0, 1, . . . , N − 1} given by
ds(x) = ∆s(x).
Let f be a positive function on S and define
1
f˜(n) = (f (n) + f (n + 1)) for 0 ≤ n ≤ N − 1.
2
16
SAUL JACKA
Theorem 5.2. If we define
Ty −1
φy (x) = Ex [
X
f (Xts0 )],
t=0
then for x ≤ y
(5.1)
φy (x) = f (x) + . . . + f (y − 1) +
y−1 v−1
X
X 2f˜(u)Wv
v=x u=0
Wu
while for y ≤ x
(5.2)
x−1 N
−1
X
X
2f˜(u)Wv
φy (x) = f (y + 1) + . . . + f (x) +
.
Wu
v=y u=v+1
Proof: It is relatively easy to check that φ satisfies the linear recurrence
φ(x) = px φ(x + 1) + qx φ(x − 1) + f (x)
with the right boundary conditions.
It follows from this that optimal payoffs are given by essentially the same formulae as
in the continuous case. Thus we now define the constrained control set MC
s0 by
(5.3)
MC
s0 = { scale functions s : ds|C = ds0 |C }.
Remark 5.3. By convention we shall always assume that 0 ∈ C since we cannot
control W0 and hence cannot control ∆s(0).
Theorem 5.4. For any scale function s, define the measure I s by
X f˜(k)
def
I s (D) =
for D ⊆ S
W
k
k∈D
and the measure J by
def
J(D) =
Xq
2f˜(k) for D ⊆ S,
k∈D
then, given a scale function s0 ,
inf E0 [
s∈MsC0
S
X
f (Xts )dt] =
p
2
s0 (C)I s0 (C) + J(C c ) .
0
Remark 5.5. Note that all complements are taken with respect to S.
The optimal choice of s is given by

Wx0
∆s(x) = Wx = q s0 (C) q
 I s0 (C) 2f˜(x)
: on C
: on C c
The dynamic problem translates in exactly the same way: we define the constrained
control set:
Msy0 = {s : ds|{0,...,y−1} = ds0 |{0,...,y−1} }, y ≥ 1
then we have the following:
MINIMISING RETURN TIME
17
Theorem 5.6. For each s ∈ M let Mts denote the running maximum of the controlled
process X s . Then for each s0 ∈ M, the optimal payoff (or Bellman) process V s0
defined by
S−1
X
s0 def
s
)dt|F
Vt = essinf
E
f
(X
t
t
s
s∈M
0
s
Mt 0
0
is given by
(5.4)

t
p
2
P


 f (Xus0 ) + 2 s0 (Mts0 )I s0 (Mts0 ) + J(Mts0 ) − φXts0 (0) :
def
0
Vts0 = vt = P
t


 f (Xu )du + φ0 (Xts0 ) :
for Mts0 < N
for Mts0 = N,
0
(where φ is formally given by equations (5.1) and (5.2) with s = s0 ), and the optimal
control is to take
q
s0 (Mts0 ) ˜
(5.5)
W (x) = s0
f (x) for x ≥ Mts0 .
s0
I (Mt )
5.2. The discounted problem. Suppose that for each x ∈ S, 0 ≤ rx ≤ 1, then
define
def
σi2 = (1 − ri−1 ri )−1 , with r−1 taken to be 1,
def
and σ(i1 , i2 , . . . , il ) =
l
Q
σim . Now set
m=1
Ak (x) = {(u, v) : 0 ≤ u1 < v1 < . . . < vk < x},
Ãk (x) = {(u, v) : x ≤ v1 < u1 < . . . < uk < N }
and Wmσ = σm Wm , where Wm is as before. Note that Ak and Ãk will be empty for
large values of k.
Theorem 5.7. For x ≤ y
Ty −1
(5.6)
Ex [
Y
rXts0 ] = rx . . . ry−1 G(x)/G(y),
t=0
where
def
G(x) = 1 +
∞
X
X
k=1 (u,v)∈Ak (x)
k
Y
Wvσm
1
,
σ(u, v) m=1 Wuσm
while for x ≥ y
Ty −1
(5.7)
Ex [
Y
rXts0 ] = ry+1 . . . rx G̃(x)/G̃(y),
t=0
where
def
G̃(x) = 1 +
∞
X
X
k=1 (u,v)∈Ãk (x)
k
Y
Wvσm
1
,
σ(u, v) m=1 Wuσm
18
SAUL JACKA
Proof: Define dx = Ex
Tx+1
Q−1
r(Xts0 ), then
t=0
dx = rx (px + qx dx−1 dx ).
Setting
rx tx /tx+1 = dx
we see that
tx+1 = (1 + wx )tx − wx rx rx−1 tx−1
or
(tx+1 − tx − wx (tx − tx−1 = wx (1 − rx−1 rx )tx−1 .
Substituting
tx = G(x)
It is easy to check that this is satisfied. Now boundary conditions give equation (5.6).
The proof of equation (5.7) is essentially the same.
Now with this choice of σ we get same results as before:
Theorem 5.8. Suppose G and G̃ are as defined in Theorem 5.7: we set
∗
def
G (x) = 1 +
∞
X
X
k=1 (u,v)∈A∗k (x)
k
Y
Wvσm
1
σ(u, v) m=1 Wuσm
and
∗
def
G̃ (x) = 1 +
∞
X
X
k=1 (u,v)∈Ã∗ (x)
k
k
Y
Wvσm
1
,
σ(u, v) m=1 Wuσm
where
A∗k (x) = {(u, v) : x ≤ u1 < v1 < . . . < vk < N }
and
Ã∗k (x) = {(u, v) : 0 ≤ v1 < u1 < . . . < uk < x}.
Then the optimal payoff to the discrete version of Problem 1.3 is given by
q
q
S−1
X
X
Y
2
def
∗
G(y)G̃ (y)
F2n (y)+ ∆G(y)(∆G̃∗ )(y)
F2n+1 (y) ,
sup E0 [
rXts0 ] = ψ̂(y) =
s∈M{0,...y}
n
t=0
n
where
def
Fk (y) =
X
y≤x1 <...<xk <N
def
∆G(y) =
∞
X
1
,
σ(x)
X
n=1 0≤u1 <v1 <...<vn−1 <un
1 Wvσ
.
σ(u, v) Wuσ
<y
and
∗
def
∆G̃ (y) =
∞
X
X
n=1 0≤u1 <v1 <...<vn−1 <un
1 Wuσ
.
σ
σ(u,
v)
W
v
<y
MINIMISING RETURN TIME
19
Theorem 5.9. The Bellman process for the dynamic version of Problem 1.3 is given
by
 t−1
Q


rXus G(Xts )ψ̂(Mts ) : if Mts < N,

u=0
Vts =
t−1
Q


if Mts = N.
rXus G̃(Xts ) :

u=0
5.3. Continuous-time and discrete-time with waiting. Now we consider the
cases where the birth and death process may wait in a state and where it forms a
continuous-time Markov chain.
By solving the problem in the generality of Theorems 5.4 to 5.9 we are are able to
deal with these two cases very easily. First, in the discrete-time case with waiting,
where
Pn,n−1 = qn ; Pn,n = en ; and Pn,n+1 = pn ,
(we stress that we take the holding probabilities en to be fixed and not controllable)
we can condition on the first exit time from each state —so that we replace P by P ∗
given by
qn
qn
def
def
∗
∗
∗
Pn,n−1
= qn∗ =
; Pn,n
= 0; and Pn,n+1
= p∗n =
.
1 − en
1 − en
Of course we must now modify the performance functional to allow for the time spent
waiting in a state. Thus for the additive case we must replace f by f ∗ given by
f (n)
.
1 − en
whilst in the multiplicative case we replace r by r∗ given by
∞
X
(1 − en )r(n)
def
∗
.
etn (1 − en )r(n)t+1 =
r (n) =
1
−
e
r(n)
n
t=0
f ∗ (n) =
Then in the case of a continuous-time birth and death process with birth and death
rates of λn and µn , we obtain P as the transition matrix for the corresponding jump
n
n
chain — so Pn,n−1 = λnµ+µ
and Pn,n+1 = λnλ+µ
(see [6] or [11]). We allow for the
n
n
exponential holding times by setting
f ∗ (n) =
f (n)
,
λn + µn
and
λn + µn
α(n) + λn + µn
Thus our results are still given by Theorems 5.4–5.9
r∗ (n) =
6. Examples and some concluding remarks
We first consider the original time-minimisation problem with general σ.
Example 6.1. Suppose that f = 1 and we seek to solve Problem 1.2. Thus C = ∅
y = 0 and the optimal choice of s0 according to Theorem 3.2 is
√
2
0
s =
.
σ
20
SAUL JACKA
Notice that it follows that (with this choice of scale function)
√
ds(Xts ) = 2dBt ,
and
E0 [S] = s(1)2 .
Example 6.2. If we extend the previous example by assuming that s is given on [0, y),
then we will still have s0 proportional to σ1 on [y, 1] and so on this interval s(X s )will
behave like a multiple of Brownian Motion with partial reflection at y (at least if s0 (y−)
exists).
Example 6.3. We now consider the additive√functional case with general f . Then
from Theorem 3.2, the optimal choice of s0 is σ2f . With this choice of s, we see that
Z t
s
f (Xus )du,
< s(X ) >t =
0
so that
Z
E0 [
S
f (Xus )du = E[< s(X s ) >S ].
0
Example 6.4. If we turn now to the discounted case and take α constant and σ = 1,
we see that the optimal choice of s0 is constant, corresponding to zero drift. Thus we
obtain the same optimal control for each α. This suggests that possibly, the optimum
is actually a stochastic minimum for the shuttle time. Whilst we cannot contradict
this for initial position 0, the corresponding statement for a general starting position
is false.
To see this let s0 correspond to drift 1 on r
[0, y]. Then a simple calculation shows that
√
√
1
√
cosh( 1+2αy)+ √1+2α
sinh( 1+2αy)
0
the optimal choice of s on [y, 1] is 2α cosh(√1+2αy)− √ 1 sinh(√1+2αy) . It is clear
1+2α
that this choice depends on α and hence there cannot be a stochastic minimum since,
were one to exist, it would achieve the minimum in each discounted problem.
Remark 6.5. For cases where a stochastic minimum is attained in a control problem
see, for example, [8] or [3].
References
[1] Assing, S and Schmidt, W M: “Continuous Strong Markov Processes in Dimension One,
Springer, Berlin-Heidelberg-New York (1998)
[2] Atchade, Y, Roberts, G O and J. S. Rosenthal: Towards Optimal Scaling of Metropolis-Coupled
Markov Chain Monte Carlo, Statistics and Computing, 21, 4, 555-568, 2011.
[3] Connor, S and Jacka, S: Optimal co-adapted coupling for the symmetric random walk on the
hypercube, J. Appl. Probab. 45, 703-713 (2008)
[4] Englebert, H J: Existence and non-existence of solutions of one-dimensional stochastic equations,
Probability and Mathematical Statistics,Vol. 20 (2), 343 - 358
[5] Englebert, H J and Schmidt, W, On solutions of one-dimensional stochastic differential equations
without drift, Z. Wahrsch. Verw. Gebiete 68, 287 314, 1985.
[6] Feller, W: “An Introduction To Probability Theory and its Application Vol II”. 2nd Edition.
Wiley New York, 1971
[7] Ito, K and McKean, H P: “Diffusion Processes and their Sample Paths”, Springer, BerlinHeidelberg-New York, 1974.
[8] Jacka, S:, Keeping a satellite aloft: two finite fuel stochastic control models, J. Appl. Probab.
36, 1-20, (1999).
MINIMISING RETURN TIME
21
[9] Oksendal, B: “Stochastic Differential Equations”, 6th Edition, Springer, Berlin-Heidelberg-New
York, 2003
[10] Revuz, D and Yor, M: “Continuous martingales and Brownian motion”, 3rd Edn. Corrected 3rd
printing. Springer, Berlin-Heidelberg-New York, 2004.
[11] Williams, D: “Diffusions, Markov processes, and martingales: Vol. I”. Wiley, New York, 1979.
Department of Statistics, University of Warwick, Coventry CV4 7AL, UK
E-mail address: s.d.jacka@warwick.ac.uk
Download