MINIMISING THE TIME TO REACH A TARGET AND RETURN SAUL JACKA Abstract. Motivated in part by a problem in Metropolis-coupled, Markov chain Monte Carlo, we seek to minimise, in a suitable sense, the time it takes a (regular) diffusion with instantaneous reflection at 0 and 1 to travel from the origin to 1 and then return. The control mechanism is that we are allowed to chose the diffusion’s drift at each point in [0,1]. We consider the static and dynamic versions of this problem, where, in the dynamic version, we are only able to choose the drift at each point at the time of first visiting that point. The dynamic version leads to a novel type of stochastic control problem. December 13, 2012 1. Introduction and problem motivation 1.1. Introduction. Suppose that X µ is the diffusion on [0, 1] started at 0 and given by dXtµ = σ(Xtµ )dBt + µ(Xtµ )dt on (0,1) with instantaneous reflection at 0 and 1 (see [11] or [7] for details). Where there is no risk of confusion we omit the superscript µ. Formally, we define Tx to be the first time that the diffusion reaches x, then we define S = S(X), the shuttle time (between 0 and 1), by def S(X) = inf{t > T1 (X) : Xt = 0}. In this article, we consider the following problem (and several variants and generalizations): Problem 1.1. Minimise the expected shuttle time ES; i.e. find inf E[S(X µ )], µ where the infimum is taken over a suitably large class of drifts µ, to be specified in more detail later. Given the symmetry of the problem it is tempting to hypothesise that the optimal choice of µ is 0. We shall soon see that, in general, this is false, although it is true when σ ≡ 1 We actually want to try and minimise S (and additive functionals of X µ evaluated at S) in as general a way as possible so we extend Problem 1.1 in the following ways: Key words: SHUTTLE-TIME; DIFFUSION; BIRTH AND DEATH PROCESS; MCMCMC; STOCHASTIC CONTROL; SIMULATED TEMPERING. AMS 2010 subject classifications: Primary 60J25; secondary 60J27, 60J60, 93E20. The author is most grateful to Gareth Roberts for suggesting this problem. The author would like to thank David Hobson, Jon Warren and Sigurd Assing for stimulating discussions on this problem and the participants at the EPSRC Workshop on Topics in Control, held at the University of Warwick in November 2011, for further feedback. 1 2 SAUL JACKA Problem 1.2. Find Z inf E[ µ S f (Xtµ )dt] 0 for suitable positive functions f ; and Problem 1.3. Find Z sup E[exp(− µ S α(Xtµ )dt)]; 0 for suitable positive functions α. We shall also solve the corresponding discrete statespace problems (in both discrete and continuous time). Although we will prove more general versions it seems appropriate to give a preliminary statement of results in this context. Theorem 1.4. Suppose that σ > 0, that f is a non-negative Borel-measurable function on [0, 1] and that, denoting Lebesgue measure by λ , √ f (1.1) ∈ L1 ([0, 1], λ), σ then Z 1s Z S 2f (u) 2 f (X µ t )dt = inf E du . measurable µ σ 2 (u) 0 0 q If, in addition, σf2 is continuously differentiable and strictly positive on (0,1), then the optimal drift is µ̂ given by r 1 f 0 µ̂ = − (ln ). 2 σ2 Theorem 1.5. Suppose that σ > 0, that α is a non-negative Borel-measurable function on [0, 1] and that √ α (1.2) ∈ L1 ([0, 1], λ), σ Z 1s Z S 2α(u) −2 sup E exp − α(X µ t )dt = cosh du . σ 2 (u) measurable µ 0 0 p If, in addition, σα2 is continuously differentiable and strictly positive on (0,1), then the optimal drift is µ̂ given by r 1 α 0 µ̂ = − (ln ). 2 σ2 We will eventually solve the problems dynamically, i.e. we will solve the corresponding stochastic control problems. However, we shall need to be careful about what these are as the problem is essentially non-Markovian. Normally in stochastic control problems, one can choose the drift of a controlled diffusion at each time point but this is not appropriate here. In this context, it is appropriate that the drift is ‘frozen’ once we have had to choose it for the first time. We shall formally model this in section 4 . MINIMISING RETURN TIME 3 1.2. Problem motivation. The problem models one arising in simulated temperinga form of Markov Chain Monte Carlo (MCMC). Essentially the level corresponds to the temperature in a “heat bath”. The idea is that when simulating a draw from a highly multimodal distribution we use a reversible Markov Process to move between low and high temperature states (and thus smear out the modes temporarily) so that the Markov chain can then move around the statespace; then at low temperature we sample from the true distribution (see [2]). 2. Notation and some general formulae We assume the usual Markovian setup, so that X µ lives on a suitable filtered space (Ω, F, (Ft )t≥0 , ), with the usual family of probability measures (Px )x∈[0,1] corresponding to the possible initial values. Definition 2.1. Denote by sµ the standardised scale function for X µ and by mµ the corresponding speed measure. Since X is regular and reflection is instantaneous we have: Z u Z x µ(t) µ s (x) = exp −2 dt du, 2 0 0 σ (t) R u µ(t) Z x exp 2 0 σ2 (t) dt Z x du def µ µ =2 du, m ([0, x]) = m (x) = 2 2 0 σ 2 (u) 0 0 σ (u)s (u) (see [10]). From now on, we shall consider the more general case where we only know that (dropping the µ dependence) s and m are absolutely continuous with respect to Lebesgue measure so that, denoting the respective Radon-Nikodym derivatives by s0 and m0 we have 2 s0 m0 = 2 Lebegue a.e. σ For such a pair we shall denote the corresponding diffusion by X s . We R underline that we are only considering regular diffusions with “martingale part” σdB or, more precisely, diffusions X with scale functions s such that ds(Xt ) = s0 (Xt )σ(Xt )dBt , so that, for example, sticky points are excluded (see [5] for a description of driftless sticky BM and its construction, see also [4] for other problems arising in solving stochastic differential equations ). Note that our assumptions do allow generalised drift: if s is the difference between two convex functions (which we will not necessarily assume) then Z t Z 1 s00 (da) s s (2.1) Xt = x + σ(Xu )dBu − Lat (X) 0 , 2 R s − (a) 0 where s0 − denotes the left-hand derivative of s, s00 denotes the signed measure induced by s0 − and Lat (X) denotes the local time at a developed by time t by X (see RY Chapter VI for details). Definition 2.2. For each y ∈ [0, 1], we denote by φy the function Z Ty φy : x 7→ Ex f (Xs )ds , 0 4 SAUL JACKA where, as is usual, the subscript x denotes the initial value of X under the corresponding law Px . Theorem 2.3. For 0 ≤ x ≤ y, φy is given by Z yZ v f (u)m0 (u)s0 (v)dudv, (2.2) φy (x) = x u=0 while for 0 ≤ y ≤ x, φy is given by Z xZ (2.3) φy (x) = y 1 f (u)m0 (u)s0 (v)dudv. u=v In particular, (2.4) E0 Z S f (Xts )dt Z 1 1 Z f (u)m0 (u)s0 (v)dudv. = 0 0 0 Proof: This follows immediately from Proposition VII.3.10 of [10] on observing that, with instantaneous reflection at the boundaries, the speed measure is continuous. We give similar formulae for the discounted problem: Definition 2.4. We denote by ψy the function Z Ty α(Xs )ds . ψy : x 7→ Ex exp − 0 Theorem 2.5. (i) Either 1 Z α(u)dm(u) < ∞, (2.5) 0 or Z E0 [exp − S α(Xs )ds = 0, 0 in which case Z S α(Xs )ds = ∞ a.s. 0 (ii) Now suppose that (2.5) holds. For each n, denote by In (x) the integral Z def In (x) = α(u1 ) . . . α(un )dm(u1 ) . . . dm(un )ds(v1 ) . . . ds(vn ), 0≤u1 ≤v1 ≤u2 ...≤vn ≤x and by I˜n (x) the integral Z def ˜ α(u1 ) . . . α(un )dm(u1 ) . . . dm(un )ds(v1 ) . . . ds(vn ), In (x) = x≤v1 ≤u1 ≤v2 ...≤un ≤1 (2.6) with I0 = I˜0 ≡ 1. Now define G and G̃ by ∞ ∞ X X def G(x) = In (x) and G̃(x) = I˜n (x). 0 0 The the sums in (2.6) are convergent and for x ≤ y ψy (x) = G(x) , G(y) MINIMISING RETURN TIME 5 while for x ≥ y ψy (x) = G̃(x) . G̃(y) Proof: (i) Note first that s(1) < ∞ follows from regularity. We consider the case where x R≤ y. Now suppose that α is bounded. It folT lows that ψy > 0 for each y since 0 y α(Xs )ds is a.s. finite for bounded α. Now, R t∧Ty R t∧T setting Nt = e− 0 α(Xs )ds ψ(Xt∧Ty ), it is clear that N = E exp − 0 y α(Xs )ds |Ft∧Ty and is thus a continuous martingale. Then, writing Z t∧Ty α(Xs )ds)Nt ψ(Xt∧Ty ) = exp( 0 it follows that t∧Ty Z ψ(Xt∧Ty ) − Z αψ(Xu )du = t α(Xs )ds)dNu , 0 0 0 u∧Ty Z exp( and hence is a martingale. Thus we conclude that ψ is in the domain of Ay , the extended or martingale generator for the stopped diffusion X Ty , and Ay ψy = αψ. (2.7) Since the speed and scale measures for X and X Ty coincide on [0, y] and using the fact that ψ 0 (0) = 0, we conclude from Theorem VII.3.12 of [10] that Z x Z v ψy (x) = ψy (0) + s0 (v)α(u)ψy (u)m0 (u)dudv for x < y. v=0 (2.8) u=0 A similar argument establishes that Z 1 Z 1 ψy (x) = ψy (1) + s0 (v)α(u)ψy (u)m0 (u)dudv for x > y. v=x u=v Now either min(ψ1 (0), ψ0 (1)) = 0, in which case Z E0 [exp − S α(Xs )ds = ψ1 (0)ψ0 (1) = 0, 0 or (2.9) min(ψ1 (0), ψ0 (1)) = c > 0. Suppose that (2.9) holds, then (since ψ1 is increasing) it follows from (2.7) that Z 1 Z v (2.10) ψ1 (1−) ≥ c + s0 (v)cα(u)m0 (u)dudv v=0 u=0 Z 1 Z 1 = c 1+ s0 (v)α(u)m0 (u)dudv u=0 v=u 1 ≥ c 1 + s(1) − s( ) 2 Z 0 1 2 α(u)m0 (u)du . 6 SAUL JACKA Similarly, we deduce that 1 ψ0 (0+) ≥ c 1 + s( ) 2 1 Z α(u)m0 (u)du . 1 2 Thus, if (2.5) fails, (2.9) cannot hold (since if (2.5) fails then at least one R1 R1 of 02 α(u)m0 (u)du) and 1 α(u)m0 (u)du) is infinite) and so we must have 2 ψ1 (0)ψ0 (1) = 0. To deal with unbounded α, take a monotone, positive sequence αn increasing to α and take limits. (ii) Suppose now that (2.5) holds. Setting G(x) = ψ1 (x) , ψ1 (0) we see that G satisfies equation P (2.7) with P ˜ G(0) = 1. Convergence of the series In and In follows from the bounds on In and ˜ In contained in the following lemma. Lemma 2.6. Let def y Z B(y) = def Z α(u)dm(u) and B̃(y) = 0 1 α(u)dm(u), y then In (x) ≤ (2.11) n (s(x)B(x))n ˜n (x) ≤ (s̃(x)B̃(x)) , and I (n!)2 (n!)2 where def s̃(x) = s(1) − s(x). Now we establish the first inequality in (2.11) by induction. The initial inequality is trivially satisfied. It is obvious from the definition that Z x Z v In+1 (x) = α(u)In (u)dm(u)ds(v), v=0 and so, assuming that In (·) ≤ Zx Zv (2.12) In+1 (x) ≤ (s(·)B(·))n : (n!)2 s(u)n B(u)n dm(u)ds(v) α(u) (n!)2 v=0 u=0 Zx Zv ≤ u=0 α(u) B(u)n dm(u)s(v)n ds(v) (since s is increasing) (n!)2 v=0 u=0 Zx B(v)n+1 s(v)n ds(v) n!(n + 1)! = v=0 B(x)n+1 ≤ n!(n + 1)! Zx s(v)n ds(v) (since B is increasing) v=0 n+1 = (s(x)B(x)) , ((n + 1)!)2 MINIMISING RETURN TIME 7 establishing the inductive step. A similar argument establishes the second inequality. Now by iterating equation (2.7) we obtain G(x) = n−1 X Ik (x) k=0 Z + α(u1 ) . . . α(un )G(un )dm(u1 ) . . . dm(un )ds(v1 ) . . . ds(vn ). 0≤u1 ≤v1 ≤u2 ...≤vn ≤x Since G is bounded by 1 ψ1 (0) we see that 0 ≤ G(x) − n−1 X Ik (x) ≤ k=0 1 In (x). ψ1 (0) A similar argument establishes that 0 ≤ G̃(x) − n−1 X I˜k (x) ≤ k=0 1 ˜ In (x). ψ0 (1) and so we obtain (2.6) by taking limits as n → ∞. 3. Preliminary results For now we will state and prove more general, but still non-dynamic versions of Theorems 1.4 and 1.5. We define our constrained control set as follows: Definition 3.1. Given a scale function s0 ∼ λ and C, a Borel subset of [0,1], we define the constrained control set MC s0 by (3.1) MC s0 = {scale functions s : ds|C = ds0 |C and s ∼ λ} The corresponding controlled diffusion X s has scale function s and speed measure m given by 2 m0 = 2 0 . σ s Theorem 3.2. For any scale function s ∼ λ, define the measure I s on ([0, 1], B([0, 1]) by Z Z f (u) def s du I (D) = f (u)m(du) = 2 2 0 D D σ (u)s (u) and the measure J by Z s 2f (u) def J(D) = du, σ 2 (u) D then, given a scale function s0 , 2 Z p S s c s 0 inf E0 f (Xt )dt = s0 (C)I (C) + J(C ) . s∈MsC0 0 8 SAUL JACKA The optimal choice of s is given by ( s0 (dx) q s(dx) = q s0 (C) I s0 (C) : on C 2f (x) dx σ 2 (x) : on C c Proof: Note first that, from Theorem 2.3, Z1 Z1 Z S (3.2) f (u)s(dv)m(du) f (Xts )dt = φ1 (0) + φ0 (1) = E0 0 v=0 u=0 = s0 (C)I s0 (C) + Z [I s0 (C)s(dv) + s(C)f (v)m(dv)] Cc 1 + 2 Z Z [f (u)s(dv)m(du) + f (v)m(dv)s(du)], Cc Cc 1 2 where the factor in the last term in (3.2) arises from the fact that we have symmetrised the integrand. Now, for s ∈ MsC0 , we can rewrite (2.4) as Z Z s0 S 2f (v) s0 I (C)s0 (v) + s0 (C) 2 f (Xt )dt = s0 (C)I (C) + (3.3)E0 dv σ (v)s0 (v) 0 Cc Z Z 2f (u) s0 (v) 2f (v) s0 (u) 1 + + dudv. 2 σ 2 (u) s0 (u) σ 2 (v) s0 (v) Cc Cc We now utilise the very elementary fact that for a, b ≥ 0, (3.4) b inf [ax + ] = 2ab and if a, b > 0 this is attained at x = x>0 x r b . a Applying this to the third term on the right-hand-sideq of (3.3), we see from (3.4) that 2 R R R q 4f (u)f (v) it is bounded below by du = J 2 (C c ) and this [ σ2 (u)σ2 (v) dudv = C c σ2f2 (u) (u) Cc Cc q bound is attained when s0 (x) is a constant multiple of σ2f2 (x) a.e. on C c . (x) Turning to the secondqterm in (3.3) we see from (3.4) that it is bounded below p p R by C c 2 s0 (C)I s0 (C) σ2f2 (v) s0 (C)I s0 (C)J(C c ) and this is attained when dv = 2 (v) q q s0 (C) s0 (x) = σ2f2 (x) a.e. on C c . (x) I s0 (C) 0 Thus, q that the infimum of the RHS of (3.3) is attained by setting s (x) equal q we see 2f (x) on C c and this gives the stated value for the infimum. to Iss00(C) (C) σ 2 (x) In the exponential case we only deal with constraints on s on [0, y]. Theorem 3.3. Assume that (2.5) holds and define def σ̃ 2 (x) = σ 2 (x) , α(x) (i) Let G be as in equation (3.3), so that (at least formally) G0 0 1 1 α σ̃ 2 s0 0 − 2G = σ 2 G00 + µG0 − αG = 0 2 s 2 MINIMISING RETURN TIME (3.5) 9 and let G̃∗ satisfy the “adjoint equation” 2 0 ∗0 0 σ̃ s G̃ 1 ∗ α =0 − 2 G̃ 2 s0 with boundary conditions G̃∗ (0) = 1 and G̃∗0 (0) = 0, so that Z x Z v 2α(v)G̃∗ (v) 0 ∗ G̃ (x) = 1 + s (u)dudv 2 0 v=0 u=0 σ (v)s (v) Z x Z v = 1+ α(v)G̃∗ (v)dm(v)ds(u), v=0 u=0 ∗ then G̃ is given by ∗ (3.6) G̃ (x) = ∞ X I˜n∗ (x), n=0 where (3.7) def I˜n∗ (x) = Z α(v1 ) . . . α(vn )ds(u1 ) . . . ds(un )dm(v1 ) . . . dm(vn ). 0≤u1 ≤v1 ≤...vn ≤x (ii) The optimal payoff for Problem 1.3 is given by Z S sup E0 exp − α(Xts )dt = ψ̂(y) [0,y] 0 s∈Ms0 where p p −2 ∗ GG̃ cosh F (y) + σ̃ 2 G0 G̃∗0 sinh F (y) , ψ̂(y) = with √ Z 1s 2du 2α def F (y) = = du. 2 σ (u) y σ̃(u) y q G̃∗0 0 The payoff is attained by setting σ̃(x)s (x) = G (y) for all x ≥ y (if y = 0, G0 G̃∗ 0 any constant value for σ̃(x)s (x) will do). Z 1 Proof: (i) This is proved in the same way as equation (2.6) in Theorem 2.5. (ii) First we define Z def ∗ (3.8) In (x) = α(v1 ) . . . α(vn )ds(u1 ) . . . ds(un )dm(v1 ) . . . dm(vn ); x≤u1 ≤v1 ≤...vn ≤1 and ∗ (3.9) def G (x) = ∞ X In∗ (x). n=0 To prove (ii) we use the following representations (which the reader may easily verify): n n X X ∗ 2 0 ∗ (3.10) In (1) = Im (y)In−m (y) − σ̃ (y) Im (y)(In−m )0 (y), m=0 m=1 10 SAUL JACKA and I˜n (0) = (3.11) n X ∗ I˜m (y)I˜n−m (y) − σ̃ 2 (y) m=0 n X 0 ∗ I˜m (y)(I˜n−m )0 (y) m=1 It follows from these equations that Z S (3.12) E0 exp −α(Xt )dt 0 = G(y)G∗ (y) − σ̃ 2 (y)G0 (y)(G∗ )0 (y) G̃(y)G̃∗ (y) − σ̃ 2 (y)G̃0 (y)(G̃∗ )0 (y) . Now essentially the same argument as in the proof of Theorem 3.2 will work as follows. Multiplying out the expression on the RHS of (3.12) we obtain the sum of the three terms: P ˜ ∗ (y) + I˜m (y)In∗ (y)] [In (y)Im (a) 21 G(y)G̃∗ (y) m≥0,n≥0 P ˜0 ∗ 0 (b) 21 G0 (y)˜(G∗ )0 (y) [I n (y)(Im ) (y) + I˜0 m (y)(In∗ )0 (y)] ; and m≥0,n≥0 P ∗ 0 0 ) (y)I˜n (y)], (y) + G0 (y)G̃∗ (y)(Im (c) [G(y)(G̃∗ )0 (y)In∗ (y)I˜m m≥1,n≥0 where in the first two terms we have symmetrised the sums. Using (2.6), (c) becomes X R t0 (v1 ) . . . t0 (vn )t0 (w1 ) . . . t0 (wm ) [G(y)(G̃∗ )0 (y) 0 (3.13) t (u1 ) . . . t0 (un )t0 (z1 ) . . . t0 (zm−1 ) m≥1,n≥0 Dm,n (y) +G0 (y)G̃∗ (y) t0 (u1 ) . . . t0 (un )t0 (z1 ) . . . t0 (zm−1 ) ]dλ̃(u, v, w, z) t0 (v1 ) . . . t0 (vn )t0 (w1 ) . . . t0 (wm ) where Dm,n (x) = {(u, v, w, z) ∈ Rn × Rn × Rm × Rm−1 : x ≤ u1 ≤ v1 ≤ . . . vn ≤ 1; and x ≤ w1 ≤ z1 ≤ . . . wm ≤ 1}, t is the measure with Radon-Nikodym derivative t0 = σ̃s0 , and λ̃ denotes the measure with Radon-Nikodym derivative σ̃1 . Clearly each term in the sum in q G̃∗0 (y) 0 (3.13) is minimised by taking t constant and equal to G(y) a.e. on [y, 1]. G0 (y)G̃∗ (y) The first two terms, (a) and (b), are each minimised by taking t0 constant a.e. on [y, 1]. Substituting this value for t0 back in we obtain the result. Remark 3.4. We see that, in general, in both Theorems 2.3 and 2.6 the optimal scale function has a discontinuous derivative. In the case where C = [0, y) there is a discontinuity in s0 at y. This will correspond to partial reflection at y (as in skew Brownian motion- see [10] or [7]) and will give rise to a singular drift – at least at y. Remark 3.5. We may easily extend Theorems 2.3 and 2.6 to the cases where f def or α disappears on some of [0,1]. In the case where N = {x : f (x) = 0} is nonempty, observe first that the cost functional does not depend on the amount of time the diffusion spends in N so that every value for ds|N which leaves the diffusion recurrent will give the same expected cost. If λ(N ) = 1 then the problem is trivial, otherwise MINIMISING RETURN TIME 11 define the revised statespace S = [0, 1 − λ(N )] and solve the problem on this revised def interval with the cost function f˜(x) = f (g −1 (x)) where g : t 7→ λ([0, t] ∩ N c ) and g −1 : x 7→ inf{t : g(t) = x}. This gives us a diffusion and scale function sS which minimises the cost functional on S. Then we can extend this to a solution of the original problem by taking ds = ds0 1N + ds̃1N c , where ds0 is any finite measure equivalent to λ and ds̃ is the Lebesgue-Stiltjes measure given by s̃([0, t]) = sS ([0, λ([0, t] ∩ N c ) = sS (g(t)). An exactly analagous method will work in the discounted problem 4. The dynamic control problems We now turn to the dynamic versions of Problems 1.2 and 1.3. A moment’s consideration shows that it is not appropriate to model the dynamic version of the problem by allowing the drift to be chosen adaptively. If we were permitted to do this then we could choose a very large positive drift until the diffusion reaches 1 and then a very large negative drift to force it back down to 0. The corresponding optimal payoffs for Problems 1.2 and 1.3. would be 0 and 1 respectively. We choose, instead, to consider the problem where the drift may be chosen dynamically at each level, but only when the diffusion first reaches that level. Formally, reverting to the finite drift setup, we are allowed to choose controls from the collection M of adapted processes µ with the constraint that (4.1) µt = µTXt , or continuing the generalised setup, to choose scale measures dynamically, in such a way that s0 (Xt ) is adapted. Although these are very non-standard control problems we are able to solve them – mainly because we can exhibit an explicit solution, to whit, following the same control as in the “static” case. Remark 4.1. Note that this last statement would not be true if our constraint was not on the set [0, y]. To see this, consider the case where our constraint is on the set [y, 1]. If the controlled diffusion starts at x > 0 then there is a positive probability that it will not reach zero before hitting 1, in which case the drift will not have been chosen at levels below IT1 , the infimum on [0, T1 ]. Consequently, on the way down we can set the drift to be very large and negative below IT1 Thus the optimal dynamic control will achieve a strictly lower payoff than the optimal static one in this case. We do not pursue this problem further here but intend to do so in a sequel. We need to define the set of admissible controls quite carefully and two approaches suggest themselves: the first is to restrict controls to choosing a drift with the property (4.1) whilst the second is to allow suitable random scale functions. 12 SAUL JACKA Both approaches have their drawbacks: in the first case we know from Remark 3.4 to expect that, in general, the optimal control will not be in this class, whilst, in the second, it is not clear how large a class of random scale functions will be appropriate. In the interests of ensuring that an optimal control exists, we adopt the second approach. From now on, we fix the Brownian Motion B on the filtered probability space (Ω, F, (Ft )t≥0 , P). Definition 4.2. By an equivalent random scale function we simply mean a random, finite Borel measure on [0, 1] a.s. equivalent to Lebesgue measure and we define def (4.2)M = {equivalent random scale functions s : s there exists a martingale Y with Yts Z = t s0 ◦ s−1 .σ ◦ s−1 (Yu )dBu }. 0 s We define the corresponding controlled process X by Xts = s−1 (Yt ). For any s0 ∈ M we then define the constrained control set Msy0 by Msy0 = {s ∈ M : ds|[0,y) = ds0 |[0,y) } (4.3) Remark 4.3. Note that M contains all deterministic equivalent scale functions. An example of a random element of M when σ ≡ 1 is s, given by 1 ds|[0, 1 ) = dλ; ds(x)|[ 1 ,1] = 1dλ1(T 1 (B)<1) + exp(−2(x − ))dλ1(T 1 (B)≥1) , 2 2 2 2 2 corresponding to X s having drift 1 above level 21 if and only if it reaches that level before time 1. In general, the martingale constraint is both about existence of a solution to the corresponding stochastic differential equation and about imposing a suitable progressive measurability condition on the random scale function. Theorem 4.4. For each s ∈ M, let Mts denote the running maximum of the controlled process X s . Then for each s0 ∈ M, the optimal payoff (or Bellman) process V s0 defined by Z S s0 def s f (X Vt = essinf E )dt|F t t s s∈M 0 s Mt 0 0 is given by (4.4) (R t p 2 s0 s0 (Mts0 )I s0 (Mts0 ) + J(Mts0 ) − φXts0 (0) : f (X )du + 2 def s0 u 0 Vt = vt = R t∧S s0 f (Xus0 )du + φ0 (Xt∧S ): 0 for Mts0 < 1 for Mts0 = 1, (where φ is formally given by equations (2.2) and (2.3) with s = s0 ), and the optimal control is to take p s0 f (x) s (M ) 0 (4.5) s0 (x) = s0 t s0 for x ≥ Mts0 . I (Mt ) σ(x) Proof: Consider the candidate Bellman process vt . Using the facts that Z t∧T1 Z T1 def s0 s (4.6) Nt = f (Xu )du − φXt∧T (0) = E f (Xus0 )du|Ft∧T1 0 0 1 0 MINIMISING RETURN TIME 13 is a martingale, def N 0t = (4.7) φ0 (Xts0 ) Z t f (Xus0 )du + 0 R S is equal to E 0 f (Xus0 )du|Ft on the stochastic interval [[T1 , S]], and hence is a martingale on that interval, M s0 is a continuous, increasing process, and φ1 (0) + φ0 (1) = 2s0 (1)I s0 (1) (so that v is continuous at T1 (X s0 )): dvt p s0 s0 s0 s0 = 4 s0 (Mt )I (Mt ) + J(Mt ) s s s0 s0 s0 p (M ) I 1 s (M ) 2f ((Mts0 ) 1 0 0 t t s0 s0 s0 s0 0 + (I ) ((Mt ) − dMts0 × s0 (Mt ) ((Mt ) 2 s0 (Mts0 ) 2 I s0 (Mts0 ) σ 2 ((Mts0 ) +dNt 1(Mts0 <1) + dN 0 t 1(Mts0 =1) p s0 s0 s0 s0 = 4 s0 (Mt )I (Mt ) + J(Mt ) s s s s s0 s0 s0 0 1 0 I (Mt ) 1 f (Mt ) s0 (Mt ) f (Mts0 ) × s − dMts0 s0 + 0 s0 s0 s0 s0 2 s 2 0 2 s0 (Mt ) s (Mt ) σ (Mt ) I (Mt ) σ (Mt ) + dNt 1(Mts0 <1) + dN 0 t 1(Mts0 =1) Now, since s0 , I s0 and J are non-negative it follows from (3.4) that dvt ≥ dN̄t , where dN̄t = dNt 1(Mts0 <1) + dN 0 t 1(Mts0 =1) , with equality if s s0 (Mts0 ) = (4.8) s0 (Mts0 )f (Mts0 ) . I s0 (Mts0 )σ 2 (Mts0 Then the usual submartingale argument (see, for example [9] Chapter 11), together with the fact that v is bounded by assumption (1.1)) gives us (4.4). It is easy to check that s given by (4.8) is in MsM0 s0 . The fact that the optimal choice t q (x) 0 of s satisfies (4.5) follows on substituting s x) = I ss00 (y)f in the formulae for s and (y)σ 2 (x) I s and observing that the ratio s0 (x) I s0 (x) is then constant on [y, 1]. Theorem 4.5. The Bellman process for Problem 1.3 is given by Z S Rt s0 def s0 def Vt = esssup E exp(− α(Xts0 )dt)|Ft = vt = e− 0 α(Xs )ds ψ(Xts0 , Mts0 ), s∈M s0 0 s Mt 0 where ( G(x)ψ̂(y) ψ(x, y) = G̃(x) if y < 1, if y = 1. and ψ̂(y) = p GG̃∗ cosh F (y) + p σ̃ 2 G0 G̃∗0 sinh F (y) −2 , 14 SAUL JACKA (as in Theorem 3.3 (ii)) with def Z 1 F (y) = y du . σ̃(u) The payoff is attained by setting s σ̃(x)s0 (x) = (4.9) GG̃∗0 (Mts0 ) for all x ≥ Mts0 G0 G̃∗ Proof: The proof is very similar to that of Theorem 4.4. Note that ψ is continuous at the point (1, 1). Thus, for a suitable bounded martingale n, Z t dvt = exp − α(Xss0 )ds ψy (Xts0 , Mts0 )dMts0 1(Mts0 <1) + dnt 0 Z t α(Xss0 )ds G(Xts0 )ψ 0 (Mts0 )1(Mts0 <1) + dnt = exp − 0 Z t p p −3 = −2 exp − α(Xss0 )ds G(Xts0 ) GG̃∗ cosh F (Mts0 ) + σ̃ 2 G0 G̃∗0 sinh F (Mts0 ) 0 p p p p s0 0 0 ∗ 0 ∗0 2 0 ∗0 × [( GG̃ ) − G G̃ ] cosh F (Mt ) + ( σ̃ G G̃ ) ] − σ̃ 2 GG̃∗ sinh F (Mts0 ) + dnt . Now p GG̃∗ 0 1 = G̃∗0 2 r s 1 G + G0 ∗ 2 G̃ G̃∗ p 0 ∗0 ≥ G G̃ using (3.4), G with equality attained when p p 0 ∗0 G G̃ = GG̃∗ . (4.10) Similarly, defining mα by setting dmα = p ( σ̃ 2 G0 G̃∗0 )0 = = = = ≥ r ( G0 s0 dm , α 0 )(σ̃ 2 G̃∗0 = s dG dG̃∗ ds dmα 0 v v u dG̃∗ u 2 2 ∗ u dG u 1 α 0 d G t dmα 1 0 d G̃ t ds (m ) + s dG 2 dmα ds 2 dsdmα dG̃α∗ ds dm v v u dG̃∗ u dG 2 ∗ 1 1 d2 G u t dmα + s0 d G̃ u t ds dG α dG̃∗ 2 σ̃ 2 s0 dmα ds dsdm ds dmα v v u dG̃∗ u dG u u 1 1 t dmα 0 ∗ t ds G̃ G + s dG dG̃∗ 2 σ̃ 2 s0 ds dmα s GG̃∗ σ̃ 2 MINIMISING RETURN TIME 15 with equality when (4.11) v u dG̃∗ uG α 1 s0 = t dm σ̃ G̃∗ dG ds ∗ dG̃ 2 0 ∗0 = s10 G0 and dm Now we can easily see (by writing dG α = σ̃ s G̃ ) that (4.11) implies ds (4.10) so the standard supermartingale argument establishes that Vt = vt . That the optimal choice of s0 is as given in (4.9) follows on observing that, with this choice of s0 , (σ̃(G0 G̃∗ − GG̃∗0 ))0 (x) = 0 for x ≥ y, and G0 (y)G̃∗ (y) − G(y)G̃∗0 (y) = 0. 5. The discrete statespace case 5.1. Additive functional case. Supppose that X is a discrete-time birth and death process on S = {0, . . . , N }, with transition matrix (P ) given by Pn,n+1 = pn and 1 − pn = qn = Pn,n−1 and pN = q0 = 0. We define def wn = qn pn and Wn = n Y wk , k=1 with the usual convention that the empty product is 1. Note that s, given by def s(n) = n−1 X Wk , k=0 is the discrete scale function in that s(0) = 0, s is strictly increasing on S and s(Xt ) is a martingale. Remark 5.1. Note that when we choose pn or wn we are implicitly specifying s(n + 1) − s(n) so we shall denote this quantity by ∆s(n) and we shall denote by ds the def Lebesgue-Stiltjes measure on S = {0, 1, . . . , N − 1} given by ds(x) = ∆s(x). Let f be a positive function on S and define 1 f˜(n) = (f (n) + f (n + 1)) for 0 ≤ n ≤ N − 1. 2 16 SAUL JACKA Theorem 5.2. If we define Ty −1 φy (x) = Ex [ X f (Xts0 )], t=0 then for x ≤ y (5.1) φy (x) = f (x) + . . . + f (y − 1) + y−1 v−1 X X 2f˜(u)Wv v=x u=0 Wu while for y ≤ x (5.2) x−1 N −1 X X 2f˜(u)Wv φy (x) = f (y + 1) + . . . + f (x) + . Wu v=y u=v+1 Proof: It is relatively easy to check that φ satisfies the linear recurrence φ(x) = px φ(x + 1) + qx φ(x − 1) + f (x) with the right boundary conditions. It follows from this that optimal payoffs are given by essentially the same formulae as in the continuous case. Thus we now define the constrained control set MC s0 by (5.3) MC s0 = { scale functions s : ds|C = ds0 |C }. Remark 5.3. By convention we shall always assume that 0 ∈ C since we cannot control W0 and hence cannot control ∆s(0). Theorem 5.4. For any scale function s, define the measure I s by X f˜(k) def I s (D) = for D ⊆ S W k k∈D and the measure J by def J(D) = Xq 2f˜(k) for D ⊆ S, k∈D then, given a scale function s0 , inf E0 [ s∈MsC0 S X f (Xts )dt] = p 2 s0 (C)I s0 (C) + J(C c ) . 0 Remark 5.5. Note that all complements are taken with respect to S. The optimal choice of s is given by Wx0 ∆s(x) = Wx = q s0 (C) q I s0 (C) 2f˜(x) : on C : on C c The dynamic problem translates in exactly the same way: we define the constrained control set: Msy0 = {s : ds|{0,...,y−1} = ds0 |{0,...,y−1} }, y ≥ 1 then we have the following: MINIMISING RETURN TIME 17 Theorem 5.6. For each s ∈ M let Mts denote the running maximum of the controlled process X s . Then for each s0 ∈ M, the optimal payoff (or Bellman) process V s0 defined by S−1 X s0 def s )dt|F Vt = essinf E f (X t t s s∈M 0 s Mt 0 0 is given by (5.4) t p 2 P f (Xus0 ) + 2 s0 (Mts0 )I s0 (Mts0 ) + J(Mts0 ) − φXts0 (0) : def 0 Vts0 = vt = P t f (Xu )du + φ0 (Xts0 ) : for Mts0 < N for Mts0 = N, 0 (where φ is formally given by equations (5.1) and (5.2) with s = s0 ), and the optimal control is to take q s0 (Mts0 ) ˜ (5.5) W (x) = s0 f (x) for x ≥ Mts0 . s0 I (Mt ) 5.2. The discounted problem. Suppose that for each x ∈ S, 0 ≤ rx ≤ 1, then define def σi2 = (1 − ri−1 ri )−1 , with r−1 taken to be 1, def and σ(i1 , i2 , . . . , il ) = l Q σim . Now set m=1 Ak (x) = {(u, v) : 0 ≤ u1 < v1 < . . . < vk < x}, Ãk (x) = {(u, v) : x ≤ v1 < u1 < . . . < uk < N } and Wmσ = σm Wm , where Wm is as before. Note that Ak and Ãk will be empty for large values of k. Theorem 5.7. For x ≤ y Ty −1 (5.6) Ex [ Y rXts0 ] = rx . . . ry−1 G(x)/G(y), t=0 where def G(x) = 1 + ∞ X X k=1 (u,v)∈Ak (x) k Y Wvσm 1 , σ(u, v) m=1 Wuσm while for x ≥ y Ty −1 (5.7) Ex [ Y rXts0 ] = ry+1 . . . rx G̃(x)/G̃(y), t=0 where def G̃(x) = 1 + ∞ X X k=1 (u,v)∈Ãk (x) k Y Wvσm 1 , σ(u, v) m=1 Wuσm 18 SAUL JACKA Proof: Define dx = Ex Tx+1 Q−1 r(Xts0 ), then t=0 dx = rx (px + qx dx−1 dx ). Setting rx tx /tx+1 = dx we see that tx+1 = (1 + wx )tx − wx rx rx−1 tx−1 or (tx+1 − tx − wx (tx − tx−1 = wx (1 − rx−1 rx )tx−1 . Substituting tx = G(x) It is easy to check that this is satisfied. Now boundary conditions give equation (5.6). The proof of equation (5.7) is essentially the same. Now with this choice of σ we get same results as before: Theorem 5.8. Suppose G and G̃ are as defined in Theorem 5.7: we set ∗ def G (x) = 1 + ∞ X X k=1 (u,v)∈A∗k (x) k Y Wvσm 1 σ(u, v) m=1 Wuσm and ∗ def G̃ (x) = 1 + ∞ X X k=1 (u,v)∈Ã∗ (x) k k Y Wvσm 1 , σ(u, v) m=1 Wuσm where A∗k (x) = {(u, v) : x ≤ u1 < v1 < . . . < vk < N } and Ã∗k (x) = {(u, v) : 0 ≤ v1 < u1 < . . . < uk < x}. Then the optimal payoff to the discrete version of Problem 1.3 is given by q q S−1 X X Y 2 def ∗ G(y)G̃ (y) F2n (y)+ ∆G(y)(∆G̃∗ )(y) F2n+1 (y) , sup E0 [ rXts0 ] = ψ̂(y) = s∈M{0,...y} n t=0 n where def Fk (y) = X y≤x1 <...<xk <N def ∆G(y) = ∞ X 1 , σ(x) X n=1 0≤u1 <v1 <...<vn−1 <un 1 Wvσ . σ(u, v) Wuσ <y and ∗ def ∆G̃ (y) = ∞ X X n=1 0≤u1 <v1 <...<vn−1 <un 1 Wuσ . σ σ(u, v) W v <y MINIMISING RETURN TIME 19 Theorem 5.9. The Bellman process for the dynamic version of Problem 1.3 is given by t−1 Q rXus G(Xts )ψ̂(Mts ) : if Mts < N, u=0 Vts = t−1 Q if Mts = N. rXus G̃(Xts ) : u=0 5.3. Continuous-time and discrete-time with waiting. Now we consider the cases where the birth and death process may wait in a state and where it forms a continuous-time Markov chain. By solving the problem in the generality of Theorems 5.4 to 5.9 we are are able to deal with these two cases very easily. First, in the discrete-time case with waiting, where Pn,n−1 = qn ; Pn,n = en ; and Pn,n+1 = pn , (we stress that we take the holding probabilities en to be fixed and not controllable) we can condition on the first exit time from each state —so that we replace P by P ∗ given by qn qn def def ∗ ∗ ∗ Pn,n−1 = qn∗ = ; Pn,n = 0; and Pn,n+1 = p∗n = . 1 − en 1 − en Of course we must now modify the performance functional to allow for the time spent waiting in a state. Thus for the additive case we must replace f by f ∗ given by f (n) . 1 − en whilst in the multiplicative case we replace r by r∗ given by ∞ X (1 − en )r(n) def ∗ . etn (1 − en )r(n)t+1 = r (n) = 1 − e r(n) n t=0 f ∗ (n) = Then in the case of a continuous-time birth and death process with birth and death rates of λn and µn , we obtain P as the transition matrix for the corresponding jump n n chain — so Pn,n−1 = λnµ+µ and Pn,n+1 = λnλ+µ (see [6] or [11]). We allow for the n n exponential holding times by setting f ∗ (n) = f (n) , λn + µn and λn + µn α(n) + λn + µn Thus our results are still given by Theorems 5.4–5.9 r∗ (n) = 6. Examples and some concluding remarks We first consider the original time-minimisation problem with general σ. Example 6.1. Suppose that f = 1 and we seek to solve Problem 1.2. Thus C = ∅ y = 0 and the optimal choice of s0 according to Theorem 3.2 is √ 2 0 s = . σ 20 SAUL JACKA Notice that it follows that (with this choice of scale function) √ ds(Xts ) = 2dBt , and E0 [S] = s(1)2 . Example 6.2. If we extend the previous example by assuming that s is given on [0, y), then we will still have s0 proportional to σ1 on [y, 1] and so on this interval s(X s )will behave like a multiple of Brownian Motion with partial reflection at y (at least if s0 (y−) exists). Example 6.3. We now consider the additive√functional case with general f . Then from Theorem 3.2, the optimal choice of s0 is σ2f . With this choice of s, we see that Z t s f (Xus )du, < s(X ) >t = 0 so that Z E0 [ S f (Xus )du = E[< s(X s ) >S ]. 0 Example 6.4. If we turn now to the discounted case and take α constant and σ = 1, we see that the optimal choice of s0 is constant, corresponding to zero drift. Thus we obtain the same optimal control for each α. This suggests that possibly, the optimum is actually a stochastic minimum for the shuttle time. Whilst we cannot contradict this for initial position 0, the corresponding statement for a general starting position is false. To see this let s0 correspond to drift 1 on r [0, y]. Then a simple calculation shows that √ √ 1 √ cosh( 1+2αy)+ √1+2α sinh( 1+2αy) 0 the optimal choice of s on [y, 1] is 2α cosh(√1+2αy)− √ 1 sinh(√1+2αy) . It is clear 1+2α that this choice depends on α and hence there cannot be a stochastic minimum since, were one to exist, it would achieve the minimum in each discounted problem. Remark 6.5. For cases where a stochastic minimum is attained in a control problem see, for example, [8] or [3]. References [1] Assing, S and Schmidt, W M: “Continuous Strong Markov Processes in Dimension One, Springer, Berlin-Heidelberg-New York (1998) [2] Atchade, Y, Roberts, G O and J. S. Rosenthal: Towards Optimal Scaling of Metropolis-Coupled Markov Chain Monte Carlo, Statistics and Computing, 21, 4, 555-568, 2011. [3] Connor, S and Jacka, S: Optimal co-adapted coupling for the symmetric random walk on the hypercube, J. Appl. Probab. 45, 703-713 (2008) [4] Englebert, H J: Existence and non-existence of solutions of one-dimensional stochastic equations, Probability and Mathematical Statistics,Vol. 20 (2), 343 - 358 [5] Englebert, H J and Schmidt, W, On solutions of one-dimensional stochastic differential equations without drift, Z. Wahrsch. Verw. Gebiete 68, 287 314, 1985. [6] Feller, W: “An Introduction To Probability Theory and its Application Vol II”. 2nd Edition. Wiley New York, 1971 [7] Ito, K and McKean, H P: “Diffusion Processes and their Sample Paths”, Springer, BerlinHeidelberg-New York, 1974. [8] Jacka, S:, Keeping a satellite aloft: two finite fuel stochastic control models, J. Appl. Probab. 36, 1-20, (1999). MINIMISING RETURN TIME 21 [9] Oksendal, B: “Stochastic Differential Equations”, 6th Edition, Springer, Berlin-Heidelberg-New York, 2003 [10] Revuz, D and Yor, M: “Continuous martingales and Brownian motion”, 3rd Edn. Corrected 3rd printing. Springer, Berlin-Heidelberg-New York, 2004. [11] Williams, D: “Diffusions, Markov processes, and martingales: Vol. I”. Wiley, New York, 1979. Department of Statistics, University of Warwick, Coventry CV4 7AL, UK E-mail address: s.d.jacka@warwick.ac.uk