OPTIMAL PORTFOLIO-CONSUMPTION WITH HABIT FORMATION AND

advertisement
OPTIMAL PORTFOLIO-CONSUMPTION WITH HABIT FORMATION AND
PARTIAL OBSERVATIONS: THE FULLY EXPLICIT SOLUTIONS APPROACH
XIANG YU
Abstract. We consider a model of optimal investment and consumption with both habit-formation
and partial observations in incomplete Itô processes markets. The individual investor develops
addictive consumption habits gradually while he can only observe the market stock prices but
not the instantaneous rates of return, which follow Ornstein-Uhlenbeck processes. Applying the
Kalman-Bucy filtering theorem and Dynamic Programming arguments, we solve the associated HJB
equation fully explicitly for this path dependent stochastic control problem in the case of power
utility preferences. We will provide the optimal investment and consumption policies in explicit
feedback forms using rigorous verification arguments.
1. Introduction
Habit Formation preference has become a popular alternative tool of traditional time separable
von Neumann-Morgenstern utilities on consumption optimization problems during recent years.
It has been observed that the time additivity property of traditional preferences is not consistent
with many empirical experiments, for instance, the celebrated Equity Premium Puzzle. (see Mehra
and Prescott [12] and Constantinides [4]). The literature in financial economics has been arguing the past consumption pattern has the impact on individual’s current consumption decisions,
and the preference should depend both on the consumption rate process and the corresponding
consumption history integral. In particular, the path dependent linear habit formation preference
RT
E[ 0 U (t, ct − Zt )dt] has been widely accepted, where index Zt stands for the accumulative consumption history. The definition that instantaneous utility function is decreasing in Zt indicates
that an increase in consumption today increases current utility but depresses all future utilities
through the induced increase in future standards of living.
Various works studied the continuous time habit-formation optimization problem in complete
Itô processes markets under full information. Detemple and Zapatero [5] and [6] employed the
martingale approach and derived the closed form optimal consumption by establishing some recursive stochastic differential equations. In the same framework, Schroder and Skiadas [16] made an
insightful observation that under some appropriate assumptions, to solve the optimal portfolio seRT
lection with utilities incorporating linear habit formation E[ 0 U (t, ct − Zt )dt] is equivalent to solve
RT
the time additive utility maximization E[ 0 U (t, c0t )dt] in some isomorphic financial markets with∗
∗
out habit formation, where the optimal policies are related as c0∗
t = ct − Zt . Munk [13] adopts the
Date: December 12, 2011.
(The Preliminary Version).
2010 Mathematics Subject Classification. Primary 91G10; secondary 93E11 93E20.
Key words and phrases. Utility maximization, consumption habit formation, Kalman-Bucy Filtering, pathdependent stochastic control, auxiliary ODEs, verification lemma.
1
2
XIANG YU
Market Isomorphism result in his complete market model with stochastic interests rates processes,
and provided closed form optimal strategies in some special cases. Taking advantage of the first
order condition in the same complete market, possibly non-Markovian, Egglezos and Karatzas [7]
exploited the interplay of stochastic partial differential equations to the utility maximization with
linear addictive habit formation, and obtained some stochastic feedback formulae for the optimal
portfolio and consumption choices. Although many nice results have already been explored, the
problem remains open when the financial market turns into incomplete.
In our present work, we are considering the habit forming investor in incomplete Itô processes
financial markets, together with additional partial observations constraint. We are facing the case
that the individual investor develops his own consumption habits during the whole investment
process and meanwhile has only access to the stock price information FS . In other words, he can
not observe the mean rate of return process µt and the corresponding Brownian motion Wt which
appeared in the stock price dynamics. In our model, we will assume µt follows the mean reverting
Ornstein Uhlenbeck process driven by another Brownian motion Bt .
Optimal investment problems under incomplete information have been studied by numerous authors, and we only list a very small amount of them: Lakner [11] applied martingale methods and
derived the structure of the optimal investment strategies. The linear diffusion model is studied by
Brendle [2], who derived explicit results for the value of information on optimal investment with
power and exponential utilities using dynamic programming approach. The effects of learning on
the composition of the optimal portfolios are studied in Brennan [3] and Xia [17]. Björk, Davis
and Landén [1] considered the market model with unobservable rates of returns that are allowed
to be arbitrary semimartingales, and they provided a unified treatment of a large class of partially
observed investment problems and explicit representation of the optimal terminal wealth and portfolio strategies.
Our contributions are two folds. From the modeling perspective, we are considering the utility
maximization problem with consumption habit formation in the incomplete financial markets setting, where there exist Brownian motions Wt and Bt that are not perfectly correlated. Together,
our individual investor withstands the incomplete information restriction, that he only has access
to the public stock prices. The combination of these two scenarios is not only a novel framework
and mathematically interesting, but also it covers many economically realistic constraints that the
individual investor is facing in his daily lives. On the other hand, mathematically speaking, we
solved the relatively complicated nonlinear HJB equation in a fully explicit form, and derived the
FtS -adapted optimal investment and consumption polices in feedback form via rigorous verification
arguments.
The structure of the present paper is as follows: Section 2 introduces the market model and the
concept of habit formation process. Section 3 briefly reviewed the stochastic filtering result and
Kalman Bucy filtering Theorem. The utility maximization problem with addictive habit formation
and partial observations is defined in Section 4. By applying the Kalman Bucy filtering theorem and
Dynamic Programming Principle, we formally derived the Hamilton-Jacobi-Bellman(HJB) equation for the power utility preference, and we provide the decoupled form solution of this nonlinear
PDE, which reduces the algorithm to solving some auxiliary ODEs with constant coefficients. Based
HABIT FORMATION AND PARTIAL OBSERVATIONS
3
on these classical solutions, the explicit feedback form of the optimal investment and consumption
policies will be obtained. Section 5 contains the rigorous proofs of the verification arguments. At
last, four cases of fully explicit solutions of some auxiliary ODEs are presented in the Appendix A.
2. Market Model and Consumption Habit Formation
On a probability space (Ω, F, P) equipped with the filtration F = (Ft )0≤t≤T , we consider a
financial market with one riskfree bond and one stock account for a “small investor” in a finite time
horizon [0, T ]. The price of the bond St0 solves:
dSt0 = rt St0 dt,
0≤t≤T
with initial price S00 = 1, and without loss of generality, we assume the interest rate rt ≡ 0, for all
t ∈ [0, T ], this can be achieved by the standard change of numéraire.
The stock price St is modeled as a diffusion process solving:
(2.1)
dSt = µt St dt + σS St dWt ,
0 ≤ t ≤ T,
with S0 = s > 0, where the drift process µt is Ft -progressively measurable, and satisfying the
mean-reverting Ornstein Uhlenbeck SDE:
(2.2)
dµt = −λ(µt − µ̄)dt + σµ dBt ,
0 ≤ t ≤ T,
Wt and Bt are Brownian motions defined on (Ω, F, P) and they are correlated with the coefficient
ρ ∈ [−1, 1]. We assume the initial value of the drift process µ0 is a F0 measurable Gaussian
random variable, satisfying µ0 ∼ N(η, θ), which is independent of Brownian motions (Bt )0≤t≤T and
(Wt )0≤t≤T . We also assume all the coefficients σS , λ, µ̄, σµ ≥ 0 are nonnegative constants.
Now, at each time t ∈ [0, T ], the investor chooses a consumption rate ct ≥ 0, and decides the
amounts πt of her wealth to invest in the stock, and the rest in bank. Then, in this self-financing
market model, the investor’s total wealth process Xt will follow the dynamics:
(2.3)
dXt = (πt µt − ct )ds + σS πt dWt ,
0 ≤ t ≤ T,
with the initial wealth X0 = x > 0.
In this paper, we adopt the notation Zt = Z (t; ct ) as “Habit Formation” process or “the standard
of living” process to describe his “consumption habits level” . We assume the accumulative index
Zt follows the dynamics:
(2.4)
dZt = (δ(t)ct − α(t)Zt )dt,
0 ≤ t ≤ T,
where Z0 = z ≥ 0 is called the initial habit, and α(t), δ(t) are nonnegative continuous functions.
Equivalently, (2.4) stipulates
Z t
Rt
Rt
(2.5)
Zt = ze− 0 α(u)du +
δ(u)e− u α(v)dv cu du, 0 ≤ t ≤ T.
0
Zt is defined as an exponentially weighted average of the initial habit and the past consumption.
Here, functions α(t) and δ(t) measure, respectively, the persistence of the past level and the intensity
4
XIANG YU
of consumption history.
In our current work, we are considering the case of “addictive habits”, i.e., we require investor’s
current consumption strategies shall never fall below the standard of living level,
(2.6)
ct ≥ Zt , ∀0 ≤ t ≤ T,
a.s..
We will see in the future that this additional consumption budget constraint implies initial wealth
must be sufficiently large to sustain habits and ensure the existence of optimal policies.
3. Stochastic filtering with partial observations
From now on, we shall make the assumption that the investor can observe the stock price
process St which are published and available to the public, however, the drift process µt and
the information of Brownian motions (Wt )0≤t≤T and (Bt )0≤t≤T are unknown. We shall call this
as the “partial observations information” scenario. This investment and consumption optimization problem with incomplete information will be modeled by requiring the investment strategy
(πt )0≤t≤T and consumption policy (ct )0≤t≤T must be only adapted to the partial observation filtration FS = (FtS )0≤t≤T where FtS = σ{Su : 0 ≤ u ≤ t}, which is strictly smaller than the
background full information F = (Ft )0≤t≤T . To this end, we need first to prepare the stochastic
filtering theorem.
3.1. General filtering model. Consider a probability space (Ω, F, P) equipped with a background filtration F = (Ft )0≤t≤T . All processes are assumed to be F progressively measurable. The
stochastic signal state process Rt : Ω −→ Rn is given by the SDE
dRt = A(t, Rt )dt + C(t, Rt )dBt ,
0 ≤ t ≤ T,
where Bt denotes a Ft -Brownian Motion. We assume Rt is not directly observable.
And an observation process Ht is gives as:
dHt = D(t, Rt )dt + E(t, Rt )dWt ,
0 ≤ t ≤ T,
where, Wt is another Ft -Brownian motion, correlated with Bt with the coefficient ρ ∈ [−1, 1].
The filtering problem is to find the best estimate R̂t based on the observation of Ht : which means
to compute the conditional distribution of the signal process R̂t , given the observation filtration
F̂t = σ(Hs : 0 ≤ s ≤ t):
i
h
P Rt ∈ AF̂t , 0 ≤ t ≤ T, for A ∈ F̂T ,
equivalently, to compute the conditional expectation:
i
h
E f (Rt )F̂t , 0 ≤ t ≤ T,
for a suitable class of test functions f : R → R.
HABIT FORMATION AND PARTIAL OBSERVATIONS
5
3.2. The Kalman Bucy filtering. For simplicity, we look at the one dimensional signal state
process Rt and observation process Ht , defined by linear SDEs:
dRt = A(t)Rt dt + C(t)dBt
(3.1)
dHt = D(t)Rt dt + E(t)dWt ,
0 ≤ t ≤ T,
where, A(t), C(t) 6= 0, D(t), E(t) 6= 0 are deterministic functions satisfying the integrability
condition:
Z Th
i
|A(t)| + |D(t)| + C 2 (t) + E 2 (t) dt < +∞.
0
Suppose R0 is a Gaussian random variable with R0 ∼ N (µ, θ), which is independent of Brownian
motions (Bt )0≤t≤T and (Wt )0≤t≤T . And H0 = s, where s is a constant.
One can see (Rt , Ht ) is a Gaussian vector process, so the conditional distribution of Rt given
F̂t = σ(Hs : 0 ≤ s ≤ t) is also Gaussian, and the entire distribution is determined by its mean R̂t
and variance Ω̂t :
h i
R̂t = E Rt F̂t ,
i
h
Ω̂t = E (Rt − R̂t )2 F̂t ,
with initial values
h i
h i
R̂0 = E R0 F̂0 = E R0 = µ
i
i
h i
h
h
Ω̂0 = E (R0 − R̂0 )2 F̂0 = E (R0 − µ)2 = Var R0 = θ.
And the linear filtering problem becomes to compute the R̂t and Ω̂t given their initial values. The
celebrated Kalman-Bucy filter theorem gives us the algorithm for these estimations.
Theorem 3.1 (One-dimensional Kalman-Bucy Filtering).
For the one dimensional signal state process hRt andi observation process Ht following the linear
SDEs (3.1), the conditional expectation R̂t = E Rt F̂t satisfies the SDE:
dR̂t = A(t)R̂t dt +
h D(t)
E(t)
i
Ω̂t + C(t)ρ dŴt ,
0 ≤ t ≤ T,
with R̂0 = µ, where (Ŵt )0≤t≤T is called the Innovation Process, and definded as:
D(t) dŴt = dWt +
Rt − R̂t dt, 0 ≤ t ≤ T, Ŵ0 = 0.
E(t)
and Ŵt is an F̂t progressively measurable
Motion.
h Bownian
i
The conditional variance Ω̂t = Var Rt F̂t satisfies the deterministic Riccati equation:
h
dΩ̂t
D2 (t)
C(t)D(t)ρ i
= − 2 Ω̂2t + 2 A(t) −
Ω̂t + C 2 (t)(1 − ρ2 ),
dt
E (t)
E(t)
with the initial condition Ω̂0 = θ.
0 ≤ t ≤ T,
6
XIANG YU
4. Utility Maximization with Kalman-Bucy Filtering
4.1. Dynamic Programming on Partial Observations Filtration FS .
Apply the above Kalman-Bucy Stochastic Filtering Theorem, we can first define the Innovation
Process in our market model as:
i
1 h
1 dSt
(4.1)
dŴt ,
− µ̂t dt , 0 ≤ t ≤ T,
(µt − µ̂t )dt + σS dWt =
σS
σS St
which is a Brownian motion under partial observations filtration FtS .
by Kalman-Bucy filtering theorem, the conditional estimations of drift process: µ̂t =
i
hMoreover,
E µt FtS satisfies the SDE:
Ω̂(t) + σ σ ρ S µ
dµ̂t = − λ(µ̂t − µ̄)dt +
dŴt ,
σS
(4.2)
Ω̂(t) + σS σµ ρ Ω̂(t) + σS σµ ρ dS
t
µ̂t dt + λµ̄dt +
= −λ−
, 0 ≤ t ≤ T,
St
σS2
σS2
h i
with µ̂0 = E µ0 F0S = η. So we can solve µ̂t as the strong solution of the SDE (4.2) by knowing
the stock price process St .
i
h
And the conditional variance Ω̂t = E (µt − µ̂t )2 FtS satisfies the Riccati ODE:
h
(Ω̂t + σS σµ ρ)2 i
dΩ̂t = σµ2 − 2λΩ̂t −
dt, 0 ≤ t ≤ T,
σS2
i
h
with Ω̂(0) = E (µ0 − η)2 F0S = θ, which has an explicit solution as:
(4.3)
√
√
(4.4)
Ω̂t = Ω̂(t; θ) =
kσS
k1 exp(2( σSk )t) + k2
√
k
− (λ +
k1 exp(2( σS )t) − k2
σµ ρ 2
)σ , 0 ≤ t ≤ T,
σS S
where:
k = λ2 σS2 S + 2σS σµ λρ + σµ2 ,
√
k1 = kσS + (λσS2 + σS σµ ρ) + θ,
√
k2 = − kσS + (λσS2 + σS σµ ρ) + θ.
By observation, we see Ω̂(t) tends to the value
q
(4.5)
θ∗ = σS λ2 σS2 + 2σS σµ λρ + σµ2 − (λσS2 + σS σµ ρ)
as time t → +∞, which we call as “steady state learning” (see also Brennan [3]). This convergence
property of Ω̂(t) tells us the precision of the drift estimate goes from an initial condition to a steady
state in the long time run, and after large time T , new return observations contribute to updating
the estimated value of the state variable, but seldom reduce the variance of the estimation error.
Moreover, by the Riccati ODE (4.3), we clearly have the boundedness
(4.6)
min(θ, θ∗ ) ≤ Ω̂(t) ≤ max(θ, θ∗ ),
t ∈ [0, T ].
HABIT FORMATION AND PARTIAL OBSERVATIONS
7
Under the observation filtration (FtS )0≤t≤T , we can instead rewrite stock price dynamics (2.1)
driven by the innovation process Ŵt as:
(4.7)
dSt = µ̂t St dt + σS St dŴt ,
0 ≤ t ≤ T.
Notice we are now seeking the optimal investment and consumption strategies πt and ct which
are only progressively measurable with respect to the partial observations filtration FtS , where the
living standard process Zt satisfies the ODE
dZt = (δ(t)ct − α(t)Zt )dt,
(4.8)
0 ≤ t ≤ T,
and under the partial observations filtration FtS , the wealth process dynamics (2.3) under πt and
ct will be rewritten as:
dXt = (πt µ̂t − ct )dt + σS πt dŴt ,
(4.9)
0 ≤ t ≤ T.
Our goal now is to maximize the consumption with linear habit formation and terminal wealth
by power utility preference under the partial observations filtration FtS :
i
h Z T (c − Z )p
(XT )p s
s
ds +
(4.10)
V (t, x, z, η, θ) = sup E
Xt = x, Zt = z, µ̂t = η, Ω̂t = θ ,
p
p
π,c∈A
t
where we take the risk aversion coefficient p < 1 and p 6= 0.
Recall that the conditional variance Ω̂t is a deterministic function of time, we can set Ω̂t = θ as
the parameter instead of the variable, and the dimension of the value function can be reduced as:
h Z T (c − Z )p
i
(XT )p s
s
(4.11)
V (t, x, z, η; θ) = sup E
ds +
X
=
x,
Z
=
z,
µ̂
=
η
;
t
t
t
p
p
π,c∈A
t
for each fixed initial value Ω̂(t) = θ.
An investment and consumption pair process (πt , ct ) is said in the Admissible Control Space
A: if it is FtS -progressively measurable, and satisfies the integrability conditions:
Z T
Z T
2
(4.12)
πt dt < +∞, a.s. and
ct dt < +∞, a.s.
0
0
with the addictive habits constraint that: ct ≥ Zt , ∀t ∈ [0, T ]. Moreover, no bankruptcy is allowed,
i.e., the investor’s wealth remains nonnegative: Xt ≥ 0, 0 ≤ t ≤ T .
By the definition of V (t, x, z, η) and Dynamic Programming Principle, we can formally derive
the Hamilton-Jacobi-Bellman (HJB) equation for the value function as:
2
h
Ω̂(t) + σS σµ ρ
V
+
max
− cVx + cδ(t)Vz
Vt − α(t)zVz − λ(η − µ̄)Vη +
ηη
c
2σS2
(4.13)
h
i
(c − z)p i
1
+ max πηVx + σS2 π 2 Vxx + Vxη Ω̂(t) + σS σµ ρ π = 0,
+
π
p
2
8
XIANG YU
with the terminal condition V (T, x, z, η) =
xp
p .
4.2. Decoupled Reduced Form Solutions.
If V (t, x, z, η) is smooth enough, the first order condition formally derives
−ηVx − Ω̂(t) + σS σµ ρ Vx η
π ∗ (t, x, z, η) =
,
σS2 Vxx
(4.14)
1
p−1
c∗ (t, x, z, η) = z + Vx − δ(t)Vz
.
which achieve the maximum over control policies π and c respectively.
Plugging forms of (4.14) for π ∗ and c∗ , the HJB equation becomes:
2
Ω̂(t) + σS σµ ρ
η Ω̂(t) + σS σµ ρ V V
x xη
Vt − α(t)zVz − λ(η − µ̄)Vη +
Vηη −
Vxx
2σS2
σS2
(4.15)
2
2
p − 1
p
Ω̂(t)
+
σ
σ
ρ
2
2
S µ
Vxη
η V
p−1
= 0.
−
z
V
−
δ(t)V
−
V
−
δ(t)V
− 2 x −
x
z
x
z
2
Vxx
p
2σS Vxx
2σS
From definition (4.11) of the value function, and dynamics (4.9) and (4.8) for Xt and Zt respectively, it’s easy to show that if V (t, x, z, η) is finite, then it is homogeneous in (x, z) with degree p,
i.e., for any x > 0, z ≥ 0 and the positive constant k, we have V (t, kx, kz, η) = k p V (t, x, z, η). It
therefore makes sense for us to seek the value function of the form:
h
ip
(x − w(t, η)z)
(4.16)
V (t, x, z, η) =
M (t, η)
p
p
for some test functions w(t, η) and M (t, η) to be determined. By the virtue of V (T ) = xp , we will
require M (T, η) = 1 and w(T, η) = 0.
After we do the direct substitution in the above Equation (4.15) and divide the equation on both
sides by (x + w(t, η)z)p , the HJB equation becomes
(4.17)
2
h
− wt z + α(t)wz − (1 + δ(t)w)z + λ(η − µ̄)wη −
Ω̂(t)+σS σµ ρ
2
2σS
η Ω̂(t)+σS σµ ρ
wηη +
2
σS
wη
i
x − w(t, η)z
2
Ω̂(t)
+
σ
σ
ρ
η
Ω̂(t)
+
σ
σ
ρ
2
µ
S
S
S
1
λ(η − µ̄)
η
+ Mt −
Mη +
Mηη −
M−
Mη
2
p
p
2pσS
2(p − 1)σS2
(p − 1)σS2
2
p
Ω̂(t) + σS σµ ρ M 2 p − 1 p
p−1
η
−
1
+
δ(t)w(t,
η)
M p−1 = 0.
−
2
M
p
2(p − 1)σS
M
Since the Equation (4.17) holds for all values of x and z, we naturally set the unknown priori
function w(t, η) = w(t) as a deterministic function in time t and satisfies:
(4.18)
− wt (t) + α(t)w(t) − (1 + δ(t)w(t)) = 0
HABIT FORMATION AND PARTIAL OBSERVATIONS
with the terminal condition w(T ) = 0, which is equivalent to:
Z T
Z s
(δ(v) − α(v))dv ds.
exp
(4.19)
w(t) =
9
0 ≤ t ≤ T.
t
t
Now we can substitute the function w(t) into the equation (4.17) above, and simplify it as:
2
p
Ω̂(t)
+
σ
σ
ρ
2
µ
S
p
pη
p−1
Mt +
M
+
M
+
(1
−
p)
1
+
δ(t)w(t)
M p−1
ηη
2
2
2(1 − p)σ1
2σS
(4.20)
2
h
p M2
Ω̂(t)
+
σ
σ
ρ
S µ
η(Ω̂(t) + σS σµ ρ)p η
+ − λ(η − µ̄) +
= 0.
M
+
η
2
2
M
(1 − p)σS
2(1 − p)σS
Now in order to solve the above nonlinear PDE (4.20), we can set the power transform as
M (t, η) = N (t, η)1−p
(4.21)
And the nonlinear PDE (4.20) for M (t, η) reduces to the linear parabolic PDE for N (t, η) as:
Nt +
(4.22)
pη 2
2
Ω̂(t) + σS σµ ρ
2 N (t, η) +
2(1 − p)2 σS
2σS2
η Ω̂(t) + σS σµ ρ p i
p
p−1
Nηη + 1 + δ(t)w(t)
h
+ − λ(η − µ̄) +
(1 − p)σS2
Nη (t, η) = 0
with N (T, η) = 1.
For the above linear PDE (4.22) of N (t, η), we can further solve it explicitly as:
Z T
p
p−1
1 + δ(s)w(s)
N (t, η) =
exp A(s, t)η 2 + B(s, t)η + C(s, t) ds
t
(4.23)
+ exp A(T, t)η 2 + B(T, t)η + C(T, t) ,
where we have for 0 ≤ t ≤ s ≤ T , A(s, t) = A(t; s), B(s, t) = B(t; s) and C(s, t) = C(t; s) satisfying
the following ODEs:
2
h
i
p
Ω̂(t)
+
σ
σ
ρ
2
Ω̂(t)
+
σ
σ
ρ
S µ
S µ
p
(4.24) At (t; s)+
+2 −λ+
A(t; s)+
A2 (t; s) = 0;
2
2
2
2
2(1 − p) σS
σS (1 − p)
σS
(4.25)
h
Bt (t; s) + − λ +
p Ω̂(t) + σS σµ ρ i
σS2 (1 − p)
(4.26)
Ct (t; s) + λµ̄B(t; s) +
2
2 Ω̂(t) + σS σµ ρ
B(t; s) + 2λµ̄A(t; s) +
2
Ω̂(t) + σS σµ ρ 2σS2
σS2
B 2 (t; s) + 2A(t; s) = 0;
with terminal conditions: A(s; s) = B(s; s) = C(s; s) = 0.
A(t; s)B(t; s) = 0;
10
XIANG YU
We remark that the above ODEs are similar to the ODEs obtained by Brendle [2] for terminal
wealth optimization problem with partial observations, and Brendle [2] made an insightful observation that we can actually solve the above ODEs by solving some auxiliary ODEs with constant
coefficients:
Theorem 4.1. For 0 ≤ t ≤ s ≤ T , consider the following auxiliary ODEs for a(t; s), b(t; s), c(t; s),
f (t; s) and g(t; s):
2pρσµ p
2(1 − p + pρ2 ) 2 2
,
σµ a (t; s) + 2λ −
a(t; s) −
(4.27)
at (t; s) = −
1−p
(1 − p)σS
2(1 − p)σS2
(4.28)
bt (t; s) = −
pρσµ 2(1 − p + pρ2 ) 2
σµ a(t; s)b(t; s) − 2λµ̄a(t; s) + λ −
b(t; s),
1−p
(1 − p)σS
ct (t; s) = −σµ2 a(t; s) −
(4.29)
(1 − p + pρ2 )σµ2 2
b (t; s) − λµ̄b(t; s),
2(1 − p)
ft (t; s) = −2(1 − ρ2 )σµ2 f 2 (t; s) + 2
(4.30)
λσS + ρσµ
1
f (t; s) + 2 ,
σS
2σS
gt (t; s) = σµ2 (1 − ρ2 )(f (t; s) − a(t; s)),
(4.31)
with the terminal conditions a(s; s) = b(s; s) = c(s; s) = f (s; s) = g(s; s) = 0, and if we adopt the
convention 00 = 0, then for the functions defined by:
Ã(t; s) =
a(t; s)
,
(1 − p) 1 − 2a(t; s)Ω̂(t)
B̃(t; s) =
b(t; s)
,
(1 − p) 1 − 2a(t; s)Ω̂(t)
Ω̂(t)
1 h
1−p
b2 (t; s) −
c(t; s) + log 1 − 2a(t; s)Ω̂(t)
1−p
2
1 − 2a(t; s)Ω̂(t)
i
p
− log 1 − 2f (t; s)Ω̂(t) − pg(t; s) ,
2
we have the equivalence that:
C̃(t; s) =
(4.32)
A(t; s) = Ã(t; s), B(t; s) = B̃(t; s), C(t; s) = C̃(t; s),
0 ≤ t ≤ s ≤ T.
Remark 4.1. The equivalence result (4.32) reveals the fact that to solve the ODEs (4.24), (4.25),
(4.26) with variable dependent coefficients is equivalent to solve the auxiliary ODEs (4.27), (4.28),
(4.29), (4.30) and (4.31) with constant coefficients in an order, i.e, we solve the Riccati ODE (4.27)
first, and substitute the solution a(t; s) into ODE (4.28) and solve for the solution b(t; s), and etc.
Actually, we can even solve out fully explicit solutions for a(t; s), b(t; s), c(t; s), f (t; s) and
g(t; s). We list all four different cases of fully explicit solutions in the Appendix depending on the
risk aversion coefficient p and the market coefficients σS , σµ , λ and ρ. By simple substitutions,
HABIT FORMATION AND PARTIAL OBSERVATIONS
11
we can therefore solve the ODEs (4.24), (4.25), (4.26) for A(t; s), B(t; s) and C(t; s) in the fully
explicit expression.
Lemma 4.1. Suppose the risk aversion constant p and the market coefficients σS , σµ , λ, ρ satisfy
λ2 σS2
p
λσS
,
≤
≤
2
1−p
(2λρσµ + σµ )
ρσµ
(4.33)
and the solution of (4.27) satisfies |1 − a(t; s)Ω̂(t)| ≥ > 0 for a constant on 0 ≤ s ≤ t ≤ T , then
there exist uniform constants bounds K̄1 > 0, K̄2 > 0 and K̄3 > 0 such that
A(t; s) ≤ K̄1 ,
(4.34)
B(t; s) ≤ K̄2 ,
C(t; s) ≤ K̄3 ,
0 ≤ t ≤ s ≤ T.
Proof. Under Assumption (4.33), the explicit solution a(t; s) is bounded and we observe the form
of ODEs (4.28), (4.29), then b(t; s), c(t; s) are bounded on 0 ≤ t ≤ s ≤ T if a(t; s) is bounded.
Now since the ODE (4.30) is well defined independent of the risk aversion constant p, and it always
admits a bounded solution f (t; s) < 0 and |1 − f (t; s)Ω̂(t)| > 1 > 0 for 0 ≤ t ≤ s ≤ T and
hence we deduce g(t; s) is bounded for 0 ≤ t ≤ s ≤ T . Combine these with the assumption
|1 − a(t; s)Ω̂(t)| ≥ > 0 on 0 ≤ t ≤ s ≤ T , we can conclude A(t; s), B(t; s) and C(t; s) are all
uniformly bounded on 0 ≤ t ≤ s ≤ T by substitution results in Theorem 4.1.
Now, for t ∈ [0, T ], η ∈ (−∞, +∞), we can define the effective domain for the pair (x, z) as:
(4.35)
(x, z) ∈ Dt = {(x0 , z 0 ) ∈ (0, +∞) × [0, +∞); x0 ≥ w(t)z 0 }, 0 ≤ t ≤ T,
and the function
Z
p
[(x − w(t)z)]p h T p−1
Ve (t, x, z, η) =
exp A(s, t)η 2 + B(s, t)η + C(s, t) ds
1 + δ(s)w(s)
p
t
(4.36)
i1−p
+ exp A(T, t)η 2 + B(T, t)η + C(T, t)
is well defined on [0, T ] × Dt × R and it’s the classical solution of the HJB equation (4.13), where
RT
Rs
w(t) = t exp( t (δ(v)−α(v))dv)ds, and A(s, t), B(s, t), C(s, t) are defined in (4.24), (4.25), (4.26).
In our main result below, we want to verify the above classical solution Ve (t, x, z, η) equals our primal value function defined in (4.11), however, the effective domain of Ve (t, x, z, η) deduces some
constraints on the optimal wealth process Xt∗ and habit formation process Zt∗ . In particular, when
t = 0, we have to pose the initial wealth-habit budget constraint that x > w(0)z.
4.3. The Main Result.
Theorem 4.2 (The Verification Theorem).
Under the initial wealth-habit budget constraint x > w(0)z, either if risk aversion constant
(4.37)
p < 0;
12
XIANG YU
or if
(4.38)
0 < p < 1,
together with market coefficients σS , σµ , λ, Θ, ρ
satisfy the additional assumption (4.33) and
(4.39)
λ2 σS4
p(1 + p)
<
.
(1 − p)2
4(Θ + σS σµ ρ)2
where Θ , max{θ, θ∗ } and θ∗ is define in (4.5), moreover, we assume the upper boundedness K̄1
in (4.34) of A(t; s) in Lemma 4.1 satisfies
(4.40)
8K̄1 + 1 <
λσS2
.
(Θ + σS σµ ρ)2
Then, define V (t, x, z, η) as in (4.11) and define Ve (t, x, z, η) as in (4.36), we have the equivalence:
(4.41)
Ve (t, x, z, η) = V (t, x, z, η).
And the optimal investment policy πt∗ and optimal consumption policy c∗t are given in the feedback
form: πt∗ = π ∗ (t, Xt∗ , Zt∗ , µ̂t ) and c∗t = c∗ (t, Xt∗ , Zt∗ , µ̂t ), 0 ≤ t ≤ T , where the function π ∗ (t, x, z, η) :
[0, T ] × Dt × R −→ R is defined by:
h
Ω̂(t)
+
σ
σ
ρ
µ
S
Nη (t, η) i
η
(4.42)
π ∗ (t, x, z, η) =
(x − w(t)z), 0 ≤ t ≤ T.
+
N (t, η)
(1 − p)σS2
σS2
c∗ (t, x, z, η) : [0, T ] × Dt × R −→ R+ is defined by:
(4.43)
c∗ (t, x, z, η) = z +
(x − w(t)z)
, 0 ≤ t ≤ T.
1
1 + δ(t)w(t) 1−p N (t, η)
And the optimal wealth process Xt∗ , for 0 ≤ t ≤ T , is given explicitly by:
Z t
Z t (µ̂ )2
µ̂u
N (t, µ̂t )
u
∗
(4.44)
Xt = (x − w(0)z)
exp
du
+
d
Ŵ
+ w(t)Zt∗ ,
u
2
N (0, η)
(1
−
p)σ
S
0
0 2(1 − p)σS
where w(t) and N (t, η) are defined in (4.19) and (4.23) respectively.
Remark 4.2. The more complex structure of feedback forms of optimal investment and consumption policies is the consequence of the time non-separability of the instantaneous utility with habit
π∗
c∗
formation. We can see the portfolio/wealth ratio X
∗ and consumption/wealth ratio X ∗ are now
Z∗
depending on the habit-formation/wealth ratio X
∗:
h
Ω̂(t)
+
σ
σ
ρ
∗
µ
S
Nη (t, µ̂t ) i
π
Z∗
µ̂t
=
+
(1
−
w(t)
),
X∗
N (t, µ̂t )
X∗
(1 − p)σS2
σS2
and
Z∗
c∗
1
w(t)
=
+
1
−
,
1
1
∗
X∗
1 + δ(t)w(t) 1−p N (t, µ̂t )
1 + δ(t)w(t) 1−p N (t, µ̂t ) X
HABIT FORMATION AND PARTIAL OBSERVATIONS
13
Moreover, although function c∗ (·, x, ·, ·) remains still linear and increasing in x > 0, c∗ (·, ·, z, ·)
is not necessary increasing in z ≥ 0, which shows the increase of initial habit dose not necessarily
imply the increase of optimal consumption stream. And since the dependence of c∗ (t, x, z, η) on the
discounting factors α(t) and δ(t) are even more complicated, the optimal consumption process c∗t is
not necessarily monotone in the habit formation process Zt∗ .
5. Proof of The Verification Theorem
We will first show the consumption constraint ct ≥ Zt implies the constraint on the controlled
wealth process by the following proposition:
Proposition 5.1. The admissible space A is not empty if and only if x > w(0)z. Moreover,
for each pair of investment and consumption policy (π, c) ∈ A, the controlled wealth process Xtπ,c
satisfies the constraint:
Xtπ,c ≥ w(t)Zt ,
(5.1)
0 ≤ t ≤ T,
where the deterministic function w(t) is defined in (4.18) and refers the cost of subsistence consumption per unit of standard of living at time t.
R
t
Proof. On one hand, let’s assume x ≥ w(0)z, then we can always take πt ≡ 0, and ct = z exp 0 (δ(v)−
α(v))dv for t ∈ [0, T ], it is easy to verify Xtπ,c ≥ 0 and ct ≡ Zt so that (π, c) ∈ A, and hence A is
not empty.
On the other hand, starting from t = 0 with the wealth x and the standard of living z, the
addictive habits constraint ct ≥ Zt , 0 ≤ t ≤ T implies the consumption must always exceed the
subsistence consumption c̃t = Z(t; c̃t ) which satisfies
dc̃t = (δ(t) − α(t))c̃t dt, c̃0 = z, 0 ≤ t ≤ T,
And ct ≥ c̃t is equivalent to
(5.2)
ct ≥ z exp
Z
t
(δ(v) − α(v))dv , 0 ≤ t ≤ T.
0
Define the exponential local martingale
Z
Z t µ̂
1 t µ̂2v v
e
Ht = exp −
dŴv −
dv , 0 ≤ t ≤ T.
2 0 σS2
0 σS
e is a true martingale with respect to (Ω, F S , P), since µ̂t follows the
Bene’s condition implies H
dynamics (4.2).
e as dPe = H
e T , Girsanov theorem states that
Now define the probability measure P
dP
Z t
µ̂v
f
Wt , Ŵt +
dv, 0 ≤ t ≤ T
0 σS
14
XIANG YU
e (F S )0≤t≤T ).
is a Brownian Motion under (P,
t
Then we can rewrite the wealth process dynamics as:
Z T
Z T
fv ,
XT +
cv dv = x +
πv σS dW
0
0
Since we have XT ≥ 0, it’s easy to see that
e we have:
take the expectation under P,
Rt
0
e and
fv is a supermartingale under (Ω, FS , P),
πv σS dW
hZ
e
x≥E
T
i
cv dv .
0
Follow the inequality (5.2), we will further have:
i
Z v
hZ T
e
(δ(u) − α(u))du dv .
exp
x ≥ zE
0
0
Since δ(t) and α(t) are deterministic functions, we easily arrive x ≥ w(0)z.
In general, for ∀t ∈ [0, T ], follow the same procedure, we can then take conditional expectation
under filtration FtS , and get
hZ T
Z v
i
e
Xt > Zt E
exp
(δ(u) − α(u))du dv FtS ,
t
t
again since δ(t), α(t) are deterministic, we obtain Xt ≥ w(t)Zt , 0 ≤ t ≤ T .
5.1. The Case p < 0 .
(The PROOF OF THEOREM 4.2).
First, for any pair of admissible control (πt , ct ) ∈ A, denote G πt ,ct as the generator of the process
Ve (t, Xt , Zt , µ̂t ) under control (πt , ct ). Ito’s lemma gives
i
h
h
i h
Ω̂(t) + σS σµ ρ i
(5.3)
d Ve (t, Xt , Zt , µ̂t ) = G πt ,ct Ve (t, Xt , Zt , µ̂t ) dt + Vex σS πt + Veη
dŴt ,
σS
e (t, x , z , η) is the classical solution of HJB equation (4.13), choose the localizing sequence
Recall V
τn , we integrate the equation (5.3) on [t, τn ∧T ], and take the expectation under probability measure
Px,z,η , and let’s denote E = Ex,z,η , we have
h Z τn ∧T (c − Z )p i
h
i
s
s
e
(5.4)
V (t, x, z, η) ≥ E
ds + E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) .
p
t
Now, we follow the idea by Janeček and Sı̂rbu [9], let’s fix this pair of control choice (πt , ct ) ∈
A = Ax , where we denote Ax as the admissible space with initial endowment x. And for ∀ > 0, it is
clear that Ax ⊆ Ax+ , and (πt , ct ) ∈ Ax+ . Also it is clear that Xtx+ = Xtx + = Xt +, 0 ≤ t ≤ T .
Follow the same procedure above, and notice process Zt keeps the same under the consumption
policy ct , then under probability measure Px,z,η , we can obtain:
h Z τn ∧T (c − Z )p i
h
i
s
s
e
ds + E Ve (τn ∧ T, Xτn ∧T + , Zτn ∧T , µ̂τn ∧T ) .
(5.5)
V (t, x + , z, η) ≥ E
p
t
HABIT FORMATION AND PARTIAL OBSERVATIONS
15
By Monotone Convergence Theorem, we first know:
h Z τn ∧T (c − Z )p i
h Z T (c − Z )p i
s
s
s
s
(5.6)
lim E
ds = E
ds .
n→+∞
p
p
t
t
For simplicity, let’s denote Yt = Xt − w(t)Zt , we know by definition (4.36) that:
1
Ve (τn ∧ T, Xτn ∧T + , Zτn ∧T , µ̂τn ∧T ) = (Yτn ∧T + )p Nτ1−p
.
n ∧T
p
Proposition 5.1 gives Xt ≥ w(t)Zt for 0 ≤ t ≤ T under any admissible control pair (πt , ct ), we
know Yτn ∧T + ≥ > 0, ∀0 ≤ t ≤ T . Since also p < 0, we will have
sup(Yτn ∧T + )p < p < +∞.
(5.7)
n
Now from Remark A.1, we already derived that A(t; s) ≤ 0, ∀0 ≤ t ≤ s ≤ T . Combining this
with the fact that w(s), δ(s) are continuous and hence bounded on [0, T ] and when p < 0, we also
have 1 − a(t)Ω̂(t) > 0 and 1 − f (t)Ω̂(t) > 0 as well as a(t; s), b(t; s), c(t; s), f (t; s) and g(t; s) are
all bounded for 0 ≤ t ≤ s ≤ T , we deduce that the explicit solutions B(t; s) and C(t; s) are both
bounded on [0, T ], hence we have:
N (t, η) ≤ k1 exp(kη) for some large constants k, k1 > 1,
which shows the existence of some constants k̄, k̄1 > 1 such that
1−p
sup Nτ1−p
≤
sup
k
exp
k
µ̂
≤
k̄
exp
k̄
sup
µ̂
.
1
t
1
t
n ∧T
n
t∈[0,T ]
t∈[0,T ]
We recall that µ̂t satisfies the Ornstein Uhlenbeck diffusion (4.2), which gives:
Z t
Ω̂(u) + σS σµ ρ
dŴu .
µ̂t = e−tλ η + µ̄(1 − e−tλ ) +
eλ(u−t)
σS
0
Hence, there exists positive constants l and l1 > 1 large enough, such that:
sup µ̂t ≤ l + sup l1 Ŵt ,
t∈[0,T ]
t ∈ [0, T ].
t∈[0,T ]
Using the exponential distribution of maximum of the Brownian Motion, there exists some positive
constants ¯l > 1 and ¯l1 such that
h
i
h
i
¯l1 E exp sup ¯lŴt < +∞.
(5.8)
E supNτ1−p
≤
n ∧T
n
t∈[0,T ]
At last, by the above (5.7) and (5.8), we can conclude that
h
i
E sup Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) < +∞.
n
By virtue of Dominated Convergence Theorem, we can deduce:
h
i
h1
i
lim E Ve (τn ∧ T, Xτn ∧T + , Zτn ∧T , µ̂τn ∧T ) = E (YT + )p N (T, µ̂T )
n→+∞
p
h (X + )p i
h Xp i
T
=E
>E T .
p
p
16
XIANG YU
Combine this with equation (5.5), and notice the pair of control (πt , ct ) ∈ A, we will see that:
h Z T (c − Z )p
Xp i
s
s
e
V (t, x + , z, η) ≥ sup E
ds + T = V (t, x, z, η).
p
p
π,c∈A
t
Notice Ve (t, x + , z, η) is continuous in variable x, and since > 0 is arbitrary, we can take the limit
as:
Ve (t, x, z, η) = lim Ve (t, x + , z, η) ≥ V (t, x, z, η).
→0
On the other hand, for πt∗ and c∗t defined by (4.42) and (4.43) respectively, we first want to show
the SDE for wealth process:
dXt∗ = (πt∗ µt − c∗t )dt + σS πt∗ dŴt ,
(5.9)
0 ≤ t ≤ T,
with initial condition x > w(0)z has an unique strong solution and also satisfies Xt∗ > w(t)Zt∗ , ∀s ∈
[0, T ].
Denote Yt∗ = Xt∗ − w(t)Zt∗ , and apply Ito’s lemma and substitute πt∗ as defined by (4.42), we
can get:
h
i
dYt∗ = πt∗ µ̂t − c∗t − wt (t)Zt∗ − w(t)δ(t)c∗t + w(t)α(t)Zt∗ dt + πt∗ σS dŴt
(5.10)
h
= − wt (t) + w(t)α(t) Zt∗ − (1 + w(t)δ(t))c∗t +
+
Ω̂(t) + σS σµ ρ N
η
σS2
N
i
h
µ̂t Yt∗ dt +
µ̂t
+
(1 − p)σS
µ̂2t
Yt∗
2
(1 − p)σS
Ω̂(t) + σS σµ ρ N i
η
σS
N
Yt∗ dŴt .
Recall the definition of w(t) by (4.18) and substitute c∗t defined by (4.43) into (5.10) above, we
will further have
−p
1−p
h
1 + δ(t)w(t)
Ω̂(t)
+
σ
σ
ρ
2
S µ
Nη i ∗
µ̂t
dYt∗ = −
+
+
µ̂t Yt dt
N
N
(1 − p)σS2
σS2
h
Ω̂(t)
+
σ
σ
ρ
S µ
Nη i ∗
µ̂t
+
+
Y dŴt .
(1 − p)σS
σS
N t
In order to solve Xt∗ in a more explicit formula, we define the auxiliary process by:
Γt =
N (t, µ̂t )
,
Yt∗
∀0 ≤ t ≤ T.
By Ito’s lemma, we can derive the SDE for process Γt as:
2
h
Ω̂(t)
+
σ
σ
ρ
µ̂
Ω̂(t)
+
σ
σ
ρ
p
µ
t
µ
S
S
Γt
dΓt =
Nt − λ(µ̂t − µ̄)Nη +
Nηη +
Nη
2
2
N
2σS
(1 − p)σS
(5.11)
−p
i
h −µ̂
i
pµ̂2t
1−p
t
+
N
dt
+
Γ
dŴt .
+ 1 + δ(t)w(t)
t
(1 − p)σS
(1 − p)2 σS2
HABIT FORMATION AND PARTIAL OBSERVATIONS
17
Recall that N (t, η) satisfies the linear PDE (4.23), we can simplify (5.11) to be:
h
i
h −µ̂
i
pµ̂2t
t
dΓt = Γt
dt
+
Γ
dŴt .
t
(1 − p)σS
2(1 − p)2 σS2
Hence, we can finally get the above SDE has an unique strong solution as:
Z t
Z t
µ̂u
µ̂2u
du
−
d
Ŵ
,
Γt = Γ0 exp −
u
2
0 (1 − p)σS
0 2(1 − p)σS
N (0,η)
Initial condition Γ0 = x−w(0)z
> 0 implies Γt > 0, ∀ 0 ≤ t ≤ T . And, hence, we finally proved
that the SDE (5.9) has an unique strong solution defined by (4.44) and the solution Xt∗ satisfies
the wealth process constraint (5.1)
Now, we proceed to verify πt∗ and c∗t are actually in the admissible space A.
First, by the definition (4.42) and (4.43), it’s clear that πt∗ and c∗t are FtS progressively measurable,
and by the path continuity of Yt∗ = Xt∗ − w(t)Zt∗ , hence, of πt∗ and c∗t , it’s easy to show that:
Z T
Z T
∗ 2
(πt ) dt < +∞,
and
c∗t dt < +∞, a.s.
Xt∗
0
∗
w(t)Zt ,
0
Also, since
>
∀s ∈ [0, T ], by the definition of c∗t , we know the consumption constraint
c∗t > Zt∗ , ∀0 ≤ t ≤ T is satisfied. And hence (πt∗ , c∗t ) ∈ A.
Given the pair of control policy (πt∗ , c∗t ) as above, following the same steps and the definition of
stopping time τn , instead of (5.4), we can now instead get the equality:
h
i
h Z τn ∧T (c∗ − Z ∗ )p i
t
t
e
dt + E Ve (τn ∧ T, Xτ∗n ∧T , Zτ∗n ∧T , µ̂τn ∧T ) .
V (t, x, z, η) = E
p
t
And hence, we apply Monotone Convergence Theorem again:
h Z T (c∗ − Z ∗ )p i
h Z τn ∧T (c∗ − Z ∗ )p i
t
t
t
t
dt = E
dt ,
lim E
n→+∞
p
p
t
t
When p < 0, we have function Ve (t, x, z, η) < 0 by it’s definition, and by Fatou’s lemma,
h
i
h
i
h (X ∗ )p i
T
lim E Ve (τn ∧ T, Xτ∗n ∧T , Zτ∗n ∧T , µ̂τn ∧T ) ≤ E Ve (T, XT∗ , ZT∗ , µ̂T ) = E
.
n→+∞
p
Therefore, it gives
hZ
e
V (t, x, z, η) = E
T
t
(XT∗ )p i
(c∗t − Zt∗ )p
dt +
≤ V (t, x, z, η)
p
p
which completes the whole proof.
5.2. The Case: 0 < p < 1 .
We proceed to prove the following two Lemmas which play important roles in the proof of the
second part of our main result.
Lemma 5.1. If constant k > 0 satisfies:
(5.12)
k<
λ2 σS2
2(Θ + σS σµ ρ)2
18
XIANG YU
there exists a constant Λ1 independent of t, such that
i
h
Z t
k µ̂2s ds ≤ Λ1 < +∞, ∀t ∈ [0, T ].
E exp
0
2
Proof. It is easy to choose an
increasing
sequence
of smooth functions Qn (y) % ky as n → ∞ such
0
00 that 0 ≤ Qn (y) ≤ n with Qn (y) and Qn (y) uniformly bounded. And for each fixed t ∈ [0, T ]
and η, we define:
i
h
Z t
Qn (µ̂s )ds ,
φ(t, η) = E exp
0
where µ̂0 = η.
Similar to the proof of Feynman-Kac formula, the function φ(t, η) is a classical solution of the
linear parabolic equation:
2
Ω̂(t) + σS σµ ρ
(5.13)
φt =
φηη − λ(η − µ̄)φη + Qn (η)φ,
2σS2
with initial condition φ(0, η) = 1. See also Lemma 1.12 in Pang [14] for details.
First, it’s clear that constant 0 is a subsolution of the above equation. Moreover, under assumption (5.12), it’s easy to show that for each t ∈ [0, T ], the equation:
2
2 Ω̂(t) + σS σµ ρ
x2 − 2λx + k = 0
σS2
λ−
has two positive real roots x1 =
positive constant a such that:
r
2(Ω̂(t)+σS σµ ρ)2
λ+ λ2 −
k
2
r
2(Ω̂(t)+σS σµ ρ)2
k
λ2 −
2
σ
S
(Ω̂(t)+σS σµ ρ)2
2
σ2
S
r
λ + λ2 −
0<a<
2
σ
S
(Ω̂(t)+σS σµ ρ)2
2
σ2
S
and x2 =
(Θ2 +σS σµ ρ)2
k
2
σS
(Θ1 +σS σµ ρ)2
2
σS
. And for any
,
with Θ1 = max(θ, θ∗ ) and Θ2 = min(θ, θ∗ ), and the positive constant b such that:
b>a
(Θ1 + σS σµ ρ)2
λ2 µ̄2 a2
−
,
2
(Θ +σ σ ρ)2
σS
2a2 1 σS2 µ − 2aλ + k
S
aη 2 )
it’s easy to verify that f (t, η) = exp(bt +
satisfies:
2
Ω̂(t) + σS σµ ρ
ft ≥
fηη − λ(η − µ̄)fη + kη 2 ,
2σS2
with the initial condition f (0, η) ≥ 1.
And since Qn (η) < kη 2 , we get function f (t, η) is the supersolution of the equation (5.13), and
0, f (t, η) is the coupled subsolution and supersolution. Theorem 7.2 from Pao [15] shows that
2
function φ(t, η) satisfies: 0 ≤ φ(t, η) ≤ f (t, η) ≤ ebT +aη ≡ Λ1 , and hence Monotone Convergence
Theorem leads to:
h
Z t
i
E exp
k µ̂2s ds ≤ Λ1 < +∞, ∀t ∈ [0, T ].
0
HABIT FORMATION AND PARTIAL OBSERVATIONS
19
Lemma 5.2. If constant k̄ > 0 satisfies
(5.14)
k̄ <
λσS2
,
(Θ + σS σµ ρ)2
for fixed constant κ > 0, there exists a constant Λ2 independent of t, and
h
i
E exp k̄(µ̂t + κ)2 ≤ Λ2 < ∞, t ∈ [0, T ].
Proof. Similar to the proof of Lemma 5.1, we again construct an increasing sequence of functions
{Qn (y)} for n ∈ N such that limn→+∞ Qn (y) = k̄(y + κ)2 . And for each fixed t ∈ [0, T ] and η, we
define:
h
i
ψ(t, η) = E exp Qn (µ̂t ) ,
where µ̂0 = η.
Then a direct corollary of Theorem 5.6.1 of Friedman [8] gives the function ψ(t, η) is a classical
solution of the linear parabolic equation:
2
Ω̂(t) + σS σµ ρ
ψηη − λ(η − µ̄)ψη ,
(5.15)
ψt =
2σS2
with initial condition ψ(0, η) = eQn (η) .
Under assumption (5.14), and choose any constant a such that
k̄ < a <
where x1 =
2
λσS
(Ω̂(t)+σS σµ ρ)2
λσS2
,
(Θ + σS σµ ρ)2
is one real root of the algebraic equation:
2
2 Ω̂(t) + σS σµ ρ
σS2
x2 − 2λx = 0
for each fixed t ∈ [0, T ]. And choose any positive constant b such that
2
2
2a κ(Θ+σS σµ ρ)2
−
aλκ
−
aλµ̄
2
2
2
2
2
(Θ + σS σµ ρ)
2a (Θ + σS σµ ρ) κ
σS
+
+ 2aλµ̄κ −
,
b>a
2
2
(Θ+σS σµ ρ)2
σS
σS
2a2
− 2aλ
2
σS
It is easy to verify that f (t, η) = exp(bt + a(η + κ)2 ) satisfies
2
Ω̂(t) + σS σµ ρ
ft ≥
fηη − λ(η − µ̄)fη ,
2σS2
2
with the initial condition f (0, η) = ea(η+κ) ≥ ψ(0, η), hence we get the function f (t, η) is the
supersolution of the equation (5.15), and it is trivial to show g(t, η) ≡ 0 is the subsolution, there
fore 0, f (t, η) are the coupled subsolution and supersolution. Again by Theorem 7.2 from Pao
2
[15], that function ψ(t, η) satisfies: 0 ≤ ψ(t, η) ≤ f (t, η) ≤ ebT +a(η+κ) ≡ Λ2 , hence Monotone
20
XIANG YU
Convergence Theorem implies:
h
i
E exp k̄(µ̂ + κ)2t
≤ Λ2 < +∞, ∀t ∈ [0, T ].
(The PROOF OF THEOREM 4.2, CONTINUED).
For any pair of admissible control (πt , ct ) ∈ A, similar to the case for p < 0, choose the same
localizing sequence τn such that
h Z τn ∧T (c − Z )p i
h
i
s
s
e
(5.16)
V (t, x, z, η) ≥ E
ds + E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) .
p
t
Now, by monotone convergence theorem, we first know:
h Z T (c − Z )p i
h Z τn ∧T (c − Z )p i
s
s
s
s
ds = E
ds .
lim E
n→+∞
p
p
t
t
And for 0 < p < 1, Ve (t, Xt , Zt , µ̂t ) ≥ 0 for all t ∈ [0, T ] by the definition (4.36) and (5.1), and
Fatou’s lemma yields that:
h
i
h
i
h Xp i
lim E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) ≥ E Ve (T, XT , ZT , µ̂T ) = E T ,
n→+∞
p
which implies that:
hZ
Ve (t, x, z, η) ≥ sup E
π,c∈A
πt∗
t
T
Xp i
(cs − Zs )p
ds + T = V (t, x, z, η).
p
p
c∗t
On the other hand, for the
and
defined by (4.42), (4.43), again follow the same procedure
in the proof for case p < 0, we can show πt∗ and c∗t are actually in the admissible space A.
Now, by policies πt∗ and c∗t , similarly, we can now get the equality:
h Z τn ∧T (c∗ − Z ∗ )p i
h
i
s
s
e
V (t, x, z, η) = E
ds + E Ve (τn ∧ T, Xτ∗n ∧T , Zτ∗n ∧T , µ̂τn ∧T ) .
p
t
By the definition of Ve (t, x, z, η) and Holder inequality, we know that:
h
i
h Y ∗ p
i
N (T ∧ τn , µ̂T ∧τn )
E Ve (T ∧ τn , XT∗ ∧τn , ZT∗ ∧τn , µ̂T ∧τn ) ≤ k1 E
N T ∧τn
Z T ∧τn
h
Z T ∧τn 2pµ̂
i 1
2p2 µ̂2u
2
u
≤ k2 E exp
dŴu −
du
2σ2
(1
−
p)σ
(1
−
p)
S
0
0
S
h
Z T ∧τn (p2 + p)µ̂2 i 1
2
u
· E exp
du N 2 (T ∧ τn , µ̂T ∧τn )
2σ2
(1
−
p)
0
S
for some positive constants
K1 , K2 , which are independent
of n.
R
R t 2p2 µ̂2u
t 2pµ̂u
Notice that exp 0 (1−p)σS dŴu − 0 (1−p)2 σ2 du is a positive local martingale, therefore, a
S
supermartingale, we can apply the bounded stopping theorem and get:
Z T ∧τn
h
Z T ∧τn 2pµ̂
i
2p2 µ̂2u
u
E exp
dŴu −
du
≤ 1.
(1 − p)σS
(1 − p)2 σS2
0
0
HABIT FORMATION AND PARTIAL OBSERVATIONS
Thus, we can get:
hZ
e
V (t, x, z, η) ≤ E
τn ∧T
t
21
i 1
Z T ∧τn (p2 + p)µ̂2 (cs − zs )p i h
2
u
2
du
N
(T
∧
τ
,
µ̂
)
.
ds + E exp
n
T
∧τ
n
2
2
p
(1 − p) σS
0
And now, we observe that:
h
Z
sup exp
T ∧τn
i
(p2 + p)µ̂2u 2
du
N
(T
∧
τ
,
µ̂
)
n
T
∧τ
n
(1 − p)2 σS2
n
0
h
i
h
Z T ∧τn (p2 + p)µ̂2 i
u
2
du
·
sup
≤ sup exp
N
(T
∧
τ
,
µ̂
)
n T ∧τn
(1 − p)2 σS2
n
n
0
Z T (p2 + p)µ̂2 h
i
u
2
du
·
sup
≤ exp
N
(T
∧
τ
,
µ̂
)
n
T
∧τ
n
2 2
n
0 (1 − p) σS
Z
T (2p2 + 2p)µ̂2 h
i
u
4
du
+
sup
N
(T
∧
τ
,
µ̂
)
.
≤ exp
n
T
∧τ
n
(1 − p)2 σS2
n
0
Hence, we have:
n
h
Z
≤ E exp
0
T
T ∧τn
i
(p2 + p)µ̂2u 2
du N (T ∧ τn , µ̂T ∧τn )
(1 − p)2 σS2
0
i
h
(2p2 + 2p)µ̂2u i
4
N
(T
∧
τ
,
µ̂
)
.
du
+
E
sup
n
T
∧τ
n
(1 − p)2 σS2
n
h
Z
E sup exp
We first recall that under Assumption (4.33), Lemma 4.1 implies that there exists constants k,
k1 such that
2
N (t, η) ≤ keK̄1 (η+k1 ) ,
where A(t; s) ≤ K̄1 for all 0 ≤ t ≤ s ≤ T , and hence we have
2
sup N 4 (T ∧ τn , µ̂T ∧τn ) ≤ sup ke4K̄1 (µ̂t +k1 ) .
n
t∈[0,T ]
Then we just need to show that
h
i
2
E sup ke4K̄1 (µ̂t +k1 ) < ∞.
(5.17)
t∈[0,T ]
Define ϕ(x) , e4K̄1 (x+k1
)2
and apply Ito’s lemma, we have
2
2
h
Ω̂(t) + σS σµ ρ Ω̂(t) + σS σµ ρ i
dϕ(µ̂t ) =ϕ(µ̂t ) − 8K̄1 λ − 8K̄12
µ̂2t + 8K̄1 λµ̄µ̂t − 4K̄1
dt + dLt
σS2
σS2
≤ϕ(µ̂t )k2 dt + dLt ,
where k2 > 0 is a upper bound constant, and the local martingale part is:
Ω̂(t) + σS σµ ρ
dLt , ϕ(µ̂t )8K̄1 µ̂t
dŴt .
σS
From which we can derive that
Z
h
i
E sup ϕ(µ̂t ) ≤ ϕ(η) +
t∈[0,T ]
0
T
h
i
h
i
k2 E sup ϕ(µ̂s ) dt + E sup Lt ,
s∈[0,t]
t∈[0,T ]
22
XIANG YU
Burhholder-Davis-Gundy Inequality and Jensen’s Inequality imply that
2
h
i
hZ T
i 1
Ω̂(t) + σS σµ ρ
2
2 8K̄1 (µ̂t +k1 )2
64K̄12
E sup Lt ≤ k3 E
µ̂
e
dt ,
t
2
σS
0
t∈[0,T ]
However, under Assumption (4.40) and by Lemma 5.2, there exists a constant Λ2 such that
h
i
h
i
2
2
E µ̂2t e8K̄1 (µ̂t +k1 ) ≤ E e(8K̄1 +1)(µ̂t +k1 ) ≤ Λ2 < ∞, t ∈ [0, T ].
h
i
Hence we obtain the boundedness E sup Lt ≤ k4 < ∞ for some constant k4 , and
t∈[0,T ]
Z
h
i
E sup ϕ(µ̂t ) ≤ ϕ(η) +
t∈[0,T ]
0
T
h
i
k2 E sup ϕ(µ̂s ) dt + k4 ,
s∈[0,t]
The Gronwall’s Inequality verifies (5.17).
For the first part, by coefficients Assumption (4.39), we can therefore apply Lemma 5.1, and it
yields that:
h
Z T (2p2 + 2p)µ̂2 i
u
E exp
du < Λ1 < +∞,
2σ2
(1
−
p)
0
S
for some constant Λ1 > 0. We can then get that:
h
i
sup E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) < Λ < ∞
n
for some constant Λ > 0, independent of n. Therefore, Dominated Convergence Theorem leads to:
i
h (X ∗ )p i
h
T
,
lim E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) = E
n→+∞
p
together with Monotone Convergence Theorem, we deduce
h Z T (c∗ − Z ∗ )p
(XT∗ )p i
s
s
e
V (t, x, z, η) = E
ds +
≤ V (t, x, z, η),
p
p
t
which completes the whole proof.
Acknowledgements. I sincerely thank my advisor Mihai Sı̂rbu for numerous helpful comments
on the topic of this paper as well as reading the manuscript of it.
Appendix A. Fully Explicit Solutions
Follow the arguments by Kim and Omberg [10], we can even solve the auxiliary ODEs (4.27),
(4.28), (4.29), (4.30) and (4.31) fully explicitly depending on the risk aversion constant p and all
the market coefficients σS , σµ , λ, ρ:
HABIT FORMATION AND PARTIAL OBSERVATIONS
23
A.1. The Normal Solution. The condition for the Normal solution is
∆ , λ2 −
(A.1)
and then we define:
√
ξ=
∆=
q
pσµ2
2λpρσµ
−
> 0,
(1 − p)σS2
(1 − p)σS2
(1 − p + pρ2 ) 2
σµ ,
1−p
p
γ3 =
,
(1 − p)σS2
γ22 − γ1 γ3 ,
γ1 =
pρσµ
,
γ2 = −λ +
(1 − p)σS
q
(1 − ρ2 )σµ2 + (λσS + ρσµ )2
ξ1 =
.
σS
We can solve the equations (4.27), (4.28), (4.29), (4.30) and (4.31) as:
a(t; s) =
b(t; s) =
c(t; s) =
p(1 − e2ξ(t−s) )
h
2(1 − p)σS2 2ξ − (ξ + γ2 )(1 − e2ξ(t−s) )
i,
pλµ̄(1 − eξ(t−s) )2
h
i,
2(1 − p)σS2 ξ 2ξ − (ξ + γ2 )(1 − e2ξ(t−s) )
p
2(1 − p)σS2
−
λ2 µ̄2
ξ2
−
σµ2
ξ + γ2
h
i
pλ2 µ̄2 (ξ + 2γ2 )e2ξ(t−s) − 4γ2 eξ(t−s) + 2γ2 − ξ
h
i
(s − t) +
2(1 − p)σS2 ξ 3 2ξ − (ξ + γ2 ) 1 − e2ξ(t−s)
2ξ − (ξ + γ )(1 − e2ξ(t−s) ) pσµ2
2
log
,
2ξ
2(1 − p)σS2 (ξ 2 − γ22 )
1 − e2ξ1 (t−s)
1
,
2σS (σS ξ1 + λσS + ρσµ ) + (σS ξ1 − λσS − ρσµ )e2ξ1 (t−s)
(σ ξ + λσ + ρσ ) + (σ ξ − λσ − ρσ )e2ξ1 (t−s) 1
µ
µ
S 1
S
S 1
S
g(t; s) = log
ξ
(t−s)
1
2
2σS ξ1 e
ρσµ p
2
) + (σS ξ − λσS −
(σS ξ + λσS + 1−p
(1 − p)(1 − ρ )
−
log
2
2(1 − p + pρ )
2σS ξeξ(t−s)
2
ρσµ (s − t)
ρ λ(s − t)
−
−
,
2
2(1 − p + pρ ) 2(1 − p + pρ2 )σS
f (t; s) = −
ρσµ p 2ξ(t−s) 1−p )e
The condition for the bounded Normal solution is
(A.2)
γ3 > 0, or γ1 > 0, or γ2 < 0.
The condition for the explosive solution and the critical point is
γ3 < 0, γ1 < 0, and γ2 > 0,
γ + ξ 1
2
s−t=
log
.
2ξ
γ2 − ξ
Remark A.1. By observation, if p < 0, the conditions (A.1) and (A.2) hold, and we have a(t; s) ≤ 0
is a bounded solution as well as 1 − 2a(t; s)Ω̂(t) > 1 > 0 and 1 − f (t; s)Ω̂(t) > 1 > 0, hence we can
24
XIANG YU
finally conclude the solutions of ODEs (4.24), (4.25), (4.26) are all bounded on 0 ≤ t ≤ s ≤ T . We
a(t)
also notice that A(t) =
≤ 0, on 0 ≤ t ≤ s ≤ T .
(1−p)(1−2a(t)Ω̂(t))
A.2. The Hyperbolic Solution. The condition for the Hyperbolic solution is
∆ , λ2 −
pσµ2
2λpρσµ
−
= 0,
(1 − p)σS2
(1 − p)σS2
together with
pρσµ
6= 0,
(1 − p)σS
Then we can solve (4.27), (4.28), (4.29), (4.30) and (4.31) as:
γ2 = −λ +
a(t; s) =
−1
2γ1 (s − t −
b(t; s) = −
c(t; s) =
1
γ2 )
−
2λµ̄
4γ1 γ2 (s − t −
γ2 σµ2 (s
2γ1
− t)
+
γ2
,
2γ1
1
γ2 )
−
γ2 λµ̄(s − t +
λ2 µ̄2 γ22 (s
1
γ2 )
2γ1
−t−
24γ1 (s − t
4
γ2 )(s −
− γ12 )
,
t)3
+
σµ2 log (s − t)γ2 − 2
2γ1
,
1 − e2ξ1 (t−s)
1
,
2σS (σS ξ1 + λσS + ρσµ ) + (σS ξ1 − λσS − ρσµ )e2ξ1 (t−s)
(σ ξ + λσ + ρσ ) + (σ ξ − λσ − ρσ )e2ξ1 (t−s) (λσ + ρσ )
1
µ
µ
µ
S
S 1
S
S 1
S
−
g(t; s) = log
(s − t)
2
2σS
2σS ξ1 eξ1 (t−s)
i
σµ2 (1 − ρ2 ) h
+
log 1 + γ2 (t − s) − γ2 (s − t) .
2γ1
The condition for the bounded Hyperbolic solution is
f (t; s) = −
γ2 < 0.
The condition for the explosive solution and the critical point is
γ2 > 0,
and s − t =
1
.
γ2
A.3. The Polynomial solution. The condition for the Polynomial solution is
∆ , λ2 −
pσµ2
2λpρσµ
−
= 0,
(1 − p)σS2
(1 − p)σS2
together with
pρσµ
= 0,
(1 − p)σS
Then we can solve (4.27), (4.28), (4.29), (4.30) and (4.31) as:
p
a(t; s) =
(s − t),
2(1 − p)σS2
p
b(t; s) =
λµ̄(s − t)2 ,
2(1 − p)σS2
p
p
c(t; s) = −
σµ2 (s − t)2 +
λ2 µ̄2 (s − t)3 ,
2
4(1 − p)σS
6(1 − p)σS2
γ2 = −λ +
HABIT FORMATION AND PARTIAL OBSERVATIONS
25
1 − e2ξ1 (t−s)
1
,
2σS (σS ξ1 + λσS + ρσµ ) + (σS ξ1 − λσS − ρσµ )e2ξ1 (t−s)
(σ ξ + λσ + ρσ ) + (σ ξ − λσ − ρσ )e2ξ1 (t−s) (λσ + ρσ )
1
µ
µ
µ
S 1
S
S 1
S
S
(s − t)
g(t; s) = log
−
2
2σS
2σS ξ1 eξ1 (t−s)
σµ2 (1 − ρ2 )p
−
(s − t)2 .
4(1 − p)σS2
All Polynomial solutions are well behaved.
f (t; s) = −
A.4. The Tangent solution. The condition for the Tangent solution is
∆ , λ2 −
Now, we define
ζ=
√
pσµ2
2λpρσµ
−
< 0,
(1 − p)σS2
(1 − p)σS2
−∆,
$ = tan−1
γ 2
,
ζ
Then we can solve (4.27), (4.28), (4.29), (4.30) and (4.31) as:
γ2
ζ
tan ζ(s − t) + $ −
,
a(t; s) =
2γ1
2γ1
i
λµ̄ h
b(t; s) =
− 1 − tan($) tan(ζ(s − t) + $) + sec($) sec(ζ(s − t) + $) ,
γ1
p
i
2λ2 µ̄2 γ2 γ22 + ζ 2 h
c(t; s) =
sec ($) − sec (ζ(s − t) + $)
2γ1 ζ
i
2
2
λ µ̄ (2γ2 + ζ 2 ) h
+
tan(ζ(s
−
t)
+
$)
−
tan($)
2γ1 ζ 3
2
2
λ µ̄ (γ22 + ζ 2 ) − γ2 ζ 2 σµ2
σµ2
−
+
log
sec($)
cos(ζ(s
−
t)
+
$)
,
2γ1 ζ 2
2γ1
1
1 − e2ξ1 (t−s)
,
2σS (σS ξ1 + λσS + ρσµ ) + (σS ξ1 − λσS − ρσµ )e2ξ1 (t−s)
(σ ξ + λσ + ρσ ) + (σ ξ − λσ − ρσ )e2ξ1 (t−s) (λσ + ρσ )
1
µ
µ
µ
S
S 1
S
S 1
S
−
g(t; s) = log
(s − t)
ξ
(t−s)
1
2
2σ
2σS ξ1 e
S
cos(ζ(t − s) + $) i
h 1
γ2
log (s − t) .
− σµ2 (1 − ρ2 )
−
2γ1
cos($)
2γ1
All Tangent solutions are explosive solutions and the critical point is
f (t; s) = −
s−t=
π
1
γ2
− tan−1 ( ).
2ζ
ζ
ζ
References
[1] T. Björk, M. H. A. Davis, and C. Landén, Optimal investment under partial information, Math. Methods Oper.
Res. 71 (2010), no. 2, 371–399.
[2] S. Brendle, Portfolio selection under incomplete information, Stochastic Process. Appl. 116 (2006), no. 5, 701–
723.
26
XIANG YU
[3] M. J. Brennan, The role of learning in dynamic portfolio decisions, European Finance Review 1 (1998), no. 3,
295–306.
[4] G.M. Constantinides, Habit formation: a resolution of the equity premium puzzle, Working paper series, Center
for Research in Security Prices, Graduate School of Business, University of Chicago, 1988.
[5] J. B. Detemple and F. Zapatero, Asset prices in an exchange economy with habit formation, Econometrica 59
(1991), no. 6, 1633–57.
, Optimal consumption-portfolio policies with habit formation, Mathematical Finance 2 (1992), no. 4,
[6]
251–274.
[7] N. Egglezos and I. Karatzas, Utility maximization with habit formation: Dynamic programming and stochastic
pdes, Siam Journal on Control and Optimization 48 (2009), 481–520.
[8] A. Friedman, Stochastic differential equations and applications, Vol. 1., Academic Press, New York., 1975.
[9] K. Janeček and M. Sı̂rbu., Optimal investment with high-watermark performance fee, (2011), submitted for
publication.
[10] T.S. Kim and E. Omberg, Dynamic nonmyopic portfolio behavior, Review of Financial Studies 9 (1996), no. 1,
141–161.
[11] P. Lakner, Optimal trading strategy for an investor: the case of partial information, Stochastic Processes and
their Applications 76 (1998), no. 1, 77–97.
[12] R. Mehra and E. C. Prescott, The equity premium: A puzzle, Journal of Monetary Economics 15 (1985), no. 2,
145 – 161.
[13] C. Munk, Portfolio and consumption choice with stochastic investment opportunities and habit formation in
preferences, Journal of Economic Dynamics and Control 32 (2008), no. 11, 3560–3589.
[14] T. Pang, Stochastic control theory and its applications to financial economics, Ph.D thesis, Brown University,
Providence, Rhode Island (2002).
[15] C. V. Pao, Nonlinear parabolic and elliptic equations, Plenum Press, New York, 1992.
[16] M. Schroder and C. Skiadas, An isomorphism between asset pricing models with and without linear habit formation, Review of Financial Studies 15 (2002), no. 4, 1189–1221.
[17] Y. Xia, Learning about predictability: The effects of parameter uncertainty on dynamic asset allocation, The
Journal of Finance 56 (2001), no. 1, 205–246.
Xiang Yu, Department of Mathematics, The University of Texas at Austin, USA
E-mail address: xiangyu@math.utexas.edu
Download