OPTIMAL PORTFOLIO-CONSUMPTION WITH HABIT FORMATION AND PARTIAL OBSERVATIONS: THE FULLY EXPLICIT SOLUTIONS APPROACH XIANG YU Abstract. We consider a model of optimal investment and consumption with both habit-formation and partial observations in incomplete Itô processes markets. The individual investor develops addictive consumption habits gradually while he can only observe the market stock prices but not the instantaneous rates of return, which follow Ornstein-Uhlenbeck processes. Applying the Kalman-Bucy filtering theorem and Dynamic Programming arguments, we solve the associated HJB equation fully explicitly for this path dependent stochastic control problem in the case of power utility preferences. We will provide the optimal investment and consumption policies in explicit feedback forms using rigorous verification arguments. 1. Introduction Habit Formation preference has become a popular alternative tool of traditional time separable von Neumann-Morgenstern utilities on consumption optimization problems during recent years. It has been observed that the time additivity property of traditional preferences is not consistent with many empirical experiments, for instance, the celebrated Equity Premium Puzzle. (see Mehra and Prescott [12] and Constantinides [4]). The literature in financial economics has been arguing the past consumption pattern has the impact on individual’s current consumption decisions, and the preference should depend both on the consumption rate process and the corresponding consumption history integral. In particular, the path dependent linear habit formation preference RT E[ 0 U (t, ct − Zt )dt] has been widely accepted, where index Zt stands for the accumulative consumption history. The definition that instantaneous utility function is decreasing in Zt indicates that an increase in consumption today increases current utility but depresses all future utilities through the induced increase in future standards of living. Various works studied the continuous time habit-formation optimization problem in complete Itô processes markets under full information. Detemple and Zapatero [5] and [6] employed the martingale approach and derived the closed form optimal consumption by establishing some recursive stochastic differential equations. In the same framework, Schroder and Skiadas [16] made an insightful observation that under some appropriate assumptions, to solve the optimal portfolio seRT lection with utilities incorporating linear habit formation E[ 0 U (t, ct − Zt )dt] is equivalent to solve RT the time additive utility maximization E[ 0 U (t, c0t )dt] in some isomorphic financial markets with∗ ∗ out habit formation, where the optimal policies are related as c0∗ t = ct − Zt . Munk [13] adopts the Date: December 12, 2011. (The Preliminary Version). 2010 Mathematics Subject Classification. Primary 91G10; secondary 93E11 93E20. Key words and phrases. Utility maximization, consumption habit formation, Kalman-Bucy Filtering, pathdependent stochastic control, auxiliary ODEs, verification lemma. 1 2 XIANG YU Market Isomorphism result in his complete market model with stochastic interests rates processes, and provided closed form optimal strategies in some special cases. Taking advantage of the first order condition in the same complete market, possibly non-Markovian, Egglezos and Karatzas [7] exploited the interplay of stochastic partial differential equations to the utility maximization with linear addictive habit formation, and obtained some stochastic feedback formulae for the optimal portfolio and consumption choices. Although many nice results have already been explored, the problem remains open when the financial market turns into incomplete. In our present work, we are considering the habit forming investor in incomplete Itô processes financial markets, together with additional partial observations constraint. We are facing the case that the individual investor develops his own consumption habits during the whole investment process and meanwhile has only access to the stock price information FS . In other words, he can not observe the mean rate of return process µt and the corresponding Brownian motion Wt which appeared in the stock price dynamics. In our model, we will assume µt follows the mean reverting Ornstein Uhlenbeck process driven by another Brownian motion Bt . Optimal investment problems under incomplete information have been studied by numerous authors, and we only list a very small amount of them: Lakner [11] applied martingale methods and derived the structure of the optimal investment strategies. The linear diffusion model is studied by Brendle [2], who derived explicit results for the value of information on optimal investment with power and exponential utilities using dynamic programming approach. The effects of learning on the composition of the optimal portfolios are studied in Brennan [3] and Xia [17]. Björk, Davis and Landén [1] considered the market model with unobservable rates of returns that are allowed to be arbitrary semimartingales, and they provided a unified treatment of a large class of partially observed investment problems and explicit representation of the optimal terminal wealth and portfolio strategies. Our contributions are two folds. From the modeling perspective, we are considering the utility maximization problem with consumption habit formation in the incomplete financial markets setting, where there exist Brownian motions Wt and Bt that are not perfectly correlated. Together, our individual investor withstands the incomplete information restriction, that he only has access to the public stock prices. The combination of these two scenarios is not only a novel framework and mathematically interesting, but also it covers many economically realistic constraints that the individual investor is facing in his daily lives. On the other hand, mathematically speaking, we solved the relatively complicated nonlinear HJB equation in a fully explicit form, and derived the FtS -adapted optimal investment and consumption polices in feedback form via rigorous verification arguments. The structure of the present paper is as follows: Section 2 introduces the market model and the concept of habit formation process. Section 3 briefly reviewed the stochastic filtering result and Kalman Bucy filtering Theorem. The utility maximization problem with addictive habit formation and partial observations is defined in Section 4. By applying the Kalman Bucy filtering theorem and Dynamic Programming Principle, we formally derived the Hamilton-Jacobi-Bellman(HJB) equation for the power utility preference, and we provide the decoupled form solution of this nonlinear PDE, which reduces the algorithm to solving some auxiliary ODEs with constant coefficients. Based HABIT FORMATION AND PARTIAL OBSERVATIONS 3 on these classical solutions, the explicit feedback form of the optimal investment and consumption policies will be obtained. Section 5 contains the rigorous proofs of the verification arguments. At last, four cases of fully explicit solutions of some auxiliary ODEs are presented in the Appendix A. 2. Market Model and Consumption Habit Formation On a probability space (Ω, F, P) equipped with the filtration F = (Ft )0≤t≤T , we consider a financial market with one riskfree bond and one stock account for a “small investor” in a finite time horizon [0, T ]. The price of the bond St0 solves: dSt0 = rt St0 dt, 0≤t≤T with initial price S00 = 1, and without loss of generality, we assume the interest rate rt ≡ 0, for all t ∈ [0, T ], this can be achieved by the standard change of numéraire. The stock price St is modeled as a diffusion process solving: (2.1) dSt = µt St dt + σS St dWt , 0 ≤ t ≤ T, with S0 = s > 0, where the drift process µt is Ft -progressively measurable, and satisfying the mean-reverting Ornstein Uhlenbeck SDE: (2.2) dµt = −λ(µt − µ̄)dt + σµ dBt , 0 ≤ t ≤ T, Wt and Bt are Brownian motions defined on (Ω, F, P) and they are correlated with the coefficient ρ ∈ [−1, 1]. We assume the initial value of the drift process µ0 is a F0 measurable Gaussian random variable, satisfying µ0 ∼ N(η, θ), which is independent of Brownian motions (Bt )0≤t≤T and (Wt )0≤t≤T . We also assume all the coefficients σS , λ, µ̄, σµ ≥ 0 are nonnegative constants. Now, at each time t ∈ [0, T ], the investor chooses a consumption rate ct ≥ 0, and decides the amounts πt of her wealth to invest in the stock, and the rest in bank. Then, in this self-financing market model, the investor’s total wealth process Xt will follow the dynamics: (2.3) dXt = (πt µt − ct )ds + σS πt dWt , 0 ≤ t ≤ T, with the initial wealth X0 = x > 0. In this paper, we adopt the notation Zt = Z (t; ct ) as “Habit Formation” process or “the standard of living” process to describe his “consumption habits level” . We assume the accumulative index Zt follows the dynamics: (2.4) dZt = (δ(t)ct − α(t)Zt )dt, 0 ≤ t ≤ T, where Z0 = z ≥ 0 is called the initial habit, and α(t), δ(t) are nonnegative continuous functions. Equivalently, (2.4) stipulates Z t Rt Rt (2.5) Zt = ze− 0 α(u)du + δ(u)e− u α(v)dv cu du, 0 ≤ t ≤ T. 0 Zt is defined as an exponentially weighted average of the initial habit and the past consumption. Here, functions α(t) and δ(t) measure, respectively, the persistence of the past level and the intensity 4 XIANG YU of consumption history. In our current work, we are considering the case of “addictive habits”, i.e., we require investor’s current consumption strategies shall never fall below the standard of living level, (2.6) ct ≥ Zt , ∀0 ≤ t ≤ T, a.s.. We will see in the future that this additional consumption budget constraint implies initial wealth must be sufficiently large to sustain habits and ensure the existence of optimal policies. 3. Stochastic filtering with partial observations From now on, we shall make the assumption that the investor can observe the stock price process St which are published and available to the public, however, the drift process µt and the information of Brownian motions (Wt )0≤t≤T and (Bt )0≤t≤T are unknown. We shall call this as the “partial observations information” scenario. This investment and consumption optimization problem with incomplete information will be modeled by requiring the investment strategy (πt )0≤t≤T and consumption policy (ct )0≤t≤T must be only adapted to the partial observation filtration FS = (FtS )0≤t≤T where FtS = σ{Su : 0 ≤ u ≤ t}, which is strictly smaller than the background full information F = (Ft )0≤t≤T . To this end, we need first to prepare the stochastic filtering theorem. 3.1. General filtering model. Consider a probability space (Ω, F, P) equipped with a background filtration F = (Ft )0≤t≤T . All processes are assumed to be F progressively measurable. The stochastic signal state process Rt : Ω −→ Rn is given by the SDE dRt = A(t, Rt )dt + C(t, Rt )dBt , 0 ≤ t ≤ T, where Bt denotes a Ft -Brownian Motion. We assume Rt is not directly observable. And an observation process Ht is gives as: dHt = D(t, Rt )dt + E(t, Rt )dWt , 0 ≤ t ≤ T, where, Wt is another Ft -Brownian motion, correlated with Bt with the coefficient ρ ∈ [−1, 1]. The filtering problem is to find the best estimate R̂t based on the observation of Ht : which means to compute the conditional distribution of the signal process R̂t , given the observation filtration F̂t = σ(Hs : 0 ≤ s ≤ t): i h P Rt ∈ AF̂t , 0 ≤ t ≤ T, for A ∈ F̂T , equivalently, to compute the conditional expectation: i h E f (Rt )F̂t , 0 ≤ t ≤ T, for a suitable class of test functions f : R → R. HABIT FORMATION AND PARTIAL OBSERVATIONS 5 3.2. The Kalman Bucy filtering. For simplicity, we look at the one dimensional signal state process Rt and observation process Ht , defined by linear SDEs: dRt = A(t)Rt dt + C(t)dBt (3.1) dHt = D(t)Rt dt + E(t)dWt , 0 ≤ t ≤ T, where, A(t), C(t) 6= 0, D(t), E(t) 6= 0 are deterministic functions satisfying the integrability condition: Z Th i |A(t)| + |D(t)| + C 2 (t) + E 2 (t) dt < +∞. 0 Suppose R0 is a Gaussian random variable with R0 ∼ N (µ, θ), which is independent of Brownian motions (Bt )0≤t≤T and (Wt )0≤t≤T . And H0 = s, where s is a constant. One can see (Rt , Ht ) is a Gaussian vector process, so the conditional distribution of Rt given F̂t = σ(Hs : 0 ≤ s ≤ t) is also Gaussian, and the entire distribution is determined by its mean R̂t and variance Ω̂t : h i R̂t = E Rt F̂t , i h Ω̂t = E (Rt − R̂t )2 F̂t , with initial values h i h i R̂0 = E R0 F̂0 = E R0 = µ i i h i h h Ω̂0 = E (R0 − R̂0 )2 F̂0 = E (R0 − µ)2 = Var R0 = θ. And the linear filtering problem becomes to compute the R̂t and Ω̂t given their initial values. The celebrated Kalman-Bucy filter theorem gives us the algorithm for these estimations. Theorem 3.1 (One-dimensional Kalman-Bucy Filtering). For the one dimensional signal state process hRt andi observation process Ht following the linear SDEs (3.1), the conditional expectation R̂t = E Rt F̂t satisfies the SDE: dR̂t = A(t)R̂t dt + h D(t) E(t) i Ω̂t + C(t)ρ dŴt , 0 ≤ t ≤ T, with R̂0 = µ, where (Ŵt )0≤t≤T is called the Innovation Process, and definded as: D(t) dŴt = dWt + Rt − R̂t dt, 0 ≤ t ≤ T, Ŵ0 = 0. E(t) and Ŵt is an F̂t progressively measurable Motion. h Bownian i The conditional variance Ω̂t = Var Rt F̂t satisfies the deterministic Riccati equation: h dΩ̂t D2 (t) C(t)D(t)ρ i = − 2 Ω̂2t + 2 A(t) − Ω̂t + C 2 (t)(1 − ρ2 ), dt E (t) E(t) with the initial condition Ω̂0 = θ. 0 ≤ t ≤ T, 6 XIANG YU 4. Utility Maximization with Kalman-Bucy Filtering 4.1. Dynamic Programming on Partial Observations Filtration FS . Apply the above Kalman-Bucy Stochastic Filtering Theorem, we can first define the Innovation Process in our market model as: i 1 h 1 dSt (4.1) dŴt , − µ̂t dt , 0 ≤ t ≤ T, (µt − µ̂t )dt + σS dWt = σS σS St which is a Brownian motion under partial observations filtration FtS . by Kalman-Bucy filtering theorem, the conditional estimations of drift process: µ̂t = i hMoreover, E µt FtS satisfies the SDE: Ω̂(t) + σ σ ρ S µ dµ̂t = − λ(µ̂t − µ̄)dt + dŴt , σS (4.2) Ω̂(t) + σS σµ ρ Ω̂(t) + σS σµ ρ dS t µ̂t dt + λµ̄dt + = −λ− , 0 ≤ t ≤ T, St σS2 σS2 h i with µ̂0 = E µ0 F0S = η. So we can solve µ̂t as the strong solution of the SDE (4.2) by knowing the stock price process St . i h And the conditional variance Ω̂t = E (µt − µ̂t )2 FtS satisfies the Riccati ODE: h (Ω̂t + σS σµ ρ)2 i dΩ̂t = σµ2 − 2λΩ̂t − dt, 0 ≤ t ≤ T, σS2 i h with Ω̂(0) = E (µ0 − η)2 F0S = θ, which has an explicit solution as: (4.3) √ √ (4.4) Ω̂t = Ω̂(t; θ) = kσS k1 exp(2( σSk )t) + k2 √ k − (λ + k1 exp(2( σS )t) − k2 σµ ρ 2 )σ , 0 ≤ t ≤ T, σS S where: k = λ2 σS2 S + 2σS σµ λρ + σµ2 , √ k1 = kσS + (λσS2 + σS σµ ρ) + θ, √ k2 = − kσS + (λσS2 + σS σµ ρ) + θ. By observation, we see Ω̂(t) tends to the value q (4.5) θ∗ = σS λ2 σS2 + 2σS σµ λρ + σµ2 − (λσS2 + σS σµ ρ) as time t → +∞, which we call as “steady state learning” (see also Brennan [3]). This convergence property of Ω̂(t) tells us the precision of the drift estimate goes from an initial condition to a steady state in the long time run, and after large time T , new return observations contribute to updating the estimated value of the state variable, but seldom reduce the variance of the estimation error. Moreover, by the Riccati ODE (4.3), we clearly have the boundedness (4.6) min(θ, θ∗ ) ≤ Ω̂(t) ≤ max(θ, θ∗ ), t ∈ [0, T ]. HABIT FORMATION AND PARTIAL OBSERVATIONS 7 Under the observation filtration (FtS )0≤t≤T , we can instead rewrite stock price dynamics (2.1) driven by the innovation process Ŵt as: (4.7) dSt = µ̂t St dt + σS St dŴt , 0 ≤ t ≤ T. Notice we are now seeking the optimal investment and consumption strategies πt and ct which are only progressively measurable with respect to the partial observations filtration FtS , where the living standard process Zt satisfies the ODE dZt = (δ(t)ct − α(t)Zt )dt, (4.8) 0 ≤ t ≤ T, and under the partial observations filtration FtS , the wealth process dynamics (2.3) under πt and ct will be rewritten as: dXt = (πt µ̂t − ct )dt + σS πt dŴt , (4.9) 0 ≤ t ≤ T. Our goal now is to maximize the consumption with linear habit formation and terminal wealth by power utility preference under the partial observations filtration FtS : i h Z T (c − Z )p (XT )p s s ds + (4.10) V (t, x, z, η, θ) = sup E Xt = x, Zt = z, µ̂t = η, Ω̂t = θ , p p π,c∈A t where we take the risk aversion coefficient p < 1 and p 6= 0. Recall that the conditional variance Ω̂t is a deterministic function of time, we can set Ω̂t = θ as the parameter instead of the variable, and the dimension of the value function can be reduced as: h Z T (c − Z )p i (XT )p s s (4.11) V (t, x, z, η; θ) = sup E ds + X = x, Z = z, µ̂ = η ; t t t p p π,c∈A t for each fixed initial value Ω̂(t) = θ. An investment and consumption pair process (πt , ct ) is said in the Admissible Control Space A: if it is FtS -progressively measurable, and satisfies the integrability conditions: Z T Z T 2 (4.12) πt dt < +∞, a.s. and ct dt < +∞, a.s. 0 0 with the addictive habits constraint that: ct ≥ Zt , ∀t ∈ [0, T ]. Moreover, no bankruptcy is allowed, i.e., the investor’s wealth remains nonnegative: Xt ≥ 0, 0 ≤ t ≤ T . By the definition of V (t, x, z, η) and Dynamic Programming Principle, we can formally derive the Hamilton-Jacobi-Bellman (HJB) equation for the value function as: 2 h Ω̂(t) + σS σµ ρ V + max − cVx + cδ(t)Vz Vt − α(t)zVz − λ(η − µ̄)Vη + ηη c 2σS2 (4.13) h i (c − z)p i 1 + max πηVx + σS2 π 2 Vxx + Vxη Ω̂(t) + σS σµ ρ π = 0, + π p 2 8 XIANG YU with the terminal condition V (T, x, z, η) = xp p . 4.2. Decoupled Reduced Form Solutions. If V (t, x, z, η) is smooth enough, the first order condition formally derives −ηVx − Ω̂(t) + σS σµ ρ Vx η π ∗ (t, x, z, η) = , σS2 Vxx (4.14) 1 p−1 c∗ (t, x, z, η) = z + Vx − δ(t)Vz . which achieve the maximum over control policies π and c respectively. Plugging forms of (4.14) for π ∗ and c∗ , the HJB equation becomes: 2 Ω̂(t) + σS σµ ρ η Ω̂(t) + σS σµ ρ V V x xη Vt − α(t)zVz − λ(η − µ̄)Vη + Vηη − Vxx 2σS2 σS2 (4.15) 2 2 p − 1 p Ω̂(t) + σ σ ρ 2 2 S µ Vxη η V p−1 = 0. − z V − δ(t)V − V − δ(t)V − 2 x − x z x z 2 Vxx p 2σS Vxx 2σS From definition (4.11) of the value function, and dynamics (4.9) and (4.8) for Xt and Zt respectively, it’s easy to show that if V (t, x, z, η) is finite, then it is homogeneous in (x, z) with degree p, i.e., for any x > 0, z ≥ 0 and the positive constant k, we have V (t, kx, kz, η) = k p V (t, x, z, η). It therefore makes sense for us to seek the value function of the form: h ip (x − w(t, η)z) (4.16) V (t, x, z, η) = M (t, η) p p for some test functions w(t, η) and M (t, η) to be determined. By the virtue of V (T ) = xp , we will require M (T, η) = 1 and w(T, η) = 0. After we do the direct substitution in the above Equation (4.15) and divide the equation on both sides by (x + w(t, η)z)p , the HJB equation becomes (4.17) 2 h − wt z + α(t)wz − (1 + δ(t)w)z + λ(η − µ̄)wη − Ω̂(t)+σS σµ ρ 2 2σS η Ω̂(t)+σS σµ ρ wηη + 2 σS wη i x − w(t, η)z 2 Ω̂(t) + σ σ ρ η Ω̂(t) + σ σ ρ 2 µ S S S 1 λ(η − µ̄) η + Mt − Mη + Mηη − M− Mη 2 p p 2pσS 2(p − 1)σS2 (p − 1)σS2 2 p Ω̂(t) + σS σµ ρ M 2 p − 1 p p−1 η − 1 + δ(t)w(t, η) M p−1 = 0. − 2 M p 2(p − 1)σS M Since the Equation (4.17) holds for all values of x and z, we naturally set the unknown priori function w(t, η) = w(t) as a deterministic function in time t and satisfies: (4.18) − wt (t) + α(t)w(t) − (1 + δ(t)w(t)) = 0 HABIT FORMATION AND PARTIAL OBSERVATIONS with the terminal condition w(T ) = 0, which is equivalent to: Z T Z s (δ(v) − α(v))dv ds. exp (4.19) w(t) = 9 0 ≤ t ≤ T. t t Now we can substitute the function w(t) into the equation (4.17) above, and simplify it as: 2 p Ω̂(t) + σ σ ρ 2 µ S p pη p−1 Mt + M + M + (1 − p) 1 + δ(t)w(t) M p−1 ηη 2 2 2(1 − p)σ1 2σS (4.20) 2 h p M2 Ω̂(t) + σ σ ρ S µ η(Ω̂(t) + σS σµ ρ)p η + − λ(η − µ̄) + = 0. M + η 2 2 M (1 − p)σS 2(1 − p)σS Now in order to solve the above nonlinear PDE (4.20), we can set the power transform as M (t, η) = N (t, η)1−p (4.21) And the nonlinear PDE (4.20) for M (t, η) reduces to the linear parabolic PDE for N (t, η) as: Nt + (4.22) pη 2 2 Ω̂(t) + σS σµ ρ 2 N (t, η) + 2(1 − p)2 σS 2σS2 η Ω̂(t) + σS σµ ρ p i p p−1 Nηη + 1 + δ(t)w(t) h + − λ(η − µ̄) + (1 − p)σS2 Nη (t, η) = 0 with N (T, η) = 1. For the above linear PDE (4.22) of N (t, η), we can further solve it explicitly as: Z T p p−1 1 + δ(s)w(s) N (t, η) = exp A(s, t)η 2 + B(s, t)η + C(s, t) ds t (4.23) + exp A(T, t)η 2 + B(T, t)η + C(T, t) , where we have for 0 ≤ t ≤ s ≤ T , A(s, t) = A(t; s), B(s, t) = B(t; s) and C(s, t) = C(t; s) satisfying the following ODEs: 2 h i p Ω̂(t) + σ σ ρ 2 Ω̂(t) + σ σ ρ S µ S µ p (4.24) At (t; s)+ +2 −λ+ A(t; s)+ A2 (t; s) = 0; 2 2 2 2 2(1 − p) σS σS (1 − p) σS (4.25) h Bt (t; s) + − λ + p Ω̂(t) + σS σµ ρ i σS2 (1 − p) (4.26) Ct (t; s) + λµ̄B(t; s) + 2 2 Ω̂(t) + σS σµ ρ B(t; s) + 2λµ̄A(t; s) + 2 Ω̂(t) + σS σµ ρ 2σS2 σS2 B 2 (t; s) + 2A(t; s) = 0; with terminal conditions: A(s; s) = B(s; s) = C(s; s) = 0. A(t; s)B(t; s) = 0; 10 XIANG YU We remark that the above ODEs are similar to the ODEs obtained by Brendle [2] for terminal wealth optimization problem with partial observations, and Brendle [2] made an insightful observation that we can actually solve the above ODEs by solving some auxiliary ODEs with constant coefficients: Theorem 4.1. For 0 ≤ t ≤ s ≤ T , consider the following auxiliary ODEs for a(t; s), b(t; s), c(t; s), f (t; s) and g(t; s): 2pρσµ p 2(1 − p + pρ2 ) 2 2 , σµ a (t; s) + 2λ − a(t; s) − (4.27) at (t; s) = − 1−p (1 − p)σS 2(1 − p)σS2 (4.28) bt (t; s) = − pρσµ 2(1 − p + pρ2 ) 2 σµ a(t; s)b(t; s) − 2λµ̄a(t; s) + λ − b(t; s), 1−p (1 − p)σS ct (t; s) = −σµ2 a(t; s) − (4.29) (1 − p + pρ2 )σµ2 2 b (t; s) − λµ̄b(t; s), 2(1 − p) ft (t; s) = −2(1 − ρ2 )σµ2 f 2 (t; s) + 2 (4.30) λσS + ρσµ 1 f (t; s) + 2 , σS 2σS gt (t; s) = σµ2 (1 − ρ2 )(f (t; s) − a(t; s)), (4.31) with the terminal conditions a(s; s) = b(s; s) = c(s; s) = f (s; s) = g(s; s) = 0, and if we adopt the convention 00 = 0, then for the functions defined by: Ã(t; s) = a(t; s) , (1 − p) 1 − 2a(t; s)Ω̂(t) B̃(t; s) = b(t; s) , (1 − p) 1 − 2a(t; s)Ω̂(t) Ω̂(t) 1 h 1−p b2 (t; s) − c(t; s) + log 1 − 2a(t; s)Ω̂(t) 1−p 2 1 − 2a(t; s)Ω̂(t) i p − log 1 − 2f (t; s)Ω̂(t) − pg(t; s) , 2 we have the equivalence that: C̃(t; s) = (4.32) A(t; s) = Ã(t; s), B(t; s) = B̃(t; s), C(t; s) = C̃(t; s), 0 ≤ t ≤ s ≤ T. Remark 4.1. The equivalence result (4.32) reveals the fact that to solve the ODEs (4.24), (4.25), (4.26) with variable dependent coefficients is equivalent to solve the auxiliary ODEs (4.27), (4.28), (4.29), (4.30) and (4.31) with constant coefficients in an order, i.e, we solve the Riccati ODE (4.27) first, and substitute the solution a(t; s) into ODE (4.28) and solve for the solution b(t; s), and etc. Actually, we can even solve out fully explicit solutions for a(t; s), b(t; s), c(t; s), f (t; s) and g(t; s). We list all four different cases of fully explicit solutions in the Appendix depending on the risk aversion coefficient p and the market coefficients σS , σµ , λ and ρ. By simple substitutions, HABIT FORMATION AND PARTIAL OBSERVATIONS 11 we can therefore solve the ODEs (4.24), (4.25), (4.26) for A(t; s), B(t; s) and C(t; s) in the fully explicit expression. Lemma 4.1. Suppose the risk aversion constant p and the market coefficients σS , σµ , λ, ρ satisfy λ2 σS2 p λσS , ≤ ≤ 2 1−p (2λρσµ + σµ ) ρσµ (4.33) and the solution of (4.27) satisfies |1 − a(t; s)Ω̂(t)| ≥ > 0 for a constant on 0 ≤ s ≤ t ≤ T , then there exist uniform constants bounds K̄1 > 0, K̄2 > 0 and K̄3 > 0 such that A(t; s) ≤ K̄1 , (4.34) B(t; s) ≤ K̄2 , C(t; s) ≤ K̄3 , 0 ≤ t ≤ s ≤ T. Proof. Under Assumption (4.33), the explicit solution a(t; s) is bounded and we observe the form of ODEs (4.28), (4.29), then b(t; s), c(t; s) are bounded on 0 ≤ t ≤ s ≤ T if a(t; s) is bounded. Now since the ODE (4.30) is well defined independent of the risk aversion constant p, and it always admits a bounded solution f (t; s) < 0 and |1 − f (t; s)Ω̂(t)| > 1 > 0 for 0 ≤ t ≤ s ≤ T and hence we deduce g(t; s) is bounded for 0 ≤ t ≤ s ≤ T . Combine these with the assumption |1 − a(t; s)Ω̂(t)| ≥ > 0 on 0 ≤ t ≤ s ≤ T , we can conclude A(t; s), B(t; s) and C(t; s) are all uniformly bounded on 0 ≤ t ≤ s ≤ T by substitution results in Theorem 4.1. Now, for t ∈ [0, T ], η ∈ (−∞, +∞), we can define the effective domain for the pair (x, z) as: (4.35) (x, z) ∈ Dt = {(x0 , z 0 ) ∈ (0, +∞) × [0, +∞); x0 ≥ w(t)z 0 }, 0 ≤ t ≤ T, and the function Z p [(x − w(t)z)]p h T p−1 Ve (t, x, z, η) = exp A(s, t)η 2 + B(s, t)η + C(s, t) ds 1 + δ(s)w(s) p t (4.36) i1−p + exp A(T, t)η 2 + B(T, t)η + C(T, t) is well defined on [0, T ] × Dt × R and it’s the classical solution of the HJB equation (4.13), where RT Rs w(t) = t exp( t (δ(v)−α(v))dv)ds, and A(s, t), B(s, t), C(s, t) are defined in (4.24), (4.25), (4.26). In our main result below, we want to verify the above classical solution Ve (t, x, z, η) equals our primal value function defined in (4.11), however, the effective domain of Ve (t, x, z, η) deduces some constraints on the optimal wealth process Xt∗ and habit formation process Zt∗ . In particular, when t = 0, we have to pose the initial wealth-habit budget constraint that x > w(0)z. 4.3. The Main Result. Theorem 4.2 (The Verification Theorem). Under the initial wealth-habit budget constraint x > w(0)z, either if risk aversion constant (4.37) p < 0; 12 XIANG YU or if (4.38) 0 < p < 1, together with market coefficients σS , σµ , λ, Θ, ρ satisfy the additional assumption (4.33) and (4.39) λ2 σS4 p(1 + p) < . (1 − p)2 4(Θ + σS σµ ρ)2 where Θ , max{θ, θ∗ } and θ∗ is define in (4.5), moreover, we assume the upper boundedness K̄1 in (4.34) of A(t; s) in Lemma 4.1 satisfies (4.40) 8K̄1 + 1 < λσS2 . (Θ + σS σµ ρ)2 Then, define V (t, x, z, η) as in (4.11) and define Ve (t, x, z, η) as in (4.36), we have the equivalence: (4.41) Ve (t, x, z, η) = V (t, x, z, η). And the optimal investment policy πt∗ and optimal consumption policy c∗t are given in the feedback form: πt∗ = π ∗ (t, Xt∗ , Zt∗ , µ̂t ) and c∗t = c∗ (t, Xt∗ , Zt∗ , µ̂t ), 0 ≤ t ≤ T , where the function π ∗ (t, x, z, η) : [0, T ] × Dt × R −→ R is defined by: h Ω̂(t) + σ σ ρ µ S Nη (t, η) i η (4.42) π ∗ (t, x, z, η) = (x − w(t)z), 0 ≤ t ≤ T. + N (t, η) (1 − p)σS2 σS2 c∗ (t, x, z, η) : [0, T ] × Dt × R −→ R+ is defined by: (4.43) c∗ (t, x, z, η) = z + (x − w(t)z) , 0 ≤ t ≤ T. 1 1 + δ(t)w(t) 1−p N (t, η) And the optimal wealth process Xt∗ , for 0 ≤ t ≤ T , is given explicitly by: Z t Z t (µ̂ )2 µ̂u N (t, µ̂t ) u ∗ (4.44) Xt = (x − w(0)z) exp du + d Ŵ + w(t)Zt∗ , u 2 N (0, η) (1 − p)σ S 0 0 2(1 − p)σS where w(t) and N (t, η) are defined in (4.19) and (4.23) respectively. Remark 4.2. The more complex structure of feedback forms of optimal investment and consumption policies is the consequence of the time non-separability of the instantaneous utility with habit π∗ c∗ formation. We can see the portfolio/wealth ratio X ∗ and consumption/wealth ratio X ∗ are now Z∗ depending on the habit-formation/wealth ratio X ∗: h Ω̂(t) + σ σ ρ ∗ µ S Nη (t, µ̂t ) i π Z∗ µ̂t = + (1 − w(t) ), X∗ N (t, µ̂t ) X∗ (1 − p)σS2 σS2 and Z∗ c∗ 1 w(t) = + 1 − , 1 1 ∗ X∗ 1 + δ(t)w(t) 1−p N (t, µ̂t ) 1 + δ(t)w(t) 1−p N (t, µ̂t ) X HABIT FORMATION AND PARTIAL OBSERVATIONS 13 Moreover, although function c∗ (·, x, ·, ·) remains still linear and increasing in x > 0, c∗ (·, ·, z, ·) is not necessary increasing in z ≥ 0, which shows the increase of initial habit dose not necessarily imply the increase of optimal consumption stream. And since the dependence of c∗ (t, x, z, η) on the discounting factors α(t) and δ(t) are even more complicated, the optimal consumption process c∗t is not necessarily monotone in the habit formation process Zt∗ . 5. Proof of The Verification Theorem We will first show the consumption constraint ct ≥ Zt implies the constraint on the controlled wealth process by the following proposition: Proposition 5.1. The admissible space A is not empty if and only if x > w(0)z. Moreover, for each pair of investment and consumption policy (π, c) ∈ A, the controlled wealth process Xtπ,c satisfies the constraint: Xtπ,c ≥ w(t)Zt , (5.1) 0 ≤ t ≤ T, where the deterministic function w(t) is defined in (4.18) and refers the cost of subsistence consumption per unit of standard of living at time t. R t Proof. On one hand, let’s assume x ≥ w(0)z, then we can always take πt ≡ 0, and ct = z exp 0 (δ(v)− α(v))dv for t ∈ [0, T ], it is easy to verify Xtπ,c ≥ 0 and ct ≡ Zt so that (π, c) ∈ A, and hence A is not empty. On the other hand, starting from t = 0 with the wealth x and the standard of living z, the addictive habits constraint ct ≥ Zt , 0 ≤ t ≤ T implies the consumption must always exceed the subsistence consumption c̃t = Z(t; c̃t ) which satisfies dc̃t = (δ(t) − α(t))c̃t dt, c̃0 = z, 0 ≤ t ≤ T, And ct ≥ c̃t is equivalent to (5.2) ct ≥ z exp Z t (δ(v) − α(v))dv , 0 ≤ t ≤ T. 0 Define the exponential local martingale Z Z t µ̂ 1 t µ̂2v v e Ht = exp − dŴv − dv , 0 ≤ t ≤ T. 2 0 σS2 0 σS e is a true martingale with respect to (Ω, F S , P), since µ̂t follows the Bene’s condition implies H dynamics (4.2). e as dPe = H e T , Girsanov theorem states that Now define the probability measure P dP Z t µ̂v f Wt , Ŵt + dv, 0 ≤ t ≤ T 0 σS 14 XIANG YU e (F S )0≤t≤T ). is a Brownian Motion under (P, t Then we can rewrite the wealth process dynamics as: Z T Z T fv , XT + cv dv = x + πv σS dW 0 0 Since we have XT ≥ 0, it’s easy to see that e we have: take the expectation under P, Rt 0 e and fv is a supermartingale under (Ω, FS , P), πv σS dW hZ e x≥E T i cv dv . 0 Follow the inequality (5.2), we will further have: i Z v hZ T e (δ(u) − α(u))du dv . exp x ≥ zE 0 0 Since δ(t) and α(t) are deterministic functions, we easily arrive x ≥ w(0)z. In general, for ∀t ∈ [0, T ], follow the same procedure, we can then take conditional expectation under filtration FtS , and get hZ T Z v i e Xt > Zt E exp (δ(u) − α(u))du dv FtS , t t again since δ(t), α(t) are deterministic, we obtain Xt ≥ w(t)Zt , 0 ≤ t ≤ T . 5.1. The Case p < 0 . (The PROOF OF THEOREM 4.2). First, for any pair of admissible control (πt , ct ) ∈ A, denote G πt ,ct as the generator of the process Ve (t, Xt , Zt , µ̂t ) under control (πt , ct ). Ito’s lemma gives i h h i h Ω̂(t) + σS σµ ρ i (5.3) d Ve (t, Xt , Zt , µ̂t ) = G πt ,ct Ve (t, Xt , Zt , µ̂t ) dt + Vex σS πt + Veη dŴt , σS e (t, x , z , η) is the classical solution of HJB equation (4.13), choose the localizing sequence Recall V τn , we integrate the equation (5.3) on [t, τn ∧T ], and take the expectation under probability measure Px,z,η , and let’s denote E = Ex,z,η , we have h Z τn ∧T (c − Z )p i h i s s e (5.4) V (t, x, z, η) ≥ E ds + E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) . p t Now, we follow the idea by Janeček and Sı̂rbu [9], let’s fix this pair of control choice (πt , ct ) ∈ A = Ax , where we denote Ax as the admissible space with initial endowment x. And for ∀ > 0, it is clear that Ax ⊆ Ax+ , and (πt , ct ) ∈ Ax+ . Also it is clear that Xtx+ = Xtx + = Xt +, 0 ≤ t ≤ T . Follow the same procedure above, and notice process Zt keeps the same under the consumption policy ct , then under probability measure Px,z,η , we can obtain: h Z τn ∧T (c − Z )p i h i s s e ds + E Ve (τn ∧ T, Xτn ∧T + , Zτn ∧T , µ̂τn ∧T ) . (5.5) V (t, x + , z, η) ≥ E p t HABIT FORMATION AND PARTIAL OBSERVATIONS 15 By Monotone Convergence Theorem, we first know: h Z τn ∧T (c − Z )p i h Z T (c − Z )p i s s s s (5.6) lim E ds = E ds . n→+∞ p p t t For simplicity, let’s denote Yt = Xt − w(t)Zt , we know by definition (4.36) that: 1 Ve (τn ∧ T, Xτn ∧T + , Zτn ∧T , µ̂τn ∧T ) = (Yτn ∧T + )p Nτ1−p . n ∧T p Proposition 5.1 gives Xt ≥ w(t)Zt for 0 ≤ t ≤ T under any admissible control pair (πt , ct ), we know Yτn ∧T + ≥ > 0, ∀0 ≤ t ≤ T . Since also p < 0, we will have sup(Yτn ∧T + )p < p < +∞. (5.7) n Now from Remark A.1, we already derived that A(t; s) ≤ 0, ∀0 ≤ t ≤ s ≤ T . Combining this with the fact that w(s), δ(s) are continuous and hence bounded on [0, T ] and when p < 0, we also have 1 − a(t)Ω̂(t) > 0 and 1 − f (t)Ω̂(t) > 0 as well as a(t; s), b(t; s), c(t; s), f (t; s) and g(t; s) are all bounded for 0 ≤ t ≤ s ≤ T , we deduce that the explicit solutions B(t; s) and C(t; s) are both bounded on [0, T ], hence we have: N (t, η) ≤ k1 exp(kη) for some large constants k, k1 > 1, which shows the existence of some constants k̄, k̄1 > 1 such that 1−p sup Nτ1−p ≤ sup k exp k µ̂ ≤ k̄ exp k̄ sup µ̂ . 1 t 1 t n ∧T n t∈[0,T ] t∈[0,T ] We recall that µ̂t satisfies the Ornstein Uhlenbeck diffusion (4.2), which gives: Z t Ω̂(u) + σS σµ ρ dŴu . µ̂t = e−tλ η + µ̄(1 − e−tλ ) + eλ(u−t) σS 0 Hence, there exists positive constants l and l1 > 1 large enough, such that: sup µ̂t ≤ l + sup l1 Ŵt , t∈[0,T ] t ∈ [0, T ]. t∈[0,T ] Using the exponential distribution of maximum of the Brownian Motion, there exists some positive constants ¯l > 1 and ¯l1 such that h i h i ¯l1 E exp sup ¯lŴt < +∞. (5.8) E supNτ1−p ≤ n ∧T n t∈[0,T ] At last, by the above (5.7) and (5.8), we can conclude that h i E sup Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) < +∞. n By virtue of Dominated Convergence Theorem, we can deduce: h i h1 i lim E Ve (τn ∧ T, Xτn ∧T + , Zτn ∧T , µ̂τn ∧T ) = E (YT + )p N (T, µ̂T ) n→+∞ p h (X + )p i h Xp i T =E >E T . p p 16 XIANG YU Combine this with equation (5.5), and notice the pair of control (πt , ct ) ∈ A, we will see that: h Z T (c − Z )p Xp i s s e V (t, x + , z, η) ≥ sup E ds + T = V (t, x, z, η). p p π,c∈A t Notice Ve (t, x + , z, η) is continuous in variable x, and since > 0 is arbitrary, we can take the limit as: Ve (t, x, z, η) = lim Ve (t, x + , z, η) ≥ V (t, x, z, η). →0 On the other hand, for πt∗ and c∗t defined by (4.42) and (4.43) respectively, we first want to show the SDE for wealth process: dXt∗ = (πt∗ µt − c∗t )dt + σS πt∗ dŴt , (5.9) 0 ≤ t ≤ T, with initial condition x > w(0)z has an unique strong solution and also satisfies Xt∗ > w(t)Zt∗ , ∀s ∈ [0, T ]. Denote Yt∗ = Xt∗ − w(t)Zt∗ , and apply Ito’s lemma and substitute πt∗ as defined by (4.42), we can get: h i dYt∗ = πt∗ µ̂t − c∗t − wt (t)Zt∗ − w(t)δ(t)c∗t + w(t)α(t)Zt∗ dt + πt∗ σS dŴt (5.10) h = − wt (t) + w(t)α(t) Zt∗ − (1 + w(t)δ(t))c∗t + + Ω̂(t) + σS σµ ρ N η σS2 N i h µ̂t Yt∗ dt + µ̂t + (1 − p)σS µ̂2t Yt∗ 2 (1 − p)σS Ω̂(t) + σS σµ ρ N i η σS N Yt∗ dŴt . Recall the definition of w(t) by (4.18) and substitute c∗t defined by (4.43) into (5.10) above, we will further have −p 1−p h 1 + δ(t)w(t) Ω̂(t) + σ σ ρ 2 S µ Nη i ∗ µ̂t dYt∗ = − + + µ̂t Yt dt N N (1 − p)σS2 σS2 h Ω̂(t) + σ σ ρ S µ Nη i ∗ µ̂t + + Y dŴt . (1 − p)σS σS N t In order to solve Xt∗ in a more explicit formula, we define the auxiliary process by: Γt = N (t, µ̂t ) , Yt∗ ∀0 ≤ t ≤ T. By Ito’s lemma, we can derive the SDE for process Γt as: 2 h Ω̂(t) + σ σ ρ µ̂ Ω̂(t) + σ σ ρ p µ t µ S S Γt dΓt = Nt − λ(µ̂t − µ̄)Nη + Nηη + Nη 2 2 N 2σS (1 − p)σS (5.11) −p i h −µ̂ i pµ̂2t 1−p t + N dt + Γ dŴt . + 1 + δ(t)w(t) t (1 − p)σS (1 − p)2 σS2 HABIT FORMATION AND PARTIAL OBSERVATIONS 17 Recall that N (t, η) satisfies the linear PDE (4.23), we can simplify (5.11) to be: h i h −µ̂ i pµ̂2t t dΓt = Γt dt + Γ dŴt . t (1 − p)σS 2(1 − p)2 σS2 Hence, we can finally get the above SDE has an unique strong solution as: Z t Z t µ̂u µ̂2u du − d Ŵ , Γt = Γ0 exp − u 2 0 (1 − p)σS 0 2(1 − p)σS N (0,η) Initial condition Γ0 = x−w(0)z > 0 implies Γt > 0, ∀ 0 ≤ t ≤ T . And, hence, we finally proved that the SDE (5.9) has an unique strong solution defined by (4.44) and the solution Xt∗ satisfies the wealth process constraint (5.1) Now, we proceed to verify πt∗ and c∗t are actually in the admissible space A. First, by the definition (4.42) and (4.43), it’s clear that πt∗ and c∗t are FtS progressively measurable, and by the path continuity of Yt∗ = Xt∗ − w(t)Zt∗ , hence, of πt∗ and c∗t , it’s easy to show that: Z T Z T ∗ 2 (πt ) dt < +∞, and c∗t dt < +∞, a.s. Xt∗ 0 ∗ w(t)Zt , 0 Also, since > ∀s ∈ [0, T ], by the definition of c∗t , we know the consumption constraint c∗t > Zt∗ , ∀0 ≤ t ≤ T is satisfied. And hence (πt∗ , c∗t ) ∈ A. Given the pair of control policy (πt∗ , c∗t ) as above, following the same steps and the definition of stopping time τn , instead of (5.4), we can now instead get the equality: h i h Z τn ∧T (c∗ − Z ∗ )p i t t e dt + E Ve (τn ∧ T, Xτ∗n ∧T , Zτ∗n ∧T , µ̂τn ∧T ) . V (t, x, z, η) = E p t And hence, we apply Monotone Convergence Theorem again: h Z T (c∗ − Z ∗ )p i h Z τn ∧T (c∗ − Z ∗ )p i t t t t dt = E dt , lim E n→+∞ p p t t When p < 0, we have function Ve (t, x, z, η) < 0 by it’s definition, and by Fatou’s lemma, h i h i h (X ∗ )p i T lim E Ve (τn ∧ T, Xτ∗n ∧T , Zτ∗n ∧T , µ̂τn ∧T ) ≤ E Ve (T, XT∗ , ZT∗ , µ̂T ) = E . n→+∞ p Therefore, it gives hZ e V (t, x, z, η) = E T t (XT∗ )p i (c∗t − Zt∗ )p dt + ≤ V (t, x, z, η) p p which completes the whole proof. 5.2. The Case: 0 < p < 1 . We proceed to prove the following two Lemmas which play important roles in the proof of the second part of our main result. Lemma 5.1. If constant k > 0 satisfies: (5.12) k< λ2 σS2 2(Θ + σS σµ ρ)2 18 XIANG YU there exists a constant Λ1 independent of t, such that i h Z t k µ̂2s ds ≤ Λ1 < +∞, ∀t ∈ [0, T ]. E exp 0 2 Proof. It is easy to choose an increasing sequence of smooth functions Qn (y) % ky as n → ∞ such 0 00 that 0 ≤ Qn (y) ≤ n with Qn (y) and Qn (y) uniformly bounded. And for each fixed t ∈ [0, T ] and η, we define: i h Z t Qn (µ̂s )ds , φ(t, η) = E exp 0 where µ̂0 = η. Similar to the proof of Feynman-Kac formula, the function φ(t, η) is a classical solution of the linear parabolic equation: 2 Ω̂(t) + σS σµ ρ (5.13) φt = φηη − λ(η − µ̄)φη + Qn (η)φ, 2σS2 with initial condition φ(0, η) = 1. See also Lemma 1.12 in Pang [14] for details. First, it’s clear that constant 0 is a subsolution of the above equation. Moreover, under assumption (5.12), it’s easy to show that for each t ∈ [0, T ], the equation: 2 2 Ω̂(t) + σS σµ ρ x2 − 2λx + k = 0 σS2 λ− has two positive real roots x1 = positive constant a such that: r 2(Ω̂(t)+σS σµ ρ)2 λ+ λ2 − k 2 r 2(Ω̂(t)+σS σµ ρ)2 k λ2 − 2 σ S (Ω̂(t)+σS σµ ρ)2 2 σ2 S r λ + λ2 − 0<a< 2 σ S (Ω̂(t)+σS σµ ρ)2 2 σ2 S and x2 = (Θ2 +σS σµ ρ)2 k 2 σS (Θ1 +σS σµ ρ)2 2 σS . And for any , with Θ1 = max(θ, θ∗ ) and Θ2 = min(θ, θ∗ ), and the positive constant b such that: b>a (Θ1 + σS σµ ρ)2 λ2 µ̄2 a2 − , 2 (Θ +σ σ ρ)2 σS 2a2 1 σS2 µ − 2aλ + k S aη 2 ) it’s easy to verify that f (t, η) = exp(bt + satisfies: 2 Ω̂(t) + σS σµ ρ ft ≥ fηη − λ(η − µ̄)fη + kη 2 , 2σS2 with the initial condition f (0, η) ≥ 1. And since Qn (η) < kη 2 , we get function f (t, η) is the supersolution of the equation (5.13), and 0, f (t, η) is the coupled subsolution and supersolution. Theorem 7.2 from Pao [15] shows that 2 function φ(t, η) satisfies: 0 ≤ φ(t, η) ≤ f (t, η) ≤ ebT +aη ≡ Λ1 , and hence Monotone Convergence Theorem leads to: h Z t i E exp k µ̂2s ds ≤ Λ1 < +∞, ∀t ∈ [0, T ]. 0 HABIT FORMATION AND PARTIAL OBSERVATIONS 19 Lemma 5.2. If constant k̄ > 0 satisfies (5.14) k̄ < λσS2 , (Θ + σS σµ ρ)2 for fixed constant κ > 0, there exists a constant Λ2 independent of t, and h i E exp k̄(µ̂t + κ)2 ≤ Λ2 < ∞, t ∈ [0, T ]. Proof. Similar to the proof of Lemma 5.1, we again construct an increasing sequence of functions {Qn (y)} for n ∈ N such that limn→+∞ Qn (y) = k̄(y + κ)2 . And for each fixed t ∈ [0, T ] and η, we define: h i ψ(t, η) = E exp Qn (µ̂t ) , where µ̂0 = η. Then a direct corollary of Theorem 5.6.1 of Friedman [8] gives the function ψ(t, η) is a classical solution of the linear parabolic equation: 2 Ω̂(t) + σS σµ ρ ψηη − λ(η − µ̄)ψη , (5.15) ψt = 2σS2 with initial condition ψ(0, η) = eQn (η) . Under assumption (5.14), and choose any constant a such that k̄ < a < where x1 = 2 λσS (Ω̂(t)+σS σµ ρ)2 λσS2 , (Θ + σS σµ ρ)2 is one real root of the algebraic equation: 2 2 Ω̂(t) + σS σµ ρ σS2 x2 − 2λx = 0 for each fixed t ∈ [0, T ]. And choose any positive constant b such that 2 2 2a κ(Θ+σS σµ ρ)2 − aλκ − aλµ̄ 2 2 2 2 2 (Θ + σS σµ ρ) 2a (Θ + σS σµ ρ) κ σS + + 2aλµ̄κ − , b>a 2 2 (Θ+σS σµ ρ)2 σS σS 2a2 − 2aλ 2 σS It is easy to verify that f (t, η) = exp(bt + a(η + κ)2 ) satisfies 2 Ω̂(t) + σS σµ ρ ft ≥ fηη − λ(η − µ̄)fη , 2σS2 2 with the initial condition f (0, η) = ea(η+κ) ≥ ψ(0, η), hence we get the function f (t, η) is the supersolution of the equation (5.15), and it is trivial to show g(t, η) ≡ 0 is the subsolution, there fore 0, f (t, η) are the coupled subsolution and supersolution. Again by Theorem 7.2 from Pao 2 [15], that function ψ(t, η) satisfies: 0 ≤ ψ(t, η) ≤ f (t, η) ≤ ebT +a(η+κ) ≡ Λ2 , hence Monotone 20 XIANG YU Convergence Theorem implies: h i E exp k̄(µ̂ + κ)2t ≤ Λ2 < +∞, ∀t ∈ [0, T ]. (The PROOF OF THEOREM 4.2, CONTINUED). For any pair of admissible control (πt , ct ) ∈ A, similar to the case for p < 0, choose the same localizing sequence τn such that h Z τn ∧T (c − Z )p i h i s s e (5.16) V (t, x, z, η) ≥ E ds + E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) . p t Now, by monotone convergence theorem, we first know: h Z T (c − Z )p i h Z τn ∧T (c − Z )p i s s s s ds = E ds . lim E n→+∞ p p t t And for 0 < p < 1, Ve (t, Xt , Zt , µ̂t ) ≥ 0 for all t ∈ [0, T ] by the definition (4.36) and (5.1), and Fatou’s lemma yields that: h i h i h Xp i lim E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) ≥ E Ve (T, XT , ZT , µ̂T ) = E T , n→+∞ p which implies that: hZ Ve (t, x, z, η) ≥ sup E π,c∈A πt∗ t T Xp i (cs − Zs )p ds + T = V (t, x, z, η). p p c∗t On the other hand, for the and defined by (4.42), (4.43), again follow the same procedure in the proof for case p < 0, we can show πt∗ and c∗t are actually in the admissible space A. Now, by policies πt∗ and c∗t , similarly, we can now get the equality: h Z τn ∧T (c∗ − Z ∗ )p i h i s s e V (t, x, z, η) = E ds + E Ve (τn ∧ T, Xτ∗n ∧T , Zτ∗n ∧T , µ̂τn ∧T ) . p t By the definition of Ve (t, x, z, η) and Holder inequality, we know that: h i h Y ∗ p i N (T ∧ τn , µ̂T ∧τn ) E Ve (T ∧ τn , XT∗ ∧τn , ZT∗ ∧τn , µ̂T ∧τn ) ≤ k1 E N T ∧τn Z T ∧τn h Z T ∧τn 2pµ̂ i 1 2p2 µ̂2u 2 u ≤ k2 E exp dŴu − du 2σ2 (1 − p)σ (1 − p) S 0 0 S h Z T ∧τn (p2 + p)µ̂2 i 1 2 u · E exp du N 2 (T ∧ τn , µ̂T ∧τn ) 2σ2 (1 − p) 0 S for some positive constants K1 , K2 , which are independent of n. R R t 2p2 µ̂2u t 2pµ̂u Notice that exp 0 (1−p)σS dŴu − 0 (1−p)2 σ2 du is a positive local martingale, therefore, a S supermartingale, we can apply the bounded stopping theorem and get: Z T ∧τn h Z T ∧τn 2pµ̂ i 2p2 µ̂2u u E exp dŴu − du ≤ 1. (1 − p)σS (1 − p)2 σS2 0 0 HABIT FORMATION AND PARTIAL OBSERVATIONS Thus, we can get: hZ e V (t, x, z, η) ≤ E τn ∧T t 21 i 1 Z T ∧τn (p2 + p)µ̂2 (cs − zs )p i h 2 u 2 du N (T ∧ τ , µ̂ ) . ds + E exp n T ∧τ n 2 2 p (1 − p) σS 0 And now, we observe that: h Z sup exp T ∧τn i (p2 + p)µ̂2u 2 du N (T ∧ τ , µ̂ ) n T ∧τ n (1 − p)2 σS2 n 0 h i h Z T ∧τn (p2 + p)µ̂2 i u 2 du · sup ≤ sup exp N (T ∧ τ , µ̂ ) n T ∧τn (1 − p)2 σS2 n n 0 Z T (p2 + p)µ̂2 h i u 2 du · sup ≤ exp N (T ∧ τ , µ̂ ) n T ∧τ n 2 2 n 0 (1 − p) σS Z T (2p2 + 2p)µ̂2 h i u 4 du + sup N (T ∧ τ , µ̂ ) . ≤ exp n T ∧τ n (1 − p)2 σS2 n 0 Hence, we have: n h Z ≤ E exp 0 T T ∧τn i (p2 + p)µ̂2u 2 du N (T ∧ τn , µ̂T ∧τn ) (1 − p)2 σS2 0 i h (2p2 + 2p)µ̂2u i 4 N (T ∧ τ , µ̂ ) . du + E sup n T ∧τ n (1 − p)2 σS2 n h Z E sup exp We first recall that under Assumption (4.33), Lemma 4.1 implies that there exists constants k, k1 such that 2 N (t, η) ≤ keK̄1 (η+k1 ) , where A(t; s) ≤ K̄1 for all 0 ≤ t ≤ s ≤ T , and hence we have 2 sup N 4 (T ∧ τn , µ̂T ∧τn ) ≤ sup ke4K̄1 (µ̂t +k1 ) . n t∈[0,T ] Then we just need to show that h i 2 E sup ke4K̄1 (µ̂t +k1 ) < ∞. (5.17) t∈[0,T ] Define ϕ(x) , e4K̄1 (x+k1 )2 and apply Ito’s lemma, we have 2 2 h Ω̂(t) + σS σµ ρ Ω̂(t) + σS σµ ρ i dϕ(µ̂t ) =ϕ(µ̂t ) − 8K̄1 λ − 8K̄12 µ̂2t + 8K̄1 λµ̄µ̂t − 4K̄1 dt + dLt σS2 σS2 ≤ϕ(µ̂t )k2 dt + dLt , where k2 > 0 is a upper bound constant, and the local martingale part is: Ω̂(t) + σS σµ ρ dLt , ϕ(µ̂t )8K̄1 µ̂t dŴt . σS From which we can derive that Z h i E sup ϕ(µ̂t ) ≤ ϕ(η) + t∈[0,T ] 0 T h i h i k2 E sup ϕ(µ̂s ) dt + E sup Lt , s∈[0,t] t∈[0,T ] 22 XIANG YU Burhholder-Davis-Gundy Inequality and Jensen’s Inequality imply that 2 h i hZ T i 1 Ω̂(t) + σS σµ ρ 2 2 8K̄1 (µ̂t +k1 )2 64K̄12 E sup Lt ≤ k3 E µ̂ e dt , t 2 σS 0 t∈[0,T ] However, under Assumption (4.40) and by Lemma 5.2, there exists a constant Λ2 such that h i h i 2 2 E µ̂2t e8K̄1 (µ̂t +k1 ) ≤ E e(8K̄1 +1)(µ̂t +k1 ) ≤ Λ2 < ∞, t ∈ [0, T ]. h i Hence we obtain the boundedness E sup Lt ≤ k4 < ∞ for some constant k4 , and t∈[0,T ] Z h i E sup ϕ(µ̂t ) ≤ ϕ(η) + t∈[0,T ] 0 T h i k2 E sup ϕ(µ̂s ) dt + k4 , s∈[0,t] The Gronwall’s Inequality verifies (5.17). For the first part, by coefficients Assumption (4.39), we can therefore apply Lemma 5.1, and it yields that: h Z T (2p2 + 2p)µ̂2 i u E exp du < Λ1 < +∞, 2σ2 (1 − p) 0 S for some constant Λ1 > 0. We can then get that: h i sup E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) < Λ < ∞ n for some constant Λ > 0, independent of n. Therefore, Dominated Convergence Theorem leads to: i h (X ∗ )p i h T , lim E Ve (τn ∧ T, Xτn ∧T , Zτn ∧T , µ̂τn ∧T ) = E n→+∞ p together with Monotone Convergence Theorem, we deduce h Z T (c∗ − Z ∗ )p (XT∗ )p i s s e V (t, x, z, η) = E ds + ≤ V (t, x, z, η), p p t which completes the whole proof. Acknowledgements. I sincerely thank my advisor Mihai Sı̂rbu for numerous helpful comments on the topic of this paper as well as reading the manuscript of it. Appendix A. Fully Explicit Solutions Follow the arguments by Kim and Omberg [10], we can even solve the auxiliary ODEs (4.27), (4.28), (4.29), (4.30) and (4.31) fully explicitly depending on the risk aversion constant p and all the market coefficients σS , σµ , λ, ρ: HABIT FORMATION AND PARTIAL OBSERVATIONS 23 A.1. The Normal Solution. The condition for the Normal solution is ∆ , λ2 − (A.1) and then we define: √ ξ= ∆= q pσµ2 2λpρσµ − > 0, (1 − p)σS2 (1 − p)σS2 (1 − p + pρ2 ) 2 σµ , 1−p p γ3 = , (1 − p)σS2 γ22 − γ1 γ3 , γ1 = pρσµ , γ2 = −λ + (1 − p)σS q (1 − ρ2 )σµ2 + (λσS + ρσµ )2 ξ1 = . σS We can solve the equations (4.27), (4.28), (4.29), (4.30) and (4.31) as: a(t; s) = b(t; s) = c(t; s) = p(1 − e2ξ(t−s) ) h 2(1 − p)σS2 2ξ − (ξ + γ2 )(1 − e2ξ(t−s) ) i, pλµ̄(1 − eξ(t−s) )2 h i, 2(1 − p)σS2 ξ 2ξ − (ξ + γ2 )(1 − e2ξ(t−s) ) p 2(1 − p)σS2 − λ2 µ̄2 ξ2 − σµ2 ξ + γ2 h i pλ2 µ̄2 (ξ + 2γ2 )e2ξ(t−s) − 4γ2 eξ(t−s) + 2γ2 − ξ h i (s − t) + 2(1 − p)σS2 ξ 3 2ξ − (ξ + γ2 ) 1 − e2ξ(t−s) 2ξ − (ξ + γ )(1 − e2ξ(t−s) ) pσµ2 2 log , 2ξ 2(1 − p)σS2 (ξ 2 − γ22 ) 1 − e2ξ1 (t−s) 1 , 2σS (σS ξ1 + λσS + ρσµ ) + (σS ξ1 − λσS − ρσµ )e2ξ1 (t−s) (σ ξ + λσ + ρσ ) + (σ ξ − λσ − ρσ )e2ξ1 (t−s) 1 µ µ S 1 S S 1 S g(t; s) = log ξ (t−s) 1 2 2σS ξ1 e ρσµ p 2 ) + (σS ξ − λσS − (σS ξ + λσS + 1−p (1 − p)(1 − ρ ) − log 2 2(1 − p + pρ ) 2σS ξeξ(t−s) 2 ρσµ (s − t) ρ λ(s − t) − − , 2 2(1 − p + pρ ) 2(1 − p + pρ2 )σS f (t; s) = − ρσµ p 2ξ(t−s) 1−p )e The condition for the bounded Normal solution is (A.2) γ3 > 0, or γ1 > 0, or γ2 < 0. The condition for the explosive solution and the critical point is γ3 < 0, γ1 < 0, and γ2 > 0, γ + ξ 1 2 s−t= log . 2ξ γ2 − ξ Remark A.1. By observation, if p < 0, the conditions (A.1) and (A.2) hold, and we have a(t; s) ≤ 0 is a bounded solution as well as 1 − 2a(t; s)Ω̂(t) > 1 > 0 and 1 − f (t; s)Ω̂(t) > 1 > 0, hence we can 24 XIANG YU finally conclude the solutions of ODEs (4.24), (4.25), (4.26) are all bounded on 0 ≤ t ≤ s ≤ T . We a(t) also notice that A(t) = ≤ 0, on 0 ≤ t ≤ s ≤ T . (1−p)(1−2a(t)Ω̂(t)) A.2. The Hyperbolic Solution. The condition for the Hyperbolic solution is ∆ , λ2 − pσµ2 2λpρσµ − = 0, (1 − p)σS2 (1 − p)σS2 together with pρσµ 6= 0, (1 − p)σS Then we can solve (4.27), (4.28), (4.29), (4.30) and (4.31) as: γ2 = −λ + a(t; s) = −1 2γ1 (s − t − b(t; s) = − c(t; s) = 1 γ2 ) − 2λµ̄ 4γ1 γ2 (s − t − γ2 σµ2 (s 2γ1 − t) + γ2 , 2γ1 1 γ2 ) − γ2 λµ̄(s − t + λ2 µ̄2 γ22 (s 1 γ2 ) 2γ1 −t− 24γ1 (s − t 4 γ2 )(s − − γ12 ) , t)3 + σµ2 log (s − t)γ2 − 2 2γ1 , 1 − e2ξ1 (t−s) 1 , 2σS (σS ξ1 + λσS + ρσµ ) + (σS ξ1 − λσS − ρσµ )e2ξ1 (t−s) (σ ξ + λσ + ρσ ) + (σ ξ − λσ − ρσ )e2ξ1 (t−s) (λσ + ρσ ) 1 µ µ µ S S 1 S S 1 S − g(t; s) = log (s − t) 2 2σS 2σS ξ1 eξ1 (t−s) i σµ2 (1 − ρ2 ) h + log 1 + γ2 (t − s) − γ2 (s − t) . 2γ1 The condition for the bounded Hyperbolic solution is f (t; s) = − γ2 < 0. The condition for the explosive solution and the critical point is γ2 > 0, and s − t = 1 . γ2 A.3. The Polynomial solution. The condition for the Polynomial solution is ∆ , λ2 − pσµ2 2λpρσµ − = 0, (1 − p)σS2 (1 − p)σS2 together with pρσµ = 0, (1 − p)σS Then we can solve (4.27), (4.28), (4.29), (4.30) and (4.31) as: p a(t; s) = (s − t), 2(1 − p)σS2 p b(t; s) = λµ̄(s − t)2 , 2(1 − p)σS2 p p c(t; s) = − σµ2 (s − t)2 + λ2 µ̄2 (s − t)3 , 2 4(1 − p)σS 6(1 − p)σS2 γ2 = −λ + HABIT FORMATION AND PARTIAL OBSERVATIONS 25 1 − e2ξ1 (t−s) 1 , 2σS (σS ξ1 + λσS + ρσµ ) + (σS ξ1 − λσS − ρσµ )e2ξ1 (t−s) (σ ξ + λσ + ρσ ) + (σ ξ − λσ − ρσ )e2ξ1 (t−s) (λσ + ρσ ) 1 µ µ µ S 1 S S 1 S S (s − t) g(t; s) = log − 2 2σS 2σS ξ1 eξ1 (t−s) σµ2 (1 − ρ2 )p − (s − t)2 . 4(1 − p)σS2 All Polynomial solutions are well behaved. f (t; s) = − A.4. The Tangent solution. The condition for the Tangent solution is ∆ , λ2 − Now, we define ζ= √ pσµ2 2λpρσµ − < 0, (1 − p)σS2 (1 − p)σS2 −∆, $ = tan−1 γ 2 , ζ Then we can solve (4.27), (4.28), (4.29), (4.30) and (4.31) as: γ2 ζ tan ζ(s − t) + $ − , a(t; s) = 2γ1 2γ1 i λµ̄ h b(t; s) = − 1 − tan($) tan(ζ(s − t) + $) + sec($) sec(ζ(s − t) + $) , γ1 p i 2λ2 µ̄2 γ2 γ22 + ζ 2 h c(t; s) = sec ($) − sec (ζ(s − t) + $) 2γ1 ζ i 2 2 λ µ̄ (2γ2 + ζ 2 ) h + tan(ζ(s − t) + $) − tan($) 2γ1 ζ 3 2 2 λ µ̄ (γ22 + ζ 2 ) − γ2 ζ 2 σµ2 σµ2 − + log sec($) cos(ζ(s − t) + $) , 2γ1 ζ 2 2γ1 1 1 − e2ξ1 (t−s) , 2σS (σS ξ1 + λσS + ρσµ ) + (σS ξ1 − λσS − ρσµ )e2ξ1 (t−s) (σ ξ + λσ + ρσ ) + (σ ξ − λσ − ρσ )e2ξ1 (t−s) (λσ + ρσ ) 1 µ µ µ S S 1 S S 1 S − g(t; s) = log (s − t) ξ (t−s) 1 2 2σ 2σS ξ1 e S cos(ζ(t − s) + $) i h 1 γ2 log (s − t) . − σµ2 (1 − ρ2 ) − 2γ1 cos($) 2γ1 All Tangent solutions are explosive solutions and the critical point is f (t; s) = − s−t= π 1 γ2 − tan−1 ( ). 2ζ ζ ζ References [1] T. Björk, M. H. A. Davis, and C. Landén, Optimal investment under partial information, Math. Methods Oper. Res. 71 (2010), no. 2, 371–399. [2] S. Brendle, Portfolio selection under incomplete information, Stochastic Process. Appl. 116 (2006), no. 5, 701– 723. 26 XIANG YU [3] M. J. Brennan, The role of learning in dynamic portfolio decisions, European Finance Review 1 (1998), no. 3, 295–306. [4] G.M. Constantinides, Habit formation: a resolution of the equity premium puzzle, Working paper series, Center for Research in Security Prices, Graduate School of Business, University of Chicago, 1988. [5] J. B. Detemple and F. Zapatero, Asset prices in an exchange economy with habit formation, Econometrica 59 (1991), no. 6, 1633–57. , Optimal consumption-portfolio policies with habit formation, Mathematical Finance 2 (1992), no. 4, [6] 251–274. [7] N. Egglezos and I. Karatzas, Utility maximization with habit formation: Dynamic programming and stochastic pdes, Siam Journal on Control and Optimization 48 (2009), 481–520. [8] A. Friedman, Stochastic differential equations and applications, Vol. 1., Academic Press, New York., 1975. [9] K. Janeček and M. Sı̂rbu., Optimal investment with high-watermark performance fee, (2011), submitted for publication. [10] T.S. Kim and E. Omberg, Dynamic nonmyopic portfolio behavior, Review of Financial Studies 9 (1996), no. 1, 141–161. [11] P. Lakner, Optimal trading strategy for an investor: the case of partial information, Stochastic Processes and their Applications 76 (1998), no. 1, 77–97. [12] R. Mehra and E. C. Prescott, The equity premium: A puzzle, Journal of Monetary Economics 15 (1985), no. 2, 145 – 161. [13] C. Munk, Portfolio and consumption choice with stochastic investment opportunities and habit formation in preferences, Journal of Economic Dynamics and Control 32 (2008), no. 11, 3560–3589. [14] T. Pang, Stochastic control theory and its applications to financial economics, Ph.D thesis, Brown University, Providence, Rhode Island (2002). [15] C. V. Pao, Nonlinear parabolic and elliptic equations, Plenum Press, New York, 1992. [16] M. Schroder and C. Skiadas, An isomorphism between asset pricing models with and without linear habit formation, Review of Financial Studies 15 (2002), no. 4, 1189–1221. [17] Y. Xia, Learning about predictability: The effects of parameter uncertainty on dynamic asset allocation, The Journal of Finance 56 (2001), no. 1, 205–246. Xiang Yu, Department of Mathematics, The University of Texas at Austin, USA E-mail address: xiangyu@math.utexas.edu