Maximum Likelihood Estimation of Hidden Markov Processes

advertisement
Maximum Likelihood Estimation of Hidden
Markov Processes
Halina Frydman and Peter Lakner
New York University
June 11, 2001
Abstract
We consider the process dYt = utdt + dWt; where u is a process
not necessarily adapted to F Y (the …ltration generated by the process
Y ) and W is a Brownian Motion. We obtain a general representation for the likelihood ratio of the law of the Y process relative to
Brownian measure. This representation involves only one basic …lter
(expectation of u conditional on observed process Y ): This generalizes
the result of Kailath and Zakai (1971) where it is assumed that the
process u is adapted to F Y : In particular, we consider the model in
which u is a functional of Y and of a random element X which is
independent of the Brownian Motion W: For example X could be a
di¤usion or a Markov chain. This result can be applied to the estimation of an unknown multidimensional parameter µ appearing in the
dynamics of the process u based on continuous observation of Y on the
time interval [0,T]. For a speci…c hidden di¤usion …nancial model in
which u is an unobserved mean-reverting di¤usion, we give an explicit
form for the likelihood function of µ: For this model we also develop
a computationally explicit E-M algorithm for the estimation of µ: In
contrast to the likelihood ratio, the algorithm involves evaluation of a
number of …ltered integrals in addition to the basic …lter.
1
Introduction
Let (-; F; P ), fF t ; t T g be a …ltered complete probability space and W =
fWt; t T g a standard Brownian Motion. We consider process Y with the
1
decomposition
dYt = ut(µ)dt + dWt
(1)
where u is a measurable, adapted process, and µ is an unknown, vector-valued
parameter. We assume that Y is observed continuously on the interval [0; T ];
but that neither u = fut ; t T g nor W are observed. The purpose of this
work is to develop methods for the maximum likelihood estimation of µ:
Our interest in this work was motivated by the following model from …nance.
Suppose that we observe continuously a single security with the price process
{St ; t T g which follows the equation
dSt = ¹t St dt + ¾St dWt;
where ¾ > 0 is a known constant, but the drift coe¢cient ¹ = f¹t ; t
an unobserved mean-reverting process with the dynamics
d¹t = ®(À ¡ ¹ t )dt + ¯dWt1 :
(2)
T g is
(3)
Here ® > 0; ¯ > 0; and À are unknown parameters, and W 1 is a standard
Brownian Motion independent of W: The initial value of the security S0 is an
observable constant whereas the initial value of the drift, ¹0 ; is an unknown
constant. The model in (2) and (3) has been recently found to be very ‡exible
in modeling of security prices. Its versions have been used extensively in the
area of dynamic asset management (see e.g. Bielecki and Pliska (1999),
Lakner (1998), Kim and Omberg (1996), and Lo (2000)). We now transform
the model de…ned by (2) and (3) to the form in (1) by introducing the
following changes of variables. The reason for the transformation is that we
want the law of {Yt ; t T g to be equivalent to that of a Brownian Motion.
Let
1
1
¹t ¡ ¹ 0;
¯
¯
µ ¶
1
St
¾
=
log
+ t:
¾
S0
2
Xt =
Yt
Then the hidden process is
dXt = ®(± ¡ Xt )dt + dWt1;
2
(4)
where ± = À=¯; and the observed process is
dYt = (¸Xt + º)dt + dWt ;
(5)
where ¸ = ¯=¾ and º = ¹ 0=¾: Let µ = (®; ±; ¸; º): We note that X0 = 0 and
Y0 = 0; and that equation (5) is in the form (1) with ut(µ) = ¸Xt + º:
We summarize our results. Our main result is a new formula for the
likelihood ratio of the law of the process Y; speci…ed in (2), with respect to
the Brownian measure. Naturally, the likelihood ratio is a function of µ: This
result generalizes a result in Kailath and Zakai (1971) that gives a formula
for the likelihood ratio in the case when u is adapted to F Y = fFtY ; t T g;
the …ltration generated by the observed process Y: Here we extend this result
to the case when u is not adapted to F Y : We note that in our …nancial model
example (5) u is not adapted to F Y : We establish the new formula under a
stronger (Theorem 1) and a weaker (Theorem 2) set of su¢cient conditions.
These conditions allow u to be a functional of Y; and of a random element
X which is independent of a Brownian Motion W: For example X could be
a di¤usion or a Markov chain. The likelihood ratio involves only one basic
…lter, that is, the expectation of u conditional on the observed data. The new
formula makes it possible to directly maximize the likelihood function with
respect to the parameters. It may also be useful in developing the likelihood
ratio tests of various hypotheses about the parameters of hidden Markov
processes. For the …nancial model in (4) and (5) we give an explicit form of
the likelihood function.
We also consider the Expectation-Maximization (E-M) algorithm approach to the maximization of the loglikelihood function for the model in
(4) and (5). Dembo and Zeitouni (1986) extended the E-M algorithm to the
parameter estimation of the hidden continuous time random processes and
in particular to the parameter estimation of hidden di¤usions. The …nancial
model in (4) and (5) generalizes an example in Dembo and Zeitouni (1986).
For this model we develop a computationally explicit E-M algorithm for the
maximum likelihood estimation of µ: The expectation step of this algorithm
requires evaluation of …ltered integrals, that is, evaluation of the expectations
of stochastic integrals conditional on observed data. As remarked by Elliott
et al (1997, p. 203), Dembo and Zeitouni (1986) express the …ltered integrals
in the form of nonadapted stochastic integrals which are not well de…ned. In
our development we express a required …ltered integral in terms of adapted
stochastic integrals (Proposition 4). In contrast to the likelihood function,
the E-M algorithm involves a number of …lters in addition to the basic …lter.
3
The development of the likelihood ratio tests based on the new formula
for the likelihood function and of simulations to assess the accuracy of the
maximum likelihood estimators is left for a future investigation.
This paper is organized as follows. The new formula for the likelihood
ratio is established in Section 2. Section 3 presents the extended E-M algorithm and some of its properties. In Section 4 and 5 we apply the E-M
algorithm to the estimation of µ in model in (4) and (5). Appendix contains
an auxiliary result for the derivation of the likelihood ratio in Section 2.
2
The Likelihood Ratio
In this section we generalize a result of Kailath and Zakai (1971). We consider
the process in (1), but to simplify notation, we suppress the dependence of
u on µ; so that
dYt = ut dt + dWt :
(6)
Our standing assumption is
Z
T
0
u2t dt < 1:
(7)
Let Py be the measure induced by Y on B(C [0; T ]) and Pw be the Brownian
measure on the same class of Borel sets. We know from Kailath & Zakai
(1971) that (7) implies Py ¿ Pw . The same authors give a formula for the
dPy
likelihood ratio dP
for the case when u is adapted to F Y . Here we are going
w
to extend this result to the case when u is not adapted to F Y .
Let fv(t); t T g be the F Y ¡optional projection of u, thus
E[ut j F tY ] = vt; a:s:
(8)
whenever
Ejutj < 1:
Theorem 1 Additionally to the assumption (7), we assume that
Ejut j < 1 ;
for all t 2 [0; T ]
4
(9)
Z
T
0
(10)
a.s.
vs2ds < 1 ;
and
n Z
E exp ¡
T
0
1
us dWs ¡
2
Z
T
0
o¸
u2s ds
= 1:
Then P y is equivalent to Pw and
Z T
Z
dPy
1 T 2
= expf
vs dYs ¡
v dsg:
dPw
2 0 s
0
(11)
(12)
Remark: The formula on the right-hand side is an FTY -measurable random variable, thus it is also a path-functional of fYt ; t
T g. Thus this
expression represents a random variable and a functional on C[0; T ] simultaneously, and in (12) it appears in its capacity as a functional on the space of
continuous functions.
Proof. We de…ne the process
Z t
Z
1 t 2
Zt = expf¡
us dWs ¡
u dsg; t 2 [0; T ]
(13)
2 0 s
0
and the probability measure P0 by
dP0
= ZT ;
dP
which is a bona …de probability measure by condition (11). We are going to use repeatedly the following form of “Bayes theorem”: for every F t measurable random variable V we have
E0[V j FtY ] =
1
E[ZtV j FtY ];
E[Zt j FtY ]
t 2 [0; T ];
(14)
where E0 is the expectation operator corresponding to the probability measure P0: According to Girsanov’s Theorem Y is a Brownian Motion under
P0. Also, one can see easily that
dPy
1
= E 0[
j FTY ] = (E[ZT j F TY ])¡1 :
dPw
ZT
5
The equivalence of Py and Pw already follows, since ZT > 0 (under P and
P0) thus also
E0[
1
j F TY ] > 0;
ZT
hence the same expression, as a path-functional of fY t; t T g is Py-almost
everywhere positive. We still need to show (12). We start with the identity
Z t
1
1
= 1+
us dYs :
Zt
0 Zs
Theorem 12.1 in Lipster and Shiryaev (1978), implies that
Z t
1
1
Y
E0[ j Ft ] = 1 +
E0[ us j FsY ]dYs ;
Zt
Z
s
0
as long as the following two conditions hold:
¸
1
E0
ju j < 1
Zs s
Z Tµ
0
E0
1
us j FsY
Zs
¸¶2
ds < 1 ; a.s.
(15)
(16)
(17)
However, (16) follows from condition (9). In order to show (17) we use the
rule (14) and compute
Z T
Z T
Z T
1
1
Y 2
Y ¡2 2
(E0 [ us j F s ]) ds =
(E[Zs j F s ]) vs ds =
(E0[
j FsY ])2vs2ds:
Zs
Zs
0
0
0
£
¤
Now we observe that by (15) (s; !) 7! E0 Z1s j F sY is a P0 -local martingale
(and also a martingale but this is not important at this point), adapted
to the …ltration generated by the P0-Brownian Motion Y , thus it must be
continuous (Karatzas and Shreve, Problem 3.4.16). This fact and condition
(10) implies (17). Now using (15) and (14) we get
¸
Z t
Z t
1
1
1
Y
Y
E0[ j Ft ] = 1 +
v dYs = 1 +
E0
j F s vs dYs ;
Y s
Zt
Zs
0 E[Zs j Fs ]
0
6
which implies (12).
In the rest of this section we are going to strengthen Theorem 1 by eliminating condition (11). This will be a generalization again of a result by
Kailath and Zakai (1971), where u is assumed to be F Y -adapted. Our objective here is to have a result which applies to a situation when u depends on
Y and possibly on another process which is independent of W . This other
process may be a Brownian Motion, or possibly a Markov chain. In order
to achieve this generality we assume that X is a complete separable metric
space and X : - 7! X is an F 7! B(X )-measurable random element, independent of the Brownian Motion W . We denote by PX the measure induced
by X on B(X): We assume that
us = ©(Y0s ; s; X);
s 2 [0; T ]
(18)
where Y0s represents the path of Y up to time s. Before spelling out the
exact conditions on © we note that X may become a Brownian Motion if
X = C[0; T ], or a Markov-chain if X = D[0; T ] (the space of right-continuous
functions with left limits) with the Skorohod topology. Other examples are
also available.
The conditions for © are the following. First a notation: for every f 2
C([0; T ]) and s 2 [0; T ] we denote by f s 2 C [0; s] the restriction of f to [0; s].
We assume the following:
For every s 2 [0; T ] …xed, ©(¢; s; ¢) is a mapping from C [0; s] £ X to <
such that
(i) the C[0; T ] £ [0; T ] £ X 7! < mapping given by (f; s; x) 7! ©(f s ; s; x)
is measurable with respect to the appropriate Borel-classes.
(ii) for every s 2 [0; T ] and f 2 C[0; s] the random variable ©(f; s; X) is
Fs -measurable.
(iii) the following “functional Lipschitz” condition (Protter, 1990) holds:
for every f; g 2 C([0; T ]) and PX ¡ almost every x 2 X there exists an
increasing (…nite) process Kt (x) such that
¯
¯
¯©(f t ; t; x) ¡ ©(g t ; t; x)¯ Kt (x) sup jf (u) ¡ g(u)j
u t
for all t 2 [0; T ]:
Our basic setup now is Y with a decomposition (6), u of the form (18),
where X is a random element independent of W and © a functional satisfying
(i), (ii) and (iii).
7
Theorem 2 Suppose that (7) holds, i.e., in our case
Z T
© 2(Y0s ; s; X)ds < 1
(19)
0
and also
Z
T
0
(20)
©2 (W0s ; s; X)ds < 1:
Then
n Z
E exp ¡
T
0
1
us dWs ¡
2
Z
T
0
o¸
u2s ds
= 1:
(21)
Remark: It is interesting to observe that if © does not depend on Y ,
i.e., u is independent of W then (21) follows from (7) only.
Remark: Conditions (19) and (20) are obviously satis…ed if the mapping
s 7! ©(f s ; s; x)
is continuous for every (f; x) 2 C[0; T ] £ X .
Proof. The main idea of the proof is to derive our theorem from the
results of Kailath & Zakai by conditioning on X = x. Let Q(x; ¢), x 2 X be
a regular conditional probability measure on F given X. The existence of
such regular version is well-known in this situation. It is shown in the Appendix that the independence of the Brownian Motion W and X implies that
f(Wt; F tW ); t T g remains a Brownian Motion under Q(x; ¢) for PX-almost
every x 2 X . We also show in the Appendix that if A is a zero probability
event under P then for PX ¡ almost every x 2 X we have Q(x; A) = 0: (The
“PX¡ almost every” is important here; obviously Q(x; ¢) is not absolutely
continuous with respect to P:) Accordingly, for PX-almost every x 2 X
Z
T
0
©2(Y0s ; s; x)ds < 1 ;
Q(x; ¢) ¡ a:s:;
©2(W0s ; s; x)ds < 1 ;
Q(x; ¢) ¡ a:s:
and
Z
T
0
8
Since X is assumed to be a complete measurable metric space, we have
Q(x; f! 2 - : X (!) = xg) = 1
for PX -almost every x 2 X (Karatzas and Shreve,(1988), Theorem 5.3.19)
What follows is that under Q(x; ¢) our equations (6) and (18) become
dYt = ©(Y0t; t; x)dt + dWt:
It follows from Theorem V.3.7 in Protter (1990) and our condition (iii) that
the above equation has a unique (strong) solution. Thus the process Y must
be adapted to the Q(x; ¢)¡ augmentation of F W : However, now we are back
in the situation studied by Kailath and Zakai (1971). For every x 2 X let
PY (x; ¢) be the measure induced by Y on B(C [0; T ]) under the conditional
probability measure Q(x; ¢). Kailath and Zakai, Theorems 2 and 3 imply
that for P X-almost every x 2 X we have PY (x; ¢) » Pw and
n
dPw
= exp ¡
dPY (x; ¢)
Z
T
0
©(Y0t ; t; x)dYt
1
+
2
Z
T
0
o
©2 (Y0t ; t; x)dt :
This implies that the integral of the right-hand side with respect to the
measure PY (x; ¢) is 1, which can be expressed as
¸
Z
n Z T
1 T 2 o
E exp ¡
utdYt +
u dt j X = x = 1 :
2 0
0
for PX -almost every x 2 X. Formula (21) now follows.
We summarize the results of Theorems 1 and 2 in the following theorem.
Theorem 3 Let X : - 7! X be a measurable mapping taking values in the
complete separable metric space X . Assume that X is independent of the
Brownian motion W , Y has the form (6) and u is as speci…ed in (18) where
© satis…es conditions (i), (ii) and (iii). Additionally we assume
E[©(Y0s ; s; X)] < 1;
Z
T
0
vs2ds < 1 ;
9
a.s.,
Z
T
0
Z
T
0
©2(Y0s ; s; X )ds < 1 ;
a.s.,
©2(W0s ; s; X)ds < 1;
a.s.,
where v is the optional projection of u to F Y .Then Py » Pw and
Z
nZ T
dP y
1 T 2 o
= exp
vs dYs ¡
v ds :
dPw
2 0 s
0
Furthermore,
¸
¸
Z T
Z
Z T
Z
1 T 2
1 T 2
E expf¡
us dWs ¡
u dsg = E expf¡
vs dWs ¡
v dsg = 1:¥
2 0 s
2 0 s
0
0
For our example of the security price model we compute explicitly the
required optional projection in Section 5.
3
E-M algorithm.
We brie‡y describe the extended EM algorithm presented in Dembo and
Zeitouni (1986). In the next section we apply this algorithm to the estimation
of µ in the model de…ned in (4) and (5). Let FTX be the …ltration generated
by {Xt; 0
t
T g; F TY the …ltration generated by {Yt ; 0
t
T g; and
X;Y
FT the …ltration generated by {Xt ; 0 t T g and {Yt ; 0
t T g: We
identify - as C [0; T ] £ C[0; T ] and F as the Borel sets of -: Let X be the
…rst coordinate mapping process and Y be the second coordinate mapping
process, i. e., for (!1; ! 2) 2 C[0; T ] £ C [0; T ] we have Xt (!1; !2 ) = ! 1(t)
and Yt (!1; ! 2) = !2 (t): Let Pµ be a family of probability measures such that
Pµ » PÁ for all µ; Á 2 £; where £ is a parameter space. Also let Eµ denote
the expectation under the measure Pµ : We denote by PµY the restriction of
Pµ to F TY : Then obviously
dPµY
dPµ Y
jF ):
Y = E Á(
dPÁ
dPÁ T
10
(22)
The maximum likelihood estimator b
µ of µ is de…ned as:
Ã
!
Y
bµ = arg max dPµ
for some constant Á 2 £:
µ
dPÁY
It should be clear that this de…nition does not depend on the choice of Á:
The maximization of dPµY =dPÁY is equivalent to the maximization of the loglikelihood function
LÁ(µ) , log
dPµY
:
dPÁY
(23)
Now by (22) and (23)
dPµY0
LÁ(µ ) ¡ LÁ(µ) = log
= log Eµ
dPµY
0
and from the Jensen’s inequality
0
LÁ(µ ) ¡ LÁ(µ) ¸ Eµ
µ
µ
dPµ0 Y
jF
dPµ T
dP 0
log µ jF TY
dPµ
¶
;
¶
;
(24)
with equality if and only if µ 0 = µ: The E-M algorithm is based on (24)
and proceeds as follows. In the …rst step we select an initial value µ 0 for µ:
In the (n + 1)st iteration, n ¸ 0; the E-M algorithm is applied to produce
µ n+1: Each iteration consists of two steps: the expectation, or E, step and
the maximization, or M, step. In the E-step of the (n + 1)st iteration the
following quantity is computed
Q(µ; µ n ) , Eµ n (log
dPµ Y
jF ):
dPµ n T
In the M-step Q(µ; µn ) is maximized with respect to µ to obtain µ n+1: Iterations are repeated until the sequence fµ ng converges. It is clear from (24)
that
LÁ(µ n+1) ¸ LÁ(µ n);
(25)
with equality if and only if µ n+1 = µn : Thus at each iteration of the E-M
algorithm the likelihood function increases. In the M-step the maximizer of
µ is obtained as the solution of
@
@
dPµ Y
Q(µ; µn ) = Eµn (log
jF ) = 0;
(26)
@µ
@µ
dPµ n T
11
or if we can interchange the di¤erentiation and expectation operations, the
maximizer of µ is the solution of
@
dPµ Y
Eµ n ( log
jF ) = 0:
@µ
dPµ n T
A corollary of (25) is that the maximum likelihood estimator is a …xed point
of the E-M algorithm. For summary of the convergence properties of the
E-M algorithm and additional references we refer the reader to Dembo and
Zeitouni (1986).
4
Estimation of the drift of the security price
process with the E-M algorithm
We develop the E-M algorithm for the estimation of µ = (®; ±; ¸; º) in the
security price model in (4) and (5). By Girsanov’s theorem, if also X process
is completely observed, the log-likelihood function of µ is given by
½Z T
¾
Z
dPµ
1 T 2
2
= exp
®(± ¡ Xs )dXs ¡
® (± ¡ Xs) ds
dPW
2 0
0
½Z T
¾
Z
1 T
2
£ exp
(¸Xs + º)dYs ¡
(¸Xs + º) ds ;
2 0
0
where W =(W 1; W ) is a two-dimensional Brownian motion and PW is the
associated Brownian measure on ß(C[0; T ] £ C [0; T ]): Hence
½Z T
¾
Z
dPµ
1 T 2
2
log
=
®(± ¡ Xs )dXs ¡
® (± ¡ Xs ) ds
dPµn
2 0
0
½Z T
¾
Z
1 T
2
+
(¸Xs + º)dYs ¡
(¸Xs + º) ds
2 0
0
+f(µn );
where f(µn ) is a term which does not depend on µ: Since, for our problem, we can interchange the operations of di¤erentiation and integration, the
equations in (26) take the form
¸
@
dPµ Y
Eµ n
log
jF
=
@®
dPµ n T
Z T
Z T
¸
2
Y
Eµ n
(± ¡ Xs )dXs ¡ ®
(± ¡ Xs ) dsjFT
= 0;
(27)
0
0
12
@
dPµ Y
log
jF
@±
dPµ n T
Eµn
Eµ n
¸
= Eµn ®XT ¡
Z
T
2
® (± ¡
0
Xs )dsjFTY
¸
= 0;
(28)
¸
¸
Z T
Z T
@
dPµ Y
Y
log
jF = Eµn
Xs dYs ¡
(¸Xs + º)Xs dsjFT = 0:
@¸
dPµn T
0
0
(29)
Eµ n
@
dP µ Y
log
jF
@À
dPµ n T
¸
= Eµn YT ¡
Let
Then, recalling that
RT
0
Z
T
0
(¸Xs + º)dsjFTY
¸
= 0;
(30)
m(t; T; µ n ) = Eµn (Xt jF TY );
n(t; T; µ n ) = Eµn (Xt2 jFTY ):
Xs dXs = 12 XT2 ¡ T2 ; equation (27) becomes
1
±m(T; T ; µn ) ¡ n(T; T ; µn ) +
2
T
+ ¡ ®±2 T + 2®±
2
Z
T
0
m(s; T; µ n)ds ¡ ®
Z
T
n(s; T; µ n )ds = 0:
(31)
0
Equation (28) takes the form
m(T ; T; µ n) ¡ ®±T + ®
Z
T
m(s; T; µ n )ds = 0:
(32)
0
Multiplying (32) by ± and subtracting the result from (31) gives
Z T
Z T
1
T
¡ n(T; T; µ n ) + + ®±
m(s; T ; µn )ds ¡ ®
n(s; T; µ n)ds = 0: (33)
2
2
0
0
From equations (32) and (33) we obtain updated values of ® and ± :
RT
2
¡ 12 T n(T ; T; µ n ) + T2 + m(T; T ; µn ) 0 m(s; T; µ n)ds
®n+1 =
;
hR
i2
RT
T
T 0 n(s; T; µ n )ds ¡ 0 m(s; T ; µn )ds
RT
m(s; T; µ n )ds
m(T ; T; µ n)
±n+1 =
+ 0
:
T ®n+1
T
13
(34)
(35)
Similarly we write equation (30) as
Z T
YT ¡ ¸
m(s; T ; µ n)ds ¡ ºT = 0;
(36)
0
and equation (29) as
¸
Z T
Z
Y
Eµn
Xs dYs jF T ¡ ¸
0
T
0
n(s; T; µ n )ds ¡ º
Z
T
m(s; T ; µ n)ds = 0: (37)
0
We compute the …rst term of (37) in the following lemma. Let
ms (µn ) = Eµ n (XsjFsY );
and to simplify notation we will write
m(t; T ) = m(t; T ; µ);
n(t; T ) = n(t; T; µ);
ms = ms (µ)
Proposition 4
¸
Z T
Y
Eµ
Xs dYs jFT
0
ZT
Z T
Z
=
msdYs ¡
ms (º + ¸ms)ds + º
0
0
T
m(s; T )ds + ¸
0
Proof. Let
ft = Yt ¡ ºt ¡ ¸
W
Z
Z
T
m(s; T )ms ds
0
T
ms ds
(38)
0
ft ; FtY ); 0
By Lemma 11.3 in Lipster and Shiryayev (1978), the process (W
t
T is a Brownian motion. By Theorem 12.1 of Lipster and Shiryayev
(1978) mt satis…es the following equation
dmt = ®(± ¡ mt )dt ¡ ¸° t (º + ¸mt)dt + ¸° tdYt ; m0 = 0;
(39)
where ° t = V (Xt jF tY ) is a deterministic function satisfying
d° t
= ¡¸2° 2t ¡ 2®° t + 1; ° 0 = 0:
dt
14
(40)
From (38)
ft = dYt ¡ (º + ¸mt )dt;
dW
and thus
ft :
dmt = ®(± ¡ mt)dt + ¸° tdW
f
(41)
(42)
f
We see from (42) that mt is FtW ¡ adapted. Then Yt is FtW ¡ adapted by
(38). Hence
¸
¸
Z T
Z T
f
Y
W
Eµ
Xs dYs jF T
= Eµ
Xs dYs jFT
0
0
¸
Z T
Z T
f
W
fs +
= Eµ
Xs dW
Xs (º + ¸ms )dsjFT
0
0
¸
Z T
Z T
f
W
f
= Eµ
Xs dWs jF T + º
m(s; T )ds
0
0
Z T
+¸
m(s; T )ms ds:
(43)
0
Now by Theorem 5.14 in Lipster and Shiryayev (1977, p.185)
¸ Z T
Z T
f
W
f
fs ;
Eµ
Xs dWs jFT =
ms dW
0
(44)
0
provided that
and
E jXs j < 1; 0
Z
P(
T
0
h
s
i
f 2
W
E(jXs j jF s ) ds
T;
< 1) = 1:
(45)
(46)
Conditions (45) and (46) hold in our case and thus by (44), (43) and (41)
¸
Z T
Y
Eµ
Xs dYs jF T
0
Z T
Z T
Z T
f
=
ms dWs + º
m(s; T )ds + ¸
m(s; T )ms ds
0
0
0
Z T
Z T
Z T
=
ms dYs ¡
ms (º + ¸ms )ds + º
m(s; T )ds
0
0
0
Z T
+¸
m(s; T )ms ds;
0
15
which is the statement of the Proposition.
By Proposition 4 equation (37) becomes
Z T
Z T
Z T
0 =
ms (µ n)dYs ¡ º
ms (µ n)ds ¡ ¸
m2s (µ n )ds
0
0
Z T
Z T 0
+¸
m(s; T ; µn )ms (µn )ds ¡ ¸
n(s; T; µ n)ds;
0
(47)
0
From (36) and (47) the updated values for ¸ and º are
¸n+1 =
T
and
RT
0
T
RT
ms (µ n)dYs ¡ YT
(48)
RT
ms (µ n)ds
;
RT
RT
[m2s (µ n ) + n(s; T; µ n) ¡ m(s; T; µ n )ms (µ n)] ds ¡ 0 m(s; T; µ n)ds 0 ms (µ n)ds
ºn+1 =
0
YT ¡ ¸n+1
0
RT
0
m(s; T; µ n )ds
:
T
(49)
To implement the E-M algorithm using equations (34), (35), (48), and (49)
we need the expressions for ms ; m(s; T ) and n(s; T ): These expressions are
obtained in the next section.
5
Expressions for Filters.
We obtain mt as the solution of the equation (39). Letting
µ
¶
Z t
2
©t = exp ¡®t ¡ ¸
° s ds :
(50)
0
the solution is
mt = ©t ¸
Z
t
0
° s ©¡1
s (dYs
¡ ºds) + ®±
Z
0
t
©¡1
s ds
¸
:
(51)
The function ° t is obtained as the solution of (40) and is given by
°t =
b exp(2at) ¡ d
®
¡ 2;
2 exp(2at) + d
¸
¸
16
(52)
where
b=
p
®2 + ¸2; d =
b¡®
:
b+®
For evaluation of mt it is useful to note the following expressions. Substituting (52) into (50) gives
µ Z t
¶
(b + ®) exp(bs) ¡ (b ¡ ®) exp(¡bs)
©t = exp ¡b
ds
0 (b + ®) exp(bs) + (b ¡ ®) exp(¡bs)
©
ª
= exp ¡ ln [(b + ®) exp(bs) + (b ¡ ®) exp(¡bs)] j t0
2b
=
;
(b + ®) exp(bt) + (b ¡ ®) exp(¡bt)
and also
sinh(bs)
:
b
° s ©¡1
s =
We note that the likelihood function of µ involves only the …lter ms ; and is
of the following explicit form
½Z T
¾
Z
1 T
2
exp
[¸ms (µ) + º]dYs ¡
[¸ms (µ) + º] ds :
2 0
0
Application of Lemma 12.3 and Theorem 12.3 in Lipster and Shiryayev (1978,
p. 36-37) to our model shows that
m(s; t)
(53)
Z t
Z t
¡ t ¢ ¡1
= 's
mt ¡ ®±
('us)¡1 du ¡ ¸
('us)¡1 °(u; s)(dYu ¡ ºdu); s t;
s
s
and
2
°(s; t) = 1 + ¸ ° s
Z
t
s
(' us)2 du
¸¡1
° s; s
t;
(54)
respectively, where 'ts ; t ¸ s; is the solution of
¢
d'ts ¡
= ¡® ¡ ¸2°(t; s) ' ts; ' ss = 1;
dt
17
(55)
and for t > s; °(t; s) = V (XtjzYt ; Xs ) satis…es
d°(t; s)
= ¡2®°(t; s) + 1 ¡ ¸2°(t; s)2; °(s; s) = 0:
dt
(56)
It can be easily veri…ed that the solution to (56) is
°(t; s) =
b exp [2b(t ¡ s)] ¡ d
®
¡ 2 ; t ¸ s:
2
¸ exp [2b(t ¡ s)] + d ¸
Using (57), the solution to (55) is
Z
t
's = exp ¡b
t
exp[2b(u ¡ s)] ¡ d
du
s exp[2b(u ¡ s)] + d
exp[b(t ¡ s)]
= (1 + d)
:
exp[2b(t ¡ s)] + d
¸
For evaluation of m(s; t) it is useful to note that
¡
'ts
¢¡1
°(t; s) =
sinh [b(t ¡ s)]
;
b
and to evaluate °(s; t) we compute the integral
Z t
Z t
4b2
exp[2b(u ¡ s)]
u 2
(' s ) du =
du
2
(b + ®) s fexp[2b(u ¡ s)] + dg2
s
¸
2b
1
t
=
¡
j
exp[2b(u ¡ s)] + d s
(b + ®)2
exp[2b(t ¡ s)] ¡ 1
=
:
(b + ®) exp[2b(t ¡ s)] + b ¡ ®
Finally we note
n(s; t) = °(s; t) + m(s; t)2; 0
Appendix
18
s
t:
(57)
Proposition: Suppose that {(Wt ; Ft ); t T g is a Brownian Motion, X
is a complete separable metric space, X : - ! X is a measurable mapping,
and {Q(x; ¢); x 2 X g is a family of regular conditional probability measures
for F given X: Let P X be the measure induced by X on B(X ). Then
(a) for every A 2 F such that P (A) = 0 we have Q(x; A) = 0 for
PX ¡almost every x 2 X ,
(b) for PX ¡almost every x 2 X , the process {(Wt; F tW ); t
Brownian Motion under the probability measure Q(x; ¢):
T g is a
Proof: Statement (a) is an obvious consequence of the de…nition of regular conditional probability measure. Indeed, for every A 2 F we have
Q(x; A) = P (AjX = x); PX ¡ a:e:; x 2 X
and thus
Q(X; A) = P (AjX); P ¡ a:s:
This implies that if A is a P ¡zero event then E[Q(X ; A)] = P (A) = 0 and
(a) now follows. Next we are going to prove (b). It follows from (a) that
the process W has Q(x; ¢)¡ almost surely continuous paths for P X¡almost
every x 2 X. We need to show that the …nite dimensional distributions of
W under Q(x; ¢) coincide with those under the probability measure P: Let us
…x a positive integer n and time points 0 t1 ::: tn T: We denote by
Q the set of rational numbers. For every q = (q1; :::; qn ) 2 Qn there exists a
set Bq 2 B(X), PX (Bq ) = 1 such that for all x 2 Bq
Q(x; W (t1)
q1 ; :::; W(tn ) qn) = P (W (t1 )
= P (W (t1 ) q1; :::; W (tn) qn );
q1; :::; W (tn)
qnjX = x)
where the last step follows from independence of X and W under P: Now we
de…ne
\
B=
Bq ;
q2Q n
and note that PX (B) = 1 since Qn is countable. It follows that the equation
above holds for every x 2 B which completes the proof.
19
Acknowledgement
Halina Frydman completed part of this work while visiting the Department of Biostatistics at the University of Copenhagen. She thanks the department for the hospitality.
REFERENCES
Bielecki, T. R. and Pliska, S. R., (1999). Risk Sensitive Dynamic Asset
Management. Journal of Applied Mathematics and Optimization 39: 337360.
Dembo, A. and Zeitouni O. (1986). Parameter Estimation of Partially
Observed Continuous Time Stochastic Processes. Stochastic Processes and
their Applications 23: 91-113.
Elliott, R. J., Aggoun, L. P. and Moore, J. B. (1997). Hidden Markov
Models. Springer-Verlag.
Haugh, M. B. and Lo, A. W. (2000). Asset Allocation and Derivatives.
Draft. MIT Sloan School of Management.
Kailath, T. and Zakai, M. (1971). Absolute Continuity and RadonNikodym Derivatives for Certain Measures Relative to Wiener Measure. The
Annals of Mathematical Statistics 42: 130-140.
Karatzas, I. and Shreve, S. E. (1988). Brownian Motion and Stochastic
Calculus. Springer Verlag.
Kim, T. and Omberg, E. (1996). Dynamic Nonmyopic Portfolio Behavior.
Review of Financial Studies 9: 141-161.
Lakner, P. (1998). Optimal Trading Strategy for an Investor: the Case of
Partial Information. Stochastic Processes and their Applications 76: 77-97.
Lipster, R. S. and Shiryayev, A. N. (1977). Statistics of Random Processes, Vol. 1. Springer Verlag.
Lipster, R. S. and Shiryayev, A. N. (1978). Statistics of Random Processes, Vol. 2. Springer Verlag.
20
Protter, P. (1990). Stochastic Integration and Di¤erential Equations.
Springer-Verlag.
Department of Statistics and Operations Research
Stern School of Business, New York University
44 West 4th Street, New York, NY 10012
E-mail: hfrydman@stern.nyu.edu
E-mail: plakner@stern.nyu.edu
Abbreviated Title: Hidden Markov Processes
21
Download