Click here

advertisement
PAPER:
Maximum likelihood estimation of the Heston stochastic volatility
model using asset and option prices: an application of nonlinear
filtering theory
Francesca Mariani, Graziella Pacelli, Francesco Zirilli
1.
2.
3.
4.
5.
6.
Abstract
Introduction
Filtering problem Filtering Movie
Estimation problem Estimation Movie
References
FORTRAN code
Warning: If your browser does not allow you to see the content of this website click here to
download a word file that reproduces the material contained here.
1. Abstract
Let us suppose that the dynamics of the stock prices and of their stochastic variance is described by
the Heston model, that is by a system of two stochastic differential equations with a suitable initial
condition. Our aim is to estimate the parameters of the Heston model and one component of the
initial condition, that is the initial stochastic variance, from the knowledge of the stock and option
prices observed at discrete times. The option prices considered refer to an European call on the
stock whose prices are described by the Heston model. The method proposed to solve this problem
is based on a filtering technique to construct a likelihood function and on the maximization of the
likelihood function obtained. The estimated parameters and initial value component are
characterized as being a maximizer of the likelihood function subject to some constraints. The
solution of the filtering problem, used to construct the likelihood function, is based on an integral
representation of the fundamental solution of the Fokker-Planck equation associated to the Heston
model, on the use of the wavelet expansions presented in [1], [2], [3] to approximate the integral
kernel appearing in the representation formula of the fundamental solution, on a simple truncation
procedure to exploit the sparsifying properties of the wavelet expansions and on the use of the Fast
Fourier Transform (FFT). The use of these techniques generates a very efficient and fully
parallelizable numerical procedure to solve the filtering problem, this last fact makes possible to
evaluate very efficiently the likelihood function and its gradient. As a byproduct of the solution of
the filtering problem we have developed a stochastic variance tracking technique that gives very
good results in numerical experiments. The maximum likelihood problem used in the estimation
procedure is a low dimensional constrained optimization problem, its solution with ad hoc
techniques is justified by the computational cost of evaluating the likelihood function and its
gradient. We use parallel computing and a variable metric steepest ascent method to solve the
maximum likelihood problem. Some numerical examples of the estimation problem using synthetic
data obtained with a parallel implementation of the previous numerical method are presented. Very
impressive speed up factors are obtained in the numerical examples using the parallel
implementation of the numerical method proposed. This website contains two animations and some
auxiliary material that helps the understanding of the paper [7] and makes available to the interested
users the computer programs used to produce the numerical experience presented here and in [7] .
2. Introduction
Assuming the stock prices and their stochastic volatilities or variances as state variables and the
stock prices and option prices observed at discrete times as observations, we study the problem of
forecasting (tracking) the unobserved and time-varying stochastic volatility or variance when the
model parameters and initial condition are known using the observations (filtering problem) and the
problem of estimating the model parameters and the unknown initial volatility or variance using the
observations (estimation problem). We consider these problems in the context of the Heston
stochastic volatility model.
Let St and vt be the state variables at time t>0 that is, respectively, the stock price and the stochastic
variance at time t>0. The Heston model assumes that the dynamics of the state variables (S t,vt), t>0,
is described by the following system of stochastic differential equations [4]:
dS t  μ S t dt  v t S t dWt1 ,
t  0,
(1)
dv t  γ(θ  v t )dt  ε v t dWt2 ,
t  0,
(2)
where Wt1, Wt2, t>0, are standard Wiener processes such that W01=W02=0, dWt1, dWt2 are their
stochastic differentials and dWt1dWt2 = dt where    denotes the mean of  and [-1,1] is a
constant known as correlation parameter;  is a real constant related to the behaviour of the mean
of the stock log returns, associated to the stochastic process St , t>0. Note that the stochastic
variance vt, t>0, is assumed to be described by a mean-reverting process with speed 0 and
parameters 0 and 0, that is by equation (2). The equations (1), (2) must be equipped with an
initial condition, that is:
~
S0  S0 ,
v ~
v .
0
0
~
Here S0 and
(3)
(4)
~v
0 denote random variables concentrated with probability one in a point; for
~
~
~v
S
0
0
simplicity, we continue to denote these points respectively with
and
. We note that S0 , the
~
stock price at t=0, is observable while v 0 , the stochastic variance at t=0, cannot be observed. The
quantities , , ,  and  are real constants and are the model parameters that we want to estimate.
Hence, the parameters and unknown initial condition component of the Heston model that we want
~
to estimate are , , , ,  and v 0 . We assume that 2/2 1, this assumption guarantees that if
~v
0 is positive with probability one, vt, solution of (1), (2), remains positive with probability one for
t>0.
We suppose to observe the prices of the stock and the prices of a European call option of known
maturity and strike price on the underlying whose price St, t>0, satisfies (1), (2), (3), (4) at
prescribed discrete times 0<t1<t2<…<tn<tn+1 =+, let Ft= {(Si,Ci) : ti t}, t>0, be the set of the
observations up to time t>0 of the stock prices St and of the option prices Ct that is for i=1,2,...,n let
Si be the observation of the stock price and Ci be the observation of the option price at time ti and let
Θ  (  ,  , ε, θ, ρ, ~
v 0 ) T be the vector of the model parameters and of the initial stochastic variance,
~ ~
where the superscript T denotes the transpose operator. We use the notation t0 =0 and F0={( S0 , v0 )},
this notation simplifies some of the formulae that follow. The option prices considered refer to the
prices of an European call option on the underlying whose prices are described by S t, t>0, with
known strike price E and maturity time T such that T>tn, we denote with C(S,v,t;E,T,) the price of
this option in the Heston stochastic volatility model when 0t<T, St=S and vt=v. We assume that
the option prices observed are given by the theoretical option prices in the Heston model affected by
a Gaussian error, that is: for i=1,2,...,n, Ci=C(Si,vi,ti; E,T,)+ui, where ui is a Gaussian error term;
that is we assume that ui is sampled from a Gaussian variable with mean zero and known variance
 i and C(Si,vi,ti; E,T,) is the option price in the Heston model assuming known the (unobserved)
variance vi at time ti.
3. Filtering problem
Let us assume the vector  known, we focus on the stock price and stochastic variance forecasting
problem from the observed stock and option prices. This problem is a filtering problem, where the
~
~
stochastic differential equations (1), (2) represent the “state equations”, S0 and v 0 at time t are the
0
initial condition and, for i=1,2,...,n, (Si ,Ci) at time ti represent the observations. Note that in the
~
filtering problem v 0 is known through the knowledge of . The solution of the previous filtering
problem is based on the knowledge of the joint probability density function p(S,v,t| Ft,),
(S,v)(0,+)  (0,+) , t>0, of the state variables, that is of the stock prices and of the stochastic
variance, conditioned to the observations Ft, t>0, and to the initial condition (3), (4).
We introduce the stock log returns:
~
x t  log(S t / S0 ),
(5)
t  0,
~
without loss of generality, we can assume S0= S0 =1. From (1), (2), (3), (4) and (5) it is easy to see
that (xt,vt), t>0, satisfy the following equations:
1
v t )dt  v t dWt1 ,
2
dv t  γ(θ  v t )dt  ε v t dWt2 ,
dx t  (μ 
t  0,
t  0,
with initial condition:
x0  ~
x 0  0,
v ~
v .
0
(8)
(9)
0
~
The joint probability density function p(x, v, t | Ft , Θ), (x,v)(-,+)  (0, +) , t>0, of the state
variables xt, vt, t>0, conditioned to the observations Ft, t>0, is solution of the following equations
[5]: for i=0,1,...,n:
~
p 1  2 (v~
p) 1  2 (ε 2 v~
p)  2 (ε ρ v~
p)  2 ((μ  v/2) ~
p) (γ (θ  v) ~
p)





, (x, v)  (-,)  (-,), t i  t 
2
2
t 2 x
2 v
xv
x
v
(10)
with initial condition: for i=0:
~
p (x, v, t | Ft 0 , Θ )  δ(x  ~
x 0 )δ(v  ~
v 0 ),
(x, v)  (,)  (,),
(11)
and for i=1,2,...,n:

 1 ~
δ(x  x i )
~
p(x, v, t i | Ft i , Θ) 
exp  
C(x, v, t i ; E, T, Θ)  C i
2πi
 2i
(x, v)  (-,)  (-,),

2
~
p(x, v, t i | Ft i 1 , Θ)

 
2
~
 exp   1 C
~
(x
,
v,
t
;
E,
T,
Θ
)

C
p(x i , v,
i
i
i
0  2i


(12)
~
where for i=1,2,...,n: xi=log(Si) and C (x,v,ti;E,T, ) is the value in the Heston model at time t=ti
x  x v ti  v
when t i
,
of a call option on the stock whose price is St, t>0, with strike price E and
maturity time T.
~
Once we know the joint probability density p (x,v,t|Ft,), (x,v)  (-,+)  (0, +), t>0, we can
forecast the values of the random variables (xt,vt), t>0, as follows:


x̂ t|Θ 
v̂ t|Θ 

~
 dx  xp(x, v, t | Ft , Θ)dv,

0



0
 dx  vp(x, v, t | F , Θ)dv,
~
t
The quantities
x̂ t|Θ , v̂ t|Θ , t  0,
t  0,
(13)
t  0.
(14)
can be regarded as the solution of the filtering problem.
Let G(x,v,t,x',v',t'|), (x,v), (x',v') (-,+)  (0, +) , t>t'>0, be the fundamental solution of (10),
we have (see [6] pag.605):


1

 2
G(x, v, x' , v' , t' | Θ)  2exp  (x  x' μτ)   dk  exp(k ( x  x' μτ))exp  2 (A(τ , k, l; )  B(τ , k, l; Θ)v' ilv
2
  
ε
(x, v), (x' , v' )  (,)  (0, ), t  t'  0, τ  t  t' 
where  denotes the imaginary unit an A and B are explicitly known elementary functions [6]
pag.605. Using the fundamental solution of (10) given in (15) the solution of (10), (11), (12) can
be written as follows:
~
p(x, v, t | Ft 0 , Θ)  G(x, v, t, ~
x0,~
v 0 , t 0 | Θ),
(16)
(x, v)  (,)  (0,), t 0  t  t 1 ,
and for i=1,2,…,n:

~
p(x, v, t | Ft i , Θ)   G(x, v, t, x i , v' , t i | Θ)f i (v' )dv' ,
(x, v)  (,)  (0,), t i  t  t i 1 ,
0
(17)
where
f i (v' ) 
δ(x  x i )~
p (x, v' , t i | Ft i 1 , Θ)π 1 (x, v' , t i | Ft i , Θ)

,
v'  0,
~
 p(x i , v, t i | Ft i1 , Θ)π1 (x i , v, t i | Ft i , Θ)dv
0
(18)
and
π1 (x, v' , t i | Ft i , Θ) 
(19)
δ(x  x i )
2πi

 1 ~
exp  
C(x, v' , t i ; E, T, Θ)  C i
 2i
 ,
2

(x, v' )  (,)  (0,).
~
p (x, v, t | Ft i , Θ)
The evaluation of
, (x,v) (-,+)  (0, +), t>0, for i=1,2,...,n, on a grid of N2
points in the (x,v) variables with a naïve quadrature procedure applied to formula (17) using N
quadrature nodes in each variable requires O(N5) complex multiplications when N goes to infinity.
In fact (17) due to (15) must be regarded as a three dimensional integral (i.e. an integral of the v' , k,
l variables). Using
 a wavelet expansion [1], [2], [3] of the integral kernel of (17) and of the function (18),
 a truncation procedure that exploits the sparsifying properties of the wavelet expansion,
 the Fast Fourier Transform (FFT),
we reduce the computational cost of the previous computation from O(N5) to N1O(N2log N)
complex multiplications when N goes to infinity, where N1 is independent of N. Note that N1 is the
number of the wavelet functions remaining in the expansion after the use of the truncation
procedure.
Using the estimates (13), (14), we can forecast the values of the stock log return x t, t>0, and of the
stochastic variance vt, t>0, coherently with the observations Ft, t0. The continuous update in time
(x̂ , v̂ ), t  0,
of the value of the couple t|Θ t|Θ
is the “trajectory tracking” of (6), (7), (8), (9). The
v̂ t|Θ , t  0,
continuous update in time of
is the “variance tracking”.
x̂ v̂
The quality of the estimated values ( t | , t | ), t>0, depends on the variance of the random
variables xt, vt, t>0, conditioned to the observations, that is we can estimate the quality of the
estimates (13), (14) computing respectively the quantities:



0



0
Σ(x t | Ft , Θ)   dx  (x  xˆ t|Θ ) 2 ~
p(x, v, t | Ft i , Θ)dv,
Σ(v t | Ft , Θ)   dx  (v  vˆ t|Θ ) 2 ~
p(x, v, t | Ft i , Θ)dv,
t  0,
(20)
t  0.
(21)
The estimates (13), (14) are good, when the variances (20), (21) are small.
Click here to download a digital movie showing the trajectory tracking (right top
corner), the evolution in time of the variance of the estimated variance that is the
evolution in time of (21) (right bottom corner) and the evolution in time of the joint
probability density function conditioned to the observations (left side) obtained solving
the filtering problem .
In the trajectory tracking shown in the digital movie and in Figure1 the solid line represents the true
trajectory made of the values (xt,vt), t[0,24/252.5], obtained computing with Euler method one
~
~
trajectory of the stochastic differential equations (6), (7), (8), (9) with x 0 =0 and v 0 =0.5; the
dashed line represents the trajectory followed by the forecasted values (
x̂ t | v̂ t |
,
), t[0,24/252.5],
obtained solving the filtering problem computing the integrals (13), (14) and using the data shown
in Table 1. The asterisks and the circles indicate, respectively, the values of (x t,vt) and of
(x̂ t|Θ , v̂ t|Θ )
at the observation times t=ti, i=0,1,...,5. We have chosen  = (0.026, 5.94, 0.306,
0.01159, -0.576, 0.5)T. We have considered the option with strike price E = 0.5 and maturity time T
= 126/252.5.
We assume that the non adimensional components of the parameter vector  are expressed in
annualized unit, that is in the unit 1/year. We use a “year” made of the average number of trading
days per year, that is 252.5 days, so that for example t=24/252.5 corresponds to the end of the
trading day number 24 of the regular year.
i
ti
xi
C(xi,vi,ti|E,)
0
0
0
1
4/252.5 -0.00371625 0.13279039
2
8/252.5 -0.08059790 0.06952851
3 12/252.5 0.00498269 0.12195051
4 16/252.5 -0.00590680 0.10456021
5 20/252.5 0.08869340 0.18950671
Ci
0.13440918
0.06835849
0.12349829
0.10599176
0.18758168
Table 1. The data of the filtering problem
Below we show two figures (Figure 1 and Figure 2) that help the understanding of the quality of the
estimates obtained solving the filtering problem. Figure 1 shows the trajectory tracking obtained as
solution of the filtering problem. In particular the solid line represents the
true trajectory made by the values (xt,vt), t[0,24/252.5], obtained solving with the Euler method
one trajectory of the stochastic differential equations (6), (7), (8), (9), the dashed line represents the
x̂ v̂
trajectory followed by the forecasted values ( t | , t | ), t[0,24/252.5]. Figure 2 shows (vt | Ft, )
versus time when t[0,24/252.5].
Figure 1. Trajectory tracking
Figure 2. (vt|Ft ,) versus time t
4. Estimation problem
In Section 3 we have assumed  known to solve the filtering problem but in real situations when
the Heston model is used  is not known and must be estimated from the knowledge of the
x̂ v̂
observations. The dynamics of the stochastic processes (xt,vt), t>0, and the estimates ( t | , t | ),
t>0,
obtained from the solution of the filtering problem, depend on the vector
Θ  (  ,  , ε, θ, ρ, ~
v 0 ) T . Let M  { :   0,   0, θ  0, 2θ /  2  1, 1    1} be the set of the
~
admissible vectors . The observations (xi,Ci), i=1,2,...,n,and the initial condition x 0 are
realizations (samples) of the random variable xt, t0, at the observation times and at t=t0=0 and of a
random variable, the option price Ct, t>0, derived from (xt,vt), t>0, at the observation times so that
we can hope that the observations can be used to characterize the vector . Therefore, in order to
derive forecasts of the stock log returns and of the stochastic variance and in order to obtain
derivative prices that are coherent with the observed derivative prices it is essential to have a
reliable estimate of the vector . That is we search an admissible vector  that makes most likely
~
the observations x 0 and (x ,C ), i=1,2,...,n.
i
i
We solve the problem of estimating the vector  from the observations using the maximum
likelihood method. The maximum likelihood method consists in searching the vector M that
maximizes the following (log-)likelihood function:
n 1

i 0
0
F(Θ)   log  ~
p(x i 1 , v, t i 1 | Ft i , Θ)π1 (x i 1 , v, t i | Ft i , Θ)dv,
Θ  M.
(22)
Note that (22) represents the sum of the logarithms of the integrals of the product between the
~
π (x , v, t i 1 | Fti , Θ)
transition probability densities p (xi+1,v,ti+1|Ft i ,) and the weight functions 1 i 1
,
i=0,1,...,n-1. The addition of these weight functions to the expression of the likelihood function
considered in (22) has the purpose of taking care of the fact that also the option prices C i,
i=1,2,...,n, are observed, assigning bigger weights to the values of the v variable that make more
likely the observation of Ci at time t=ti, i=1,2,...,n. The function F() represents the (log-)likelihood
~
of the parameter vector  given the observations Ft, t>0, and x 0 at time t=t0.
In order to maximize the (log-)likelihood function shown in (22), we use an iterative optimization
procedure that searches the maximum likelihood estimate * solution of the maximization
problem:
max F(Θ),
ΘM
(23)
starting from an initial guess 0M.
The iterative procedure that solves the optimization problem (23) using 0 as initial guess finds
during the iterations the admissible vectors k,, k=1,2,...,iter, where iter is the maximum number of
iterations allowed. The vectors k, k=0,1,…,iter, are such that F(k+1)F(k), k=0,1,…,iter-1.
We use as data of the estimation problem the same data used for the filtering problem that is the
data shown in Table 1. These are synthetic data that were generated using =true = (0.026, 5.94,
0.306, 0.01159, -0.576, 0.5)T. We choose as initial guess of the optimization procedure the vector
0=true(1+0.4), where  is a 6-dimensional standard Gaussian variable with independent
components.
We fix the maximum number of iterations allowed iter=40 and we use the variable metric steepest
ascent method described in [7] and we use one of the following two stopping criteria: the first
criterion consists in checking at every iteration the condition:

  (Θ kj  Θ kj 1 ) 2
 j1
6

  (Θ kj 1 ) 2
j1

6
1
2



  δ1 ,


(24)
where kj is the j-th component of the vector k, j=1,2,…,6, k=0,1,…,iter, 1 is a given tolerance
and in stopping the iteration when (24) is satisfied for the first time, the second one consists in
checking at every iteration the condition:
| F(Θ )  F(Θ
k
| F(Θ
k 1
k 1
)|
)|
 δ2 ,
(25)
where 2 is a given tolerance and in stopping the iteration when (25) is satisfied for the first time.
Choosing 1 = 2 =0.001 if we use the stopping criterion (24) the variable metric steepest ascent
procedure stops at the iteration k=20, if we use the stopping criterion (25) the variable metric
steepest ascent procedure stops at the iteration k=10, where 10=(0.02, 4.74, 0.24, 0.02, -0.63, 0.4)
and 20=(0.02,4.748,0.24, 0.02, -0.615, 0.422) and we have F(10) 19.1014 , F(20)  19.7356,
F(0)  11.6194, and F(true)  21.7477 .
Click here to download a digital movie showing the variance tracking for several values
of the parameter vector k M , obtained during the iterative optimization procedure
and the variance tracking corresponding to the parameter vector =true.
The evaluation of the (log-)likelihood function F() is very expensive computationally. We reduce
the computational cost of solving the filtering and estimation problems using parallel computing.
Using np processors the wall clock time required to execute the program that finds the solution of
T
the filtering problem reduces approximately of a factor np. In particular if we denote with n p the
wall clock time needed to evaluate the (log-)likelihood function F(), for a given choice of M,
using np processors, we can measure the efficiency of the parallel implementation considered
computing the speed up factor s(np), where
s(n p ) 
Tn p
T1
,
p  1,2,... .
(26)
In Table 2 we show as a function of the number of processors np the speed up factor and the wall
T
clock time n p needed to evaluate the (log-)likelihood function associated to the data shown in
Table 1 on IBM SP4 machine with 32 processors.
np
1
2
4
8
16
s(np)
1
1.9927
3.9424
7.6549
14.8287
Tn p (seconds)
4200
2107
1065
548.6
283.23
T
Table 2. Speed up factors s(np) and wall clock time n p
needed to evaluate the likelihood function associated to
the data of Table1 versus the number np of
processors
employed
5. References
[1]
L.Fatone, M.C.Recchioni, F.Zirilli New Scattering Problems and Numerical Methods in
Acoustics, in Recent Research Developments in Acoustics,S.G.Pandalai Managing Editor,
Transworld Research Network, Kerala, India, 2 (2005), 39-69.
[2]
L.Fatone, M.C.Recchioni, F.Zirilli New wavelet bases made of piecewise polynomial
functions: approximation theory, quadrature rules and applications to kernel sparsification
and image compression, in preparation.
[3]
L.Fatone, G.Rao, M.C.Recchioni, F.Zirilli, High Performance Algorithms Based on a New
Wavelet Expansion for Time Dependent Acoustic Obstacle Scattering, submitted to Journal
of Computational Physics.
[4]
S.L.Heston, A Closed-Form Solution for Options with Stochastic Volatility with
Applications to Bond and Currency Options , Review of Financial Studies, 6 (1993), 327343.
[5]
A.H. Jazwinski, Stochastic Processes and Filtering Theory , Academic Press, New York,
(1970).
[6]
A.Lipton, Mathematical Methods for Foreign Exchange, , World Scientific Pubblishing Co.
Pte. Ltd, Singapore, (2001).
[7]
F.Mariani, G.Pacelli, F.Zirilli, Maximum Likelihood Estimation of the Heston Stochastic
Volatility Model Using Asset and Option Prices: an Application of Nonlinear Filtering
Theory, , submitted to Optimization Letters.
6. FORTRAN Code
The FORTRAN code distributed in this website computes the solution of the estimation problem
(23) using the formulae outlined in Sections 3 and 4 and derived in [7] . The data of the estimation
problem are those described in Section 4 and must be provided by the user. The FORTRAN code
consists of two programs. The first program preproc.f reads and writes on files the data inserted by
the user, initializes the parameter vector 0 and computes the number of the wavelet coefficients
needed after the truncation procedure in order to approximate efficiently the integral kernel
contained in (17) associated to the initial guess 0.
click here to download
The second program estimate.f starting from the initial guess 0 generates a sequence of
admissible vectors k, k=1,2,…,iter, using a variable metric steepest ascent procedure. At every
iteration the program solves the filtering problem associated to the current vector parameter k.
click here to download
Both programs use parallel computing and MPI as message passing library. Therefore they can run
on parallel machines.
Entry n.
Download