Variance Risk Premium Dynamics Job Market Paper Viktor Todorov Duke University

advertisement
Variance Risk Premium Dynamics
Job Market Paper
Viktor Todorov
Duke University
Current Draft: January 3, 2007
∗†
Abstract
This paper uses high-frequency S&P 500 index futures data and data on the VIX index to provide an
arbitrage-free explanation of the variance risk premium and its dynamics. Using the high-frequency
data only, I select a semiparametric two-factor stochastic volatility model, containing jumps in the price
and the stochastic variance. For this model I derive prices of diffusive and jump risk that determine
the variance risk premium. Unlike other studies of the variance risk premium, this study allows compensation for both stochastic volatility and jumps to be reflected in the variance risk premium. The
price of jump risk considered here is novel and allows the jump risk premium to depend on the level of
past price jumps. Using the selected stochastic volatility model and the prices of risk, I conduct a joint
inference and detect a non-trivial variance risk premium. The estimation results show that the variance
risk premium varies significantly over time. It increases in periods of high volatility and straight after
big jumps. The empirical findings of this paper suggest habit persistence in investors’ fear of jumps,
i.e., after a market crash investors are willing to pay more to protect themselves from future market drops.
Key words: Change of measure, continuous-time stochastic volatility model, diffusive risk, jump risk,
Lévy process, quadratic variation, realized multipower variation, variance risk premium, variance swap
rate.
JEL classification: G12, C51, C52.
∗
Author’s Contact: Viktor Todorov: viktor.todorov@duke.edu. Department of Economics, Duke University, Box 90097,
Durham NC 27708.
†
I would like to thank the members of my committee George Tauchen(chair), Tim Bollerslev, Ron Gallant and Han Hong
for many discussions and encouragement along the way. I thank also seminar participants in the Duke Economics and Finance
seminars, Javier Cicco, Pedro Duarte, Paul Dudenhefer, Silvana Krasteva, Jonathan Mattingly and Barbara Rossi for helpful
comments. I benefited from discussions with Jean Jacod, Albert Shiryaev, Ernesto Mordecki, Mark Podolskij and other
seminar participants at the Conference on Stochastics in Science in Honor of Ole Barndorff-Nielsen, Guanajuato, Mexico,
March 2006.
1
Introduction
A central topic in finance concerns the risk premium that investors require for bearing different risks. Much
of the work so far has been centered on explaining the equity risk premium, i.e. the compensation for the
variation in asset prices (the price risk). However, the price risk is not the only risk from holding assets
that investors face. Over the last few decades the financial econometrics literature has provided strong
and unambiguous evidence that the variances of financial assets exhibit significant variation over time (e.g.
Bollerslev, Engle, and Nelson (1994) and more recently Andersen, Bollerslev, and Diebold (2005a)). This
variation introduces an additional source of risk from holding assets, referred to as variance risk. The
importance of the variance risk for the investors is directly underlined by the development and trading
of variance swap contracts, i.e., forward contracts on future variance1 . Investors generally dislike the
randomness of the future variance and in equilibrium require a premium for accepting it. This is known
as the variance risk premium.
The main goal of this paper is to analyze the dynamics of the variance risk premium. Studying the
dynamics of the variance risk premium is important for at least two reasons. First, given the increased
interest in the direct trading of variance contracts, we need to be able to price these products. Moreover,
in many cases the variance products are part of a portfolio containing the underlying asset. Thus, the
pricing of the variance should not be done in isolation, but rather in a way that is consistent with the
pricing of the underlying asset. If the variance risk premium is constant, the substantial literature on
modeling and forecasting the returns variance (e.g. Andersen, Bollerslev, and Diebold (2005a)) is directly
applicable for pricing the variance products. Things are different, however, if the variance risk premium
has time-variation. Second, the analysis of the dynamics of the variance risk premium has implications for
the existence and properties of the pricing kernel, also known as the stochastic discount factor. In that
sense the importance of the analysis in the paper goes well beyond the pricing of variance products.
In this paper I provide an arbitrage-free explanation of the dynamics of the variance risk premium.
The analysis is based on high-frequency data on the S&P 500 index and data on the variance swap rate
(the VIX index). Using a general semiparametric stochastic volatility model and flexible prices of risk in
this model, I am able to account for the dynamics of the variance risk premium implied by the data.
Previous studies of the dynamics of the variance risk premium include Bollerslev, Gibson, and Zhou
(2005) and Wu (2005)2 3 . There are two substantial differences between these two papers and the current
work, which also synthesize the major contributions of this paper to the existing literature. The first
difference is the separation of the price jumps from the continuous price component. Bollerslev, Gibson,
and Zhou (2005) do not allow for price jumps in their model, while Wu (2005) allows them but he does not
consider their separation from the continuous price component in the estimation. In contrast, in this paper
I allow for jumps in the model and use the high-frequency data on the index to separate the continuous
from the discontinuous component of the price. The advantage of this separation is that it allows to isolate
1
These contracts give exposure only to variance risk and hence provide an instrument to hedge against it. The recent
theoretical results in Carr and Wu (2004) and Britten-Jones and Neuberger (2000) imply that the variance swap can be
replicated with a static portfolio of option contracts written on the underlying asset. In 2003 CBOE adopted these theoretical
results in calculating its volatility index, the VIX index, and as a result the new VIX index reflects the theoretical price of
a variance swap contract. The ability to replicate the VIX index (i.e. the variance swap) directly with standard options
increased further the interest in trading future variance. As a result in 2004, CBOE launched trading of futures contracts on
the VIX index and at the beginning of 2006 it started trading option contracts written on the VIX index as well.
2
Other studies of the variance risk premium include Bakshi and Kapadia (2003), Carr and Wu (2004) and Bakshi and
Madan (2006). These papers, however, do not consider modeling the dynamics of the variance risk premium.
3
Of course, since the variance risk premium (and its dynamics) can be determined from the pricing kernel, all papers which
consider estimation of the pricing kernel are also indirectly related with the current study of the dynamics of the variance risk
premium. An incomplete list includes Bates (2000), Chernov and Ghysels (2000), Ait-Sahalia, Wang, and Yared (2001), Pan
(2002), Rosenberg and Engle (2002), Eraker (2004), Santa-Clara and Shu (2005), Broadie, Chernov, and Johannes (2006). A
major difference between the current study and these papers is the data. The current paper uses high-frequency data on the
underlying index and a variance swap (portfolio of options) data, while the above-cited papers use low-frequency data on the
underlying asset and data on a set of options. As discussed later, the use of high-frequency data is crucial for the analysis
here.
2
the effect of the price jumps on the variance risk premium and hence it allows for a deeper analysis of the
determinants of the variance risk premium.
The second difference between Bollerslev, Gibson, and Zhou (2005) and Wu (2005) and the current
paper is the source of the variance risk premium. Variance risk premium in Bollerslev, Gibson, and Zhou
(2005) and Wu (2005) is associated with the compensation for the time-variation in the conditional variance.
However, when the model contains price jumps, the variance of the asset will vary over time even if there
is no time-variation in the conditional variance of the returns. The variance risk premium in general,
therefore, reflects also compensation demanded by investors for the presence of price jumps. In contrast
to the above-cited papers, I allow compensation for both presence of price jumps and time-variation in the
conditional variance to determine the variance risk premium. In fact, it is the flexible specification of the
compensation for jump risk, considered in the present paper, that allows me to explain the dynamics of
the variance risk premium implied by the data.
The analysis of the dynamics of the variance risk premium has two major building blocks. The first is
the specification of a model for the dynamics of the underlying index, while the second one is specification
of prices of risk in the model (i.e. specification of a valid pricing kernel). The paper starts with a selection of
a stochastic volatility model. I work with a very general semiparametric model which nests the affine-jump
diffusion models of Duffie, Pan, and Singleton (2000) (with constant jump intensity) and the jump-diffusion
jump-driven stochastic volatility models of Todorov (2006a) (which include the non-Gaussian OU model of
Barndorff-Nielsen and Shephard (2001)). Empirically relevant features of the model include the presence of
jumps both in the price and the stochastic variance (the spot variance of the continuous price component)
as well as the multifactor-type structure of the stochastic variance. The modeling of the jumps in the
price and the variance is quite flexible and allows for all possible dependencies between them. Using only
high-frequency data on the underlying index, I estimate different specifications of the general stochastic
volatility model and select one of them for the analysis of the variance risk premium. The selected model
has two variance factors. The one is diffusive (modelled as a square-root process) and very persistent. The
other variance factor is driven by jumps and has a very short memory.
In the selected model, the variation both in the price and in the stochastic variance of the underlying
asset is driven by diffusive shocks (modelled with a Brownian motion) and jumps (modelled with a general
pure-jump Lévy process). Therefore, to determine the variance risk premium, we need prices of diffusive
and jump risks in the model. The compensation for diffusive risk, considered here, is the generalized affine
price of risk, as recently defined in Cheridito, Filipović, and Kimmel (2005) in the context of affine diffusion
models. The price of jump risk that is used is novel and quite flexible. It allows jumps to have very different
behavior under the physical and the risk-neutral measure. For example, the compensation for the jumps
allows for a situation where the jumps are time-homogenous under the physical measure and yet exhibit
significant persistence under the risk-neutral measure. This flexibility turns out to be empirically relevant.
Following Todorov (2006b), the estimation in the present paper is based on matching moments of
realized multipower variation. Realized multipower variation statistics aggregate high-frequency data on
a daily level and provide a good approximation of latent quantities of the model (see Barndorff-Nielsen,
Graversen, Jacod, Podolskij, and Shephard (2005)4 ). In the estimation I treat the realized multipower
variation statistics as their (unobservable) asymptotic limits. This introduces error in the parameter
estimation. The error converges in probability to zero for the general stochastic volatility model used in
the paper under the condition that the number of intraday observations goes to infinity. Further, under
the additional condition that the number of intraday observations increases slightly faster than the number
√
of days in the sample, T , this approximation error is asymptotically negligible, i.e. it is of order op (1/ T ).
A final remark regarding the estimation is related to the jump specification. In the estimation, the
distribution of the jumps in the price and the variance is left unspecified. Instead, only certain moments
4
Their asymptotic behavior in the case of no price jumps, as the number of intraday observations goes to infinity, is derived
in Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2005). These results are partially extended to the case when
the price process contains jumps, which is the case of interest in this paper, by Barndorff-Nielsen, Shephard, and Winkel
(2006) and Jacod (2006a,b).
3
of the jumps are estimated. The advantage of this approach is that the results of the paper are immune
to misspecification of the distribution of the jumps. This is particularly relevant for the dependence
between the jumps in the price and the variance. The estimation results indicate that this dependence
is statistically significant. The estimated dependence between the jumps, however, is different from that
implied by most parametric specifications for the jumps used in the financial literature. This finding
underscores the advantage of the estimation approach adopted here of not modeling parametrically the
jumps.
My main empirical findings can be summarized as follows. I find a non-trivial variance risk premium.
Its estimated mean is 0.6827, while the sample mean of the variance swap rate is 1.6542 (both estimates
are in daily variance units). Further, the variance risk premium shows significant variation over time. An
estimated lower bound for its variance is 0.3401, while the sample variance of the variance swap rate is
1.2775. I find that both price jumps and stochastic variance are important determinants of the variance risk
premium. The dependence of the variance risk premium on the price jumps, to the best of my knowledge,
is a new finding. The empirical evidence indicates that after a big jump in the price, the variance risk
premium increases and takes a while to revert to its mean. This is explained with a compensation for jumps
that depends on a very persistent state variable, which, in turn, is related with the price jumps. At the
same time, the price jumps in the model are time-homogeneous (under the physical measure), since they
are modelled as a Lévy process, and further the estimation results show that their effect on the stochastic
variance disappears quickly. Thus, the empirical finding of a persistent jump risk premium suggests a habit
persistence in investors’ fear of jumps: immediately after a market crash investors are willing to pay more
to protect themselves against future market drops.
Finally, my findings for the importance of the time-varying jump risk premium are consistent with the
results of Bates (2000), Pan (2002) and Santa-Clara and Shu (2005), among others. However, there are
two major differences between this study and the above-cited papers in the modeling of the time-varying
jump risk premium. First, in this study the compensation for jump risk depends on the past jumps in the
stock market index and this accounts for the observed dependence of the variance risk premium on past
price jumps. Second, the jump risk premium here is not directly linked with a state variable in the model
such as the variance jump factor. This is important since the estimation results show that the jump risk
premium, although related with past jumps, has much longer memory than does the variance jump factor.
The remainder of the paper is organized as follows. Section 2 introduces the general stochastic volatility model for the dynamics of the underlying asset under the physical measure. I discuss how the model
can capture key empirical features of asset prices and derive the moments to be used later in the estimation. Section 3 describes the estimation technique based on the realized multipower variation statistics
constructed from the high-frequency data. This Section also contains an asymptotic result for realized multipower variation based inference in the context of the stochastic volatility model used here. In Section 4
I estimate different specifications of the general stochastic volatility model, introduced in Section 2, and
select one of them to be used for the analysis of the variance risk premium. The estimation is done using
only high-frequency data on the underlying asset. In Section 5 I construct a measure for the variance risk
premium and report significant empirical evidence for time-variation in this measure. Section 6 derives
general prices of diffusive and jump risk within the selected stochastic volatility model and discusses their
implication for the variance risk premium. In this Section I also test the different specifications of diffusive
and jump risk using high-frequency data on the underlying asset and data on the variance swap data.
Section 7 concludes. All the proofs are given in Appendices at the end of the paper.
2
Dynamics under the Physical Measure
In this Section I specify the general stochastic volatility model and define key quantities associated with it
that are used for the definition and estimation of the variance risk premium. Later in Section 4 I estimate
different specifications of the general stochastic volatility model, introduced in this Section, and select
4
one of them for the analysis of the variance risk premium. The current Section also discusses the main
characteristics of the model and argues for its flexibility. Finally, moments of the return process, to be
used in the estimation, are also derived.
2.1
The Stochastic Volatility Model
I fix a filtered probability space (Ω, F , P), with F = (Ft )t∈R its filtration. On this space I define with F (t)
the price at time t of a futures contract on the stock market index expiring at some future date. I assume
for f (t) = log(F (t)) the following dynamics under the physical measure P
Z
Z
t
f (t) = f (0) +
0
Z tZ
t
α(s)ds +
σ(s−)dW (s) +
0
0
Rn
0
h(x)µ̃(ds, dx),
(1)
σ 2 (t) = V c (t) + V j (t),
V c (t) =
p
X
Vic (t),
and dVic (t) = κi (θi − Vic (t))dt + σiv
i=1
Z
j
t
V (t) =
−∞
(2)
q
Vic (t)dBi (t),
i=1,...,p,
(3)
Z
Rn
0
g(t − s)k(x)µ(ds, dx),
(4)
where (W (t), B1 (t), ..., Bp (t)) is a (p + 1)-dimensional Brownian motion with B1 (t), ..., Bp (t) independent
of each other and having correlation coefficients ρ1 , ρ2 , ..., ρp respectively with W (t); x is an n-dimensional
vector on Rn0 ; µ is a time-homogenous Poisson random measure with compensator ν such that ν(dt, dx) =
dtG(dx) for some G : Rn0 → R+ ; g : R+ → R+ , h : Rn0 → R and k : Rn0 → R+ and µ̃ := µ − ν is the
compensated measure.
Sufficient conditions for the existence of all processes in the model (1)-(4) are given in Section 3
(Assumption 4). The futures price in (1) has three components. The first is the drift term which is
absolutely continuous. In this paper it is left unspecified.
The second component of the price is a continuous local martingale. Its time-variation is determined
by the process σ 2 (t). I refer to σ 2 (t) as the stochastic variance, since it determines the time-variation
in the conditional variance of the returns5 . σ 2 (t) is a sum of two factors. The first factor, V c (t), is the
continuous component of the stochastic variance. I model it as a sum of square-root processes as in the
standard affine stochastic volatility models (Duffie, Pan, and Singleton (2000) and Duffie, Filipović, and
Schachermayer (2003)).
The second component of the stochastic variance, V j (t), is its discontinuous part6 . I model V j (t) as a
moving average of a pure jump Lévy process. To guarantee nonnegativity of V j (t) I define it as an integral
with respect to the random measure µ and not with respect to its compensated version µ̃. Further, I
restrict k(·) > 0 and g(·) > 0 as already specified in the definition of the stochastic volatility model. A
more familiar representation for V j (t) is (with the normalization g(0) = 1)
X
V j (t) =
g(t − s)∆V j (s).
s≤t
This shows that V j (t) is a weighted sum of past variance jumps. The impact of the past jumps on the
current level of V j (t) is determined by the function g(·). In other words g(·) controls the persistence in
the process V j (t). This modeling of the discontinuous component of the stochastic variance follows the
general dynamics of the jump-driven stochastic volatility models introduced in Todorov (2006a) (which
include also the non-Gaussian OU model of Barndorff-Nielsen and Shephard (2001) and its extensions in
5
6
This is because the jump martingale is time-homogeneous.
V j (t) is discontinuous provided g(0) 6= 0, which will be assumed.
5
Brockwell (2001a) and Brockwell and Marquardt (2005)). In these models the stochastic variance is driven
solely by nonnegative jumps.
The last component of the price in equation (1) is a jump martingale and as a result is defined as
an integral with respect to the compensated martingale measure µ̃. This notation is less familiar in the
empirical finance literature. If the price jumps are of finite variation, e.g. all compound Poisson processes,
we have
Z tZ
Z tZ
Z
h(x)µ̃(ds, dx) =
h(x)µ(ds, dx) − t
h(x)G(dx).
(5)
0
Rn
0
Rn
0
0
Rn
0
The second term in the above equation is constant and can be added to the drift term. For the first term
in (5) we have
Z tZ
X
h(x)µ(ds, dx) =
∆f (s),
0
Rn
0
0<s≤t
which is more familiar and shows that this integral is simply a sum of all price jumps up to time t. The
reason the price jumps are written as in equation (1) is that this allows considering more general cases in
which the decomposition in (5) does not work, i.e. the case of infinite variation price jumps7 . Therefore,
the jumps in the price are allowed to be completely general as far as their activity is concerned8 .
I proceed with defining key variables, associated with the stochastic volatility model (1)-(4), to be used
throughout the paper. The return of holding the futures contract over the period (t, t + a] is denoted with
ra (t) = f (t + a) − f (t). The quadratic variation (hereafter QV) of the futures price f (t) over the period
(t, t + a] is given by
Z t+a
Z t+a Z
2
(6)
[f, f ](t,t+a] =
σ (s)ds +
h2 (x)µ(ds, dx).
t
t
Rn
0
The first term in the quadratic variation is due to the continuous martingale in the futures price. This is
the continuous part of the quadratic variation. I refer to it as Integrated Variance (hereafter IV)
Z
IVa (t) =
t+a
σ 2 (s)ds.
(7)
t
The second component of the quadratic variation is due to the discontinuous martingale in the price. It
can be written as
Z
Z t+a Z
Z t+a Z
2
2
h2 (x)µ̃(ds, dx).
h (x)µ(ds, dx) = a
h (x)G(dx) +
(8)
t
Rn
0
Rn
0
Rn
0
t
The first term in (8) is a constant due to the time-homogeneity property of the Lévy processes. The second
term in (8) is a jump martingale with jumps equal to h2 (x) (i.e. the squares of the price jumps).
Further, it is convenient to decompose IV into two components corresponding to the continuous and
jump components of σ 2 (t)
IVa (t) = IVac (t) + IVaj (t),
(9)
where
Z
IVac (t)
=
t+a
Z
c
V (s)ds,
and
t
IVaj (t)
=
t+a
V j (s)ds.
(10)
t
7
In intuitive terms a function is of finite variation if its trajectory over a finite interval is finite. If this is not the case the
function is of infinite variation. If the jumps are of infinite variation we need to compensate them in order to be able to define
the last integral in (1). In this case the integral is defined as a stochastic integral, see e.g. Jacod and Shiryaev (2003).
8
In the empirical part the activity of the price jumps is restricted and the infinite variation case is excluded. However,
for this Section I keep the model as general as possible and allow for infinite variation price jumps since the analysis in this
Section covers this case as well and nothing is gained from excluding it.
6
The second integral in (10) can be expressed as another integral with respect to the random measure µ.
This could be easily done using Fubini’s theorem9
Z t+a Z
j
IVa (t) =
Ha (t, s)k(x)µ(ds, dx),
(11)
−∞
where
Rn
0
( R
t+a
g(z − s)dz if s < t
Rtt+a
Ha (t, s) =
g(z − s)dz if t ≤ s < t + a.
s
(12)
Note that the quadratic variation of the price varies over time. There are two reasons for this. The
first is that σ 2 (t) has time-variation. The second reason for the randomness of the quadratic variation is
the presence of jumps in the price process. This observation is important for the analysis of the variance
risk premium.
Finally, the unit of measurement in this paper is a trading day and if a = 1, in order to simplify
notation, I will omit the dependence on a in the notation of all the quantities defined above.
2.2
Model Characteristics
I continue with a short discussion of the empirically relevant features of the stochastic volatility model
(1)-(4).
Price Jumps. The presence of jumps in the price process has two implications. The first consequence
of price jumps is that the generated distributions (both conditional and unconditional) of the returns are
much more general. Thus, for example, price jumps (together with the time-varying stochastic variance) can
easily account for the observed fat-tailedness in the unconditional return distribution. This implication of
the price jumps could be detected even with the use of low-frequency data. Indeed, the studies of Andersen,
Benzoni, and Lund (2002), Chernov, Gallant, Ghysels, and Tauchen (2003) and Eraker, Johannes, and
Polson (2003) provide empirical evidence, based on daily financial returns, in favor of parametric models
containing price jumps. The second implication of price jumps is the discontinuity of the price trajectory.
This is a pathwise implication of the presence of jumps in the price. Naturally, if we want to separate the
price jumps from the continuous martingale component of the price, based on their difference in pathwise
behavior, we need high-frequency observations. Recently, Barndorff-Nielsen and Shephard (2004, 2006)
developed nonparametric tests for the presence of price jumps based on realized multipower variation
statistics. These statistics are constructed from high-frequency data and behave differently depending on
whether the price contains jumps10 . Using the tests of Barndorff-Nielsen and Shephard (2004, 2006) 11 ,
Barndorff-Nielsen and Shephard (2006), Andersen, Bollerslev, and Diebold (2005b) and Huang and Tauchen
(2005) find strong empirical evidence for a non-trivial jump component in the price. Further, Ait-Sahalia
(2004) and Ait-Sahalia and Jacod (2005), working in a time-homogenous setting, show theoretically that
jumps could be disentangled in a parametric estimation with the use of high-frequency data. Bollerslev
and Zhou (2002), Jiang and Oomen (2006), and Todorov (2006a) use high-frequency data to estimate
different parametric models with and without price jumps. These studies find strong support for models
containing price jumps. Thus, overall, there is overwhelming evidence for the presence of price jumps and
their inclusion is necessary. In the model here the jumps in the price are12
∆f (t) = h(x).
9
(13)
The integral in the definition of the second variance factor V j (t) is with respect to µ and thus could be defined pathwise.
In the estimation I use realized multipower variation statistics and Section 3 contains the definitions and properties of the
ones used in this paper.
11
These tests are valid asymptotically, as the intraday sampling interval goes to zero. However, the Monte Carlo analysis in
Huang and Tauchen (2005) suggests that they are good jump detectors for the frequencies at which the high-frequency data
is recorded.
12
This notation is a bit loose, but underlies the fact jumps are time-homogenous.
10
7
As seen from equation (13), the jumps in the price are time-homogenous, i.e., they are modelled as a Lévy
process. In the empirical part in Section 4 I find that for the data used in this study there is no significant
time-variation in the price jumps, at least when looking at their quadratic variation only. Therefore, in
this study I restrict the price jumps to be time-homogenous. Indeed, the empirical evidence (e.g. Andersen
et al. (2005b)) suggests that the continuous and discontinuous martingale components of the futures price
process differ substantially in their persistence. This is why in this paper these two components of the
price are modelled separately, instead of working with a more parsimonious model where σ 2 (t) determines
the time-variation both of the continuous and discontinuous martingales or even working with a model
where the price is a pure jump process. An extension of the current work is to allow for time-variation in
the jumps, which can be relevant especially if the jumps are identified not only through their quadratic
variation13 .
Stochastic Variance. Another important feature of the financial data is the persistence in the returns
variance. Since in the model here the price jumps are time-homogenous their conditional variance is
constant. Therefore, persistence in the returns variance can be generated only through time-variation in
σ 2 (t). This, in turn, can be done in the following way. First, the continuous component of the stochastic
variance, V c (t), is a sum of independent square-root processes. Secondly, the jump variance component,
V j (t), is modelled as a moving average of past jumps. A typical choice for the function g(·) in equation (4)
is a CARMA (continuous-time autoregressive moving average) kernel (see Brockwell (2001a,b)), but other
choices like the fractionally-integrated CARMA kernels introduced in Brockwell and Marquardt (2005) are
also possible. The choice of the function g(·) and the number of factors in the continuous variance part
determine the persistence of the stochastic variance. In Section 4 I provide further details on the particular
choice used in the empirical implementation.
An important feature of the stochastic variance σ 2 (t) (e.g. Eraker, Johannes, and Polson (2003) among
others) is its ability to increase rather quickly. Such sudden changes in the stochastic variance are hard to
be generated with a continuous path process such as the square-root process. However, they are naturally
generated by allowing for jumps in the variance. In the model here the jumps in the variance are (assuming
g(·) is a continuous function)
∆σ 2 (t) = g(0)k(x).
(14)
Jump Dependence. The modeling of the jumps in the price and the variance is quite flexible. In
equations (1) and (3) the jump sizes in the price and the variance are expressed as functions of jumps in an
n-dimensional space. This way all possible dependencies between the jumps can be captured in a practical
and intuitive way. I demonstrate with several examples, which have been used in the financial literature,
the flexibility of the jump modeling used here. I start with two examples which use a one-dimensional
Poisson measure, i.e. in which x = x. The first of these two examples is of a perfect linear dependence
between the jumps in the price and the variance. Such modeling is used in the non-Gaussian model of
Barndorff-Nielsen and Shephard (2001) (which is nested in the general stochastic volatility model here).
In the setting of the model (1)-(4) this type of dependence is generated with
h(x) ∝ x k(x) ∝ x.
Note that, since the jumps in the variance are restricted to be positive, this dependence has the potentially
limiting feature that the price jumps are of the same sign14 .
13
Time-variation in the price jumps can be generated either by introducing time-variation in the compensator ν (e.g. the
time-changed Lévy processes considered in Carr, Geman, Madan, and Yor (2003)) or by time-changing the jump size (e.g. the
COGARCH model of Klüppelberg, Lindner, and Maller (2004) and the pure-jump jump-driven stochastic volatility models in
Todorov (2006a)).
14
Another restrictive feature of this modeling of the jumps is that the price jumps are constrained to be of finite variation.
However, in the empirical part I exclude infinite variation price jumps from the analysis.
8
The second example, where the jumps in the price and the variance are modelled using a one-dimensional
measure, is when the variance jumps are proportional to the squared price jumps, i.e.
h(x) ∝ x k(x) ∝ x2 .
This dependence structure is used in Todorov (2006a). It induces non-linear relationship between the price
and the variance jumps. However, note that it implies perfect linear dependence between the squared
price jumps and the jumps in the variance. This modeling of the jumps resembles the modeling of the
conditional variance in the GARCH models. It is potentially more flexible than the previous case since the
price jumps can be of either sign and are not restricted to those of finite variation.
The above two examples consider the use of one-dimensional Poisson measures in modeling the price
and variance jumps. However, the analysis here encompasses more general cases, where x can be multidimensional. For example, independent jumps in the setting here can be modelled as follows.
x = (x1 , x2 ) h(x) = h(x1 ) k(x) = k(x2 ),
and the compensator G(·) is such that
Z
R20
1(x1 x2 6=0) G(dx) = 0,
i.e., the measure G(·) is concentrated on the two axes in R20 . Finally, the Lévy copula approach of Tankov
(2003), which describes the dependence of a two-dimensional Lévy process in its full generality, could be
also analyzed in the setting here.
For the purposes of the analysis in this paper I leave h(·), k(·) and G(·) unspecified for two reasons.
First, their parametric (or semiparametric) modeling does not simplify the analysis here. Secondly, since
we do not have a clear idea of the dependence structure of the jumps, it is better to leave this structure
unspecified and let the data “choose” the right one. This way potential misspecification problems can be
avoided.
Leverage Effect. Another empirically relevant feature of the model is the “leverage effect”, i.e. the
(negative) linear relationship between the price and variance innovations. In this study I am not interested
in measuring the “leverage effect”. However, in order for the model to be empirically realistic it should
allow for a flexible way of generating “leverage effect”. Therefore, I explain shortly how the model can
account for this feature of the data. Since the price and the variance are both driven by Brownian motions
as well as Poisson jumps, the “leverage effect” could be generated in two different ways in this model. One
way, which has been used predominantly in the financial literature, is to correlate the Brownian motions
in the price and in the variance. It is interesting to note that if this correlation is zero the Brownian
innovations in the price and in the variance will be independent. However, this could be restrictive as we
could have no “leverage effect” while the innovations in the price and in the variance are still dependent
(e.g. the standard GARCH model in discrete time or the jump-driven stochastic volatility models in
Todorov (2006a)). The second way of generating “leverage effect” is through a dependence between the
jumps in the price and the variance. In the financial literature the jumps are usually associated with very
big changes. However, here the jumps could be specified as infinitely active (having an infinite number
of jumps in any finite interval). Thus, a link between the jumps in the price and in the variance is not
necessarily capturing only the link between the excessive changes in price and variance.
2.3
Moments of the Return Process
I end this Section with a Theorem regarding moments of the return process which will be used later in
the estimation. The theorem provides further insight into the mechanism through which the stochastic
volatility model could account for the stylized features of the financial data.
9
Theorem 1 (Moments of the return process)
In the stochastic
R
R volatility model R(1)-(4) assume that
2 ≤ 2κ θ for i = 1, ..., p; ∞ g(s)ds < ∞ and ∞ g 2 (s)ds < ∞;
κi > 0, θi > 0 and σiv
k(x)G(dx) < ∞
i i
0
0
Rn
0
R
R
R
2
2
4
and Rn k (x)G(dx) < ∞; Rn h (x)G(dx) < ∞ and Rn h (x)G(dx) < ∞. Then if α(t) = 0 we have
0
0
0
Var(ra (t)) = a
p
X
Z
θi + a
g(s)ds
ÃZ
Z
4
Rn
0
+6a2
+3a2
Z
n
X
h2 (x)G(dx)
θi2 + 6a2
i=1
i=1
aZ s
+6
0
à p
X
R0n
h (x)G(dx)
−∞
Z
θi
+6
0
Z
θi +
i=1
n
X
Z
2
h (x)G(dx) + 3a
Rn
0
k(x)G(dx) + a
!2
2
Z
Z
Rn
0
0
i=1
¡
¢
E ra4 (t) = a
Z
∞
g(u)du
0
Rn
0
Ha (0, u)du
!
g(u)du
0
Z
Hs (0, u)g(s − u)duds
Rn
0
ÃZ
k(x)G(dx) + 3a2
2
Rn
0
Rn
0
h2 (x)k(x)G(dx)
k(x)G(dx)
Z
∞
(15)
Z
a
Z
∞
h2 (x)G(dx),
Rn
0
k (x)G(dx) + 6
Z
∞
g(u)du
0
p
X
σ 2 θi
iv
i=1
2κ3i
Rn
0
!2
k(x)G(dx)
(aκi − 1 + e−aκi ) + O(a5/2 ),
(16)
If in addition b ≥ a, we have
¶
µ
Z a
Z
p
−κi a 2 σ 2 θ
X
¡ 2
¢
2
−κi (b−a) 1 − e
iv i
Cov ra (0), ra (b) =
+
Ha (0, u)Ha (b, u)du
k 2 (x)G(dx)
e
n
κi
2κi
−∞
R
0
i=1
Z a
Z
+
Ha (h, u)du
h2 (x)k(x)G(dx) + O(a5/2 ).
(17)
0
Rn
0
The assumption of a zero drift term is not crucial for the results in the Theorem. It is assumed in order to
avoid trivial complications. In the case when the Brownian motions in the price and σ 2 (t) are independent,
the error terms in equations (16) and (17) are exactly zero. In this case the proof of the Theorem follows
essentially from the results in Todorov (2006a). When the Brownian motions in the price and σ 2 (t) are
correlated, the proof of (16) and (17) involves bounding terms coming from this correlation. The proofs
are easy, but somewhat tedious; therefore, they are given in a separate Appendix available upon request15 .
3
Realized Power Variation Based Inference
This Section introduces the estimation technique used in the paper and states an asymptotic result for the
resulting estimator. All results in this Section follow from Todorov (2006b), where the estimation of much
more general stochastic volatility models is considered. The discussion here is in the context of GMM, as
this is the estimation method used in the paper16 . The analysis is general in the sense that I do not specify
the moment conditions in the GMM estimator. The theoretical result here provides justification for the
estimation in Section 4 and Section 6. These two Sections contain the particular moment conditions and
other details on the estimation.
Estimation of stochastic volatility models has a long history in the empirical financial literature. One
difficulty in the estimation comes from the fact that σ(t) is stochastic and unobservable. The presence of
15
16
The Appendix with the proof of Theorem 1 can be downloaded from www.duke.edu/∼vst2.
Exactly the same analysis applies to M-type estimators.
10
price jumps additionally complicates the estimation problem. The availability of reliable high-frequency
data over the last two decades provides a way of making some of the unobservable quantities practically
observable. This can significantly simplify the estimation and in addition can provide big gains of efficiency.
In this paper I aggregate the high-frequency data into realized multipower variation statistics, which are
proxies for certain latent processes associated with the stochastic volatility model (e.g. QV and IV). The
inference is based on these realized multipower variation statistics.
I start with defining the realized multipower variation statistics that are used in this paper; see
Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2005) for a general definition. The first
statistic is the daily Realized Variance (hereafter abbreviated as RV). It is defined over a day t as
RVδ (t) =
M
X
rδ2 (t + (i − 1)δ),
(18)
i=1
where M = b1/δc. This statistic has been used extensively in finance (see Andersen, Bollerslev, and
Diebold (2005a) and references therein). Its usefulness for the inference is determined from the fact that
for δ close to zero, i.e. with high-frequency observations, it is close to QV defined in equation (6)17 .
The second realized multipower variation statistic used in this paper is the Realized Tripower Variation
(hereafter abbreviated as TV). It is defined over a day t as
T Vδ (t) = µ−3
2/3
M
X
|rδ (t + (i − 3)δ)|2/3 |rδ (t + (i − 2)δ|2/3 |rδ (t + (i − 1)δ)|2/3 ,
(19)
i=3
where µa = E(|u|a ) and u ∼ N (0, 1). Its usefulness is determined from the fact that for δ close to zero TV
is close to IV, defined in equation (7)18 . It should be mentioned that it is more common to use the Realized
Bipower Variation for testing for jumps and estimation of IV (e.g. Barndorff-Nielsen and Shephard (2004,
2006))19 . However, the Monte Carlo analysis in Todorov (2006b) shows that for estimating moments of
IV, TV performs better for values of δ comparable with those of the available high-frequency data (see
also the Monte Carlo evidence in Barndorff-Nielsen et al. (2006)).
For high-frequency data δ is close to zero and consequently, as argued above, RV and TV are close to
QV and IV respectively. Therefore, estimation that is based on using RV and TV is approximately the
same as inference that is based on using the unobservable QV and IV (this statement is made precise in
Theorem 2). At the same time, estimation based on QV and IV is easy to be done since in this case we
have observations for the latent integrated variance and the squared jumps over the days in the sample.
A problem with implementing the inference based on realized multipower variation statistics is that in
many cases closed-form expressions for their moments are not available. However, given the fact that
for δ close to zero RV and TV are close to QV and IV respectively, the moments of RV and TV can be
approximated with the corresponding ones of QV and IV. Of course, this approximation introduces error
in the estimation, the magnitude of which is controlled by δ. In Theorem 2 I provide conditions under
which the consistency and the efficiency in the estimation is not affected by the approximation error. I
proceed with stating the assumptions needed for Theorem 2.
17
This statement can be made more formal. As δ → 0 RV converges in probability to QV. Under certain integrability
conditions, Todorov (2006b) shows that the convergence holds also in moments. Finally, a CLT result is also available; see
Jacod (2006a,b).
18
As for RV this statement can be made formal. For δ → 0 TV converges in probability to IV. Also, provided certain
integrability conditions are satisfied, the convergence holds in moments. A CLT result also holds, provided the activity of the
price jumps is restricted; see Barndorff-Nielsen et al. (2006).
19
The Realized Bipower Variation over a day t is formally defined as
BVδ (t) = µ−2
1
M
X
|rδ (t + (i − 2)δ||rδ (t + (i − 1)δ)|.
i=2
11
Assumption 1. θ is a vector of parameters of the stochastic volatility model (1)-(4), {z(t)} is a data
vector consisting of daily statistics associated with the price process f (t), whose only “infeasible” elements
are IV and QV (and possibly lags of them). The infeasible estimator is defined as
θ̂nf = argmin mT (θ)0 Ŵ mT (θ),
(20)
θ∈Θ
P
p
where mT (θ) = T1 Tt=1 m(z(t), θ); Ŵ → W and W is a positive definite matrix. Assume that θ̂nf in (20)
is consistent and asymptotically normal.
Assumption 2. ẑ(t) is constructed from z(t) by replacing QV with RV and IV with TV. The feasible
estimator θ̂f is defined as
θ̂f = argmin m̂T (θ)0 Ŵ m̂T (θ),
(21)
where m̂T (θ) =
1
T
PT
θ∈Θ
t=1 m(ẑ(t), θ).
Assumption 3. α < 4/5, where α is the Blumenthal-Getoor index (Blumenthal and Getoor (1961)) of
the price jumps defined as
¾
½
Z
γ
1|h(x)|≤1 |h(x)| G(dx) < ∞ .
α = inf γ ≥ 0 :
(22)
Rn
0
p
2
Assumption 4. α(t) is stationary
2κi θi for i = 1, ..., p;
R ∞and Ep|α(t)| < ∞; κi > 0, θi > R0 and σiv ≤λ|h(x)|
G(dx) < ∞ and
g(·) is bounded around zero and 0 |g(s)| ds < ∞ for every p > 0; Rn 1|h(x)|>² e
0
R
λk(x)
G(dx) < ∞ for some ² > 0 and λ > 0.
Rn 1k(x)>² e
0
For the next
qPassumptions we need the following notation. For an arbitrary matrix A = [aij ] I denote
2
with ||A|| =
ij aij its Euclidean norm.
Assumption 5. ||m(z + y, θ) − m(z, θ)|| ≤ ||C(θ)||||P (z + y) − P (z)|| for every z and y and some matrix
valued functions C(·) and P (·) such that P (z) has at most polynomial growth.
Assumption 6. ||∇θ m(z + y, θ) − ∇θ m(z, θ)|| ≤ ||C(θ)||||P (z + y) − P (z)|| for every z and y and some
matrix valued functions C(·) and P (·) such that P (z) has at most polynomial growth.
Assumption 7. ∇z m(z, θ0 ) exists, it is continuous in z and has at most polynomial growth in z.
Before stating the asymptotic result for θ̂f , I make few remarks regarding the assumptions.
Remark 1. θ̂nf is an infeasible estimator which is used as a benchmark for the feasible one θ̂f . For
computing θ̂nf the econometrician has access to the unobservable QV and IV. Note that {z(t)} can contain
other variables besides QV and IV. However, if this is the case, then these variables are observable (e.g.
daily returns). In other words, the focus here is the error in the estimation coming from the substitution
of QV with RV and IV with TV.
Remark 2. Depending on what enters in the data vector {z(t)}, θ can include the whole parameter vector
of the stochastic volatility model (1)-(4) or only part of it. For example, if {z(t)} contains only QV and IV
(and possibly lags of them) then we will not be able to estimate the parameters controlling the drift α(t).
This is because the estimation in this case is based only on the continuous and discontinuous components
of the quadratic variation QV and the drift term does not participate in the latter.
12
Remark 3. The conditions for the consistency and asymptotic normality of θf are well known (see e.g.
Newey and McFadden (1994) and Wooldridge (1994)) and therefore are omitted here.
Remark 4. In the case when TV participates in the (feasible) data vector, the quality of ẑ(t) as a
proxy for z(t) depends on how good TV is in disentangling price jumps. This, in turn, depends crucially
on the activity of the price jumps. Intuitively, if the price contains many small jumps, then it will be
harder to separate these small jumps from the continuous price movements. The activity of the price
jumps is indexed with the Blumenthal-Getoor index given in (22). The index is in the interval [0, 2]. In
intuitive terms, the index measures the smallest power for which the sum of the absolute jumps raised
to it is still finite. As seen from the definition of the index, it concerns the behavior of the very small
jumps. For finite activity jump processes this index is 0. Another example is the α-(tempered) stable
process, where the Blumenthal-Getoor index coincides with the α parameter of the process. In Figure 1
I plot simulated trajectories of jump processes with different values of the Blumenthal-Getoor index and
a simulated trajectory of Brownian motion. As seen from the figure, for higher values of the BlumenthalGetoor index the trajectories of the jump processes look very similar to the trajectory of the Brownian
motion. Therefore, intuitively, the index is measuring how close the jumps are to a continuous process.
Remark 5. Assumption 3 is needed only when TV is included in the data vector ẑ(t). Further, it
could be weakened for the consistency result in Theorem 2. Assumption 3 coincides with the condition
in Barndorff-Nielsen et al. (2006) under which the asymptotic distribution of TV (as δ → 0) is unaffected
by the presence of price jumps. This is not a coincidence of course. Under the condition T δ → 0, which is
imposed in part b of Theorem
2, we need supδ δ −1 E (T Vδ (t) − IV (t))2 < K for some constant K so that
√
the error θ̂f − θ̂nf is op (1/ T ). Note also that Assumption 3 rules out infinite variation price jumps. The
Monte Carlo analysis in Todorov (2006b) shows that for practical applications this assumption is relevant.
Remark 6. Assumption 4 puts integrability conditions on the components of the stochastic volatility
model. It is used for proving uniform integrability of powers of RV and TV. The conditions involving the
continuous variance factors, V c (t), are well known (see e.g., Feller (1951)). Under these conditions V c (t) is
strictly positive and stationary. The conditions in Assumption 4 involving the jumps in the price and the
variance guarantee that all moments of the price and variance jumps are finite. These conditions involve
only the behavior of the big (price or variance) jumps (unlike the Blumenthal-Getoor index which measures
the behavior of the Lévy measure for the small jumps). The integrability condition for the powers of the
function g(·) is necessary for the discontinuous variance factor V j (t) to have all its moments finite (see
Rajput and Rosiński (1989)). This condition is automatically satisfied when g(·) is a sum of exponentials
(with negative exponents) which is the case for CARMA kernels with distinctive negative autoregressive
roots that are used in the empirical part.
Remark 7. Assumptions 5-7 are related with the moment conditions m(·, ·) used in the GMM estimation. They are satisfied if for example m(·, ·) is polynomial in z, which is the case for the estimators in
Sections 4 and 6.
The asymptotic properties of the feasible estimator θ̂f are analyzed in the next Theorem.
Theorem 2 (Consistency and asymptotic normality of θ̂f )
(a) Suppose Assumptions 1-5 hold. Then for T → ∞ and δ → 0 we have
p
θ̂f → θ0 .
(b) Suppose Assumptions 1-7 hold. Then if T → ∞, δ → 0 and T δ → 0 we have
´
√ ³
d
T θ̂f − θ0 → N (0, Avar(θ̂nf )).
13
(23)
(24)
Note that for the consistency of θ̂f we do not need a condition for the relative speed at which T → ∞
and δ → 0. For the asymptotic normality result, however, we need such a condition.
This is so because in
√
this case the error θ̂f − θ̂nf has to satisfy a stronger condition, i.e. to be op (1/ T ). Theorem 2 shows that
for this it suffices to have T δ → 0, i.e. the number of intraday observations should increase slightly faster
than the number of the days in the sample increase.
The proof of Theorem 2 can be found in Todorov (2006b).
4
Selection of the Model under the Physical Measure
4.1
High-Frequency Data and Initial Data Analysis
I estimate different model specifications (all falling in the general stochastic volatility model (1)-(4)) using
high-frequency data on the S&P 500 index futures contract. The data covers the period January 2, 1990,
to November 29, 2002. There are 80 five-minute return observations in each day covering the day trading
session from 9:30am till 4:15pm.
For each of the days in the sample I calculate RV and TV, using the high-frequency returns over that
day and using equations (18) and (19)20 . Figure 2 plots the returns over the day as well as TV and
JV=RV-TV. The last variable is a measure of the sum of squared jumps over the day. As seen from the
TV series, integrated variance has spikes and this is suggestive of the presence of jumps in the stochastic
variance σ 2 (t) (the spot variance of the continuous price component). Another interesting observation from
Figure 2 is that most of the days in which TV is high are days in which JV is high as well. This suggests
that jumps in σ 2 (t) are linked with the price jumps.
I continue the initial data analysis by investigating the persistence in IV and the squared daily price
jumps. The two panels of Figure 3 show the first 100 autocorrelations of TV and JV respectively. As seen
from the Figure, IV and the sum of squared price jumps differ significantly in their persistence. On one
hand, IV is a very persistent process. On the other hand, the squared jumps over the day show almost no
persistence. Therefore, even if there is time-variation in the price jumps, it should be such that it yields
almost no persistence when looking at the squared jumps over the days.
4.2
Model Selection
In this Subsection I estimate several specifications of the stochastic volatility (hereafter abbreviated as SV)
model (1)-(4) and select one of them to be used in the subsequent analysis of the variance risk premium.
Below I specify each of the candidate models. In each of them I keep the price jumps since we saw in the
previous Subsection that the data suggests nontrivial price jump contribution. To save space I do not report
estimation results for SV models without a price jump component in them since they are overwhelmingly
rejected. I classify the models according to the driving factor of the stochastic variance σ 2 (t), that is,
whether the stochastic variance is determined by diffusion processes only, or is determined by jumps only,
or contains both types of processes. For convenience here and to avoid unnecessary repetition I state in
each of the cases only the stochastic variance specification (the equation for the evolution of the price is
always the same and is given by (1)).
4.2.1
Model Specifications
Diffusive SV Model
In this model the stochastic volatility has only a continuous component (i.e. V j (t) = 0). This model is
nested in the affine jump-diffusion models of Duffie, Pan, and Singleton (2000). Here I look at up to two
20
On the days on which RVδ (t) > T Vδ (t), I replace the value of T Vδ (t) with that of RVδ (t), i.e. I compute T Vδ (t) ∧ RVδ (t).
This is a finite sample correction, since QV (t) ≥ IV (t) always, and guarantees that the estimate for the squared price jumps
is always nonnegative.
14
factors, i.e., the stochastic variance specification is
V (t) = V c (t) = V1c (t) + V2c (t),
q
c
c
dVi (t) = κi (θi − Vi (t))dt + σiv Vic (t)dBi (t),
(25)
i = 1, 2.
(26)
To avoid identification
problems in the estimation I make the following re-parametrization. I set θ = θ1 +θ2
q
θi
and σi = σiv 2κ
for i = 1, 2. σi2 is the variance of Vic (t). Therefore, the stochastic variance parameters
i
I estimate are θ and κi , σi for i = 1, 2. The nonnegativity and stationarity restrictions in Assumption 4
imply
θ > 0, κ1 > 0, κ2 > 0, σ1 + σ2 < θ.
I estimate both a one and a two-factor model.
Jump-Driven SV Model
In this model the stochastic variance is driven only by positive jumps, i.e. V c (t) = 0. This model falls
into the class of the jump-driven stochastic volatility models of Todorov (2006a). Here I use the following
specification for the moving average function g(·) in equation (4)
Z t Z
V (t) = V j (t) =
g(t − s)k(x)µ(ds, dx),
(27)
−∞
g(u) =
Rn
0
b0 + ρ1 ρ1 u b0 + ρ2 ρ2 u
e +
e ,
ρ1 − ρ2
ρ2 − ρ1
u ≥ 0.
(28)
The expression in (28) is of a (normalized) CARMA(2,1) kernel (see Brockwell (2001b) for details). The
reason I work with it here is that it induces the same type of autocorrelation function for the stochastic
variance σ 2 (t) as that implied by a two-factor affine jump-diffusion model. Thus, the models in the different
classes here are given a fair comparison. To ensure nonnegativity of the kernel as well as to guarantee
(weak) stationarity of the stochastic variance I impose the following parameter restrictions (see Todorov
and Tauchen (2006))
b0 ≥ − max{ρ1 , ρ2 } > 0.
I look also at a CARMA(1,0) kernel, which is the analogue of the one-factor Diffusive SV model. The
CARMA(1,0) kernel is a restriction of the CARMA(2,1) kernel. The restriction is b0 = − min{ρ1 , ρ2 }.
Jump-Diffusive SV Model
This model has both a diffusive and a jump component in the variance
V (t) = V c (t) + V j (t),
V c (t) = V1c (t),
q
c
c
dV1 (t) = κ1 (θ1 − V1 (t))dt + σ1v V1c (t)dB1 (t),
Z t Z
j
V (t) =
g(t − s)k(x)µ(ds, dx),
−∞
Rn
0
g(u) = eρ1 u ,
u ≥ 0.
(29)
(30)
(31)
(32)
This variance specification generates the same autocorrelation structure for σ 2 (t) as that implied by the
CARMA(2,1)-jump-driven SV model and the two-factor affine jump-diffusion model. To avoid
R identifica1
tion problems in the estimation, the model is re-parameterized as follows. I set θ = θ1 − ρ1 R2 k(x)G(dx)
0
q
θ
2
and σ1 = σ1v 2κ1 . θ is the mean of the stochastic variance and σ1 is the variance of the diffusive variance
component V c (t). In other words, as for the Diffusive SV model, I do not estimate separately the means
15
of the different variance components (note that in the Jump-Driven SV model this problem is automatically avoided since in it we have a single factor). The nonnegativity and stationarity of σ 2 (t) implies the
following conditions on the parameters which are imposed in the estimation
κ1 > 0,
θ > 0,
4.2.2
ρ1 < 0,
σ1 < θ.
Details on the estimation
Turning to the estimation of the different model specifications, I use a GMM-type estimator and apply the
general result in Theorem 2. In all estimated models I do not specify the Lévy processes in the price and
the variance. Instead, I treat as parameters only cumulants which are needed for calculating the moment
conditions in the GMM. In particular, I estimate
Z
Z
Z
Z
Z
2
2
4
k(x)G(dx),
k (x)G(dx) and
h2 (x)k(x)G(dx).
h (x)G(dx),
h (x)G(dx),
Rn
0
Rn
0
Rn
0
Rn
0
Rn
0
R
As mentioned above, in the case of the Jump-Diffusive SV model I do not estimate Rn k(x)G(dx) separately
0
to avoid identification problems. Also, in the case of the Diffusive SV model, since the variance does not
contain jumps, the last three quantities above are obviously not estimated. Further, in the estimation of
the models I impose the following constraint on the cumulants
sZ
Z
Z
0≤
Rn
0
h2 (x)k(x)G(dx) ≤
Rn
0
h4 (x)G(dx)
Rn
0
k 2 (x)G(dx).
This constraint guarantees that there exists a two-dimensional Lévy process (for the jumps in the price
and the variance) with cumulants equal to the estimated ones.
Turning to the moment conditions in the GMM, for the estimation of all the models specified above, I
match the following statistics
1. Mean, Variance and Autocorrelation of IV
2. Mean and Variance of QV
3. Mean of Realized Fourth Variation (hereafter FV), which is defined below
For the autocorrelation of IV I use lags 1, 3 and 6 as well as the average autocorrelation for lags 11 − 20,
21 − 30 and 31 − 40. The averaging of the higher order autocorrelations is done since these autocorrelations
are estimated with less precision. Altogether I end up with 11 moment conditions. I make the following
additional observations regarding the estimation.
• I use Theorem 2 and substitute in the estimation the unobservable IV with TV. In addition, I make
use of the following CLT result. Under the assumptions of Theorem 2, as shown in Barndorff-Nielsen
et al. (2005) and Barndorff-Nielsen et al. (2006), we have
√ Z t+1
µ
¶
Z t+1
A
law
−1/2
2
δ
T Vδ (t) −
σ (u)du −→ 3
σ 2 (u)dW (u),
(33)
µ
t
2/3 t
where W is a Wiener process defined on an extension of the probability space and is independent of
the futures log-price process f . The constant A is given by
A = µ34/3 − 5µ62/3 + 2µ22/3 µ24/3 + 2µ42/3 µ4/3 ,
16
where µa is defined after equation (19). Based on this result we have the following approximation
r µZ
¶
t+1
K
4
T Vδ (t) ≈ IV (t) +
σ (u)du ²t ,
(34)
M
t
where K =
A
µ62/3
≈ 3.0613 and (²t ) is i.i.d. sequence and ²t ∼ N (0, 1). Similar approximation for RV,
for the case of no price jumps, is used in Andersen, Bollerslev, and Meddahi (2005) for the purposes
of constructing volatility forecasts. Note that ²t is independent of IV (t) and is i.i.d. (with mean 0).
This means that, using the asymptotic refinement in equation (34), the mean and autocovariance
of TV are approximated by the mean and the autocovariance of IV. For the approximation of the
variance of TV, in addition to the variance of IV, we have a term of order O(δ) reflecting the effect
of ²t
µZ t+1
¶
K
4
Var (T Vδ ) ≈ Var (IV (t)) +
E
σ (u)du .
(35)
M
t
I use the approximation in equation (35) in the estimation. That is, the sample variance of TV is
matched to the expression in
³R (35). Under´the conditions of Theorem 2, there is no asymptotic effect
t+1
K
from adding the term M E t σ 4 (u)du to the variance of IV, i.e. Theorem 2 continues to hold.
This approximation can be viewed as a small sample correction (i.e. for a finite number of intraday
observations)21 (see Todorov (2006b)). Finally, for the moments of IV I use Theorem 1 in Todorov
(2006a).
• In the estimation the unobservable QV is replaced by RV, using Theorem 2. Similar to IV, I make a
small sample correction to the variance of QV. I match the sample variance of RV to an approximation
of the variance of RV, which is derived using Theorem 1 22 . This approximation can be written as
a sum of the variance of QV and an additional O(δ) term. Under the conditions of Theorem 2 the
correction has no asymptotic effect.
• FV is a particular case of realized power variation. It is defined for a day t as
F Vδ (t) =
M
X
rδ4 (t + (i − 1)δ).
(36)
i=1
p
R
It can be shown that for ∀t F Vδ (t) → Rn g 4 (x)G(dx) as δ → 0 (see Woerner (2006) and Jacod
0
(2006a)). Thus, FV is a measure of the sum of the price jumps raised to the power four over the
day. I use here F V to completely identify the second order moments of the jumps in the price
and the variance. To calculate the mean of FV I use equation (16) in Theorem 1. This is a high
order approximation. The error of the approximation is of magnitude O(δ 3/2 ), and is coming from
the “leverage effect” associated with the link between the diffusive variance factor and the diffusive
price innovation. This error will be zero for the Jump-Driven SV model and the Jump-Diffusive SV
model, provided that in the latter model the continuous innovations in the price and the variance are
independent. I neglect this error in this estimation23 .
• For calculating the optimal weighting matrix for the GMM-type estimator I use a Parzen kernel with
a lag-length of 80.
21
In the estimation it does not have
effect. For example for the parameter estimates of the Jump-Diffusive
³R a very significant
´
t+1
K
model Var (IV (t)) ≈ 1.14 and M
E t σ 4 (u)du ≈ 0.08.
22
The error of approximating the true variance of RV is of magnitude O(δ 1/2 ) and is induced by the “leverage effect” coming
from the link between the diffusive component of the stochastic variance and the diffusive price innovations. In the case of
the Jump-Driven SV models this error is zero.
23
Note that the effect of this error is not covered by Theorem 2. However, numerical experiments suggest that for practical
purposes the error is negligible.
17
• The estimation is performed using the MCMC approach of Chernozhukov and Hong (2003) of treating
the Laplace transform of the objective function as an unnormalized likelihood function and applying
MCMC to the pseudo posterior. The point estimates are the resulting mode of the pseudo posterior.
4.2.3
Estimation results
The estimation results are reported in Tables 1-3. Below I summarize the key findings from the estimation.
• One-factor type stochastic volatility models cannot match the autocorrelation in IV (respectively
TV). This claim holds true regardless of the specification of the stochastic variance σ 2 (t) as a sum
of diffusions or purely jump-driven. Both one-factor type models estimated here produce very bad
fit as can be seen by their corresponding J-statistics24 reported in the first columns of Tables 1 and
2 respectively. On the other hand, inclusion of an additional variance factor significantly improves
the fit. A two-factor type SV model can match the autocorrelation in IV (respectively TV). Figure 4
plots the fit to the autocorrelation of TV, implied by the parameter estimates for the Jump-Diffusive
SV Model (29)-(32) (an almost identical autocorrelation structure is implied by the model estimates
for the CARMA(2,1) jump-driven SV model). As seen from the Figure, the autocorrelation of TV
is well matched for lags until forty (in the estimation I matched the autocorrelation of TV until
lag forty). After lag forty the model-implied autocorrelation slightly underestimates the empirically
observed one. However, it is still well within the 95% confidence interval.
• The Diffusive SV Models, estimated here, produce very bad fit to the data as seen from the results
in Table 1. The reason for this is that the square-root processes could not generate enough variance
in IV to match the empirically observed one. On the other hand, models which contain jumps in
the stochastic variance σ 2 (t) can naturally generate enough variance in IV. These models with jumps
in σ 2 (t) provide good fit to the data, as seen from the estimation results in Tables 2 and 3. This
observation is in line with the findings in Eraker, Johannes, and Polson (2003), where lower frequency
stock market data is used and in Broadie, Chernov, and Johannes (2006), where options data is used.
• The estimation results for the models containing jumps in the variance (given in Tables 2 and 3) show
that
there is a relationship between the jumps in the variance and the jumps in the price. That is,
R
h2 (x)k(x)G(dx) is statistically different from zero25 . This finding rejects independence between
Rn
0
the jumps in the price and the variance. It is also in line with the observation made at the beginning
of the current Section regarding the positive link between the JV and TV series. It should be noted
that perfect linear dependence between the price jumps and the variance jumps or perfect linear
dependence between the squared price jumps and the variance jumps can be shown to be rejected;
see the analysis in Todorov (2006b). Another popular in the literature dependence structure, where
the jumps in the price and the variance are compound Poisson, arrive always together and have
independent normally and exponentially respectively distributed jump sizes, can be also rejected. As
already discussed in Section 2 here I do not model parametrically the link between the jumps in the
variance and the jumps in the price. Therefore, the results in the paper are not driven by a (possibly
misspecified) parametric model for the jumps.
The estimated two-factor type models are nonnested. To compare them formally we can use a model
selection criteria (MSC) as proposed in Andrews (1999), Andrews and Lu (2001) and Hong et al. (2003)
among others. The MSC can be written as
M SC = J − s(#moments − #parameters) × kT ,
24
The J-statistic is the GMM test for overidentifyingR restrictions.
Note that under the null hypothesis the parameter Rn h2 (x)k(x)G(dx) is on the boundary of the parameter space. In this
0
case the asymptotic distribution (under the null) is truncated normal (see Andrews (2002)); as a result the 5% significance
critical value is 1.65.
25
18
where J is the test of overindentifying restrictions for the model, s(·) is an increasing function and kT is a
sequence satisfying kT → ∞ and kT = o(T ). This model selection criteria is consistent, i.e. asymptotically
we choose the model with the best fit to the moment conditions, which is most parsimonious26 . In the
case here, all models are estimated with the same number of conditions. The Jump-Driven and the JumpDiffusive SV models have the same number of parameters, which exceeds the number of parameters of the
Diffusive SV model with one. Therefore, the difference in MSC of the Jump-Driven and Jump-Diffusive
SV models is coming only from the difference in their J-statistics. When comparing these two models
with the Diffusive SV model, a small correction to the difference of the corresponding J-statistics should
be made to account for the fact that the Diffusive SV model is more parsimonious. However, given the
high value of the J-statistic for the Diffusive SV model, this correction for parsimony cannot change our
conclusions about the inferior performance of this model. Thus, the two best performing models are the
CARMA(2,1) jump-driven SV model (parameter estimates reported in the second column of Table 2) and
the Jump-Diffusive SV Model (parameter estimates reported in Table 3). The two models produce an
almost identical fit to the moments used in the estimation.
In the subsequent analysis I decide to work with the Jump-Diffusive SV Model for the following reason.
As already discussed in Section 2, the “leverage effect” in the models analyzed here can be captured by
dependence in the diffusive and/or jump innovations in the price and the variance. The CARMA(2,1)
jump-driven SV model can put too much “burden” on the jump specification, since it is only through the
link between the jumps in the price and the variance that this effect can be generated in this model. In
contrast, the Jump-Diffusive SV Model is more flexible in that regard as it can allow for “leverage” coming
from dependence between the diffusive innovations in the price and the variance. Thus, for the subsequent
analysis of the variance risk premium, I work with the Jump-Diffusive SV Model. Following Tauchen
(1985), in Table 4 I report the t-statistics associated with each of the moments used in the estimation.
The results in Table 4 suggest that the Jump-Diffusive Volatility Model has no problem with fitting any
of the moments used in the estimation27 .
I finish the present Section with a short comment on the parameter estimates of the Jump-Diffusive
Model. In line with many other studies in the literature (Andersen, Benzoni, and Lund (2002), Alizadeh,
Brandt, and Diebold (2002), Chernov, Gallant, Ghysels, and Tauchen (2003)) here I find one of the variance
factors to be slowly mean reverting, having a half-life of approximately twenty (business) days, while the
other one to be quickly mean reverting with a half-life of approximately half a day. Perhaps not surprisingly
the quickly mean-reverting factor is the jump component of the variance, while the slowly mean-reverting
variance factor is the continuous component of the variance.
5
Initial Analysis of the Variance Risk Premium
In this Section I start the analysis of the variance risk premium, using the selected model for the futures
price. The variance risk premium is formally defined as the wedge between (conditional) expectation of the
future quadratic variation under the risk-neutral and the physical measure. Thus, the daily-standardized
risk premium for the time-variation in the quadratic variation over the next a days is
V Ra (t) =
¢ 1 ¡
¢
1 Q¡
E [f, f ](t,t+a] |Ft − EP [f, f ](t,t+a] |Ft ,
a
a
(37)
where EQ (·) denotes expectation under the risk-neutral measure, known also as equivalent martingale
measure. In case a superscript is not put on the expectation operator, the expectation is always assumed
26
If none of the compared models can fit asymptotically the moment conditions, i.e. for none of the compared models
kT
m0 (θ0 ) = 0 (m0 (θ) = E(m(z(t), θ))), then for the consistency of the MSC criteria we need also √
→ ∞.
T
27
However, these t-statistics should be interpreted carefully. In particular, if a moment condition fails, then in general this
affects the consistency of the whole parameter vector. This, in turn, leads to inconsistency even of correct moments. The
diagnostic tests are used here just as one more device of checking if the model has difficulty in matching the moments used in
the estimation.
19
to be under the physical measure.
The variance risk premium reflects the compensation demanded by investors for two features of the
price process. The first is the time-variation in the variance of the continuous price component σ 2 (t). The
second feature, compensation for which is reflected in the variance risk premium, is the presence of price
jumps. In this Section I construct a measure for the variance risk premium, using VIX index data and the
selected Jump-Diffusive SV model, and analyze the dynamics of the variance risk premium. The empirical
findings in this Section are used as an important guidance for constructing prices of diffusive and jump
risk, which is done in the next Section.
5.1
Variance Risk Premium Measure and Its Properties
I start with constructing a measure for the variance risk premium. In addition to the high-frequency
futures data I use also data on the variance swap rate. The variance swap is a forward contract on the
future quadratic variation28 . At expiration it pays the difference between the quadratic variation over the
horizon of the contract and the fixed variance swap rate (for further details see e.g. Demeterfi, Derman,
Kamal, and Zou (1999)). The variance swap contract involves no initial payment. Therefore, the price of
the contract is equal to the expected value under the risk-neutral measure of the future quadratic variation
over the contract’s horizon. Thus, for a contract with a length of a days (recall our unit of measurement
is (trading) day) the daily-standardized variance swap rate is
ÃZ
¯ !
Z t+a Z
t+a
¯
¢ 1 Q
1 Q¡
2
2
h (x)µ(ds, dx)¯¯Ft .
SWa (t) = E [f, f ](t,t+a] |Ft = E
σ (s)ds +
(38)
a
a
t
Rn
t
0
For now assume that we have daily data on variance swap rates with a contract horizon of a days. Later in
this Section I provide details on the variance swap data. With the realized power variation statistics TV
and RV, calculated from the high-frequency data, we have a very good approximation for the integrated
variance and the quadratic variation respectively. On the other hand, the variance swap rate gives the
conditional expected value of the quadratic variation under the risk-neutral measure. Therefore, studying
the joint behavior of the variance swap rate and TV and RV can allow us to construct a good proxy for
the variance risk premium. I use the Jump-Diffusive SV model (29)-(32) for this (recall from Section 4
that this is our final model choice for the dynamics under the physical measure). The first term in the
variance risk premium formula (37) is the theoretical value of the variance swap and we have data on it.
Thus, in order to construct a proxy for the variance risk premium, we need to construct an estimate for the
conditional expectation of the future quadratic variation under the physical measure. For the discontinuous
component of the quadratic variation this is easy. Its conditional expectation is equal to the unconditional
one since the price jumps are a Lévy process
ÃZ
¯ !
Z
t+a Z
¯
2
P
¯
h (x)µ(ds, dx)¯Ft = a
h2 (x)G(dx).
(39)
E
t
Rn
0
Rn
0
Turning to the continuous part of the quadratic variation, its conditional expectation is different from the
unconditional one since σ 2 (t) is time-varying. The conditional expectation of IV is a linear function of its
two variance factors (since both of them are AR(1)-type processes). However, this conditional expectation
is not available to the econometrician. At the same time, the Jump-Diffusive SV model (and in fact all SV
models estimated in Section 4) implies an ARMA(2,2) process for the daily IV with coefficients determined
from the structural parameters of the model. This fact could be used to calculate the linear projection of
the integrated variance on the past values of TV and JV (which in turn are being used as a proxy for IV
and QV-IV respectively). The details are provided in Appendix A. I denote this linear projection as
P (IVa (t)|Gt ) = β + β1 T Vδ (t − 1) + γ1 JVδ (t − 1) + ... + βt T Vδ (0) + γt JVδ (0),
28
In practice the unobservable quadratic variation is substituted with the realized variance.
20
(40)
where Gt = σ(T Vδ (t − 1), ..., T Vδ (1), JVδ (t − 1), ..., JVδ (1)), i.e. Gt is the information created from the past
realizations of TV and JV, and we have G ⊂ F . β, β1 ,γ1 ,... are the linear projection coefficients. Thus, a
feasible measure for the premium at date t for the variance risk over the next a days is
Z
1
RPa (t) = SWa (t) −
(41)
h2 (x)G(dx) − P (IVa (t)|Gt ).
n
a
R0
RPa (t) is a proxy for V Ra (t). The approximation comes from substituting the conditional expectation of
the future integrated variance with its linear projection on the past values of TV and JV. This introduces
error for several reasons. First, the information set of the investor might be much larger. Secondly,
P (IVa (t)|Gt ) is just a linear projection and since the distribution of IV is non-Gaussian (as suggested by the
empirical evidence in Section 4), the linear projection does not coincide with the conditional expectation.
Note that I work in a semiparametric setting since the jumps are not modelled parametrically. This means
that, in general, it will not be possible in this setting to determine the coefficients from projecting also on
nonlinear functions of TV and JV.
How useful is the proposed measure for the variance risk premium? Our interest is in answering the
question whether there is time-variation in the variance risk premium and if so what determines it. It is
easy to show that, if an asymptotic approximation for TV and JV given in Appendix A holds exact, we
have the following
Cov(RPa (t), T Vδ (t − j)) = Cov (V Ra (t), IV (t − j)) ,
(42)
Cov(RPa (t), JVδ (t − j)) = Cov (V Ra (t), QV (t − j) − IV (t − j)) .
(43)
Below I illustrate how we can make use of these covariances to identify the dynamics of the variance
risk premium.
Let’s look at the case of constant variance risk premium first. In this case the conditional expectation
of the future quadratic variation differs from the one under the physical measure only by a constant. That
is, in the case of a constant variance risk premium, the variance swap rate is
Ã
¯ !
Z t+a Z
¯
1
2
c
h (x)G(dx)¯¯Ft + K,
(44)
SWa (t) := E IVa (t) +
a
t
Rn
0
where K is some constant. Therefore, using (30), (31) and (32) it is easy to derive
SWac (t) = K0 +
1 − e−κ1 a c
eρ1 a − 1 j
V (t) +
V (t),
aκ1
aρ1
(45)
for some constant K0 . That is, in the case of a constant variance risk premium, the variance swap rate is
a linear combination of the variance risk factors. Note that the coefficients in front of the variance factors
in SWac (t) are not free and are determined by the persistence in these factors under the physical measure.
A natural generalization is to consider variance risk premium specification under which the variance
swap is a linear combination of the variance factors, but the coefficients in front of them are not restricted.
This corresponds to a variance risk premium, which is linear in the variance factors29 . Virtually all
measure changes considered in the finance literature imply variance risk premium that is linear in the
variance factors. In this case we have time-variation in the variance risk premium and this time-variation
is determined solely by the two variance factors. Thus, under such specification of the variance risk
premium, the variance swap rate is
SWav (t) := K0 + Kc V c (t) + Kj V j (t),
29
(46)
In the next Section I derive prices of diffusive and jump risk which support such variance risk premium specification.
21
where K0 , Kc and Kj are some constants. Importantly, the coefficients Kc and Kj are left unrestricted 30 .
The variance swap rate corresponding to the constant variance risk premium scenario can be recovered by
−κ a
ρ1 a −1
constraining Kc = 1−eaκ1 1 and Kj = e aρ
in equation (46).
1
Can we distinguish these two scenarios for the variance risk premium using our measure RPa (t)? I
work with the variance swap specification SWav (t) in (46), since SWac (t) is a constrained version of it.
Then, it is easy to show that
Cov(RPa (t), T Vδ (t − i)) = 0
for i = 1, 2, ...
⇔
Kc =
1 − e−κ1 a
aκ1
and Kj =
eρ1 a − 1
.
aρ1
In other words, provided there is time-varying risk premium with time-variation determined by the variance
factors, our risk premium measure RP should be correlated with the past values of TV.
Further, with the measure RP we can investigate whether both the jump and diffusion parts of the
stochastic variance σ 2 (t) determine the time-variation in the variance risk premium. It is easy to derive
Z
eρ1 a − 1
h2 (x)k(x)G(dx) = 0.
or
Cov(RPa (t), JVδ (t − i)) = 0 ⇔ Kj =
n
aρ1
R0
That is, the measure RP will be correlated with the past squared price jumps, provided the jumps in the
price and the variance are dependent and in addition the variance
R risk2 premium depends on the variance
j
jump factor V (t). The empirical evidence in Section 4 indicates Rn h (x)k(x)G(dx) 6= 0. Thus, with the
0
covariance between RP and TV and RP and JV, we can differentiate constant variance risk premium from
variance risk premium that is linear in the variance factors. Further, because of the link between the price
and variance jumps, using these covariances, we can also determine if both variance factors determine the
variation in the variance risk premium.
5.2
VIX Index
In this Subsection I provide details on the construction of the variance swap rate. In the empirical study
I use a one-month variance swap rate. The variance swap is an over-the-counter derivative product, but
its theoretical value can be replicated by a portfolio of standard European-style options. The theoretical
results in Carr and Wu (2004), Britten-Jones and Neuberger (2000) (see also Bakshi and Madan (2000)
and Carr and Madan (2001)) show that we have the following for the variance swap rate
Z
ert,a ∞ 2Q(K, t, a)
SWa (t) =
dK + ²t,a ,
(47)
a 0
K2
where rt,a denotes a risk-free interest rate at time t for the period (t, t + a) and Q(K, t, a) is the time t
price of an out-of-the-money European-style option with strike price K and time to expiation a. ²t,a is an
approximation error, which will be zero if the price does not contain jumps31 . The numerical experiments
in Carr and Wu (2004) suggest that this error is not significant for practical purposes. In fact, since in this
paper I am interested in the time-variation of the variance risk premium, taking into account this error
will not change any of the conclusions in the paper.
The result in (47) means that the option data on each of the days can be used to construct the onemonth variance swap rate. This theoretical result is used by the CBOE in the calculation of the new VIX
index, which is therefore a proxy for the one-month variance swap rate32 . The index is based on European
30
The constant K0 in (46) and (44) might differ of course. Our interest here is in the coefficients Kc and Kj .
As shown
in Carr and Wu´ (2004), for the general SV model in (1)-(4), the error is ²t,a
=
R
R ³ h(x)
h2 (x)
Q
Q
−2 t+a
ν
(ds,
dx),
where
ν
(·,
·)
is
the
compensator
of
the
jump
measure
µ
under
the
e
−
1
−
h(x)
−
n
a
2
t
R0
risk-neutral measure.
32
The VIX index was introduced by CBOE in 1993. The old VIX index (also called VXO) computes the average Black and
Scholes (1973) implied volatility with strikes close to the current index level and two nearest maturities, so that one month
implied volatility is interpolated. The old VIX index was based on options written on the S& P 100 index. In 2003 CBOE
changed the calculation of the VIX index and also calculated the new VIX index back to 1990. The new VIX index uses
market prices, instead of implied volatilities (unlike VXO) and is approximating the variance swap rate.
31
22
options written on the S&P 500 index. The formula for the implied variance, used in the calculation of
the new VIX index, is a discretization of the portfolio of continuum of options in (47), i.e.
µ
¶2
2 X ∆Ki rt,h
1 Ft,h
2
σ (t, h) =
e Q(Ki , t, h) −
−1 ,
(48)
h
h K0
Ki2
i
where h is time to expiration of the option contracts (measured in calendar years, thus differing from our
convention of trading time) used in the calculation; rt,h is the risk-free interest rate at time t for the period
till expiration (t, t + h); Ft,h is the forward index level derived from the index option prices; K0 is the first
strike below the forward index level Ft,h ; Q(Ki , t, h) is the midpoint of the bid-ask spread for each option
with strike Ki ; Ki is the strike of the out-of-the-money option, which is a call if Ki > Ft,h and a put if
i−1
Ki < Ft,h ; ∆Ki = Ki+1 −K
, while for the lowest strike it is just the difference between the lowest and
2
next higher strike and similarly for the highest strike it is just the difference between the highest strike
and the next lower strike.
Since on each day we do not have data on options with time to expiration exactly one calendar month,
the CBOE calculates the new VIX index by using the following linear interpolation
s½
¾
N30 − Nh1 365
Nh2 − N30
2
2
V IX(t) = 100
+ h2 σ (t, h2 )
(49)
h1 σ (t, h1 )
,
Nh2 − Nh1
Nh2 − Nh1 30
where h1 and h2 are the two nearest maturities from the available options, and Nh1 and Nh2 are the number
of calendar days to expiration of the options.
In the calculation of the VIX index a calendar-counting convention is used. That is, the year consists
of 365 days and in computing the time to expiration for the options, the actual number of days is being
used. However, in this paper I adopted a trading-time counting. That is, a unit of time here is one trading
day. I do not consider the overnight returns in the analysis of the S&P 500 futures. I continue to use the
trading-time convention and assume that each month consists of 22 trading days. I use the VIX index to
calculate a daily-standardized variance swap rate with one month horizon (corresponding to 22 trading
days) according to the following formula
SW22 (t) =
30 1
V IX 2 (t).
365 22
(50)
The above is just a transformation, so that the variance swap rate is reported in daily variance units and
thus is directly comparable with RV and TV over the days in the sample.
The data for the VIX index consists of closing prices for every date for which we have high-frequency
data on the S&P 500 futures contract. Thus, we have data covering the period January 2, 1990, to
November 29, 2002, for a total of 3256 daily observations. For each of the days in the sample I calculate
from the VIX index a one-month variance swap rate according to formula (50).
5.3
Initial Analysis of the Variance Risk Premium
The variance swap rate can be decomposed as
µ
P
SWa (t) = V Ra (t) + E
¶
1
[f, f ](t,t+a] |Ft .
a
Using this formula we have
1
E(V Ra (t)) = E(SWa (t)) − E([f, f ](t,t+a] ),
a
µ
¶
q
p
¡
¢ 2
1
Var (V Ra (t)) ≥
Var(SWa (t)) −
Var E([f, f ](t,t+a] |Ft )
.
a
23
Thus, using the parameter estimates of the Jump-Diffusive SV model as well as the variance swap rate
data, we have an estimate for the first two moments of the variance risk premium. The estimated mean of
the variance risk premium is 0.6827, while the mean of the variance swap is 1.6542. This shows that the
variance risk premium is rather significant. An estimate for the lower bound of the variance of the variance
risk premium is 0.3401, while the estimated variance of the variance swap is 1.2775. Thus, the variance
risk premium shows significant variation over time and this accounts for a large part of the variation of the
variance swap rate. These results underline the importance of studying the variance risk premium and in
particular its dynamics.
I finish this Section with an analysis of the measure RP (evaluated at the parameter estimates reported
in Table 3). I use the high-frequency data and the parameter estimates of the Jump-Diffusive SV model,
reported in Table 3, to estimate the linear projection in (40). Appendix A contains details on the Kalman
filter used in the construction of the linear projection. Using the estimated linear projection and the data
on the variance swap rate I construct the measure for the variance risk premium. Figure 5 plots the RP
series together with TV and JV. As seen from the Figure, the RP series is quite persistent and is generally
above zero. We can distinguish three periods in the sample. The first is the beginning of the sample
and it covers roughly all of 1990 and half of 1991. This period is characterized with average values of
the variance of the continuous component and high price jump activity. As seen from the RP series, this
period is associated with a high variance risk premium as well. The second period in the sample lasts
from the middle of 1991 till approximately the beginning of 1996. This period is relatively tranquil and
is associated with very low levels of the variance (both of the continuous and the jump component). Our
measure for the variance risk premium RP shows very little variation and is close to zero in this period.
The last part of the sample covers the period from 1996 till 2002. This period has a very high variance
from the continuous component. We can detect many days in this period in which there are big changes
in TV. The same holds true for JV, i.e. an indication for many days with nontrivial price jumps. The
RP measure for this period is relatively high. Also, it shows significant variation. It peaks in the days of
high variance and after this shows slow decay. Thus, Figure 5 suggests that the jumps and the level of
IV are important factors determining the variance risk premium. To investigate further this conjecture I
compute the covariance between RP and past values of TV and JV. In Figure 6 I plot these covariances.
I summarize the findings from the analysis of the Figure as follows.
• The top panel of Figure 6 shows that RP covaries positively with past values of TV. The 95% lower
bound for these covariances is well above zero. This indicates that these covariances are statistically
different from zero. In Table 5 I report the results from Wald tests for zero covariance. The null
hypothesis of these tests is that all covariances between RP and past values of TV up to a certain lag
are equal to zero. The results in Table 5 confirm the evidence for the strong dependence between RP
and past values of TV. As seen from equation (42) this means that the variance risk premium has
time-variation, which depends on the level of the variance of the continuous price component σ 2 (t).
• The bottom panel of Figure 6 shows that RP covaries positively with past values of JV. The 95%
lower bound for these covariances is above zero. As for the covariance between RP and past values
of TV, in Table 5 I report Wald tests for zero covariances between RP and past values of JV up to
a certain lag. These tests indicate that there is a statistically significant relation between RP and
past values of JV. This, in turn, implies that the past values of the squared price jumps are also a
determinant for the time-variation in the variance risk premium (in addition to the variance of the
continuous price component).
6
Modeling and Inference for Time-Varying Variance Risk Premium
The main question I try to answer in this Section is whether we can “rationalize” the empirical evidence of
Section 5 for the time-variation in the variance risk premium. Can we find prices for the different risks in
24
the Jump-Diffusive SV model, which are consistent with no arbitrage and support the empirical findings
in Section 5? I start by deriving very general prices of risks in the Jump-Diffusive SV model, which are
consistent with no arbitrage. Following that, I conduct a joint inference using the high-frequency data and
the variance swap data and test the various specifications for the prices of risk in the model.
In order to avoid confusion I restate our final choice for the evolution of the futures price process
Z
p
df (t) = α(t)dt + σ(t)(ρdB1 (t) + 1 − ρ2 dB2 (t)) +
h(x)µ̃(dt, dx),
(51)
Rn
0
σ 2 (t) = V c (t) + V j (t),
p
dV c (t) = κ1 (θ1 − V c (t))dt + σ1v V c (t)dB1 (t),
Z tZ
j
ρ1 t j
eρ1 (t−s) k(x)µ(ds, dx),
V (t) = e V (0) +
0
Rn
0
(52)
(53)
(54)
where (B1 , B2 ) is a standard Brownian motion33 .
6.1
Prices of Risk and Change of Measure
The fundamental theorem of asset pricing implies, under some technical conditions, that no arbitrage is
equivalent to the existence of an Equivalent Martingale Measure (hereafter abbreviated as EMM) under
which the discounted gain process associated with an asset is a local martingale 34 . The futures contract
involves no initial payment and as a result, assuming that the contract is continuously marked to market,
the futures price F (t) is a local martingale under the EMM (see e.g. Duffie (2001)35 ).
Turning to the specification of the EMM, the presence of jumps in the futures price renders the market
essentially incomplete. That is, we cannot complete it by including in the investor’s portfolio a finite
number of securities 36 . That means, that in general we have infinitely many EMM-s which are consistent
with no arbitrage. Recall from the previous Section that P denotes the physical measure and Q the
risk-neutral measure, i.e. the EMM. The change of measure is specified with the following density process
ÃZ
!
¯
Z T
Z TZ
T
dQ ¯¯
= Z(T ) = E
(55)
ψ1 (s)dB1 (s) +
ψ2 (s)dB2 (s) +
(Y (s, x) − 1)µ̃(ds, dx)
dP ¯FT
0
0
0
Rn
0
where E(·) is the stochastic exponential 37 . The stochastic processes ψ1 (t) and ψ2 (t) are predictable and
the stochastic function Y (t, x) is nonnegative and predictable.
The technical conditions for Z(T ) to define an EMM are discussed in Appendix B. Here I assume that
these conditions are satisfied and focus on analysis of their implications for the variance risk premium.
33
Note that I made a slight change in notation here. I decomposed the Brownian motion in the price process into two
orthogonal components. The reason for this is that the pricing kernel is easier to be written with respect to a standard
Brownian motion.
34
More formally, as shown in Delbaen and Schachermayer (1998), the condition “No free lunch with vanishing risk” is
equivalent to the existence of an equivalent measure under which the discounted gain process is a σ-martingale. The notion of
“No free lunch with vanishing risk” can be viewed as a slight modification of no arbitrage opportunities. σ-martingales include
local martingales, but the opposite does not hold. For more see Harrison and Kreps (1979), Harrison and Pliska (1981) and
the more recent work of Delbaen and Schachermayer (1994, 1998). Here I will avoid these complications and I will assume
that an equivalent martingale measure exists, i.e. the discounted gain process will be a local martingale under it.
35
This is subject to a boundedness condition on the interest rate process, but this assumption can be relaxed; see Pozdnyakov
and Steele (2004).
36
Except trivial cases, for example when the jumps are a standard Poisson process or more generally when the set of possible
jump sizes is finite. See the discussion in Cont and Tankov (2004) and Cont, Tankov, and Voltchkova (2005).
37
Recall that the stochastic (Doléans-Dade) exponential of a given semimartingale X is defined as the solution of dY = Y− dX
and Y0 = 1. It has the property that if X is a local martingale, so is Y (see e.g. Jacod and Shiryaev (2003) for further
properties).
25
Rt
Rt
Under the measure Q, B1 (t) − 0 ψ1 (s)ds and B2 (t) − 0 ψ2 (s)ds are Brownian motions and the jump
measure µ has compensator Y (t, x)dtG(dx) (recall, under the measure P, µ has compensator dtG(dx)).
The stochastic processes ψ1 (t), ψ2 (t) and the (stochastic) function Y (t, x) determine the prices of the
different risks in the stochastic volatility model (51)-(54). ψ1 (t) and ψ2 (t) are the prices for the diffusion
type risk in the price and the variance, while Y (t, x) determines the compensation for the presence of jumps
in the price and the variance. ψ1 (t) and Y (t, x) determine together the variance risk premium, which we
are after. ψ1 (t) and ψ2 (t) determine the compensation for the diffusive price risk. Typical assumption
for this risk is to be proportional to the stochastic variance; see Pan (2002) for example. In this paper
I am not interested in it and as a result I will leave ψ2 (t) unspecified. The drift term α(t) contains the
compensation for the diffusive and jump price risks, i.e. (see Theorem 3 in Appendix B)
p
1
α(t) = − σ 2 (t) − ρσ(t)ψ1 (t) − 1 − ρ2 σ(t)ψ2 (t)
2Z
³
´
−
Y (t, x)(eh(x) − 1) − h(x) G(dx).
Rn
0
(56)
In general ρ 6= 0 (because of the “leverage effect”) and therefore the drift term contains information for the
variance risk premium. However, since the estimation in this paper is based on realized power variation
statistics constructed from high-frequency data, the effect of the time-varying drift term α(t) is negligible.
Therefore, I do not make use of α(t) for the identification of the variance risk premium in this paper.
The variance risk premium contains compensation for diffusive and jump type risk, since the model has
price jumps and in addition σ 2 (t) contains both a diffusive and a jump factor. The pricing of the diffusive
risk is well studied, while pricing of jump risk is less so. Before analyzing different specifications for ψ1 (t)
and Y (t, x), I briefly discuss the pricing of jump risk and how it differs from pricing diffusive risk.
The stochastic function Y (t, x) specifies compensation for each possible jump size x at each point of
time t. This is fundamentally different from the pricing of diffusive risk, where at each point of time we
have a single price (e.g. ψ1 (t) for the Brownian motion B1 (t)). This explains why the market is incomplete
in the presence of jumps. When there are only diffusive risks we need to include in the portfolio a finite
number of instruments (i.e. assets and different derivatives on them) which have sensitivity with respect
to those diffusive risks and this completes the market. Intuitively, the diffusive risks have a local Gaussian
behavior and appropriately weighted set of instruments sensitive to those risks could completely eliminate
(hedge) them. The situation is very different in the presence of jumps. In this case, it is not enough
to include instruments which are sensitive towards jumps. In the presence of jumps we need a hedging
instrument for each possible jump size. Thus, provided the jumps have an infinite number of possible
jump sizes, the market cannot be completed by a finite number of instruments. Therefore, for example,
the compensation for price jump risk reflected in the drift term α(t) in equation (56) is an “average” over
the compensation for all possible jump sizes at time t.
I turn now to modeling the variance (and jump) risk premium, i.e. modeling ψ1 (t) and Y (t, x) in the
stochastic volatility model (51)-(54). The typical way of specifying the change of measure (i.e. the prices
of risk) is such that the model is of the same class under both measures (physical and risk-neutral). The
main reason for analyzing such measure changes is analytical tractability. For the SV model (51)-(54),
used here, this means that under the risk-neutral measure the jumps in the price and the variance are
again Lévy processes, and the stochastic variance is a sum of square-root process and jump-driven OU
process, possibly with different parameters. This, however, might be too restrictive particularly for the
jumps. Therefore, I consider also measure changes for jumps which go beyond the “structure-preserving”
ones. In the next two subsections I analyze the pricing of the diffusive and the jump risks determining
the variance risk premium, i.e. ψ1 (t) and Y (t, x). It is convenient for the subsequent analysis to split the
variance risk premium as follows
V Ra (t) = V Rac (t) + V Raj (t),
26
where
V
Rac (t)
1
= EQ
a
µZ
1 Q
E
a
V Raj (t) =
t+a
t
ÃZ
1
− EP
a
¯ ¶
¯ ¶
µZ t+a
¯
¯
1 P
c
¯
V (s)ds¯Ft − E
V (s)ds¯¯Ft ,
a
t
c
t+a Z
Z
Rn
0
t
ÃZ
h2 (x)µ(ds, dx) +
t+a Z
Rn
0
t
t+a
t
Z
h2 (x)µ(ds, dx) +
¯ !
¯
V j (s)ds¯¯Ft
t+a
t
(57)
¯ !
¯
V j (s)ds¯¯Ft .
Thus, V Rac (t) is the part of the variance risk premium which is due to the compensation for the timevariation in the continuous variance factor V c (t). It is determined by ψ1 (t). The other component of
the variance risk premium, V Raj (t), is determined only by the price of jump risk Y (t, x). It consists of
compensation for the time-variation in V j (t) as well as a compensation for the presence of jumps in the
price. I refer to V Rac (t) as diffusive variance risk premium and to V Raj (t) as jump variance risk premium.
6.1.1
Specification of ψ1 (t)
All the proofs
for the measure changes considered in this subsection are given in Appendix B. Let φ1 (t) =
p
ψ1 (t)σ1v V c (t). Then, for any specification of ψ1 (t), V c (t) has the following dynamics under the measure
Q,
p
dV c (t) = (κ1 θ1 − κ1 V c (t) + φ1 (t))dt + σ1v V c (t)dB1Q (t),
(58)
and this implies
¯ ¶
µZ t+a
Z t+a κ1 (u−t)
¯
1 Q
aκ1 + e−aκ1 − 1
e
−1 Q
1 − e−κ1 a c
c
¯
E
V (s)ds¯Ft =
V (t) +
θ1 +
E (φ1 (u)|Ft ) du,
a
κ1 a
aκ1
aκ1
t
t
therefore
Z
V Rac (t) =
t
t+a
eκ1 (u−t) − 1 Q
E (φ1 (u)|Ft ) du.
aκ1
Thus, the time-variation in φ1 (t) determines the time-variation in the variance risk premium coming from
the compensation for the diffusive variance risk. I analyze two different specifications for ψ1 (t). The first
corresponds to a constant diffusive variance risk premium. It is given by
D1. ψ1 (t) =
σ1v
√λ
V c (t)
, where λ is a constant such that λ ≥
2
σ1v
2
− κ1 θ1 ≥ 0.
Under this specification ψ1 (t) is inversely proportional to the square-root of the diffusive variance factor.
An implication of this specification for ψ1 (t) is that φ1 (t) is constant. Therefore, under specification D1
V Rac (t) = const. I further generalize this specification to allow for time-variation in V Rac (t). This is done
in D2.
D2. ψ1 (t) =
λ0 +λ1 V c (t)
√
,
σ1v V c (t)
where λ0 and λ1 are constants such that λ0 ≥
2
σ1v
2
− κ1 θ1 ≥ 0.
First, we are trivially back to the constant diffusive variance risk premium case if λ1 = 0. This
specification for ψ1 (t) is called extended affine price of risk in Cheridito, Filipović, and Kimmel (2005).
It is easy to derive that in this case V Rac (t) = const0 + const1 × V c (t), i.e. the diffusive variance risk
premium is an affine function of the diffusive variance factor. This specification implies that the only
relevant information at a given time for the diffusive variance risk premium is the level of the diffusive
variance factor itself. This change of measure, with the restriction λ0 = 0 imposed, is one of the most
frequently used for empirical finance applications. The fully general case, i.e. when λ0 6= 0, has only
27
recently been used by Wu (2005) and Cheridito, Filipović, and Kimmel (2005) and formally justified in
Cheridito, Filipović, and Kimmel (2005) (see also Cheridito, Filipović, and Yor (2005)).
Further generalizations of the specification could be also considered. The generalization might be
looked for in two directions. The first is in including additional information besides the one contained in
the process V c (t) already. Examples being the past level of jumps in the price or the variance. The second
generalization is related with the persistence in V Rac (t). Under D2 the memory of the diffusive variance risk
premium is tied with the memory of the diffusive variance risk factor. However, the information contained
in V c (t) could be “summarized” in a different state variable (still adapted to the filtration generated by
V c (t)), which differs in its degree of persistency. I do not consider these alternatives here for two reasons.
The first is that it seems relatively harder to do this in an analytically tractable way. The second reason is
that I make such a generalization for the price of jump risk and we can only observe the sum of the jump
and diffusive variance risk premium.
6.1.2
Specification of Y (t, x)
In this subsection I model the price of jump risk, i.e. Y (t, x). All the proofs for the measure changes
considered in this subsection can be found in Appendix B. Each of the cases for Y (t, x) generalizes the
previous one. The simplest specification for Y (t, x) is when it is only a function of x (i.e. of the jump
size). This case is considered in specification J1.
J1. Y (t, x) = ϑ(x), where ϑ(x) > 0
Z ³p
´2
ϑ(x) − 1 G(dx) < ∞ and
Z
Rn
0
Rn
0
| (ϑ(x) − 1) h(x)|G(dx) < ∞.
Note that, since under this specification Y (t, x) is non-stochastic and does not depend on t, the jumps
under the measure Q are Lévy process. Such measure change for jump-driven OU processes (i.e. the SV
model (51)-(54) with V c (t) = 0) is analyzed in Nicolato and Venardos (2003). Under such measure change
V Raj (t) = const, that is there is no time-variation in the jump variance risk premium. This specification
might be too restrictive in view of the empirical evidence in Section 5. Also, Pan (2002) finds that (for the
particular model she is using) time-varying jump risk premia is very important in reconciling the spot and
option price data as well as explaining volatility ”smirks” in the cross section of options. A natural way
to introduce time-variation in the jump risk premium is by specifying it as an affine function of the jump
variance component. This is done with the following specification.
J2. Y (t, x) = ϑ0 (x) + ϑ1 (x)V j (t−), where ϑ0 (x) ≥ 0 and ϑ1 (x) ≥ 0 and in addition
Z
(ϑ0 (x) − 1)2 G(dx) < ∞,
Rn
0
Z
Rn
0
Z
ϑ21 (x)G(dx) < ∞ and
Rn
0
k(x)ϑ(x)G(dx) < −ρ1 .
First, for ϑ1 (x) = 0 we are back to specification J1. Under specification J2, it is easy to show that the
jump variance risk premium is linear in the jump variance factor, i.e. V Raj = const0 +const1 ×V j (t). Under
J2, the jumps (in the price and the variance) have time-varying intensity under the risk-neutral measure,
while they are time-homogenous (i.e. have no time-variation) under the physical measure. However, it
should be noted that in terms of tractability little is lost. The model still falls in the class of generalized
affine models as defined in Duffie, Filipović, and Schachermayer (2003). I generalize further this jump risk
premium specification in J3.
28
J3. Y (t, x) = ϑ0 (x) + ϑ1 (x)τ (t−), where ϑ0 (x) ≥ 0 and ϑ1 (x) ≥ 0 and in addition
τ (t) = eρτ t τ (0) +
Z tZ
Rn
0
0
eρτ (t−s) ζ(x)µ(ds, dx),
Z
Rn
0
Z
ρτ < 0,
Z
ζ(x)G(dx) < ∞ and
Z
2
Rn
0
ζ(x) ≥ 0,
(ϑ0 (x) − 1) G(dx) < ∞,
Rn
0
ϑ21 (x)G(dx)
Rn
0
ζ 4 (x)G(dx) < ∞,
Z
< ∞ and
Rn
0
ζ(x)ϑ1 (x)G(dx) < −ρτ .
Under specification J3, the jump variance risk premium can be written as V Raj (t) = const0 + const1 ×
τ (t), i.e. it is linear in τ (t). The state variable τ (t) is a Lévy-driven OU process. For ρτ = ρ1 and
ζ(x) = k(x), we have τ (t) = V j (t) and thus we recover specification J2. Specification J3 generalizes J2
in two directions. First, by allowing ζ(·) to differ from k(·), I allow for different information, besides the
one contained in the jump variance factor V j (t), to enter the state variable τ (t). This can be done by
setting ζ(x) to depend on elements in the vector x on which the function k(x), determining the jumps in
the variance, does not depend. The second generalization is to allow the persistence in the state variable
τ (t) to differ from that of the jump variance component V j (t). That is, τ (t) and V j (t) might have the
same information (i.e. the jumps in the variance) but this information could be synthesized in a different
way. This difference in persistency can be achieved by letting ρτ 6= ρ1 . It should be pointed out that
specification J3 is still analytically tractable and under the risk-neutral measure the model is again in the
generalized affine class.
Finally, J3 can be even further generalized. For example, τ (t) can have a kernel which is mixture of
exponentials (e.g. CARMA(2,1) kernel). This will offer more flexibility in modeling the persistence in the
jump variance risk premium. I did not make this generalization in order to keep the modeling parsimonious.
Also, the time-variation in jumps of different size and/or sign might be modelled differently. Examples
are big jumps being more persistent than small ones and/or negative jumps having more persistence as
compared with positive ones (under the risk-neutral measure of course). All these scenarios are plausible
but I do not explore them here for reasons of identification. The analysis of the jump risk premium here is
based on using the conditional second moment of the risk-neutral distribution of the return process (i.e. the
variance swap contract). Other conditional moments of the risk-neutral distribution of the return process
are needed in order to identify different time-variation in the different jumps. Such analysis is beyond the
scope of the paper.
Overall, I conclude that for the purposes of exploring the variance risk premium the measure change
for the jumps given in J3 is general enough and at the same time very easy to work with.
6.2
Inference for Time-Varying Variance Risk Premium
In this Subsection I conduct a joint inference using high-frequency futures data and data on the variance
swap. I work with specification D2 for ψ1 (t) and specification J3 for Y (t, x). As argued above, we have
D1 ⊂ D2 and J1 ⊂ J2 ⊂ J3.
The different cases for the diffusive and jump risk, analyzed here, have testable implications. Mainly,
the covariances between the variance risk premium and past values of the daily IV and squared price jumps
are different. Therefore, these moments could be used to discriminate between the different cases. In the
previous Subsection I showed that V Rac (t) = const0 + const1 × V c (t) under specification D2 and that
V Rj (t) = const0 + const1 × τ (t) under specification J3. Therefore, under D2 and J3 we have
µ
¶
Z t−i
2
Cov V Ra (t),
σ (s)ds = Kφc e−κ1 i + Kτc eρτ i ,
(59)
t−i−a
29
and
Ã
Cov V Ra (t),
Z
t−i
t−i−a
!
Z
2
Rn
0
= Kτj eρτ i ,
h (x)µ(ds, dx)
(60)
where Kφc , Kτc , Kτj and ρτ > 0 are some constants.
R
Further, the constant Kτc is proportional to Rn k(x)ζ(x)G(dx) ≥ 0 and Kτj is proportional to
0
R
2 (x)ζ(x)G(dx) ≥ 0. Also, the results in Section 4 indicate that there is a highly statistically significant
h
n
R0
R
link between the price and variance jumps, i.e. Rn h2 (x)k(x)G(dx) > 0. However, note that even in this
0
R
R
case we might have Rn h2 (x)ζ(x)G(dx) > 0 and still Rn k(x)ζ(x)G(dx) = 038 . In such a scenario Kτc = 0
0
0
and Kτj 6= 0. Similarly we might have Kτc 6= 0 with Kτj = 0.
Before discussing how to use the data to estimate the moments in (59) and (60) I briefly elaborate
on the implications which different scenarios for the variance risk premium have on the values of these
covariances.
• Constant variance risk premium (D1 and J1). In this case Kτc = Kτj = Kφc = 0. Note that this
scenario is observationally equivalent39 to the case when φ(t) and τ (t) are time-varying, but are both
(at least linearly) independent from σ 2 (t) and the squared price jumps40 .
• Affine-in-variance factors variance risk premium (D2 and J2). In this case ρτ = ρ1 . Note that in
this case if the variance jump factor V j (t) has short memory, this will imply
the impact of the
R that
2
jumps on the variance risk premium dies out quickly as well. Also, since Rn h (x)k(x)G(dx) > 0,
under J2 we can either have Kτc = Kτj = 0 or Kτc 6= 0 and Kτj 6= 0.
0
• Jump variance risk premium depends only on component of price jumps orthogonal to the variance
jumps. This is possible only under specification J3. The implication of this scenario is Kτc = 0 and
Kτj 6= 0.
• Jump variance risk premium depends only on component of variance jumps that is orthogonal to
price jumps. As the previous scenario, this case is possible under the general specification for the
jump risk J3. Its implication is Kτj = 0 and Kτc 6= 0.
Turning to the estimation of the covariances, I make use of the RP measure constructed in Section 5.
To underline its dependence on the parameters I denote it here as RPa (t, θ), where θ denotes the parameter
vector. Recall from Section 5 that we have41
Cov (RPa (t, θ), T Vδ (t − i)) = Cov (V Ra (t), IV (t − i)) ,
Ã
Cov (RPa (t, θ), JVδ (t − i)) = Cov V Ra (t),
38
Z
t−i
t−i−a
Z
Rn
0
!
h2 (x)µ(ds, dx) .
Example of such a case is the following. Suppose x = (x1 , x2 ) > 0 (that is all of theR components are nonnegative with
at least one being strictly positive) and x1 and x2 are independent of each other, i.e. Rn 1(x1 x2 6=0) G(dx) = 0. Then set
0
√
h(x) = x1 + x2 , k(x) = x1 and ζ(x) = x2 .
39
When the covariances in (59) and (60) are used for identification of the time-variation in the variance risk premium.
40
In the specifications D1 and D2 such a scenario is not possible. For the jump Rvariance risk premium this could happen
only under specification J3. Example is the case where x = (x1 , x2 , x3 ) > 0 with Rn 1(x1 x2 x3 6=0) G(dx) = 0 and h(x) = x1 ,
0
k(x) = x2 and ζ(x) = x3 .
41
As already mentioned in Section 5, in the calculation of the linear projection P (IVa (t)|Gt ) given in Appendix A it is
assumed that an asymptotic approximation for RV and TV stated in this Appendix holds exactly. If this is not the case
we will not have the exact equalities above. However, under the conditions of Theorem 2, we know that asymptotically our
estimation results will be unchanged from substituting IV with TV and QV with RV.
30
Using this the estimation is done as follows. For each value of the parameters I calculate the linear
projection of IVa (t) on past values of TV and JV (the details on the calculation are given in Appendix A).
Using this linear projection and the data on the VIX index, for each value of the parameter vector I
construct RPa (t, θ) series. In the estimation I match the following set of moments
• Mean, Variance and Autocovariance of IV
• Mean and Variance of QV
• Mean of FV
• Mean of VR
• Covariance between VR and past IV
• Covariance between VR and past QV-IV
The autocovariances of IV, used in the estimation, are for lags 1, 3 and 6 as well as the average autocovariance for lags 11 − 20 and 21 − 30. In comparision with the estimation in Section 4 I dropped one of the
moment conditions. For the covariance between VR and past IV I use the average one for lags of TV 1−10,
11 − 20, 21 − 30 and 41 − 50. The same is done for the covariance between VR and past values of QV-IV.
Thus, overall, I match 19 moments. Following Theorem 2, I substitute the unobservable quantities IV, QV
and VR with TV, RV and RP respectively. All additional details of the estimation are as in Section 4.
The estimation results are reported in Table 6. The test for overidentifying restrictions shows that the
specification of the prices of risk, together with the stochastic volatility model, provides relatively good
fit to the moments used in the estimation. Comparing the estimation results in Table 3 and Table 6, we
can see that the parameters determining the evolution of the futures price under the physical measure
do not change substantially when the additional moments identifying the time-variation in the variance
risk premium are included in the estimation. Also, note that K0 is very high and statistically significant,
confirming the observation made already in Section 5 of a non-trivial variance risk premium. I continue
with analysis of the parameters controlling the time-variation in the variance risk premium. I summarize
the findings in the following points
1. The parameters determining the covariance between RP and past values of TV are not very accurately
estimated, as indicated by their relatively big standard errors.
2. The coefficient Kφc is negative (although statistically insignificant). This implies that an increase in
the diffusive variance factor affects inversely the variance risk premium. This is contrary to what we
would expect. In the model the negative premium for the diffusive variance factor can be explained
with offsetting a rather big premium for the jumps.
3. I test the hypothesis that the jumps in the price and/or variance do not determine the time-variation
in the variance risk premium. This is equivalent to testing Kτc = Kτj = 0. Note that ρτ is present
only under the alternative hypothesis and is not identified under the null hypothesis. I use a criterion
difference test, i.e J(θ̂r ) − J(θ̂), where J(θ) = T m̂T (θ)0 Ŵ mT (θ) is the GMM objective function42
and θ̂r and θ̂ are the restricted and unrestricted estimates respectively. From the results in Andrews
(2001), it follows that the criteria different test has a χ2 distribution with 2 degree of freedom
under the null hypothesis 43 . The value of the test is 16.00 with corresponding p-value of 0.0003.
42
Note that I use optimal weighting matrix for the GMM estimator, as this is crucial for the asymptotic distribution of the
test.
43
Assumption 3 in Andrews (2001) is easy to be verified in our setting. This assumption is needed for proving the asymptotic
distribution of the criterion difference test. It concerns the convergence of the first derivative of the GMM objective function,
as a function of the nuisance parameter present only under the alternative ρτ (the convergence is on the space of continuous
functions of ρτ equipped with the uniform metric). The first derivative of the vector of moment conditions is a continuous
31
This indicates that the jumps are an important state variable determining the time-variation in the
variance risk premium. In addition, the test provides also very strong evidence against “structurepreserving” type measure change for the jumps, i.e. specification J1.
4. Given the strong evidence for the importance of the jumps in determining the time-variation in the
variance risk premium, it is interesting to investigate what component of the jumps determines this
time-variation. If this is a component in the jumps common for both price and variance we must have
Kτc 6= 0 and Kτj 6= 0. If this is a component contained only in the price jumps (i.e. orthogonal to the
variance jumps) we should have Kτc = 0 and Kτj 6= 0. Similarly, if it is a component present only in
the variance jumps we should have Kτc 6= 0 and Kτj = 0. The coefficient Kτj is statistically different
from zero, while Kτc is not. Therefore, this could be interpreted as evidence that the time-variation
in the variance risk premium is determined by component in the price jumps, which is orthogonal to
(i.e. independent from) the variance jumps. However, this hypothesis can be true only if the diffusive
variance risk is also priced44 . Thus, overall, I conclude that there are two possible scenarios. The
first is that the diffusive variance factor does not determine the time-variation in the variance risk
premium. In this case the time-variation in the variance risk premium is determined by components
of both price and variance jumps. The second possible scenario is that the diffusive variance factor
determines the variation in the variance risk premium and in addition a component only present in
the price jumps determines the variation in the variance risk premium (when the restriction Kτc = 0
is imposed, Kφc becomes positive).
5. The autoregressive coefficients ρ1 and ρτ differ significantly. I calculate a Wald test for the hypothesis
ρ1 = ρτ . The value of the test is 36.39 with a corresponding p-value of 0.0000. In other words the
difference between ρ1 and ρτ is statistically big. Note that, under the specification J2 for the jump
risk, ρ1 and ρτ should be equal. Therefore, the estimation results provide a strong evidence against
such a specification.
Overall, the empirical results uncover a non-trivial variance risk premium, whose time-variation depends
strongly on the price jumps and the stochastic variance. However, the results here are not so conclusive
which component of σ 2 (t), the diffusive or the jump one, drives the variation in V Ra (t). Further, we can
reconcile the data on the variance swap rate and the underlying stock market index, but only with the use
of general (and quite flexible) specification for the price of jump risk.
The general specification of the price of jump risk is needed for two reasons. First, it allows the jumps
to behave quite differently under the two measures. On one hand, under the physical measure, the jumps
generate little time-variation. For the price jumps this follows from the fact that they are of Lévy type and
therefore time-homogenous. For the variance jumps this holds since the estimated value of ρ1 indicates very
little persistence, its half-life is approximately half a day. Under the risk-neutral measure the behavior
of the jumps is very different. Their compensator depends on a state variable which is very persistent
(under P). The half-life of ρτ is over 30 business days. Thus, when a jump occurs (in the price and/or
the variance) its effect on the evolution of the futures price under the physical measure disappears rather
quickly. However, the effect of the jumps on the variance risk premium persists for a very long period
of time. The second flexibility which J3 offers is to allow different components of the price and variance
jumps to drive the time-variation in the variance risk premium.
Finally, the empirical findings regarding the jump risk premium suggest that investors do fear jumps
and that their fear of jumps increases after big market drops. This finding can be potentially explained
function of ρτ and the moment vector does not depend on ρτ under the null hypothesis. Functional convergence of the first
derivative of the GMM objective function, as a function of ρτ then follows by establishing finite-dimensional convergence
(which in turn follows essentially from a standard CLT theorem for the moment vector) and C-tightness of the sequence,
which can be verified using Theorem 8.3 in Billingsley (1968).
44
A Wald test for the hypothesis Kφc = Kτc = 0 has a value of 11.39 with a corresponding p-value of 0.0034. Thus, this
hypothesis is strongly rejected.
32
with a habit persistence type equilibrium model45 in which habits are affected by jumps. In such a model
the representative investor will treat differently the diffusive and jump risk, as in Liu, Pan, and Wang
(2005) and Bates (2006). When a jump occurs investor’s willingness to protect her portfolio against future
big (negative) jumps increases. This could be done for example with buying deep out-of-the money put
options, which pushes up their price. This, in turn, results in an increase in the variance risk premium.
7
Conclusion
This paper provides an arbitrage-free explanation of the dynamics of the variance risk premium, observed in
the data, in the framework of a general stochastic volatility model. The study underlines the importance of
general jump specifications for the modeling of financial time series. On the one hand, jumps are important
for explaining the observed high-frequency data on the stock market index. On the other hand, this paper
finds that flexible jump risk premium is needed for explaining the dynamics of the variance risk premium
uncovered from the data.
I find that the stochastic volatility model needs both jumps in the price and the variance. Furthermore,
the estimation results indicate that the jumps in the price and the variance are dependent, but this
dependence cannot be generated by most commonly used parametric stochastic volatility models.
The jumps in the model are Lévy and therefore have no time-variation. This assumption is reasonable
at least for the analysis of the quadratic variation of the price process (and its components), which is
conducted here. The reason is that the daily squared price jumps, estimated from the high-frequency
data, show statistically insignificant autocorrelation. Extensions to allow for time-dependence in the jump
process could be considered but they will not change qualitatively the findings.
When the measure is changed from the physical to the risk-neutral, the time-dependence in the jumps
changes significantly. In particular, the estimation results in the paper show that the jumps can be no
longer Lévy under the risk-neutral measure. The reason for this finding is that such a scenario will result
in a constant jump risk premium and this will fail to match the observed dependence of the variance risk
premium on past price jumps. Indeed, I find strong evidence that supports jump risk premium depending
on a highly-persistent state variable. Moreover, in the model used here this state variable cannot be the
jump variance factor since the latter has a relatively short memory. This means that we need far more
general prices of jump risk, like the ones derived in the paper, in order to reconcile the joint dynamics of
the underlying asset and the variance swap rate.
The estimation results suggest that after a market crash the investors are willing to pay more in order
to hedge against future big (negative) jumps. This is also indicative of a special attitude towards extreme
negative events. Thus, preferences with an external habit influenced by jumps on the markets appear to
be a plausible explanation for this phenomenon.
Finally, there are several important directions in which the current work can be extended. First, the
analysis of the dynamics of the variance risk premium can be done in a completely parametric framework.
The advantage of this is that an efficient estimator can be used. The parametric modeling should take into
account the key findings of this paper. Mainly, the jumps in the price and the variance should be modelled
in a flexible way as dependent and the price of jump risk should be specified as a very persistent process
depending on the price jumps. A second extension of the current work is to generalize further the jump
risk premium to account for differential pricing of small versus big and positive versus negative jumps. An
appealing conjecture is that the pricing of the big negative jumps is much more persistent as these are
the jumps the investors fear most. Such asymmetric modeling of the price of jump risk is impossible to
identify in the setting here. The reason is that in this paper I use only the second moment of the riskneutral distribution i.e. the variance swap rate. However, other moments of the risk-neutral distribution
could also be constructed, using the theoretical results of Bakshi and Madan (2000) and Carr and Madan
(2001). These moments can potentially allow estimating the more general asymmetric prices of jump risk
45
As in Campbell and Cochrane (1999).
33
suggested above. A final extension of the current work is to incorporate the equity risk premium in the
analysis. Indeed, the price of jump risk affects both the equity and the variance risk premium. Therefore,
a joint study of equity and variance risk premium could help better identify the jump risk premium and
at the same time provide an additional test for the pricing of jump risk introduced in this paper.
34
Appendices
A
Calculation of P (IVa (t)|Gt ) for the Jump-Diffusive Volatility SV model
(29)-(32)
I start with stating an asymptotic result about the joint behavior of Realized Variance and Realized
Tripower Variation in the presence of price jumps. I assume that Assumption 3 and Assumption 4
hold and denote
bt/δc
X
[2]
(fjδ − f(j−1)δ )2 ,
{fδ }t =
j=1
bt/δc
[2/3,2/3,2/3]
{fδ }t
=
X
|f(j−2)δ − f(j−3)δ |2/3 |f(j−1)δ − f(j−2)δ |2/3 |fjδ − f(j−1)δ |2/3 .
j=3
This notation is adopted from Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2005). Then,
combining the results in Barndorff-Nielsen, Shephard, and Winkel (2006) and Jacod (2006a,b) we have
Ã
δ −1/2
[2]
{fδ }t −
Rt
2
0 σ (s)ds −
[2/3,2/3,2/3]
{fδ }t
RtR
0
Rn
0
− µ32/3
h2 (x)µ(ds, dx)
!
law
−→ L1 (t) + L2 (t),
Rt
2
0 σ (s)ds
where
Z
t
σ 2 (u)AdW(u),
¡√
¢ ¶
µ0 P
√
2∆fs
κs us σs− + 1 − κs u0s σs
s≤t
L2 (t) =
,
0
L1 (t) =
and W(u) is a Brownian motion, κs ∼ U [0, 1], us ∼ N (0, 1), u0s ∼ N (0, 1) and furthermore W and
the sequences (κs ), (us ), (u0s ) are independent of each other, are defined on an extension of the original
probability space, and are independent from the process f . A is a matrix of constants such that
!
Ã
2
3µ22/3 µ8/3 − 3µ32/3
0
.
AA =
3µ22/3 µ8/3 − 3µ32/3 µ34/3 − 5µ62/3 + 2µ22/3 µ24/3 + 2µ42/3 µ4/3
Using this asymptotic result we can write the following approximation for T Vδ (t) and JVδ (t)
Z
Z t+1 Z
2
T Vδ (t) = θ + (IV (t) − θ) + ν1t
JVδ (t) =
h (x)G(dx) +
h2 (x)µ̃(ds, dx) + ν2t ,
Rn
0
t
Rn
0
where (ν1t , ν2t ) is a martingale difference sequence with
µZ t+1
¶ Ã µ3 − 5µ6 + 2µ2 µ2 + 2µ4 µ !
¡ 2¢
1
4/3
2/3
2/3 4/3
2/3 4/3
4
E ν1t =
E
σ (u)du
6
M
µ2/3
t
¡ 2¢
E ν2t
=
¶ Ã µ3 + 3µ6 + 2µ2 µ2 + 2µ4 µ − 6µ5 µ !
4/3
2/3
2/3 4/3
2/3 4/3
2/3 8/3
σ 4 (u)du
6
µ2/3
t
Ã
!
Z
Z
1
+
4E(IV (t))
h2 (x)G(dx) + 2
h2 (x)k(x)G(dx)
n
M
Rn
R
0
0
1
E
M
µZ
t+1
35
1
E (ν1t ν2t ) =
E
M
µZ
t
t+1
¶ Ã 3µ5 µ + 2µ6 − µ3 − 2µ2 µ2 − 2µ4 µ !
2/3 8/3
2/3
4/3
2/3 4/3
2/3 4/3
σ (u)du
µ62/3
4
For the Jump-Diffusive Volatility SV Model, given in equation (29)-(32), we can decompose the demeaned IV into two independent parts
˜ c (t) + IV
˜ j (t),
IV (t) − θ = IV
˜ c (t) =
IV
Z
t+1
t
V1c (s)ds
−
E(V1c (s))
and
˜ j (t) =
IV
Z
t+1
V j (s)ds − E(V j (s)),
t
˜ c (t) and IV
˜ j (t) have
with V1c (t) and V j (t) specified in equations (30) and (31) respectively. Both IV
R t+1 R
autocorrelations of ARMA(1,1) process and are independent of each other. Also, t
h2 (x)µ̃(ds, dx)
Rn
0
R
R
˜ j (t). Therefore, (IV
˜ c (t),IV
˜ j (t), t+1 n h2 (x)µ̃(ds, dx)) have the
is an i.i.d. sequence connected with IV
t
R
same autocorrelation structure as the following process (ytc , ytj , yth )
0
c
ytc = φc yt−1
+ ect + θc ect−1 ,
ytj
yth
j
h
= φj yt−1
+ φh yt−1
+ ejt + θj ejt−1 ,
=
(A.1)
eht ,
where (et ) = (ect , ejt , eht ) is a discrete time white noise process, i.e. E (et e0s ) = 0 for t 6= s.
Next, I derive the parameters of the above multivariate ARMA process as functions of the parameters
of the underlying stochastic volatility model (29)-(32). First, it is easy to see that φc = e−κ1 and φj = eρ1 .
To determine the moving average coefficient θc and the variance of the error term in the first equation,
Var(ect ), I solve the following system of equations
¡
¢
¡
¢
c
Var (ytc ) − φc Cov ytc , yt−1
= Var(ect ) 1 + θc φc + θc2 ,
¡
¢
c
Cov ytc , yt−1
− φc Var (ytc ) = Var(ect )θc .
First, it is easy to derive
(e−κ1 + κ1 − 1)
κ21
¡
¢
σ 2 (1 − e−κ1 )2
c
.
Cov ytc , yt−1
= Cov (IV c (t), IV c (t − 1)) = 1
κ21
Var (ytc ) = Var (IV c (t)) = 2σ12
c (t),IV c (t−1))
If we set ρc = Cov(IV
, then it is easy to verify that
Var(IV c (t))
ρc > φc
and
1 + φ2c − 2φc ρc
≥ 2.
ρc − φc
Therefore, the invertible solution for the moving average coefficient (i.e. |θc | < 1) is given by
p
1 + φ2c − 2φc ρc − (1 + φ2c − 2φc ρc )2 − 4(ρc − φc )2
θc =
,
2(ρc − φc )
and from here the expression for Var(ect ) follows.
36
Because of the independence of IV c from the jumps it is easy to see that we must have E(ect ejt ) =
E(ect eht ) = 0. For the correlation between the error terms in the last two equations of the system (A.1) we
have
Ã
!
Z t+1 Z
j
˜ (t),
E(ejt eht ) = Cov IV
h2 (x)µ̃(ds, dx)
ÃZ
= Cov
Z
=
H1 (t, s)ds
− 1 − ρ1
ρ21
and for Var(eht )
ÃZ
Var(eht )
Rn
0
Z
Rn
0
Rn
0
h (x)µ̃(ds, dx)
h2 (x)k(x)G(dx)
(A.2)
!
2
Rn
0
t
2
h2 (x)k(x)G(dx),
t+1 Z
= Var
!
t+1 Z
t
Z
t+1
t
eρ1
Z
H1 (t, s)k(x)µ̃(ds, dx)
Rn
0
−∞
=
Rn
0
t
t+1 Z
h (x)µ̃(ds, dx)
Z
=
Rn
0
h4 (x)G(dx).
Finally, we need to determine φh , θj and Var(ejt ). I solve the following system of equations
³ ´
³
´
³
´
³
´
¡
¢
j
h
Var ytj − φj Cov ytj , yt−1
= φh Cov ytj , yt−1
+ Var(ejt ) 1 + θj φj + θj2 + φh θj Cov eht , ejt ,
³
´
³ ´
³
´
j
Cov ytj , yt−1
− φj Var ytj
= φh Cov eht , ejt + θj Var(ejt ),
³
´
³
´
³ ´
³
´
h
Cov ytj , yt−1
− φj Cov eht , ejt
= φh Var eht + θj Cov eht , ejt ,
³ ´
³
´
³
´
j
h
where I use the following expressions for Var ytj , Cov ytj , yt−1
and Cov ytj , yt−1
,which are easy
to derive
³ ´
³ j ´ 1 − eρ1 + ρ Z
1
j
˜ (t) =
Var yt
= Var IV
k 2 (x)G(dx),
2
ρ31
R0
³
´
´
³ j
ρ1 2 Z
j
˜ (t), IV
˜ j (t − 1) = − (1 − e )
k 2 (x)G(dx),
Cov ytj , yt−1
= Cov IV
n
2ρ31
R
Ã
! µ0
¶
Z t Z
³
´
ρ1 2 Z
1
−
e
j
j h
2
˜ (t),
h (x)µ̃(ds, dx) =
Cov yt , yt−1
= Cov IV
h2 (x)k(x)G(dx).
n
n
ρ
1
t−1 R0
R0
Since (ν1t ) is a martingale difference sequence, it is clear that the linear projection of IV on the past
values of TV and JV coincides with the linear projection of TV on the past values of TV and JV. To
calculate the last quantity I use the following state-space representation for TV
where





H=



1
θc
1
θj
0
φh θj
φj +θj
0
0
0
0
1
0










,
F
=








φc
1
0
0
0
0
T Vδ (t) = θ + e01 H0 ξt + e01 νt
(A.3)
ξt+1 = Fξt + vt+1 ,
(A.4)
0 0
0 0
0 φj
0 1
0 0
0 0
0
0
0
0
0
0
0
0
φh φj
φj +θj
0
0
1
0
0
0
0
0
0
37









 , vt = 






ect
0
ejt
0
eht
0



µ
¶
µ ¶

 , νt = ν1t , e1 = 1 .

ν2t
0


³ c
´
R
R
˜ (t), IV
˜ j (t), t+1 2 h2 (x)µ̃(ds, dx) have the same autocorSince IV (t) = IV c (t) + IV j (t) and IV
t
R
0
(ytc , ytj , yth ),
specified in (A.1), to show that this is a valid state-space
relation structure as the process
representation for TV it suffices to establish that (y1t , y2t , y3t ) has the same autocorrelation structure as
(ytc , ytj , yth ), where
y1t := ξ1t + θc ξ2t ,
y2t := ξ3t + θj ξ4t +
θj φh
ξ6t ,
φj + θj
y3t := ξ5t .
That y1t follows a univariate ARMA (independent from y2t and y3t ) is easy to show. Next, from the
equation for the evolution of the state vector we can write
ξ3t = φj ξ3t−1 +
φh φj
ξ5t−1 + ejt
φj + θj
ξ4t = ξ3t−1
ξ5t = eht
ξ6t = ξ5t−1 .
Using this we can write
θj φh
Leh
y2t = (1 + θj L) ξ3t +
φj + θj t
Ã
!
φh φj
θj φh
Leht
ejt
= (1 + θj L)
+
+
Leh
φj + θj 1 − φj L 1 − φj L
φj + θj t
=
φh L h 1 + θj L j
e +
e ,
1 − φj L t
1 − φj L t
where L is the lag operator. Therefore
(1 − φj L) y2t = (1 + θj L) ejt + φh Leht ,
which verifies the claim.
From here generating the linear projection P (IVa (t)|Gt ), given the parameter estimates in Table 3,
follows easily (see e.g. Hamilton (1994), Chapter 13).
B
Equivalent Martingale Measures
I start with a general theorem, which gives the prices of the different risks in the stochastic volatility model
(51)-(54) and sufficient conditions under which they correspond to EMMs. The theorem is general in the
sense that no assumption is made about the information entering in the filtration (Ft )t∈R+ .
Theorem 3 Consider the probability space (Ω, F , P) with filtration F = (Ft )t∈R+ on which the stochastic volatility model (51)-(54) is defined. Fix a terminal date T . Define a probability measure Q on
(Ω, F , F0≤t≤T ) which has a density with respect to the restriction of P to (Ω, F , F0≤t≤T ) given by
ÃZ
!
¯
Z T
Z TZ
T
dQ ¯¯
= Z(T ) = E
ψ1 (s)dB1 (s) +
ψ2 (s)dB2 (s) +
(Y (ω, s, x) − 1)µ̃(ds, dx)
(B.1)
dP ¯FT
0
0
0
Rn
0
38
where ψ1 = (ψ1 (t)), ψ2 = (ψ2 (t)) are predictable processes and Y = Y (ω, t, x) is a strictly positive and
predictable function such that the following two conditions are satisfied
à Z
Ã
!!
Z T
Z TZ
T
1
1
EP exp
ψ 2 (s)ds +
ψ 2 (s)ds +
(B.2)
ds (Y log(Y ) − Y + 1) G(dx)
< ∞,
2 0 1
2 0 2
0
Rn
0
Z
T
Z
Rn
0
0
|(Y (ω, t, x) − 1) h(x)| dtG(dx) < ∞,
P-a.s.
(B.3)
Then the measure Q belongs to the set of equivalent martingale measures iff the following condition holds
p
1
α(t) = − σ 2 (t) − ρσ(t)ψ1 (t) − 1 − ρ2 σ(t)ψ2 (t)
2Z
³
´
−
Y (ω, t, x)(eh(x) − 1) − h(x) G(dx),
Rn
0
dP ⊗ dt-a.s.
If Q is an EMM, then under Q the logarithmic futures price f (t) follows
Z
Z
1 2
h(x) Q
Q
df (t) = − σ (t)dt +
(h(x) + 1 − e
)ν (ω, dt, dx) + σ(t−)dW (t) +
h(x)µ̃Q (dt, dx),
n
2
Rn
R
0
0
where
Z
B1Q (t)
:= B1 (t) −
0
ψ1 (s)ds,
(B.6)
ψ2 (s)ds,
(B.7)
t
0
ν Q (ω, dt, dx) := Y (ω, t, x)ν(ω, dt, dx) = Y (ω, t, x)dtG(dx),
(B.8)
µ̃Q (ω, dt, dx) := µ(ω, ds, dx) − ν Q (ω, dt, dx).
(B.9)
Proof. From condition (B.2) follows that
µZ
P
E
T
0
¶
ψ12 (s)ds
< ∞,
and
µZ
E
P
0
T
¶
ψ22 (s)ds
< ∞,
Rt
Rt
therefore 0 ψ1 (s)dB1 (s) and 0 ψ2 (s)dB2 (s) are P-local martingales for t ≤ T .
Further, using again condition (B.2) we have
ÃZ Z
!
T
EP
Rn
0
0
(Y (ω, t, x) log(Y (ω, t, x)) − Y (ω, t, x) + 1) G(dx)dt
< ∞.
To continue further I make use of the following inequality
1
y log(y) − y + 1 ≥ [|y − 1|2 ∧ |y − 1|],
3
and therefore
(B.5)
t
Z
B2Q (t) := B2 (t) −
(B.4)
ÃZ
P
E
0
T
Z
Rn
0
for y ≥ 0,
!
¡
¢
2
|Y (ω, t, x) − 1| ∧ |Y (ω, t, x) − 1| G(dx)dt < ∞.
39
RtR
As a consequence (see Jacod and Shiryaev (2003), Theorem II.1.33), 0 Rn (Y (ω, t, x) − 1) µ̃(ds, dx) for
0
t ≤ T is a P-local martingale.
Combining everything we have that
Z t
Z t
Z tZ
N (t) =
ψ1 (s)dB1 (s) +
ψ2 (s)dB2 (s) +
(Y (ω, t, x) − 1) µ̃(ds, dx),
for t ≤ T
0
0
0
Rn
0
is a P-local martingale starting from 0. Then, using the property of the Doléans-Dade exponential (see
Jacod and Shiryaev (2003), Theorem I.4.61), we have that Z = E(N ) is P-local martingale. Moreover
∆N (t) = Y (ω, t, x) − 1 > −1.
Then, condition (B.2) is sufficient for Z to be a uniformly integrable martingale for t ≤ T (see Jacod (1979),
Theorem 8.45) and we have EP (Z(T )) = 1. Therefore, the measure Q, defined by Q(dω) = Z(ω)P (dω) is
a probability measure and moreover we have Q ∼ P.
Further, for the density process Z in (B.1) application of the Girsanov’s theorem for local martingales (see Jacod and Shiryaev (2003), Theorem III.3.8) yields that the processes B1Q and B2Q , given in
equations (B.6) and (B.7), are independent standard Brownian motions under the measure Q. Also,
application of the Girsanov’s theorem for random measures (see Jacod and Shiryaev (2003), Theorem
III.3.17) yields that ν Q (ω, dt, dx) in equation (B.8) is the compensator for the random measure µ under
the probability
measure Q. Finally,
the condition in (B.3) guarantees that the quadratic covariation prohR R
i
t
cess 0 Rn h(x)µ̃(ds, dx), Z(t) has locally integrable variation under the measure P and therefore the
0
following is a well defined local martingale under the probability measure Q
Z tZ
Z tZ
Z tZ
Q
h(x)µ̃ (ds, dx) =
h(x)µ̃(ds, dx) −
h(x)(Y − 1)ν(ds, dx),
for t ≤ T .
0
Rn
0
0
Rn
0
Rn
0
0
Based on that, f (t) satisfies the following SDE under the measure Q
Z
³
´
p
2
df (t) =
α(t) + ρψ1 (t)σ(t) + 1 − ρ ψ2 (t)σ(t) dt +
(Y (ω, t, x) − 1) h(x)ν(dt, dx)
´ Z
p
+σ(t) ρdB1Q (t) + 1 − ρ2 dB2Q (t) +
³
Rn
0
Rn
0
h(x)µ̃Q (dt, dx).
(B.10)
Using Ito’s lemma we have for F (t) under the measure Q
µ
¶
Z
p
¡
¢
dF (t)
1
2
=
α(t) + ρψ1 (t)σ(t) + 1 − ρ ψ2 (t)σ(t) dt +
Y (ω, t, x)(eh(x) − 1) − h(x) ν(ds, dx)
F (t−)
2
Rn
0
³
´ Z ³
´
p
Q
Q
h(x)
+σ(t) ρdB1 (t) + 1 − ρ2 dB2 (t) +
e
− h(x) − 1 µ̃Q (dt, dx).
(B.11)
Rn
0
Since F (t) must be a local martingale under the measure Q we need to set the drift term in the SDE above
to zero. From here we get the result in (B.4) and hence also (B.5).
¤
In Theorem 3 I did not specify exactly what information is contained in the filtration (Ft )t∈R+ . If
we assume that the filtration is generated only by the Brownian motion (B1 , B2 ) and the Poisson random
measure µ 46 then a representation theorem holds (see Jacod and Shiryaev (2003), chapter III). This means
that in this case all equivalent martingale measures are of the form specified in the above Theorem (this
will be formally derived in Theorem 4 below). This is a very strong result. On the other hand, if the
filtration (Ft )t∈R+ contains additional information (besides that contained in (B1 , B2 ) and µ), then in
46
Note that (53) has a strong solution.
40
general we cannot make such a conclusion. In this case, the above theorem gives those changes of measure,
which price only the risks associated with (B1 , B2 ) and µ. Thus, any other risk orthogonal to (B1 , B2 ) and
µ will not be priced. This, however, is not restricting the analysis here since we are interested only in the
risks coming from (B1 , B2 ) and µ.
The condition in equation (B.2) is a sufficient condition guaranteeing that density process Z is a
uniformly integrable martingale, but in practice it is very hard to be verified. In the case of no jumps it
reduces to the familiar Novikov condition (for other conditions for uniform integrability of exponential local
martingales see Kallsen and Shiryaev (2002) and references therein). Given the generality of the setting
in Theorem 3 and mainly the fact that we do not know what kind of information enters in the filtration
(Ft )t∈R+ this condition is hard to be relaxed further.
The condition in equation (B.3) can be removed. However, in this case the dynamics of f (t) (under
Q) needs to be changed slightly. Condition (B.3) guarantees that h(x) can be integrated with respect to
the compensated jump measure under Q without the need to truncate the big jumps. In that regard this
assumption is not restrictive and it is easy to verify.
In the next Theorem I make a stronger assumption on the filtration (Ft )t∈R+ which allows to give
much weaker conditions for the equivalence of two measures (as compared with condition (B.2)), that are
easier to work with.
Theorem 4 Consider the probability space (Ω, F , P) with filtration F = (Ft )t∈R+ . Suppose that the filtration F is generated by d-dimensional standard Brownian motion W and n-dimensional homogenous
Poisson measure µ with compensator ν P (under the measure P). Let ψ be a d × 1 predictable process
and Y (ω, t, x) a
R t strictly positive and predictable function. Denote with Q a probability measure under
which W (t) − 0 ψ(s)ds is a standard Brownian motion and the random measure µ has compensator
ν Q (ω, dt, dx) = Y (ω, t, x)ν P (ω, dt, dx) (assuming that such a measure exists!). Assume that P0 ∼ Q0
and in addition the following conditions are satisfied
Z t
0
(B.12)
ψ(s)ψ (s)ds < ∞
dP ⊗ dt-a.s,
0
Z tZ
Rn
0
0
³√
´2
Y − 1 ν(ds, dx) < ∞
Z
t
0
ψ(s)ψ (s)ds < ∞
dQ ⊗ dt-a.s,
(B.13)
(B.14)
0
Z tZ
0
dP ⊗ dt-a.s,
Rn
0
³√
´2
Y − 1 ν(ds, dx) < ∞
dQ ⊗ dt-a.s.
(B.15)
loc
Then we have P ∼ Q, that is for every t > 0 we have Pt ∼ Qt .
Proof. Under the measure P a representation theorem holds and therefore this measure is unique
(note we are in the canonical setting because of the assumption for the filtration). Furthermore since the
characteristics of (W, µ) are constant under P we have even local uniqueness47 for this probability measure
(this follows from Theorem III.2.4 in Jacod and Shiryaev (2003)). Define T = inf (t : H(t) = ∞), where
Z
H(t) =
t
0
Z tZ
ψ(s)ψ (s)ds +
0
0
Rn
0
³√
´2
Y −1
Y
ν Q (ds, dx).
47
Local uniqueness of a probability measure on a filtered probability space is a stronger property than the uniqueness. It
requires that the measure is unique for the martingale problem associated with the stopped canonical process for every “strict”
stopping time; see Jacod and Shiryaev (2003).
41
Consider the process
0
Z (t) =
´
 ³ R
¢
RtR ¡
t

 E − 0 ψ(s)dW Q (s) + 0 Rn0 Y1 − 1 µ̃Q (ds, dx)


0
for t < T
for t ≥ T .
Taking into account the relationship ν Q (dt, dx) = Y (ω, t, x)ν(dt, dx) and using the conditions (B.12) and
(B.13) we have
H(t) < ∞
dP ⊗ dt-a.s.
Combining everything we can apply Theorem III.5.34 in Jacod and Shiryaev (2003) (adapted to the case
when the filtration is generated by a d-dimensional Brownian motion and n-dimensional homogenous
loc
0
Poisson measure) to conclude P ¿ Q with density process Z . Therefore, to prove the (local) equivalence
0
of the two measures we need only to show that Q(Z (t) = 0) = 0. This is an easy consequence of the
loc
conditions (B.14) and (B.15). As a result we have Q ¿ P. Using Theorem III.5.19 (adapted to the
case when the filtration is generated by a d-dimensional Brownian motion and n-dimensional homogenous
Poisson measure) in Jacod and Shiryaev (2003) the density process of Q with respect to P is
´
 ³R
RtR
t

for t < T
 E 0 ψ(s)dW Q (s) + 0 Rn0 (Y − 1) µ̃(ds, dx)
Z(t) =


0
for t ≥ T .
¤
Theorem 4 above is very general. In the setting of the stochastic volatility model we price only the
Brownian motions and the jumps entering the variance and the price of the futures. There might be other
sources of risk (modelled with Brownian motions and Poisson measure), which will not be priced. The
conditions (B.12)-(B.15) replace the condition (B.2) in Theorem 3 (for the particular specification of the
filtration of course) which, as argued above, is hard to verify.
B.1
Proof for the Diffusive Risk Price ψ1 (t)
The specification D1 is a particular case of the specification D2. Therefore I show only that the price of
risk D2 specifies an equivalent change of measure (i.e. that conditions (B.12) and (B.14) are satisfied).
The process V c is a square-root process under both measures. The dynamics of V c (t) under the measure
Q is given by
p
dV c (t) = (λ0 + κ1 θ1 + (λ1 − κ1 )V c (t)) dt + σ1v V c (t)dB1Q (t).
If κ1 θ1 ≥ 0 and λ0 ≥ 0 the square-root process (under both measures) satisfies the Yamada-Watanabe
condition and therefore has a unique non-explosive solution under both measures. This implies that for
the equivalence of the measures P and Q we need only verify that the following holds
V c (t) > 0 dP ⊗ dt − a.s. and Q ⊗ dt − a.s.
To check this condition we need to analyze the behavior of V c at the boundary 0. The necessary and
sufficient conditions for non-attainment of the boundary under the measure P and Q respectively, starting
from a strictly positive value are (see Ikeda and Watanabe (1981) for example, these conditions guarantee
that the boundary is entrance under both measures, i.e. starting from a positive value it is never reached
in finite time and if the process starts from zero it always goes out)
2
σ1v
≤ 2κ1 θ1
2
and σ1v
≤ 2κ1 θ1 + 2λ0 .
Therefore, for those values of the parameters the conditions (B.12) and (B.14) are satisfied and hence we
have equivalence of the two measures.
42
B.2
Proof for the Jump Price of Risk Y (ω, t, x)
For the change of measure in J1 conditions (B.13) and (B.15) coincide and are automatically satisfied.
The specification J2 is a particular case of J3. Therefore I prove only that the price of risk J3 specifies
an equivalent martingale measure.
We have
Z t Z ³√
Z t Z ³√
´2
´2
Y − 1 ν(ds, dx) =
Y − 1 dsG(dx)
0
Rn
0
0
Rn
0
0
Rn
0
Z tZ
≤
≤ 2t
Z
(Y − 1)2 dsG(dx)
Z
2
Rn
0
(ϑ0 (x) − 1) G(dx) + 2
Z
t
τ (s)ds
Rn
0
0
ϑ21 (x)G(dx),
therefore it sufficient for conditions (B.13) and (B.15) to hold that the following is true
µZ t
¶
µZ t
¶
P
Q
E
τ (s)ds < ∞ and E
τ (s)ds < ∞ for ∀t > 0.
0
0
The condition is trivially satisfied under the measure P. I will show that it holds under the measure Q as
well.
First note that
Z
Z
Z
ζ(x)ϑ0 (x)G(dx) =
ζ(x)(ϑ0 (x) − 1)G(dx) +
ζ(x)G(dx)
Rn
0
Rn
sZ0
≤
Rn
0
Z
Rn
0
ζ 2 (x)G(dx)
Z
2
Rn
0
(ϑ0 (x) − 1) G(dx) +
Rn
0
ζ(x)G(dx)
< ∞,
sZ
Z
Rn
0
ζ(x)ϑ1 (x)G(dx) ≤
Rn
0
Z
ζ 2 (x)G(dx)
Rn
0
ϑ21 (x)G(dx) < ∞.
Similar inequalities hold true when ζ(x) is replaced with ζ 2 (x). Therefore, the claim follows from the
following general result.
Lemma 1 (a) There exists probability measure on the canonical probability space such that the canonical
process V is a semimartingale with initial distribution L(V (0)) = η (with positive support) and
satisfies the following equation
Z tZ
ρt
V (t) = e V (0) +
eρ(t−s) k(x)µ(ds, dx),
(B.16)
0
Rn
0
where k : Rn0 → R+ , µ is integer-valued measure on R+ × Rn0 with compensator ν(ds, dx) =
ds (m(V (s−))G1 (dx) + G2 (dx)) where m : R+ → R+ is a continuous function, G1 : Rn0 → R+ ,
G2 : Rn0 → R+ and
m(x) ≤ C ∨ x, where C is some constant,
Z
Z
K1 :=
k(x)G1 (dx) < ∞, K2 :=
k(x)G2 (dx) < ∞,
Rn
0
Rn
0
Z
K10
:=
Z
2
Rn
0
k (x)G1 (dx) < ∞,
43
K20
:=
Rn
0
k 2 (x)G2 (dx) < ∞.
(b) In addition to the conditions in part (a) assume that −ρ > K1 ≥ 0 and m(x) = x. Then V (t) is
asymptotically covariance stationary.
Proof.
Part (a).
It is convenient to rewrite equation (B.16) in a differential form (note that the measure µ and the
process V are defined jointly)
Z
dV (t) = ρV (t)dt +
k(x)µ(dt, dx).
Rn
0
The characteristics of the process V with truncation function h(x) = x (that is without truncation) are
given by48
Z t
B(t) =
(K1 m(V (s−)) + ρV (s−) + K2 ) ds,
0
Z
C̃(t) =
0
Z
ν(ds, A) = ds
t¡
Rn
0
¢
K10 m(V (s−)) + K20 ds,
1[k(x)∈A)] (m(V (s−))G1 (dx) + G2 (dx)) ,
A ∈ R0 .
I define a sequence of semimartingales (ṼK ) with initial distribution η and the following characteristic
triplet (the truncation function is again h(x) = x)
Z t³ ³
´
³
´
´
BK (t) =
K1 m(ṼK (s−)) ∧ K + ρ ṼK (s−) ∧ K + K2 ds,
0
Z t³
C̃K (t) =
Z
νK (ds, A) = ds
Rn
0
0
1[k(x)∈A)]
K10
³
´
´
0
m(ṼK (s−)) ∧ K + K2 ds,
³³
´
´
m(ṼK (s−)) ∧ K G1 (dx) + G2 (dx) ,
A ∈ R0 .
I will show that such processes exist.
First, for each K > 0 the characteristics of the semimartingale ṼK are majorized, i.e.
³
´
³
´
sup |K1 m(ṼK (s−)) ∧ K + ρ ṼK (t−) ∧ K + K2 | < ∞,
ω,t
³
´
sup |K10 m(ṼK (t−)) ∧ K + K20 | < ∞.
ω,t
³ ³
´
³
´
´
³ ³
´
´
Also, K10 m(ṼK (s)) ∧ K + K20 and K1 m(ṼK (s−)) ∧ K + ρ ṼK (s) ∧ K + K2 are continu³³
´R
´
R
ous in ṼK (s). This holds true also for m(ṼK (s)) ∧ K Rn g(x)G1 (dx) + Rn g(x)G2 (dx) for all contin0
0
uous and bounded functions g(x) vanishing around zero. Finally, since K1 < ∞ and K2 < ∞, we trivially
have
Ã
!
Z
³
´Z
lim sup
m(ṼK (t−)) ∧ K
1|k(x)|>a G1 (dx) +
1|k(x)|>a G2 (dx) = 0, for ∀t ≥ 0.
a↑∞ ω
Rn
0
Rn
0
Therefore, the conditions of Theorem IX.2.31 in Jacod and Shiryaev (2003) are satisfied. This implies
that there exists probability measure (denoted hereafter with PK ) supporting ṼK (the canonical process)
48
See Jacod and Shiryaev (2003) for a definition of the characteristics of a general semimartingale. C̃(t) stands for the
second modified characteristic.
44
as a semimartingale with characteristics (BK , C̃K , νK ) and initial distribution η. I will now show that the
sequence of processes (ṼK ) converges weakly (upon taking a subsequence if necessary) to a limiting process
and will identify the limit with the process V . To establish weak convergence I prove that the sequence
(ṼK ) is tight. For this I use Theorem VI.4.18 in Jacod and Shiryaev (2003). It is sufficient to show that
1. For all T > 0 and ² > 0
lim lim sup PK [νK ([0, T ] × {x : |k(x)| > a}) > ²] = 0.
a↑∞
(B.17)
K
2. The following sequence of processes is C-tight (i.e. the sequence of processes is tight and all its limit
points are continuous processes)
FK (t) =
Z t³ ³
Z t³ ³
´
³
´
´
´
´
K1 m(ṼK (s−)) ∧ K + |ρ| ṼK (s−) ∧ K + K2 ds+
K10 m(ṼK (s−)) ∧ K + K20 ds.
0
0
First I establish that (FK ) is C-tight. For this I make use of Theorem VI.3.26 in Jacod and Shiryaev
(2003). Note that FK (t) is absolutely continuous. Therefore for the C-tightness of FK (t) it suffices to show
that the process ṼK (t) satisfies the following boundedness in probability condition
·
¸
lim sup PK sup ṼK (s) > a = 0, for ∀t ≥ 0.
a↑∞ K
We have
0≤s≤t
Z tZ
Z t³
´
ṼK (s−) ∧ K ds +
ṼK (t) = ṼK (0) + ρ
therefore
Rn
0
0
0
Z t³
Z tZ
´
ṼK (t) ≤ ṼK (0) + |ρ|
ṼK (s−) ∧ K ds +
0
k(x)µ(ds, dx),
Rn
0
0
k(x)µ(ds, dx),
RtR
and since 0 Rn k(x)µ(ds, dx) ≥ 0 as k(x) > 0, and ṼK (0) ≥ as η has a positive support, using Gronwall’s
0
inequality (see Revuz and Yor (1994) for example) we have
!
Ã
Z Z
t
ṼK (s) ≤
ṼK (0) +
0
Rn
0
k(x)µ(ds, dx) exp(|ρ|t),
for 0 ≤ s ≤ t.
Therefore the C-tightness of (FK ) will be established if we can show that
³
´
EK ṼK (t) < C,
where C is a constant that does not depend on K. We have
Z t
Z t
³
´
³
´
³
´
³
´
EK ṼK (t) = E ṼK (0) + ρ
EK ṼK (s−) ∧ K ds + K1
EK m(ṼK (s−)) ∧ K ds + tK2 ,
0
0
and since m(x) ≤ x ∨ C for some constant C, for K > C we have
K
E
³
Z t
Z t
´
³
´
³
´
K
ṼK (t) ≤ ṼK (0) + ρ
E
ṼK (s−) ∧ K ds + K1
EK ṼK (s−) ∧ K ds + t(K2 + C).
0
If K1 + ρ ≤ 0 we have
0
³
´
EK ṼK (t) ≤ ṼK (0) + t(K2 + C),
45
and note that the right hand side of the above inequality does not depend on K. If K1 + ρ > 0 we have
Z t
³
´
´
³
K
EK ṼK (s) ds + t(K2 + C),
E
ṼK (t) ≤ ṼK (0) + (K1 + ρ)
0
therefore using Gronwall’s inequality we have
³
´ ³
´
EK ṼK (t) ≤ ṼK (0) + T (K2 + C) exp ((K1 + ρ) t) ,
0 ≤ t ≤ T,
and note again that the right hand side of the above inequality does not depend on K.
This proves C-tightness of the sequence of processes (FK ).
To establish tightness of the sequence (ṼK ) we need only verify that condition (B.17) holds. We have
EK (νK ([0, T ] × {x : |k(x)| > a}))
²
´ R
RT K ³
R
E
m(
Ṽ
(s))
∧
K
ds Rn 1|k(x)|>a G1 (dx) + Rn 1|k(x)|>a G2 (dx)
K
0
PK [νK ([0, T ] × {x : |k(x)| > a}) > ²] ≤
0
≤
0
²
R
R
C Rn k(x)G1 (dx) + Rn k(x)G2 (dx)
0
≤
0
a²
,
where C is a constant that does not depend on K. For the last inequality I made use of the result derived
above that EK (ṼK (t)) is bounded by a constant, which does not depend on K. This proves that the
sequence (ṼK ) is tight.
Now we are left with identifying the limiting process with the process V . For this I use Theorem IX.2.22
in Jacod and Shiryaev (2003). It suffices to establish the following
Z t
p
|ṼK (s) ∧ K − ṼK (s)|ds → 0, for every ∀t > 0, as K ↑ ∞,
Z
0
0
t
p
|m(ṼK (s)) ∧ K − m(ṼK (s))|ds → 0,
for every ∀t > 0, as K ↑ ∞.
The first result follows since for arbitrary s > 0 and ² > 0 we have
³
´
´ EK ṼK (s)
³
´
³
PK |ṼK (s) ∧ K − ṼK (s)| > ² ≤ PK ṼK (s) > K ≤
,
K
and as shown above EK (ṼK (s)) can be bounded by a constant, which does not depend on K. The second
result follows analogously since
³
´
³
´
³
´
³
´ EK m(ṼK (s))
EK ṼK (s) + C
PK |m(ṼK (s)) ∧ K − m(ṼK (s))| > ² ≤ PK m(ṼK (s)) > K ≤
≤
.
K
K
Part (b). We can write
Z
E (V (t)|Fs ) = e
ρ(t−s)
V (s) + K1
t
Z
ρ(t−u)
e
s
E (V (u)|Fs ))du + K2
t
eρ(t−u) du,
t ≥ s.
u
If we denote with E (V (t)|Fs ) = x(t) for t ≥ s, then x(t) solves the following differential equation
dx(t) = (K1 + ρ)x(t)dt + K2 dt,
46
t ≥ s,
which implies
Z
x(t) = e(K1 +ρ)(t−s) x(s) + K2
t
e(K1 +ρ)(t−u) du,
t ≥ s.
s
Therefore, we have
Z
E (V (t)|F0 ) = e(K1 +ρ)t V (0) + K2
= e
(K1 +ρ)t
t
e(K1 +ρ)(t−u) du
0
´
K2 ³
(K1 +ρ)t
V (0) −
1−e
,
ρ + K1
and since −ρ − K1 > 0 we have
lim E (V (t)|F0 ) = −
t→∞
K2
,
ρ + K1
that is we have asymptotic stationarity in the mean.
For the second moment we have
Z t
Z t
¡ 2
¢
2
0
2ρ(t−u)
0
E V (t)|F0 = (E (V (t)|F0 )) + K1
e
E (V (u)|F0 ) du + K2
e2ρ(t−u) du,
0
0
and further
µ
2
lim (E (V (t)|F0 )) =
t→∞
Z
lim
t→∞ 0
t
K2
ρ + K1
K2
lim
ρ + K1 t→∞
1 K2
.
2ρ ρ + K1
e−2ρ(t−u) E (V (u)|F0 ) du = −
=
Z
lim
t→∞ 0
t
e2ρ(t−u) du = −
Z
¶2
,
t
³
´
e2ρ(t−u) 1 − e(K1 +ρ)u du
0
1
.
2ρ
Combining everything we can write
¢
¡
lim E V 2 (t)|F0 =
t→∞
µ
K2
ρ + K1
¶2
+
K0
K10 K2
− 2,
2ρ ρ + K1
2ρ
and therefore we have asymptotic stationarity in the second moment.
47
¤
Table 1: Estimation results for the Diffusive Volatility SV model:
Z
h(x)µ̃(dt, dx),
df (t) = α(t)dt + σ(t)dW (t) +
Rn
0
σ 2 (t) = V1c (t) + V2c (t),
q
dVic (t) = κi (θi − Vic (t))dt + σiv Vic (t)dBi (t),
Parameter
θ
i = 1, 2.
one-factor
0.5273
two-factor
0.6001
κ1
0.3935
1.5945
σ1
0.1283
0.3305
(0.0364)
(0.0569)
(0.0098)
(0.0507)
(0.5136)
(0.2996)
κ2
0.0022
σ2
R
0.2575
(0.0094)
(0.2090)
h2 (x)G(dx)
Rn
0
R
4
Rn h (x)G(dx)
0.0981
0.1136
0.00003
0.0003
91.3540
(6)
0.0000
40.6320
(4)
0.0000
(0.0086)
(0.0416)
0
(0.0085)
(0.0534)
GMM test of overidentifying restrictions
χ2
d.o.f
p-value
q
θi
Note: In the estimation I set θ = θ1 + θ2 and σi = σiv 2κ
for i = 1, 2 and imposed the stationarity
i
conditions σ1 + σ2 < θ and κi > 0 for i = 1, 2. The data used in the estimation spans the period from
January 1 1990 till November 29 2002 for a total of 3256 daily observations on the S&P 500 futures
contract. The daily realized multipower variation statistics were computed using 80 intraday five-minute
returns over each of the days in the sample. The model is estimated using GMM-type estimator with
moment conditions specified in Section 4. The asymptotic variance-covariance matrix, used for calculating
the optimal weighting matrix, is computed using Parzen weights with a lag length of 80. Standard errors
for the parameter estimates are reported in parentheses.
48
Table 2: Estimation results for the CARMA Jump-Driven SV model:
Z
df (t) = α(t)dt + σ(t)dW (t) +
h(x)µ̃(dt, dx),
Z
σ 2 (t) =
t
−∞
g(u) =
Rn
0
Z
Rn
0
g(t − s)k(x)µ(ds, dx),
b0 + ρ1 ρ1 u b0 + ρ2 ρ2 u
e +
e ,
ρ1 − ρ2
ρ2 − ρ1
Parameter
θ
R
2
Rn k (x)G(dx)
u ≥ 0.
CARMA(1,0)
0.7425
CARMA(2,1)
0.7969
0.1672
2.3934
(0.0525)
(0.0222)
0
b0
(0.0511)
(0.4741)
0.2313
(0.0455)
−ρ1
0.0604
(0.0067)
−ρ2
R
h2 (x)G(dx)
Rn
0
R
h4 (x)G(dx)
Rn
0
R
2
Rn h (x)k(x)G(dx)
0.0390
(0.0097)
1.5574
(0.2283)
0.1119
0.1491
0.1060
0.2555
0.1331
0.5257
75.2200
(5)
0.0000
0.9879
(3)
0.8042
(0.0160)
(0.1293)
(0.0975)
0
(0.0159)
(0.1296)
(0.1397)
GMM test of overidentifying restrictions
χ2
d.o.f
p-value
R
Note: In the estimation I set θ = ρ1b0ρ2 Rn k(x)G(dx) and imposed the stationarity conditions b0 >
0
− max{ρ1 , ρ2 } and ρi < 0 for i = 1, 2. The data used in the estimation spans the period from January 1 1990 till November 29 2002 for a total of 3256 daily observations on the S&P 500 futures contract.
The daily realized multipower variation statistics were computed using 80 intraday five-minute returns
over each of the days in the sample. The model is estimated using GMM-type estimator with moment
conditions specified in Section 4. The asymptotic variance-covariance matrix, used for calculating the
optimal weighting matrix, is computed using Parzen weights with a lag length of 80. Standard errors for
the parameter estimates are reported in parentheses.
49
Table 3: Estimation results for the Jump-Diffusive Volatility SV model:
Z
df (t) = α(t)dt + σ(t)dW (t) +
h(x)µ̃(dt, dx),
Rn
0
σ 2 (t) = V1c (t) + V j (t),
q
dV1c (t) = κ1 (θ1 − V1c (t))dt + σ1v V1c (t)dB1 (t),
Z t Z
j
V (t) =
eρ1 (t−s) k(x)µ(ds, dx).
−∞
Parameter
θ
κ1
σ1
−ρ
R 1 2
n k (x)G(dx)
RR0 2
n h (x)G(dx)
RR0 4
n h (x)G(dx)
RR0 2
Rn h (x)k(x)G(dx)
0
Rn
0
Estimate
0.8153
0.0390
0.8126
1.5399
2.3466
0.1527
0.2648
0.5768
Standard Error
0.0511
0.0098
0.0711
0.2245
0.4692
0.0159
0.1293
0.1483
GMM test of overidentifying restrictions
χ2
d.o.f
p-value
1.0770
(3)
0.7826
q
R
θ1
Note: In the estimation I set θ = θ1 − ρ11 Rn k(x)G(dx) and σ1 = σ1v 2κ
and imposed the stationarity
1
0
conditions σ1 < θ, κ1 > 0 and ρ1 < 0. The data used in the estimation spans the period from January
1 1990 till November 29 2002 for a total of 3256 daily observations on the S&P 500 futures contract.
The daily realized multipower variation statistics were computed using 80 intraday five-minute returns
over each of the days in the sample. The model is estimated using GMM-type estimator with moment
conditions specified in Section 4. The asymptotic variance-covariance matrix, used for calculating the
optimal weighting matrix, is computed using Parzen weights with a lag length of 80.
50
Table 4: Moment Condition Tests for the Jump-Diffusive Volatility SV model
autocorrelation in IV for lag 1
autocorrelation in IV for lag 3
autocorrelation in IV for lag 6
aver. autocorrelation in IV for lags 10 − 20
aver. autocorrelation in IV for lags 20 − 30
aver. autocorrelation in IV for lags 30 − 40
E(IV (t))
E(IV 2 (t))
E(QV (t) − IV (t))
E(QV 2 (t))
E(F Vδ2 (t))
−0.2760
0.2252
0.0875
0.5844
0.6461
0.2418
−0.3190
−0.0965
0.1062
0.2371
0.7163
Note: The table reports the diagnostic t-statistics for each of the moment conditions underlying the
estimation results for the Jump-Diffusive SV Model reported in Table 3.
51
Table 5: Wald tests for zero covariances between RP and lags of TV and RP and lags of JV
Covariance between RP and lags of TV
Covariance between RP and lags of JV
Lags
1
5
10
15
20
25
30
Lags
1
5
10
15
20
25
30
Wald test
16.9828
19.2828
28.1547
51.4367
73.3441
83.2144
105.4132
P-value
0.0000
0.0017
0.0017
0.0000
0.0000
0.0000
0.0000
Wald test
9.1121
21.1619
27.0614
33.5223
49.3173
56.6326
73.8088
P-value
0.0025
0.0008
0.0025
0.0040
0.0003
0.0003
0.0000
Note: The Wald statistic tests the null hypothesis of zero covariances between RP and lags of TV (respectively JV) up to the corresponding lag. The data used in the estimation spans the period from January 1
1990 till November 29 2002 for a total of 3256 daily observations on the S&P 500 futures contract and daily
closing prices of the VIX index. The daily realized multipower variation statistics were computed using
80 intraday five-minute returns over each of the days in the sample. The RP measure was constructed,
using the Jump-Diffusive SV model with parameter values the estimated ones reported in Table 3. For the
calculation of the asymptotic variance of the covariances the error, from the estimation of the parameters
of the Jump-Diffusive SV model, was taken into account.
52
Table 6: Estimation results for the Time-Variation in the Variance Risk Premium:
V Ra (t) = α0 + α1 φ(t) + α2 τ (t),
Z
df (t) = α(t)dt + σ(t)dW (t) +
h(x)µ̃(dt, dx),
Rn
0
σ 2 (t) = V c (t) + V j (t),
p
dV c (t) = κ1 (θ1 − V c (t))dt + σ1v (t) V c (t)dB1 (t),
Z t Z
j
V (t) =
eρ1 (t−s) k(x)µ(ds, dx).
−∞
Parameter
θ
κ1
σ1
−ρ
R 1 2
n k (x)G(dx)
RR0 2
n h (x)G(dx)
RR0 4
n h (x)G(dx)
RR0 2
Rn h (x)k(x)G(dx)
0
K0
Kφc
Kτc
Kτj
−ρτ
Rn
0
Estimate
0.7955
0.0312
0.7517
1.6203
2.2623
0.1522
0.2355
0.5419
Standard Error
0.0807
0.0107
0.1193
0.2722
0.6566
0.0169
0.1243
0.1537
0.6948
-1.0797
1.4288
0.1182
0.0189
0.0533
1.3404
1.3151
0.0252
0.0092
GMM test of overidentifying restrictions
χ2
d.o.f
p-value
5.1952
(6)
0.5190
q
R
θ1
and imposed the stationarity
Note: In the estimation I set θ = θ1 − ρ11 Rn k(x)G(dx) and σ1 = σ1v 2κ
1
0
conditions σ1 < θ, κ1 > 0 and ρ1 < 0. The data used in the estimation spans the period from January
1 1990 till November 29 2002 for a total of 3256 daily observations on the S&P 500 futures contract and
daily closing prices of the VIX index. The daily realized multipower variation statistics were computed
using 80 intraday five-minute returns over each of the days in the sample. The model is estimated using
GMM-type estimator with moment conditions specified in Section 6. The asymptotic variance-covariance
matrix, used for calculating the optimal weighting matrix, is computed using Parzen weights with a lag
length of 80.
53
Jumps, α=0.0
0.2
0
−0.2
Jumps, α=1.0
0.5
0
−0.5
Jumps, α=1.5
0.5
0
−0.5
Brownian motion
2
1
0
−1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Figure 1: Simulated trajectories of tempered stable jump processes and Brownian motion on the interval
[0, 1]. The parameters of the processes were chosen such that in all cases these processes have variance of
1 on the unit interval.
54
Daily Returns
10
5
0
−5
−10
1990
1992
1994
1996
TV
1998
2000
2002
1992
1994
1996
JV
1998
2000
2002
1992
1994
1996
1998
2000
2002
20
15
10
5
0
1990
20
15
10
5
0
1990
Figure 2: S&P 500 daily measures. The top panel shows daily returns; the middle panel shows the daily
TV and the bottom panel shows the daily difference between RV and TV. The sample period is from
January 2 1990 till November 29 2002 and includes 3256 daily high-frequency observations on the S&P
500 futures contract. The daily realized multipower variation statistics were computed using 80 intraday
five-minute returns over each of the days in the sample using the formulas in equations (18) and (19).
55
corr(TVt,TVt−i)
0.6
0.5
0.4
0.3
0.2
0.1
0
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
Lags in days
60
70
80
90
100
corr(JVt,JVt−i)
0.6
0.5
0.4
0.3
0.2
0.1
0
Figure 3: S&P 500 sample correlations. The top panel shows the sample autocorrelation in TV and the
second panel shows the autocorrelation in JV=RV-TV. The sample period used in the calculations is from
January 2 1990 till November 29 2002 and includes 3256 daily high-frequency observations on the S&P
500 futures contract. The daily realized multipower variation statistics were computed using 80 intraday
five-minute returns over each of the days in the sample using the formulas in equations (18) and (19).
56
0.8
0.7
0.6
corr(TVt,TVt−i)
0.5
0.4
0.3
0.2
0.1
0
−0.1
−0.2
0
10
20
30
40
50
Lags in days
60
70
80
90
100
Figure 4: The figure shows the empirical and the fitted autocorrelation for TV. The empirical autocorrelation of the TV is marked with +. The dashed lines are the 95% confidence interval for the autocorrelation
with GMM-type standard errors. The solid line is the autocorrelation implied from the Jump-Diffusive
Volatility SV model given in (29)-(32). The parameters were set at the estimated values reported in Table 3.
57
6
RPt
4
2
0
1990
1992
1994
1996
1998
2000
2002
1992
1994
1996
1998
2000
2002
1992
1994
1996
Year
1998
2000
2002
20
TV
t
15
10
5
0
1990
20
JV
t
15
10
5
0
1990
Figure 5: Estimated “Variance Risk Premium”. The top panel shows the RP measure calculated using
equation (41). The middle panel shows the daily TV and the bottom panel shows the daily difference
between RV and TV. The sample period is from January 2 1990 till November 29 2002 and includes 3256
daily observations on the VIX index as well as 3256 daily high-frequency observations on the S&P 500
futures contract. The variance swap rate was calculated using equation (50). The daily realized multipower
variation statistics were computed using 80 intraday five-minute returns over each of the days in the sample
using the formulas in equations (18) and (19).
58
0.8
cov(RPt,TVt−i)
0.6
0.4
0.2
0
0
5
10
15
20
25
30
35
40
45
50
0
5
10
15
20
25
Lags in days
30
35
40
45
50
0.35
cov(RPt,JVt−i)
0.3
0.25
0.2
0.15
0.1
0.05
0
Figure 6: “Variance Risk Premium” sample covariances. The top panel shows the covariance between
RP and past values of TV; the bottom panel shows the covariance between RP and past values of JV.
On both panels the solid lines are the estimated covariances, while the dotted lines are the corresponding
95% confidence bounds. The RP measure was calculated using equation (41). The sample period is from
January 2 1990 till November 29 2002 and includes 3256 daily observations on the VIX index as well as
3256 daily high-frequency observations on the S&P 500 futures contract. The variance swap rate was
calculated using equation (50). The daily realized multipower variation statistics were computed using 80
intraday five-minute returns over each of the days in the sample using the formulas in equations (18) and
(19).
59
References
Ait-Sahalia, Y. (2004). Disentangling Diffusion from Jumps. Journal of Financial Economics 74, 487–528.
Ait-Sahalia, Y. and J. Jacod (2005). Volatility Estimators for Discretely Sampled Lévy Processes. Working
paper, Princeton University and Université de Paris-6.
Ait-Sahalia, Y., Y. Wang, and F. Yared (2001). Do Option Markets Correctly Price the Probabilities of
Movement of the Underlying Asset? Journal of Econometrics 102, 67–110.
Alizadeh, S., M. W. Brandt, and F. Diebold (2002). Range-Based Estimation of Stochastic Volatility
Models. Journal of Finance 57, 1047–1091.
Andersen, T., L. Benzoni, and J. Lund (2002). An Empirical Investigation of Continuous-Time Equity
Return Models. Journal of Finance 57, 1239–1284.
Andersen, T., T. Bollerslev, and F. Diebold (2005a). Parametric and Nonparametric Measurement of
Volatility. In Y. Ait-Sahalia and L. Hansen (Eds.), Handbook of Financial Econometrics. North-Holland.
Andersen, T., T. Bollerslev, and F. Diebold (2005b). Some Like it Smooth, and Some Like it Rough:
Disentangling Continuous and Jump Components in Measuring, Modeling and Forecasting Asset Return
Volatility. Working paper, Duke University.
Andersen, T., T. Bollerslev, and N. Meddahi (2005). Correcting the Errors: Volatility Forecast Evaluation
Using High-Frequency Data and Realized Volatilities. Econometrica 73, 279–296.
Andrews, D. (1999). Consistent Moment Selection Procedures for Generalized Method of Moments Estimation. Econometrica 67, 543–564.
Andrews, D. (2001). Testing when a Parameter is on the Boundary of the Maintained Hypotheis. Econometrica 69, 683–734.
Andrews, D. (2002). Generalized Method of Moments when the Parameter is on the Boundary. Journal
of Business and Economic Statistics 20, 530–544.
Andrews, D. and B. Lu (2001). Consistent Model and Moment Selection Procedures for GMM Estimation
with Application to Dynamic Panel Data Models. Journal of Econometrics 101, 123–164.
Bakshi, G. and N. Kapadia (2003). Delta-Hedged Gains and the Negative Market Volatility Risk Premium.
Review of Financial Studies 16, 527–566.
Bakshi, G. and D. Madan (2000). Spanning and Derivative-Security Valuation. Journal of Financial
Economics 55, 205–238.
Bakshi, G. and D. Madan (2006). A Theory of Volatility Spreads. Working paper, University of Maryland.
Barndorff-Nielsen, O. E., S. Graversen, J. Jacod, M. Podolskij, and N. Shephard (2005). A Central Limit
Theorem for Realised Power and Bipower Variations of Continuous Semimartingales. In Y. Kabanov and
R. Lipster (Eds.), From Stochastic Analysis to Mathematical Finance, Festschrift for Albert Shiryaev.
Springer.
Barndorff-Nielsen, O. E. and N. Shephard (2001). Non- Gaussian Ornstein-Uhlenbeck-based Models and
Some of Their Applicaions in Financial Economics. Journal of the Royal Statistical Society: Series B 63,
167–241.
Barndorff-Nielsen, O. E. and N. Shephard (2004). Power and Bipower Variation with Stochastic Volatility
and Jumps. Journal of Financial Econometrics 2, 1–37.
60
Barndorff-Nielsen, O. E. and N. Shephard (2006). Econometrics of Testing for Jumps in Financial Economics using Bipower Variation. Journal of Financial Econometrics 4, 1–30.
Barndorff-Nielsen, O. E., N. Shephard, and M. Winkel (2006). Limit Theorems for Multipower Variation
in the Presence of Jumps in Financial Econometrics. Stochastic Processes and Their Applications 116,
796–806.
Bates, D. (2000). Post-’87 Crash Fears in S&P 500 Future Options. Journal of Econometrics 94, 181–238.
Bates, D. (2006). The Market for Crash Risk. Working paper, University of Iowa.
Billingsley, P. (1968). Convergence of Probability Measures. New York: Wiley.
Black, F. and M. Scholes (1973). The Pricing of Options and Corporate Liabilities. Journal of Political
Economy 81, 637–654.
Blumenthal, R. and R. Getoor (1961). Sample Functions of Stochastic Processes with Independent Increments. Journal of Math. Mech. 10, 493–516.
Bollerslev, T., R. Engle, and D. Nelson (1994). ARCH Models. In R. Engle and D. McFadden (Eds.),
Handbook of Econometrics, Volume 4. Amsterdam: North-Holland.
Bollerslev, T., M. Gibson, and H. Zhou (2005). Dynamic Estimation of Volatility Risk Premia and Investor
Risk Aversion from Option-Implied and Realized Volatilities. Working paper, Duke University.
Bollerslev, T. and H. Zhou (2002). Estimating Stochastic Volatility Diffusion using Conditional Moments
of Integrated Volatility. Journal of Econometrics 109, 33–65.
Britten-Jones, M. and A. Neuberger (2000). Option Prices, Implied Price Processes, and Stochastic Volatility. Journal of Finance 55, 839–866.
Broadie, M., M. Chernov, and M. Johannes (2006). Specification and Risk Premiums: The Information in
S&P 500 Futures Options. Journal of Finance, forthcoming.
Brockwell, P. (2001a). Lévy -Driven CARMA Processes. Ann.Inst.Statist.Math 53, 113–124.
Brockwell, P. (2001b). Continuous-Time ARMA Processes. In D. Shanbhag and C. Rao (Eds.), Handbook
of Statistics, Volume 19. North-Holland.
Brockwell, P. and T. Marquardt (2005). Lévy-Driven and Fractionally Integrated ARMA Processes with
Continuous Time Parameter. Statistica Sinica 15, 477–494.
Campbell, J. and J. Cochrane (1999). By Force of Habit: A Consumption Based Explanation of Aggregate
Stock Market Behavior. Journal of Political Economy 107, 205–251.
Carr, P., H. Geman, D. Madan, and M. Yor (2003). Stochastic Volatility for Lévy Processes. Mathematical
Finance 13, 345–382.
Carr, P. and D. Madan (2001). Optimal Positioning in Derivative Securities. Quantitative Finance 1,
19–37.
Carr, P. and L. Wu (2004). Variance Risk Premia. Working paper, Bloomberg and Baruch College.
Cheridito, P., D. Filipović, and R. Kimmel (2005). Market Price of Risk Specifications for Affine Models:
Theory and Evidence. Journal of Financial Economics, forthcoming.
61
Cheridito, P., D. Filipović, and M. Yor (2005). Equivalent and Absolutely Continuous Measure Changes
for Jump-Diffusion Processes. The Annals of Applied Probability 15, 1713–1732.
Chernov, M., R. Gallant, E. Ghysels, and G. Tauchen (2003). Alternative Models for Stock Price Dynamics.
Journal of Econometrics 116, 225–257.
Chernov, M. and E. Ghysels (2000). A Study Towards a Unified Approach to the Joint Estimation
of Objective and Risk-Neutral Measures for the Purpose of Options Valuation. Journal of Financial
Economics 56, 407–458.
Chernozhukov, V. and H. Hong (2003). An MCMC Approach to Classical Estimation. Journal of Econometrics 115, 293–346.
Cont, R. and P. Tankov (2004). Financial Modelling With Jump Processes. London: Chapman & Hall.
Cont, R., P. Tankov, and E. Voltchkova (2005). Hedging with Otions in Presence of Jumps. Stochastic
analysis and applications: Abel Symposium 2005 in honor of Kiyosi Ito’s 90th birthday.
Delbaen, F. and W. Schachermayer (1994). A General Version of the Fundamental Theorem of Asset
Pricing. Mathematische Annalen 300, 520–563.
Delbaen, F. and W. Schachermayer (1998). The Fundamental Theorem of Asset Pricing for Unbounded
Stochastic Processes. Mathematische Annalen 312, 215–250.
Demeterfi, K., E. Derman, M. Kamal, and J. Zou (1999). A Guide to Volatility and Variance Swaps.
Journal of Derivatives 6, 9–32.
Duffie, D. (2001). Dynamic Asset Pricing Theory. Princeton: Princeton University Press.
Duffie, D., D. Filipović, and W. Schachermayer (2003). Affine Processes and Applications in Finance.
Annals of Applied Probability 13(3), 984–1053.
Duffie, D., J. Pan, and K. Singleton (2000). Transform Analysis and Asset Pricing for Affine JumpDiffusions. Econometrica 68, 1343–1376.
Eraker, B. (2004). Do Stock Prices and Volatility Jump? Reconciling Evidence from Spot and Option
Prices. Journal of Finance 59, 1367–1403.
Eraker, B., M. Johannes, and N. Polson (2003). The Impact of Jumps in Volatility and Returns. Journal
of Finance 58, 1269–1300.
Feller, W. (1951). Two Singular Diffusion Problems. Annals of Mathematics 54, 173–182.
Hamilton, J. (1994). Time Series Analysis. New Jersey: Princeton University Press.
Harrison, J. and D. Kreps (1979). Martingales and Arbitrage in Multiperiod Security Markets. Journal of
Economic Theory 20, 381–408.
Harrison, J. and S. Pliska (1981). Martingales and Stochastic Integrals in the Theory of Continuous
Trading. Stochastic Processes and their Applications 11, 215–260.
Hong, H., B. Preston, and M. Shum (2003). Generalized Empirical Likekihood-Based Model Selection
Criteria for Moment Condition Models. Econometric Theory 19, 923–943.
Huang, X. and G. Tauchen (2005). The Relative Contributions of Jumps to Total Variance. Journal of
Financial Econometrics 3, 456–499.
62
Ikeda, N. and S. Watanabe (1981). Stochastic Differential Equations and Diffusion Processes. Tokyo:
North-Holland.
Jacod, J. (1979). Calcul Stochastique et Problèmes de Martingales. Lecture notes in Mathemtatics 714.
Berlin Heidelberg New York: Springer-Verlag.
Jacod, J. (2006a). Asymptotic Properties of Power Variations and Associated Functionals of Semimartingales. Working paper, Université de Paris-6.
Jacod, J. (2006b). Asymptotic Properties of Power Variations of Lévy Processes. Working paper, Université
de Paris-6.
Jacod, J. and A. N. Shiryaev (2003). Limit Theorems For Stochastic Processes (2nd ed.). Berlin: SpringerVerlag.
Jiang, G. and R. Oomen (2006). Estimating Latent Variables and Jump Diffusion Models using High
Frequency Data. Working paper, University of Arizona and University of Warwick.
Kallsen, J. and A. Shiryaev (2002). The Cumulant Process and Esscher’s Change of Measure. Finance
and Stochastics 6, 397–428.
Klüppelberg, C., A. Lindner, and R. Maller (2004). A Continuous Time GARCH Process Driven by a
Lévy Process: Stationarity and Second Order Behavior. Journal of Applied Probability 41, 601–622.
Liu, J., J. Pan, and T. Wang (2005). An Equilibrium Model of Rare-Event Premia and Its Implications
for Option Smirks. Review of Financial Studies 18, 131–164.
Newey, W. and D. McFadden (1994). Large Sample Estimation and Hypothesis Testing. In R. Engle and
D. McFadden (Eds.), Handbook of Econometrics, Volume 4, pp. 2113–2241. Amsterdam: North-Holland.
Nicolato, E. and E. Venardos (2003). Option Pricing in Stochastic Volatility Models of the OrnsteinUhlenbeck Type. Mathematical Finance 13, 445–466.
Pan, J. (2002). The Jump-Risk Premia Implicit in Options: Evidence from an Integrated Time-Series
Study. Journal of Financial Economics 63, 3–50.
Pozdnyakov, V. and J. Steele (2004). On the Martingale Framework for Futures Prices. Stochastic Processes
and their Applications 109, 69–77.
Rajput, B. and J. Rosiński (1989). Spectral Representation of Infinitely Divisible Processes Vectors.
Probability Theory and Related Fields 82, 451–487.
Revuz, D. and M. Yor (1994). Continuous Martingales and Brownian Motion (2nd ed.). New York:
Springer-Verlag.
Rosenberg, J. and R. Engle (2002). Empirical Pricing Kernels. Journal of Financial Economics 64,
341–372.
Santa-Clara, P. and Y. Shu (2005). Crashes, Volatility, and the Equity Premium: Lessons from S&P 500
Options. Working paper, UCLA.
Tankov, P. (2003). Dependence Structure of Lévy Processes with Applications in Risk Management.
Raport Interne 502, CMAP,Ecole Polytechnique.
Tauchen, G. (1985). Diagnostic Testing and Evaluation of Maximum Likelihood Models. Journal of
Econometrics 30, 415–443.
63
Todorov, V. (2006a). Econometric Analysis of Jump-Driven Stochastic Volatility Models. Working paper,
Duke University.
Todorov, V. (2006b). Estimation of Coninuous-time Stochastic Volatility Models with Jumps. Working
paper, Duke University.
Todorov, V. and G. Tauchen (2006). Simulation Methods for Lévy -Driven CARMA Stochastic Volatility
Models. Journal of Business and Economic Statistics 24, 455–469.
Woerner, J. (2006). Power and Multipower Variation: Inference for High Frequency Data. In A. Shiryaev,
M. do Rosario Grossinho, P. Oliviera, and M. Esquivel (Eds.), Proceedings of the International Conference
on Stochastic Finance 2004. Berlin: Springer Verlag.
Wooldridge, J. (1994). Estimation and Inference for Dependent Processes. In R. Engle and D. McFadden
(Eds.), Handbook of Econometrics, Volume 4, pp. 2639–2738. Amsterdam: North-Holland.
Wu, L. (2005). Variance Dynamics: Joint Evidence from Options and High-Frequency Returns. Working
paper, Baruch College.
64
Download