Uploaded by Bas Boekhout

FECS report group 8 (2)

advertisement
Vrije Universiteit Amsterdam
School of Business and Economics
Financial Econometrics Case Study
Modeling and Forecasting Volatility
Supervisor:
Lennart Hoogerheide
Siem Jan Koopman
Author:
Jorik van der Oord - 2646814
Jan Daenen - 2653703
Kwame Bonsu - 2576417
Stijn Donckers - 2680580
Bas Boekhout - 2655526
Abstract
This paper examines how 16 different volatility models, applied to high frequency data on stock prices
of BP (British Petroleum), perform in approximating volatility of the returns. The forecasts of the
models are compared to each other using the Diebold-Mariano test based on the FMSE, FMAE and
the likelihoods. The volatility models are tested on two test sets namely, set A and set B where
set A starts on January 1st 2007 and ends on January 31st 2012 and set B starts on January 1st
2007 and ends on April 19th 2010. Set B is created to investigate how the different volatility models
forecast the Deep Water Horizon oil disaster that occurred on April 20th 2010. Results show that in
set A, the Realized GARCH-T and Realized GARCH-GED perform well in approximating volatility
of the returns. For set B, the results show that, when using the FMAE, the Realized GARCH models
perform best. When using the FMSE, it can be found that the GAS-GED outperforms all models,
whereas there is no clear ’best’ model when using the Diebold-Mariano based on the likelihood.
January 31, 2020
Contents
1 Introduction
1
2 Data Cleaning
1
3 Descriptive Statistics
3.1 Full dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 The year 2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2
3
4 Realized Measures
4.1 Different Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Empirical Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
3
5 Models
5.1 GARCH . . . . . . . . . . . . .
5.2 Robust GARCH . . . . . . . .
5.3 Robust GARCH with leverage .
5.4 GJR-GARCH . . . . . . . . . .
5.5 NAGARCH . . . . . . . . . . .
5.6 GAS . . . . . . . . . . . . . . .
5.7 Realized GARCH Log-Linear .
5.8 Residual error testing . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
5
5
5
5
6
6
6
7
. . . . . . . . .
. . . . . . . . .
Absolute Error
. . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
8
8
8
7 Results
7.1 Forecasting results Set A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Forecasting results Set B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3 Visualisation of result of Set A and B . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9
10
12
8 Conclusion
13
Appendices
15
A Data Cleaning
15
B Descriptive Statistics
B.1 Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B.2 2009-2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
15
16
C Kernel Models
17
D Realized Garch
21
E GAS
E.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
E.2 General framework GAS student t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
22
22
F Error distributions
F.1 Gaussian . . . . . . . . . . . .
F.2 Student-t . . . . . . . . . . .
F.3 Generalized error distribution
F.4 Derivative . . . . . . . . . . .
F.4.1 Student-t . . . . . . .
F.4.2 GED . . . . . . . . . .
23
23
23
23
24
24
25
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Forecasting
6.1 Forecasting methodology . . . . . . .
6.2 Forecasting accuracy . . . . . . . . .
6.2.1 Forecast Mean Squared Error
6.2.2 Diebold-Mariano Test . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . . .
. . . . . . . . .
Forecast Mean
. . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Modeling and forecasting volatility
G Results
G.1 Set A . . . . . . . . .
G.1.1 Estimates . . .
G.1.2 DM-test results
G.1.3 Error tests . .
G.2 Set B . . . . . . . . .
G.2.1 Estimates . . .
G.3 DM-test results . . . .
G.3.1 Error tests . .
G.4 Plots . . . . . . . . . .
G.4.1 Set A . . . . .
G.4.2 Set B . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
26
26
26
27
28
30
31
32
33
35
35
36
Page 2
Modeling and forecasting volatility
1
Introduction
In this paper, several volatility models are studied and applied to high frequency data on stock prices of
British Petroleum, or BP, traded on the NASDAQ exchange within the period 2007-2014. BP is a British
oil and gas company that delivers energy products and services to customers worldwide. It is one of the
six ‘Supermajor’ oil companies: large multinationals that are independent, hence not state-owned. This
makes BP an international player of importance. Although the oil and gas market could be considered
an oligopoly, influence of individual companies on prices is very limited.
It is useful to study volatility in the gas and oil market, as a worldwide transition towards new ways of
energy supply is currently arising. For investors, knowledge about the behaviour of prices and returns
is important information during these more uncertain times. BP specifically is an interesting company,
since it was the protagonist in the infamous Deep Water Horizon accident in the Gulf of Mexico in 2010.
This led to an enormous oil spill, killing hundreds of animals in the years after and destroying biodiversity
in the whole region. It is known to be the largest marine oil spill in history as Pallardy (2016) states
among others. According to Lee et al. (2018), the total costs for BP are estimated on at least $144.89
billion while BP claims it is only $62.59 billion. Anyhow, costs were enormous, which likely led to significant reactions in the market. Hence, this event is incorporated in the volatility analysis performed in
this paper. It is therefore good to mention that within this paper two different test and validation sets
are used. The first test and validation set, which is conducted on the full dataset, runs from 01-01-2007
to 31-12-2011 and from 01-01-2012 tot 31-12-2014 respectively. Whereas the test and validation set,
conducted on forecasting the oil crisis in 2010, runs from 01-01-2007 to 18-04-2010 and from 19-04-2010
to 15-04-2011.
The structure of the paper is as follows. Section 2 explains the data-cleaning process of the data. In
section 3 a general statistical analysis on the obtained data is performed. Section 4 compares realized
measures employed on the data. A dozen of existent volatility models are applied to the data in section 5
and therefrom judged. In section 6 the predictive power of the models is explained using forecasts. This
analysis is followed by the obtained results in section 7 and a conclusion in section 8.
2
Data Cleaning
Before the econometric analysis could be conducted, the data from Wharton Research Data Series were
‘cleaned’. This is important as one wants to make optimal use of the dataset to generate the best possible
volatility estimators. Hence, as much noise as possible should be deleted. Since data cleaning is tricky,
a tiny mistake could make the data totally unrepresentative. In this paper, the exact cleaning method
used by Barndorff-Nielsen et al. (2009) has been used. This section explains the steps taken in the data
cleaning process.
Entries with a bid, ask or transaction price of 0 were not present, so step (P2) from Barndorff-Nielsen
was skipped. Apart from this, the following steps were taken to obtain the final dataset.
1. Filtering out entries that contain corrected trades. In the dataset these are the entries that have
CORR 6= 0. (step T1 in Barndorff-Nielsen)
2. Delete entries that have an abnormal Sale Condition, i.e. entries where COND has a letter code
that is not equal to E or F. These are transactions that deviate from what can be expected under
the market conditions at that moment. Trades with no sale condition were kept as well. (T2)
3. Entries from exchanges other than NASDAQ were removed. NASDAQ was preferred above other
exchanges as it contains the most datapoints. Therefore a subset was created from the dataset
that only contains NASDAQ data. While, in theory, different markets should have no significant
differences in prices, this cannot be assumed in practice. The estimates therefore are more reliable
when we apply the models to prices of one exchange only1 . (P3)
4. Entries outside of the regular trading day (9:30-16:00) were taken out so separate days can be
compared. (P1)
1 excluding exchanges (especially exchanges with different opening hours) does lead to some problems, however, which will
be addressed in Section 5.7
Page 1
Modeling and forecasting volatility
5. Lastly, every timepoint should appear at most one time in the dataset. Hence, the median price
was taken for trades occurring at the same second. (T3)
Step T4, which ensures the data are smooth, was not included. Barndorff and Nielsen et al. (2009) use
daily spreads for this step, which is not possible in our case as our dataset does not contain quote prices.
An alternative smoothing method in absence of quote data could be to remove prices too far from the
median price calculated over its neighbourhood (using a rolling window of, for example, 50 data points).
However, this would be very time consuming and, is not crucial for a good analysis on this type of data
as Barndorff and Nielsen et al. (2009) show that the amount of adjusted entries in step T4 is negligible.
For this reason, smoothing is not included in the data cleaning process. Ultimately, data cleaning led
to a loss of data points from about 70 million to somewhat more than 4 million. The exact numbers
that were erased through each cleaning step are given in Table 8 in the Appendix. The cleaned data are
separated into a test sample, and a validation sample. This is important for the remainder of the paper.
The test sample comprises the first 5 years. This is used to estimate the parameter values of the used
models. The last three years belong to the validation sample. This is used to test the performance of the
model that has been built based on the test sample.
3
Descriptive Statistics
When performing an econometric analysis of any kind, one should start with a general statistical analysis
on the data. This gives insight in the properties of the data and helps understanding how to model and
interpret the volatility estimates. In this section, some of these descriptive statistics are reported.
3.1
Full dataset
Looking at the plot of the price development between 2007 and 2015, two large price drops stand out.
The shock in 2008 can be explained by the worldwide decrease of oil prices that affected all oil companies.
The drop in 2010 however, is specific for BP, also referred to as idiosyncratic risk. A straightforward
hypothesis is that this shock is caused by the disaster on the 20th of April in that year on the oil platform
‘Deep Water Horizon’, leased by BP. Within a couple of weeks after the accident, half of the market value
of BP dampened.
Figure 1: The price development of BP shares between 2007 and 2015 shows two large shocks.
The plot of returns in the same time period in Figure 5 of the Appendix also shows consecutive peaks
during the same periods. Hence, returns do not seem to consist entirely out of random noise. This gives
reason to study models that allow for volatility clustering.
Table 1 depicts descriptive statistics of the data. In Table 11 in Appendix B.2, the results of the JarqueBera test can be found as well. From these results, one clearly concludes that the data have a non-normal
distribution. The same holds for returns: the tails are fat and the distribution is skewed.
Value
nr. Obs.
4,104,486
mean
48.4485
std. dev.
11.0029
var.
121.0631
min.
26.75
1st Q
41.15
med.
45.34
3st Q
54.32
max.
79.77
kurt.
-0.1014
skew.
0.8013
Table 1: Summary of descriptive statistics of BP prices between 2007 and 2015.
Page 2
Modeling and forecasting volatility
3.2
The year 2010
Because the disaster had such an impact on prices and returns, it may yield to look closer at the year
2010. Descriptive statistics are supplied in Table 10. Compared to other years, returns in 2010 had
higher kurtosis and were more negatively skewed. This depicts that 2010, indeed, was an unstable year.
For this reason it is interesting to investigate how well the different models are able to forecast future
volatility. A striking fact here is that it took a while before the market started to react to the oil spill.
The first ten days after the accident, the price and returns remained fairly stable. In figures 6a and 6b
of Appendix B.2 this is graphically shown. It is peculiar price behaviour as there does not seem to have
been a lack of information about the disaster, hence, it is an interesting topic for future research.
4
4.1
Realized Measures
Different Measures
With the rise of high frequency financial data, it became possible to accurately estimate the realized
volatility within a day using that day’s trading prices. Algorithms built for this task are called Realized
Measures of volatility (RMs), and exist in many different forms. The simplest one is called Realized
Volatility (RV), while the Realized Kernel (RK) is arguably the most sophisticated RM.
Every RM is an approximation of the Integrated Variance (IV) of the return process. This IV can be seen
as the sum of infinitesimal time-varying variances over a given time period. In Appendix C an overview is
given of the formulae each RM uses, as well as an overview of the biases which adhere to each RM. Since
the Two Scale Realized Volatility and the Realized Kernel correct for microstructure noise the most, they
likely produce less biased approximations of the IV.
4.2
Empirical Comparison
We employ each realized measure on our data, obtaining a path for the RV, BPV, TSRV, and RK. We
compute the RV and BPV using 5 minute returns within each day. For the TSRV, we choose the number
of grid such that on average the returns calculated over the grid will be 5 minute returns (K, the number
of grids is calculated as the number of transactions in the full dataset, divided by the number of 5 minute
intervals in the dataset, rounded up to the nearest integer). Finally, the RK is calculated using q = 25 in
the equation for ω, using 20 minute returns in the RVsparse equation, and using the Parzen kernel. Since
these are measures for the Integrated Variance, it is natural to compare them graphically by plotting the
square root of each RM against the open-to-close returns on BP shares. In the figure below this is done
for the year 2010. At the end of Appendix C, Figure 7 with the full sample period can be found.
Figure 2: Realized Measure paths over 2009-2010.
From this figure one can observe that the RMs all decently approximate the stock’s volatility. The
measures all seem to be fairly close, with only BPV being slightly lower than the other three across this
time period. Looking at the full sample, however, the RV seems much more prone to price jumps than
the other measures, and hence shows some strong jumps in volatility as well.
Page 3
Modeling and forecasting volatility
Besides this, we can compare these RMs based on their Mean Squared Difference (MSD), which is defined
by equation (1).
T
X
M SDi,j =
(RMt,i − RMt,j )2
(1)
t=0
Here, i,j denote the individual RMs, and t is an index for each day. Table 2 shows the MSD between
each of the Realized Measures.
RV
BPV
TSRV
RK
RV
0.5097
1.0779
0.9206
BPV
0.5097
1.1728
1.0201
TSRV
1.0779
1.1728
0.0370
RK
0.9206
1.0201
0.0370
-
Table 2: Mean Squared Difference between each Realized measure.
From Table 2 it can be seen that the MSD between the TSRV and RK is very small, indicating that these
RM’s largely agree on the Integrated Variance (IV) realizations. Between the RV and BPV, the MSD can
also be considered small, albeit less convincing than it is for TSRV and RK. This indicates that the two
least sophisticated measures also somewhat agree on IV. The MSD between a less sophisticated and a
more sophisticated measure lies around 1.0, indicating that the less sophisticated realized measures likely
produce biased estimates of IV in our dataset. Hence, one can conclude that the TSRV and RK produce
the best approximations of IV, but it is difficult to say how accurate these approximations are (as we do
not observe Integrated Variance).
5
Models
According to Cox et al. (1981), there are two approaches when modeling time series with time varying
parameters: the parameter driven models (PDM) and the observation driven models (ODM) approach.
ODM have become popular in applied statistics and econometrics literature due to the perfect predictability given past information. In this paper various ODM’s will be investigated. One of the advantages of
ODM models compared to PDM is that there exists a closed form of the likelihood function. For the
PDM this property does not necessarily hold. Here, due to the stochastic properties of the models, one
needs to apply a simulation or some other process to properly capture these properties. This makes
empirical work with PDM models more difficult according to Creal et al. (2013).
Three families of models will be investigated: the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) Models, Generalized autoregressive score (GAS) models, and Realized GARCH models.
In order to calculate the log likelihood functions for these models, three different distributions are used.
These are the normal distribution, the Student-t distribution, and the Generalized Error Distribution
(GED). If one needs a reminder on the exact workings of these distributions and their derivations one
can look at the information provided in Appendix F.
In order to prevent the estimates from converging to local maxima instead of a global maximum, the
initial values were randomised before estimating. In order to ensure proper functioning of the maximum
likelihood estimation procedure, these random variables are subject to certain constraints, which differ for
each model. For randomising, a uniform distribution is chosen. Per variable the bounds of the distribution
may differ, from which a random variable is picked as initial value. This procedure is repeated one hundred
times for each GARCH/GAS model estimate and only 10 times for Realized GARCH models due to
longer computing times. In this way, for each model, the estimate with the best log likelihood values is
found. Thus, the global maximum and the best model estimates are selected. These estimates are used
for further forecasting. In this paper, the negative average likelihood for optimizing (since Python has
powerful minimization algorithms) is used. Also, the likelihood are scaled for easier optimization. In the
remainder of this section, a brief explanation of all the used models is given.
Page 4
Modeling and forecasting volatility
5.1
GARCH
In this paper, application of the GARCH has been restricted to the GARCH(1,1) model. Hence, models
with higher orders of lags are not considered for simplicity and for the fact that empirically a GARCH(p,q)
model is not preferred due to it having many parameters which need to be estimated. GARCH models
are often used to capture clusters of volatility, which is a common occurrence in financial data as is
stated by Blasques (2019). This is an important feature in this study as the plots in section 3 clearly
show volatility clustering. The model is used to filter a time-varying volatility σt in a sequence of, in our
case, daily returns. This model is an extension on the ARCH models from Engle (1982). A GARCH(1,1)
is equal to an ARCH(∞) as Bollerslev (1986) explains. Therefore, no further emphasis will be placed on
ARCH models in this paper. The observation-equation of the GARCH is as follows:
√
∀t ∈ Z
(2)
y t = σ 2 t ,
Here yt = rt − µ and {t }t∈Z is IID with either a Gaussian, a Student-t or a GED distribution. σt2 is the
conditional time-varying variance at time t. For all of the GARCH extensions below, the observationequation follows this same framework. In this specific equation, the time-varying volatility is filtered
using the following updating equation of the GARCH(1,1):
2
σt+1
= ω + αyt2 + βσt2 ,
(3)
where ω can be seen as the constant, α as the news parameter and β as the memory parameter. To
ensure that the process remains stationary, the restriction needs to be imposed that α + β < 1.
5.2
Robust GARCH
When the innovations (t ) of a model are fat-tailed, it is useful to bound the updating equation of a
GARCH model. By bounding the updating equation, the model has become robust. In this paper, the
updating equation is bounded. This is easily seen in the following equation by looking at the news term
α:
y2
2
(4)
σt+1
= ω + α t 2 + βσt2 .
1 + yt
This term will never be able to explode to extreme values when yt becomes very large. As stated by
Blasques (2019), recognizing that the innovations are fat-tailed, may be important for the robustness
since the maximum likelihood estimator converges to a pseudo-true parameter that renders the volatility
filter more robust.
5.3
Robust GARCH with leverage
Several studies, such as that of Nelson (1991), have shown that negative returns have more impact on
volatility than positive returns. To account for this phenomenon, the robust GARCH with leverage
model (RGARCH-LEV) can be used. This model deals with both robustifying of the equation by means
of ρ and λ and accounting for the leverage effect by making use of the δ parameter. This results in the
updating equation that is given in equation (5). The observation-equation is given by yt = σt t , with
{t }t∈Z ∼ T ID(λ).
y 2 + δyt−1
2
σt2 = ω + α t−1
+ βσt−1
(5)
2
1 + (ρ/λ)yt−1
5.4
GJR-GARCH
To deal with asymmetry the GJR-GARCH introduces a dummy variable in the updating equation 1t <0 .
If t < 0, then the dummy is equal to one, otherwise it is equal to zero. This dummy is useful to deal
with the fact that positive and negative innovations have different impacts on conditional volatility as
Glosten et al. (1993) found. In this model the updating-equation is:
2
σt+1
= ω + α2t + γ2t · 1t <0 + βσt2
(6)
For the GJR-GARCH model to be stationary, the following constraint need to hold, α + β + γ2 < 1. The
observation-equation is given by yt = σt t . In this paper multiple different distributions are investigated
for {t }t∈Z .
Page 5
Modeling and forecasting volatility
5.5
NAGARCH
In the Nonlinear Assymetric Generalized Autoregressive Conditional Heteroskedastic model (NAGARCH)
the response to extreme news is reduced according to Engle & Ng (1993). This is a useful property to
deal with outliers. In the updating equation of this model it is described as follows:
q
2
∀t ∈ Z
(7)
σt+1
= ω + βσt2 + α(t + γ σt2 )2 ,
q
2 . For
The news impact curve of the model described above is symmetric and centered at t = (−γ) σt−1
this model to be stationary, α(1+γ 2 )+β needs to be smaller than 1. Once again, the observation-equation
remains the same as above and the model is investigated using multiple error distributions.
5.6
GAS
The final model group investigated in this paper is the group of GAS models. In this case, once again
the observation equation remains the same as for the models described above. The updating equation is
given by:
2
σt+1
= ω + Bσt2 + Ast
st = St ∆t
(8)
Here St is the scaling matrix and ∆t is the score. A more detailed description of the workings of the
GAS model is given in Appendix E. For the GAS model only two error distributions are investigated
instead of three. These are the Student-t and the GED. The reason for this different approach is the fact
that a GAS(1,1) where t is the distributed as a Normal with mean zero and variance one will reduce
to a GARCH(1,1), as is shown in Creal et al. (2013). For this reason no further attention is given to a
Gaussian distributed GAS.
5.7
Realized GARCH Log-Linear
The realized GARCH model is an extension of normal GARCH and/or GAS models, which makes use
of one or more Realized Measures as an extra variable, Hansen et al. (2012). These models hence try to
model the joint density of returns and the Realized Measure, driven by the latent volatility process and
a set of parameters. The idea behind the use of RMs as additional data, is that through these measures
information based on high frequency returns can be added to the GARCH model. This will likely (and
hopefully) improve volatility forecasts.
As discussed before, the Realized Kernel is the most sophisticated RM, and hence we use this measure
in our Realized GARCH models (as we found, the TSRV was very close to the RK, and could therefore
also have been used. The other two RMs likely suffer from bias).
One problem with using an RM as an additional variable in modeling close-to-close returns, is that the
RMs discussed in this paper are all measures for open-to-close volatility. Hence, we must employ a
2
2
2
)/σO2C
.
+ σC2O
correction term to our Realized Kernel: this term is given as (σO2C
Our correction term has a value of 2.274, which is relatively high for such a correction. An explanation
could be that BP’s stock is traded worldwide, and that hence close-to-open returns (and their volatility)
can be substantive. This is a large drawback of using data from one exchange only (but using all available
data may present even more problems, such as exchange-specific noise).
The realized GARCH model admits the following representation:
p
rt = ht t
(9)
log ht = ω + β log ht−1 + α log xt−1
log xt = ξ + φ log ht + τ (t ) + ut
Where t and ut are white noise with some distribution (again, we consider Gaussian, Student-t, and
GED). xt is the RM for time t. τ () is the following function: τ1 2t + τ2 (2t − 1). A more thorough
description of the realized GARCH model can be found in Appendix D.
Since the realized GARCH models have a large number of parameters, in this paper optimization is
done over a unrestricted parameter set (θ), which is a transformation of the original parameter set.
Page 6
Modeling and forecasting volatility
The transformations and their inverses can be found in the Appendix. Unfortunately, this choice makes
it impossible to derive standard errors and p-values for the original parameters. Hence, we provide
estimates, SE’s, and p-values for the θ-set, and estimates only for the original parameter set.
5.8
Residual error testing
Finally, after all the models are estimated the residuals of these models are investigated a bit further. For
proper validity of the result, the residuals should be independently identically distributed (i.e. ∼ IID).
The results of these test on the error term of Set A are depicted in section G.1.3 of the Appendix. For
the results of the tests on Set B one needs to look at section G.3.1 of the appendix. Next to this, also
the moments of the residuals are obtained. The mean, standard deviation, and the shape of the residuals
are given. The shape is only depicted if it exist for the given distribution.
Two tests are performed on the residuals. Firstly, the Kolmogorov-Smirnov (KS) test is performed. This
is a nonparametric test to statistically tell if there is a difference between the cumulative distribution
function of the reference distribution (i.e. Gaussian, Student-t, GED in this paper) and the obtained
empirical distribution function. In the KS test the null hypothesis states that the empirical distribution
is equal to the sample distribution Smirnov & Smirnov (1939).
Secondly the error terms undergo a Ljung-Box (LB) test. This test is to test whether or not there is
absence of serial correlation in the first lag. Here, the null hypothesis is tested for zero autocorrelation
in the first lag (i.e. independent and random in the first lag). No serial auto correlation is a good
property for the residuals to have because it suggests evidence for individual distributed errors. This test
is performed on both the residuals and the squared residuals. For these tests a 5% significance level is
adopted.
6
Forecasting
In this section, the aim is to get a better understanding of the forecasting method used in this paper.
In econometrics there are two distinct reasons to create a model. One can focus on the structure of
the model, this is often used in policy analysis. Here the effect of certain parameters in the model can
be analysed. An other goal is to forecast with the obtained estimates, these forecasting models are not
necessarily good for structural analysis and the other way around, Blasques (2019). In this paper, the
focus will be on forecasting, not on structural analysis. Also, this section explains the different measures
used to compare the forecast accuracy of the different models.
6.1
Forecasting methodology
Maximum likelihood estimation is used to estimate the parameters of each model, the different error
distributions are explained in the appendix in section F. The parameter estimates are used to forecast
the volatility of the validation set. The parameter estimates can be found in Appendix G in Table 12
and Table 34. The data sample is split up into two sets, namely a test set and a validation set. The
parameter estimates are used for the one-step-ahead forecasts, these are compared to the realized kernel
volatility values. The one-step-ahead forecast is as follows:
2
2
2
σ̂t+h+1
= E(σt+h+1
|σt+h
, Ft+h )
(10)
where Ft+h denotes all past available information at time t + h.
In this paper the models are not re-estimate for each one-step-ahead forecast, as this is very timeconsuming computing wise. However, this approach would be more correct in comparing the accuracy
of each of the models, as this would also take into account the flexibility of the models with respect to
parameter changes in the underlying return process.
Page 7
Modeling and forecasting volatility
6.2
6.2.1
Forecasting accuracy
Forecast Mean Squared Error Forecast Mean Absolute Error
The Mean Squared Error is a forecast accuracy method that tests the quality of the estimator σ̂t2 . The
forecast errors are used to calculate the FMSE. The formula is as follows:
T
1X 2
F\
M SE(σ̂t2 ) =
ê
T t=1 t
(11)
The Forecast Mean Absolute Error is another forecast accuracy method. It punishes large deviations less
than the FMSE. The formula for the FMAE is as follows:
T
1X
F\
M AE(σ̂t2 ) =
|êt |
T t=1
(12)
It is clear that forecast errors with the lowest FMSE and FMAE are the forecasts that have the best
prediction, with respect to some observed variable. The problem encountered here, is that volatility is
a latent variable, and hence it is difficult to calculate any error with respect to the unobserved timevarying volatility. Hence, we use the corrected Realized Kernel (see Section 5.7) as an approximation
of the close-to-close Integrated Variance across each day, and calculate the FMSE and FMAE using the
corrected RK as the ”observed” variance.
6.2.2
Diebold-Mariano Test
Amongst all competing models the forecasting performance is compared using the method described
by Diebold & Mariano (2002), the Diebold-Mariano (DM) test. In this test the null hypothesis of no
difference between forecast errors produced by two models is tested. One can obtain the DM statistic
by taking the standardized difference between the forecast errors under some loss function as described
above. The DM statistic is described below in equation (13). (dt is the difference between forecast errors
(see section 6.2.1)).
√
DM =
T
d¯
,
σ̂d
T
1 X
dt
d¯ =
N t=1
(13)
As stated in Blasques (2019) the σ̂d is a consistent estimator for the dt’s standard deviation.
This version of the DM test uses the FMAE or FMSE, which is based on the corrected Realized Kernel
as an accurate approximation of the true, latent volatility. Note, however, that we do not know how
good of an approximation of open-to-close IV the RK produces, and that the correction term we used is
a rather crude (yet practical) way to obtain a close-to-close measure. It is thus necessary to also focus
on other forecasting accuracy measures other than FMAE (and FMSE).
Hence, our second version of the Diebold Mariano test uses a log-likelihood based scoring rule: dt =
LLt,i − LLt,j (with LLt,i being the log-likelihood contribution of the data at time t in model i). The
main advantage of this approach is that it uses the log-likelihood instead of an arbitrary target variable
like the corrected RK. Since maximizing the likelihood function is equivalent to minimizing the KullbackLeibler divergence between the model’s distribution and the true distribution of the data (Akaike (1998)),
the log-likelihood contributions can be seen as a measure of the distance between each data point and
the true distribution. Hence, comparing models based on a log-likelihood scoring rule is a natural way to
compare the accuracy of forecasted distributions of returns.
Note that we can only compare likelihoods that are based on the same data. Hence, to be able to compare
the GARCH and GAS models with the realized GARCH models, we have to use the partial log-likelihood
of the realized GARCH models that give us the log-likelihood of seeing a certain return (not taking the
likelihoods of the Realized Measure into account).
The results from both of these methods will be discussed in the next section.
Page 8
Modeling and forecasting volatility
7
Results
In this part of the paper the results of two different scenarios are described, which come along with a
different breakdown of the full dataset in the test and validation part.
In scenario A, the data during the oil spill disaster are part of the test sample. In fact, all data up to
and including 2011 are used as test sample, consisting of 1260 trading days. In this way, the parameter
estimates are based on a lot of data points, which will improve the quality of the forecasts. The validation
sample in this scenario is all the data from 2012 up to and including 2014 and includes 754 trading days.
For the scenario B, the test set is shortened. It includes all the data from 2007 up to and including
2010/04/19. The last mentioned date is the day before the oil spill happened. The entire test sample
then includes 801 trade days. The validation set in scenario B includes 252 trade days, which is one year
of trade data. These 252 data points are highly volatile: they include the initial price shock caused by the
Deep Water Horizon disaster and the short term (1 year) aftermath. These different test sets are used to
estimate the parameters of the different models examined in this paper. The estimated parameter values
of the different models can be found in Appendix G in Tables 12, 14, 34, and 36. These estimates are
then used to forecast the future volatility of the close-to-close returns. The forecast errors of the models
will be discussed in the section below.
7.1
Forecasting results Set A
As explained in section 6, this paper discusses three forecast accuracy measures namely, the Forecast
Mean Squared Error, the Forecast Mean Absolute Error and the log likelihoods of the validation set.
Table 3b gives an overview of the different FMSE, FMAE and the likelihoods of the different models. In
Table 3b likelihood values with an ∗ indicate that that particular value is a partial log-likelihood instead
of a regular log-likelihood, the same holds for the values with a ∗ in Table 5. Partial log-likelihood means
that for these values only the non kernel part of the likelihood is used. An interesting insight from
Table 3b is that based on the FMSE and the FMAE, it can be observed that all the Realized models
(i.e. Realized-GARCH, Realized-GARCH-T, and Realized-GARCH-GED) perform far better than all
the other models. From these three models, the Realized-GARCH-GED seems to have the smallest error
in terms of FMSE and FMAE. Furthermore, it can be observed that the RGARCH-LEV has the highest
FMSE and FMAE.
From Table 3a it can be observed that the Realized GARCH-T has the highest partial log-likelihood and
lowest AIC. Here, the R-GARCH-LEV model actually performs well, beating all other GARCH/GAS
models in terms of AIC. As the R-GARCH-LEV model performance very well in-sample but poor outof-sample, this could indicate a possibility of overfitting.
Table 4 shows the Diebold-Mariano test scores. The Diebold-Mariano test score gives a measure of the
number of times a certain model outperforms another model, given that the forecast errors are significant
at 5%. Tables 15, 16 and 17 show the p-values of all Diebold-Mariano tests. Whenever the p-value
is smaller than the 5% significance level it can be said that the models are significantly different from
eachother. For the Diebold-Mariano test based on the FMAE, Table 4 shows that the Realized-GARCHGED model outperforms most other models. Looking at the Diebold-Mariano test based on the FMSE it
can be found that the different Realized models (i.e. Realized GARCH, Realized GARCH-T and Realized
GARCH-GED) outperform most of the other models. Looking at Table ??, it can also be observed that
when using the Diebold-Mariano test, most of the time no significant difference between the different
Realized models can be found. Therefore one can conclude more generally that the different Realized
models outperform the other models. Finally, a Diebold-Mariano test based was also conducted based
on the likelihood. Table 4 shows that the Realized GARCH-T outperforms all other models. Interestingly enough, the Realized GARCH only outperforms the RGARCH model, indicating that a fat-tailed
distribution is necessary for modelling BP’s stock’s returns.
In section G.1.3 in Appendix G, one can observe the results obtained by the different error tests ran
on the residuals of each model for data set A. By observing the results of the Ljung-Box test on the
residuals, one can observe that none of the models suffer from autocorrelation. For the squared residuals,
it can be observed that apart from the GJR-GARCH-GED, the GARCH-GED and the GARCH models,
none of the models have autocorrelation in the residuals. The tables in section G.1.3 in Appendix G also
give the results of the Kolmogorov-Smirnov test (KS-test). The KS-test is rejected for all models in this
paper. This indicates that the forecasting residuals are not distributed in the way one expected them to
Page 9
Modeling and forecasting volatility
be, which suggests that the model forecasts might not have the distributions that are assumed in this
paper.
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-LEV
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NAGARCH
NAGARCH-T
NAGARCH-GED
GAS-T
GAS-GED
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
Likelihood
-2561.646
-2540.090
-2544.254
-2718.208
-2532.571
-2553.087
-2534.381
-2538.021
-2561.646
-2540.090
-2544.254
-2563.700
-2546.030
-2521.525
-2508.555
-2512.642
AIC
5129.291
5088.181
5096.508
5440.416
5077.141
5114.174
5078.761
5086.043
5131.292
5090.181
5098.508
5135.391
5100.053
5059.050
5037.111
5045.284
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-LEV
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NAGARCH
NAGARCH-T
NAGARCH-GED
GAS-T
GAS-GED
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
(a) In sample results, Likelihood, AIC
FMSE
5.032
4.772
4.871
9.130
5.089
5.170
4.950
5.044
5.033
4.772
4.871
5.271
5.049
4.527
4.495
4.391
FMAE
1.267
1.245
1.252
2.323
1.364
1.308
1.292
1.299
1.267
1.245
1.252
1.310
1.238
1.030
1.025
1.007
Likelihood
-1211.196
-1178.383
-1181.057
-1289.287
-1186.480
-1210.783
-1180.560
-1182.678
-1211.199
-1178.424
-1181.055
-1167.652
-1176.430
-1182.124∗
-1141.504∗
-1154.145∗
(b) Out of sample results, FMSE, FMAE, Likelihood.
* indicates partial log-likelihood
FMAE
FMSE
Likelihood
6
3
2
G
G
A
R
C
H
A
R
C
H
G
A
R T
C
H
R
G
A GE
R
C D
R
H
G
A
R
C
G
JR H-G LE
V
G
A
JR R
C
-G
H
G
A
JR R
-G CH
N
A
A
R T
-G C
H
N AR -G
A
C
ED
-G
H
A
N
A RC
-G
H
G AR -T
A
S- CH
-G
G T
ED
A
SG
R
ea ED
liz
R ed G
ea
liz AR
R ed G CH
ea
liz AR
ed
C
H
G
-T
A
R
C
H
-G
ED
Table 3: Set A, Results.
7
7
6
5
3
4
0
0
0
1
1
4
2
1
1
4
6
5
3
2
4
5
2
1
6
6
5
6
4
5
12
10
12
11
1
4
13
13
1
13
13
15
14
13
12
Table 4: Diebold-Mariano test scores, set A.
7.2
Forecasting results Set B
We used the same three forecast accuracy measures for scenario B as we used for scenario A. In Table 5b
the FMSE, FMAE and the likelihood of encountering the returns of the validation set under our estimated
models can be seen. In Table 6 the Diebold-Mariano scores can be found. The Diebold-Mariano score
is the number of times a model outperformes another model, conditional on the fact that the forecast
errors or (partial) likelihoods are significantly different at 5% significance. In the tables 37, 38, and 39
one can find the actual Diebold-Mariano test p-values for set B.
In Table 6 it can be seen that, when FMAE is used as the forecast error measurement, the Realized
GARCH models outperform most of the other models, followed by the GAS-T and the GAS-GED models.
When taking a closer look at Table 37 and Table 5b, it is easy to see that the realized GARCH models
outperform all others, but that there is no significant difference in FMAE among the three realized
GARCH models.
If one looks at the Diebold-Mariano test score for FMSE instead of the FMAE in Table 6, it can be
found that the GAS-GED model is the clear winner. However, if one looks at Table 39, it can be found
that the FMSE of the GAS-GED model is not significantly different from any of the GAS,GJR-GARCH
and Realized GARCH models. Furthermore, it can be seen that the FMSEs of most models are only
significantly different from the FMSEs of the RGARCH models. Hence, it might be concluded that the
GAS-GED model is the only model that, in terms of the FMSE, is outperforming all the other models
and that the RGARCH models are worse than all other models.
Page 10
Modeling and forecasting volatility
In Table 6 it can also be seen that, when the log-likelihood is used as measure of the forecasting performance, there is no model that is a clear winner. It can be seen that the models that have a GED or
a Student’s-t distribution perform slightly better. When one takes a closer look at Table 37, it can be
seen that there is no significant difference between the log-likelihoods of any of the models with a GED
or a Student’s-t distribution. Furthermore, in Table 34, it can be seen that all the models with a GED
distribution have a kurtosis between 0 and 2. We can therefore conclude that models that allow for fat
tails perform the best in terms of the log-likelihood of forecasting the returns in the validation set.
In 5a the (partial) likelihood of seeing the returns of the test set under our estimated models as well as the
AIC of our models can be seen. Note that the Realized Garch-T model outperforms all the other models
in terms of likelihood as well as AIC. However,interestingly, all other realized models are outperformed
by the other models. Hence, just as for set A, in general, the realized GARCH models do not describe the
test set returns correctly. Furthermore, as in set A, the RGARCH is performing well on the in-sample
data, but very poorly on the out-of-sample data. Again, this could indicate a possibility of overfitting.
In section G.3.1 of the appendix, one can find the results of the error test on the residuals of each
model for data set B. We find that for all models, except for the RGARCH, the residuals are uncorrelated. Furthermore, all models, except the realized GARCH models reject that the squared residuals are
independent. What is however a more important, is that all models reject the KS test. That means
the residuals for every model are not empirically distributed in the way that we expected them to be
distributed. Therefore, our model forecasts might not have the distributions we are supposing they have.
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-LEV
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NAGARCH
NAGARCH-T
NAGARCH-GED
GAS-T
GAS-GED
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
Likelihood
-1623.655
-1613.484
-1615.421
-1732.855
-1607.950
-1619.776
-1610.807
-1612.539
-1623.655
-1610.808
-1612.539
-1633.386
-1616.081
-1593.853∗
-1589.544∗
-1591.419∗
AIC
3253.309
3234.967
3236.841
3469.710
3227.89
3247.552
3231.616
3235.077
3255.309
3231.616
3235.078
3274.771
3240.162
3203.705
3199.088
3202.838
FMSE
104.739
105.315
104.782
170.160
231.426
108.246
105.675
106.410
104.738
105.317
104.784
100.248
99.460
104.626
104.587
104.605
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-LEV
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NA-GARCH
NA-GARCH-T
NA-GARCH-GED
GAS-T
GAS-GED
Real GARCH
Real GARCH-T
Real GARCH-GED
(a) In sample, Likelihood, AIC
FMAE
4.435
4.457
4.434
5.447
6.391
4.476
4.394
4.415
4.435
4.457
4.435
3.823
4.245
3.787
3.796
3.790
Likelihood
-546.978
-538.286
-539.656
-594.873
-539.471
-545.003
-537.278
-538.438
-546.978
-538.278
-539.656
-539.539
-539.870
-544.702∗
-537.016∗
-539.410∗
(b) Out of sample, FMSE, FMAE, Likelihood
* indicates partial log-likelihood
G
A
R
C
H
G
A
R
C
H
G
-T
A
R
C
H
R
-G
G
ED
A
R
C
R
H
G
A
R
C
G
JR H-L
EV
-G
A
G
R
JR
C
H
-G
A
G
R
JR
C
H
-G
-T
A
N
R
A
C
-G
H
-G
A
R
ED
N
C
A
H
-G
A
R
N
C
A
H
-G
-T
A
R
G
C
A
H
S-G
T
ED
G
A
SG
ED
R
ea
liz
ed
G
R
A
ea
R
C
liz
H
ed
G
R
A
ea
R
C
liz
H
ed
-T
G
A
R
C
H
-G
ED
Table 5: Set B, Results
FMSE
FMAE
Likelihood
2.0
2.0
1.0
2.0
2.0
3.0
2.0
2.0
3.0
0.0
0.0
0.0
0.0
0.0
1.0
1.0
2.0
1.0
1.0
4.0
3.0
1.0
3.0
4.0
2.0
3.0
1.0
2.0
2.0
3.0
2.0
2.0
3.0
2.0
8.0
1.0
8.0
8.0
3.0
2.0
13.0
1.0
2.0
13.0
2.0
2.0
13.0
2.0
Table 6: Diebold-Mariano test scores, set B.
Page 11
Modeling and forecasting volatility
7.3
Visualisation of result of Set A and B
Below in Figures 3 and 4, the filtered volatility of the best three models for Set A and B are depicted.
The reason for having three best models per set is the fact that three different criteria are investigated
namely, the score of FMSE, the FMAE, and the Likelihood. In Figure 3 the results of Set A are shown,
here the best models were found to be the GAS-T, Realized GARCH-GED, and Realized GARCH-T as
can be found in section 7.1. The colours given to these models are red, yellow, and green respectively.
For Set B, the filtered volatility of the GJR-GARCH-GED, the GAS-GED, and the Realized GARCH-T
models are estimated. Once again shown in red, yellow, and green respectively in Figure 4. Here they
are plotted against the kernel (Figures 3a and 4a) and returns (Figures 3b and 4b). Larger depictions of
these Figures are given in Appendix G.4.
When looking at Figure 3a one notices that it seems that the red line, belonging to the GAS-T, overshoots
the kernel (i.e. the assumed true volatility) quite some times. The green and yellow line (i.e. the Realized
GARCH-GED and Realized GARCH-T) are more or less overlapping. The last two follow the kernel much
closer, which is a good property since it is related to forecasting precision. This is as expected since these
models are quite similar in contrast to the GAS-T. When observing Table 4 one can see that concerning
the FMAE and the FMSE scores, the GAS-T scored lower than both of the Realized models suggesting
that these Realized models are better for forecasting. In Figure 4a one can see that all of the three
models of Set B seem to follow the kernel quite well. As expected, the peaks are more flattened out
during extreme shocks. This can be observed very clearly around the beginning of the Deep Water
Horizon scandal.
When visualizing the likelihood fit, one needs to turn to Figures 3b and 4b. Since the kernel is a mere
approximation of the true volatility it is not possible to directly look at the true volatility. Figures 3b
and 4b are of interest given that the returns are driven by the true volatility. Also, the likelihood is based
on the true volatility. When visualizing the likelihood fit of Set A and B, one needs to turn to Figures
3b and 4b. These returns are harder to visualize and are an interesting subject for further research.
(a) Results vs Kernel.
(b) Results vs Returns.
Figure 3: Set A, Visualisation of results against Kernel and Return.
(a) Results vs Kernel.
(b) Results vs Returns.
Figure 4: Set B, Visualisation of results against Kernel and Return.
Page 12
Modeling and forecasting volatility
8
Conclusion
When considering the results of the 16 different GARCH, GAS, and realized GARCH models on dataset A,
it is easy to conclude that the realized GARCH models are superior in forecasting volatility of BP’s stock,
considering all three forecast accuracy measures. In Figure 3 we have visualized the filtered volatility of
the realized GARCH-T and GED, as well as the filtered volatility of the GAS-T model, versus close-toclose returns and the corrected Realized Kernel (across our validation set). It is easy to see that both
realized GARCH models perform quite well in approximating the RK and in approximating volatility of
the returns (with respect to GAS-T, one of the best performing models of the GARCH/GAS family on
set A). Both realized GARCH-T and GED follow almost exactly the same filtered volatility path.
When dataset B is considered, however, a different conclusion should be drawn. Now, only when considering the FMAE the realized GARCH models reign supreme, while the GAS-GED outperforms them
massively when considering the FMSE. Based on the likelihood it is difficult to choose a “best” model,
since most of them do not show significant differences in daily log-likelihood contributions. This is quite
different from the results obtained using dataset A, where the realized GARCH models were without
doubt the best performing ones. There may be multiple explanations for this, but the most realistic are
that (1) in the ‘Deepwater Horizon’ period the distribution of returns was very complex, and this made it
very difficult for any model to accurately predict the volatility of returns after this event; (2) the Realized
Kernel is not a good approximation of volatility in the ‘Deepwater Horizon’ period, and hence the extra
information it provides in the realized GARCH models does not necessarily lead to a better distributional
fit of returns in this period.
There is at least some evidence for (1), since in the residual test none of the models passed the KolmogorovSmirnov test (neither for set A or B). Hence, we can conclude that there is still some room for improvement
by choosing a more flexible distribution, for example a skewed student-t. Also, it may be interesting to
investigate the possibility of using a time-varying parameter for kurtosis, as this varies heavily across years
(spiking in 2010, the year of the ‘Deepwater Horizon’ crisis; see Table 7). It is more difficult to determine
whether the RK is a good approximation of volatility; however, there are two ways in which our RK could
give a wrong estimate. The first one is the fact that we multiplied our open-to-close approximation with
a fixed number, to correct for close-to-open variance. However, the close-to-open variance seems to vary
over time as well, and hence this fixed number correction could lead to bias. A second possibility is the
presence of “gradual jumps” in the ‘Deepwater Horizon’ period. Barndorff-Nielsen et al. (2009) show that
the Realized Kernel cannot properly deal with this feature in the data, but it is nevertheless very difficult
to determine the presence of gradual jumps, let alone clean them from the data. It would therefore be
very interesting to look into methods for identifying the presence of gradual jumps, and investigate the
possibility of robustifying the Realized Kernel against them.
Kurt.
2007
0.724
2008
5.222
2009
1.310
2010
11.585
2011
0.775
2012
1.415
2013
3.120
2014
3.902
Table 7: Overview of kurtosis across each year (C2C returns)
Page 13
Modeling and forecasting volatility
References
Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle. , 199–213.
Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., & Shephard, N. (2009). Realized kernels in practice:
Trades and quotes. Oxford University Press Oxford, UK.
Barndorff-Nielsen, O. E., & Shephard, N. (2004). Power and bipower variation with stochastic volatility
and jumps. Journal of financial econometrics, 2 (1), 1–37.
Blasques, C. (2019). Advanced econometric measures. canvas, 31 (3), 62–64.
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of econometrics,
31 (3), 307–327.
Casella, G., & Berger, R. (2002). Statistical inference. Thomson Learning. Retrieved from https://
books.google.nl/books?id=0x vAAAAMAAJ
Cox, D. R., Gudmundsson, G., Lindgren, G., Bondesson, L., Harsaae, E., Laake, P., . . . Lauritzen,
S. L. (1981). Statistical analysis of time series: Some recent developments [with discussion and reply].
Scandinavian Journal of Statistics, 93–115.
Creal, D., Koopman, S. J., & Lucas, A. (2013). Generalized autoregressive score models with applications.
Journal of Applied Econometrics, 28 (5), 777–795.
Czyżycki, R. (2013). Using ged (generalized error distribution) for modeling distribution of the rates of
return.
Diebold, F. X., & Mariano, R. S. (2002). Comparing predictive accuracy. Journal of Business & economic
statistics, 20 (1), 134–144.
Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of united
kingdom inflation. Econometrica: Journal of the Econometric Society, 987–1007.
Engle, R. F., & Ng, V. K. (1993). Measuring and testing the impact of news on volatility. The journal
of finance, 48 (5), 1749–1778.
Glosten, L. R., Jagannathan, R., & Runkle, D. E. (1993). On the relation between the expected value
and the volatility of the nominal excess return on stocks. The journal of finance, 48 (5), 1779–1801.
Hansen, P. R., Huang, Z., & Shek, H. H. (2012). Realized garch: a joint model for returns and realized
measures of volatility. Journal of Applied Econometrics, 27 (6), 877–906.
Hoogerheide, L. (n.d.). Some distributions that are used in models for daily stock returns.
Lee, Y. G., Garza-Gomez, X., & Lee, R. M. (2018). Ultimate costs of the disaster: Seven years after the
deepwater horizon oil spill. The Journal of Corporate Accounting & Finance, 29 (1), 69–79.
Nelson, D. B. (1991). Conditional heteroskedasticity in asset returns: A new approach. Econometrica:
Journal of the Econometric Society, 347–370.
Pallardy, R. (2016). Deepwater horizon oil spill environmental disaster, gulf of mexico [2010]. Encyclopedia
Brittanica.
Smirnov, N., & Smirnov, N. (1939). On the estimation of the discrepancy between empirical curves of
distribution for two independent samples.
Zhang, L., Mykland, P. A., & Aı̈t-Sahalia, Y. (2005). A tale of two time scales: Determining integrated
volatility with noisy high-frequency data. Journal of the American Statistical Association, 100 (472),
1394–1411.
Page 14
Modeling and forecasting volatility
Appendices
A
Data Cleaning
Table 8 summarizes the process of data cleaning. For each step in the data cleaning process from
Barndorff-Nielsen et al. (2009) it shows how many data points were removed. This is given for every year
in separate columns. If one adds the values of all years for the 6 different steps, it results in the amount
of removed data points per step in the total data set. This accumulation can be found in the first column
of Table 8.
Descriptive
trading days
observations
outside window (P1)
Incorrect Prices (P2)
non-NASDAQ
Exchanges (P3)
Corrected Trades (T1)
Abnormal Trades (T2)
Double
Exchanges (T3)
Retained Observations
Total
2.014
72.419.933
299.503
0
Test Sample (Set A)
2007
2008
251
253
3.305.145 7.411.499
1.263
12.537
0
0
2009
252
8.290.510
16.108
0
2010
252
22.838.089
188.055
0
2011
252
10.940.914
33.927
0
Validation
2012
250
7.125.196
18.819
0
Sample (Set
2013
252
5.813.265
15.363
0
55.944.226
2.277.898
5.416
1.121.824
380
25.709
10.944.478
4.104.486
A)
2014
252
6.695.315
13.426
0
5.229.284
6.458.787
17.260.936
8.594.100
5.701.799
4.880.917
5.349.262
914
42.298
354
50.077
1.138
407.081
496
115.028
306
62.657
1.134
80.882
694
338.092
453.184
1.372.950
1.131.670
4.110.637
1.587.533
985.161
606.265
697.078
355.462
753.516
633.514
870.242
609.830
356.454
228.704
296.763
Table 8: Overview of Data cleaning and Filtration process 2007-2014.
The most data is lost during the execution of P3, where data from other exchanges are removed. This is
no big surprise, as BP is listed on several exchanges. Besides this, step T3, where exchanges that took
place during the same second have been summarized in the median, accounts for a large reduction. This
shows that the data really is high-frequency. Corrected trades were to be found very few and there were
no incorrect prices at all.
B
B.1
Descriptive Statistics
Returns
When one looks at the plotted ’Open to Close’ and ’Close to open’ returns, there are clearly periods where
the variance is larger i.e. peaks are higher. The most remarkable periods of higher variance correspond
to the shocks in 2008 and 2010 about which is reported earlier.
Figure 5: The development of returns on BP shares between 2007 and 2015 shows two large shocks.
Page 15
Modeling and forecasting volatility
Table 9 projects the descriptive statistics of the O2C and C2C returns. There are no large differences
between the two classes of returns.
Descriptive
Statistic
number of observations
mean
standard deviation
variance
minimum
first quartile
median
third quartile
maximum
kurtosis
skewness
O2C
C2C
1987.0
0.0279
1.3657
1.8651
-15.1994
-0.5724
0.0436
0.6462
10.3397
14.7165
-0.6964
1986.0
-0.0247
2.0319
4.1285
-17.1814
-0.8721
0.0597
0.8941
14.7646
11.7923
-0.4902
Table 9: Summary of descriptive statistics of BP O2C and C2C returns.
B.2
2009-2010
Figure 6 takes a closer look at the price and return development in the year 2010. It is remarkable that it
takes long before the prices and returns react to the big shock. Although the disaster happened in April
2010, serious decline in prices and increased volatility in returns starts in May 2010. However, after the
big shock, the price level never recovers to the value before the Deep Water Horizon accident, even years
after the it as is shown in Figure 1.
(a) Return development
(b) Price development
Figure 6: BP shares in 2010, large shock around the Deep Water Horizon spill (’10/04/20).
Focussing on the descriptive statistics of the prices during the years 2009 up to and including 2011, some
interesting differences are to be seen. The amount of trades executed during 2010 is very high. Also the
variance in 2010 is much higher compared to the other years. These descriptives confirm that 2010 was
a highly unstable year as is stated above.
Page 16
Modeling and forecasting volatility
Descriptive
Statistic
number of observations
mean
standard deviation
variance
minimum
first quartile
median
third quartile
maximum
kurtosis
skewness
2009
2010
2011
633514
46.9046
6.5859
43.3743
33.7
41.24
46.99
51.88
60.0
-1.009
0.1548
870242
42.1806
9.2636
85.8152
26.75
35.49
40.56
49.26
62.38
-0.8732
0.4396
609830
42.7359
3.4955
12.2185
33.62
39.79
43.465
45.44
49.5
-0.7352
-0.4312
Table 10: Summary of descriptive statistics of BP prices of 2009, 2010, 2011.
In order to test whether data is normally distributed, a Jarque-Bera test was performed on the prices as
well as returns for the total dataset and the data in the years 2009 up to and including 2010. The results
of these JB tests are given in Table 11. The null hypothesis of this test assumes a normal distribution:
skewness and excess kurtosis are 0. The null hypothesis of a normal distribution is clearly rejected in all
cases. Returns in 2010 have a much higher JB-statistic than other years.
JB Test Statistic
P-Value
Conclusion:
Normal Distribution
Prices
Total
441,015.5
0.0
2009
29,404.7
0.0
2010
55,678.5
0.0
2011
32,628.8
0.0
Returns (C2C)
Total
2009
11,586.7 23.1
0.0
9.653e− 06
2010
1496.3
0.0
2011
10.2
0.006
No
No
No
No
No
No
No
No
Table 11: Results of the JB test for Prices and Returns of all years and 2009-2011. Significance
level: 0.01.
C
Kernel Models
With Kernel models we can estimate the integrated volatility over a given time interval [0,T]. The
integrated volatility is defined as
Z T
IVT =
σ(s)2 ds
(14)
0
and it is the population measure of the actual return variance over a certain time interval.
If we let Pt denote the price process of an asset and we suppose that Xt = log(Pt ) follows an Itô process,
so that we have
dX(t) = µt dt + σt dBt
(15)
where, according to Zhang et al. (2005), Bt is a standard Brownian Motion, µt is the drift coefficient and
σt is the instantaneous variance of the returns process, then the quadratic variation is given by
2
QVT = plimN →∞ ΣN
i=1 (X(ti ) − X(ti−1 ))
(16)
where 0 = t0 < t1 < t2 < ... < tN = T and sup(ti − ti−1 ) → 0 as n → ∞. In Barndorff-Nielsen &
i
Shephard (2004) it is shown that the quadratic variation is equal to the integrated volatility. Hence, it is
shown that
Z T
2
plimN →∞ ΣN
(X(t
)
−
X(t
))
=
σ(s)2 ds
(17)
i
i−1
i=1
0
This knowledge brings us to the first kernel model that we have implemented, namely the realized volatility.
Page 17
Modeling and forecasting volatility
Realized Volatility
The realized volatility is the sample analogue of the quadratic variation. For a given time interval [0,T],
it is defined as
N
X
RVT =
(Xti − Xti−1 )2
(18)
i=1
where 0 = t0 < t1 < t2 < ... < tN = T It can be shown that when N → ∞, RVT → QVT . One would
therefore expect that calculating the realized volatility over all the observed values in time interval [0,T]
will give the best estimate of the integrated volatility over that interval. Let us denote this measure by
(all)
(all)
RVT . Zhang et al. (2005) tells us that RVT
is actually not a good estimator. It is stated that due
to the microstructure of the market we do not actually observe the real price process Xti , but the process
Yti = Xti + ti instead. We can interpret the microstructure of the market as our observation errors. In
Zhang et al. (2005) they show that, because of the market microstructure, the expectation of the realized
(all)
volatility RVT
calculated on the observed data and conditioned on the true, latent, log price process
X, is given by
E([Y, Y ]T |X process) = [X, X]T + 2nE2
(19)
and hence, that our estimator is biased. What’s more, as can be seen in (19), this bias increases linearly
with the sample frequency n. In Zhang et al. (2005) they therefore propose to sample less frequently.
(sparse)
The realized volatility obtained with this sparse sample is denoted by RVT
and has a lower bias
(all)
than RVT .
Two Time Scales Realized Volatility
(sparse)
(all)
In Zhang et al. (2005) it is pointed out that RVT
, altough it has a lower bias than RVT , does have
a bigger error caused by discretization. Remember that (17) still holds and that by sampling sparsely
our estimate will be further removed from the limit. It is shown in Zhang et al. (2005) that this error
has an effect on the variance of the estimator. It turns out that the bigger the discretization error, the
(sparse)
(all)
bigger the variance of the estimator. Hence, RVT
has a bigger variance than RVT
To alleviate this problem, Zhang et al. (2005) proposes to take the average of estimators created on
different subgrids. To do this, the full grid, hence all the observations included in a time interval [0,T],
is divided into K nonoverlapping subgrids. Hence, we have
K
[
G =
G (k)
(20)
k=1
where each subgrid G (k) is constructed by taking the tk−1 sample point and then taking every Kth sample
point until you reach T. We then obtain the realized volatility on each of these K subgrids. The realized
(k)
volatility obtained in this way on the kth subgrid is denoted by RVT . Then, the final estimator is
obtained by taking the average over the realized volatilities calculated on the K subgrids. Hence, we have
(avg)
RVT
=
K
1 X
(k)
RVT
K
(21)
k=1
In this way, the benefits of not sampling too often is retained, while at the same time the variation of
the estimator is decreased. However,it is pointed out in Zhang et al. (2005) that this estimator is still
biased. The bias is equal 2n̄E2 , where n̄ is the average size of the K subgrids.
(avg)
To overcome this problem, Zhang et al. (2005) proposes to correct the bias of RVT
. Zhang et al.
(all)
1
(2005) shows that 2n RVT
is a consistent estimator for the variance of the error term. For a fixed true
return process X,
1
(all)
− E2 ) → N (0, E4 )
(22)
n1/2 ( RVT
2n
Here, n is the number of observation in the full grid. Then, by taking
(avg)
RVT
−
n̄
(all)
RV
n T
(23)
Page 18
Modeling and forecasting volatility
we obtain an unbiased estimator. Let’s dentote this measure by T SRVT . In Zhang et al. (2005) a
small-sample adjustment is given by
(1 − n̄n )−1 T SRV
Bipower Variation Model
In Barndorff-Nielsen & Shephard (2004) they state that if the log-price of an asset, Xt , is a member of
the Brownian semimartingale plus jump (BSMJ) class, so that we have
t
Z
Xt =
Zt =
t
Z
as ds +
σs dWs + Zt
0
t
ΣN
j=1 cj
(24)
0
where Zt is a jump process, with N a counting process that is finite for every t and the cj are nonzero random variables, then the realized volatility converges in probability to the sum of the integrated volatility
and the quadratic variation of the jump process.
To still be able to estimate the integrated volatility, Barndorff-Nielsen & Shephard (2004) introduce the
Bipower Variation Model. The Bipower Variation process on interval [0,T] of order (1,1) is given by
BVT =
N
X
|rj,T ||rj−1,T |
(25)
j=0
where rj,T = Xj,T −Xj−1,T . Hence, rj,T is the return over the jth period on time interval [0,T] BarndorffNielsen & Shephard (2004) show that
BVT = µ21
T
Z
σ 2 (s)ds
(26)
0
where µ1 = E|u| =
obtained:
√
√2
π
and u ∼ N (0, 1). The following estimator of the integrated volatility is therefore
µ−2
1 BVT
=
µ−2
1
N
X
|rj,T ||rj−1,T |
(27)
j=0
Z
=
T
σ 2 (s)ds
(28)
0
Realized Kernel
In Barndorff-Nielsen et al. (2009) an other kernel model is proposed to estimate the integrated variation,
namely the realized kernel. In Barndorff-Nielsen et al. (2009) they also assume that the price process, X,
follows a Brownian semimartingale plus jump process. Hence, we have
Z
Xt =
T
Z
au du +
0
T
σu dWu + Jt
(29)
0
where
T
JT = ΣN
i=1 Ci
(30)
is a finite activity jump process. Here NT is the number of jumps that occurred in the time interval [0,T]
and finite activity means that NT < ∞ for any T.
Barndorff-Nielsen et al. (2009) also states that we do not observe the true price process, but that we
instead observe
Yti = Xti + ti
(31)
where again represents the market microstructure.
Page 19
Modeling and forecasting volatility
In Barndorff-Nielsen et al. (2009) they are interested in the quadratic variation of X given by
Z
T
2
t
σu2 du + ΣN
i=1 Ci
(32)
0
RT
Here, 0 σu2 du is the integrated variance. In Barndorff-Nielsen et al. (2009) the following model is proposed to estimate the quadratic variation
K(X) = ΣH
h=−H k(
h
)γh
H +1
(33)
where γh = Σnj=|h|+1 xj xj−|h| and k(x) is given by

1
2
3

1 − 6|x| + 6|x| , if 0 ≤ |x| ≤ 2
k(x) = 2(1 − |x|)3 ,
if 21 ≤ |x| ≤ 1


0,
if |x| > 1
Here xj is the jth high frequency return calculated over the interval tj−1 − tj . In Barndorff-Nielsen et al.
(2009) they state that
4
3
H = cξ 5 n 5
(34)
where ξ 2 = √
T
ω2
RT
0
4 du
σu
, c=3.5134 and n is equal to the number of intra-day returns of a day. Further-
more, H is rounded up to to nearest integer.
In Barndorff-Nielsen et al. (2009) a non-trivial way for estimating ξ 2 is proposed. This method is implemented in this paper to estimate ξ 2 .
Realized Measure Paths over the Full Data Set
Figure 7 plots the paths of the 4 Realized Measures that were produced in this study over the returns.
This is done over the whole period 2007-2015.
Figure 7: Realized Measure paths over 2007-2014.
Looking at the full sample, the RV seems much more prone to price jumps than the other measures, and
hence shows some strong jumps in volatility as well. The other RM’s, all decently seem to approximate
the stock’s volatility.
Page 20
Modeling and forecasting volatility
D
Realized Garch
In Hansen et al. (2012) a framework is introduced that combines a GARCH structure for returns with
an integrated model for realized measures of volatility. The models within this framework are called
Realized GARCH models. The general structure of the model is given by
p
(35)
rt = ht zt
ht = v(ht−1 , ..., ht−p , xt−1 ..., xt−q )
xt = m(ht , zt , ut )
where rt is the return, xt is a realized measure of volatility, zt ∼ i.i.d.(0, 1), ut ∼ i.i.d.(0, σu2 ), and
ht = var(rt |Ft−1 ) with Ft = σ(rt , xt , rt−1 , xt−1 , ...). The first and second equation in (35) are called the
return equation and GARCH equation respectively. The third equation is the measurement equation and
relates the observed realized measure to the latent volatility. Hence, the Realized GARCH model fully
specifies the dynamic properties of both returns and the realized measure Hansen et al. (2012).
In our paper we implemented a Realized GARCH with log-linear specification as defined in Hansen et al.
(2012). The model looks in general as follows
p
rt = ht zt
(36)
log(ht ) = ω + Σpi=1 βi log(ht−i ) + Σqj=1 γj log(xt−j )
log(xt ) = ξ + φlog(ht ) + τ (zt ) + ut
where zt = √rht ∼ i.i.d.(0, 1), ut ∼ i.i.d.(0, σu2 ). τ (zt ) is the leverage function. In our paper both p and q
t
are equal to one. Furthermore our leverage function is, as suggested in Hansen et al. (2012), defined as
τ (zt ) = τ1 zt + τ2 (zt2 − 1)
(37)
In this way it is, according to Hansen et al. (2012), possible to generate an asymmetric response in
volatility to return shocks.
Furthermore, in estimating the realized GARCH models we use the following (back)transformations of
the original parameters to the parameter set θ, in order to use unrestricted optimization.
ω = θ1
exp θ3
exp θ2 + exp θ3 + 1
exp θ2
γ = exp −θ5
exp θ2 + exp θ3 + 1
ξ = θ4
θ1 = ω
φγ
1 − φγ − β
β
θ3 = ln
1 − φγ − β
θ4 = ξ
β=
θ2 = ln
φ = exp θ5
θ5 = ln φ
σu = exp θ6
θ6 = ln σu
τ1 = θ7
θ7 = τ1
τ2 = θ8
θ8 = τ2
realized − GARCH − T :
ν1 = exp θ9 + 2
θ9 = ln ν1 − 2
ν2 = exp θ10 + 2
θ10 = ln ν2 − 2
realized − GARCH − GED :
ν1 = tanh θ9 + 1
θ9 = arctanh(ν1 − 1)
ν2 = tanh θ10 + 1
θ10 = arctanh(ν2 − 1)
Page 21
Modeling and forecasting volatility
E
GAS
E.1
Introduction
The generalized autoregressive score (GAS) model belongs to the class of oberservation driven models.
Creal et al. (2013) argue that GAS models have some advantages such as, but not limited to, straightforward likelihood evaluation since it is available in closed form, extensions asymmetric, and long memory.
Creal et al. (2013) also states that since a GAS model is based on a score function, it exploits the complete
density structure rather than the means and higher moment. This sets the GAS model apart from other
observation driven models.
In the GAS model the updating equation is described by equation (38). where ω is a vector of constants,
Ai and Bi are coefficient matrices and st is a function of past data, st = st (yt , σt2 , Ft ; θ). Here
the returns are given by yt , The time-varying parameter are given by σt2 , the σ-field is given by Ft , and
the vector of static parameters is given by θ. In this paper the focus will be on a GAS(1,1) model (i.e.
both parameter p and q are set to 1). This reduces the model to equation (39).
2
σt+1
=ω+
p
X
Ai st−i+1 +
i=1
q
X
2
Bj σt−j+1
(38)
j=1
2
σt+1
= ω + Ast + Bσt2
(39)
In equation (40) further detailing is provided of the st part. Here S(·) is a Matrix function, the scaling
matrix.
st = St · ∇t ,
∇t =
∂lnp(yt |σt2 , Ft , θ)
,
∂ft
St = S(t, σt2 , Ft ;θ)
(40)
Creal et al. (2013) explain how the score defines a steepest ascent direction for improving the local fit in
terms of the likelihood or density at time t given the current position of the parameter σt2 . This will then
give the natural direction for the updating parameter. The exploitation of the full density structure by
the GAS model introduces new transformations of the data that can be used to update the time-varying
parameter σt2 . Furthermore, Creal et al. (2013) stretch how via the choice of the scaling matrix St , the
GAS model allows for additional flexibility in how the score is used for updating σt2 . Therefore, different
choices for the scaling matrix St result in different GAS models. There is no clear theory available on
how to scale the score but the literature gives three main suggestions namely,
• Unit scaling: St = I
• Inverse fisher information matrix scaling: St = (Et−1 [∇t ∇t 0 ])−1
• square root inverse Fisher information matrix: St = (Et−1 [∇t ∇t 0 ])−1/2
E.2
General framework GAS student t
When using a Gaussian distribution for model estimation of a GAS(1,1) model when using the following
basic model: yt = σt t it can be shown that this model reduces to a GARCH(1,1). Proof of this can
be found in Creal et al. (2013). Therefore, in this paper, one focuses on both student’s-t and GED
distributed t . For a GAS(1,1) model with Student-t distribution the updating equation is depicted in
equation (41).
(1 + ν −1 )
2
2
2
y
−
σ
+ B1 σt2
(41)
σt+1
= ω + A1 · (1 + 3ν −1 ) ·
t
(1 + ν −1 yt2 /σt2 ) t
One needs to keep in mind that when νt−1 is equal to zero this equation will reduce to equation (39)
as stated by Creal et al. (2013). An important distinction between equation (41) and a GARCH(1,1)
with Student-t distribution is the fact that the denominator in the GAS(1,1) model ensures that large
realisations of yt result in a smaller effect in the variance when ν is finite.
Page 22
Modeling and forecasting volatility
F
Error distributions
When estimating the models as described above in section 5, one needs to notice the fact that it is
possible for t to be distributed in various ways. In this paper, the following three error distributions are
investigated: the Gaussian distribution, the Student-t distribution, and the Generalized error distribution
(GED). The reason for these specific distributions is as follows, the Gaussian distribution is the easiest
and most standard choice for t . There are some drawbacks with this distribution. For example the
fact that most financial data is not normally distributed. In most, if not all, cases in financial data
one encounters fat tails. To deal with the fact that it is common for empirical distributions to be more
leptokurtic, the Student-t distribution is investigated. This specific distribution is a lot better in handling
this specific data. Finally, the GED distributions is useful because it deals with both kurtosis, which
is related to fat tails, and skewness, which accounts for asymmetric effect. As stated by Nelson (1991)
this distribution is able to capture both fat and skinny tails depending on different parameter values.
Distributions such as the skewed Student-t distribution are not used in this paper since the close-to-close
and open-to-close data is not skewed. As described above, in financial data these two concepts are of
major importance when one wants to properly estimate the a model which encapture all the properties
of the data.
F.1
Gaussian
This Gaussian distribution is one of the most standard ways to model an error distribution. In this approach, the probability density function (pdf) is given by equation (42), here µt is equal to the conditional
mean of the observation equation. The the Log likelihood is given by equation (43).
f (yt |µt , σt2 ) = p
Lt (YT , θ) =
1
2πσt2
exp −
(yt − µt )2
2σt2
T
X
1
1
− log(2πσ2 ) − 2 (yt − E[yt |yt−1 ])2
2
2σ
t=2
(42)
(43)
One has to keep in mind that using this distribution the kurtosis in equal to 3 and the skewness is equal
to 0 as stated by Casella & Berger (2002).
F.2
Student-t
When looking at the Student-t distribution there are some restrictions, for the degrees of freedom parameter, ν, must be larger than 2 for the distribution to have a variance Casella & Berger (2002). The
PDF of a Student-t distributed error is given by equation (44). The log likelihood is given by equation
45 Hoogerheide (n.d.). When ν < 2 it implies extreme fat tails in the distribution of the data. When
2 < ν ≤ 3, then there is no skewness. ν → ∞ implies that the distribution of the Student-t converges
to a Gaussian distribution. Most common for financial data is a ν of between 3 and 5, however this is
dependent on the data as ν depends on the stock, the period and the model for the variance .
− ν+1
2
Γ( ν+1
(yt − µt )2
2 −1/2
2 )
f (yt |ν) =
1
+
((ν
−
2)πσ
)
t
Γ( ν2 )
(ν − 2)σt2
ν 1
ν+1
ν+1
(yt − µt )2
2
log p(rt ) = log Γ
− log Γ
− log((ν − 2)πσt ) −
log 1 +
2
2
2
2
(ν − 2)σt2
F.3
(44)
(45)
Generalized error distribution
For the GED distribution one needs to focus on the ν parameter. This ν is a Tail-thickness parameter
Nelson (1991), the properties of this distribution strongly depend on this value. If the ν parameter is
equal to 1, the distribution yields a Laplace distribution. When ν is equal to 2 the GED transforms in a
Gaussian distribution Czyżycki (2013). A ν between 0 and 2 results in a kurtosis larger than 3. Whereas
a ν > 2 results in the kurtosis < 3, Hoogerheide (n.d.). The PDF of the GED is given by equation (46).
In this equation the λ variable is a function of ν. For the equation of the log likelihood one needs to look
at equation 47. This equation uses the same λ(ν) as used in 46.
Page 23
Modeling and forecasting volatility
p(rt ) =
(1+(1/ν))
2
−1
1
1 rt − µt
−1/2
Γ
λ
σt2
ν · exp −
ν
2 λσ 2 1/2
t
ν!
s
,
λ=
Γ(1/ν)
2/ν
2 Γ(3/ν)
1
1 rt − µt
1
λ − log(σt2 ) + log(ν) −
log p(rt ) = − log 2(1+(1/ν)) Γ
ν
2
2 λσ 2 1/2
t
F.4
(46)
ν
(47)
Derivative
This part of the appendix contains the derivation of the Student-t and Generalized Error Distribution.
As already explained, the derivatives are used in the GAS models within the score. A fully explanation
of the score can be found in Appendix E.
F.4.1
Student-t
In this section derivation of the likelihood equation for a Student t is stated step by step:
ν 1
∂ log p(rt )
∂
ν+1
ν+1
(yt − µt )2
2
− log((ν − 2)πσt ) −
=
log Γ
− log Γ
log 1 +
∂σt2
∂σt2
2
2
2
2
(ν − 2)σt2
(48)
r2
ν+1 ∂
1 ∂
2
[ln(π(ν
−
2)σ
)]
−
·
ln
+
1
=− ·
t
2 ∂σt2
2
lσt2
(ν − 2)σt2
1
=−
r2
2
(ν−2)σt
+1
·
∂
∂σt2
h
r2
(ν−2)σt2
i
+ 1 · (ν + 1)
−
2
π(ν − 2) ·
=−
2π(ν −
∂
[σ 2 ]
∂σt2 t
2)σt2
−
r2
ν−2

−
1
−
2σt2
∂
[1]
∂σt2 σt2
2

=−
·
2
∂ [σ 2 ]
t
2
∂σt
r2
4
σt
ν−2
1
π(ν−2)σt2
∂
[π(ν
∂σt2
·
− 2)σt2 ]
2
+
∂
[1]
∂σt2
r2
(ν−2)σt2 )
+1
(49)
(50)
(ν + 1)
(51)


 (ν + 1)
r2
(ν−2)σt2
+1
r2 (ν + 1)
1
− 2
r2
2σ
4
t
2(ν − 2) (ν−2)σ2 + 1 σt
(52)
(53)
t
=−
(ν − 2)σt2 − r2 ν
2σt2 ((ν − 2)σt2 + r2 )
(54)
Page 24
Modeling and forecasting volatility
F.4.2
GED
∂ log p(rt )
∂
=
∂t2
∂σt2
1
1
1 rt − µt
(1+(1/ν))
− log 2
Γ
λ − log(σt2 ) + log(ν) −
ν
2
2 λσ 2 1/2
t
1 ∂
∂
|r|ν
=− ·
·
[ln(σt2 )] −
2
2 ∂σt
2|λ|ν ∂σt2
ν−1
ν √1 2
·
σt
∂
∂σt2
√1 2
σt
σt
1
=− 2 −
2σt
√1 2
·
∂
∂t2
1
p
σt2
· |r|ν
−
2|λ|ν
√1 2
"
σt
(55)
ν#
(56)
1
σt2
(57)
2
√1 2 · |r|ν ν √1 2
ν!
ν−1
σt
σt
(58)
2|l|ν
1
(− 12 )(σt2 )− 2 −1 |r|ν ν| √1 2 |ν−2
1
σt
p
=−
− 2
2
ν
2
2|λ| σt
t
|r|ν ν| √1 2 |ν−2
=
σt
4|λ|ν σt4
−
|r|ν ν| √1 2 |ν
=
σt
4|λ|ν σt2
−
1
2σt2
1
2σt2
(59)
(60)
(61)
Page 25
Modeling and forecasting volatility
G
G.1
Results
Set A
In this part of the Appendix the parameter results of data Set A are depicted. This data set includes
values from 2007 up to and including 2011 as the validation sample. Here the test sample is form 2012
till 2014.
G.1.1
Estimates
ω
α
β
ν
γ
GARCH
Coefficient
P-value
0.1102
0.000
0.1238
0.000
0.8568
0.000
GARCH-GED
Coefficient
P-value
0.1050
0.002
0.1047
0.000
0.8735
0.000
1.45438
0.000
GARCH-T
Coefficient
P-value
0.096
0.002
0.090
0.000
0.899
0.000
7.414
0.000
RGARCH
Coefficient
P-value
2.170
0.000
0.777
0.000
RGARCH-LEV
Coefficient
P-value
0.129
0.000
0.072
0.000
0.895
0.000
GJR-GARCH
Coefficient
P-value
0.108
0.000
0.040
0.023
0.879
0.000
GJR-GARCH-T
Coefficient
P-value
0.100
0.000
0.030
0.055
0.899
0.000
8.001
0.000
0.092
0.000
GJR-GARCH-GED
Coefficient
P-value
0.106
0.000
0.034
0.052
0.889
0.000
1.485
0.000
0.103
0.000
NAGARCH
Coefficient
P-value
0.110
0.000
0.124
0.000
0.857
0.000
NAGARCH-T
Coefficient
P-value
0.096
0.002
0.090
0.000
0.889
0.000
7.413
0.000
0.000
0.500
NAGARCH-GED
Coefficient
P-value
0.105
0.002
0105
0.000
0.874
0.000
1.454
0.000
0.000
0.500
GAS-T
Coefficient
P-value
0.029
0.000
0.234
0.000
0.979
0.000
9.244
0.000
GAS-GED
Coefficient
P-value
0.029
0.006
0.230
0.000
0.978
0.000
1.487
0.000
7.875
0.000
ρ
δ
-0.010
0.140
-2.120
0.001
0.113
0.000
0.000
0.500
Table 12: Set A: Estimates of parameters of the different models.
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
coeff
pval
coeff
pval
coeff
pval
θ1
0.055
0.044
0.056
0.050
0.054
0.044
θ2
2.695
0.000
2.691
0.000
2.709
0.000
θ3
2.811
0.000
2.859
0.000
2.872
0.000
θ4
-0.083
0.425
-0.087
0.500
-0.086
0.418
θ5
-0.022
0.326
-0.013
0.382
-0.018
0.326
θ6
-0.807
0.000
-0.807
0.000
-0.807
0.000
θ7
0.048
0.463
0.044
0.500
0.045
0.455
θ8
0.022
0.471
0.023
0.500
0.022
0.478
θ9
θ10
2.070
0.000
0.679
0.000
2.386
0.000
0.768
0.000
Table 13: θ estimates for set A.
Page 26
Modeling and forecasting volatility
Coefficient
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
ω
0.055
0.056
0.054
β
0.513
0.526
0.525
γ
0.466
0.450
0.454
ξ
-0.083
-0.087
-0.086
φ
0.979
0.987
0.982
σ
0.446
0.446
0.446
τ1
0.048
0.044
0.045
τ2
0.022
0.023
0.022
ν1
ν2
9.924
1.591
12.87
1.646
Table 14: Set A: Estimates of parameters of the different Realized GARCH models.
0,442
0,000
0,000
0,000
0,001
0,000
0,000
0,441
0,000
0,000
0,000
0,000
0,090
0,000
0,000
0,000
0,113
0,434
0,000
0,000
0,000
0,149
0,000
0,117
0,434
0,006
0,272
0,477
0,000
0,041
0,000
0,053
0,195
0,000
0,072
0,000
0,149
0,000
0,055
0,195
0,001
0,128
0,491
0,000
0,020
0,000
0,000
0,000
0,000
0,004
0,441
0,000
0,000
0,000
0,000
0,000
0,000
0,072
0,000
0,000
0,000
0,000
0,127
0,000
0,001
0,000
0,117
0,055
0,000
0,127
0,008
0,375
0,444
0,000
0,050
0,000
0,123
0,000
0,000
0,074
0,000
0,434
0,195
0,000
0,127
0,000
0,161
0,483
0,000
0,021
0,000
0,008
0,000
0,000
0,001
0,000
0,006
0,001
0,000
0,008
0,000
0,005
0,270
0,000
0,133
-G
ED
R
C
H
-T
ed
ed
G
A
R
C
H
H
G
A
A
R
C
G
ED
R
ea
liz
0,004
0,001
0,074
0,000
0,001
0,000
0,072
0,004
0,001
0,074
0,001
0,089
0,436
0,000
0,020
R
ea
liz
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
ea
liz
ed
0,000
0,123
0,000
0,074
0,000
0,434
0,195
0,000
0,127
0,000
0,000
0,161
0,483
0,000
0,021
R
0,000
0,123
0,000
0,001
0,000
0,113
0,053
0,000
0,000
0,123
0,008
0,377
0,443
0,000
0,050
SG
R
G
0,000
0,000
0,000
0,004
0,442
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,072
0,000
0,000
G
A
H
-L
G
EV
JR
-G
A
R
C
G
H
JR
-G
A
R
C
G
H
JR
-T
-G
A
R
C
N
H
A
-G
G
A
ED
R
C
H
N
A
G
A
R
C
H
N
-T
A
G
A
R
C
H
G
-G
A
ED
ST
A
R
C
R
G
A
R
C
H
G
A
R
C
H
-G
ED
-T
H
G
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-LEV
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NAGARCH
NAGARCH-T
NAGARCH-GED
GAS-T
GAS-GED
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
G
A
R
C
H
DM-test results
A
R
C
G.1.2
0,000
0,377
0,161
0,000
0,089
0,000
0,272
0,128
0,000
0,375
0,161
0,005
0,397
0,000
0,019
0,072
0,443
0,483
0,000
0,436
0,090
0,477
0,491
0,072
0,444
0,483
0,270
0,397
0,021
0,014
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,021
0,048
0,000
0,050
0,021
0,000
0,020
0,000
0,041
0,020
0,000
0,050
0,021
0,133
0,019
0,014
0,048
-
0.001
0.000
0.000
0.000
0.000
0.001
0.001
0.001
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.026
0.000
0.000
0.000
0.000
0.001
0.003
0.027
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.007
0.000
0.000
0.000
0.000
0.001
0.003
0.007
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.001
0.027
0.007
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.010
0.000
0.000
0.000
0.000
0.000
0.000
0.010
0.000
0.246
0.000
0.000
0.000
0.000
0.007
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.010
0.000
0.059
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
G
A
R
ea
C
liz
H
ed
G
A
R
R
ea
C
liz
H
ed
-T
G
A
R
C
H
-G
ED
0.000
0.257
0.059
0.000
0.000
0.000
0.000
0.000
0.000
0.246
0.059
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
R
G
ED
ea
liz
ed
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
R
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
S-
0.000
0.007
0.000
0.000
0.000
0.000
0.000
0.000
0.010
0.000
0.000
0.059
0.000
0.000
0.000
G
A
H
-L
EV
JR
-G
A
R
C
G
H
JR
-G
A
R
C
G
H
JR
-T
-G
A
R
C
N
H
A
-G
G
A
ED
R
C
H
N
A
G
A
R
C
H
N
-T
A
G
A
R
C
H
G
-G
A
ED
ST
0.000
0.007
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.007
0.000
0.257
0.000
0.000
0.000
G
R
G
A
R
C
H
R
C
H
R
G
A
A
R
C
G
R
C
0.000
0.000
0.000
0.000
0.001
0.026
0.007
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
A
G
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-LEV
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NAGARCH
NAGARCH-T
NAGARCH-GED
GAS-T
GAS-GED
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
G
A
R
C
H
H
-T
-G
ED
Table 15: Diebold-Mariano test with the Log likelihood , Set A
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
-
Table 16: Diebold-Mariano test with FMAE as forecast error, Set A.
Page 27
G
A
R
ea
liz
0.416
0.117
0.245
0.000
0.414
0.005
0.086
0.415
0.120
0.245
0.002
0.284
0.000
0.000
0.000
0.021
0.311
0.437
0.000
0.284
0.263
0.301
0.475
0.020
0.317
0.438
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.143
0.179
0.024
0.415
0.191
0.000
0.005
0.004
0.009
0.024
0.408
0.191
0.010
0.301
0.000
0.000
0.000
0.083
0.356
0.419
0.000
0.086
0.002
0.009
0.082
0.361
0.420
0.005
0.475
0.000
0.000
0.000
0.000
0.002
0.001
0.000
0.415
0.308
0.024
0.082
0.003
0.001
0.000
0.020
0.000
0.000
0.000
0.003
0.000
0.014
0.000
0.120
0.160
0.408
0.361
0.003
0.014
0.000
0.317
0.000
0.000
0.000
0.001
0.012
0.000
0.000
0.245
0.277
0.191
0.420
0.001
0.014
0.000
0.438
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.002
0.002
0.010
0.005
0.000
0.000
0.000
0.000
0.000
0.000
0.000
ed
R
ea
liz
R
C
H
A
R
C
A
R
C
H
0.309
0.157
0.278
0.000
0.414
0.004
0.002
0.308
0.160
0.277
0.002
0.263
0.000
0.000
0.000
G
A
R
C
H
ed
G
A
R
R
ea
C
liz
H
ed
-T
G
A
R
C
H
-G
ED
R
G
A
R
C
H
-L
G
EV
JR
-G
A
R
C
G
H
JR
-G
A
R
C
G
H
JR
-T
-G
A
R
C
N
H
A
-G
G
A
ED
R
C
H
N
A
G
A
R
C
H
N
-T
A
G
A
R
C
H
G
-G
A
ED
ST
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
SG
ED
R
G
0.001
0.012
0.000
0.245
0.278
0.191
0.419
0.001
0.014
0.000
0.000
0.437
0.000
0.000
0.000
-G
ED
G
A
0.002
0.012
0.000
0.117
0.157
0.415
0.356
0.002
0.000
0.012
0.000
0.311
0.000
0.000
0.000
A
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-LEV
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NAGARCH
NAGARCH-T
NAGARCH-GED
GAS-T
GAS-GED
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
H
-T
G
0.002
0.001
0.000
0.416
0.309
0.024
0.083
0.000
0.003
0.001
0.000
0.021
0.000
0.000
0.000
R
C
H
G
Modeling and forecasting volatility
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.143
0.094
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.179
0.094
-
Table 17: Diebold-Mariano test with FMSE as forecast error, Set A.
G.1.3
Error tests
Below the results of the test on the residuals are depicted for data Set A. Here the results of the
Kolmogorov-Smirnov (KS) test and the Ljung-Box test on both the residuals (LB res) and the squared
residuals (LB res2 ) are depicted. Also the mean, standard deviation, and the shape (shape is only depicted
if it exist for the given distribution) of the residuals a depicted in the tables.
KS test
LB test res
LB test res2
Test Stat
1.000
0.002
4.404
P-val
0.0
0.965
0.036
Mu
0.010
Std
0.998
Table 18: Set A, error test, GARCH.
KS test
LB test res
LB test res2
Test Stat
0.999
0.033
1.937
P-val
0.000
0.856
0.164
Mu
0.033
Std
0.849
Shape
7.306
Table 19: Set A, error test, GARCH-T.
KS test
LB test res
LB test res2
Test Stat
1.000
0.005
3.017
P-val
0.000
0.943
0.082
Mu
0.042
Std
1.113
Shape
1.431
Table 20: Set A, error test, GARCH-GED.
KS test
LB test res
LB test res2
Test Stat
1.000
2.044
37.208
P-val
0.000
0.153
0.000
Mu
0.003
Std
0.998
Table 21: Set A, error test, RGARCH.
Page 28
Modeling and forecasting volatility
KS test
LB test res
LB test res2
Test Stat
0.999
0.005
2.293
P-val
0.000
0.946
0.130
Mu
0.032
Std
0.859
Shape
7.760
Table 22: Set A, error test, RGARCH-LE.
KS test
LB test res
LB test res2
Test Stat
1.000
0.005
3.019
P-val
0.000
0.946
0.082
Mu
0.011
Std
0.998
Table 23: Set A, error test, GJR-GARCH.
KS test
LB test res
LB test res2
Test Stat
0.999
0.014
1.770
P-val
0.000
0.906
0.183
Mu
0.031
Std
0.861
Shape
7.885
Table 24: Set A, Error test,GJR-GARCH-T.
KS test
LB test res
LB test res2
Test Stat
1.000
0.001
2.334
P-val
0.000
0.978
0.127
Mu
0.041
Std
1.135
Shape
1.462
Table 25: Set A, Error test, GJR-GARCH-GED.
KS test
LB test res
LB test res2
Test Stat
1.000
0.002
4.405
P-val
0.000
0.965
0.036
Mu
0.010
Std
0.998
Table 26: Set A, Error test, NA-GARCH.
KS test
LB test res
LB test res2
Test Stat
0.999
0.033
1.936
P-val
0.000
0.856
0.164
Mu
0.033
Std
0.849
Shape
7.306
Table 27: Set A, Error test, NA-GARCHT .
KS test
LB test res
LB test res2
Test Stat
1.000
0.005
3.016
P-val
0.000
0.943
0.082
Mu
0.042
Std
1.113
Shape
1.431
Table 28: Set A, Error test, NA-GARCH-GED.
KS test
LB test res
LB test res2
Test Stat
0.999
0.000
3.519
P-val
0.000
0.982
0.061
Mu
0.034
Std
0.839
Shape
6.810
Table 29: Set A, Error test, T-GAS.
Page 29
Modeling and forecasting volatility
KS test
LB test res
LB test res2
Test Stat
1.000
0.000
2.738
P-val
0.000
0.987
0.098
Mu
0.043
Std
1.110
Shape
1.425
Table 30: Set A, Error test, GAS-GED.
KS test
LB test res
LB test res2
Test Stat
1.000
0.001
2.722
P-val
0.000
0.974
0.099
Mu
0.013
Std
0.990
Table 31: Set A, Error test, Realized GARCH.
KS test
LB test res
LB test res2
Test Stat
0.997
0.001
2.722
P-val
0.000
0.974
0.099
Mu
0.033
Std
0.882
Shape
9.857
Table 32: Set A, Error test, Realized GARCH-T.
KS test
LB test res
LB test res2
Test Stat
0.934
0.001
2.599
P-val
0.000
0.980
0.107
Mu
0.037
Std
1.198
Shape
1.573
Table 33: Set A, Error test, Realized GARCH-GED.
G.2
Set B
In this part of the Appendix the parameter results of data Set B are depicted. Here the validation sample
is from 2007 up to and including 2010/04/19, this was one day before the deep water horizon scandal.
The test sample of Set B one year (i.e. 252 days) from 2010/04/20 onwards.
Page 30
Modeling and forecasting volatility
G.2.1
Estimates
ω
α
β
ν
γ
GARCH
Coefficient
P-value
0.0586
0.018
0.0901
0.000
0.8966
0.000
GARCH-GED
Coefficient
P-value
0.058
0.031
0.084
0.000
0.902
0.000
1.516
0.000
GARCH-T
Coefficient
P-value
0.057
0.031
0.084
0.000
0.906
0.000
1.516
0.000
RGARCH
Coefficient
P-value
1.678
0.000
0.817
0.000
RGARCH-LEV
Coefficient
P-value
0.104
0.003
0.063
0.000
0.905
0.000
GJR-GARCH
Coefficient
P-value
0.062
0.009
0.038
0.030
0.905
0.000
GJR-GARCH-T
Coefficient
P-value
0.067
0.017
0.039
0.034
0.905
0.000
9.161
0.000
0.076
0.000
GJR-GARCH-GED
Coefficient
P-value
0.065
0.017
0.038
0.041
0.905
0.000
1.540
0.010
0.079
0.000
NAGARCH
Coefficient
P-value
0.059
0.019
0.090
0.000
0.897
0.000
NAGARCH-T
Coefficient
P-value
0.057
0.033
0.080
0.000
0.906
0.000
8.569
0.000
0.000
0.500
NAGARCH-GED
Coefficient
P-value
0.058
0.032
0.084
0.000
0.902
0.000
1.516
0.000
0.000
0.500
GAS-T
Coefficient
P-value
0.016
0.071
0.199
0.000
0.986
0.000
10.561
0.000
GAS-GED
Coefficient
P-value
0.014
0.082
0.183
0.000
0.987
0.000
1.557
0.000
9.135
0.000
ρ
δ
-0.022
0.019
-2.271
0.004
0.082
0.003
0.000
0.500
Table 34: Estimates of parameters of the different models
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
coeff
pval
coeff
pval
coeff
pval
θ1
0.037
0.101
0.025
0.263
0.030
0.203
θ2
2.956
0.000
3.025
0.000
3.036
0.000
θ3
3.108
0.000
3.190
0.000
3.219
0.000
θ4
0.070
0.500
0.025
0.500
0.026
0.469
θ5
0.005
0.456
-0.004
0.472
0.002
0.486
θ6
-0.850
0.000
-0.851
0.000
-0.850
0.000
θ7
-0.092
0.500
-0.026
0.500
-0.039
0.453
θ8
0.139
0.500
0.071
0.500
0.084
0.399
θ9
θ10
2.459
0.000
0.888
0.000
2.436
0.000
0.729
0.000
Table 35: θ estimates for set B.
Coefficient
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
ω
0.037
0.025
0.030
β
0.525
0.529
0.534
γ
0.449
0.451
0.444
ξ
0.070
0.025
0.026
φ
1.005
0.996
1.002
σ
0.427
0.427
0.427
τ1
-0.092
-0.026
-0.039
τ2
0.139
0.071
0.084
ν1
ν2
13.695
1.710
13.432
1.622
Table 36: Set B: Estimates of parameters of the different Realized GARCH models.
Page 31
Modeling and forecasting volatility
G
JR
G
H
-G
A
R
C
G
H
JR
-T
-G
A
R
C
N
H
A
-G
G
A
ED
R
C
H
N
A
G
A
R
C
H
N
-T
A
G
A
R
C
H
G
-G
A
ED
ST
G
A
R
ea
liz
0.007
0.005
0.005
0.009
0.008
0.006
0.005
0.007
0.005
0.005
0.005
0.005
0.005
0.004
0.004
0.102
0.328
0.476
0.009
0.129
0.070
0.303
0.102
0.327
0.476
0.445
0.450
0.261
0.326
0.496
0.200
0.091
0.091
0.008
0.129
0.054
0.038
0.200
0.091
0.091
0.155
0.083
0.481
0.103
0.173
0.039
0.261
0.143
0.006
0.070
0.054
0.211
0.039
0.263
0.143
0.122
0.134
0.164
0.477
0.348
0.028
0.187
0.338
0.005
0.450
0.083
0.134
0.189
0.028
0.186
0.339
0.481
0.234
0.270
0.463
0.351
0.191
0.228
0.005
0.261
0.481
0.164
0.188
0.351
0.191
0.228
0.247
0.234
0.039
0.029
0.124
0.043
0.032
0.007
0.102
0.200
0.039
0.031
0.043
0.032
0.070
0.028
0.351
0.061
0.097
0.043
0.058
0.181
0.005
0.327
0.091
0.263
0.470
0.043
0.181
0.064
0.186
0.191
0.390
0.415
0.032
0.182
0.199
0.005
0.476
0.091
0.143
0.233
0.032
0.181
0.428
0.339
0.228
0.283
0.480
0.070
0.065
0.428
0.005
0.445
0.155
0.122
0.281
0.070
0.064
0.428
0.481
0.247
0.252
0.457
ed
SG
ED
JR
R
C
-G
A
H
A
R
C
H
-L
E
-G
ED
R
C
H
H
-T
A
R
C
R
C
0.031
0.472
0.233
0.005
0.303
0.038
0.211
0.031
0.470
0.233
0.281
0.189
0.188
0.382
0.426
G
A
R
C
H
ed
G
A
R
R
ea
C
liz
H
ed
-T
G
A
R
C
H
-G
ED
R
G
A
0.032
0.182
0.005
0.476
0.091
0.143
0.233
0.032
0.181
0.199
0.428
0.338
0.228
0.283
0.480
R
ea
liz
R
G
0.043
0.182
0.005
0.328
0.091
0.261
0.472
0.043
0.058
0.182
0.065
0.187
0.191
0.390
0.416
A
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-L
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NA-GARCH
NA-GARCH-T
NA-GARCH-GED
GAS-T
GAS-GED
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
V
G
A
0.043
0.032
0.007
0.102
0.200
0.039
0.031
0.124
0.043
0.032
0.070
0.028
0.351
0.061
0.097
R
C
H
G
DM-test results
G
G.3
0.061
0.390
0.283
0.004
0.326
0.103
0.477
0.382
0.061
0.390
0.283
0.252
0.270
0.039
0.081
0.097
0.416
0.480
0.004
0.496
0.173
0.348
0.426
0.097
0.415
0.480
0.457
0.463
0.029
0.081
-
G
A
R
R
R
0.008
0.006
0.006
0.112
0.025
0.014
0.016
0.008
0.006
0.006
0.000
0.001
0.000
0.000
0.000
0.000
0.000
0.000
0.112
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.003
0.000
0.001
0.001
0.000
0.110
0.178
0.160
0.003
0.000
0.001
0.180
0.018
0.019
0.018
0.006
0.003
0.004
0.000
0.000
0.019
0.026
0.024
0.006
0.003
0.004
0.026
0.018
0.144
0.239
0.006
0.003
0.004
0.000
0.000
0.019
0.026
0.024
0.006
0.003
0.004
0.030
0.019
0.144
0.184
0.006
0.003
0.004
0.000
0.000
0.018
0.026
0.024
0.006
0.003
0.004
0.027
0.018
0.239
0.184
-
0.365
0.335
0.383
0.014
0.000
0.004
0.029
0.365
0.335
0.383
0.161
0.178
0.026
0.026
0.026
0.439
0.395
0.448
0.016
0.000
0.001
0.029
0.439
0.396
0.447
0.150
0.160
0.024
0.024
0.024
0.002
0.314
0.490
0.008
0.000
0.389
0.365
0.439
0.315
0.491
0.035
0.003
0.006
0.006
0.006
0.316
0.483
0.106
0.006
0.000
0.457
0.335
0.396
0.315
0.106
0.007
0.000
0.003
0.003
0.003
0.491
0.103
0.077
0.006
0.000
0.399
0.383
0.447
0.491
0.106
0.018
0.001
0.004
0.004
0.004
0.035
0.007
0.018
0.000
0.000
0.117
0.161
0.150
0.035
0.007
0.018
0.180
0.026
0.030
0.027
G
A
d
ea
liz
e
ed
G
A
R
C
ea
liz
ea
liz
ed
SG
ED
R
C
A
-T
H
H
R
C
H
0.389
0.457
0.399
0.025
0.000
0.004
0.001
0.389
0.457
0.399
0.117
0.110
0.019
0.019
0.018
R
C
R
G
A
R
C
H
-L
G
EV
JR
-G
A
R
C
G
H
JR
-G
A
R
C
G
H
JR
-T
-G
A
R
C
N
H
A
-G
G
A
ED
R
C
H
N
A
G
A
R
C
H
N
-T
A
G
A
R
C
H
G
-G
A
ED
ST
H
-T
R
G
0.489
0.103
0.006
0.000
0.399
0.383
0.448
0.490
0.106
0.077
0.018
0.001
0.004
0.004
0.004
H
G
A
R
C
0.315
0.103
0.006
0.000
0.457
0.335
0.395
0.314
0.483
0.103
0.007
0.000
0.003
0.003
0.003
G
A
R
C
G
A
0.315
0.489
0.008
0.000
0.389
0.365
0.439
0.002
0.316
0.491
0.035
0.003
0.006
0.006
0.006
H
-G
ED
R
C
G
A
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-L
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NA-GARCH
NA-GARCH-T
NA-GARCH-GED
GAS-T
GAS-GED
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
H
-G
ED
Table 37: Diebold-Mariano test with the Log likelihood for set B.
-G
R
C
A
G
A
liz
ed
G
R
R
ea
ed
R
ea
liz
liz
ea
H
-T
R
C
R
C
A
G
ed
ED
G
SG
A
G
A
H
H
ED
-G
ST
R
C
H
A
R
C
N
A
G
A
N
A
G
A
N
A
G
H
-T
-G
R
C
H
R
C
A
G
JR
-G
A
G
JR
-G
A
-G
G
JR
H
-T
H
R
C
H
R
C
-L
H
R
G
A
R
C
H
R
C
R
G
A
R
C
H
-T
G
A
G
A
R
C
H
H
R
C
A
G
GARCH
GARCH-T
GARCH-GED
RGARCH
RGARCH-L
GJR-GARCH
GJR-GARCH-T
GJR-GARCH-GED
NA-GARCH
NA-GARCH-T
NA-GARCH-GED
GAS-T
GAS-GED
Realized GARCH
Realized GARCH-T
Realized GARCH-GED
-G
ED
EV
ED
ED
Table 38: Diebold-Mariano test with FMAE as forecast error for Set B.
0.411
0.489
0.033
0.003
0.324
0.438
0.401
0.054
0.412
0.489
0.448
0.046
0.496
0.494
0.495
0.411
0.292
0.026
0.004
0.385
0.483
0.451
0.411
0.447
0.292
0.393
0.004
0.468
0.465
0.466
0.489
0.292
0.028
0.003
0.352
0.452
0.420
0.489
0.295
0.279
0.435
0.009
0.493
0.491
0.492
0.033
0.026
0.028
0.179
0.068
0.055
0.059
0.033
0.026
0.028
0.009
0.018
0.009
0.009
0.009
0.003
0.004
0.003
0.179
0.001
0.001
0.001
0.003
0.004
0.003
0.007
0.003
0.006
0.006
0.006
0.324
0.385
0.352
0.068
0.001
0.073
0.053
0.324
0.385
0.352
0.387
0.189
0.412
0.409
0.411
0.438
0.483
0.452
0.055
0.001
0.073
0.124
0.438
0.483
0.452
0.443
0.226
0.472
0.470
0.471
0.401
0.451
0.420
0.059
0.001
0.053
0.124
0.400
0.451
0.420
0.426
0.217
0.454
0.452
0.453
0.054
0.411
0.489
0.033
0.003
0.324
0.438
0.400
0.411
0.488
0.448
0.046
0.496
0.494
0.495
0.412
0.447
0.295
0.026
0.004
0.385
0.483
0.451
0.411
0.295
0.392
0.004
0.467
0.464
0.466
0.489
0.292
0.279
0.028
0.003
0.352
0.452
0.420
0.488
0.295
0.435
0.009
0.493
0.491
0.492
0.448
0.393
0.435
0.009
0.007
0.387
0.443
0.426
0.448
0.392
0.435
0.272
0.433
0.437
0.434
0.046
0.004
0.009
0.018
0.003
0.189
0.226
0.217
0.046
0.004
0.009
0.272
0.271
0.266
0.268
0.496
0.468
0.493
0.009
0.006
0.412
0.472
0.454
0.496
0.467
0.493
0.433
0.271
0.473
0.465
0.494
0.465
0.491
0.009
0.006
0.409
0.470
0.452
0.494
0.464
0.491
0.437
0.266
0.473
0.483
0.495
0.466
0.492
0.009
0.006
0.411
0.471
0.453
0.495
0.466
0.492
0.434
0.268
0.465
0.483
-
Table 39: Diebold-Mariano test with FMSE as forecast error, Set B.
Page 32
Modeling and forecasting volatility
G.3.1
Error tests
Below the results of the test on the residuals are depicted for data Set B. Here the result of the
Kolmogorov-Smirnov (KS) test and the Ljung-Box test on both the residuals (LB res) and the squared
residuals (LB res2 ) are depicted. Also the mean, standard deviation, and the shape (shape is only depicted
if it exist for the given distribution) of the residuals a depicted in the tables.
KS test
LB test res
LB test res2
Test Stat
1.000
0.910
4.361
P-val
0.000
0.340
0.037
Mu
0.018
Std
0.996
Table 40: Set B, error test, GARCH.
Table 41: Diebold Mariano tests Realized GARCH, Set B
KS test
LB test res
LB test res2
Test Stat
0.999
1.043
3.925
P-val
0.000
0.307
0.048
Mu
0.040
Std
0.867
Shape
8.348
Table 42: Set B, error test, GARCH-T.
KS test
LB test res
LB test res2
Test Stat
1.000
0.985
4.117
P-val
0.000
0.321
0.042
Mu
0.054
Std
1.144
Shape
1.478
Table 43: Set B, error test, GARCH GED.
KS test
LB test res
LB test res2
Test Stat
1.000
4.814
16.600
P-val
0.000
0.028
0.000
Mu
0.013
Std
0.997
Table 44: Set B, error test, RGARCH.
KS test
LB test res
LB test res2
Test Stat
0.999
0.966
4.431
P-val
0.000
0.326
0.035
Mu
0.038
Std
0.876
Shape
8.887
Table 45: Set B, error test, RGARCH-LE
KS test
LB test res
LB test res2
Test Stat
1.000
0.994
4.006
P-val
0.000
0.319
0.045
Mu
0.018
Std
0.996
Table 46: Set B, error test, GJR-GARCH.
Page 33
Modeling and forecasting volatility
KS test
LB test res
LB test res2
Test Stat
0.999
1.044
3.961
P-val
0.000
0.307
0.047
Mu
0.038
Std
0.876
Shape
8.906
Table 47: Set B, error test, GJR-GARCH-T.
KS test
LB test res
LB test res2
Test Stat
1.000
1.027
3.974
P-val
0.000
0.311
0.046
Mu
0.052
Std
1.160
Shape
1.502
Table 48: Set B, error test, GJR-GARCH-GED.
KS test
LB test res
LB test res2
Test Stat
1.000
0.910
4.361
P-val
0.000
0.340
0.037
Mu
0.018
Std
0.996
Table 49: Set B, error test, NA-GARCH.
KS test
LB test res
LB test res2
Test Stat
0.999
1.044
3.922
P-val
0.000
0.307
0.048
Mu
0.040
Std
0.867
Shape
8.347
Table 50: Set B, error test, NA-GARCH-T.
KS test
LB test res
LB test res2
Test Stat
1.000
0.985
4.117
P-val
0.000
0.321
0.042
Mu
0.054
Std
1.144
Shape
1.477
Table 51: Set B, error test, NA-GARCH-GED-T.
KS test
LB test res
LB test res2
Test Stat
0.999
0.931
5.102
P-val
0.000
0.335
0.024
Mu
0.044
Std
0.844
Shape
7.253
Table 52: Set B, error test, T-GAS.
KS test
LB test res
LB test res2
Test Stat
1.000
0.918
3.946
P-val
0.000
0.338
0.047
Mu
0.054
Std
1.144
Shape
1.476
Table 53: Set B, error test, GAS-GED.
KS test
LB test res
LB test res2
Test Stat
1.000
0.942
2.010
P-val
0.000
0.332
0.156
Mu
0.024
Std
0.982
Table 54: Set B, error test, Realized GARCH.
Page 34
Modeling and forecasting volatility
KS test
LB test res
LB test res2
Test Stat
0.983
0.906
1.964
P-val
0.000
0.341
0.161
Mu
0.039
Std
0.910
Shape
13.483
Table 55: Set B, error test, Realized GARCH-T.
KS test
LB test res
LB test res2
Test Stat
0.980
0.828
1.969
P-val
0.000
0.363
0.161
Mu
0.044
Std
1.199
Shape
1.656
Table 56: Set B, error test, Realized GARCH-GED.
G.4
Plots
G.4.1
Set A
Figure 8: Set A, Results vs Kernel.
Page 35
Modeling and forecasting volatility
Figure 9: Set A, Results vs Returns.
G.4.2
Set B
Figure 10: Set B, Results vs Kernel.
Page 36
Modeling and forecasting volatility
Figure 11: Set B, Results vs Returns.
Page 37
Download