Uploaded by Huijian Dong

Option Price predictability

advertisement
285
Model Assisted Statistics and Applications 17 (2022) 285–297
DOI 10.3233/MAS-220410
IOS Press
Option price predictability, splines, and
expanded rationality
Huijian Donga,b,∗ and Xiaomin Guoc
a
Y
School of Business, New Jersey City University, Jersey City, NJ, USA
Teachers College, Columbia University, New York, NY, USA
c
Kate Tiedemann School of Business and Finance, University of South Florida, FL, USA
OP
b
OR
C
Abstract. The current practice of option price forecast relies on the outputs of various option pricing models. The expected value
of the current option price is widely regarded as the best forecast for the future price, assuming the option prices evolve with
a Brownian motion. However, volatility clustering, transaction illiquidity, and demand-supply imbalance drive the future option
prices off the modeled price targets. Therefore, we suggest using the spline method to forecast option prices directly. The focus
is the accuracy of the forecasted asset price in the next period, rather than if the pricing models correctly produce the current
price. We use fifteen years of daily SPY American option contract prices to examine the spline model forecast accuracy. Among
the 476,882 forecasts produced, the mean forecasting error size is $3.66 × 10−3 , with a standard deviation of 1.33 and a median
error of $5.54 × 10−17 . The forecast accuracy is stable across contracts with different terms and moneyness. The spline forecast
model incorporates the illiquidity issue and avoids the vital pitfalls in the current leading option pricing techniques.
1. Introduction
TH
Keywords: Option pricing, forecast, spline model
AU
The Black-Scholes model enlightens the research in pricing contingent assets, and a number of these research in
the past decades are introspective. They focus on the limits of the Black-Scholes model specification and its incompatibility with the prices realized at the marketplace. The efforts concentrate on a few critical areas of violations
to the assumptions of the Black-Scholes model, according to Kou (2008). Though the newer versions of the model
attempt to mediate some concerns, the five main areas below motivate various proposals on better option pricing
models.
1.1. The limits of the Black-Scholes model
Asset returns do not follow normal distributions but present leptokurtic attributes. The returns of most underlying
assets are asymmetric with fat tails and abnormal values at higher moments. For example, the average daily return
of Apple, Inc. is 0.11% from December 15, 1980, to March 8, 2021, with a skewness value of −0.38 and a kurtosis value of 17.91. The returns based on its historical stock prices reject the null hypothesis of normality at the
significance level of 1%. Assets with smaller market capitalizations present more substantial deviations from a normal distribution. Also, even assets that are considered well-diversified and highly liquid suffer from the non-normal
pattern. For example, the Exchange-Traded Fund (ETF) SPY, which tracks the performance of Standard and Poor’s
500 Index, brings a skewness value of −0.05 and kurtosis value of 12.06 from January 29, 1993, to March 8, 2021.
∗ Corresponding author: Huijian Dong, School of Business, New Jersey City University, 200 Hudson St, Jersey City, NJ 07032, USA. Tel.: +1
201 200 3168; Fax: +1 201 200 2004; E-mail: hdong@njcu.edu.
ISSN 1574-1699/$35.00 c 2022 – IOS Press. All rights reserved.
286
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
AU
TH
OR
C
OP
Y
These distribution parameters reject the null hypothesis of normality in the Jarque-Bera test at the significance level
of 1%.
The financial market applies different risk measures. Two measures that are most widely incorporated are the
standard deviation and the value-at-risk (VaR). The methods are not consistent: the former commits to frequency
conversion errors, and the latter violates the subadditivity axiom. For example, when using the monthly asset return
standard deviations as the Black-Scholes model’s risk input, other model inputs are converted to monthly-based
variables to match the frequency. This conversion produces errors by omitting other inputs’ volatilities. By compounding the average daily risk-free interest rates to monthly rates, one implicitly assumes that the variable under
conversion is a constant value without variation. Besides, the frequently used risk measure VaR cannot be used as
an input for the Black-Scholes model though it is quantile based and is free from the restriction of the normality
assumption.
The autocorrelations of asset prices are generally insignificant at the lower sampling frequency, and asset prices
are therefore regarded as following a random walk. However, the volatility of returns carries positive autocorrelation.
Such autocorrelation is referred to as the volatility clustering effect, which describes the phenomenon that higher
volatilities follow higher volatilities. The volatility clustering implies that asset returns’ autocorrelation is not zero,
mainly when observed with high-frequency data. The Black-Scholes model assumes that asset prices move with
independent increments as a Lévy process and therefore do not embrace the volatility clustering effect.
The theoretical volatility as the Black-Scholes formula’s input is consistent across various call and the put options
of the same asset. However, the implied volatilities derived from the realized prices of call and put options with
different strike prices and maturities on the same asset are usually different. The phenomenon that the at-the-money
options carry lower implied volatilities is referred to as the volatility smile or volatility smirk because the implied
volatility is a convex and quasi-convex curve of the strike price.
The microstructure of the option transaction does not have a role in the Black-Scholes model. Except for option
contracts created based on the highly liquid underlying assets, most option transactions are conducted in a less liquid
environment. In this case, the best ask and bid prices are moved by each contract’s demand and supply.
Drapeau, Wang and Wang (2021) confirm the risk premium from this perspective in the option prices. Given
that each contract carries specific desirable characteristics from an investor’s perspective, the ask and bid prices are
affected by the interactions at the market’s microstructure level. However, the Black-Scholes model does not include
a liquidity premium to reflect the significant impact of illiquidity.
Therefore, there is a need for an innovative view and method for option pricing. Such method not only needs to
avoid the issues associated with the Black-Scholes model described above but also meets the following criteria:
The method should carry economical meanings that make sense from an empirical perspective. The method’s
attributes need to be observable in the marketplace and consistent with the behavior of an investor with rational
expectations and sentiment.
Besides, it should be robust and adaptive to the empirical needs. The method should price options correctly
without producing arbitrage opportunities when faced with extreme market conditions such as fat-tailed volatilities
or thin liquidity.
Furthermore, it should be simple and easy for computation. The ideal method should yield closed-form solutions
for standard options and path-dependent options, and the process of solving the model is not numerical but analytical.
1.2. The stochastic volatility models
Studies in the option pricing field strive to fix the Black-Scholes model’s five issues described in the previous
section. The development of pricing models and techniques is summarized below.
Concerning the normality assumption problem, Barndorff-Nielsen and Shephard (2001) propose using the generalized hyperbolic models to fit the asset returns with other distributions to manage the heterogeneous skewness
and kurtosis. The five parameters of the superclass’s generalized form can be controlled to specify one of the five
probability density functions: the Student’s t-distribution, the Laplace distribution, the hyperbolic distribution, the
normal-inverse Gaussian distribution, and the variance-gamma distribution.
Concerning the volatility clustering problem, Mandelbrot (1963) shed light on the asset return dependence issue
by replacing the Brownian motion assumption of an asset with a fractal Brownian motion, which allows for dependent asset price updates. Rogers (1997), however, reminds us that this causes inconsistent asset pricing and hence
287
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
arbitrage opportunities. Past studies also contribute a significant number of investigations on Constant elasticity of
variance models and stochastic volatility models to include the volatility clustering effect, and some representative
researches are Cox and Ross (1976), Engle (1982), Hull and White (1987), Heston (1993), and Davydov and Linetsky (2001). Recently, Marks and Simon (2017) further conclude that sector implied volatilities overreact to both
positive and negative idiosyncratic returns and substantially reverse in the next period. The prevailing Heston model
is a commonly used stochastic volatility model. This model defines that the randomness of the variance process
varies as the square root of variance. The differential equation for variance is:
√
dνt = θ(ω − νt )dt + ξ νt dBt
Y
In the definition, ω is the mean long-term variance, θ is the rate at which the variance reverts to the long-term
mean, ξ is the volatility of the variance process. The essence of the Heston model is its character of volatility
long-term stability and proportional to its level.
Another prevailing stochastic volatility model is the Constant Elasticity of Variance model. The relationship
between volatility and price is:
OP
dSt = µSt dt + σStγ dWt
OR
C
If γ > 1, the volatility increases as price increases, and if γ < 1, the volatility increases as price decreases. This
model is sometimes only regarded as a local volatility model because it does not carry an independent stochastic
process.
The Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model is frequently used to release the
constant variance assumption made by a traditional Black-Scholes model. The model is different from the Heston
model in that it assumes the randomness of the variance changes with the variance. In contrast, the Heston model
assumes the randomness of the variance changes with the square root of variance. Though the variance produced
by GARCH improves the Black-Scholes model assumption, it is not stochastic since the volatility is a deterministic
function of the previously realized volatility rather than a perfectly random process.
dνt = θ(ω − νt )dt + ξνt dBt
3
3
TH
A similar attempt is the 3/2 model, though it assumes the randomness of the variance changes with νt2 and the
differential equation for variance is:
dνt = νt (ω − θνt )dt + ξνt2 dBt
AU
Furthermore, the efforts of releasing the Black-Scholes model assumptions go beyond the correction of volatility
setting. Both academia and financial industry practitioners devote more attention to address the volatility smile issue.
This issue is mainly due to the mismatch between the option values produced by the pricing models and the realized
prices quoted at the marketplace. When the observed asset prices experience a dramatic change, the pricing models
are short of adaptivity to incorporate the volatility shocks at the deep-in-the-money and the deep-out-of-the-money
ends. Allowing for random volatility is not sufficient for some often-seen price jumps. Models without volatility
jumps may be fundamentally misspecified because implausibly large volatility shocks generate large option price
movements. For example, the Heston model, which has a square-root variance stochastic volatility, requires an
eight-standard-deviation shock in returns to generate a price gap. The gap is unattainable by adjusting the squareroot specification. Jacquier et al. (2001) find the same misspecification still exists in a log-variance model. There
is a need to incorporate the price jumps in the model rather than relying on extremely fat tails produced solely by
stochastic volatilities. Therefore, the recent literature’s efforts mainly focus on applying and advancing the jumpdiffusion models in the option pricing process.
1.3. The jump-diffusion models
Concerning resolving multiple issues with the Black-Scholes model, the most current development pursues the
jump-diffusion models. Some representative studies are Kou (2002) and Eraker et al. (2003). These models carry
some features that improve the quality of option pricing: they reproduce the leptokurtic feature of the return distribution, incorporate the volatility smile and smirk phenomena, and derive closed-form solutions for path-dependent
288
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
options. Given the constant volatility problem of the Black-Scholes model, the stochastic volatility models prevail. Stochastic volatility models incorporate the variance as a stochastic process to release the assumption of fixed
volatility of the underlying asset.
The building blocks of a jump-diffusion model are a Poisson process and a Brownian motion. The former fits the
jumps in asset price, and the latter fits the diffusion. This model mimics the asset price series that sometimes jumps
and has a continuous but random evolution between the jumps.
The stochastic volatility model is developed from the continuous-time form of the Black-Scholes formula:
Allowing for time-dependent volatility,
√
dSt = µSt dt + νt St dzt
OR
C
dνt = αν,t dt + βν,t dBt
OP
The maximum likelihood estimator of σ is

2
St1
Stn
n
ln
ln
X
S
1
St0
ti−1

(t1 − ti−1 ) 
−
σ̂ 2 =
n i=1
ti − ti−1
tn − t0
Y
∂V
∂V
1
∂2V
+ σ 2 S 2 2 + rS
− rV = 0
∂t
2
∂S
∂S
dSt = µSt dt + σSt dzt
2. The spline model
TH
In the definition of volatility, αν,t and βν,t are functions of νt , and dBt is a standard normally distributed variable.
The form of νt depends on the specific setting of the stochastic model summarized below.
However, one key demand is missing from the jump-diffusion models: the volatility clustering effect. The stochastic volatility models resolve this concern. Duffie et al. (2000) propose combining jump-diffusions with stochastic
volatilities resulting in the affine jump-diffusion models. Filipović and Larsson (2020) refine the model to accommodate the non-normal distribution, the fitting volatility smile, and the volatility clustering effect. However, the
affine jump-diffusion model is impractical for the financial industry as its users find it difficult to solve, even for a
numerical solution.
AU
Despite the efforts summarized above, two significant gaps remain in the field of forecasting option prices. Regarding the first gap, the latest development in the option pricing models, such as the affine jump-diffusion models,
does not address all Black-Scholes models’ remaining issues. The two obstacles left are the risk measures and the
liquidity premium. As we specified in the previous section, the financial market applies different risk measures,
namely the standard deviation and the value-at-risk (VaR). The former carries conversion errors, and the latter fails
to serve as the input of a parametric model. In addition, none of the recently developed option pricing models shed
light on the liquidity premium of option prices in the real market environment. To sum up, the first gap is the lack of
connection between the theoretical prices and the quality of execution in the actual marketplace.
Regarding the second gap, under the current state of knowledge, the option price forecast still relies on the option
pricing models’ results. The best forecast of the future price is the expected value of the current option price. From
this perspective, the current literature blends the two topics: forecasting future option prices and pricing current
option contracts. There is no forecast method proposed or applied independently from the option pricing models.
The need to forecast option prices is apparent and meaningful. The purpose of a highquality forecast of option
prices is not limited to pursuing excess return. It may also improve the budget process of portfolio risk management
if options are involved as hedging tools.
However, relying on using a perfect option pricing model to conduct the price forecast is unrealistic. This is not
only because such a perfect model does not exist but also because most option contract transactions are illiquid. The
thin liquidity causes option price jumps that leave gaps in the continuous timeseries data. Hence the best forecast
for the future price may not be the expected value of the current price.
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
289
TH
spline
pt+1 = F1,2,..
(pt , pt−1 , . . . , p1 )
OR
C
OP
Y
Therefore, we explore an independent option price forecast model free from the option pricing models. The focus
is the accuracy of forecasted asset price at t + 1, rather than if the price produced from pricing models fits the
asset price at t. In this forecast model, we regard the realized option prices as exogenous and do not assume their
distributions. We also take into consideration the market liquidity conditions.
A good forecast model should carry additional features besides the demand for accuracy. These additional criteria
include simplicity, minimized assumptions, and the degree of accuracy independent from the impact of market
conditions. On the contrary, the option prices are affected by factors from four perspectives: the fundamental value
of the asset, the specifications of the option contract, the dynamic game among market makers, buyers, and sellers,
as well as the investor expectations. Therefore, it is challenging to develop a simple yet effective forecasting model
with minimal assumptions.
Therefore, we switch the track and consider the option prices as exogenous. Rather than decomposing the option
returns by factors like the factor models in equity asset pricing, we treat option prices in an ex post manner and as
black boxes. We only focus on option prices’ autocorrelations and avoid explaining the reasons to support its current
price level.
Empirically, past studies have made numerous efforts regarding fitting a time series, for example, the ARIMA
model, curve fitting with exponential terms, Gaussian class models, sigmoid shapes, to name a few. On the one
hand, using a parametric model to fit the option prices has its downside: its fixed set of parameters cannot adjust
for the option prices with return jumps and volatility jumps. This can cause significant forecast errors and error
dependence. On the other hand, using a nonparametric model to fit the prices has its downside, too: its forecasted
value cannot be specified with certainty.
Among others, we propose using the spline method to forecast option prices. Studies in this field have attempted
to apply the spline functions to different asset markets. Some representative ones include Vasicek and Fong (1982)
that first use splines on term structures; Audrino and Bühlmann (2009) that use splines to model financial volatility;
Tian (2015) that applies spline models for the implied binomial tree; and Rashidinia and Jamalzadeh (2017) that
connects the spline models with option prices.
This method minimizes the assumptions to data pattern and distribution It is also less prone to be affected by the
specific types of option contracts regarding their parameters, as presented in Tables 2 and 3. In fact, the only input
restriction to the data that are used to generate a prediction is that the discrete data need to be bounded values in the
nonnegative real number set:
pi ∈ R−
∃M ∈ R+ , |pi | 6 M
AU
Splines are piecewise polynomial functions. Using splines in curve fit and observation interpolation yields better results than single polynomial regressions because it produces similar results without increasing the degree of
polynomials and avoids introducing the Runge’s phenomenon.
Spline functions for interpolation are the minimizers of suitable roughness measures subject to the observation
and boundary constraints. The model regression minimizes a weighted combination of the average squared approximation error over observed data. We name the spline function S and define it as a cubic polynomial function on the
[xj−1 , xj+k+1 ] domain and mapping to the real numbers range:
S : [xj−1 , xj+k+1 ] → R
As S is nonlinear, we define its k + 2 sub-domains in a piecewise manner with pairwise disjoint interiors:
[xj−1 , xj+k+1 ] = [xj−1 , xj ]∪[xj , xj+1 ]∪[xj+1 , xj+2 ]∪ · · · ∪[xj+k−1 , xj+k ]∪[xj+k , xj+k+1 ]
For each sub-domain, we define a curve, for example:
C : [xj , xj+1 ] → R
To make the curve a cubic polynomial:
Cj−1 (x) = a0,j−1 + a1,j−1 x + a2,j−1 x2 + a3,j−1 x3
290
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
Therefore, the knots xj−1 . . . xj+k+1 divide the spline into k + 2 forms:

Cj−1 (x), for x ∈ [xj−1 , xj ]



 Cj (x), for x ∈ [xj , xj+1 ]
S= .
..




Cj+k (x), for x ∈ [xj+k , xj+k+1 ]
OR
C
OP
Y
The choice of spline degree affects the shape of the curve fit. If the degree is 0, the spline fit is a constant between
every two knots. If the degree is 1, the spline fit is a linear function. If the degree is 3, the spline fit is a cubic
function that combines that concave and convex curves without introducing the boundary connection constraints. In
addition, any cubic spline is compatible with linear and constant functions. Therefore our study uses the cubic basis
spline (Cubic B Spline) as the instrument to fit and forecast option prices, given the support of our large dataset and
sufficient sample sizes to maintain a satisfying degree of freedom.
We do not set up the locations of knots in an ex ante manner. This avoids introducing subjectivity in the identification of a price jump. The threshold of price jump changes over the life of an option’s contract, and there is no readily
available algorithm to determine the significance of a price jump in a continuous series of price quotes. Instead, we
allow for one additional knot added at a free location for every 20 new price quotes recorded and included in the
dataset. This method is based on the common practice of demanding 20 more observations per degree of freedom
is increased. When a new knot is added in the spline regression, its location and the previous knots’ locations are
recalculated to find the cubic functions that fit the data the best with the newly defined cutoff range of the independent variable. This procedure is implemented by the constrained nonlinear multivariable function minimization
algorithm.
The problem is specified by
min f (x)
x
s.t. c(x) 6 0
ceq(x) = 0
Aeq · x = beq
TH
A·x6b
x ∈ [lower bound, upper bound]
AU
where b and beq are vectors, A and Aeq are matrices, c(x) and ceq(x) are functions that return vectors, and f (x) is
a function that returns a scalar. f (x), c(x), and ceq(x) can be nonlinear functions.
x = (f (x), x0 , A, b, Aeq, beq, lb, ub, nonlinearcondition) starts from x0 , and subjects the minimization to the
nonlinear inequalities c(x) or equalities ceq(x) defined in nonlinear condition. x optimizes such that c(x) 6 0 and
ceq(x) = 0.
3. Forecasting option prices with spline model
We use the daily American style option prices of the SPY to conduct our forecast tests. The underlying asset
SPY is an Exchange-Traded Fund (ETF) that tracks the Standard and Poor’s 500 Index performance. There are four
reasons we select the options based on SPY: (1) it is the most active asset traded in the financial market in the United
States with the highest Asset Under Management (AUM) value; (2) it attracts the most complete and uninterrupted
option chains and types of contracts created by a broad range of market makers and investors; (3) the financial
market and data providers keep the longest record for SPY compared to other assets; (4) most of the literature in
the option pricing field used SPY as the instrument of pricing precision. We collect more than 50 studies using SPY
options as the representative assets, and the list is available upon request.
The option data and the related market data are from the Options Pricing and Reporting Authority (OPRA). The
data is referred to as the National Best Bid Offer (NBBO). The OPRA is responsible for consolidating the prices
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
291
Table 1
Summary statistics of the Greeks, pricing factors, and market transactions
Maximum Minimum
1.88E+02 0.00
1.78E+02 0.00
60.52
−46.17
1.00
−1.00
1.00
0.00
50.00
−9.56E+03
2.14E+02 0.00
14.00
0.00
11.00
0.00
12.00
0.00
2.35
0.45
4.00E+02 60.00
3.38E+02 68.11
1.07E+03 1.00
5.43E+05 0.00
5.43E+05 0.00
1.18E+06 0.00
Std. Dev.
16.65
16.36
1.33
0.69
2.90E-03
32.83
40.52
0.20
0.10
0.10
0.16
60.65
64.27
2.89E+02
2.87E+04
2.87E+04
6.25E+03
Y
Median
14.00
13.00
5.54E-17
0.00
0.00
−4.00
49.00
0.00
0.00
0.00
0.97
1.75E+02
1.87E+02
3.25E+02
3.02E+03
3.00E+03
2.00
OP
Mean
Option ask price
17.87
Option bid price
17.49
Forecast error of ask price
3.66E-03
Option δ
0.13
Option γ
8.39E-06
Option θ
−5.32
Option υ
55.64
Implied volatility from ask price
0.02
Implied volatility from bid price
0.01
Implied volatility from mean price 0.01
Moneyness (K/St )
0.99
Strike prices (K)
1.83E+02
Underlying asset price (St )
1.88E+02
Contract remaining life
4.01E+02
Open interest
1.29E+04
Open interest T + 1
1.29E+04
Option trade volume
7.04E+02
Skewness
1.39
1.39
0.23
−0.17
3.45E+02
−1.64E+02
0.61
15.46
23.10
24.26
1.02
0.41
0.34
0.60
5.07
5.08
64.55
Kurtosis Jarque-Bera
5.63
2.92E+05
5.51
2.78E+05
57.80
5.97E+07
2.12
1.78E+04
1.19E+05 2.82E+14
4.11E+04 3.35E+13
2.61
3.21E+04
4.49E+02 3.98E+09
1.14E+03 2.59E+10
1.54E+03 4.69E+10
6.20
2.85E+05
2.20
2.57E+04
2.00
2.92E+04
2.29
3.91E+04
40.41
2.99E+07
40.49
3.00E+07
8.79E+03 1.53E+12
AU
TH
OR
C
from all the individual option exchanges and publishing the NBBO. The data is sampled by the end of each trading
day, and the coverage is from February 8, 2005, to May 29, 2020. Due to the rich spectrum of contracts created based
on SPY, each option variable includes 476,882 observations. Table 1 illustrates the data description. In Table 1, each
variable includes 476,882 observations. The forecast errors are based on the projection from the spline model on the
options of SPY from February 8, 2005, to May 29, 2020.
Options are assigned to one of three cycles at the moment of creation: JAJO (January, April, July, and October),
FMAN (February, May, August, and November), and MJSD (March, June, September, and December). The options
for SPY follow the MJSD cycle. Therefore, new contracts based on the SPY price are created to expire in March,
June, September, and December. In addition, option investors are provided the contracts for the first two front months
followed by the two remaining cycle months. After a month passes, the last two remaining months continue to follow
the cycle assigned initially. After two months pass, the contract with the nearest expiration is concluded, and a new
cycle month is about to start. This provides opportunities for contracts with various lengths to meet the different
needs of the investors. In recent years, market makers have created weekly options to accommodate heterogeneous
demands. However, our study focuses on the conventional option contracts that are still mainstream and do not
include contracts whose entire life is shorter than 20 days.
For each option contract of SPY, we start to fit its first spline curve after accumulating 20 days of transactions. This
brings 16 degrees of freedom to shape the coefficients of the first cubic polynomial. As the transactions continue,
one knot is added for every 20 additional days the age of the option grows, though the location of the knot is not
pre-determined. Compared to most literature that uses the average price of ask and bid as the option price, our study
uses the four series of ask and bid prices for the call and put options directly quoted from the marketplace. We
believe that this is a better benchmark of the original option prices free from the spread’s impact.
In Fig. 1, we use two spline regressions to illustrate the development of spline functions and knots’ shift as more
samples are included. From left to right and from the top to the bottom splines, the number of observations increases
from 100 to 130. The number of knots increases from 5 to 6. The vertical dashed lines indicate the location and
cutoff of the knots. The Fig. 1 illustrate the process of growing the data sample and the number of knots without
a deterministic setup of knot location. With more option price quotes being realized as time passes by, the knots
are recalculated and relocated. Accordingly, the parameters of the cubic polynomial functions fit the data adjusts
to pursue minimized error. A new knot is added after every 20 new observations are included in the sample. The
splines are continuous, but the joints are not derivable. The cubic polynomial functions present piecewise concavity
and convexity, with at the most one inflection point in each cubic polynomial function. The MATLAB R codes of
the spline modeling and knot identification involved in this paper are available by request.
The right-most cubic polynomial between the last two knots can be fit in the form of the following equation. We
define knot t as the right-most knot at the current moment and knot t − j as the second right-most knot located j
trading days ago.
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
Y
292
OP
Fig. 1. Spline fit of a growing sample.
t,knot t−j+1
t,knot t−j+1 2
t,knot t−j+1 3
C(t) = â0knot t,knot t−j+1 + âknot
t + âknot
t + âknot
t
1
2
3
We use this polynomial to generate the predicted price of a SPY option contract:
t,knot t−j+1
t,knot t−j+1
C(t + 1) = â0knot t,knot t−j+1 + âknot
(t + 1) + âknot
(t + 1)2
1
2
OR
C
t,knot t−j+1
+ âknot
(t+)3
3
We compute the forecast error as the deviation of the forecasted option price and the realized price:
εt+1 = C(t + 1) − pt+1
spline
εt+1 = C(t + 1) − F1,2,..
(pt , pt−1 , . . . , p1 )
AU
TH
The results, combining the ask and bid prices for both call and put options, show that the performance of the spline
forecast makes it an ideal and practical tool for the market. The average forecast error is $3.66 × 10−3 . This is an
insignificant level compared to the bid-ask spread and the transaction cost. This is also a trivial amount compared to
the average option ask price ($17.87) and the average bid price ($17.49) of the options. In fact, the average forecast
error is affected by a few outliers; the median forecast error is as low as $5.54 × 10−17 , with a standard error of
1.33.
As discussed in Section 2, a good forecasting model should produce not only accurate but also robust predictions.
We test the independence of the forecast error with different attributes of contracts. The attributes of options come
from two perspectives: the Greeks and the Black-Scholes factors.
We use the Newey-West heteroscedasticity-consistent and autocorrelation-consistent (HAC) estimators procedure
(Newey & West, 1987) in the cross-sectional ordinary least square (OLS) regressions. We present the results in
Tables 2 and 3.
The regression results from an ordinary least square (OLS) regression without using this procedure are presented
in Appendix A.1. We argue that pooling the Greeks of an option and the transaction volumes of an option in the
regression of forecast error without counting for the serial correlation and heteroscedasticity problems among independent variables would cause false significance. The spline forecast errors are not affected by the Greeks, as
presented in Table 2. In Table 2, we present the relationship between the ask price forecast error from the spline
model and the SPY option Greeks. Single, double, and triple asterisks (*, **, ***) indicate statistical significance at
the 10%, 5%, and 1% levels. The standard errors of the regression coefficients are reported in the parentheses.
We continue the robustness check of the spline forecast error regarding the Black-Scholes formula factors: days
to expiration, volatility, underlying asset price, moneyness, and the option price. Regarding the categorization of
the moneyness of options, there is numerous literature that adopts different criteria. Though the technical threshold
values are different, these past studies generally use two sets of methods: categorizing the moneyness by the ratio of
strike price over the current underlying asset price and the option’s delta.
Cosma et al. (2020) follow the practice of Bollen and Whaley (2004) and use 0.375, 0.625 as the threshold values
of options’ delta to categorize ATM, ITM, and OTM. Feunou and Okou (2019) define OTM calls as the options with
Moneyness
Underlying price
Ask
IV ask
0.0043
(0.0070)
−1.56E-06
(1.51E-05)
−0.0146
(0.0261)
0.0040
(0.0051)
0.0001
(1.85E-04)
0.0062
(0.0059)
−0.0155
(0.0134)
7.93E-05
(1.27E-04)
0.0013
(0.0078)
−0.0155
(0.0134)
0.1929
(0.3609)
0.0001
(1.86E-04)
0.0043
(0.0053)
0.1934
(0.3614)
−0.0011
(0.0076)
8.48E-05
(1.28E-04)
OR
C
TH
0.0057
(0.0058)
−0.0156
(0.0134)
0.1868
(0.3659)
0.0001
(1.85E-04)
8.10E-05
(1.28E-04)
−0.0003
(0.0077)
0.0062
(0.0059)
−0.0155
(0.0134)
0.1907
(0.3654)
0.0001
(1.85E-04)
−7.18E-05
(9.91E-05)
0.0172
(0.0176)
0.0078
(0.0451)
−0.0041
(0.0430)
−0.0023∗∗∗
(6.93E-04)
0.0001∗∗∗
(8.41E-05)
0.0103
(0.0447)
0.0141
(0.0510)
−7.57E-05
(1.03E-04)
−0.0025
(0.0469)
−0.0158
(0.0262)
0.0208
(0.0522)
−0.0019
(0.0465)
−0.0040
(0.0264)
−0.0021∗∗∗
(0.0007)
0.0439
(0.0509)
−0.0021
(0.0266)
−0.0022∗∗∗
(0.0007)
9.53E-05
(9.49E-05)
0.0259
(0.0181)
−7.23E-05
(1.02E-04)
−0.0007
(0.0464)
−0.0130
(0.0482)
−0.0023∗∗∗
(0.0007)
0.0440
(0.0511)
3.40E-05∗∗
(1.07E-05)
OP
0.0181
(0.0152)
−3.02E-07
(1.04E-05)
−0.0024∗∗∗
(0.0007)
9.58E-05
(9.45E-05)
Y
0.0154
(0.0181)
3.25E-05∗∗
(1.59E-05)
0.0078
(0.0449)
−0.0021
(0.0433)
−3.97E-06
(1.43E-05)
−0.0151
(0.0261)
7.93E-05
(1.27E-04)
0.0013
(0.0078)
−0.0155
(0.0134)
0.1910
(0.3659)
Table 3
Ask price forecast error and the pricing factors of the SPY options with Newey-West procedure
8.48E-05
(1.28E-04)
−0.0011
(0.0076)
AU
−0.0021∗∗∗
(6.92E-04)
0.0420∗∗∗
(0.0092)
Dependent variable: ask price error
Dependent variable: ask price error
0.0057
0.0037
0.0043
(0.0058)
(0.0051) (0.0053)
−0.0156
(0.0134)
0.1888
(0.3613)
0.0001
(1.86E-04)
Days to expiration
Intercept
Pricing factors
Vega
Theta
Gamma
Delta
Greeks
Intercept
Table 2
Ask price forecast error and the Greeks of the SPY options with Newey-West procedure
−7.32E-05
(9.92E-05)
0.0187
(0.0182)
−2.07E-06
(1.49E-05)
−0.0161
(0.0261)
(0.0007)
(1.60E-05)
0.0043
(0.0268)
−0.0023∗∗∗
0.0313∗∗∗
(0.0090)
3.30E-05∗∗
(0.0007)
9.59E-05
(8.41E-05)
−0.0014
(0.0464)
0.0164
(0.0511)
3.34E-05∗
(1.71E-05)
0.0062
(0.0267)
−0.0024∗∗∗
Number of observations: 476882
Number of observations: 476882
0.0020
−0.0003
0.0020
(0.0079)
(0.0077)
(0.0079)
−0.0154
−0.0154
(0.0134)
(0.0134)
0.1971
0.1946
(0.3610)
(0.3655)
0.0001
0.0001
0.0001
(1.84E-04) (1.85E-04) (1.84E-04)
7.57E-05
8.10E-05
7.57E-05
(1.27E-04) (1.28E-04) (1.27E-04)
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
293
294
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
4. Expanded rationality
AU
TH
OR
C
OP
Y
delta lower than 0.5 and define OTM puts as the options with delta higher than 0.5. Christofferson et al. (2014) also
use option delta to categorize options by their moneyness.
Barone-Adesi et al. (2020) regard the options whose K/St ratios fall into 0.85 to 0.1 (puts) and 1 to 1.15 (calls)
as OTM, and less than 0.85 (puts) or greater than 1.15 (calls) as DOTM. Alitab et al. (2019) regards the options
whose K/St ratios are between 0.98 and 1.02 as ATM, 0.9 to 0.98 (puts) and 1.02 to 1.1 (calls) as OTM, and 0.7 to
0.9 (puts) and 1.1 to 1.3 (calls) as DOTM. Leippold and Vasiljevic (2020) regard the options whose K/St ratios are
between 0.97 and 1.03 as ATM, 0.94 to 0.97 (puts) and 1.03 to 1.06 (calls) as OTM. However, Christofferson et al.
(2018) use St /k between 0.975 and 1.025 as the standards of ATM.
In Table 3, we present the relationship between the ask price forecast error from the spline model and the SPY
option’s pricing factors. Single, double, and triple asterisks (*, **, ***) indicate statistical significance at the 10%,
5%, and 1% levels. The standard errors of the regression coefficients are reported in the parentheses. As Table 3
concludes, the spline forecast error is not affected by the implied volatility of the option, the price of the underlying
asset, or the option’s moneyness. However, two factors will affect the forecast precisions: the stage and the price
of the option. When the remaining life of an option is longer, the forecast error is higher. The forecast precision
increases as the options move to maturity. In addition, the forecast error is lower when the price of the option is
higher. Both impacts are reasonable, and we argue that the former is due to the limited number of observations,
whereas the latter is due to the skewness from the asset price bound.
Regarding the impact of the stage of the option, when the spline function fits the first cubic polynomial function
based on the first 20 samples of realized option prices, the goodness of fit is relatively limited due to the fewer
degrees of freedom and due to the lack of knots to identify structural breaks. Therefore, the first few predictions
may carry large errors. The goodness of fit is improved as more option prices are introduced to depict the cubic
polynomial and more knots are inserted to allow optimal segmentation of the asset price series. In other words, the
precision of the spline model forecast is less of a concern as an option walks out of its initial stage.
Regarding the impact of the option price, we propose that the price boundary can be a significant reason. A large
portion of the option prices are zero, especially for the bid prices of both call and put options when they are close to
maturity or deep out of the money. The minimum price is zero, though the value predicted by the spline functions
may be negative. This amplifies the forecast error when the option prices are lower. The spline model forecast
precision is not a concern for option prices that are not close to zero. For option prices staying at zero for a long
time, the forecast precision is not a concern either, because the spline model will most likely fit a constant curve to
reflect the price fact.
Overall, we believe that the impacts on the forecast errors cause less significant concerns. This is also because the
errors are not cumulative and are mean-reverting, as the expected error size between each pair of adjacent knots is
zero.
The forecast error is primarily not affected by the features carried by the contract terms of an option, as suggested
by the previous section. This should be distinguished from the fact that it is related to some variables realized and
observed in the marketplace. From a theoretical perspective, these variables are affected by the option price and are
related to the forecast errors. These variables are not the ones that affect the option prices when an option contract
is created. Therefore, these market-related variables are associated with the forecast errors but are not determinants
of the forecast errors.
The motivation of investigating the relationship between these market-based variables and the forecast errors is
primarily driven by the investors’ momentum and mean reversion expectations. Investors observe the updates of
the option prices and are usually able to form momentum expectations. Such expectations are drawn mainly by
observing the price curve. We argue that this should not be identified as a technical analysis. Investors may have
sound statistical evidence and experience concluded from the realized market data to support their expectations.
Some examples of this evidence are the volatility clustering effect and the local autocorrelation of asset returns.
Therefore, the spline fit curves can be partially regarded as the investors’ expectations on the option prices when
observing the price plots. Obviously, even the most proficient investor cannot depict and predict the spline price
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
295
Table 4
Ask price forecast error and the market transaction of the SPY options with Newey-West procedure
Market
Intercept
Open interest
Open interest T1
Volume
Dependent variable: ask price error
0.0036
0.0033
0.0041
(0.0058)
(0.0058)
(0.0052)
3.50E-09
(9.82E-08)
3.02E-08
(9.74E-08)
−6.57E-07
(4.28E-07)
0.0035
(0.0058)
−9.03E-07∗
(4.89E-07)
9.20E-07∗
(4.83E-07)
Number of observations: 476882
0.0030
0.0032
(0.0058)
(0.0058)
−8.32E-07∗∗
(4.81E-07)
9.41E-08
9.10E-07∗∗
(1.00E-07)
(4.82E-07)
−7.66E-07∗
−8.12E-07∗
−7.58E-07∗∗
(4.35E-07)
(4.41E-07)
(4.38E-07)
0.0034
(0.0058)
6.49E-08
(9.99E-08)
OR
C
OP
Y
curve because the asset price follows a random walk outside its local autocorrelation zone. However, this does not
imply that investors always fail to guess and predict the option prices. Therefore, we roughly use the spline curves
as a proxy of the investor expectations on option prices based on their experience and observations of the market.
Accordingly, the forecast errors of the spline curves are investor expectation errors. We explore how the marketbased variables of an option affect these errors. We include the trading volume, the open interest, and the T + 1 open
interest in the regressions.
The interpretation of option trading volume is slightly different from that of equities. The option trading volume
is relative and needs to be compared to the average daily volume of the underlying assets. The open interest is the
number of active contracts, which have been traded but not yet liquidated by an offsetting trade, exercise, assignment.
A buy to open (setting up a long position) or sell to open (setting up a short position) contribute to the open
interest. Conversely, a sell to close concludes the long position, and a buy to close concludes the short position. Both
of these offsetting operations reduce the open interest. The T + 1 open interest observed at T indicates the updated
open interest after some of the current open interests are traded and offset. The current open interest observed at the
trading initiation of T + 1 combines the T + 1 open interest observed at T and the new positions opened at T + 1,
assuming no trading volume has been executed at T + 1.
Open interest(T + 1)t = Open interest(T )t − Volume(T )t
Open interest(T )t+1 = Open interest(T + 1)t + new opens(T )t+1
AU
TH
In Table 4, we present the relationship between the ask price forecast error from the spline model and the SPY
option’s transaction interests. Single, double, and triple asterisks (*, **, ***) indicate statistical significance at the
10%, 5%, and 1% levels. The standard errors of the regression coefficients are reported in the parentheses. Higher
current open interest and volume indicate more robust demand for the option and better liquidity. This not only
helps reduce the bid-ask spread but also indicates a lower degree of opinion divergence among investors. The results
from Table 4 show that higher current open interest and higher volume are related to lower investor expectation
error. In contrast, higher next-day open interest is related to higher investor expectation error. With lower current
open interest settled by trading volume, a more significant number of remaining open interest are carried forward
when investors find the market difficult to predict and forecast errors are high. Under such circumstances, investors
leave the positions open and realize thinner trading volume and liquidity. This provides evidence to support that the
investors expand their rational expectations. Investors are not only effective for investments conducted but also for
investments that are terminated. A risk averse investor hesitates to continue the option trading and execution when
the option prices are more difficult to predict by observing the autocorrelations from the price curves.
5. Concluding remarks
This study proposes using spline modeling to predict the option prices. Past studies implement option price forecast by pursuing perfect asset pricing models and use the expected price produced by such models as the forecasted
option price. We focus on direct price forecast instead of option pricing. We consider option price as exogenous and
produce future prices from the historical prices directly without decomposing the option price by their determinants.
This helps avoid the problems involved in the Black-Scholes that are not fully resolved by the new option pricing
models’ recent developments.
296
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
References
OR
C
OP
Y
We use four daily price series of the American options on SPY to examine the quality of forecast: the ask and
the bid prices of the calls and the puts. The spline model produces low option price forecast errors. The mean error
size is $3.66 × 10−3 with a standard deviation of 1.33 and a median error of $5.54 × 10−17 . The spline model is
simple to use, and the forecasts are robust from the impact of different contract settings. We also provide evidence
that investors form rational expectations by observing the option prices and form a spline curve-like forecast in their
minds. This guides them for implemented investments. In addition, investors expand their rationality by terminating
planned option investments when they believe that the option prices are more difficult to predict accurately.
However, using the spline model to provide option price forecast is not without its limitations. First, it is a forecast model with exogenous realized prices rather than an asset pricing model, and hence its model specifications do
not carry economical meanings. Second, though it can produce reliable forecasts with certainty compared to nonparametric models, it does not have a fixed specification known ex ante compared to parametric models. Third, the
forecast error size is more significant for newly created options and low option prices.
Therefore, the next step’s studies can focus on combining the spline forecast with option pricing models. This can
fill the gap of the limits mentioned above regarding the spline method and the Black-Scholes models’ limits and their
developments. Another research route is to apply and expand the spline method to other financial markets across
asset classes and geographical locations. The third route of future studies is to promote methodology development of
the spline method. Currently, the spline method remains at the pure function stage: the forms of regression functions
across knots are the same. A possible innovation may be to release the consistent form constraint to allow for
different function forms in one spline regression.
AU
TH
Alitab, D., Bormetti, G., Corsi, F., & Majewski, A.A. (2019). A realized volatility approach to option pricing with continuous and jump variance
components. Decisions in Economics and Finance, 42(2), 639-664. doi: 10.1007/s10203-019-00241-2.
Audrino, F., & Bühlmann, P. (2009). Splines for financial volatility. Journal of the Royal Statistical Society: Series B (Statistical Methodology),
71(3), 655-670. doi: 10.1111/j.1467-9868.2009.00696.x.
Barndorff-Nielsen, O.E., & Shephard, N. (2001). Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial economics.
Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 167-241. doi: 10.1111/1467-9868.00282.
Barone-Adesi, G., Fusari, N., Mira, A., & Sala, C. (2020). Option market trading activity and the estimation of the pricing kernel: A Bayesian
approach. Journal of Econometrics, 216(2), 430-449. doi: 10.1016/j.jeconom.2019.11.001.
Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81(3), 637-654. doi: 10.1086/
260062.
Bollen, N.P.B., & Whaley, R.E. (2004). Does net buying pressure affect the shape of implied volatility functions? The Journal of Finance, 59(2),
711-753. doi: 10.1111/j.1540-6261.2004.00647.x.
Christoffersen, P., Feunou, B., Jacobs, K., & Meddahi, N. (2014). The economic value of realized volatility: Using high-frequency returns for
option valuation. Journal of Financial and Quantitative Analysis, 49(3), 663-697. doi: 10.1017/s0022109014000428.
Christoffersen, P., Fournier, M., & Jacobs, K. (2017). The factor structure in equity options. The Review of Financial Studies, 31(2), 595-637.
doi: 10.1093/rfs/hhx089.
Cosma, A., Galluccio, S., Pederzoli, P., & Scaillet, O. (2018). Early exercise decision in american options with dividends, stochastic volatility,
and jumps. Journal of Financial and Quantitative Analysis, 55(1), 331-356. doi: 10.1017/s0022109018001229.
Cox, J.C., & Ross, S.A. (1976). The valuation of options for alternative stochastic processes. Journal of Financial Economics, 3(1-2), 145-166.
doi: 10.1016/0304-405x(76)90023-4.
Davydov, D., & Linetsky, V. (2001). Pricing and hedging path-dependent options under the CEV process. Management Science, 47(7), 949-965.
doi: 10.1287/mnsc.47.7.949.9804.
Drapeau, S., Wang, T., & Wang, T. (2020). How Rational Are the Option Prices of Hong Kong Dollar Exchange Rate? The Journal of Derivatives,
jod.2020.1.120. doi: 10.3905/jod.2020.1.120.
Duffie, D., Pan, J., & Singleton, K. (2000). Transform analysis and asset pricing for affine jump-diffusions. Econometrica, 68(6), 1343-1376.
doi: 10.1111/1468-0262.00164.
Engle, R.F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation. Econometrica,
50(4), 987. doi: 10.2307/1912773.
Eraker, B., Johannes, M., & Polson, N. (2003). The impact of jumps in volatility and returns. The Journal of Finance, 58(3), 1269-1300. doi:
10.1111/1540-6261.00566.
Feunou, B., & Okou, C. (2018). Good volatility, bad volatility, and option pricing. Journal of Financial and Quantitative Analysis, 54(2), 695-727.
doi: 10.1017/s0022109018000777.
Filipović, D., & Larsson, M. (2020). Polynomial jump-diffusion models. Stochastic Systems, 10(1), 71-97. doi: 10.1287/stsy.2019.0052.
Heston, S.L. (1993). A closed-form solution for options with stochastic volatility with applications to bond and currency options. Review of
Financial Studies, 6(2), 327-343. doi: 10.1093/rfs/6.2.327.
H. Dong and X. Guo / Option price predictability, splines, and expanded rationality
297
AU
TH
OR
C
OP
Y
Hull, J., & White, A. (1987). The pricing of options on assets with stochastic volatilities. The Journal of Finance, 42(2), 281-300. doi:
10.1111/j.1540-6261.1987.tb02568.x.
Jacquier, E., Polson, N.G., & Rossi, P.E. (2004). Bayesian analysis of stochastic volatility models with fat-tails and correlated errors. Journal of
Econometrics, 122(1), 185-212. doi: 10.1016/j.jeconom.2003.09.001.
Kou, S.G. (2002). A jump-diffusion model for option pricing. Management Science, 48(8), 1086-1101. doi: 10.1287/mnsc.48.8.1086.166.
Leippold, M., & Vasiljević, N. (2020). Option-implied intrahorizon value at risk. Management Science, 66(1), 397-414. doi: 10.1287/mnsc.
2018.3157.
Mandelbrot, B. (1963). The variation of certain speculative prices. The Journal of Business, 36(4), 394. doi: 10.1086/294632.
Marks, J.M., & Simon, D.P. (2017). Sector option implied volatility dynamics and predictability. The Journal of Derivatives, 25(2), 22-42. doi:
10.3905/jod.2017.25.2.022.
Newey, W.K., & West, K.D. (1987). A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica, 55(3), 703. doi: 10.2307/1913610.
Rashidinia, J., & Jamalzadeh, S. (2017). Modified b-spline collocation approach for pricing american style asian options. Mediterranean Journal
of Mathematics, 14(3). doi: 10.1007/s00009-017-0913-y.
Rogers, L.C.G. (1997). Arbitrage with fractional brownian motion. Mathematical Finance, 7(1), 95-105. doi: 10.1111/1467-9965.00025.
Tian, Y.S. (2015). Implied binomial trees with cubic spline smoothing. The Journal of Derivatives, 22(3), 40-55. doi: 10.3905/jod.2015.22.3.040.
Vasicek, O.A., & Fong, H.G. (1982). Term structure modeling using exponential splines. The Journal of Finance, 37(2), 339-348. doi: 10.1111/
j.1540-6261.1982.tb03555.x.
Download