Option Price Predictability with Splines: A Forecasting Approach

285 Model Assisted Statistics and Applications 17 (2022) 285–297 DOI 10.3233/MAS-220410 IOS Press Option price predictability, splines, and expanded rationality Huijian Donga,b,∗ and Xiaomin Guoc a Y School of Business, New Jersey City University, Jersey City, NJ, USA Teachers College, Columbia University, New York, NY, USA c Kate Tiedemann School of Business and Finance, University of South Florida, FL, USA OP b OR C Abstract. The current practice of option price forecast relies on the outputs of various option pricing models. The expected value of the current option price is widely regarded as the best forecast for the future price, assuming the option prices evolve with a Brownian motion. However, volatility clustering, transaction illiquidity, and demand-supply imbalance drive the future option prices off the modeled price targets. Therefore, we suggest using the spline method to forecast option prices directly. The focus is the accuracy of the forecasted asset price in the next period, rather than if the pricing models correctly produce the current price. We use fifteen years of daily SPY American option contract prices to examine the spline model forecast accuracy. Among the 476,882 forecasts produced, the mean forecasting error size is $3.66 × 10−3 , with a standard deviation of 1.33 and a median error of $5.54 × 10−17 . The forecast accuracy is stable across contracts with different terms and moneyness. The spline forecast model incorporates the illiquidity issue and avoids the vital pitfalls in the current leading option pricing techniques. 1. Introduction TH Keywords: Option pricing, forecast, spline model AU The Black-Scholes model enlightens the research in pricing contingent assets, and a number of these research in the past decades are introspective. They focus on the limits of the Black-Scholes model specification and its incompatibility with the prices realized at the marketplace. The efforts concentrate on a few critical areas of violations to the assumptions of the Black-Scholes model, according to Kou (2008). Though the newer versions of the model attempt to mediate some concerns, the five main areas below motivate various proposals on better option pricing models. 1.1. The limits of the Black-Scholes model Asset returns do not follow normal distributions but present leptokurtic attributes. The returns of most underlying assets are asymmetric with fat tails and abnormal values at higher moments. For example, the average daily return of Apple, Inc. is 0.11% from December 15, 1980, to March 8, 2021, with a skewness value of −0.38 and a kurtosis value of 17.91. The returns based on its historical stock prices reject the null hypothesis of normality at the significance level of 1%. Assets with smaller market capitalizations present more substantial deviations from a normal distribution. Also, even assets that are considered well-diversified and highly liquid suffer from the non-normal pattern. For example, the Exchange-Traded Fund (ETF) SPY, which tracks the performance of Standard and Poor’s 500 Index, brings a skewness value of −0.05 and kurtosis value of 12.06 from January 29, 1993, to March 8, 2021. ∗ Corresponding author: Huijian Dong, School of Business, New Jersey City University, 200 Hudson St, Jersey City, NJ 07032, USA. Tel.: +1 201 200 3168; Fax: +1 201 200 2004; E-mail: hdong@njcu.edu. ISSN 1574-1699/$35.00 c 2022 – IOS Press. All rights reserved. 286 H. Dong and X. Guo / Option price predictability, splines, and expanded rationality AU TH OR C OP Y These distribution parameters reject the null hypothesis of normality in the Jarque-Bera test at the significance level of 1%. The financial market applies different risk measures. Two measures that are most widely incorporated are the standard deviation and the value-at-risk (VaR). The methods are not consistent: the former commits to frequency conversion errors, and the latter violates the subadditivity axiom. For example, when using the monthly asset return standard deviations as the Black-Scholes model’s risk input, other model inputs are converted to monthly-based variables to match the frequency. This conversion produces errors by omitting other inputs’ volatilities. By compounding the average daily risk-free interest rates to monthly rates, one implicitly assumes that the variable under conversion is a constant value without variation. Besides, the frequently used risk measure VaR cannot be used as an input for the Black-Scholes model though it is quantile based and is free from the restriction of the normality assumption. The autocorrelations of asset prices are generally insignificant at the lower sampling frequency, and asset prices are therefore regarded as following a random walk. However, the volatility of returns carries positive autocorrelation. Such autocorrelation is referred to as the volatility clustering effect, which describes the phenomenon that higher volatilities follow higher volatilities. The volatility clustering implies that asset returns’ autocorrelation is not zero, mainly when observed with high-frequency data. The Black-Scholes model assumes that asset prices move with independent increments as a Lévy process and therefore do not embrace the volatility clustering effect. The theoretical volatility as the Black-Scholes formula’s input is consistent across various call and the put options of the same asset. However, the implied volatilities derived from the realized prices of call and put options with different strike prices and maturities on the same asset are usually different. The phenomenon that the at-the-money options carry lower implied volatilities is referred to as the volatility smile or volatility smirk because the implied volatility is a convex and quasi-convex curve of the strike price. The microstructure of the option transaction does not have a role in the Black-Scholes model. Except for option contracts created based on the highly liquid underlying assets, most option transactions are conducted in a less liquid environment. In this case, the best ask and bid prices are moved by each contract’s demand and supply. Drapeau, Wang and Wang (2021) confirm the risk premium from this perspective in the option prices. Given that each contract carries specific desirable characteristics from an investor’s perspective, the ask and bid prices are affected by the interactions at the market’s microstructure level. However, the Black-Scholes model does not include a liquidity premium to reflect the significant impact of illiquidity. Therefore, there is a need for an innovative view and method for option pricing. Such method not only needs to avoid the issues associated with the Black-Scholes model described above but also meets the following criteria: The method should carry economical meanings that make sense from an empirical perspective. The method’s attributes need to be observable in the marketplace and consistent with the behavior of an investor with rational expectations and sentiment. Besides, it should be robust and adaptive to the empirical needs. The method should price options correctly without producing arbitrage opportunities when faced with extreme market conditions such as fat-tailed volatilities or thin liquidity. Furthermore, it should be simple and easy for computation. The ideal method should yield closed-form solutions for standard options and path-dependent options, and the process of solving the model is not numerical but analytical. 1.2. The stochastic volatility models Studies in the option pricing field strive to fix the Black-Scholes model’s five issues described in the previous section. The development of pricing models and techniques is summarized below. Concerning the normality assumption problem, Barndorff-Nielsen and Shephard (2001) propose using the generalized hyperbolic models to fit the asset returns with other distributions to manage the heterogeneous skewness and kurtosis. The five parameters of the superclass’s generalized form can be controlled to specify one of the five probability density functions: the Student’s t-distribution, the Laplace distribution, the hyperbolic distribution, the normal-inverse Gaussian distribution, and the variance-gamma distribution. Concerning the volatility clustering problem, Mandelbrot (1963) shed light on the asset return dependence issue by replacing the Brownian motion assumption of an asset with a fractal Brownian motion, which allows for dependent asset price updates. Rogers (1997), however, reminds us that this causes inconsistent asset pricing and hence 287 H. Dong and X. Guo / Option price predictability, splines, and expanded rationality arbitrage opportunities. Past studies also contribute a significant number of investigations on Constant elasticity of variance models and stochastic volatility models to include the volatility clustering effect, and some representative researches are Cox and Ross (1976), Engle (1982), Hull and White (1987), Heston (1993), and Davydov and Linetsky (2001). Recently, Marks and Simon (2017) further conclude that sector implied volatilities overreact to both positive and negative idiosyncratic returns and substantially reverse in the next period. The prevailing Heston model is a commonly used stochastic volatility model. This model defines that the randomness of the variance process varies as the square root of variance. The differential equation for variance is: √ dνt = θ(ω − νt )dt + ξ νt dBt Y In the definition, ω is the mean long-term variance, θ is the rate at which the variance reverts to the long-term mean, ξ is the volatility of the variance process. The essence of the Heston model is its character of volatility long-term stability and proportional to its level. Another prevailing stochastic volatility model is the Constant Elasticity of Variance model. The relationship between volatility and price is: OP dSt = µSt dt + σStγ dWt OR C If γ > 1, the volatility increases as price increases, and if γ < 1, the volatility increases as price decreases. This model is sometimes only regarded as a local volatility model because it does not carry an independent stochastic process. The Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model is frequently used to release the constant variance assumption made by a traditional Black-Scholes model. The model is different from the Heston model in that it assumes the randomness of the variance changes with the variance. In contrast, the Heston model assumes the randomness of the variance changes with the square root of variance. Though the variance produced by GARCH improves the Black-Scholes model assumption, it is not stochastic since the volatility is a deterministic function of the previously realized volatility rather than a perfectly random process. dνt = θ(ω − νt )dt + ξνt dBt 3 3 TH A similar attempt is the 3/2 model, though it assumes the randomness of the variance changes with νt2 and the differential equation for variance is: dνt = νt (ω − θνt )dt + ξνt2 dBt AU Furthermore, the efforts of releasing the Black-Scholes model assumptions go beyond the correction of volatility setting. Both academia and financial industry practitioners devote more attention to address the volatility smile issue. This issue is mainly due to the mismatch between the option values produced by the pricing models and the realized prices quoted at the marketplace. When the observed asset prices experience a dramatic change, the pricing models are short of adaptivity to incorporate the volatility shocks at the deep-in-the-money and the deep-out-of-the-money ends. Allowing for random volatility is not sufficient for some often-seen price jumps. Models without volatility jumps may be fundamentally misspecified because implausibly large volatility shocks generate large option price movements. For example, the Heston model, which has a square-root variance stochastic volatility, requires an eight-standard-deviation shock in returns to generate a price gap. The gap is unattainable by adjusting the squareroot specification. Jacquier et al. (2001) find the same misspecification still exists in a log-variance model. There is a need to incorporate the price jumps in the model rather than relying on extremely fat tails produced solely by stochastic volatilities. Therefore, the recent literature’s efforts mainly focus on applying and advancing the jumpdiffusion models in the option pricing process. 1.3. The jump-diffusion models Concerning resolving multiple issues with the Black-Scholes model, the most current development pursues the jump-diffusion models. Some representative studies are Kou (2002) and Eraker et al. (2003). These models carry some features that improve the quality of option pricing: they reproduce the leptokurtic feature of the return distribution, incorporate the volatility smile and smirk phenomena, and derive closed-form solutions for path-dependent 288 H. Dong and X. Guo / Option price predictability, splines, and expanded rationality options. Given the constant volatility problem of the Black-Scholes model, the stochastic volatility models prevail. Stochastic volatility models incorporate the variance as a stochastic process to release the assumption of fixed volatility of the underlying asset. The building blocks of a jump-diffusion model are a Poisson process and a Brownian motion. The former fits the jumps in asset price, and the latter fits the diffusion. This model mimics the asset price series that sometimes jumps and has a continuous but random evolution between the jumps. The stochastic volatility model is developed from the continuous-time form of the Black-Scholes formula: Allowing for time-dependent volatility, √ dSt = µSt dt + νt St dzt OR C dνt = αν,t dt + βν,t dBt OP The maximum likelihood estimator of σ is  2 St1 Stn n ln ln X S 1 St0 ti−1  (t1 − ti−1 )  − σ̂ 2 = n i=1 ti − ti−1 tn − t0 Y ∂V ∂V 1 ∂2V + σ 2 S 2 2 + rS − rV = 0 ∂t 2 ∂S ∂S dSt = µSt dt + σSt dzt 2. The spline model TH In the definition of volatility, αν,t and βν,t are functions of νt , and dBt is a standard normally distributed variable. The form of νt depends on the specific setting of the stochastic model summarized below. However, one key demand is missing from the jump-diffusion models: the volatility clustering effect. The stochastic volatility models resolve this concern. Duffie et al. (2000) propose combining jump-diffusions with stochastic volatilities resulting in the affine jump-diffusion models. Filipović and Larsson (2020) refine the model to accommodate the non-normal distribution, the fitting volatility smile, and the volatility clustering effect. However, the affine jump-diffusion model is impractical for the financial industry as its users find it difficult to solve, even for a numerical solution. AU Despite the efforts summarized above, two significant gaps remain in the field of forecasting option prices. Regarding the first gap, the latest development in the option pricing models, such as the affine jump-diffusion models, does not address all Black-Scholes models’ remaining issues. The two obstacles left are the risk measures and the liquidity premium. As we specified in the previous section, the financial market applies different risk measures, namely the standard deviation and the value-at-risk (VaR). The former carries conversion errors, and the latter fails to serve as the input of a parametric model. In addition, none of the recently developed option pricing models shed light on the liquidity premium of option prices in the real market environment. To sum up, the first gap is the lack of connection between the theoretical prices and the quality of execution in the actual marketplace. Regarding the second gap, under the current state of knowledge, the option price forecast still relies on the option pricing models’ results. The best forecast of the future price is the expected value of the current option price. From this perspective, the current literature blends the two topics: forecasting future option prices and pricing current option contracts. There is no forecast method proposed or applied independently from the option pricing models. The need to forecast option prices is apparent and meaningful. The purpose of a highquality forecast of option prices is not limited to pursuing excess return. It may also improve the budget process of portfolio risk management if options are involved as hedging tools. However, relying on using a perfect option pricing model to conduct the price forecast is unrealistic. This is not only because such a perfect model does not exist but also because most option contract transactions are illiquid. The thin liquidity causes option price jumps that leave gaps in the continuous timeseries data. Hence the best forecast for the future price may not be the expected value of the current price. H. Dong and X. Guo / Option price predictability, splines, and expanded rationality 289 TH spline pt+1 = F1,2,.. (pt , pt−1 , . . . , p1 ) OR C OP Y Therefore, we explore an independent option price forecast model free from the option pricing models. The focus is the accuracy of forecasted asset price at t + 1, rather than if the price produced from pricing models fits the asset price at t. In this forecast model, we regard the realized option prices as exogenous and do not assume their distributions. We also take into consideration the market liquidity conditions. A good forecast model should carry additional features besides the demand for accuracy. These additional criteria include simplicity, minimized assumptions, and the degree of accuracy independent from the impact of market conditions. On the contrary, the option prices are affected by factors from four perspectives: the fundamental value of the asset, the specifications of the option contract, the dynamic game among market makers, buyers, and sellers, as well as the investor expectations. Therefore, it is challenging to develop a simple yet effective forecasting model with minimal assumptions. Therefore, we switch the track and consider the option prices as exogenous. Rather than decomposing the option returns by factors like the factor models in equity asset pricing, we treat option prices in an ex post manner and as black boxes. We only focus on option prices’ autocorrelations and avoid explaining the reasons to support its current price level. Empirically, past studies have made numerous efforts regarding fitting a time series, for example, the ARIMA model, curve fitting with exponential terms, Gaussian class models, sigmoid shapes, to name a few. On the one hand, using a parametric model to fit the option prices has its downside: its fixed set of parameters cannot adjust for the option prices with return jumps and volatility jumps. This can cause significant forecast errors and error dependence. On the other hand, using a nonparametric model to fit the prices has its downside, too: its forecasted value cannot be specified with certainty. Among others, we propose using the spline method to forecast option prices. Studies in this field have attempted to apply the spline functions to different asset markets. Some representative ones include Vasicek and Fong (1982) that first use splines on term structures; Audrino and Bühlmann (2009) that use splines to model financial volatility; Tian (2015) that applies spline models for the implied binomial tree; and Rashidinia and Jamalzadeh (2017) that connects the spline models with option prices. This method minimizes the assumptions to data pattern and distribution It is also less prone to be affected by the specific types of option contracts regarding their parameters, as presented in Tables 2 and 3. In fact, the only input restriction to the data that are used to generate a prediction is that the discrete data need to be bounded values in the nonnegative real number set: pi ∈ R− ∃M ∈ R+ , |pi | 6 M AU Splines are piecewise polynomial functions. Using splines in curve fit and observation interpolation yields better results than single polynomial regressions because it produces similar results without increasing the degree of polynomials and avoids introducing the Runge’s phenomenon. Spline functions for interpolation are the minimizers of suitable roughness measures subject to the observation and boundary constraints. The model regression minimizes a weighted combination of the average squared approximation error over observed data. We name the spline function S and define it as a cubic polynomial function on the [xj−1 , xj+k+1 ] domain and mapping to the real numbers range: S : [xj−1 , xj+k+1 ] → R As S is nonlinear, we define its k + 2 sub-domains in a piecewise manner with pairwise disjoint interiors: [xj−1 , xj+k+1 ] = [xj−1 , xj ]∪[xj , xj+1 ]∪[xj+1 , xj+2 ]∪ · · · ∪[xj+k−1 , xj+k ]∪[xj+k , xj+k+1 ] For each sub-domain, we define a curve, for example: C : [xj , xj+1 ] → R To make the curve a cubic polynomial: Cj−1 (x) = a0,j−1 + a1,j−1 x + a2,j−1 x2 + a3,j−1 x3 290 H. Dong and X. Guo / Option price predictability, splines, and expanded rationality Therefore, the knots xj−1 . . . xj+k+1 divide the spline into k + 2 forms:  Cj−1 (x), for x ∈ [xj−1 , xj ]     Cj (x), for x ∈ [xj , xj+1 ] S= . ..     Cj+k (x), for x ∈ [xj+k , xj+k+1 ] OR C OP Y The choice of spline degree affects the shape of the curve fit. If the degree is 0, the spline fit is a constant between every two knots. If the degree is 1, the spline fit is a linear function. If the degree is 3, the spline fit is a cubic function that combines that concave and convex curves without introducing the boundary connection constraints. In addition, any cubic spline is compatible with linear and constant functions. Therefore our study uses the cubic basis spline (Cubic B Spline) as the instrument to fit and forecast option prices, given the support of our large dataset and sufficient sample sizes to maintain a satisfying degree of freedom. We do not set up the locations of knots in an ex ante manner. This avoids introducing subjectivity in the identification of a price jump. The threshold of price jump changes over the life of an option’s contract, and there is no readily available algorithm to determine the significance of a price jump in a continuous series of price quotes. Instead, we allow for one additional knot added at a free location for every 20 new price quotes recorded and included in the dataset. This method is based on the common practice of demanding 20 more observations per degree of freedom is increased. When a new knot is added in the spline regression, its location and the previous knots’ locations are recalculated to find the cubic functions that fit the data the best with the newly defined cutoff range of the independent variable. This procedure is implemented by the constrained nonlinear multivariable function minimization algorithm. The problem is specified by min f (x) x s.t. c(x) 6 0 ceq(x) = 0 Aeq · x = beq TH A·x6b x ∈ [lower bound, upper bound] AU where b and beq are vectors, A and Aeq are matrices, c(x) and ceq(x) are functions that return vectors, and f (x) is a function that returns a scalar. f (x), c(x), and ceq(x) can be nonlinear functions. x = (f (x), x0 , A, b, Aeq, beq, lb, ub, nonlinearcondition) starts from x0 , and subjects the minimization to the nonlinear inequalities c(x) or equalities ceq(x) defined in nonlinear condition. x optimizes such that c(x) 6 0 and ceq(x) = 0. 3. Forecasting option prices with spline model We use the daily American style option prices of the SPY to conduct our forecast tests. The underlying asset SPY is an Exchange-Traded Fund (ETF) that tracks the Standard and Poor’s 500 Index performance. There are four reasons we select the options based on SPY: (1) it is the most active asset traded in the financial market in the United States with the highest Asset Under Management (AUM) value; (2) it attracts the most complete and uninterrupted option chains and types of contracts created by a broad range of market makers and investors; (3) the financial market and data providers keep the longest record for SPY compared to other assets; (4) most of the literature in the option pricing field used SPY as the instrument of pricing precision. We collect more than 50 studies using SPY options as the representative assets, and the list is available upon request. The option data and the related market data are from the Options Pricing and Reporting Authority (OPRA). The data is referred to as the National Best Bid Offer (NBBO). The OPRA is responsible for consolidating the prices H. Dong and X. Guo / Option price predictability, splines, and expanded rationality 291 Table 1 Summary statistics of the Greeks, pricing factors, and market transactions Maximum Minimum 1.88E+02 0.00 1.78E+02 0.00 60.52 −46.17 1.00 −1.00 1.00 0.00 50.00 −9.56E+03 2.14E+02 0.00 14.00 0.00 11.00 0.00 12.00 0.00 2.35 0.45 4.00E+02 60.00 3.38E+02 68.11 1.07E+03 1.00 5.43E+05 0.00 5.43E+05 0.00 1.18E+06 0.00 Std. Dev. 16.65 16.36 1.33 0.69 2.90E-03 32.83 40.52 0.20 0.10 0.10 0.16 60.65 64.27 2.89E+02 2.87E+04 2.87E+04 6.25E+03 Y Median 14.00 13.00 5.54E-17 0.00 0.00 −4.00 49.00 0.00 0.00 0.00 0.97 1.75E+02 1.87E+02 3.25E+02 3.02E+03 3.00E+03 2.00 OP Mean Option ask price 17.87 Option bid price 17.49 Forecast error of ask price 3.66E-03 Option δ 0.13 Option γ 8.39E-06 Option θ −5.32 Option υ 55.64 Implied volatility from ask price 0.02 Implied volatility from bid price 0.01 Implied volatility from mean price 0.01 Moneyness (K/St ) 0.99 Strike prices (K) 1.83E+02 Underlying asset price (St ) 1.88E+02 Contract remaining life 4.01E+02 Open interest 1.29E+04 Open interest T + 1 1.29E+04 Option trade volume 7.04E+02 Skewness 1.39 1.39 0.23 −0.17 3.45E+02 −1.64E+02 0.61 15.46 23.10 24.26 1.02 0.41 0.34 0.60 5.07 5.08 64.55 Kurtosis Jarque-Bera 5.63 2.92E+05 5.51 2.78E+05 57.80 5.97E+07 2.12 1.78E+04 1.19E+05 2.82E+14 4.11E+04 3.35E+13 2.61 3.21E+04 4.49E+02 3.98E+09 1.14E+03 2.59E+10 1.54E+03 4.69E+10 6.20 2.85E+05 2.20 2.57E+04 2.00 2.92E+04 2.29 3.91E+04 40.41 2.99E+07 40.49 3.00E+07 8.79E+03 1.53E+12 AU TH OR C from all the individual option exchanges and publishing the NBBO. The data is sampled by the end of each trading day, and the coverage is from February 8, 2005, to May 29, 2020. Due to the rich spectrum of contracts created based on SPY, each option variable includes 476,882 observations. Table 1 illustrates the data description. In Table 1, each variable includes 476,882 observations. The forecast errors are based on the projection from the spline model on the options of SPY from February 8, 2005, to May 29, 2020. Options are assigned to one of three cycles at the moment of creation: JAJO (January, April, July, and October), FMAN (February, May, August, and November), and MJSD (March, June, September, and December). The options for SPY follow the MJSD cycle. Therefore, new contracts based on the SPY price are created to expire in March, June, September, and December. In addition, option investors are provided the contracts for the first two front months followed by the two remaining cycle months. After a month passes, the last two remaining months continue to follow the cycle assigned initially. After two months pass, the contract with the nearest expiration is concluded, and a new cycle month is about to start. This provides opportunities for contracts with various lengths to meet the different needs of the investors. In recent years, market makers have created weekly options to accommodate heterogeneous demands. However, our study focuses on the conventional option contracts that are still mainstream and do not include contracts whose entire life is shorter than 20 days. For each option contract of SPY, we start to fit its first spline curve after accumulating 20 days of transactions. This brings 16 degrees of freedom to shape the coefficients of the first cubic polynomial. As the transactions continue, one knot is added for every 20 additional days the age of the option grows, though the location of the knot is not pre-determined. Compared to most literature that uses the average price of ask and bid as the option price, our study uses the four series of ask and bid prices for the call and put options directly quoted from the marketplace. We believe that this is a better benchmark of the original option prices free from the spread’s impact. In Fig. 1, we use two spline regressions to illustrate the development of spline functions and knots’ shift as more samples are included. From left to right and from the top to the bottom splines, the number of observations increases from 100 to 130. The number of knots increases from 5 to 6. The vertical dashed lines indicate the location and cutoff of the knots. The Fig. 1 illustrate the process of growing the data sample and the number of knots without a deterministic setup of knot location. With more option price quotes being realized as time passes by, the knots are recalculated and relocated. Accordingly, the parameters of the cubic polynomial functions fit the data adjusts to pursue minimized error. A new knot is added after every 20 new observations are included in the sample. The splines are continuous, but the joints are not derivable. The cubic polynomial functions present piecewise concavity and convexity, with at the most one inflection point in each cubic polynomial function. The MATLAB R codes of the spline modeling and knot identification involved in this paper are available by request. The right-most cubic polynomial between the last two knots can be fit in the form of the following equation. We define knot t as the right-most knot at the current moment and knot t − j as the second right-most knot located j trading days ago. H. Dong and X. Guo / Option price predictability, splines, and expanded rationality Y 292 OP Fig. 1. Spline fit of a growing sample. t,knot t−j+1 t,knot t−j+1 2 t,knot t−j+1 3 C(t) = â0knot t,knot t−j+1 + âknot t + âknot t + âknot t 1 2 3 We use this polynomial to generate the predicted price of a SPY option contract: t,knot t−j+1 t,knot t−j+1 C(t + 1) = â0knot t,knot t−j+1 + âknot (t + 1) + âknot (t + 1)2 1 2 OR C t,knot t−j+1 + âknot (t+)3 3 We compute the forecast error as the deviation of the forecasted option price and the realized price: εt+1 = C(t + 1) − pt+1 spline εt+1 = C(t + 1) − F1,2,.. (pt , pt−1 , . . . , p1 ) AU TH The results, combining the ask and bid prices for both call and put options, show that the performance of the spline forecast makes it an ideal and practical tool for the market. The average forecast error is $3.66 × 10−3 . This is an insignificant level compared to the bid-ask spread and the transaction cost. This is also a trivial amount compared to the average option ask price ($17.87) and the average bid price ($17.49) of the options. In fact, the average forecast error is affected by a few outliers; the median forecast error is as low as $5.54 × 10−17 , with a standard error of 1.33. As discussed in Section 2, a good forecasting model should produce not only accurate but also robust predictions. We test the independence of the forecast error with different attributes of contracts. The attributes of options come from two perspectives: the Greeks and the Black-Scholes factors. We use the Newey-West heteroscedasticity-consistent and autocorrelation-consistent (HAC) estimators procedure (Newey & West, 1987) in the cross-sectional ordinary least square (OLS) regressions. We present the results in Tables 2 and 3. The regression results from an ordinary least square (OLS) regression without using this procedure are presented in Appendix A.1. We argue that pooling the Greeks of an option and the transaction volumes of an option in the regression of forecast error without counting for the serial correlation and heteroscedasticity problems among independent variables would cause false significance. The spline forecast errors are not affected by the Greeks, as presented in Table 2. In Table 2, we present the relationship between the ask price forecast error from the spline model and the SPY option Greeks. Single, double, and triple asterisks (*, **, ***) indicate statistical significance at the 10%, 5%, and 1% levels. The standard errors of the regression coefficients are reported in the parentheses. We continue the robustness check of the spline forecast error regarding the Black-Scholes formula factors: days to expiration, volatility, underlying asset price, moneyness, and the option price. Regarding the categorization of the moneyness of options, there is numerous literature that adopts different criteria. Though the technical threshold values are different, these past studies generally use two sets of methods: categorizing the moneyness by the ratio of strike price over the current underlying asset price and the option’s delta. Cosma et al. (2020) follow the practice of Bollen and Whaley (2004) and use 0.375, 0.625 as the threshold values of options’ delta to categorize ATM, ITM, and OTM. Feunou and Okou (2019) define OTM calls as the options with Moneyness Underlying price Ask IV ask 0.0043 (0.0070) −1.56E-06 (1.51E-05) −0.0146 (0.0261) 0.0040 (0.0051) 0.0001 (1.85E-04) 0.0062 (0.0059) −0.0155 (0.0134) 7.93E-05 (1.27E-04) 0.0013 (0.0078) −0.0155 (0.0134) 0.1929 (0.3609) 0.0001 (1.86E-04) 0.0043 (0.0053) 0.1934 (0.3614) −0.0011 (0.0076) 8.48E-05 (1.28E-04) OR C TH 0.0057 (0.0058) −0.0156 (0.0134) 0.1868 (0.3659) 0.0001 (1.85E-04) 8.10E-05 (1.28E-04) −0.0003 (0.0077) 0.0062 (0.0059) −0.0155 (0.0134) 0.1907 (0.3654) 0.0001 (1.85E-04) −7.18E-05 (9.91E-05) 0.0172 (0.0176) 0.0078 (0.0451) −0.0041 (0.0430) −0.0023∗∗∗ (6.93E-04) 0.0001∗∗∗ (8.41E-05) 0.0103 (0.0447) 0.0141 (0.0510) −7.57E-05 (1.03E-04) −0.0025 (0.0469) −0.0158 (0.0262) 0.0208 (0.0522) −0.0019 (0.0465) −0.0040 (0.0264) −0.0021∗∗∗ (0.0007) 0.0439 (0.0509) −0.0021 (0.0266) −0.0022∗∗∗ (0.0007) 9.53E-05 (9.49E-05) 0.0259 (0.0181) −7.23E-05 (1.02E-04) −0.0007 (0.0464) −0.0130 (0.0482) −0.0023∗∗∗ (0.0007) 0.0440 (0.0511) 3.40E-05∗∗ (1.07E-05) OP 0.0181 (0.0152) −3.02E-07 (1.04E-05) −0.0024∗∗∗ (0.0007) 9.58E-05 (9.45E-05) Y 0.0154 (0.0181) 3.25E-05∗∗ (1.59E-05) 0.0078 (0.0449) −0.0021 (0.0433) −3.97E-06 (1.43E-05) −0.0151 (0.0261) 7.93E-05 (1.27E-04) 0.0013 (0.0078) −0.0155 (0.0134) 0.1910 (0.3659) Table 3 Ask price forecast error and the pricing factors of the SPY options with Newey-West procedure 8.48E-05 (1.28E-04) −0.0011 (0.0076) AU −0.0021∗∗∗ (6.92E-04) 0.0420∗∗∗ (0.0092) Dependent variable: ask price error Dependent variable: ask price error 0.0057 0.0037 0.0043 (0.0058) (0.0051) (0.0053) −0.0156 (0.0134) 0.1888 (0.3613) 0.0001 (1.86E-04) Days to expiration Intercept Pricing factors Vega Theta Gamma Delta Greeks Intercept Table 2 Ask price forecast error and the Greeks of the SPY options with Newey-West procedure −7.32E-05 (9.92E-05) 0.0187 (0.0182) −2.07E-06 (1.49E-05) −0.0161 (0.0261) (0.0007) (1.60E-05) 0.0043 (0.0268) −0.0023∗∗∗ 0.0313∗∗∗ (0.0090) 3.30E-05∗∗ (0.0007) 9.59E-05 (8.41E-05) −0.0014 (0.0464) 0.0164 (0.0511) 3.34E-05∗ (1.71E-05) 0.0062 (0.0267) −0.0024∗∗∗ Number of observations: 476882 Number of observations: 476882 0.0020 −0.0003 0.0020 (0.0079) (0.0077) (0.0079) −0.0154 −0.0154 (0.0134) (0.0134) 0.1971 0.1946 (0.3610) (0.3655) 0.0001 0.0001 0.0001 (1.84E-04) (1.85E-04) (1.84E-04) 7.57E-05 8.10E-05 7.57E-05 (1.27E-04) (1.28E-04) (1.27E-04) H. Dong and X. Guo / Option price predictability, splines, and expanded rationality 293 294 H. Dong and X. Guo / Option price predictability, splines, and expanded rationality 4. Expanded rationality AU TH OR C OP Y delta lower than 0.5 and define OTM puts as the options with delta higher than 0.5. Christofferson et al. (2014) also use option delta to categorize options by their moneyness. Barone-Adesi et al. (2020) regard the options whose K/St ratios fall into 0.85 to 0.1 (puts) and 1 to 1.15 (calls) as OTM, and less than 0.85 (puts) or greater than 1.15 (calls) as DOTM. Alitab et al. (2019) regards the options whose K/St ratios are between 0.98 and 1.02 as ATM, 0.9 to 0.98 (puts) and 1.02 to 1.1 (calls) as OTM, and 0.7 to 0.9 (puts) and 1.1 to 1.3 (calls) as DOTM. Leippold and Vasiljevic (2020) regard the options whose K/St ratios are between 0.97 and 1.03 as ATM, 0.94 to 0.97 (puts) and 1.03 to 1.06 (calls) as OTM. However, Christofferson et al. (2018) use St /k between 0.975 and 1.025 as the standards of ATM. In Table 3, we present the relationship between the ask price forecast error from the spline model and the SPY option’s pricing factors. Single, double, and triple asterisks (*, **, ***) indicate statistical significance at the 10%, 5%, and 1% levels. The standard errors of the regression coefficients are reported in the parentheses. As Table 3 concludes, the spline forecast error is not affected by the implied volatility of the option, the price of the underlying asset, or the option’s moneyness. However, two factors will affect the forecast precisions: the stage and the price of the option. When the remaining life of an option is longer, the forecast error is higher. The forecast precision increases as the options move to maturity. In addition, the forecast error is lower when the price of the option is higher. Both impacts are reasonable, and we argue that the former is due to the limited number of observations, whereas the latter is due to the skewness from the asset price bound. Regarding the impact of the stage of the option, when the spline function fits the first cubic polynomial function based on the first 20 samples of realized option prices, the goodness of fit is relatively limited due to the fewer degrees of freedom and due to the lack of knots to identify structural breaks. Therefore, the first few predictions may carry large errors. The goodness of fit is improved as more option prices are introduced to depict the cubic polynomial and more knots are inserted to allow optimal segmentation of the asset price series. In other words, the precision of the spline model forecast is less of a concern as an option walks out of its initial stage. Regarding the impact of the option price, we propose that the price boundary can be a significant reason. A large portion of the option prices are zero, especially for the bid prices of both call and put options when they are close to maturity or deep out of the money. The minimum price is zero, though the value predicted by the spline functions may be negative. This amplifies the forecast error when the option prices are lower. The spline model forecast precision is not a concern for option prices that are not close to zero. For option prices staying at zero for a long time, the forecast precision is not a concern either, because the spline model will most likely fit a constant curve to reflect the price fact. Overall, we believe that the impacts on the forecast errors cause less significant concerns. This is also because the errors are not cumulative and are mean-reverting, as the expected error size between each pair of adjacent knots is zero. The forecast error is primarily not affected by the features carried by the contract terms of an option, as suggested by the previous section. This should be distinguished from the fact that it is related to some variables realized and observed in the marketplace. From a theoretical perspective, these variables are affected by the option price and are related to the forecast errors. These variables are not the ones that affect the option prices when an option contract is created. Therefore, these market-related variables are associated with the forecast errors but are not determinants of the forecast errors. The motivation of investigating the relationship between these market-based variables and the forecast errors is primarily driven by the investors’ momentum and mean reversion expectations. Investors observe the updates of the option prices and are usually able to form momentum expectations. Such expectations are drawn mainly by observing the price curve. We argue that this should not be identified as a technical analysis. Investors may have sound statistical evidence and experience concluded from the realized market data to support their expectations. Some examples of this evidence are the volatility clustering effect and the local autocorrelation of asset returns. Therefore, the spline fit curves can be partially regarded as the investors’ expectations on the option prices when observing the price plots. Obviously, even the most proficient investor cannot depict and predict the spline price H. Dong and X. Guo / Option price predictability, splines, and expanded rationality 295 Table 4 Ask price forecast error and the market transaction of the SPY options with Newey-West procedure Market Intercept Open interest Open interest T1 Volume Dependent variable: ask price error 0.0036 0.0033 0.0041 (0.0058) (0.0058) (0.0052) 3.50E-09 (9.82E-08) 3.02E-08 (9.74E-08) −6.57E-07 (4.28E-07) 0.0035 (0.0058) −9.03E-07∗ (4.89E-07) 9.20E-07∗ (4.83E-07) Number of observations: 476882 0.0030 0.0032 (0.0058) (0.0058) −8.32E-07∗∗ (4.81E-07) 9.41E-08 9.10E-07∗∗ (1.00E-07) (4.82E-07) −7.66E-07∗ −8.12E-07∗ −7.58E-07∗∗ (4.35E-07) (4.41E-07) (4.38E-07) 0.0034 (0.0058) 6.49E-08 (9.99E-08) OR C OP Y curve because the asset price follows a random walk outside its local autocorrelation zone. However, this does not imply that investors always fail to guess and predict the option prices. Therefore, we roughly use the spline curves as a proxy of the investor expectations on option prices based on their experience and observations of the market. Accordingly, the forecast errors of the spline curves are investor expectation errors. We explore how the marketbased variables of an option affect these errors. We include the trading volume, the open interest, and the T + 1 open interest in the regressions. The interpretation of option trading volume is slightly different from that of equities. The option trading volume is relative and needs to be compared to the average daily volume of the underlying assets. The open interest is the number of active contracts, which have been traded but not yet liquidated by an offsetting trade, exercise, assignment. A buy to open (setting up a long position) or sell to open (setting up a short position) contribute to the open interest. Conversely, a sell to close concludes the long position, and a buy to close concludes the short position. Both of these offsetting operations reduce the open interest. The T + 1 open interest observed at T indicates the updated open interest after some of the current open interests are traded and offset. The current open interest observed at the trading initiation of T + 1 combines the T + 1 open interest observed at T and the new positions opened at T + 1, assuming no trading volume has been executed at T + 1. Open interest(T + 1)t = Open interest(T )t − Volume(T )t Open interest(T )t+1 = Open interest(T + 1)t + new opens(T )t+1 AU TH In Table 4, we present the relationship between the ask price forecast error from the spline model and the SPY option’s transaction interests. Single, double, and triple asterisks (*, **, ***) indicate statistical significance at the 10%, 5%, and 1% levels. The standard errors of the regression coefficients are reported in the parentheses. Higher current open interest and volume indicate more robust demand for the option and better liquidity. This not only helps reduce the bid-ask spread but also indicates a lower degree of opinion divergence among investors. The results from Table 4 show that higher current open interest and higher volume are related to lower investor expectation error. In contrast, higher next-day open interest is related to higher investor expectation error. With lower current open interest settled by trading volume, a more significant number of remaining open interest are carried forward when investors find the market difficult to predict and forecast errors are high. Under such circumstances, investors leave the positions open and realize thinner trading volume and liquidity. This provides evidence to support that the investors expand their rational expectations. Investors are not only effective for investments conducted but also for investments that are terminated. A risk averse investor hesitates to continue the option trading and execution when the option prices are more difficult to predict by observing the autocorrelations from the price curves. 5. Concluding remarks This study proposes using spline modeling to predict the option prices. Past studies implement option price forecast by pursuing perfect asset pricing models and use the expected price produced by such models as the forecasted option price. We focus on direct price forecast instead of option pricing. We consider option price as exogenous and produce future prices from the historical prices directly without decomposing the option price by their determinants. This helps avoid the problems involved in the Black-Scholes that are not fully resolved by the new option pricing models’ recent developments. 296 H. Dong and X. Guo / Option price predictability, splines, and expanded rationality References OR C OP Y We use four daily price series of the American options on SPY to examine the quality of forecast: the ask and the bid prices of the calls and the puts. The spline model produces low option price forecast errors. The mean error size is $3.66 × 10−3 with a standard deviation of 1.33 and a median error of $5.54 × 10−17 . The spline model is simple to use, and the forecasts are robust from the impact of different contract settings. We also provide evidence that investors form rational expectations by observing the option prices and form a spline curve-like forecast in their minds. This guides them for implemented investments. In addition, investors expand their rationality by terminating planned option investments when they believe that the option prices are more difficult to predict accurately. However, using the spline model to provide option price forecast is not without its limitations. First, it is a forecast model with exogenous realized prices rather than an asset pricing model, and hence its model specifications do not carry economical meanings. Second, though it can produce reliable forecasts with certainty compared to nonparametric models, it does not have a fixed specification known ex ante compared to parametric models. Third, the forecast error size is more significant for newly created options and low option prices. Therefore, the next step’s studies can focus on combining the spline forecast with option pricing models. This can fill the gap of the limits mentioned above regarding the spline method and the Black-Scholes models’ limits and their developments. Another research route is to apply and expand the spline method to other financial markets across asset classes and geographical locations. The third route of future studies is to promote methodology development of the spline method. Currently, the spline method remains at the pure function stage: the forms of regression functions across knots are the same. A possible innovation may be to release the consistent form constraint to allow for different function forms in one spline regression. AU TH Alitab, D., Bormetti, G., Corsi, F., & Majewski, A.A. (2019). A realized volatility approach to option pricing with continuous and jump variance components. Decisions in Economics and Finance, 42(2), 639-664. doi: 10.1007/s10203-019-00241-2. Audrino, F., & Bühlmann, P. (2009). Splines for financial volatility. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(3), 655-670. doi: 10.1111/j.1467-9868.2009.00696.x. Barndorff-Nielsen, O.E., & Shephard, N. (2001). Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial economics. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 167-241. doi: 10.1111/1467-9868.00282. Barone-Adesi, G., Fusari, N., Mira, A., & Sala, C. (2020). Option market trading activity and the estimation of the pricing kernel: A Bayesian approach. Journal of Econometrics, 216(2), 430-449. doi: 10.1016/j.jeconom.2019.11.001. Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81(3), 637-654. doi: 10.1086/ 260062. Bollen, N.P.B., & Whaley, R.E. (2004). Does net buying pressure affect the shape of implied volatility functions? The Journal of Finance, 59(2), 711-753. doi: 10.1111/j.1540-6261.2004.00647.x. Christoffersen, P., Feunou, B., Jacobs, K., & Meddahi, N. (2014). The economic value of realized volatility: Using high-frequency returns for option valuation. Journal of Financial and Quantitative Analysis, 49(3), 663-697. doi: 10.1017/s0022109014000428. Christoffersen, P., Fournier, M., & Jacobs, K. (2017). The factor structure in equity options. The Review of Financial Studies, 31(2), 595-637. doi: 10.1093/rfs/hhx089. Cosma, A., Galluccio, S., Pederzoli, P., & Scaillet, O. (2018). Early exercise decision in american options with dividends, stochastic volatility, and jumps. Journal of Financial and Quantitative Analysis, 55(1), 331-356. doi: 10.1017/s0022109018001229. Cox, J.C., & Ross, S.A. (1976). The valuation of options for alternative stochastic processes. Journal of Financial Economics, 3(1-2), 145-166. doi: 10.1016/0304-405x(76)90023-4. Davydov, D., & Linetsky, V. (2001). Pricing and hedging path-dependent options under the CEV process. Management Science, 47(7), 949-965. doi: 10.1287/mnsc.47.7.949.9804. Drapeau, S., Wang, T., & Wang, T. (2020). How Rational Are the Option Prices of Hong Kong Dollar Exchange Rate? The Journal of Derivatives, jod.2020.1.120. doi: 10.3905/jod.2020.1.120. Duffie, D., Pan, J., & Singleton, K. (2000). Transform analysis and asset pricing for affine jump-diffusions. Econometrica, 68(6), 1343-1376. doi: 10.1111/1468-0262.00164. Engle, R.F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation. Econometrica, 50(4), 987. doi: 10.2307/1912773. Eraker, B., Johannes, M., & Polson, N. (2003). The impact of jumps in volatility and returns. The Journal of Finance, 58(3), 1269-1300. doi: 10.1111/1540-6261.00566. Feunou, B., & Okou, C. (2018). Good volatility, bad volatility, and option pricing. Journal of Financial and Quantitative Analysis, 54(2), 695-727. doi: 10.1017/s0022109018000777. Filipović, D., & Larsson, M. (2020). Polynomial jump-diffusion models. Stochastic Systems, 10(1), 71-97. doi: 10.1287/stsy.2019.0052. Heston, S.L. (1993). A closed-form solution for options with stochastic volatility with applications to bond and currency options. Review of Financial Studies, 6(2), 327-343. doi: 10.1093/rfs/6.2.327. H. Dong and X. Guo / Option price predictability, splines, and expanded rationality 297 AU TH OR C OP Y Hull, J., & White, A. (1987). The pricing of options on assets with stochastic volatilities. The Journal of Finance, 42(2), 281-300. doi: 10.1111/j.1540-6261.1987.tb02568.x. Jacquier, E., Polson, N.G., & Rossi, P.E. (2004). Bayesian analysis of stochastic volatility models with fat-tails and correlated errors. Journal of Econometrics, 122(1), 185-212. doi: 10.1016/j.jeconom.2003.09.001. Kou, S.G. (2002). A jump-diffusion model for option pricing. Management Science, 48(8), 1086-1101. doi: 10.1287/mnsc.48.8.1086.166. Leippold, M., & Vasiljević, N. (2020). Option-implied intrahorizon value at risk. Management Science, 66(1), 397-414. doi: 10.1287/mnsc. 2018.3157. Mandelbrot, B. (1963). The variation of certain speculative prices. The Journal of Business, 36(4), 394. doi: 10.1086/294632. Marks, J.M., & Simon, D.P. (2017). Sector option implied volatility dynamics and predictability. The Journal of Derivatives, 25(2), 22-42. doi: 10.3905/jod.2017.25.2.022. Newey, W.K., & West, K.D. (1987). A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica, 55(3), 703. doi: 10.2307/1913610. Rashidinia, J., & Jamalzadeh, S. (2017). Modified b-spline collocation approach for pricing american style asian options. Mediterranean Journal of Mathematics, 14(3). doi: 10.1007/s00009-017-0913-y. Rogers, L.C.G. (1997). Arbitrage with fractional brownian motion. Mathematical Finance, 7(1), 95-105. doi: 10.1111/1467-9965.00025. Tian, Y.S. (2015). Implied binomial trees with cubic spline smoothing. The Journal of Derivatives, 22(3), 40-55. doi: 10.3905/jod.2015.22.3.040. Vasicek, O.A., & Fong, H.G. (1982). Term structure modeling using exponential splines. The Journal of Finance, 37(2), 339-348. doi: 10.1111/ j.1540-6261.1982.tb03555.x.

Option Price Predictability with Splines: A Forecasting Approach

Related documents

Products

Support

Option Price Predictability with Splines: A Forecasting Approach

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib