International Journal of Forecasting 22 (2006) 283 – 300 www.elsevier.com/locate/ijforecast Using extreme value theory to measure value-at-risk for daily electricity spot pricesB Kam Fong Chan a,*, Philip Gray b,1 a Department of Accounting and Finance, Faculty of Business and Economics, The University of Auckland, Private Bag 92019, Auckland, New Zealand b UQ Business School, The University of Queensland, St. Lucia 4072, Australia Abstract The recent deregulation in electricity markets worldwide has heightened the importance of risk management in energy markets. Assessing Value-at-Risk (VaR) in electricity markets is arguably more difficult than in traditional financial markets because the distinctive features of the former result in a highly unusual distribution of returns — electricity returns are highly volatile, display seasonalities in both their mean and volatility, exhibit leverage effects and clustering in volatility, and feature extreme levels of skewness and kurtosis. With electricity applications in mind, this paper proposes a model that accommodates autoregression and weekly seasonals in both the conditional mean and conditional volatility of returns, as well as leverage effects via an EGARCH specification. In addition, extreme value theory (EVT) is adopted to explicitly model the tails of the return distribution. Compared to a number of other parametric models and simple historical simulation based approaches, the proposed EVT-based model performs well in forecasting out-of-sample VaR. In addition, statistical tests show that the proposed model provides appropriate interval coverage in both unconditional and, more importantly, conditional contexts. Overall, the results are encouraging in suggesting that the proposed EVT-based model is a useful technique in forecasting VaR in electricity markets. D 2005 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved. JEL classification: C14; C16; C53; G11 Keywords: Extreme value theory; Value-at-risk; Electricity; EGARCH; Conditional interval coverage 1. Introduction B This paper is a revised version of Chapter Five of the first author’s Ph.D. thesis at The University of Queensland, Australia. * Corresponding author. Tel.: +64 9 373 7599x85172. E-mail addresses: k.chan@auckland.ac.nz (K.F. Chan), p.gray@business.uq.edu.au (P. Gray). 1 Tel.: +61 7 3365 6992. The recent worldwide deregulation of wholesale electricity markets has created opportunities and incentives for market participants to trade electricity spot prices and related derivatives. Trading in electricity markets is challenging because spot prices are highly volatile and exhibit occasional extreme 0169-2070/$ - see front matter D 2005 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.ijforecast.2005.10.002 284 K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 price movements of magnitudes rarely seen in markets for traditional financial assets.2 As a result, energy industry participants often self-impose trading limits to prevent extreme price fluctuations from adversely affecting firm profitability and indeed the operation of the entire industry. Firms also require optimal trading limits to allocate capital to cover potential losses should the trading limits be violated. Obviously, over-capitalization implies idle capital which compromises the firm’s profitability. On the other hand, under-capitalization may cause financial distress should the firm be unable to honour its trading contracts. One tool commonly used to establish optimal trading limits is Value-at-Risk (VaR). In general, VaR measures the amount a firm can lose with a% probability over a certain time horizon s. If, for example, a = 5% and s is one day, the VaR can be interpreted as the maximum potential loss that will occur for five days on average over each 100-day period. An extensive discussion of VaR use in traditional financial markets can be found in Dowd (1998), Duffie and Pan (1997), Jorion (2000), Holton (2003) and Manganelli and Engle (2004), whilst energy VaR is detailed in Clewlow and Strickland (2000) and Eydeland and Wolyniec (2003). The conventional approaches to estimating VaR in practice can be broadly classed as parametric and nonparametric. Under the parametric approach, a specific distribution for asset returns must be postulated, with a Normal distribution being a common choice. In contrast, non-parametric approaches make no assumptions regarding the return distribution. As an example, the popular historical simulation method utilizes the empirical distribution of returns to proxy for the likely distribution of future returns. Both approaches are widely employed in financial markets, where prices seldom exhibit extreme movements. In electricity markets, however, the high volatility and occasional price spikes result in an empirical distribution of returns with a non-standard shape making it difficult 2 The extreme movements are attributable to several distinctive features of electricity markets: (1) electricity cannot be stored effectively through time and space; and (2) electricity prices have inelastic demand curves and kinked supply curves (Cuaresma, Hlouskova, Kossmeier, & Obersteiner, 2004; Knittel & Roberts, 2001; Escribano, Pena, & Villaplana, 2002). to specify a parametric form. As a result, parametric approaches may not generate accurate VaR measures in electricity markets. Similarly, the usefulness of nonparametric approaches in electricity markets is largely unknown. One possible avenue for improving VaR estimates in energy markets lies in extreme value theory (EVT), which specifically models the extreme spot price changes (i.e., the tails of the return distribution). Focusing on extreme returns rather than the entire distribution seems natural since, by definition, VaR measures the economic impact of rare events. EVT has already found numerous applications for VaR estimation in financial markets.3 Longin (1996) examines extreme movements in U.S. stock prices and shows that the extreme returns obey a Fréchet fattailed distribution. Ho, Burridge, Cadle, and Theobald (2000) and Gençay and Selçuk (2004) apply EVT to emerging stock markets which have been affected by a recent financial crisis. They report that EVT dominates other parametric models in forecasting VaR, especially for more extreme tail quantiles. Gençay, Selçuk, and Ulugülyaĝci (2003) reach similar conclusions for the Istanbul Stock Exchange Index (ISE-100). Müller, Dacorogna, and Pictet (1998) and Pictet, Dacorogna, and Müller (1998) compare the EVT method with a time-varying GARCH model for foreign exchange rates. Bali (2003) adopts the EVT approach to derive VaR for U.S. Treasury yield changes. At present, applications of EVT to estimating VaR in energy markets are sparse. Andrews and Thomas (2002) combine historical simulation with a threshold-based EVT model to fit the tails of the empirical profit-and-loss distribution of electricity. They report that the model fits the empirical tails better than the Normal distribution. Rozario (2002) derives VaR for Victorian half-hourly electricity returns using a threshold-based EVT model. While the model performs well for moderate tails covering a = 5% to 1%, it struggles when a is below 1%, a fact Rozario attributes to the model’s failure to account for clustering in the data. 3 Embrechts, Kluppelberg, and Mikosh (1997) and Reiss and Thomas (2001) provide a comprehensive overview of EVT as a risk management tool. K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 It is important to note that EVT relies on an assumption of i.i.d. observations. Clearly, this is not true for electricity return series, and arguably financial returns in general. One approach to this problem is provided by McNeil and Frey (2000). Using a twostage approach, McNeil and Frey estimate a GARCH model in stage one with a view to filtering the return series to obtain (nearly) i.i.d. residuals. In stage two, the EVT framework is applied to the standardized residuals. The advantage of this GARCH–EVT combination lies in its ability to capture conditional heteroscedasticity in the data through the GARCH framework, while at the same time modelling the extreme tail behaviour through the EVT method. As such, the GARCH-EVT approach might be regarded as semi-parametric (Manganelli & Engle, 2004). Bali and Neftci (2003) apply the GARCH-EVT model to U.S. short-term interest rates and show that the model yields more accurate estimates of VaR than that obtained from a Student t-distributed GARCH model. Fernandez (2005) and Byström (2004) also find that the GARCH-EVT model performs better than the parametric models in forecasting VaR for various international stock markets. In an energy application, Byström (2005) employs a GARCH-EVT framework to NordPool hourly electricity returns. He finds that the extreme GARCH-filtered residuals obey a Fréchet distribution. Furthermore, the GARCH-EVT model produces more accurate estimates of extreme tails than a pure GARCH model. The objective of the current paper is to further explore the usefulness of EVT in forecasting VaR in electricity markets. There are several contributions. First, the paper proposes a model that, when combined with EVT, has the potential to generate more accurate quantile estimates for electricity VaR. Based on daily electricity returns, the model accommodates autoregression and weekly seasonals in both the conditional mean and conditional volatility equations. Leverage effects in conditional volatility are also modelled using an Exponential GARCH (EGARCH) specification. In forecasting VaR, EVT is applied to the standardized residuals from this model. Clearly, the proposed EGARCH–EVT combination is a sophisticated approach to forecasting VaR. The second contribution, therefore, is to compare the accuracy of VaR forecasts under the proposed model with a number of conventional approaches (both parametric and non-paramet- 285 ric). Tail quantiles are estimated under each competing model and the frequency with which realized returns violate these estimates provides an initial measure of model success. While the use of violation frequencies is common in assessing quantile estimators for VaR, the utility of such an approach may be limited in electricity applications where the true quantiles are likely to be time varying. For example, a naı̈ve estimator constructed as the quantile of all historical returns will have a perfect violation proportion on average. If, however, the data series exhibit time-varying volatility (and consequently, a time-varying return distribution), the naı̈ve quantile estimator may struggle to differentiate between periods of high volatility and periods of relative tranquility. As such, VaR violations from a naı̈ve quantile estimator may well be clustered in time, possibly during periods of turmoil when VaR forecasts are most crucial. The third contribution of this paper, therefore, is to assess the VaR performance of a number of competing models using formal statistical inference designed to test both unconditional and conditional coverage of the quantile estimators. Based on tests developed by Christoffersen (1998), the findings shed new light on the appropriateness of simple non-parametric approaches to VaR estimation. Finally, the paper examines five electricity markets, each with defining characteristics. Wolak (1997) notes that electricity price behaviour is affected by how the electricity is generated. This paper considers markets such as Victoria, where electricity is primarily generated by fossil fuel, and the NordPool, which utilizes hydro generation. Indeed, the findings suggest that the optimal approach to estimating VaR is very likely to be a function of the characteristics of the underlying power market. At the very least, the international comparison allows an assessment of the generality of our findings. The remainder of the paper is structured as follows. Section 2 describes the competing approaches used to forecast VaR in this paper. A number of common parametric and non-parametric models are included, along with an EVT-based approach designed specifically for electricity applications. Section 3 documents the data employed in the study while Section 4 presents the results. Model estimates are presented in Section 4.1, with particular emphasis given to the 286 K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 implementation of the EVT framework. Section 4.2 documents the relative VaR performance of competing approaches, measured using (unconditional) violation frequencies. Section 4.3 extends the assessment by conducting formal statistical tests of both unconditional and conditional coverage of the various quantile estimators. Section 5 concludes the study. 2. Methods for estimating value-at-risk This section presents the various approaches to calculating VaR examined in this paper. Section 2.1 describes a simple non-parametric approach based on the historical distribution of returns. Section 2.2 outlines four parametric approaches based on an autoregressive model for returns. Our proposed model, termed AR-EGARCH-EVT, is detailed in Section 2.3. 2.1. Historical simulation approach Arguably, the most popular method of estimating VaR is to utilise the empirical distribution of past returns on the asset of interest. If, for example, one requires the VaR for one day with an a = 5% confidence level, one takes the 95% quantile from the most-recent T observed daily returns. VaR for longer horizons (for example, s days) can be similarly obtained using the most-recent sample of non-overlapping s-day returns.4 Known as the Historical Simulation (HS) approach, this simple method is non-parametric in that it makes no arbitrary assumptions of the true distribution of returns. Of course, it does assume that the past distribution is representative of likely future returns. In this paper, the HS approach serves as a naı̈ve benchmark against which more sophisticated approaches are judged. 2.2. Parametric approaches This paper considers four parametric approaches to measuring VaR. First, we consider an autoregressive (AR) model of returns with constant variance (hereafter denoted AR-ConVar). Since the data series are 4 Alternatively, s daily returns can be bootstrapped from the empirical distribution and aggregated. sampled daily, an AR(7) model is proposed to capture any weekly seasonality in electricity prices: rt ¼ / 0 þ 7 X /j rtj þ et ; ð1Þ j¼1 where r t = (S t S t1) / S t1 is the simple electricity return and S t is the daily spot price. A distributional assumption is made of the error term in the ARConVar model; specifically, errors e t are assumed to be Normally distributed with zero mean and constant variance (E(e2t / X t1) = r 2). At any time t, the VaR estimate from the AR-ConVar model is: VaRq;t ¼ /̂ 0 þ 7 X /̂j rtj þ F 1 ðqÞr̂ r; ð2Þ j¼1 where (/̂ j = 1, 2, . . .7, r̂ ) are parameter estimates and F 1( q) is the q% quantile of the Normal distribution function at an a% tail (i.e., q = 1 a). The second parametric approach, a minor variation of the AR-ConVar model, combines the key features of the autoregressive model and the historical simulation approach.5 The conditional mean is again modelled using an AR(7) model. However, rather than making a distributional assumption over F 1( q), the q% quantile for VaR is obtained by bootstrapping the empirical distribution of residuals from the fitted AR(7). Denoted AR-HS, this approach is motivated by the likelihood that a historical simulation from the empirical distribution of returns is inappropriate in electricity VaR applications. Unlike financial return series, which have near constant mean, electricity returns have significant autocorrelation in their conditional mean. The traditional HS approach cannot capture these intertemporal characteristics. In contrast, the proposed AR-HS method accommodates the time-series properties through the autoregressive mean, while retaining the distributionfree flavour by the use of bootstrapping. As such, the AR-HS approach represents a more sensible implementation of the historical-simulation concept for calculating VaR.6 5 While we label this approach dparametricT, it is arguably a hybrid semi-parametric approach. 6 We are grateful to an anonymous referee for suggesting this approach. K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 Our third parametric approach specifically models the serial correlation of both the conditional mean and conditional volatility of returns. The mean return is again modelled using an AR(7) but, rather than constant variance, the error term is assumed to follow an EGARCH process: E e2t jXt1 ¼ ht ; et1 and lnðht Þ ¼ b1 þ b2 pffiffiffiffiffiffiffiffi þ b3 lnðht1 Þ ht1 et1 et1 þ b4 pffiffiffiffiffiffiffiffi E pffiffiffiffiffiffiffiffi ht1 ht1 et7 þ b5 pffiffiffiffiffiffiffiffi þ b6 lnðht7 Þ: ht7 ð3Þ The adoption of an EGARCH formulation for volatility is motivated by Knittel and Roberts (2001), who argue that the convex nature of the marginal costs of electricity generation causes positive demand shocks to have a larger impact on price changes than negative shocks. That is, positive price shocks are conjectured to increase volatility more than negative shocks, thus inducing a positive leverage effect.7 In addition to capturing asymmetries (b 4 ), Eq. (3) also accommodates weekly seasonality in conditional volatility (b 5 and b 6). The corresponding VaR measure is calculated in a similar fashion to Eq.p(2), ffiffiffiffi with the parameter estimate of r̂ replaced by ĥ h t from Eq. (3). Since a number of distributional assumptions are common in the GARCH literature, this study imposes two distributions over the e t error term: the Normal distribution and the fat-tailed t-distribution with m degrees of freedom. The former model is termed AR-EGARCH-N, where the tail quantile of F 1( q) in its VaR model is also Normally distributed; whilst the latter is termed AREGARCH-t, where the F 1( q) quantile in its VaR model is t m -distributed. 7 We are grateful to an anonymous referee for suggesting this motivation for employing an EGARCH model. 287 2.3. The AR-EGARCH-EVT method Following Byström (2005), this study adopts the EVT approach of McNeil and Frey (2000) to measure VaR for electricity returns. McNeil and Frey recognize that most financial return series exhibit stochastic volatility and fat-tailed distributions. While the fat tails might be modelled directly with EVT, the lack of i.i.d. returns is problematic. McNeil and Frey’s solution is to first model the conditional volatility using a GARCH approach. The GARCH model serves to filter the return series such that GARCH residuals are closer to i.i.d. than the raw return series. Even so, GARCH residuals have been shown to exhibit fat tails. In stage two, McNeil and Frey apply EVT to the GARCH residuals. As such, the GARCH–EVT combination accommodates both time-varying volatility and fattailed return distributions. We denote this approach by AR-EGARCH-EVT. The AR-EGARCH-EVT approach is implemented as follows: 1. The AR-EGARCH model with a t m -distribution governing the e t term (as described in Section 2.2) is estimated from electricity returns. Maximum likelihood estimation is employed over an insample period (described shortly). 2. The residuals from the AR-EGARCH model are standardized: ( )1 0 7 X /̂j rtj C B rt /̂0 þ C B j¼1 C B pffiffiffiffiffiffiffiffiffi ð4Þ ẑz t ¼ B C A @ ĥh t1 where T is the number of return observations during the in-sample estimation period. 3. EVT is applied to the standardized residuals ẑ t to model the tail quantile of F 1( q) in deriving VaR. In applying the EVT method, this paper adopts the Peak Over Threshold (POT) EVT method.8 The POT 8 For details on the POT EVT method, refer to McNeil and Frey (2000) and Embrechts et al. (1997). 288 K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 method identifies extreme observations (that is, extreme standardized residuals) that exceed a high threshold u and specifically models these dexceedencesT separately from non-extreme observations. Assume that the standardized residuals z t are a sequence of i.i.d. random variables from an unknown distribution function F z . Let u denote a high threshold beyond which observations of z are considered exceedences (the choice of the threshold u is discussed shortly). The magnitude of the exceedence is given by y i = z i u, for i = 1, . . .N y, where N y is the total number of exceedences in the sample. The distribution of y, for a given threshold u, is given by: and beta distributions. In most financial applications, the data exhibit heavy-tails suggesting n N 0. Given that F u ( y) can be approximated by Eq. (7), and noting that F z (u) is determined by (T N y ) / T, Eq. (6) simplifies to: Fu ð yÞ ¼ Prð z u V yjz N uÞ Eq. (9), together with the conditional mean (Eq. (1)) and conditional variance (Eq. (3)), is termed AREGARCH-EVT. The VaR measure is defined as: ¼ Prð z u V y; z N uÞ Prð zNuÞ Fz ð y þ uÞ Fz ðuÞ ¼ : 1 Fz ðuÞ ð5Þ zNu: ð6Þ Balkema and de Haan (1974) and Pickands (1975) show that, for a sufficiently high u, F u ( y) can be approximated by the Generalized Pareto Distribution (GPD), which is defined as: 8 < ny 1=n 1 1 þ Gn;m ð yÞ ¼ m : 1 expð y=mÞ ð8Þ This tail estimator can be inverted for the purpose of calculating VaR: " # n m T 1 Fz ðqÞ ¼ u þ a 1 : ð9Þ n Ny VaRq;t ¼ /̂0 þ 7 X /̂j rtj þ Fz1 ðqÞ qffiffiffiffi ĥh t : ð10Þ j¼1 That is, F u ( y) is the probability that z exceeds the threshold u by an amount no greater than y, given that z exceeds u. Since z = y + u, re-arrange Eq. (5) to obtain: Fz ð zÞ ¼ ½1 Fz ðuÞFu ð yÞ þ Fz ðuÞ; Ny nð z uÞ 1=n 1þ F z ð zÞ ¼ 1 : m T if n p 0 ð7Þ if n ¼ 0; where n and m N 0 are shape and scale parameters respectively. Note that the GPD subsumes various distributions. A value of n N 0 corresponds to the heavytailed distributions such as Pareto, Student t, Cauchy, loggamma and Fréchet, whose tails decay like power functions. A n = 0 corresponds to thin-tailed distributions such as Gumbel, normal, exponential, gamma and lognormal, whose tails decay exponentially. A n b 0 corresponds to finite distributions such as the uniform Finally, a reasonable threshold u must be chosen to implement the POT method. Ideally, u should be set sufficiently high so that the POT asymptotic theory applies. However, if u is set too high, there will be too few exceedences from which to estimate the parameters of the GPD. This paper follows the approach of Gençay and Selçuk (2004), who determine a reasonable threshold u using a combination of two popular techniques: the mean excess function (MEF) and the Hill plots (Hill, 1975). Further details are provided in Section 4.1’s discussion of results. 3. Data description The data examined in this paper are daily aggregated electricity spot prices from five international power markets: Victoria (Australia), NordPool (Scandinavia), Alberta (Canada), Hayward (New Zealand) and PJM (US). Table 1 reports descriptive statistics for daily (simple) returns in all markets. Statistics are shown for the full sample, as well as the in-sample period used subsequently in model estimation. Full sample sizes range from 2093 days (PJM) to 2553 days (NordPool and Alberta). While the mean K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 289 Table 1 Descriptive statistics Victoria NordPool Alberta Hayward PJM Full sample Start date End date No. of obs Mean Median Std. dev. Min Max Skewness Kurtosis Q(7) Q 2(7) 4 Jan 99 31 Dec 04 2189 0.091 0.014 1.100 0.959 44.22 30.52 1191 196.4*** 46.64*** 5 Jan 98 31 Dec 04 2553 0.008 0.003 0.141 0.558 2.496 3.742 50 236.7*** 111.4*** 5 Jan 98 31 Dec 04 2553 0.098 0.011 0.575 0.898 6.307 4.007 31 130.4*** 67.16*** 5 Jan 98 31 Dec 03 2184 0.076 0.005 0.775 0.963 21.84 17.01 398 141.7*** 59.86*** 5 Apr 98 31 Dec 03 2093 0.068 0.015 0.560 0.926 15.24 12.91 301 329.2*** 58.92*** In-sample Start date End date No. of obs Mean Median Std. dev. Min Max Skewness Kurtosis Q(7) Q 2(7) 4 Jan 99 31 Dec 02 1459 0.104 0.017 1.301 0.959 44.22 27.42 911 146.3*** 38.25*** 5 Jan 98 31 Dec 02 1823 0.012 0.004 0.161 0.557 2.496 3.358 40 165.9*** 74.66*** 5 Jan 98 31 Dec 02 1823 0.094 0.015 0.574 0.897 6.306 4.379 35 113.0*** 73.83*** 5 Jan 98 30 Apr 02 1584 0.062 0.007 0.439 0.963 21.84 15.02 304 99.22*** 36.45*** 5 Apr 98 30 Apr 02 1493 0.052 0.008 0.360 0.925 15.24 13.54 292 217.2*** 36.18*** The table reports summary statistics for the daily simple net returns (r t ) of five international power markets: Victoria, NordPool, Alberta, Hayward and PJM. The Ljung–Box Q(7) and Q 2 (7) statistics test for serial correlation up to 7 lags for r t and r2t , respectively. *** Indicates significance at the 1% level. daily returns are quite large, the median returns are close to zero.9 The high volatility of electricity returns is evident in the standard deviation of daily returns. Similarly, the positive skewness and high kurtosis clearly illustrate the non-normality of the distribution. Ljung–Box Q and Q 2 statistics indicate the presence of serial correlation at up to 7 lags, as well as potential time-varying volatility. These findings lend credence to the adoption of the AR(7) and EGARCH models discussed in Section 2. Fig. 1 graphs spot prices, returns and QQ plots for each power market. Together with Table 1, Fig. 1 demonstrates the defining characteristics of electricity markets: high volatility, occasional extreme movements, volatility clustering and fattailed distributions. These descriptive statistics and plots further motivate the exploration of the alternative approaches to measuring VaR described in Section 2. 4. Empirical results 9 The non-zero mean return is directly attributable to the nature of electricity returns. Extreme positive returns (sometimes exceeding several hundred percent) occur semi-regularly. In contrast, the minimum return is bounded from below at 100%. As emphasized by Byström (2005), this feature results in severe positive skewness and non-zero mean returns, yet causes no major concerns as we study the right tail of the distribution. This section presents the empirical findings of the study. Section 4.1 reports the in-sample parameter estimates for all models proposed in Section 2. In Section 4.2, an initial assessment is made of the accuracy with which each model forecasts VaR, 290 K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 Victoria prices Victoria returns Victoria QQ plot 1200 50 10 1000 40 8 800 6 30 600 4 20 400 2 10 200 0 Jan 99 Dec 00 Dec 02 Dec 04 0 0 Jan 99 Dec 00 Dec 02 Dec 04 -2 -4 1000 800 -2 0 2 4 NordPool QQ plot NordPool returns NordPool prices 3 3 2 2 1 1 0 0 600 400 200 0 -1 Dec 98 Dec 00 Dec 02 Dec 04 Dec 98 Dec 00 Dec 02 Dec 04 -1 -4 -2 600 500 0 2 4 2 4 Alberta QQ plot Alberta returns Alberta prices 8 8 6 6 400 4 4 300 200 2 2 0 100 0 0 Dec 98 Dec 00 Dec 02 Dec 04 Dec 98 Dec 00 Dec 02 Dec 04 -2 -4 Hayward returns Hayward prices 25 25 400 20 20 300 15 15 10 10 5 5 100 0 0 Jan 98 Dec 99 Dec 01 Dec 03 Jan 98 PJM prices 0 Dec 99 Dec 01 Dec 03 -4 -2 PJM returns 400 300 200 0 Hayward QQ plot 500 200 -2 0 2 4 2 4 PJM QQ plot 20 20 15 15 10 10 5 5 0 0 100 0 Apr 98 Dec 99 Dec 01 Dec 03 Apr 98 Dec 99 Dec 01 Dec 03 -4 -2 0 Fig. 1. Spot prices, returns and QQ plots. The figure shows summary plots for daily electricity data from five international power markets: Victoria, NordPool, Alberta, Hayward and PJM. The left, middle, and right columns display electricity prices, returns and QQ plots for daily returns respectively. K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 291 Table 2 Parameter estimates for the AR-ConVar model /0 /1 /2 /3 /4 /5 /6 /7 r R2 Victoria NordPool Alberta Hayward PJM 0.079 0.119 0.101 0.126 0.085 0.094 0.043 (0.015) 0.222 0.422 0.1015 0.018 0.026 (0.278) 0.113 0.023 (0.341) 0.164 0.154 0.119 0.122 0.153 0.0897 0.109 0.144 0.127 0.009 (0.715) 0.051 (0.036) 0.018 (0.050) 0.001 (0.970) 0.168 0.555 0.0639 0.077 0.077 0.158 0.071 0.044 (0.109) 0.036 (0.180) 0.013 (0.611) 0.167 0.424 0.0638 0.092 0.185 0.254 0.167 0.102 0.138 0.106 0.184 0.332 0.1526 The table reports maximum-likelihood estimates of the AR-ConVar model (Eq. (1)). For each data series, parameter estimates are based on the in-sample period documented in Table 1. The majority of parameter estimates are statistically significant at better than the 1% level and their pvalue is not shown. p-values are shown in parentheses only when not significant at the 1% level. with the observed violation frequencies compared to tail quantiles from each model. Section 4.3 examines VaR performance further by conducting statistical tests of both the unconditional and conditional interval coverage of each approach. 4.1. Model estimates 4.1.1. AR-ConVar model Table 2 presents the ML estimates of the ARConVar model (Eq. (1)). For each data series, the period of estimation is the in-sample period indicated in Table 1. The findings are quite similar across the various power markets. Consider, for example, the PJM market. The time-series properties of the return series are evident, with statistically significant autocorrelations at all seven lags.10 The first six lags exhibit negative autocorrelation, while a day of the week effect is confirmed by the positive estimate at lag 7. These results support the deployment of an AR(7) model for the return series. Taken together, the AR estimates imply a long-term mean return of P /̂0 = 1 7j¼1 /ˆ j ¼ 0:052, which closely matches the unconditional mean for PJM reported in Table 1. Similarly, the volatility estimate relating to AR errors (r = 0.332) approximates the unconditional insample standard deviation. 10 The vast majority of parameter estimates in Tables 2 and 3 are significant at the 1% level and their p-values are not shown. pvalues are only explicitly shown when they are greater than 1%. 4.1.2. AR-EGARCH model Table 3 presents the ML estimates of the AREGARCH-t model.11 The mean and conditional volatility equations are given by Eqs. (1) and (3), respectively, with a t-distribution governing the errors. Estimates from the autoregressive mean equation have changed little from Table 2. The estimates from the conditional volatility equation are of particular interest. The general tenor of the findings is as follows. There is strong evidence of a first-order GARCH effect (b 2 and b 3) in all markets except PJM. In addition, there appears to be an asymmetric leverage effect (b 4).12 Parameter estimates (b 5 and b 6) also suggest a weekly seasonal effect in the conditional variance. Finally, ML estimates of the parameter m suggest that returns have a tail fatter than that implied by a Normal distribution. In summary, the findings support the use of the AREGARCH-t model as specified in Eqs. (1) and (3). 4.1.3. AR-EGARCH-EVT model Since EVT relies on the assumption of i.i.d. observations, Section 2.3 described a two-stage process designed to achieve (near) i.i.d. time-series. First, the AR-EGARCH model is fitted and residuals are standardized in an attempt to satisfy the i.i.d. 11 ML estimates for the AR-EGARCH-N model are qualitatively similar and, to preserve space, are not reported. However, the VaR performance of the AR-EGARCH-N model is reported in subsequent analysis. 12 Estimates of the leverage effect are comparable to those reported by Knittel and Roberts (2001) and Duffie, Gray, and Hoang (1998) where b̂4 is positive and statistically significant. 292 K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 Table 3 Parameter estimates for the AR-EGARCH model Victoria NordPool Alberta Hayward PJM Estimates from the AR(7) mean (Eq. (1)) /0 0.001 (0.907) 0.002 (0.153) /1 0.156 0.000 (0.990) /2 0.140 0.087 0.140 0.100 /3 /4 0.087 0.143 /5 0.114 0.163 0.080 0.049 /6 /7 0.170 0.231 R2 0.1414 0.1414 0.011 (0.092) 0.201 0.195 0.113 0.093 0.093 0.040 (0.020) 0.064 0.1492 0.007 (0.129) 0.077 0.063 0.067 0.044 0.046 0.001 (0.970) 0.098 0.0596 0.043 0.211 0.236 0.191 0.147 0.189 0.151 0.122 0.1718 Estimates from the EGARCH conditional variance (Eq. (3)) b1 0.225 (0.051) 0.046 (0.338) b2 0.282 0.366 b3 0.057 (0.074) 0.377 0.520 0.703 b4 b5 0.223 0.208 b6 0.771 0.597 m 2.500 3.689 0.560 0.395 0.835 0.964 0.110 0.104 2.069 0.747 0.986 0.879 1.412 0.023 (0.794) 0.073 (0.047) 2.119 0.375 0.043 (0.416) 0.039 (0.340) 0.181 0.232 0.808 4.167 The table reports maximum-likelihood estimates of the AR-EGARCH model (Eqs. (1) and (3)), with a t m -distribution governing the error terms. For each data series, parameter estimates are based on the in-sample period documented in Table 1. The majority of parameter estimates are statistically significant at better than the 1% level and their p-value is not shown. p-values are shown in parentheses only when not significant at the 1% level. Table 4 Summary statistics for AR-EGARCH residuals Victoria NordPool Alberta Hayward PJM Panel A: raw AR-EGARCH residuals Median 0.014 0.001 0.013 0.015 0.004 Mean 0.091 0.013 0.146 0.081 0.061 Std. dev. 0.423 0.155 0.563 0.427 0.333 Skewness 4.325 3.197 4.684 3.487 1.956 Kurtosis 28.29 46.20 36.92 21.67 10.18 Q(7) 7.189 114.6*** 75.40*** 21.79*** 8.521 Q 2(7) 27.29*** 88.63*** 103.0*** 33.36*** 33.66*** Panel B: standardized AR-EGARCH residuals Median 0.052 0.011 0.048 0.065 0.009 Mean 0.354 0.078 0.444 0.212 0.138 Std. dev. 1.755 0.961 1.766 1.448 0.787 Skewness 4.880 4.281 4.095 3.230 2.348 Kurtosis 37.81 68.65 31.72 33.39 14.02 Q(7) 12.53* 34.00*** 11.14 20.72*** 8.860 Q 2(7) 9.43 6.19 2.44 4.03 12.77* The table reports summary statistics for the (in-sample) residuals from the AR-EGARCH model, with a t m -distribution governing the error terms. Panels A and B report diagnostics for the drawT and standardized residuals respectively. The latter are the basis of the EVT estimation. * and *** indicate that the Ljung–Box Q and Q 2 statistics are significant at the 10% and 1% levels, respectively. assumption. Second, EVT is applied to the standardized residuals. Table 4 presents diagnostics for the drawT and standardized AR-EGARCH residuals. The Ljung–Box Q and Q 2 statistics provide an indication of whether any serial correlation or heteroscedasticity is present in the data series. Panel A strongly suggests that the raw AR-EGARCH residuals are not i.i.d. as required by EVT. In contrast, the standardized residuals in Panel B, whilst not perfectly i.i.d., are better behaved. To a large extent, the filtering procedure advocated by McNeil and Frey (2000) has been effective in producing (near) i.i.d. residuals on which EVT can be implemented. Table 4 Panel B does, however, show that skewness and excess kurtosis remain in the standardized residuals. Similarly, QQ plots (not presented) document heavy right tails. These findings motivate the second stage of McNeil and Frey’s (2000) EVT implementation, where the fat tails of the standardized residuals are explicitly modelled. To apply EVT, the threshold u is selected using mean excess functions (MEF) and Hill plots.13 Table 5 13 Our approach to selecting u follows Gençay and Selçuk (2004) closely and we do not present the MEFs and Hill plots here. K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 293 Table 5 Parameter estimates for the AR-EGARCH-EVT model Panel A Total in-sample obs. EVT threshold Number of exceedences % of exceedences in-sample GPD shape parameter GPD scale parameter T u Ny N y /T n m Panel B q = 95% q = 99% q = 99.5% Normal 1.645 2.326 2.576 Victoria NordPool Alberta Hayward PJM 1459 1.5 151 10.35 0.552*** 1.208*** 1823 1.0 185 10.15 0.305*** 1.631*** 1823 2.0 167 9.16 0.332*** 1.034*** 1584 1.5 187 11.81 0.305*** 0.570*** 1493 1.0 149 9.98 0.344*** 0.569*** 2.582 7.265 10.97 1.459 2.935 3.832 3.289 7.498 10.05 2.375 5.194 6.956 1.444 2.996 3.978 Panel A reports in-sample ML estimates of the GPD distribution for the AR-EGARCH-EVT model. *** Denote significance at the 1% level. 1 Panel B presents EVT tail quantiles F ( q) for the standardized residuals, along with tail quantiles from a Normal distribution. z reports the threshold chosen in each market. In each case, the resulting exceedences N y total approximately 10% of the sample, which is consistent with percentages reported by McNeil and Frey (2000). Table 5 also reports ML estimates of the shape (n) and scale (m) parameters, determined by fitting the GPD Eq. (7) to the standardized residuals. Recall that values of n N 0 reflect heavy-tailed distributions. In each power market, the n estimate is positive and statistically significantly different from zero, suggesting that the right tail of the distribution of standardized residuals is characterized by the Fréchet distribution.14 Table 5 Panel B further documents the heavy tails of the distribution by comparing the EVT tail quantiles to those from a Normal distribution. EVT tail quantiles F z 1( q) are obtained from Eq. (9) using the Panel A reports of T, u, N y, n and m at the specified end tail of a%. In general, the tail quantiles from the AR-EGARCH-EVT model are higher than those under a Normal distribution. The fatness of the tail is readily apparent, especially as we move to more extreme quantiles (i.e., as a moves towards 0.5%). Indeed, Gençay and Selçuk (2004) warn that using quantile estimates from a Normal distribution when the data is in fact fat tailed will cause VaR to be underestimated. 14 In a study of NordPool hourly prices, Byström (2005) also finds that a Fréchet distribution applies to the tail of the distribution of standardized residuals. 4.2. Relative VaR performance of competing models The primary goal of this paper is to assess the relative ability of a number of alternate approaches to accurately measure VaR in electricity markets. To do this, the full data sample is divided into an in-sample period (on which Section 4.1’s model estimates are based) and an out-of-sample period over which VaR performance is measured. Measurement of VaR proceeds as follows. On the first day of the out-of-sample period, the mostrecent T returns are used to estimate model parameters for each parametric approach. The magnitude of T is set to be equal to the length of the in-sample period. That is, T = 1459 in Victoria, T = 1823 in NordPool, and so on. From the parameter estimates, the next-day VaR is estimated using each method described in Section 2. Should the realized next-day return exceed the estimated VaR, this is labelled a dviolationT.15 Moving to time t + 1, the estimation procedure is rolled forward one day and repeated. Note that the size of the estimation window T is kept constant and simply rolled forward one day at a time, thus ensuring that model estimates are not based on stale data.16 The procedure differs slightly for the non-parametric (HS) and semi-parametric (AR-HS) approaches. 15 Berkowitz (1999), Ho et. al. (2000), Bali and Neftci (2003), Gençay and Selçuk (2004), Byström (2004, 2005) and Fernandez (2005) adopt a similar procedure. 16 Indeed, in relation to the EVT approach, plots (not reported) show that rolling estimates n and m are clearly time varying. This reinforces the necessity for using a rolling estimation window. 294 K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 Table 6 Out-of-sample VaR violations Victoria NordPool Alberta Hayward PJM a = 5% HS AR-ConVar AR-HS AR-EGARCH-N AR-EGARCH-t AR-EGARCH-EVT 4.25 0.68 3.84 6.99 4.93 4.65 (3) (6) (4) (5) (1) (2) 4.25 0.41 0.41 3.70 1.00 2.19 (1) (5) (5) (2) (4) (3) 4.93 4.38 5.89 4.65 4.79 4.11 (1) (4) (6) (3) (2) (5) 4.00 0.33 1.50 4.83 2.50 3.50 (2) (6) (5) (1) (4) (3) 6.50 3.62 7.77 7.83 3.50 6.67 (2) (1) (5) (6) (2) (4) a = 1% HS AR-ConVar AR-HS AR-EGARCH-N AR-EGARCH-t AR-EGARCH-EVT 0.82 0.27 1.10 3.56 1.51 0.69 (2) (5) (1) (6) (4) (3) 0.96 0.27 0 1.23 0 0 (1) (3) (4) (2) (4) (4) 0.82 2.60 0.55 3.01 0.68 0.69 (1) (5) (4) (6) (3) (2) 0.33 0 0 2.17 0.30 0.5 (2) (4) (4) (6) (3) (1) 2.00 1.33 1.17 4.83 0.67 1.50 (5) (2) (1) (6) (2) (4) a = 0.5% HS AR-ConVar AR-HS AR-EGARCH-N AR-EGARCH-t AR-EGARCH-EVT 0.41 0.27 0.14 2.88 0.68 0.54 (2) (4) (5) (6) (3) (1) 0.69 0.27 0 0.96 0 0 (1) (2) (4) (3) (4) (4) 0.55 2.60 0.27 2.46 0.14 0.14 (1) (6) (4) (5) (2) (2) 0 0 0 1.83 0.17 0.33 (3) (3) (3) (6) (2) (1) 1.00 1.17 0 4.00 0.17 0.67 (3) (5) (3) (6) (2) (1) The table details the out-of-sample VaR violations for all competing models. A violation occurs if the realized empirical return exceeds the predicted VaR on a particular day. The numbers in parentheses denote the ranking among the competing models for each quantile at a = 5%, 1% and 0.5%. All actual and expected violations are in percentage terms. Under the AR-HS approach, parameters of the autoregressive model are again estimated using a rolling window of the most recent T observations (this ensures comparability with the plain-vanilla ARConVar approach). However, tail quantiles are constructed by bootstrapping from the 500 most-recent AR errors. Similarly, the naı̈ve HS approach simply bootstraps from the 500 most-recent raw returns.17 Table 6 documents the out-of-sample violation ratios under each model for a range of quantiles. For the 95% quantile (a = 5%), five violations are expected every 100 days. Each model is evaluated by comparing the actual and expected violation ratios and competing models are ranked accordingly (rankings are shown in parentheses). Consider, for example, the Victorian market with a = 5%. The AR-EGARCH-t and ARGARCH-EVT models forecast right-tail quantiles most accurately. The AR-EGARCH-N model (that is, 17 Manganelli and Engle (2004) note that it is common to utilize a rolling window of between 6 and 24 months (i.e., between 180 and 730 observations) for HS approaches. the autoregressive model with Normally distributed errors) significantly underestimates the 95% quantile resulting in an excessive number of violations; this is to be expected when the actual returns have heavier tails than assumed under a Normal distribution. Curiously, the AR-ConVar model (which also assumes Normal errors) overestimates the 95% quantile, while the naı̈ve HS approach does surprisingly well. Moving to the 99% quantile (a = 1%), there is little consistency in model rankings. The AR-HS and HS approaches provide the most-accurate VaR forecasts, while the AR-EGARCH-t and AR-EGARCH-EVT models underestimate and overestimate the 99% quantile respectively. For the extreme quantile (a = 0.5%), rankings approximate those reported for a = 5%. Examining the other power markets, little consistency in model performance is evident. The HS approach has superior performance for NordPool and Alberta, irrespective of the quantile. The AREGARCH-EVT model performs very well for Hayward and PJM. While these inconsistent rankings are not particularly encouraging for risk managers inter- K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 295 3.5 3 2.5 2 1.5 1 0.5 0 -0.5 -1 Jan 03 Dec 03 Dec 04 Fig. 2. Time-varying VaR forecasts and violations. The plot depicts the VaR forecasts from the HS (smooth, heavy red line) and AR-EGARCHEVT model (dashed, green line) for the Alberta market during the out-of-sample period (a = 5%). Daily returns are shown with the thin grey line. Violations under the HS and AR-EGARCH-EVT models are displayed with triangles and circles respectively. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) ested in forecasting VaR, a careful examination of the violation ratios in conjunction with the Table 1 summary statistics is revealing. Consider the Victorian, Hayward and PJM markets, where the moresophisticated models like AR-EGARCH-EVT and AR-EGARCH-t perform well. The summary statistics for these markets reveal that electricity returns are characterized by extremely high levels of skewness and kurtosis, high variance and an extreme range. Under such conditions, a sophisticated model like EVT which explicitly models the tails of the return distribution is better-equipped to produce accurate VaR forecasts. In contrast, the NordPool and Alberta summary statistics are notably different—the skewness and kurtosis statistics are an order of magnitude lower than the other markets and the range of returns is considerably narrower. The relative advantage of a more sophisticated VaR model is diminished in such conditions and simpler models may suffice.18 In summary, the results in Table 6 extend the findings of Byström (2005). Working with hourly NordPool returns, Byström (2005) reports that VaR 18 It is unclear why the distribution of electricity returns is so different in these two markets. Arguably, differences might be expected for NordPool where electricity is hydro-generated, yet Alberta features traditional coal-fired generation. performance under a GARCH-EVT framework is superior to a number of competing parametric approaches. The current findings (based on daily returns) also show that the AR-EGARCH-EVT model performs well, especially in markets where the distributions of returns exhibit extreme moments. However, the naı̈ve HS approach (not examined by Byström) is also shown to perform well, particularly in markets where the return distribution does not display extreme skewness and kurtosis. While the HS approach performs surprisingly well in several energy markets, risk managers may nonetheless benefit from adopting the AREGARCH-EVT model. A parametric model that captures the time-series properties of both the mean and volatility of returns, as well as explicitly modelling the tails of the distribution, may offer advantages during periods of market turmoil. To illustrate, consider Fig. 2 which depicts the VaR performance of the HS and AR-EGARCH-EVT approaches during the out-of-sample period for the Alberta market (a = 5%). Although the HS model in Table 6 has a marginally better violation ratio (4.93%) than the AR-EGARCH-EVT model (4.11%), the latter results in time-varying VaR forecasts that adapt quickly to changing market conditions. During the middle of 2003 and towards the end of 2004, the AR- 296 K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 EGARCH-EVT model produces more accurate and robust VaR forecasts (dashed, green line). AREGARCH-EVT dviolationsT (marked with circles) are relatively evenly spaced throughout the out-ofsample period. In contrast, VaR forecasts under the HS approach (smooth, heavy red line) are relatively constant and persistent. As a result, HS violations (marked by triangles) appear to be clustered during periods of turmoil. This finding has obvious implications —a firm that forecasts VaR using the HS model may experience a number of consecutive violations during turbulent periods when accurate VaR measures are needed most. In light of the possibility that true quantiles are time varying, the following section conducts formal statistical tests to assess the conditional coverage of various approaches to quantile estimation. 4.3. Statistical analysis of model performance The performance of competing approaches to VaR measurement in Section 4.2 is based on an assessment of the out-of-sample accuracy of estimated quantiles. Specifically, the out-of-sample violation proportions are compared to theoretical probabilities. Conducting formal statistical inference on this unconditional coverage is straightforward (see Berkowitz, 1999; Christoffersen, 1998; McNeil & Frey, 2000). Note, however, that evaluating quantile estimation performance using unconditional coverage may be of limited use if the true quantile is time varying. To illustrate, consider a naı̈ve quantile estimator constructed as the quantile of all historical returns. On average, the naı̈ve estimator will have a perfect violation proportion (that is, correct unconditional coverage). In any given period, however, the conditional coverage may be incorrect. This scenario is particularly relevant in financial time-series where volatility (and consequently, the return distribution) varies over time. A naı̈ve quantile estimator may entirely fail to differentiate between periods of high volatility and periods of relative tranquility.19 Christoffersen (1998) clarifies the distinction between conditional and unconditional interval forecasts 19 We are grateful to an anonymous referee for articulating this issue and suggesting tests of both unconditional and conditional coverage. and proposes statistical tests for each.20 Let LRcc denote a likelihood ratio test statistic examining whether a quantile estimator has correct conditional coverage. Christoffersen (1998) shows that LRcc can be decomposed into a likelihood ratio test of correct unconditional coverage (LRuc) and a likelihood ratio test of independence (LRind). In brief, the test of independence is concerned with the order in which VaR violations occur — observed violations should be spread out over the sample rather than arriving in clusters. Table 7 reports statistical tests for conditional coverage, unconditional coverage and independence. In addition, the popular Binomial test of unconditional coverage is also reported (see Fernandez, 2005; McNeil & Frey, 2000). As a quick reference guide, the absence of dasterisksT in Table 7 indicates that the difference between theoretical and empirical violation ratios is not statistically significant. In addition, a quantile estimator should be viewed with scepticism if it passes the unconditional test but fails either or both of the conditional and independence tests. Almost immediately, we see examples of the issue raised above. For example, with a = 5%, the violation ratio for Alberta passes the unconditional tests (Binomial and LRuc), but fails the independence test, and consequently the conditional coverage test. For NordPool, the unconditional tests are passed, but the independence test is failed. In contrast, the AREGARCH-EVT approach (and arguably the AREGARCH-t approach) demonstrates consistency between conditional and unconditional tests. In general, the differences between theoretical and empirical violation ratios from these models are not statistically significant. Considering the results of Tables 6 and 7 as a whole, the only market where the AR-EGARCHEVT VaR forecast is not superior is the NordPool (at any level of a). As noted in the previous section, NordPool features hydro-generation which seemingly results in a return distribution with notably different characteristics (as evidenced by the summary statistics in Table 1). Table 7 suggests that the naı̈ve HS approach is adequate in this market only. 20 Briefly, the tests can be implemented in a convenient likelihood ratio framework and are distributed asymptotically chi-squared. Readers are referred to Christoffersen (1998) for technical details on the test statistics. K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 297 Table 7 Statistical tests of conditional and unconditional coverage a = 5% HS AR-ConVar AR-HS AR-EGARCH-N AR-EGARCH-t AR-EGARCH-EVT a = 1% HS AR-ConVar AR-HS AR-EGARCH-N AR-EGARCH-t AR-EGARCH-EVT Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc test test test test test test test test test test test test Victoria NordPool Alberta Hayward PJM 0.93 0.92 10.43*** 11.35*** 5.35*** 44.54*** 0.07 44.61*** 1.44 2.26 12.75*** 15.01*** 2.46*** 5.43*** 5.28*** 10.71*** 0.08 0.01 2.35 2.36 0.42 0.18 2.96 3.14 0.93 0.92 2.75* 3.67 5.69*** 53.61*** 0.02 53.63*** 5.69*** 53.60*** 0.02 53.62*** 1.61* 2.85* 6.56** 9.41*** 6.03*** 65.59*** 0.00 65.59*** 3.48*** 15.21*** 0.72 15.93*** 0.09 0.01 6.34** 6.35** 0.76 0.61 6.31** 6.92** 1.10 1.16 5.68** 6.84** 0.42 0.18 6.21** 6.39** 0.25 0.07 0.35 0.42 1.10 1.29 0.05 1.34 1.12 1.35 0.95 2.30 5.24*** 46.52*** 0.01 46.54*** 3.93*** 21.09*** 0.27 21.36*** 0.19 0.04 0.25 0.29 2.81*** 9.59*** 0.77 10.36*** 1.68* 3.16* 0.09 3.27 1.68* 2.61 5.42** 8.03** 2.06* 4.85** 1.24 6.09** 3.00*** 7.78*** 7.64*** 15.42*** 3.18*** 8.71*** 1.06 9.77*** 1.69* 3.16* 1.52 4.68* 1.87* 3.19* 1.52 4.71* 0.48 0.25 4.44* 4.69* 2.72*** 14.67*** 0 14.67*** 0.26 0.07 3.26** 3.33 6.96*** 29.14*** 0.01 29.15*** 1.37 1.64 0.34 1.98 0.86 0.82 0.07 0.89 0.11 0.01 0.14 0.15 1.97* 5.45** 0.01 5.55* 2.72*** 14.67*** 0 14.67*** 0.63 0.37 0.22 0.59 2.72*** 14.67*** 0 14.67*** 2.72** 14.67*** 0 14.67*** 0.11 0.25 0.10 0.35 4.35*** 13.13*** 0.42 13.55*** 1.23 1.80 0.04 1.84 5.47*** 19.44*** 1.37 20.81*** 0.86 0.82 0.07 0.89 0.86 0.82 0.07 0.89 1.64* 3.63* 0.01 3.64 2.46** 12.06*** 0 12.06*** 2.46** 12.06*** 0 12.06*** 2.87*** 6.19** 1.18 7.37** 1.64* 3.63* 0.01 3.64 1.23 1.86 0.03 1.89 2.46** 4.69** 0.48 5.17* 0.82 0.61 0.22 0.83 0.41 0.16 0.16 0.32 9.44*** 46.28*** 0.14 46.42*** 0.82 0.76 0.05 0.81 1.23 1.31 0.27 1.58 (continued on next page) 298 K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 Table 7 (continued) a = 0.5% HS AR-ConVar AR-HS AR-EGARCH-N AR-EGARCH-t AR-EGARCH-EVT Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc Binomial LRuc LRind LRcc test test test test test test Victoria NordPool Alberta Hayward PJM 0.34 0.89 0.01 0.90 0.87 7.31*** 0 7.31** 1.39 2.72 0.00 2.72 9.10*** 39.21*** 0.23 39.44*** 0.71 0.45 0.07 0.52 0.18 0.03 0.04 0.07 0.71 0.45 0.07 0.52 0.87 0.89 0.01 0.90 1.92* 7.32*** 0 7.32*** 1.76** 2.43 0.14 2.57 1.92* 7.32*** 0 7.32*** 1.92* 7.32*** 0 7.32** 0.18 0.03 0.04 0.07 8.06*** 32.32*** 0.43 32.75*** 0.87 0.89 0.01 0.90 7.53*** 29.03*** 0.91 29.94*** 1.39 2.72* 0.00 2.72 1.39 2.72* 0.00 2.72 1.74* 6.02** 0 6.02** 1.74* 6.02** 0 6.02** 1.74* 6.02** 0 6.02** 4.63*** 12.69*** 1.73 14.42*** 1.16 1.81 0.00 1.81 0.58 0.38 0.01 0.39 1.74* 2.33 0.12 2.45 2.32** 3.89** 0.16 4.05 1.74* 6.02** 0 6.02** 12.16*** 58.56*** 2.00 60.56*** 1.16 1.81 0.00 1.81 0.58 0 0.03 0.03 The table presents statistical tests of both conditional and unconditional coverage of the interval forecasts under each competing approach. *, ** and *** denote significance at the 10%, 5% and 1% level, respectively. The statistical evidence favouring the use of VaR forecasts based on the AR-EGARCH-EVT model should come as no surprise. The AR(7) mean equation accommodates the autoregression in returns. The EGARCH component captures conditional volatility clustering, asymmetric effects, and, in this case, seasonality in volatility. The EVT component explicitly models the heavy tails of the standardized residuals. Taken together, the features ensure that quantile estimates from the AR-EGARCH-EVT model at any given time reflect the most recent and relevant information. 5. Conclusion The recent deregulation in electricity markets worldwide has heightened the importance of risk management in energy markets. This paper examines a number of approaches to forecasting VaR for electricity markets. Arguably, assessing VaR in electricity markets is more difficult than in traditional financial markets because the distinctive features of the former result in a highly unusual distribution of returns— electricity returns are highly volatile, display seasonalities in both their mean and volatility, exhibit leverage effects and clustering in volatility, and feature extreme levels of skewness and kurtosis. Accordingly, approaches to VaR measurement that are common in financial markets may not necessarily be appropriate in electricity markets. In addition to popular parametric and non-parametric approaches, this paper explores an approach to VaR forecasting that incorporates extreme value theory. The proposed model is specifically designed for electricity applications. Given daily data series, the model accommodates autoregression and weekly seasonals in both the conditional mean and conditional volatility equations. Leverage effects in conditional volatility are modelled with an EGARCH specification. Model residuals are standardized to produce (near) i.i.d. observations, and EVT is applied K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 to the standardized residuals to forecast the tail quantiles required for VaR. The results support the deployment of the proposed model. Autocorrelations exist at lags up to 7 days, and conditional volatility displays leverage effects. The two-step procedure of McNeil and Frey (2000) produces standardized residuals that dbehaveT significantly better than raw returns in terms of independence, and thus better facilitate the EVT implementation. In terms of VaR performance, it is difficult to draw consistent conclusions across the various methods, quantile levels and energy markets. Of the parametric models, the proposed AR-EGARCH-EVT method arguably produces the most accurate forecasts of VaR. Somewhat surprisingly, the naı̈ve quantile estimator based on historical simulation performs strongly in several markets. Further examination suggests that the distribution of returns in markets in which the HS approach dominates may be different — the distribution of returns in Nordpool and Alberta markets is notably less skewed, has lower kurtosis, and exhibits lower dispersion. In contrast, and as might be expected, the sophisticated AR-EGARCH-EVT approach dominates in markets where the distribution of returns is characterized by high skewness and kurtosis, and high volatility. The paper also examines VaR performance by assessing the unconditional and conditional interval coverage of the various approaches to forecasting VaR. Assessing conditional coverage is important if true tail quantiles are time varying. In such cases, simple VaR approaches based on historical simulation are likely to result in dclusteringT of VaR violations, and this will occur during periods of turmoil when accurate VaR forecasts are needed most. The statistical tests of Christoffersen (1998) suggest that the naı̈ve HS approach does indeed fail to provide adequate conditional coverage. In contrast, the proposed AR-EGARCH-EVT approach generates VaR forecasts that, by incorporating the most recent market events, provide appropriate conditional coverage. This finding is consistent across nearly all energy markets examined. In summary, the results of the paper support the combination of the parametric AR-EGARCH model with EVT for the purpose of estimating tail quantiles and forecasting VaR. 299 Acknowledgements We are grateful to two anonymous referees, Hans Byström and seminar participants at the 2005 AsianFA Conference for their helpful comments and suggestions. The first author thanks the Department of Education, Training and Youth Affairs (DETYA), Australia, the University of Queensland and the University of Auckland for the funding support. All errors are our own. References Andrews, N., & Thomas, M. (2002, April). At the end of the tail. EPRM 75–77. Bali, T. G. (2003). An extreme value approach to estimating volatility and value at risk. Journal of Business, 76, 83 – 107. Bali, T. G., & Neftci, S. N. (2003). Disturbing extremal behaviour of spot price dynamics. Journal of Empirical Finance, 10, 455 – 477. Balkema, A. A., & de Haan, L. (1974). Residual lifetime at great age. Annals of Probability, 2, 792 – 804. Berkowitz, J. (1999). Evaluating the forecasts of risk models, Working paper, Federal Reserve Board. Byström, H. (2004). Managing extreme risks in tranquil and volatile markets using conditional extreme value theory. International Review of Financial Analysis, 13, 133 – 152. Byström, H. (2005). Extreme value theory and extremely large electricity price changes. International Review of Economics and Finance, 14, 41 – 55. Christoffersen, P. F. (1998). Evaluating interval forecasts. International Economic Review, 39, 841 – 862. Clewlow, L., & Strickland, C. (2000). Energy derivatives: Pricing and risk management. London7 Lacima. Cuaresma, J. C., Hlouskova, J., Kossmeier, S., & Obersteiner, M. (2004). Forecasting electricity spot-prices using linear univariate time-series models. Applied Energy, 77, 87 – 106. Dowd, K. (1998). Beyond value-at-risk: The new science of risk management. Chichester7 Wiley. Duffie, D., Gray, S., & Hoang, P. (1998). Volatility in energy prices. In R. Jameson (Ed.), Managing energy price risk. London7 Risk publication. Duffie, D., & Pan, J. (1997). An overview of value-at-risk. Journal of Derivatives, 7, 7 – 49. Embrecths, P., Kluppelberg, C., & Mikosh, T. (1997). Modelling extremal events. Berlin–Heidelberg7 Springer. Escribano, A., Pena, I., & Villaplana, P. (2002). Modelling electricity prices: International evidence, Working paper, Universidad Carlos III de Madrid. Eydeland, A., & Wolyniec, K. (2003). Energy and power risk management: New developments in modelling, pricing and hedging. New Jersey7 Wiley. Fernandez, V. (2005). Risk management under extreme events. International Review of Financial Analysis, 14, 113 – 148. 300 K.F. Chan, P. Gray / International Journal of Forecasting 22 (2006) 283–300 Gençay, R., & Selçuk, F. (2004). Extreme value theory and value-atrisk: Relative performance in emerging markets. International Journal of Forecasting, 20, 287 – 303. Gençay, R., Selçuk, F., & Ulugülyaĝci, A. (2003). High volatility, thick tails and extreme value theory in value-at-risk estimation. Insurance, Mathematics and Economics, 33, 337 – 356. Hill, B. (1975). A simple general approach to inference about the tail of distribution. Annals of Statistics, 46, 1163 – 1173. Ho, L., Burridge, P., Cadle, J., & Theobald, M. (2000). Value-atrisk: Applying the extreme value approach to Asian markets in the recent financial turmoil. Pacific-Basin Financial Journal, 8, 249 – 275. Holton, G. A. (2003). Value-at-risk: Theory and practice. San Diego7 Academic Press. Jorion, P. (2000). Value-at-risk: The new benchmark for managing financial risk. New York7 McGraw-Hill. Knittel, C., & Roberts, M. (2001). An empirical examination of deregulated electricity prices, Working paper, University of California. Longin, F. M. (1996). The asymptotic distribution of extreme stock market returns. Journal of Business, 69, 383 – 408. Manganelli, S., & Engle, R. (2004). A comparison of value-at-risk models in finance. In G. Szegö (Ed.), Risk measures for the 21st century. Chichester7 Wiley. McNeil, A. J., & Frey, R. (2000). Estimation of tail-related risk measures for heteroscedasticity financial time series: An extreme value approach. Journal of Empirical Finance, 7, 271 – 300. Müller, U. A., Dacorogna, M. M., & Pictet, O. V. (1998). Heavy tails in high frequency financial data. In R. Adler et al., (Eds.), A practical guide to heavy tails: Statistical techniques and applications. Boston7 Birkhäuser. Pickands, J. (1975). Statistical inference using extreme order statistics. Annals of Statistics, 3, 119 – 131. Pictet, O. V., Dacorogna, M. M., & Müller, U. A. (1998). Hill, bootstrap and jackknife estimators for heavy tails. In R. Adler et al., (Eds.), A practical guide to heavy tails: Statistical techniques and applications. Boston7 Birkhäuser. Reiss, R. D., & Thomas, M. (2001). Statistical analysis of extreme values. Berlin–Heidelberg7 Springer. Rozario, R. (2002). Estimating value at risk for the electricity market using a technique from extreme value theory, Working paper, University of New South Wales. Wolak, F. (1997). Market design and price behaviour in restructured electricity markets: An international comparison, Working paper, Stanford University. Kam Fong Chan is a PhD student at the UQ Business School, The University of Queensland. He is a current lecturer in finance at The University of Auckland. His research interests include financial econometrics, testing asset pricing models and modelling jumpdiffusion and volatility processes. He has published in the Multinational Finance Journal and Accounting & Finance. Philip Gray is Associate Professor in Finance at the UQ Business School at the University of Queensland. He completed a PhD at the Australian Graduate School of Management in 2000. His research interests include assessing return predictability, non-parametric derivative pricing, and empirical testing of asset pricing models. He has published in numerous scholarly journals including the Journal of Business, Finance & Accounting, Journal of Futures Markets, Journal of Finance, Economic Record, International Review of Finance, Finance Research Letters, Accounting & Finance and Journal of Banking and Finance.