Evaluating extreme quantile forecasts 1 Evaluating extreme quantile forecasts Rob J Hyndman Business & Economic Forecasting Unit Evaluating extreme quantile forecasts Outline 1 Examples 2 Forecast density evaluation 3 Forecast quantile evaluation 4 Electricity peak demand forecasting Examples 2 Evaluating extreme quantile forecasts Examples Extreme quantile forecasting 0 −500 Change in DJI 500 1000 Daily change in Dow Jones Index 2000 2002 2004 2006 Year 2008 2010 3 Evaluating extreme quantile forecasts Examples Extreme quantile forecasting 0 −500 Change in DJI 500 1000 Daily change in Dow Jones Index NASDAQ crash After 11 Sep 2001 2000 2002 Global Financial Crisis 2004 2006 Year 2008 2010 3 Evaluating extreme quantile forecasts Examples Extreme quantile forecasting 4 Evaluating extreme quantile forecasts Examples Extreme quantile forecasting Black Saturday → 4 Evaluating extreme quantile forecasts Examples Extreme quantile forecasting 4.0 PoE (annual interpretation) 3.5 10 % 50 % 90 % ● ● 3.0 ● ● ● ● ● ● 2.5 ● ● ● ● ● 2.0 PoE Demand ● 98/99 00/01 02/03 04/05 Year 06/07 08/09 10/11 4 Evaluating extreme quantile forecasts Examples Extreme quantile forecasting 5 6 Annual POE levels 4 ● 3 ● ● ● ● ● ● ● ● ● ● ● ● 2 PoE Demand ● 1 % POE 5 % POE 10 % POE 50 % POE 90 % POE Actual annual maximum 98/99 00/01 02/03 04/05 06/07 08/09 10/11 12/13 14/15 16/17 18/19 20/21 Year 4 Evaluating extreme quantile forecasts Forecast density evaluation Outline 1 Examples 2 Forecast density evaluation 3 Forecast quantile evaluation 4 Electricity peak demand forecasting 5 Evaluating extreme quantile forecasts Forecast density evaluation Density evaluation Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. 6 Evaluating extreme quantile forecasts Forecast density evaluation Density evaluation Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. If Qt (p) is an accurate forecast distribution, then G(p) ≈ p. 6 Evaluating extreme quantile forecasts Forecast density evaluation 0.8 0.6 0.4 0.2 0.0 p= proportion less than Q(p) 1.0 Density evaluation 4 5 6 7 Quantile: Q(p) 8 9 7 Evaluating extreme quantile forecasts Forecast density evaluation 0.8 0.6 0.4 0.2 Imagine there are multiple observations for each forecast distribution. 0.0 p= proportion less than Q(p) 1.0 Density evaluation 4 5 6 7 Quantile: Q(p) 8 9 7 Evaluating extreme quantile forecasts Forecast density evaluation 1.0 Density evaluation 0.8 0.6 0.4 0.2 Imagine there are multiple observations for each forecast distribution. 0.0 p= proportion less than Q(p) p G(p) 4 5 6 7 Quantile: Q(p) 8 9 7 Evaluating extreme quantile forecasts Forecast density evaluation 1.0 Density evaluation 0.8 0.6 0.2 0.4 KS: Kolmogorov−Smirnov statistic 0.0 p= proportion less than Q(p) p G(p) 4 5 6 7 Quantile: Q(p) 8 9 7 Evaluating extreme quantile forecasts Forecast density evaluation 1.0 Density evaluation 0.8 0.6 0.2 0.4 Mean difference = Mean Absolute Excess Probability 0.0 p= proportion less than Q(p) p G(p) 4 5 6 7 Quantile: Q(p) 8 9 7 Evaluating extreme quantile forecasts Forecast density evaluation Excess probability Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. 8 Evaluating extreme quantile forecasts Forecast density evaluation Excess probability Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. Excess probability E(p) = G(p) − p 8 Evaluating extreme quantile forecasts Forecast density evaluation Excess probability Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. Excess probability E(p) = G(p) − p E(p) does not depend on t. 8 Evaluating extreme quantile forecasts Forecast density evaluation Excess probability Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. Excess probability E(p) = G(p) − p KS = maxp |E(p)| E(p) does not depend on t. 8 Evaluating extreme quantile forecasts Forecast density evaluation Excess probability Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. Excess probability E(p) = G(p) − p KS = maxp |E(p)| MAEP = R1 0 |E(p)| dp E(p) does not depend on t. 8 Evaluating extreme quantile forecasts Forecast density evaluation Excess probability Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. Excess probability E(p) = G(p) − p E(p) does not depend on t. KS = maxp |E(p)| R1 |E(p)| dp R1 Cramer-von-Mises = 0 E2 (p) dp MAEP = 0 8 Evaluating extreme quantile forecasts Forecast density evaluation −0.10 −0.05 0.00 Area = MAEP: Mean Absolute Excess Probability −0.15 Excess probability EP(p) 0.05 Density evaluation KS 0.0 0.2 0.4 0.6 Probability p 0.8 1.0 9 Evaluating extreme quantile forecasts Forecast density evaluation Density evaluation −0.005 −0.015 −0.025 Squared excess probability Area = Cramer−von−Mises statistic 0.0 0.2 0.4 0.6 Probability p 0.8 1.0 9 Evaluating extreme quantile forecasts Forecast density evaluation Probability integral transform Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. Ft (y) = Prob(yt ≤ y) = distribution of yt . 10 Evaluating extreme quantile forecasts Forecast density evaluation Probability integral transform Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. Ft (y) = Prob(yt ≤ y) = distribution of yt . Ft (Qt (p)) = p. 10 Evaluating extreme quantile forecasts Forecast density evaluation Probability integral transform Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. Ft (y) = Prob(yt ≤ y) = distribution of yt . Ft (Qt (p)) = p. Zt = Ft (yt ) is the PIT. 10 Evaluating extreme quantile forecasts Forecast density evaluation Probability integral transform Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. Ft (y) = Prob(yt ≤ y) = distribution of yt . Ft (Qt (p)) = p. Zt = Ft (yt ) is the PIT. If Ft (y) is correct, then Zt will follow a U(0, 1) distribution. 10 Evaluating extreme quantile forecasts Forecast density evaluation 1.0 Probability integral transform 0.8 0.6 0.4 0.2 0.0 p= proportion less than Q(p) p G(p) 4 5 6 7 Quantile: Q(p) 8 9 11 Evaluating extreme quantile forecasts Forecast density evaluation 0.8 0.6 0.4 0.2 Yt 0.0 p= proportion less than Q(p) 1.0 Probability integral transform 4 5 6 7 Quantile: Q(p) 8 9 11 Evaluating extreme quantile forecasts Forecast density evaluation 0.8 0.6 0.2 0.4 Zt Yt 0.0 p= proportion less than Q(p) 1.0 Probability integral transform 4 5 6 7 Quantile: Q(p) 8 9 11 Evaluating extreme quantile forecasts Forecast density evaluation 12 0.0 0.2 0.4 Zt 0.6 0.8 1.0 Probability integral transform 0.0 0.2 0.4 0.6 p 0.8 1.0 Evaluating extreme quantile forecasts Forecast density evaluation 12 Zt 0.6 0.8 1.0 Probability integral transform 0.0 0.2 0.4 KS (same value as before) 0.0 0.2 0.4 0.6 p 0.8 1.0 Evaluating extreme quantile forecasts Forecast density evaluation 12 0.2 0.4 Zt 0.6 0.8 1.0 Probability integral transform 0.0 MAEP (same value as before) 0.0 0.2 0.4 0.6 p 0.8 1.0 Evaluating extreme quantile forecasts Forecast density evaluation 12 PIT not necessary as G(p) gives same information and more interpretable. 0.2 0.4 Zt 0.6 0.8 1.0 Probability integral transform 0.0 MAEP (same value as before) 0.0 0.2 0.4 0.6 p 0.8 1.0 Evaluating extreme quantile forecasts Forecast density evaluation Distribution of MAEP Zi = Fi (yi ) Ai = 1 (Zi − 2 1 |Zi − n i −1 2 n i−0.5 n ) + (Zi − ni )2 | if i −1 n < Zi < otherwise. i n 13 Evaluating extreme quantile forecasts Forecast density evaluation Distribution of MAEP Zi = Fi (yi ) Ai = 1 (Zi − 2 1 |Zi − n i −1 2 n ) + (Zi − ni )2 i−0.5 n | MAEP = if i −1 n < Zi < otherwise. n X i =1 Ai i n 13 Evaluating extreme quantile forecasts Forecast density evaluation Distribution of MAEP Zi = Fi (yi ) Ai = 1 (Zi − 2 1 |Zi − n i −1 2 n ) + (Zi − ni )2 i−0.5 n | MAEP = √1 10n if i −1 n < Zi < otherwise. n X i =1 E(MAEP) = Ai i n 13 Evaluating extreme quantile forecasts Forecast density evaluation Distribution of MAEP Zi = Fi (yi ) Ai = 1 (Zi − 2 1 |Zi − n i −1 2 n ) + (Zi − ni )2 i−0.5 n | MAEP = V(MAEP) = √1 10n 1 54n if i −1 n < Zi < otherwise. n X i =1 E(MAEP) = Ai i n 13 Evaluating extreme quantile forecasts Forecast density evaluation Distribution of MAEP Zi = Fi (yi ) Ai = 1 (Zi − 2 1 |Zi − n i −1 2 n ) + (Zi − ni )2 i−0.5 n | MAEP = V(MAEP) = i −1 n < Zi < otherwise. n X Ai i =1 E(MAEP) = if √1 10n 1 54n Get p-values by simulation. i n 13 Evaluating extreme quantile forecasts Forecast density evaluation MAEP for density evaluation MAEP more sensitive and less variable than KS. 14 Evaluating extreme quantile forecasts Forecast density evaluation MAEP for density evaluation MAEP more sensitive and less variable than KS. MAEP more interpretable than Cramer-von-Mises statistic. 14 Evaluating extreme quantile forecasts Forecast density evaluation MAEP for density evaluation MAEP more sensitive and less variable than KS. MAEP more interpretable than Cramer-von-Mises statistic. Calculation and interpretation of MAEP does not require a PIT. 14 Evaluating extreme quantile forecasts Forecast quantile evaluation Outline 1 Examples 2 Forecast density evaluation 3 Forecast quantile evaluation 4 Electricity peak demand forecasting 15 Evaluating extreme quantile forecasts Forecast quantile evaluation Quantile evaluation Apply density evaluation measures to tail of distribution only. Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. E(p) = G(p) − p = excess probability 16 Evaluating extreme quantile forecasts Forecast quantile evaluation Quantile evaluation Apply density evaluation measures to tail of distribution only. Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. E(p) = G(p) − p = excess probability Quantile evaluation measures KS = maxp |E(p)| where p > q 16 Evaluating extreme quantile forecasts Forecast quantile evaluation Quantile evaluation Apply density evaluation measures to tail of distribution only. Qt (p) = forecast quantile of yt , to be exceeded with probability 1 − p. G(p) = proportion of times yt less than Qt (p) in the historical data. E(p) = G(p) − p = excess probability Quantile evaluation measures KS = maxp |E(p)| where p > q MAEPq = R1 q |E(p)| dp 16 Evaluating extreme quantile forecasts Forecast quantile evaluation 1.0 Quantile evaluation measures 0.8 0.6 0.4 0.2 0.0 p= proportion less than Q(p) p G(p) 4 5 6 7 Quantile: Q(p) 8 9 17 Evaluating extreme quantile forecasts Forecast quantile evaluation 1.0 Quantile evaluation measures 0.8 0.6 0.4 0.2 Q(q) 0.0 p= proportion less than Q(p) q=0.9 4 5 6 7 Quantile: Q(p) 8 9 17 Evaluating extreme quantile forecasts Forecast quantile evaluation 1.0 Quantile evaluation measures KS0.9 0.8 0.6 0.4 0.2 Q(q) 0.0 p= proportion less than Q(p) q=0.9 4 5 6 7 Quantile: Q(p) 8 9 17 Evaluating extreme quantile forecasts Forecast quantile evaluation 1.0 Quantile evaluation measures MAEP0.9 0.8 0.6 0.4 0.2 Q(q) 0.0 p= proportion less than Q(p) q=0.9 4 5 6 7 Quantile: Q(p) 8 9 17 Evaluating extreme quantile forecasts Forecast quantile evaluation 1.0 0.6 0.4 0.2 0.2 0.4 Zt 0.6 MAEP0.9 0.8 0.8 1.0 Quantile evaluation measures 0.0 0.0 0.0 q=0.9 0.2 0.4 0.6 p 0.8 1.0 17 Evaluating extreme quantile forecasts Forecast quantile evaluation 1.0 0.4 0.2 0.2 0.4 Zt 0.6 MAEP0.9 0.8 Distribution of MAEPq can be obtained by simulation. 0.6 0.8 1.0 Quantile evaluation measures 0.0 0.0 0.0 q=0.9 0.2 0.4 0.6 p 0.8 1.0 17 Evaluating extreme quantile forecasts Forecast quantile evaluation Excess probability q must be small enough for some observations to have occurred in the tail. 18 Evaluating extreme quantile forecasts Forecast quantile evaluation Excess probability q must be small enough for some observations to have occurred in the tail. If yt values independent and there are n forecast distributions, then probability of Q(q) being exceeded at least once is 1 − qn . 18 Evaluating extreme quantile forecasts Forecast quantile evaluation Excess probability q must be small enough for some observations to have occurred in the tail. If yt values independent and there are n forecast distributions, then probability of Q(q) being exceeded at least once is 1 − qn . Let Xq = number of observations > Q(q). Then Xq ∼ Binomial(n, 1 − q). 18 Evaluating extreme quantile forecasts Forecast quantile evaluation Excess probability q must be small enough for some observations to have occurred in the tail. If yt values independent and there are n forecast distributions, then probability of Q(q) being exceeded at least once is 1 − qn . Let Xq = number of observations > Q(q). Then Xq ∼ Binomial(n, 1 − q). Select n to ensure probability of at least 5 tail observations is at least 0.95. 18 Evaluating extreme quantile forecasts Forecast quantile evaluation Excess probability q must be small enough for some observations to have occurred in the tail. If yt values independent and there are n forecast distributions, then probability of Q(q) being exceeded at least once is 1 − qn . Let Xq = number of observations > Q(q). Then Xq ∼ Binomial(n, 1 − q). Select n to ensure probability of at least 5 tail observations is at least 0.95. q = 0.9 ⇒ n > 89. 18 Evaluating extreme quantile forecasts Forecast quantile evaluation Excess probability q must be small enough for some observations to have occurred in the tail. If yt values independent and there are n forecast distributions, then probability of Q(q) being exceeded at least once is 1 − qn . Let Xq = number of observations > Q(q). Then Xq ∼ Binomial(n, 1 − q). Select n to ensure probability of at least 5 tail observations is at least 0.95. q = 0.9 ⇒ n > 89. q = 0.95 ⇒ n > 181. 18 Evaluating extreme quantile forecasts Forecast quantile evaluation Excess probability q must be small enough for some observations to have occurred in the tail. If yt values independent and there are n forecast distributions, then probability of Q(q) being exceeded at least once is 1 − qn . Let Xq = number of observations > Q(q). Then Xq ∼ Binomial(n, 1 − q). Select n to ensure probability of at least 5 tail observations is at least 0.95. q = 0.9 ⇒ n > 89. q = 0.95 ⇒ n > 181. q = 0.99 ⇒ n > 913. 18 Evaluating extreme quantile forecasts Forecast quantile evaluation 0 2000 4000 n 6000 8000 10000 Sample size needed 0.90 0.92 0.94 0.96 q 0.98 1.00 19 Evaluating extreme quantile forecasts Electricity peak demand forecasting Outline 1 Examples 2 Forecast density evaluation 3 Forecast quantile evaluation 4 Electricity peak demand forecasting 20 Evaluating extreme quantile forecasts Electricity peak demand forecasting Peak demand forecasting We need forecasts of half-hourly demand with α annual probability of exceedance. 21 Evaluating extreme quantile forecasts Electricity peak demand forecasting Peak demand forecasting We need forecasts of half-hourly demand with α annual probability of exceedance. Insufficient data to look at annual maximums (less than 15 years) 21 Evaluating extreme quantile forecasts Electricity peak demand forecasting Peak demand forecasting We need forecasts of half-hourly demand with α annual probability of exceedance. Insufficient data to look at annual maximums (less than 15 years) Create approximately independent weekly maximum forecasts (21 weeks each summer) 21 Evaluating extreme quantile forecasts Electricity peak demand forecasting Peak demand forecasting We need forecasts of half-hourly demand with α annual probability of exceedance. Insufficient data to look at annual maximums (less than 15 years) Create approximately independent weekly maximum forecasts (21 weeks each summer) For these weekly forecasts, q = (1 − α)1/21 . 21 Evaluating extreme quantile forecasts Electricity peak demand forecasting Peak demand forecasting We need forecasts of half-hourly demand with α annual probability of exceedance. Insufficient data to look at annual maximums (less than 15 years) Create approximately independent weekly maximum forecasts (21 weeks each summer) For these weekly forecasts, q = (1 − α)1/21 . For 15 years of data, n = 315. 21 Evaluating extreme quantile forecasts Electricity peak demand forecasting Peak demand forecasting We need forecasts of half-hourly demand with α annual probability of exceedance. Insufficient data to look at annual maximums (less than 15 years) Create approximately independent weekly maximum forecasts (21 weeks each summer) For these weekly forecasts, q = (1 − α)1/21 . For 15 years of data, n = 315. Therefore q ≤ 0.971 and α ≥ 0.46. 21 Evaluating extreme quantile forecasts Electricity peak demand forecasting Model evaluation for electricity demand Ex ante Ex post q = 0.95 q = 0.90 q = 0.50 q = 0.10 q = 0.0 4.35% 3.79% 5.59% 4.28% 9.25% 5.24% 10.73% 7.95% 10.31% 8.24% 22