New models to compute short-run forecasts of electricity prices: application to the Spanish market case Carolina García Martos, Julio Rodríguez, María Jesús Sánchez 1 Laboratorio de Estadística. Escuela Técnica Superior de Ingenieros Industriales. Universidad Politécnica de Madrid. Summary. Due to the last changes in the electricity markets, not only producers but also consumers need one-day-ahead accurate forecasts of electricity prices. In this article two alternatives to the models used by Contreras et al. (2003) are proposed to predict next day electricity prices based on ARIMA methodology. Both of them use and model the 24 hourly time series, instead of the complete time series of the prices. This allows taking advantage of the homogeneity of these 24 time series. A design of experiments has been carried out to reach the aim of this work: select the model that leads to smaller prediction errors as well as the preferred length of the time series to build the ARIMA models used to forecast. A mixed model which combines the advantages of the two new models discussed is proposed. Some numerical results among the ones obtained (forecasts for each and every hourly price in the period 1998-2003) for the Spanish market are shown. Keywords: Time series analysis, electricity markets, forecasting, design of experiments, marginal price. 1. Introduction Nowadays, in competitive markets, there are two ways to trade with electricity: bilateral contracts and the pool. Referred to bilateral contracts, which is interesting is to reduce the risk they imply. In the pool both the generating companies and the consumers submit to the market operator their respective generation and consumption bids for each hour of the forthcoming day. The marginal price is defined as 1 The authors would like to thank financial support of the Project MTM2005-08897 (Ministerio de Educación y Ciencia, Spain). 1510 the bidding one submitted by the last generation unit needed to satisfy the whole demand. Having short-term accurate forecasts let the producers schedule the production of their units in order to maximize its benefits. The disposal of adequate models to predict electricity prices can be considered of great interest. Just a few years ago only demand was predicted in centralized markets. For forecasting electricity prices the techniques applied can be divided into two main groups: Neural networks and time series models. Neural networks had been used by Ramsay et al. (1998), Szkuta et al. (1999) and Rodríguez et al. (2004). Besides, Nicholaisen et al. (2000) had made forecasts for electricity prices using neural networks filtering the non-linearity of the prices. Models based on time series have also been used to model electricity prices. Contreras et al. (2003) forecasts electricity prices of the spanish market and the californian applying ARIMA models. Troncoso et al. (2003) made a comparison between kWNN technique and dynamic regression. Crespo-Cuaresma et al. (2003) propose a group of univariate models to predict electricity prices in the Leipzig market. Conejo et al. (2005) compare several methods including wavelets approximation, ARIMA models and neural networks. Nogales et al. (2005) forecast the prices in the PJM interconexion through transfer function, showing that the inclusion of explanatory variables does not reduce significantly the prediction errors. In this work, two new models to predict short term electricity prices are proposed. Both of them build ARIMA models separately for the 24 hourly time series. This paper is organized as follows. In section 2 the methodology and the computational implementation are described. In section 3 the design of experiments carried out its explained. Section 4 presents numerical results for the Spanish market. In section 5 some conclusions are provided. 2. Methodology and computational implementation Making a brief descriptive analysis of the prices corresponding to the period 19982003, a conclusion is reached: Not only the level of the prices but also the variability of them depend on the corresponding hour of the day. Two new different models are proposed. The first one, which we will refer to, from now on as Model 24, forecasts the electricity prices for each of the 24 hours of the next day using the ARIMA models built for each of the 24 time series. The second one compute the forecasts for working days using the 24 time series of these days and the forecasts of the prices in weekends with the data corresponding to this kind of days. This second model will be hereafter referred as Model 48, be- 1511 cause of the number of series it works with (24 for the working days and another 24 for weekends). In order to determine which model leads to more accurate forecasts a design of experiments has been carried out. This has also let us determine the preferred length of the time series used to build the models to forecast with. Till now a default length between two and three months (10 -12 weeks) had been used to build the models, but a rigorous study to determine the convenience of this or other length had not been ever made. The factors which influence is interesting and their respective levels are: 1. Model: It takes in account if different models have been built (Model 48) or not (Model 24) for working days and weekends. 2. Length of the time series used to build the model to forecast with. Levels of this factor: 8 weeks (Level 3),12 weeks (Level 5), 16 weeks (Level 7), 20 weeks (Level 9), 24 weeks (Level 11). Something remarkable is the great number of models that have been identified and estimated. The objective is ambitious (computing forecasts of the prices for the six years considered (1998-2003) with ten possible combinations of the levels of the two factors under study). The number of models that we have identified is over 500000, and the automatization of the procedure turned necessary. Identification of all these models had been made using TRAMO. Models are selected using BIC model selection criteria. 3. Design of experiments The main objective of this work is to determine whether is interesting or not to identify, estimate and forecast with different models for working days and weekends, as well as the number of observations in the time series used to build the models. The day forecasts are being computed for is included as a block. Doing this, the effect of the day is eliminated. This is because if the price in one day is rather unexpected, the forecast will not be accurate whatever combination of the levels of the factors were considered. Also the possible correlation between forecasting errors is eliminated when including the day as a block. The variable under study, yijt, is the logarithm of the average daily prediction error, chosen to be able to compare our results with previous ones (Contreras et al. (2003), Conejo et al. (2005)). The factors considered are the model (24 or 48) and the length of the time series (8, 12, 16, 20 and 24 weeks). The day is included as a block. The equation for the design of experiments is: y ijt = μ + α i + β j + γ t + (αβ )ij + uijt (3.1) 1512 where μ is a global effect that takes in account the average level of the response (logarithm of the prediction error), αi is the main effect related to the model, βj is the main effect associated with the length of the time series and an effect like γt takes in account the increasing effect of the block (the day in this particular case). The term (αβ)ij measures the difference between the expected value of the response and the one computed using a model that does not includes the interactions. An aleatory effect, uijt, includes the effect of all the causes not considered in the rest of the sources of variability in the experiment. The response yijt is calculated as it follows. mt t 1 24 ph − ph ) yijt = log( ∑ 24 h =1 pht (3.2) mh where the forecast of the price pt for the day t in the hour h has been calculated using the model i and estimating the multiplicative ARIMA (p,d,q) x (P,D,Q) using the previous j observations. The models, that have been identified and estimated for the hourly time series are different depending on the activity (and consequently consumption) of the corresponding hour. The analysis of the residuals let us check the assumption verification. In some cases ARCH-effects have been detected and volatility has been modelled. (García-Martos, C. (2006)). The forecasts, ( ) m pth have been calculated minimizing the expression 2 ⎡ h ⎤ E⎢ m pt − pth ⎥ . ⎣ ⎦ Working days and weekends are studied separately. Table 1 shows the ANOVA table for the design of experiments of working days. Interactions are not significant, but main effects are. SOURCE A: day B: length C: model BC Residual TOTAL Sum of squares 5026.73 3.89 9.98 0.03 216.54 5285.03 D. f. 903 4 1 4 8043 8955 Mean square 5.56 0.97 9.98 0.01 0.03 F-ratio p-value 206.76 36.14 370.95 0.29 0.0 0.0 0.0 0.88 Table 1. ANOVA for working days Figure 1 shows the convenience of model 48 for working days. 1513 From Figure 2 the preferred length for the time series used to build the models is obtained, and is the one corresponding to level 9, which means using the previous 20 weeks to forecast one-day-ahead. Fig. 1. Main effects (Model, working days) Fig. 2. Main effects (Length, working days) For weekends the results that arise from the Analysis of variance are shown in Table 2. SOURCE A: day B: length C: model BC Residual TOTAL Sum of squares 585.63 0.56 327.64 0.27 279.30 1189.66 D. f. 353 4 1 4 3159 3521 Mean square 1.65 0.14 327.64 0.06 0.08 F-ratio p-value 18.76 1.59 3705.73 0.77 0.0 0.0 0.0 0.74 Table 2. ANOVA for weekends From Figure 3 and 4 it can be stated for weekends that the main effects of the prediction error is significantly smaller using model 24. Also the convenience of building the models using the previous 16, 20 or 24 weeks arises from Figure 4, without significant difference between the corresponding levels. Fig. 3. Main effects (Model, weekends) Fig. 4. Main effects (Model, weekends) 1514 The main conclusions that can be stated from the design of experiments are the convenience of using the previous 20 weeks to the day we try to compute the forecast, as well as the proposal of a mixed model that forecasts for working days with model 48 and for weekends using model 24. 4. Numerical results The average daily prediction error for the full period considered (1998-2003) is sligthly above 13%. Bearing in mind that forecasts have been calculated for all the hourly prices in a really representative and large period of time (6 years), the results, in terms of prediction error, reflect the accuracy of the new mixed model proposed. Only as an example (since we have calculated forecasts for all the weeks in the period under study), we show here the results for two weeks selected among the whole period. The first one has been chosen to illustrate the improvement, in terms of prediction errors, in comparison with the model proposed by Contreras et al. (2003). It corresponds to the last week of May 2000 (25th-31st). Figure 5 shows the real values and the forecasts computed. Daily mean errors for this week appear in Table 3, as well as the prediction errors obtained with the model proposed by Contreras et al. (2003). Week 25th−31st May 2000 45 Forecasts Real prices 40 day 1 day 2 day 3 day 4 day 5 day 6 day 7 price (euro/MWh) 35 30 25 20 15 10 0 20 40 60 80 100 Hours in the week 120 140 160 180 Mixed model 4.6 % 2.8 % 6.0 % 14.4 % 4.3 % 5.4 % 2.9 % Contreras (2003) 4.73 % 4.13 % 3.71 % 6.84 % 6.03 % 6.96 % 3.41 % Table 3. Daily prediction errors (25th – 31st May 2000) Fig. 5. Real prices and forecasts 25th – 31st May 2000 The second week selected which results we show is the third one in April 2002. Results are shown in Fig 6 and Table 4. 1515 15th−21st April 2002 50 Forecasts Real prices 45 day 1 day 2 day 3 day 4 day 5 day 6 day 7 price (euro/MWh) 40 35 30 25 20 0 20 40 60 80 100 Hours in the week 120 140 160 180 Fig. 6. Prices and forecasts (15th – 21st april 2002) Mixed model proposed 3.06 % 2.69 % 2.13 % 2.36 % 3.02 % 11.62 % 7.23 % Table 4. Daily prediction errors (15th – 21st April 2000) 5. Conclusions This paper proposes several methods to forecast one-day-ahead electricity prices, corresponding to the different combinations of the levels of the factors under study. A complete study to determine whether forecasting with model 24 or 48 has been carried out as well as the length of the time series. The exhaustive analysis made. A mixed model to forecast next-day electricity prices is proposed. We recommend computing forecasts for working days with model 48 and with model 24 for weekends. We have also determined the preferred length for the time series to build the models: 20 weeks. Numerical results provided in section 5 let us check the performance of the alternative mixed model proposed. References Crespo-Cuaresma J, Hlouskova, Kossmeier S, Obersteiner M (2004) Forecasting electricity spot-prices using linear univariate time-series models. Applied Energy. Contreras J, Espínola R, Nogales FJ, and Conejo AJ (2003) ARIMA Models to Predict Next-Day Electricity Prices. IEEE Transactions on Power Systems. Conejo AJ, Contreras J, Espínola R, Plazas MA (2005). "Forecasting electricity prices for a day-ahead pool-based electric energy market". International Journal of Forecasting. García-Martos, C (2006). Short-run forecasting of electricity prices and volatilities. Working paper. 1516 Nicolaisen, JD, Richter CW, Sheblé GB (2000). Price Signal analysis for competitive electric generation companies using artificial neural networks. Conf. Elect. Utility Deregulation and Restructuring and Power Technologies. Nogales FJ, Conejo AJ (2006). Electricity Price Forecasting through Transfer Function Models. Journal of the Operational Research Society. Ramsay B, Wang AJ (1998). A neural network based estimator based estimator for electricity spot-pricing with particular reference to weekend and public holidays. Rodríguez CP, Anders GJ (2004). Energy Price Forecasting in the Ontario Competitive Power System Market. IEEE Transactions on Power Systems. Szkuta BR, Sanabria LA, Dillon TS (2004). Electricity Price Short-Term Forecasting Using Artificial Neural Networks. IEEE Transactions on Power Systems. Troncoso A, Riquelme J, Riquelme J, Gómez A, Martínez JL (2002). A Comparison of Two Techniques for Next-Day Electricity Price Forecasting. Lecture Notes In Computer Science.