Simulation and Forecasting of Hydrological Power Generation: An Alternative Approach J. Andrew Howe, PhD TransAtlantic Petroleum Presented at the Palisade EMEA 2012 Risk Conference London Howe Hydro Simulation April 2012 1 / 30 Presentation Outline Introduction What is Hydrological Power? next: Our Model Hydro Simulation Model Exploratory Data Analysis next: Model Development Model Development next: Model Validation Model Validation next: Forecasting Forecasting next: Conclusion Concluding Remarks Howe Hydro Simulation April 2012 2 / 30 Introduction What is Hydrological Power? ◮ Hydrological power (hydro) is electricity generated when a moving source of water is dammed. ◮ Specified volumes of water are allowed to flow through a turbine, generating electricity. ◮ In some areas of the world, hydro is a significant source of cheap energy. ◮ A few examples: Norway - 99%, Brazil - 83%, New Zealand - 65%, North Korea - 56%1 . ◮ Damming river systems also can provide other benefits - flood control, irrigation, and recreation. ◮ The power that a damn can generate is limited by the amount of water in the source. ◮ On the next slide, we see a simple diagram of how power is generated from a dam. 1 http://www.nationmaster.com Howe Hydro Simulation April 2012 3 / 30 Introduction What is Hydrological Power? How power is generated from a dam.2 2 http://ga.water.usgs.gov/edu/wuhy.html Howe Hydro Simulation April 2012 4 / 30 Introduction What is Hydrological Power? How power is generated from a dam.2 2 http://ga.water.usgs.gov/edu/wuhy.html Howe Hydro Simulation April 2012 4 / 30 Introduction ◮ ◮ ◮ ◮ ◮ ◮ ◮ What is Hydrological Power? Generally, power from dams is classified as baseload power. Baseload power plants are used to supply much of a region’s continuous energy demand. Common examples include nuclear, coal-fired, hydro. These types of plants generally take a significant amount of time to reach peak production; they are expected to generate power at a constant rate. Baseload power plants typically have high fixed costs, but low variable costs, so it is most economical to run them continuously. The difference between baseload supply and peak power demand is generally met by combustion turbine / combined-cycle gas plants or out-of-system purchases. These sources are said to be dispatchable, because they can be dispatched to meet demand. CC and CT plants are very dispatchable, as they spin up to peak production quickly; they have lower fixed costs and higher variable costs, however. While nominally a baseload source, hydroelectric power can be somewhat dispatched by controlling the number of sluice gates open, and how much they are open. Howe Hydro Simulation April 2012 5 / 30 Introduction What is Hydrological Power? Example power demand curve for winter. ◮ The red curve here indicates an average power demand curve during winter months, which could be met using i) blue line - baseload hydroelectric, coal, nuclear ii) black line - gas (CT / CC) iii) remainder - purchases ◮ Because of its role in cheaply supplying baseload power, it is important to be able to model and forecast hydro power. Howe Hydro Simulation April 2012 6 / 30 Introduction ◮ What is Hydrological Power? Sources of variability in hydrological power generation include: i) weather and precipitation characteristics ii) scheduling and operating procedures iii) economic drivers of supply and demand ◮ ◮ A typical approach to this kind of modeling would be to stochastically model the meteorological and economic drivers, and deterministically model the operating procedures. Scheduling and operating procedures would generally be coded into a seasonal model that balances constraining requirements such as: i) reservoir levels must accommodate seasonal recreational and flood control uses ii) spill rates must not cause ecological damage iii) planned maintenance ◮ Reservoir level variability is affected by weather and precipitation in several ways, which we can see on the familiar hydrological cycle. Howe Hydro Simulation April 2012 7 / 30 Introduction What is Hydrological Power? The hydrological cycle.3 3 http://www.wpclipart.com/science/earth/water_cycle_USGS.png.html Howe Hydro Simulation April 2012 8 / 30 Introduction ◮ ◮ ◮ ◮ ◮ ◮ ◮ ◮ next: Our Model In this case study, we present an alternate method to stochastic modeling and forecasting of hydro power generation. Our proposed approach infers the seasonal distributional and variability characteristics of the system’s drivers directly from the modeled data. Our model breaks the calendar year into seven seasons of weeks based on similarities in the distributional shapes, variabilities, and trends. Each season is stochastically modeled independently, then stitched together. Descriptive statistics measured on 50 years of both modeled and simulated data match very closely - average generation, variability, and first-order correlation most notably. There is no visible difference between simulated hydro generation histories and the modeled data. The model makes it easy to layer-in deterministic components, such as trends and levels, for scenario testing. We implement a novel “library” method to match actual power generation at a specific season and forecast future results. Howe Hydro Simulation April 2012 9 / 30 Hydro Simulation Model Exploratory Data Analysis ◮ The question underlying this case study is “How can we simulate a history of power generation from dams in a way that gives us the flexibility to model various scenarios, match reality, and provide forecasts?” ◮ The source data is composed of 50 years (1954-2003) of modeled weekly hydro generation from a large dammed river system in the United States. ◮ This data, which we’ll call “actual”, was generated by a system scheduling model based on actual weather / precipitation data and dam operation procedures. Howe Hydro Simulation April 2012 10 / 30 Hydro Simulation Model Exploratory Data Analysis Actual data. ◮ ◮ ◮ Visual inspection of the data suggests a seasonal cyclical nature, which is expected. There is no other consistent pattern visible - note the irregularity of the orange 200-week moving average line. We initially focused on the cyclical behavior of the data; on the next slide, we have the autocorrelation plot. Howe Hydro Simulation April 2012 11 / 30 Hydro Simulation Model Exploratory Data Analysis Weekly autocorrelations. ◮ ◮ In this context, it is much easier to directly model just the first lag than it is the others, so we focused on the 82% 1-week correlation. After modeling the first-order autoregressive relationship, the time-ordered residuals show no sign of a trend, and variability seems constant, as seen on the next slide Howe Hydro Simulation April 2012 12 / 30 Hydro Simulation Model Exploratory Data Analysis Residuals after first-order autoregressive model. ◮ Regression analysis of various rolling periods in this data show that despite periodic higher/lower levels of power generation, it remains constant on average. ◮ Thus, we justify building our @Risk simulation model to the mean shape. Howe Hydro Simulation April 2012 13 / 30 Hydro Simulation Model Exploratory Data Analysis Seasonal patterns seen in five selected years of data. ◮ ◮ Here we have plotted five evenly-spaced years of weekly power so as to visualize the seasonal patterns. Each of these years shows a similar pattern: i) ii) iii) iv) slightly positive trend with higher generation initially downward trend as the season progresses prolonged flat / whipsaw period of low generation rapid up trend as summer ends Howe Hydro Simulation April 2012 14 / 30 Hydro Simulation Model Exploratory Data Analysis Average seasonal pattern suggesting seven distinct periods. ◮ ◮ ◮ With the previously mentioned autoregressive structure, it would generally be a straightforward task to build a model around these averages. However, this problem is much more complicated than that. Besides the seasonal trends and levels observed here, the shape and scale of the week distributions varies dramatically from season to season. Howe Hydro Simulation April 2012 15 / 30 Hydro Simulation Model next: Model Development Distributions for weeks 9, 26, 51. ◮ While only three examples are shown here, the general observations apply to all 52 weeks, with gradual shifts in distribution shape: i) February / March: Lower limit of 95% interval is highest, distribution is skewed toward higher power generation ii) June: 95% interval is the narrowest, with distribution more strongly skewed toward lower generation iii) December: Slight tendency for higher power generation, but distribution is rather uniform ◮ Directly modeling an autoregressive process with this much distributional variation can be a difficult proposition. Howe Hydro Simulation April 2012 16 / 30 Hydro Simulation Model Model Development ◮ Rather than using a smooth autoregression, to model the seasonality, we break the average into the seven disjoint seasons for modeling. ◮ In each season and for each week, we used the distribution fitting functionality of @Risk to estimate and score various distributions of weekly power generation. ◮ Each distribution fit was scored using the chi-squared, Kolmogorov-Smirnov, and Anderson-Darling statistics; the distribution that was selected by the majority of the tests was used. ◮ As a point of interest, in most seasons, the same distribution was selected for each week. ◮ For each season, we estimated the slope from the average curve already shown. ◮ The next slide shows details estimated for each season. Howe Hydro Simulation April 2012 17 / 30 Hydro Simulation Model Model Development Model details for each season. Season 1 2 3 4 5 6 7 Weeks 1-9 10-19 20-23 24-29 30-31 32-43 44-52 Distribution Fit BetaGeneral Triang Loglogistic Invgauss Pearson5 Logistic Weibull Slope 7, 390 −26, 542 19, 242 −10, 870 32, 629 −3, 203 16, 221 Standard Error 1, 651 1, 151 3, 702 1, 073 8, 013 714 1, 186 ◮ In real life, seasons generally change gradually. ◮ If we simulate each season independently, then stitch them together, we could end up with a very disjoint and unrealistic time series. Howe Hydro Simulation April 2012 18 / 30 Hydro Simulation Model ◮ ◮ Model Development To solve this, we first computed the difference at each season boundary, then used @Risk to select a best-fitting distribution for each boundary. These distributions are used in the simulation algorithm to adjust seasons so the boundaries are all within the appropriate 98% intervals. Seasonal boundary characteristics. 1-2 2-3 3-4 4-5 5-6 6-7 ◮ Distribution Loglogistic(-905946, 907310, 21.798) Loglogistic(-334526, 358180, 6.4176) Loglogistic(-4206.6, 48689, 1.8717) Logistic(21061, 30445) Logistic(-6908.6, 19485) Loglogistic(-315443, 325251, 11.622) 98% Interval −171, 090 214, 290 −159, 490 398, 410 −202, 080 164, 370 −118, 840 160, 960 −96, 444 167, 540 −96, 412 167, 540 For each year simulated, the algorithm is thus: Howe Hydro Simulation April 2012 19 / 30 Hydro Simulation Model Model Development i) For each season, draw a random value from the appropriate distribution, this becomes the seed for the midpoint of the season. ii) For each season, draw a Gaussian random value with µ = the slope estimated for that season and with σ = t× the associated standard error; this becomes the slope. t is a scaling factor drawn from a triangular distribution =RiskTriang(0.5, 1, 5.5). Because of the degree of right-skew, the scale factor will increase the variability about 90% of the time. This scaling factor was tuned to both match the variability and the first-order autocorrelation. iii) Extrapolate each seed backward and forward in time, using the slope. iv) If the boundary between any two seasons fall outside of the appropriate 98% interval, generate a random value from the appropriate distribution, and adjust the boundary week. v) Add a translation factor generated by =RiskNormal(-15000, 35000) to each observation; like t, this was tuned to match the source data. vi) Finally, if any week in the raw simulated data falls outside of the range of the actual data by more than 5%, it is truncated. Howe Hydro Simulation April 2012 20 / 30 Hydro Simulation Model Howe Model Development Hydro Simulation April 2012 21 / 30 Hydro Simulation Model next: Model Validation ◮ This @Risk model is run 1, 000 times for a burn-in period, then we save the following 50 simulations to create a smooth record of simulated hydro power generation. ◮ Computing 100 such 50-year simulations, after a single burn-in period, requires slightly less than 2 minutes on a standard desktop computer. Howe Hydro Simulation April 2012 22 / 30 Model Validation Example simulated year. ◮ Visual inspection of the green time series as compared to the earlier chart showing actual years of data suggests this could very well be from the actual data. It also models the average shape very well. ◮ Since we estimated much of the structure of our simulation from averaged data, we had to iteratively tune certain parameters so as to better match the actual data. Howe Hydro Simulation April 2012 23 / 30 Model Validation ◮ ◮ ◮ For validation, we simulated 100 50-year histories from our @Risk model. From each simulation, we computed the mean, median, 1st and 3rd quartiles, standard deviation, interquartile range, and first-order correlation, and averaged them all. These averages were then compared to the same statistics estimated from the actual data. Howe Hydro Simulation April 2012 24 / 30 Model Validation Comparing average characteristics of simulated and actual data. mean median 1st quartile 3rd quartile standard deviation IQR st 1 correlation Howe Actual 338, 143 329, 744 239, 619 428, 598 141, 837 188, 979 0.8229 Simulated 340, 278 323, 462 235, 296 436, 725 147, 238 201, 429 0.8317 Hydro Simulation Deviation 0.63% −1.91% −1.80% 1.90% 3.81% 6.59% 1.08% April 2012 25 / 30 Model Validation ◮ After simulating 50 years of data, we have the capability to add deterministic components - trends, levels, and collars. ◮ At user-definable periods within the 50 years, we can add a trend (ex: increase 5,000 MW per week) and / or a level (ex. decrease by 100,000 MW) adjustment. ◮ We also give the modeler the ability to enforce a minimum or maximum on the entire simulated power generation history. ◮ All together, this allows us to simulate hydrological power generation and make forecasts under a variety of scenarios. ◮ As an example, we modeled a slight uptrend from mid-1999, as well as depressed generation between 1985 - 1988 and 1999 - 2002 in an attempt to match the actual data. ◮ The next slide compares 30 years from one example simulation from this scenario with the actual data. Howe Hydro Simulation April 2012 26 / 30 Model Validation next: Forecasting Which is actual and which is simulated? Howe Hydro Simulation April 2012 27 / 30 Model Validation next: Forecasting Which is actual and which is simulated? ↑ Actual, ↓ Simulation Howe Hydro Simulation April 2012 27 / 30 Forecasting ◮ A primary reason for modeling power generation is forecasting. ◮ Since meteorological data is not an input, how can we forecast future power generation using this model? For this purpose, we developed a novel “library” method to generate forecasts and error bands. The algorithm is: ◮ i) Generate a large number of simulated 50-year hydro power generation histories. ii) Specify several weeks of real hydro generation, along with the week number(s), and the number of forecast weeks desired, called T . iii) Search all simulated histories, inspecting the specified weeks and compute the difference between each simulated and real value. iv) Simulation histories for which the differences are smaller than a specified threshold are taken as starting points for the forecasts. The next T weeks are taken as the forecast. v) The average and error bands are computed from all selected sets of T weeks. Howe Hydro Simulation April 2012 28 / 30 Forecasting ◮ ◮ ◮ next: Conclusion We started with actual hydro generation in weeks 25 - 27 of 2009. Using a cutoff of 5%, the system identified 34 portions of simulation histories that matched. In this plot, we see the first three black datapoints that were used to match the histories. After that, the black line and red dashed lines indicate the forecast and its error bands. Example 17-week forecast with error bands. Howe Hydro Simulation April 2012 29 / 30 Forecasting ◮ ◮ ◮ ◮ next: Conclusion We started with actual hydro generation in weeks 25 - 27 of 2009. Using a cutoff of 5%, the system identified 34 portions of simulation histories that matched. In this plot, we see the first three black datapoints that were used to match the histories. After that, the black line and red dashed lines indicate the forecast and its error bands. Almost all the out-of-sample observations fell inside the interval. Example 17-week forecast with error bands. Howe Hydro Simulation April 2012 29 / 30 Concluding Remarks ◮ We have presented an alternate method to stochastic modeling and forecasting of hydro power generation. ◮ Our proposed approach infers the seasonal distributional and variability characteristics of the system’s drivers directly from the modeled data. ◮ Descriptive statistics measured on 50 years of both actual and simulated data match very closely. ◮ Our model makes it easy to layer-in deterministic components, such as trends and levels, for scenario testing. ◮ Finally, we implemented a novel “library” method to match actual power generation at a specific season and forecast future results. Howe Hydro Simulation April 2012 30 / 30