Electric Power Systems Research 174 (2019) 105858 Contents lists available at ScienceDirect Electric Power Systems Research journal homepage: www.elsevier.com/locate/epsr Three probabilistic metrics for adequacy assessment of the Pacific Northwest power system T ⁎ John Fazio , Daniel Hua Northwest Power and Conservation Council, 851 SW 6th Avenue, Suite 1100, Portland, OR 97204, USA A R T I C LE I N FO A B S T R A C T Keywords: Power system Probabilistic adequacy metrics Monte Carlo simulation All regional power system entities within the bulk assessment areas in North America calculate and report probabilistic adequacy metrics that measure projected shortfall duration and magnitude to the North American Electric Reliability Corporation (NERC). While no maximum thresholds have been defined for either metric to be used as an adequacy standard, it is possible that such thresholds could be defined sometime in the future as planners gain a better understanding of the risks associated with each metric and how they interact. Therefore, it is prudent to know whether setting a threshold for one metric automatically leads to an equivalent and predictable threshold for the other. To answer this question, this paper examines mathematical relationships among these two metrics and a third metric (that measures the projected shortfall frequency) for the Pacific Northwest power supply. Results from a Monte Carlo simulation model show that although in special cases the threshold for one metric can be used to calculate unique thresholds for the other two, in general this is not the case. Hence, if these metrics are to be used to define an adequacy standard for the Northwest, threshold values for all three metrics should be defined independently. 1. Introduction In order to integrate large amounts of variable generation into the bulk power system, the North American Electric Reliability Corporation (NERC) in 2011 recommended to its regional entities of bulk power systems to calculate probabilistic adequacy metrics loss of load hours (LOLH) and expected unserved energy (EUE) for comparison to the traditional adequacy standard of a loss of load expectation (LOLE) of 0.1 day/year [1]. More recently, NERC has also recommended calculating the loss of load events (LOLEV) and the normalized expected unserved energy (NEUE) metrics, among others, for probabilistic adequacy studies [2]. Definitions of LOLEV, LOLH, EUE and NEUE, and the equations to calculate them can be found in [2]. A brief description of these metrics are as follows: LOLEV, a frequency metric, is the expected (or average) number of shortfall events per year, where a shortfall event is defined as a contiguous set of hours when load exceeds generating capacity. On the other hand, LOLH, a duration metric, is the expected number of hours per year when load exceeds generating capacity. Finally, EUE, a magnitude metric, is the expected amount of unserved energy (or the average sum of the positive differences between hourly load and generating capacity) per year, in units of megawatt-hour per year. Closely aligned with EUE is NEUE, a dimensionless magnitude metric in units of parts-per-million (ppm) per ⁎ year, which is defined as the ratio of EUE to the annual load (in megawatt-hours) multiplied by one million. Similar to LOLEV and LOLH, NEUE can be compared directly across power systems serving vastly different loads. Currently all assessment areas are required to calculate and report LOLH and EUE to NERC for publication [3]. However, along with LOLH and EUE, the duration and magnitude metrics, a more complete adequacy assessment should also include the frequency metric LOLEV, which allows for a different but also important measure of risk. These three different statistical measures of power supply shortfall quantify three different types of risks and enable power system planners to design effective and economical solutions to mitigate only those risks that are deemed unacceptably severe even as other risks remain tolerable. Acceptable levels of shortfall risks, or adequacy standards, are set by planners in the form of thresholds for one or more metric, for example, the traditional “one-day-in-ten-year” threshold for the loss of load expectation (LOLE) metric (often translated into 0.1 day/year). Although no assessment area currently uses an adequacy standard based on any of the three metrics, it is possible that in the future one or more could be used to define an adequacy standard, especially as the relationship among these metrics and the risks they quantify become more familiar from the annually published NERC reports. Therefore, it is prudent to know if setting a threshold for one metric automatically leads to Corresponding author. E-mail address: jfazio@nwcouncil.org (J. Fazio). https://doi.org/10.1016/j.epsr.2019.04.036 Received 13 November 2018; Received in revised form 17 April 2019; Accepted 26 April 2019 Available online 09 May 2019 0378-7796/ © 2019 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Electric Power Systems Research 174 (2019) 105858 J. Fazio and D. Hua On the demand side, in 2015 the maximum winter and summer hourly loads were 30,100 MW and 29,000 MW, respectively and were calculated from the Federal Energy Regulatory Commission Form 714 data for various Balancing Authorities inside the PNW [9]. equivalent and predictable thresholds for the other two, in other words, whether an adequacy standard can simply be defined by setting a limiting threshold for just one of these metrics. To answer this question, this paper calculates the value of these three metrics under varying resource and load conditions and analyzes mathematical relationships among them from a set of adequacy studies for the Pacific Northwest power system. The studies were performed using an adequacy model based on Monte Carlo chronological hourly simulations. Analyses in this paper show that under special conditions any pair of the three metrics can be fitted, with very good accuracy, by a single linear function. Thus, for these special scenarios, the threshold for one metric can be used to calculate unique thresholds for the other two. However, in general, this is not the case. A specified threshold for one metric will lead to different resulting thresholds for the other two metrics in separate scenarios depending on how diverse the resource mixes and loads are between scenarios. Therefore, if these metrics are to be used to set an adequacy standard for the Pacific Northwest, the thresholds for all three metrics (frequency, duration and magnitude) should be assigned independently. This paper is organized as follows: an overview of the resource and load characteristics of the Pacific Northwest (PNW) power system is presented in Section 2. Next in Section 3 are three subsections, the first of which is Section 3.1 which has a brief description of the GENESYS adequacy model and the Monte Carlo hourly simulations used within the model; then Section 3.2 presents the equations for calculating the LOLEV, LOLH and NEUE metrics from GENESYS output; and finally in Section 3.3 are details of the varying loads and resources of the twelve scenarios studied (from which the metrics are calculated). Then the three metrics are calculated for the twelve scenarios and presented in Section 4, with mathematical relationships between pairs of metrics analyzed and discussed in Sections 4.1–4.3. Finally, in Section 5 is a summary and conclusion. 3. Methodology Both analytical methods and Monte Carlo simulations can be used to calculate probabilistic adequacy metrics [2], with computation times for analytical methods being significantly shorter than those for Monte Carlo simulations. However, for power supplies with energy limited resources, such as hydroelectric systems, chronological Monte Carlo simulations are more often used because they can directly account for time dependency of hydroelectric operations. More specifically, the available capacity of hydroelectric resources in any time period depends on their operation in previous periods and on time dependent constraints, such as required minimum off-peak generation to provide balancing reserves. In spite of not being able to model time dependent operations, analytical methods can still be used to assess the adequacy of power supplies with energy limited resources, if the dispatch of those resources can be reasonably approximated. For example, hydroelectric systems that have a high ratio of storage volume relative to river flow (inflow) volume can be modeled such that average monthly hydroelectric energy remains fairly constant from year to year. For hydroelectric systems of this type, monthly hydroelectric energy is assumed to follow hourly load shapes, adjusted for time dependent constraints. The resulting approximate hydroelectric dispatch is used to modify either the load duration curve [10], or the capacity availability curve [11], which then are convolved to create the surplus probability density function from which adequacy metrics can be calculated [2]. However, for hydroelectric systems with a low ratio of storage volume relative to inflow volume, such as that in the PNW (see Section 2), their monthly energy can vary significantly from year to year, thus making analytical methods more difficult. For that reason and the fact that the PNW hydroelectric system accounts for over half of the region's nameplate capacity and has significant time dependent constraints [8], a Monte Carlo simulation model is used to calculate adequacy metrics. 2. The Pacific Northwest Power System The Pacific Northwest (PNW) power supply, as defined in the Pacific Northwest Electric Power Planning and Conservation Act [4], consists of generating resources committed to serving loads in the geographical area mainly in the states of Washington, Oregon, Idaho and the part of western Montana that lies within the Columbia River Basin. As of 2015, the Pacific Northwest power supply had a nameplate capacity (operating and under construction) of 62,300 MW, and was dominated by hydroelectric capacity at 53% (33,200 MW), followed by natural gas, wind and coal at 16% (9900 MW), 14% (9000 MW) and 12% (7300 MW), respectively. Nuclear and biomass generating capacities each made up 2% (1200 MW), while geothermal, solar and storage comprised the remaining 1% (500 MW) [5]. Although the hydroelectric system includes 60 major U.S. dams with a total of 33,200 MW of nameplate capacity, it can only provide about 26,000 MW of sustained-peaking capacity (based on the critical water-year 1937 conditions) [6] due to limited storage (U.S. reservoirs can only store about 16% of the annual average river-flow volume as measured at The Dalles [7]). Hydroelectric sustained-peaking capacity depends on a number of variables, including river-flow volume, non-power constraints and peak-load duration. The 26,000 MW of sustained-peaking capacity referenced here was estimated assuming a 10-h peak-load duration and the lowest river-flow volume on record. Hydroelectric generation simulated in the adequacy model includes 76 regulated projects and a further set of 71 independent (not regulated within GENESYS) projects, the vast majority of which is listed in [6]. In addition, the hydroelectric system operates to satisfy many constraints (many listed in [8]) which include providing decremental balancing reserves for wind resources by increasing minimum generation between 1400 MW and 1600 MW during off-peak hours. It should be noted that a substantial amount of wind capacity (about 3000 MW out of 9000 MW) is dedicated to serving out-of-region load and does not contribute to regional adequacy. 3.1. The GENESYS adequacy model The Northwest Power and Conservation Council uses the GENESYS model [12] to assess resource adequacy. GENESYS is a Monte Carlo computer program that performs a chronological hourly simulation of the Pacific Northwest power supply for a single operating year (October through the following September). Thousands of simulations are run, with each simulation drawing a different combination of four random variables; (1) temperature-sensitive loads, (2) temperature-sensitive wind generation, (3) generator forced outages and (4) unregulated river flows, each of which is briefly described below. Loads. Temperature-sensitive hourly loads for a specific future year are produced by the Council's econometric load forecasting model, which uses historical data to project future load growth and energy efficiency savings. The model creates 77 sets of 8760 temperaturesensitive hourly loads based on 77 years of historical daily average temperatures at the four major load centers in the region (Seattle, Portland, Spokane and Boise) [13]. At the beginning of each simulation, all of the year's hourly loads are fixed by drawing one from the set of 77 possibilities based on 77 years of historical observed temperatures. The probability density of the temperature-years is uniform since each year occurred once. Hourly loads are then adjusted for firm out-of-region contracts (e.g. exported energy is added and imported energy subtracted). Wind generation (see below), which is modeled as a load-reduction resource, is also subtracted from the load. Wind. Most of the wind generation in the Northwest (located in the Columbia River Gorge) has been shown to have some correlation to temperature [14]. More precisely, as temperatures go to extremes (very 2 Electric Power Systems Research 174 (2019) 105858 J. Fazio and D. Hua Fig. 1. Ranked monthly regional temperature vs ranked monthly flows at TDA for 1929–2008. beginning of each month, the GENESYS model estimates the amount of hydroelectric energy to be dispatched based on its assigned operating cost relative to the operating costs of other non-hydroelectric resources. The dispatched monthly hydroelectric energy is then proportionally allocated across each day of the month based on daily load shapes. The hourly hydroelectric generation is limited by total system maximum sustained-peaking capability and total system off-peak minimum generation limits. Additional hourly hydroelectric generation could be dispatched, above the allocated amount, if all other resources are fully dispatched and a shortfall still exists. At the end of each month, the final amount of hydroelectric energy dispatched is used to adjust initial reservoir elevations for the next month. This method results in 8760 h of hydroelectric generation. The four random variables used in GENESYS encompass uncertainties in load forecasting, wind generation, thermal outages and hydroelectric generation that are consistent with many of those recommended by NERC to consider in adequacy studies [17]. Of the four random variables, past analyses have shown that low river flows (which affect hydroelectric generation) and extreme temperatures (which affect loads) are the most common causes of shortfall events. Thus, it is worthwhile to investigate possible correlations between historical temperatures and water conditions. Fig. 1 shows a scatter plot for ranked monthly average regional temperatures versus ranked monthly average flows at the Dalles (TDA), here chosen to represent water conditions for the entire Columbia River Basin, for the historical record from 1929 to 2008. It is clear from this figure that monthly flows at the Dalles, on average, are higher for regional temperatures above the 90th percentile than for those temperatures below the 10th percentile (the two deciles of extreme temperatures). However, the flow variations within each temperature percentile group are quite large: they range from 0.1th to 75th percentiles for temperatures below the 10th percentile, while for temperatures above the 90th percentile, they range from 30th to 90th percentiles. Therefore, due to the large variations shown in Fig. 1 and for simplicity, this paper assumes that river flows and temperatures are not correlated. Typically, a Monte Carlo analysis would draw a different combination of random variables for each simulation and run a sufficient number of simulations to ensure convergence of the output parameters of interest, in this case the three adequacy metrics. However, because unregulated river flows and temperature-sensitive loads are the dominant uncertainties that determine power system adequacy and because their respective distributions are uniform, all combinations of these two random variables are drawn for each study. And because the total number of combinations of temperature-year (77) and water-year (80) hot or very cold), wind generation tends to be low. Using historical temperatures and observed wind generation, bootstrap statistical methods [14] are used to create synthetic hourly capacity factors for all 77 temperature-year profiles. Furthermore, to better capture the effects of wind generation uncertainty, 20 possible sets of hourly capacity factors are created for each temperature-year profile. Thus, at the beginning of each simulation, after the temperature-year profile is chosen, all 8760 hourly wind capacity factors are determined by drawing from one of the 20 possible sets for that temperature year. The probability density of the 20 possible sets of wind capacity factors (for each temperature year) is uniform. Hourly wind generation is obtained by multiplying the hourly capacity factor by the amount of installed wind nameplate capacity dedicated to serve regional load and, as mentioned above, is then subtracted from the hourly loads. Forced outage. In contrast to load and wind generation whose hourly values for the entire year are selected at the beginning of each simulation, thermal generator forced outages are accounted for dynamically during the simulation based on a two-step Markov process. During each simulation hour the state of each generator is assessed. A generator that is “on” is assigned a future time to fail by equating a random number draw to the cumulative probability function of mean-time-to-failure derived from a log-normal distribution. Similarly a generator that is “off” is given a future time to return to service by equating a random number draw to the cumulative probability function of mean-time-torepair, also derived from a log-normal distribution. The log-normal probability distribution is commonly used to model generator availability in the power industry. References [15,16] contain more details. This process dynamically determines thermal force outages throughout the 8760 h of simulation for each game. In contrast, hydroelectric generator forced outages and maintenance are accounted for by applying fixed monthly availability factors. Unregulated river flows. Finally, unregulated inflows at all hydroelectric projects in GENESYS are drawn from a set of 80 historical water-year profiles. Similar to the loads based on the 77 historical temperature-years, the probability density of the 80 historical wateryears is also uniform. Also, just as the hourly loads and wind generation for the entire year are set at the beginning of each simulation by the temperature-year selection, unregulated monthly inflows at each hydroelectric project for the year are determined by the selected historical water-year profile. For each project, the selected inflow data also includes monthly operating guidelines such as maximum (flood control) and minimum end-of-month elevation limits, ramping rates, maximum and minimum outflow limits and other operating constraints. From these data, monthly generation at each project is calculated and then summed to get the total hydroelectric system generation. At the 3 Electric Power Systems Research 174 (2019) 105858 J. Fazio and D. Hua the loss of load probability (LOLP) where LOLP = N/Ntotal, to determine adequacy for which LOLP ≤ 0.05. The LOLP as defined has a shortcoming in that simulations with different shortfall events, durations and magnitudes all contribute equally. variables is relatively small at 77 × 80 = 6160, it is still practical to perform all these simulations (years), totaling 54 million hours of dispatch, and requiring about 3 hours run time on a typical business desktop personal computer. The LOLEV, LOLH and NEUE metrics are calculated in Section 4 for multiple scenarios for differing loads and resources. For these scenario analyses, resource and load variations are selected to reflect forecast uncertainties, details of which are discussed in Section 3.3. Then in Sections 4.1–4.3, mathematical relationships among the three metrics for the scenarios are analyzed to determine if one of the three metrics can be used to calculate (i.e. predict) the other two. However, the equations for calculating the metrics from GENESYS output are presented first in the next section. 3.3. Scenarios LOLEV, LOLH and NEUE can be calculated directly from GENESYS output using Eqs. (1)–(4), respectively. Mathematical relationships between pairs of the three metrics are investigated in Section 4 for 12 scenarios used for the Pacific Northwest Power Supply Adequacy Assessment for 2022, prepared by the Northwest Power and Conservation Council [19]. The scenarios form a set of sensitivity studies which are comprised of combinations of three load levels and four resource levels to reflect forecast uncertainties, details of which are discussed below. The three forecasted load levels are, high (H), medium (M) and low (L). The M hourly loads are the forecasted PNW regional hourly loads for 2022 with the H loads being 3% higher and the L loads being 3% lower. More specifically, if {Mi} is the set of hourly loads for the M level, then {1.03Mi} is the set of hourly H-loads, while {0.97Mi} is the set of hourly L-loads. The three hourly loads {Hi}, {Mi} and {Li} can be expressed as α{Mi} where α = 1.03, 1.00, and 0.97, respectively. It should be noted that as defined all three load levels have the same hourly load shape. The 2022 forecasted M loads, which incorporate 77 years of historical temperature variations, have winter peak hour loads (in MW) in the range (29,900, 43,200) and summer peak hour loads in the range (27,500, 30,300) [19]. These loads include projected cumulative peak hour energy efficiency savings of 2600 MW during winter and 1680 MW during summer since 2015. In comparison, the observed peak winter and summer loads for 2015 (from Section 2) were 30,100 MW and 29,000 MW respectively. On the other hand, the four forecasted resource levels A, B, C and D represent increasing amounts of available import capacity (modeled as a very high heat rate thermal resource) expressed as a percentage of L̄ max = 35, 000 MW, the average winter peak-hour load (calculated from the set of forecasted M winter peak hour loads discussed in the previous paragraph). The relative increase of import capacity, relative to resource A, for resources A, B, C and D are equal to 0.0%, 1.4%, 2.9% and 4.0% of the average peak-hour load L̄ max , respectively. The hourly resource {Ai} contains the forecasted available generation for the year 2022 with 2000 MW of import capacity. The four hourly resources {Ai}, {Bi}, {Ci}, {Di} can then be expressed as {Ai} + R × Lmax where R = 0.000, 0.014, 0.029 and 0.040 respectively. For example, hourly resource {Bi} = {Ai} + (0.014)Lmax. As defined, it is clear that the four resource levels also have the same hourly shape. Compared to resources in 2015 discussed previously in Section 2 and excluding the import capacity, the 2022 forecasted generating resources have a net reduction of approximately 500 MW of nameplate capacity. The net change is due to actual and planned retirement of approximately 2500 MW generation of which the majority, 2300 MW, is coal, and approximately 2000 MW of additional installed and planned generation of which 49% is wind, 28% solar and 23% natural gas [20], [21]. The three metrics, in Eqs. (1)–(4) then can be symbolically expressed as outputs of a vector function G of hourly loads and resources represented in the GENESYS model, 3.2. Equations for LOLEV, LOLH and NEUE For each scenario, GENESYS records all relevant data for each hour when generation fails to meet load. All three adequacy metrics can be calculated from a subset of this data, which lists rows of (i, hij, cij), where i ∈ (1, …, 6160) is the simulation index, hij is the jth curtailment hour and cij is the jth curtailment magnitude for simulation i. Let = {i1, …, iN } be the set of N simulation indices that contain at least one curtailment. For each simulation index i ∈ , let i = {hi1, …, hiji} represent the curtailment hours and i = {ci1, …, ciji} the corresponding curtailment magnitudes, where ji is the number of curtailment hours for simulation i. Calculations for the metrics LOLEV, LOLH and NEUE, previously defined in Section 1, are presented below. Before calculating LOLEV, it is necessary to obtain the total number of curtailment events for simulation i by grouping all the curtailment hours in i of simulation i into ki subsets ({hi1, …}1, …, {…,hiji} ki) , where all hours in each subset are contiguous, and hours from different subsets are not contiguous. ki then is the number of curtailment events for simulation i. Then LOLEV is the expected (or average) number of curtailment events for all simulations, iN LOLEV = ∑ ki/Ntotal (1) i = i1 where Ntotal = 6160 is the total number of simulations. Next, LOLH, the expected number of curtailment hours for all simulations, is calculated as iN LOLH = ∑ ji /Ntotal (2) i = i1 where ji is the number of curtailments hours for simulation i, defined previously. In contrast to the simple calculations for LOLEV and LOLH, two intermediate quantities are required for calculating NEUE. The first is the total unserved energy Ei for simulation i, obtained by summing of all the hourly curtailment magnitudes in i of simulation i, ji Ei = ∑ ci j j=1 and the second is EUE, the expected unserved energy over all simulations, iN EUE = ∑ Ei/Ntotal. (3) i = i1 (LOLEV, LOLH, NEUE) = G (α {Mi}, {Ai } + RL max ) Then NEUE is just EUE expressed as a fraction of annual load energy (in units of parts per million or ppm), and calculated as, NEUE = [EUE × 106]/[L¯ × 8760] (5) where various other inputs of GENESYS that remain constant among the scenarios are not shown explicitly in G. The metrics are calculated for the twelve scenarios by varying over combinations of α and R. It should be noted that variations among the values of α are small, as are those of R, which as scenario inputs for G in Eq. (5) can determine the type of mathematical relationships that exist among the metrics, which (4) where L̄ is the annual average load averaged over all 77 temperature years. Currently the PNW region uses threshold on a single metric [18]: 4 Electric Power Systems Research 174 (2019) 105858 J. Fazio and D. Hua are varied (e.g. family H contains scenarios HA, HB, HC and HD all with load level H and varying resources A, B, C, and D). Obviously because of the small number of scenarios, using higher order polynomials in leastsquares fitting will result in perfect fit since all the RMSE's would be zero. However, as discussed in the previous section, the excellent (though not perfect) linear fit for each family is not unexpected because resource variations, expressed by values of R in Table 1, are relatively small. It is worthwhile to discuss the linearity between LOLEV and LOLH further. By adding a large enough import capacity, for example, having an R = 1.0, scenarios for all three families of loads (H, M, L) will converge to the completely adequate (but impractical) scenario O plotted at the origin in Fig. 2, where LOLEV = 0 = LOLH = NEUE. It should be obvious that it is not necessary to run GENESYS for scenario O to calculate these three metrics since when one is zero the other two must also equal zero. It should be noted in Fig. 2 that none of the fitted lines, if extrapolated, would intersect scenario O at the origin since none of their intercepts is close to zero. This leads to the conclusion that as resource variations become large, as between R = 1 and R ∈ (0.000, 0.014, 0.029, 0.040), defined previously in Section 3.3, the relationship between these two metrics, for fixed load, can no longer be accurately approximated by a linear function. In addition, the fitted line for each fixed-load family in Fig. 2 can be expressed as NEUE = aj × [LOLEV] + bj for j = (L, M, H), where parameters aj and bj are obtained from least-squares analyses of NEUE and LOLEV values in each family. Hence aj and bj are functions of the metrics, and by Eq. (5), also functions of loads and resources. However, Fig. 2 implies that aj and bj remain constant for each family of fixed loads and small variations in resources. An interesting observation from Fig. 2 is that the metrics for the H family are much higher than those for the M and L families even though the change in loads from M to H and L is the same at ± 3%, respectively. For NEUE, the behavior is due to the distribution of unserved energy magnitudes as shown in Fig. 3 for scenarios HB, MB and LB, which shows that there are overwhelmingly more unserved energies of smaller magnitudes than those of larger magnitudes (note the log scale for the frequency axis). From regression analysis, histograms for the frequency of unserved energy can be roughly represented by f ∝ e−βE for fitting parameter β. Furthermore, since the vast majority of shortfalls occur in winter, the ± 3% hourly load difference can be roughly approximated by applying the percentage to the average M winter peak-hour load of 35,000 MW, which results in ∼ ± 1000 MW. Thus, in Fig. 3 as suggested by the arrow, the histogram in the (8000, 9000) MWh bin for HB (blue histogram) can be reasoned to shift leftward to become approximately the histogram in the (7000, 8000) MWh bin for MB (brown histogram). The same reasoning also results in all histograms shifting leftward by ∼1000 MW from HB to MB and from MB to LB. Therefore the three sets of histograms can be approximately represented by fH ∝ e−βE, fM ∝ e−β(E+▵) and fL ∝ e−β(E+2▵) where ▵≈1000 MW. Finally, the total unserved energy for each set of loads can be approximated by a sum-product of the histograms and the bins, T = ∑i [Ei × f (Ei )], which leads to (TH − TM) ≈ (TM − TL) eβ▵ ≫ (TM − TL), which confirms that NEUE for the H scenarios are much higher than those for M and L scenarios in Fig. 2. Similarly, analysis of histograms of shortfall events per simulation also explains large differences in LOLEV in Fig. 2. Next, Fig. 4 shows the same twelve LOLEV and NEUE data points as plotted in Fig. 2 but in contrast, linear least-squares fits are performed over families of fixed resource levels while loads are varied (e.g. family A contains scenarios HA, MA, LA, all with fixed resource level A and varying loads H, M and L). All four linear fits have NEUE RMSE values of 0.67 ppm, 0.33 ppm, 0.27 ppm and 0.07 ppm for the family of resource level A, B, C and D, respectively. The dotted linear fit lines in Fig. 4 show the good fit between LOLEV and NEUE for the four families. Similar to the discussion of Fig. 2, the excellent (though not perfect) linear fit for each family is due to load variations, expressed by α in Table 1 Adequacy metrics for 12 scenarios as a combination of loads (L, M, H) and Resources (A, B, C, D). Scenario α R LOLEV LOLH NEUE LA LB LC LD MA MB MC MD HA HB HC HD 0.97 0.97 0.97 0.97 1.00 1.00 1.00 1.00 1.03 1.03 1.03 1.03 0.000 0.014 0.029 0.040 0.000 0.014 0.029 0.040 0.000 0.014 0.029 0.040 0.07 0.05 0.04 0.03 0.20 0.16 0.13 0.12 0.93 0.80 0.73 0.70 0.84 0.61 0.44 0.32 2.12 1.64 1.32 1.16 9.64 8.13 7.31 6.98 7.6 4.9 3.1 2.1 16.7 11.6 8.2 6.2 57.5 45.0 37.5 33.7 are explored in the next section. 4. Simulation results and analysis The LOLEV, LOLH and NEUE metrics for all twelve scenarios are calculated using Eqs. (1)–(4) from GENESYS output data and summarized in Table 1. These scenarios cover a very wide range of adequacy levels, with LOLEV values ranging from 0.03 to 0.93 events per year (or equivalently 0.3 to 9.3 events per 10 years). Mathematical relationships between the pairs (LOLEV, NEUE), (LOLEV, LOLH) and (LOLH, NEUE) are explored in Sections 4.1–4.3, respectively using least-squares analysis. 4.1. Relationship between NEUE and LOLEV In this section, least-squares analyses are performed on the scenarios listed in Table 1 to explore the mathematical relationship between NEUE and LOLEV. Fig. 2 shows the correlation between NEUE and LOLEV, with each of the twelve data points labeled by load and resource level according to Table 1. It is easily seen that both metrics display expected behavior, in that, as load is lowered or as resource is added, both metrics decrease, as they should. For example, for fixed load level H, both metrics decrease as resources are added to the system, as seen from scenarios HA to HD. Similarly, for fixed resource level A, both metrics decrease as load is lowered, as evident from scenarios HA to MA to LA. It also turns out that linear least-squares fits over subsets (families) of scenarios with fixed load (e.g. the three load levels H, M and L) all have small root-mean-square errors (RMSE) of 0.40 ppm, 0.28 ppm and 0.16 ppm, respectively for NEUE. The dotted linear least-squares fit lines in Fig. 2 show the very accurate fit between LOLEV and NEUE for the three families of fixed load levels (H, M, L) as resources (A, B, C, D) Fig. 2. NEUE vs. LOLEV. Dotted lines are linear least-squares fits for the three families of scenarios with fixed load (H, M, L) and varying resources (A, B, C, D). 5 Electric Power Systems Research 174 (2019) 105858 J. Fazio and D. Hua Fig. 3. Histogram of Unserved Energy for scenarios HB (blue), MB (brown) and LB (light blue). As hourly loads decrease from in HB to MB to LB, the respective unserved energy histograms shift leftward. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Fig. 4. NEUE vs. LOLEV. Dotted lines are linear least-squares fits for the four families of scenarios with fixed resource (A, B, C, D) and varying loads (H, M, L). Scenario A1 was found from iterative GENESYS runs with fixed resource level A and varying loads to obtain LOLEV = 0.51. Fig. 5. NEUE vs. LOLEV for families L and M. Dotted lines are linear leastsquares fits for the two families. Scenarios L1 and M1 are iterated from LA and MD respectively with appropriate resources to achieve LOLEV = 0.1 (see the R values in Table 2). Table 1, being relatively small. Figs. 2 and 4 show that for a family of scenarios with either fixed loads or fixed resources, a linear least-squares fit produces a function of LOLEV from which reasonably accurate values of NEUE can be calculated using existing data from Table 1. For example, in Fig. 4, the blue dotted fit line for family A suggests that for LOLEV = 0.51, NEUE = 33.6. Indeed, from iterative GENESYS runs it was found that using input parameters α = 1.02 and R = 0.000 (a member of family A), produces scenario A1 with metrics (LOLEV, NEUE) = (0.51, 34.4), and plotted in Fig. 4 as the solid blue diamond. Relevant parameters for scenario A1 are listed in Table 2. The difference between the simulation value (34.4) and fit-line interpolation (33.6) for NEUE is reasonably close to the RMSE value 0.67 for the family-A fit line. Even though only one example has been shown, this paper assumes that scenarios exist with LOLEV and NEUE close to interpolated values on the fit lines. For adequacy assessments in general, scenarios with both varying loads and resources should be considered. An example is the combined scenarios of families M and L in Fig. 2, from which a magnified region is shown in Fig. 5. To explore mathematical relationships between LOLEV and NEUE for the combined families, a potential common region at LOLEV = 0.1 (represented by the dotted red line) is chosen by extrapolating the two dotted fit lines. The extrapolated blue dotted fit line for the M family suggests a scenario with metrics (LOLEV, NEUE) = (0.1, 4.0) while the extrapolated black dotted fit line for the L family suggests another scenario with metrics of (0.1, 11.9). However, in contrast to interpolated values, it is less certain that scenarios exist with LOLEV or NEUE close to extrapolated values on the fit lines. Fortunately, the two extrapolations were supported by scenarios M1 and L1 (obtained from iterative GENESYS runs) for which LOLEV = 0.1 and NEUEs of 3.5 and 11.0 respectively, which are somewhat close to the extrapolated values. Table 2 lists the input parameters for loads and resources, α and R, respectively along with the three metrics for L1 and M1. The slightly larger deviations of NEUE for L1 and M1 from the extrapolated fit lines suggest that nonlinearity is beginning to show due to the larger range in R values with the additional scenario for each family. Nevertheless, Fig. 5 shows that since scenarios L1 and M1 are members of families L and M respectively, then NEUE for each scenario can be reasonably calculated from just LOLEV using the corresponding linear fit line. However, Fig. 5 also shows that at LOLEV = 0.1, NEUE takes on two Table 2 Adequacy metrics for interpolated and extrapolated scenarios. Scenario α R LOLEV LOLH NEUE A1 L1 M1 L2 1.02 0.97 1.00 0.97 0.000 −0.012 0.069 −0.013 0.510 0.100 0.099 0.103 5.21 1.13 0.86 1.16 34.4 11.0 3.5 11.3 6 Electric Power Systems Research 174 (2019) 105858 J. Fazio and D. Hua possible values from the two families of fixed loads, demonstrating that LOLEV by itself is not sufficient to calculate a unique NEUE value. Furthermore, if an adequacy standard only includes an LOLEV threshold of 0.1 events per year, then both scenarios L1 and M1 would be considered adequate, yet L1 has a NEUE value nearly three times higher than that for M1, which might be too high to be considered adequate. On the other hand, for NEUE = 7 in Fig. 5, there are two (interpolated) values for LOLEV: 0.066 from the L-family fit line, and 0.124 from the M-family fit line. Hence NEUE by itself is not sufficient to determine a unique value for LOLEV. Finally, in Fig. 4 for the overlap region at LOLEV = 0.6 there are four interpolated values for NEUE, and conversely, for NEUE = 30 there are four interpolated values of LOLEV, from the four families of fixed resources. Hence neither NEUE nor LOLEV by itself is sufficient to determine a unique value for the other. In subsequent sections, analyses of other pairs of metrics for families with constant resources follow the same reasoning and lead to the same conclusion, and thus will not be presented again. Analysis in this section has shown that for scenarios where both loads and resources vary, neither LOLEV nor NEUE by itself is sufficient to calculate a unique value for the other, thus a threshold for one does not automatically lead to a unique threshold for the other. Fig. 7. LOLH vs. LOLEV for families L and M. Dotted lines are linear leastsquares fits for the two families. Values for the load and resource input parameters, and metrics for scenarios L1 and M1 are listed in Table 2. possible values, 0.86 for scenario M1 and 1.13 for scenario L1 (see Table 2). Therefore, an LOLEV value by itself is not sufficient to determine a unique value for LOLH. LOLH values for L1 and M1, however, do show larger deviations from the linear fit lines, due to the larger variations in resources (R values) for each family with the inclusion of the extrapolated scenario. Fig. 7 also shows that for LOLH = 1.0, and there are two interpolated values of LOLEV from the two families of fixed loads. Hence LOLH by itself is not sufficient to determine a unique LOLEV. It has been demonstrated here that for scenarios where both loads and resources vary, neither LOLEV nor LOLH by itself is sufficient to determine a unique value for the other, and thus setting an adequacy threshold for one does not automatically lead to a unique threshold for the other. 4.2. Relationship between LOLH and LOLEV In this section, least-squares fits are performed on the same twelve scenarios listed in Table 1 to explore mathematical relationships between LOLH and LOLEV, very similar to the analysis done previously for NEUE and LOLEV. Hence, only a brief description and summary are presented. Fig. 6 shows very accurate linear least-squares fits between LOLEV and LOLH, for the three families of scenarios with fixed load levels (H, M, L), as resources (A, B, C, D) are varied. Figs. 6 and 2 look similar and share the following properties discussed previously: the excellent linear fit for each family of fixed loads with small variations in resources, the disparity in metric values between family H and families M and L, and if scenario O is added to each family, a linear function can no longer provide a good fit. The same 12 LOLEV and LOLH data points presented in Fig. 6 can be alternatively plotted to show the excellent linear least-squares fits for the four families of scenarios with fixed resource levels (A, B, C, D) and varying loads (H, M, L), similar to Fig. 4 for LOLEV and NEUE. However, the linear fit lines for this case are very densely packed together, and such a plot would not be able to show enough distinguishing details to be useful. Finally, Fig. 7 is a magnified region of Fig. 6 for families L and M and shows that in the overlap region LOLEV = 0.1, LOLH takes on two 4.3. Relationship between NEUE and LOLH In this section, some of the mathematical relationships between NEUE and LOLH can be deduced from results in the previous two sections without the need to explicitly perform least-squares analysis. Specifically, from Section 4.1, NEUE and LOLEV were shown to be well correlated and have an excellent linear fit for families of scenarios with fixed loads or fixed resources, and the same properties also exist between LOLEV and LOLH, as discussed in Section 4.2. It follows then that NEUE and LOLH also share the same excellent linear least-squares fits for families with either fixed loads or fixed resources and that NEUE can be calculated from a linear function of just LOLH under those circumstances. LOLH and NEUE values from Table 1 for load families L and M are plotted in Fig. 8 for varying resource (A, B, C, D), along with scenario L2, obtained by iteratively decreasing R in scenario LA in the GENESYS model to reach the same LOLH of 1.16 as scenario MD. Fig. 8 shows that Fig. 6. LOLH vs. LOLEV. Dotted lines are linear least-squares fits for the three families of scenarios with fixed load (H, M, L) and varying resources (A, B, C, D). Fig. 8. NEUE vs. LOLH for families L and M. Scenario L2 is iterated from LA with appropriate resources to achieve LOLH = 1.16 (see the R values in Table 2). 7 Electric Power Systems Research 174 (2019) 105858 J. Fazio and D. Hua for family L, extrapolation of the black dotted fit line at LOLH = 1.16 results in NEUE = 10.9, which is reasonably close to the simulation value NEUE = 11.3 for scenario L2 (see Table 2). Fig. 8 also shows that at the overlap region LOLH = 1.16, NEUE takes on two possible values, 11.3 from L2 and 6.2 from MD. It follows that LOLH by itself is not sufficient to calculate a unique NEUE, very similar to the conclusions for the other two pairs of metrics discussed previously for Figs. 5 and 7 . Conversely, Fig. 8 also shows that for NEUE = 8, there are two interpolated values of LOLH from families L and M. Therefore, NEUE by itself is not sufficient to determine a unique LOLH. Thus, results in this section show that in general neither NEUE nor LOLH by itself can be used to calculate a unique value for the other, and setting a threshold for one does not automatically lead to a unique threshold for the other. References [1] Methods to Model and Calculate Capacity Contribution of Variable Generation for Resource Adequacy Planning, Tech. Rep., North American Energy Reliability Corporation, Available from: URL: http://www.nerc.com/files/ivgtf1-2.pdf (March 2011). [2] Probabilistic Adequacy and Measures, Technical Reference Report, Tech. Rep., North American Energy Reliability Corporation, Available from: URL: https://www. nerc.com/comm/PC/Documents/2.d_Probabilistic_Adequacy_and_Measures_ Report_Final.pdf (April 2018). [3] 2016 Long-Term Reliability Assessment, Tech. Rep., North American Energy Reliability Corporation, Available from: URL: https://www.nerc.com/pa/RAPA/ra/ Reliability%20Assessments%20DL/2016%20Long-Term%20Reliability %20Assessment.pdf (December 2016). [4] Pacific Northwest Electric Power Planning and Conservation Act, Public Law No. 96-501, S. 885, 1980, p. 83. [5] Seventh Northwest Power Plan, Tech. Rep., Northwest Power and Conservation Council, Available from: URL: https://www.nwcouncil.org/reports/seventh-powerplan (February 2016). [6] 2017 Pacific Northwest Loads and Resources study, Technical Appendix, Volume 2: Capacity Analysis, Tech. Rep., Bonneville Power Administration, Available from: URL: https://www.bpa.gov/p/Generation/White-Book/wb/2017-WBK-TechnicalAppendix-Volume-2-Capacity-Analysis-20171218.pdf (December 2017). [7] The Columbia River System Inside Story: Federal Columbia River Power System, 2nd ed., Tech. Rep., Available from: URL: https://www.bpa.gov/news/pubs/ generalpublications/edu-the-federal-columbia-river-power-system-inside-story.pdf (April 2001). [8] Understanding Operational Flexibility in the Federal Columbia River Power System, Tech. Rep., Hydro Research Foundation, Available from: URL: https://www. hydrofoundation.org/uploads/3/7/6/1/37618667/studarus_final_findings.pdf (December 2014). [9] Form No. 714, Annual Electric Balancing Authority Area and Planning Area Report, Tech. Rep., Federal Energy Regulatory Commission, Available from: URL: https:// www.ferc.gov/docs-filing/forms/form-714/data.asp. [10] R. Billinton, P. Harrington, Reliability evaluation in energy limited generating capacity studies, IEEE Trans. Power Appl. Syst. 97 (6) (1978) 2076–2085, https://doi. org/10.1109/TPAS.1978.354711. [11] D. Bagen, B. Huang, C. Singh, A new analytical technique for incorporating base loaded energy limited hydro units in reliability evaluation, Electr. Power Syst. Res. 131 (2016) 218–223, https://doi.org/10.1016/j.epsr.2015.10.019. [12] The GENESYS Model, Tech. Rep., Northwest Power and Conservation Council, Available from: URL: https://www.nwcouncil.org/energy/saac/GENESYS. [13] Using Structural Time Series Models for Development of Demand Forecasting for Electricity with Application to Resource Adequacy Analysis, Tech. Rep., Northwest Power and Conservation Council, Available from: URL: https://www.nwcouncil. org/media/7491506/dev-of-short-term-dem-forecast-model-use-for-adequacy2014.pdf. [14] B. Kujala, A Temperature-Based W ind Power Model for System Reliability Analyses, Tech. Rep., Northwest Power and Conservation Council, Available from: URL: https://www.nwcouncil.org/media/7491504/temp-based-wind-powermodel-for-sys-reliability-2012.pdf. [15] R. Billinton, R. Allan, Reliability Evaluation of Power Systems, 2nd edition, Spring Science + Business Media, New York, NY, 1996, pp. 10013–11578. [16] M. Rausand, A. Hoyland, System Reliability Theory, Models, Statistical Methods and Applications, 2nd edition, Wiley-Interscience, John Wiley & Sons, Inc., Hoboken, NJ, USA, 2004. [17] Probabilistic Assessment, Technical Guideline Document, Tech. Rep., North American Energy Reliability Corporation, Available from: URL: https://www.nerc. com/comm/PC/PAITF/ProbA%20Technical%20Guideline%20Document%20-% 20Final.pdf (August 2016). [18] A New Resource Adequacy Standard for the Pacific Northwest, Tech. Rep., Northwest Power and Conservation Council, Available from: URL: https://www. nwcouncil.org/sites/default/files/2011_14_1.pdf (December 2011). [19] Pacific Northwest Power Supply Adequacy Assessment for 2022, Tech. Rep., Northwest Power and Conservation Council, Available from: URL: https://www. nwcouncil.org/media/7491213/2017-5.pdf (July 2017). [20] Seventh Northwest Power Plan – Mid-term Assessment, Tech. Rep., Northwest Power and Conservation Council, Available from: URL: https://nwcouncil.box. com/s/nb97utmurtf8rbqi4j5isk2smuz6wg5o (November 2018). [21] Summary of Anticipated Resource Needs, Tech. Rep., Northwest Power and Conservation Council, Available from: URL: https://nwcouncil.box.com/s/ qlugectn7jed8j9kg5ix0olblsxkgm2t (May 2018). [22] E. Ibanez, M. Milligan, Comparing resource adequacy metrics and their influence on capacity value, 2014 International Conference on Probabilistic Methods Applied to Power Systems (PMAPS) (2014), https://doi.org/10.1109/PMAPS.2014.6960610. 5. Conclusion In summary, least-squares analyses show that for the Pacific Northwest power supply each of the three adequacy metrics (LOLEV, NEUE and LOLH) can be used in a linear function to calculate with very good accuracy the other two metrics, but only for families of scenarios with fixed load or fixed resources (and small variations in the other input), as seen previously in Figs. 2, 4 and 6 . However, in general, when both loads and resources vary, even in small amounts, the value of one metric does not automatically determine unique values for the other two, as presented in Figs. 5, 7 and 8 . For the more general cases, additional load or resource data are needed to determine which family, if any, the scenarios belong to, and thus the unique fitting function that relate the pair of metrics. To assess power supply adequacy, it is prudent to use metrics that measure different dimensions of risk, and manage those risks by setting tolerable limits for each metric. LOLEV, LOLH and NEUE, which measure shortfall frequency, duration and magnitude, form a good set of risk measures for a power supply. Moreover, as both LOLH and NEUE have been calculated by all regional power system entities and reported annually to NERC for publication for the past few years, many power system planners might be familiar with the risks associated with these two metrics. Analyses in previous sections suggest that if the three metrics were to be used to determine power supply adequacy, then thresholds for all three metrics should be set independently. On the other hand, if only a single metric is used to set the adequacy standard for the PNW, for example setting a 0.1 events per year threshold for the LOLEV, what might have been acceptable associated values for LOLH and EUE could become unacceptably large with future changes in resources and loads, even while the LOLEV remains at or below 0.1 events per year. Finally it should be noted that using numerical probabilistic convolution methods (in contrast to Monte Carlo simulations), near-linear mathematical relationships were also discovered among similar metrics [22]. Conflicts of interest None. Acknowledgment The authors would like to thank Dr. B. Bagen at Manitoba Hydro for helpful discussions on the use of analytical methods in adequacy analysis. 8