Applied Hydrology Regional Frequency Analysis (RFA) Adapted from Hosking and Wallis (1997) Professor Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 1 Why regional frequency analysis (RFA) is needed? Hydrological frequency analysis is generally conducted for sites with rainfall or flow measurements. For areas with short record length or without rainfall or flow measurements, hydrological frequency analysis needs to be conducted using data from sites of similar hydrological characteristics. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 2 The Index-Flood Approach for RFA – Concept Proposed by Dalrymple (1960) for flood frequency analysis. Let Q be the hydrological variable of interest, for example annual maximum rainfall of a specific duration or annual maximum flow. Suppose that observed data of Q are available at N different sites and ni represents the sample size for the i-th (i = 1, 2, …, N) site. Also, let Qi(F) be the quantile function of Q at site-i. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 3 Observed data: Qij , j 1,2,, ni ; i 1,2,, N Quantile function PQi Qi (F ) F , 0 F 1 Assume the quantile function of hydrological variables at different sites can be expressed by Qi (F ) i q(F ), i 1,, N. where i is the index flood (Dalrymple, 1960) and q(F), known as the regional growth curve, is an adjusted dimensionless quantile function common to every site. The index flood i is often taken to be the mean of Q at site-i. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 4 The regional growth curve q(F) is considered as the quantile function of a common distribution Qij/i . It is usually assumed that the distribution type for the rescaled data Qij/i (i.e. the regional frequency distribution q(F;1, p )) is known. Thus, it is necessary to estimate parameters of this common distribution using observed data available at different sites. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 5 The Index-Flood Approach for RFA – Estimations Parameter estimation ˆi Qi N ni 1 Qij ni j 1 ˆk niˆk(i ) i 1 N n i 1 i , k 1,, p. qˆ(F ) q(F;ˆk , k 1,, p) Regional frequency analysis Qˆi (F ) ˆi qˆ(F ) ˆi q(F;ˆk , k 1,, p) RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 6 The Index-Flood Approach for RFA – Implicit Assumptions Observations at any given site are RSLAB-NTU identically distributed. Observations at any given site are serially independent. Observations at different sites are independent. The distributions of the rescaled variable at different sites are identical. The distribution type of the rescaled variable is correctly specified. Lab for Remote Sensing Hydrology and Spatial Modeling 7 The assumption that distributions of the rescaled variable at different sites are identical implicitly imply the existence of a homogeneous region. A homogeneous region is considered as an area within which rescaled variables in different sites have approximately the same probability distributions. The homogeneous region need not to be geographically continuous. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 8 Implicit in the definition of a homogeneous region, is the condition that all sites can be described by one common probability distribution after the site data are rescaled by their at-site mean. Thus, all sites within a homogeneous region have a common regional growth curve. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 9 RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 10 General procedures of regional frequency analysis 1. Data screening Correctness check Data should be stationary over time. 2. Identifying homogeneous regions A set of characteristic variables should be chosen and used for delineation of homogeneous regions. Characteristic variables may include geographic and hydrological variables. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 11 3. Choice of an appropriate regional frequency distribution GOF test using rescaled samples from different sites within the same homogeneous region. The chosen distribution not only should fit the data well but also yield quantile estimates that are robust to physically plausible deviations of the true frequency distribution from the chosen frequency distribution. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 12 4. Parameter estimation of the regional frequency distribution Estimating parameters of the site-specific frequency distribution Estimating parameters of the regional frequency distribution using weighted average. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 13 Situations for application of RFA General application to individual sites with observed data Regionalization is valuable. Even though a region may be moderately heterogeneous, regional frequency analysis will still yield much more accurate quantile estimates than at-site analysis. Application to one site of special interest. Special care should be taken (by choosing appropriate characteristic variables) to make the site typical of the region to which it is assigned. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 14 Application to one or more ungauged sites (PUB program – http://iahs.info/ ). An ungauged site can be assigned to a homogeneous region based on its characteristic variables. The regional growth curve at an ungauged site is then estimated using the characteristic variables. The index flood (or index quantity, if the variable of interest is not flood flow) can be considered as a function of characteristic variables and to calibrate the function by using data from the gauged sites. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 15 Data screening using a measure of discordance Di Assuming that there are N sites in a region and we want to identify those sites that are grossly discordant with the group as a whole. Hosking and Wallis (1997) proposed a measure of discordance in terms of L-moments (t, t3, and t4) of the sites’ data. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 16 RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 17 RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 18 It can be shown that Di satisfies the algebraic bound Di ( N 1) / 3 . Thus, the value of Di can exceed 3 only in regions having 11 or more sites. The criterion for discordance should be an increasing function of the number of sites in the region since regions with more sites are more likely to contain sites with large values of Di. Hosking and Wallis (1997) recommend that any site with Di >3 be regarded as discordant, as such sites have Lmoments ratios that are markedly different from the average for the other sites in the region. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 19 Defining homogeneous sub-regions Homogeneous sub-regions (grouping of sites/gages) can be determined based on the similarity of the physical and/or meteorological characteristics of the sites. This can be done by performing cluster analysis. L-moment statistics can then used to estimate the variability and skewness of the pooled regional data and to test for heterogeneity as a basis for accepting or rejecting the proposed sub-region formulation. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 20 Candidates for physical features included such measures as: site elevation; elevation averaged over some grid size; localized topographic slope; macro topographic slope averaged over some grid size; distance from the coast or source of moisture; distance to sheltering mountains or ridgelines; and latitude or longitude. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 21 Candidate climatological characteristics included such measures as: mean annual precipitation; precipitation during a given season; seasonality of extreme storms; and seasonal temperature/dewpoint indices. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 22 Example A review of the topographic and climatological characteristics in the Oregon study area showed only two measures, mean annual precipitation (MAP) and latitude were needed for grouping of sites/gages into homogeneous sub-regions within a given climatic region. Homogeneous sub-regions were therefore formed with gages/sites within small ranges of MAP and latitude. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 23 The output from the cluster analysis need not, and usually should not, be final. Subjective adjustments can often be found to improve the physical coherence of the regions. Several kinds of adjustment of regions may be useful: move a site or a few sites from one region to another; delete a site or a few sites from the data set; subdivide the region; break up the region by reassigning its sites to other regions; merge the region with another or others; merge two or more regions and redefine groups; and obtain more data and redefine groups. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 24 Test of regional homogeneity Once a set of physically plausible regions has been defined, it is desirable to assess whether the regions are meaningful. This involves testing whether a proposed region may be accepted as being homogeneous and whether two or more homogeneous regions are sufficiently similar that they should be combined into a single region. The hypothesis of homogeneity is that the atsite frequency distributions are the same except for a site-specific scale factor. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 25 Rationale of test of regional homogeneity Comparing the between-site dispersion of the sample L-moment ratios for the group of sites under consideration and the expected dispersion of a homogeneous region. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 26 Test of regional homogeneity A heterogeneity measure proposed by Hosking and Wallis (1997). Suppose that the proposed region has N sites, with site i having record length ni and sample L-moment ratios t (i ) , t3(i ) , and t4(i ) . Let t R , t3R , and t4Rrepresent the regional average L-CV, L-skewness, and L-kurtosis, weighted proportionally to the sites’ record length; forN example N t R nit (i ) i 1 RSLAB-NTU n i 1 Lab for Remote Sensing Hydrology and Spatial Modeling i 27 Calculate the weighted standard deviation of the at-site sample L-CVs, 12 (i ) R 2 V ni (t t ) ni i 1 i 1 Fit a four-parameter kappa distribution to the regional average L-moment ratios R R R 1, t , t3 , and t4 . N RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling N 28 Simulate a large number Nsim of realizations of a region with N sites, each having this kappa distribution as its frequency distribution. The simulated regions are homogeneous and have no cross-correlation or serial correlation; sites have the same record lengths as their real-world counterparts. For each simulated region, calculate V. From the simulations determine mean and standard deviation of the Nsim values of V. Call these V and V . RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 29 Calculate the heterogeneity measure H RSLAB-NTU V V V Lab for Remote Sensing Hydrology and Spatial Modeling 30 Choosing a distribution for frequency analysis For regional frequency analysis, a single probability distribution is applied to all sites within a homogeneous region. Thus, it is necessary to choose a best-fit distribution from a set of candidate distributions. Assume that the region is acceptably close to homogeneous. The L-moment ratios of the sites in a homogeneous region are well summarized by the regional average and the scatter of the individual sites’ L-moment ratios about the regional average represents no more than sampling variability. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 31 The goodness-of-fit can be judged by how well the L-skewness and L-kurtosis of the fitted distribution match the regional average L-skewness and L-kurtosis of the observed data. Assume for convenience that the candidate distribution is generalized extreme-value (GEV), which has three parameters, and the sample L-skewness and L-kurtosis are exactly unbiased. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 32 The GEV distribution fitted by the method of L-moments has L-skewness equal to the regional average L-skewness. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 33 Note: When fitting a three-parameter candidate distribution to at-sites L-moment ratios, we only need to estimate the Lskewness of the distribution by using the method of L-moments (L-skewness equal to the regional average L-skewness). There is no need for estimation of the L-kurtosis since the L-kurtosis is completely dependent on the L-skewness. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 34 We thus judge the quality of fit by the difference between the L-kurtosis of the fitted GEV distribution and the regional average L-kurtosis t4R. GEV 4 RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 35 (t , R 3 RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling DIST 4 ) 36 Small values of ZGEV indicate that the GEV distribution can be considered as the true underlying distribution for the region. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 37 Calculation of 4 Theoretically, separate set of simulations must be made for each candidate distribution in order to obtain the appropriate 4 values. In practice, we can obtain a 4 value by using the same simulated realizations of a kappa distribution for a homogeneous region used in test of regional homogeneity. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 38 Bias correction for L-kurtosis RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 39 RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 40 Goodness-of-fit test Given a set of candidate three-parameter distributions (Pearson type III, GEV, lognormal, generalized Pareto, etc.). We first need to fit each distribution to the regional average L-moment ratios 1, t R , and t3R . DIST Denote by 4 the L-kurtosis of the fitted distribution, where DIST represents a candidate distribution. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 41 Fit a kappa distribution to the regional average Lmoment ratios 1, t R , t3R and t4R . Simulate a large number, Nsim, of realizations of a region with N sites, each having this kappa distribution as its frequency distribution. The simulated realizations are homogeneous and have no cross-correlation or serial correlation; sites have the same record lengths as their real-world counterparts. The fitting of a kappa distribution and simulation of at-site realizations of the kappa distribution can use the same simulations as those used for test of regional homogeneity. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 42 For the m-th simulated realization, regional average L-skewness t [ m ]and L-kurtosis t [ m ] can 4 3 be calculated. RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 43 RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 44