Journal of Hydrology 406 (2011) 54–65 Contents lists available at ScienceDirect Journal of Hydrology journal homepage: www.elsevier.com/locate/jhydrol Uncertainty estimates by Bayesian method with likelihood of AR (1) plus Normal model and AR (1) plus Multi-Normal model in different time-scales hydrological models Lu Li a,⇑, Chong-Yu Xu a,b, Jun Xia c, Kolbjørn Engeland d, Paolo Reggiani e,f a Department of Geosciences, University of Oslo, P.O. Box 1047, Blindern, 0316 Oslo, Norway Department of Earth Sciences, Uppsala University, Villavägen 16, SE-752 36 Uppsala, Sweden Key Laboratory of Water Cycle & Related Land Surface Processes, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China d SINTEF Energy Research, 7465 Trondheim, Norway e Section of Hydraulic Structures and Probabilistic Design, Delft University of Technology, P.O. Box 5048, 2600GA Delft, The Netherlands f Deltares, P.O. Box 177, 2600MH Delft, The Netherlands b c a r t i c l e i n f o Article history: Received 6 October 2010 Received in revised form 1 May 2011 Accepted 30 May 2011 Available online 16 June 2011 This manuscript was handled by Dr. A. Bardossy, Editor-in-Chief, with the assistance of Ezio Todini, Associate Editor Keywords: Uncertainty assessment Hydrological models Bayesian methods Multi-Normal Time scales s u m m a r y Bayesian revision is widely used in hydrological model uncertainty assessment. With respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches have been used in different models calibrated for either daily or monthly time step. None of these applications however includes a comparison of uncertainty analysis in hydrological models with respect to the time periods, at which the models are operated. This study pursues a comprehensive intercomparison and evaluation of uncertainty assessments by Bayesian revision using the Metropolis Hasting (MH) algorithm with the hydrological model WASMOD with daily and monthly time step. In the daily step model three likelihood functions are used in combination with Bayesian revision: (i) the AR (1) plus Normal time period independent model (Model 1), (ii) the AR (1) plus Multi-Normal model (Model 2), and (iii) the AR (1) plus Normal time period dependent model (Model 3). In addition an index called the percentage of observations bracketed by the Unit Confidence Interval (PUCI) was used for uncertainty evaluation. The results reveal that it is more important to consider the autocorrelation in daily WASMOD rather than monthly WASMOD. Firstly, the resulting goodness of fit of the daily model vs. observations as measured by the Nash–Sutcliffe efficiency value is comparable with that calculated by the optimization algorithm in monthly WASMOD. Secondly, the AR (1) model is not sufficiently adequate to estimate the distribution of residuals in daily WASMOD since PUCI shows that Model 2 outperforms Model 1. Furthermore, the maximum Nash–Sutcliffe efficiency value of Model 2 is the largest. Thirdly, Model 3 performs best over the entire flow range, while Model 2 outperforms Model 3 for high flows. This shows that additional statistical parameters reflect the statistical characters of the residuals more efficiently and accurately. Fourthly, by considering the difference in terms of application and computational efficiency it becomes evident that Model 3 performs best for daily WASMOD. Model 2 on the other hand is superior for daily time step WASMOD if the auto-correlation of parameters is considered. Ó 2011 Elsevier B.V. All rights reserved. 1. Introduction The Bayesian method is widely used for uncertainty assessment in hydrologic modelling. During the last two decades several applications to various models and catchments have been pursued in the literature (Bates and Campbell, 2001; Duan et al., 2007; Engeland and Gottschalk, 2002; Li et al., 2010b; Marshall et al., 2007; Kavetski et al., 2006a,b; Kuczera et al., 2006; Kuczera and Parent, 1998; Krzysztofowicz and Kelly, 2000; Liu et al., 2005; ⇑ Corresponding author. E-mail address: lu.li@geo.uio.no (L. Li). 0022-1694/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jhydrol.2011.05.052 Thiemann et al., 2001; Vrugt et al., 2003). The Bayesian method requires a specification of a likelihood function. For it to provide unbiased and reliable results several conditions need to be satisfied. The likelihood function provides a probability distribution of all simulation errors. It is therefore necessary to specify a correct distribution for the simulation errors and a possible mutual correlation structure. The most common approach is to transform the data for them to become normally and identically distributed (constant variance). There are two commonly used transformation methods in uncertainty assessment for hydrological models. One is the Box– Cox method (Engeland et al., 2005; Jin et al., 2010; Wang et al., L. Li et al. / Journal of Hydrology 406 (2011) 54–65 2009; Yang et al., 2007b) and the other one is the Normal Quantile Transform (NQT) (Krzysztofowicz, 1997, 1999; Todini, 2008; Van der Waerden, 1952, 1953a,b). Krzysztofowicz and Herr (2001) proposed the Normal Quantile Transform (NQT) method in his Hydrological Uncertainty Processor (HUP) to convert both observation and model output into the Gaussian space with the aim to estimate the predictive uncertainty. This method is one of the most commonly used in hydrological uncertainty analysis in deriving the joint distribution and predictive conditional distribution from a multivariate distribution (Krzysztofowicz and Kelly, 2000; Liu et al., 2005; Maranzano and Krzysztofowicz, 2004; Montanari and Brath, 2004; Reggiani and Weerts, 2008; Todini, 2008). Because hydrological processes have a ‘‘memory effect’’, the model residuals are often inter-correlated. In this context the discretetime and continuous-time autoregressive error models have been used universally nowadays (Yang et al., 2007b), amongst which, the first order auto-regressive model (AR (1) model) is the most widely used. The latter is employed in combination with the transformation method to make the residuals satisfy the statistical assumptions of identical and independent distribution (Engeland et al., 2010; Krzysztofowicz, 2002; Yang et al., 2007a). Mantovan and Todini (2006) proposed a Multi-Normal AR (1) model as likelihood function by considering the autocorrelation of errors. All uncertainty assessment methods account for different sources of uncertainty. Some methods use a statistical model of the residuals to address all the uncertainty sources (Engeland and Gottschalk, 2002; Jin et al., 2010; Krzysztofowicz, 2002; Montanari and Grossi, 2008). Others consider all uncertainty sources lumped into parameter uncertainty such as the GLUE method and the multi-objective method (Beven and Freer, 2001; Engeland et al., 2006; Yapo et al., 1998). Others consider the input error and/or model structure error jointly or separately (Chowdhury and Sharma, 2007; Kavetski et al., 2006a, 2006b; Kuczera et al., 2006). Recently, some criteria have been proposed to evaluate the performance of uncertainty analysis methods: (1) indices for goodness of fit at the level of the posterior distribution (efficiency) based on the Nash–Sutcliffe (NS) coefficient or other objective functions (Jin et al., 2010; Yang et al., 2008); (2) indices used to compare the width of the model uncertainty intervals (resolution) mainly based on 95% confidence interval of discharge (95CI), which include the width of 95CI, the average distance between the upper and the lower limits of 95% confidence intervals, 95PPU (d-factor), (Li et al., 2009; Xiong et al., 2009; Yang et al., 2008); and (3) indices to compare the distribution of the uncertainty estimates (reliability) which include the percentage of observations bracketed by the 95CI, the Continuous Rank Probability Score (CRPS). Engeland et al. (2010) proposed a hierarchy of all the criteria which are reliability, resolution and efficiency. According to Todini (2007, 2011) all these studies treats the problem of ‘‘uncertainty’’, which can be distinguished into ‘‘Validation Uncertainty’’ and ‘‘Predictive Uncertainty’’. The ‘‘Validation Uncertainty’’ is the probability that the simulated value (water level, discharge, water volume, etc.) will be less than or equal to a prescribed value given prior knowledge, the historical information and the observed value, which refers to running the model in historical mode based on observations. Numerous approaches for quantifying the ‘‘Validation Uncertainty’’ have been proposed, including the Generalized Likelihood Uncertainty Estimation (GLUE) (Beven and Freer, 2001; Blasone et al., 2008a,b; Vogel et al., 2008; Xiong and O’Connor, 2008) and formal Bayesian methods (Montanari and Brath, 2004; Vrugt et al., 2003; Engeland et al., 2005; Yang et al., 2008; Jin et al., 2010; Li et al., 2010b). In which, the uncertainty has been estimated by 95% confidence interval (95CI) (Engeland et al., 2005; Li et al., 2009, 2010b) or 95% prediction uncertainty band (95PPU) (Abbaspouretal et al., 2007; Yang et al., 2008) calculated at the 2.5% and 97.5% levels of the 55 cumulative distribution function of the model output variables. Krzysztofowicz (1999, 2001) defines ‘‘Predictive Uncertainty’’ as expressing the uncertainty on what may happen at a future stage conditional on the forecast of model, which was considered as a known value, based on the all available information, like input, parameter and model structure. It refers to running the model in forecast mode. After this, Todini (2007) redefined the predictive uncertainty that it is based on the posterior density of parameters and the conditional probability density of a predict and conditional on the covariates and a set of parameters. According to this, three approaches for the assessment of predictive uncertainty have been motioned (Todini, 2011), which are Hydrological Uncertainty Processor (HUP) (Krzysztofowicz, 1999), the Bayesian Model Averaging (BMA) (Raftery et al., 2003, 2005) and the Model Conditional Processsor (MCP) (Todini, 2008; Coccia and Todini, 2010). This definition of predictive uncertainty is restricted to hydrologic forecasting. It might, however, argued that the predictive uncertainty might be defined for other applications that include predictions, e.g. streamflow in an ungauged catchment (conditioned on streamflow in a nearby catchment), or streamflow for a period of missing data (conditioned on observations outside this period). For this reason the difference between the ‘‘Predictive Uncertainty’’ and ‘‘Validation Uncertainty’’ as defined by Todini might be more related to the problem addressed (forecasting/non-forecasting) and not to a general Bayesian definition of predictive uncertainty. In both types of uncertainty analyses, the Bayesian method requires a specification of a likelihood function. In order to provide unbiased and reliable results several conditions need to be satisfied. The likelihood function provides a probability distribution of all simulation errors. It is therefore necessary to specify a correct distribution for the simulation errors and a possible mutual correlation structure. The most common approach is to transform the data for them to become normally and identically distributed (constant variance). Hydrological models at monthly time-step have been widely utilized for long-range streamflow forecasting, for assessing the long-term climatic change impact on water resources, and for regional water resources management (Xu and Vandewiele, 1995; Chen et al., 2007). Whereas daily or shorter time step hydrological models are required for real-time flood forecasting, for generation of synthetic sequences of hydrologic data for facility design, for studying the potential impacts of changes in land use or climate, and for interdisciplinary researches. Therefore, both monthly and daily hydrological modeling approaches are essential in water cycle and water resources researches. With respect to model calibration, parameter estimation, uncertainty sources analyses, various regression and probabilistic approaches have been employed in hydrological applications. Some include monthly water balance hydrological models (Benke et al., 2008; Engeland et al., 2005; Jin et al., 2010; Li et al., 2010b), others are daily conceptual hydrological models or physically-based distributed hydrological models (Blasone et al., 2008a; Engeland and Gottschalk, 2002; Li et al., 2010a; Liu et al., 2005; Looper et al., 2009; Wang et al., 2009; Yang et al., 2008, 2007b). However, none of the above applications seems to include comparisons of uncertainty analyses in timevariant hydrological models. It is common knowledge that the hydrological response variables are dependent on the time scale of model simulation. However, the question remains how model time scales impact on the performance of the uncertainty analysis methods and how models of similar structure operated with different time steps affect the uncertainty assessment via Bayesian methods. The objective of this paper is to attempt to fill this gap. This study performs a comprehensive comparison and evaluation of uncertainty estimates by Bayesian methods using the Metropolis Hasting (MH) algorithm in two validated conceptual hydrological models (WASMOD) with monthly and daily time 56 L. Li et al. / Journal of Hydrology 406 (2011) 54–65 steps. It aims at explaining the differences between the Bayesian method applied to hydrological models of similar concept and structure but with different time steps. The Box–Cox transformation in this study was used to obtain variables which are closer to a normal distribution. In daily WASMOD, two likelihood functions were used to remove the time dependence of residuals in the Bayesian method: (i) the AR (1) plus Normal model and (ii) the AR (1) plus Multi-Normal model. We also tested models, where the statistical parameters were dependent and/or independent on time, respectively. The Average Relative Interval Length (ARIL) defined by Jin et al. (2010), the percentage of observations bracketed by the Confidence Interval (PCI), and a new index called the Percentage of observations bracketed by the Unit Confidence Interval (PUCI) were used in this study for evaluation of model performance. Furthermore, the 95% confidence interval of daily discharge in low flow, medium flow and high flow were selected for comparison. Finally, the paper tries to give some advice to users on application of Bayesian method in validation uncertainty in timevariant hydrological model applications. 2. Study area and hydrological model 2.1. Study area The selected study area is the Jiuzhou basin (Fig. 1) a tributary of the Dongjiang River in southern China. The river flows from the north-east to the south-west estuary with a drainage area of 385 km2 (upstream area of Jiuzhou gauge station). The landscape is characterized by hills and plains. The Jiuzhou basin is located in a Monsoon region in a tropical and sub-tropical humid climate with a mean annual temperature of about 21 °C and only occasional incidents of winter daily air temperature dropping below 0 °C in the mountainous areas of the upper catchment part. Precipitation is generated mainly from two types of storms: frontal type and typhoon-type rainfalls. Eighty percent of the annual precipitation occurs during the wet season from April to September and the mean annual precipitation for the period of 1978–1988 is 1802 mm. The mean annual runoff is 993.4 mm or 55% of the annual precipitation. Precipitation data from Heduobu, Tangwei and Jiuzhou stations for the period of 1978–1988 are used for the study as the main input data. The potential evapotranspiration was calculated through the Penman–Monteith formula using meteorological data of the Heyuan meteorological station. All data have gone through quality control in earlier studies (Jin et al., 2010). 2.2. Wasmod WASMOD is a well-tested simple conceptual water balance model with multiple time scales (Xu, 2002). The input data to the model include precipitation, potential evapotranspiration. Model output data are fast flow, slow flow, actual evapotranspiration and soil moisture. The model can be set up with different spatial and temporal scales varying from global to catchment and from daily to monthly (Gong et al., 2009; Jin et al., 2010; Widen-Nilsson et al., 2009, 2007; Xu, 2002). For this study a monthly and a daily version of the model at the catchment scale are used. The main Fig. 1. Main river, meteorological stations and discharge station of Jiuzhou River Basin. 57 L. Li et al. / Journal of Hydrology 406 (2011) 54–65 Table 1 The main equations in WASMOD. Actual evapotranspiration Slow flow 0 6 a1 6 1 0 6 a2 6 1 et ¼ minfðsmt1 =Dt þ pt Þð1 expða1 ept ÞÞ; ept g st ¼ a2 ðsmt1 Þ2 ft ¼ a3 ðsmt1 Þðpt ept ð1 expðpt = maxðept ; 1ÞÞÞÞ smt ¼ smt1 þ ðpt et dt Þ Dt SC t ¼ SC t1 þ ft Dt Q t ¼ a4 SC t SC t ¼ SC t Q t Dt dt ¼ st þ Q t Fast flow Water balance Routing of fast flow Total runoff 0 6 a3 6 1 0 6 a4 6 1 In which, t is t th month or day, pt is precipitation, ept is potential evaporation, smt is t th monthly or daily soil moisture, SCt is channel storage at time step t, Dt is time step which is either one month for the monthly modal or one day for the daily model. The unit for all the fluxes is either mm/month for the monthly model and mm/day for the daily model. The unit for the state variables soil moisture and channel storage is mm. transformation yields residuals closest to normal for values of k = 0.6 in daily and k = 0.2 in monthly WASMOD. It was also necessary to verify whether nt were stochastically independent. In case of stochastic dependence, an AR (1) model needs to be fitted to the residuals as follows: model equations are shown in Table 1. Snowmelt is not considered in the simulation because of the rare appearance of snow in the Jiuzhou River basin in the South of China. There are three parameters in monthly WASMOD and four parameters in daily WASMOD, with one additional routing parameter with respect to the former. Furthermore, parameter a2 and a3 have multiplied 103 and 104 respectively, aiming to be scaled into [0, 1] in this paper. nt b ¼ aðnt1 bÞ þ et There is no need to use AR (1) model in monthly WASMOD, since the monthly residuals were not significantly correlated. On the contrary, daily residuals were more heteroscedastic and autocorrelated. Using the Box–Cox transformation to reduce the heteroscedasticity of residuals increased their degree of autocorrelation. We therefore used the AR (1) model to make the Box–Cox transformed residuals independent. However, the AR (1) innovations were still auto-correlated. We therefore introduced AR (1) plus Multi-Normal model to account for all autocorrelations. This method has been used before in hydrological applications (Mantovan and Todini, 2006; Romanowicz et al., 1994). Fig. 2 shows that the variances of AR (1) innovations from daily residuals have different values in different seasons. We therefore tested two statistical models: (i) in Model 1 the statistic parameters were constant and (ii) in Model 3 the parameters were allowed to depend on the season. Two simulation sub-periods were defined: a wet period from April to September and a dry period from October to March. 3. Methods 3.1. Residual analysis The variable nt represent the difference of the observed random M variable gt ðyt Þ and the simulated random variables gM t ðyt Þ: M nt ¼ gt ðyt Þ gM t ðyt Þ ð1Þ where t is an index for time, M indicates that the variable is simulated, y is the variable in original space and g is the transformed variable. In this study, Box–Cox transformation method is used, which can be expressed as: ( hðy; kÞ ¼ yk 1 k k–0 ð2Þ lnðyÞ k ¼ 0 ð3Þ 3.2. Statistical models for the prediction errors where k is a transformation parameter. It is often fixed, e.g. at the value of 0.3 in Yang et al. (2007) or 0.2 (Engeland et al., 2010). The square-root transformation and log-transformation are special cases of the Box–Cox transformation (Engeland and Gottschalk, 2002; Engeland et al., 2005; Jin et al., 2010; Langsrud et al., 1998; Wang et al., 2009). The Lilliefors test (Lilliefors, 1967) was used to verify the normality of the distribution of the residuals of the Box–Cox transformed variable with k fixed at different values for the monthly and daily models. The test statistics is called LSTAT and it should be as small as possible. Calculations indicated that the Box–Cox 3.2.1. Statistical model for monthly WASMOD The assumption that there is neither bias nor autocorrelation in monthly residuals results in the following likelihood function for monthly WASMOD: Lðx; hjnÞ ¼ T Y 1 t¼1 n qð t Þ ð4Þ r r where h represents the hydrological model parameters, x represents the statistical parameters including in this case only the 6 Residual 4 2 0 -2 -4 -6 1 2 3 4 5 6 7 8 9 10 11 12 month Fig. 2. Variances of daily runoff residuals in Jiuzhou basin (1979–1988) after Box–Cox transformation and AR (1). 58 L. Li et al. / Journal of Hydrology 406 (2011) 54–65 standard deviation of the residuals r, n ¼ fn1 ; . . . ; nT g is the residuals, t is the time index, and q is the normal density operator. xt fbt ; at ; rt g ¼ xW fbW ; aW ; rW g t 2 ðApriltoSeptemberÞ xD fbD ; aD ; rD g others ð11Þ 3.2.2. Statistical models for daily WASMOD 3.3. Prior density (a) The AR (1) plus Normal model in which statistical parameters independent on periods (Model 1) The likelihood function for the AR (1) plus Normal model for the prediction errors in daily WASMOD were: Lðx; hjnÞ ¼ 1 T Y n 1 nt b aðnt1 bÞ qð 0 Þ qð Þ r r t¼1 r r ð5Þ where h represents the hydrological model parameters. x represents the statistical parameters including an autoregressive term a, a bias term b and a standard deviation of the residuals r, n0 is the initial residual value, t is the time index, and q is the normal density operator. (b) The AR (1) plus Multi-Normal model (Model 2) The likelihood function is built for time steps ranging from 1 to t (t = n) given the AR (1) plus Multi-Normal model. Lðx; hjnÞ ¼ where P expð 12 ðn bÞT 1 P r1 r21 . . . r n2 1 1 r1 . . . r n3 1 6 r 6 1 6 2 X 6 r1 ¼ r2n 6 6 ... 6 6 n2 4 r1 r1 1 ... r n4 1 ... ... ... ... rn3 1 r n4 1 ... 1 rn1 1 rn2 1 r n3 1 ... r1 rn1 1 3 7 rn2 7 1 7 n3 7 r1 7 ... 7 7 7 r1 5 ð7Þ 1 where r2n ¼ var½nt ð8Þ n1 P ðnt nÞðntþ1 nÞ r1 ¼ t¼1 n P ð9Þ ðnt nÞ (c) The AR (1) plus Normal model in which statistical parameters depend on periods (Model 3) The likelihood function for the daily WASMOD for which the statistical parameters depend on the periods is: 1 qð n0 r0 r0 ð12Þ where f ðuÞ is the prior density, u ¼ fx; hg and n ¼ fn1 ; . . . ; nT g. The likelihood L is given in Eqs. (4), (5), (6), and (10) above. The Markov chain Monte Carlo method known as the Metropolis Hasting (MH) algorithm was used to sample from the posterior distributions. The Markov chain is initiated from a random starting value, whereby a new sample is derived from the proposal distribution. To avoid long ‘‘burn in’’ time, the Markov chain is initiated in correspondence of an approximated maximum Nash–Sutcliffe value calculated with the help of an optimization algorithm named VA05A (Hopper, 1978). A new sample was suggested from the proposal distribution. The new sample was accepted according to an acceptance probability. If not, the chain kept the old sample. The necessary steps of the Metropolis Hasting method have been explained in detail in the literature (Chib and Greenberg, 1995; Engeland and Gottschalk, 2002; Kuczera and Parent, 1998). Here the proposal distributions d were assumed as normal in the Random walk M–H Chain: dðxÞ N½xL ; Sx ð13Þ dðhÞ N½hL ; Sh ð14Þ 2 t¼1 Lðx; hjnÞ ¼ LðujnÞ f ðuÞ LðujnÞ f ðuÞdu ð6Þ is the covariance matrix that can be approximated with: 1 pðujnÞ ¼ R 3.4. Metropolis Hasting algorithm ðn bÞÞ P 1=2 n=2 ð2pÞ ½detð Þ 2 There is no information about the distribution of parameters. Parameter r is an unknown model standard deviation and with Jeffreys’ uninformative prior it is proportional to r1 (Bernardo and Smith, 1994; Yang et al., 2007b). Besides, the prior probability densities of other parameters are generally taken as non-informative multi-uniform distribution in hydrological applications (Engeland et al., 2005; Jin et al., 2010; Liu et al., 2005; Yang et al., 2008). The prior densities and intervals of all parameters in the two hydrological models are shown in Table 2. The posterior density pðujnÞ is: Þ T Y 1 t¼1 rt n bt at ðnt1 bt1 Þ qð t Þ rt ð10Þ where all statistical parameters (bt , at and rt ) are labelled according to two periods: the wet period (April to September) and the dry period (rest months): where x ¼ fa; b; rg are the statistical parameters, h are the hydrologic parameters, L index the current state of the chain, Sx and Sh are the covariance matrixes of the proposal distributions. The normal distribution is not appropriate for r > 0. However, in this study, the standard deviation of r was much smaller than the mean of r, which means that the normal distribution density functions are mostly inside the feasible region. Other statistical parameters and model parameters in this paper satisfied the same assumption. The algorithm works best if the proposal density matches the target distribution which is generally unknown. The tuning of the variance was performed by calculating the acceptance rate. The ideal Table 2 The prior distributions of all parameters in two hydrological models. a b Hydrological models Monthly WASMOD Parameter Prior distributionsa Hydrological models Parameter Prior distributionsa a1 U[0, 1] Daily WASMOD a1 U[0, 1] rb a2 U[0, 1] a3 U[0, 1] /r1 a2 U[0, 1] a3 U[0, 1] a4 U[0, 1] U½a; b means the prior distribution of the parameter is uniform over the interval ½a; b. / r1 means the prior density of the parameter at value r is proportional to r1 . a U[0, 1) b U[10, 10] rb /r1 59 L. Li et al. / Journal of Hydrology 406 (2011) 54–65 A1 0.4 0.1 0.3 Frequency 0.2 0.2 0.1 0 0.7 0.8 0 0.1 0.9 A3 0.4 0.3 Frequency 0.3 Frequency A2 0.4 0.2 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.4 Fig. 3. Histograms of parameters of monthly WASMOD derived by Metropolis Hasting method. acceptance rate is approximately 40–50% for one parameter updating, decreasing to approximately 20–30% for multi-parameters updating in a one block (Chib and Greenberg, 1995; Engeland and pffiffiffi Gottschalk, 2002). Furthermore, the Scale Reduction Score R (Gelman and Rubin, 1992) was used to check whether the MC chains converged. 3.5. Uncertainty confidence intervals estimation The uncertainty intervals for streamflow due to parameter uncertainty were calculated using all the MH-samples of the hydrologic parameters. The samples were then sorted for every day in order to get credibility intervals. The empirical distribution P B ðy < yi Þ based on the rank i was used to calculate the probability using the sorted sample yi Table 3 The statistic characteristic of parameters estimated by Bayesian method for monthly WASMOD. Method Parameters Min Max Average Variance Skewness Estimated 2.50% Quantile 5% Quantile 50% Quantile 95% Quantile 97.50% Quantile MNS a1 a2 a3 0.783 0.872 0.830 1.59E04 0.134 0.829 0.804 0.809 0.830 0.850 0.854 0.856 0.132 0.340 0.224 1.38E03 0.166 0.207 0.158 0.166 0.223 0.286 0.300 0.148 0.405 0.258 1.54E03 0.248 0.244 0.186 0.195 0.257 0.325 0.341 MNS: Maximum Nash–Sutcliffe. PB ðy < yi Þ ¼ i=n ð15Þ where n is the number of sample simulations; PB ðy < yi Þ is the quantile of discharge yi . The 95% confidence intervals due to parameter uncertainty and model uncertainty for discharge were calculated by adding the model residuals in the form of a normally distributed random error with zero mean and variance r2 to each of the sample simulations that were available for each time step (Eq.(16)). For the daily model the bias and autoregressive terms were included as well (Eq.(17)). The inverse of the Box–Cox transformation was then applied. gi;t ¼ gMi;t þ rnormð0; rÞi ð16Þ gi;t ¼ gMi;t b þ a ðgi;t1 gMi;t1 bÞ þ rnormð0; rÞi ð17Þ 1 yi;t ¼ h ðgi;t Þ ð18Þ where i is the sample index, t is the time index, rnorm is the random value extracted from a normal distribution with zero mean and var1 iance r2 , and h represents the inverse operation of the Box–Cox transformation. Then the 5% percentile and 95% percentile of discharge due to parameter uncertainty and model uncertainty are derived by sorting new discharge values yi;t were sorted at each time step (Eq. (18)). And Eq. (15) was used to obtain the empirical distributions. 3.6. Evaluation of uncertainty interval estimates Three indices were used to evaluate the resolution, reliability and efficiency of the uncertainty interval estimates: (1) the Average Relative Interval Length (ARIL) by Jin et al. (2010), (2) the percentage of observations bracketed by the Confidence Interval (PCI) Table 4 The statistic characteristic difference of parameters estimated by three methods for daily WASMOD. Parameter Min Max Mean Variance Skewness 5% Quantile 50% Quantile 95% Quantile MNS Model 1 a1 a2 a3 a4 0.967 0.112 0.033 0.34 0.993 0.177 0.062 0.443 0.981 0.143 0.045 0.393 1.83E05 1.26E04 1.84E05 2.18E04 0.164 0.074 0.42 0.022 0.974 0.125 0.038 0.368 0.981 0.143 0.044 0.393 0.988 0.162 0.052 0.416 0.805 Model 2 a1 a2 a3 a4 0.955 0.129 0.053 0.364 0.988 0.204 0.088 0.459 0.974 0.17 0.07 0.411 3.81E05 1.90E04 4.21E05 2.77E04 0.156 0.206 0.136 0.038 0.963 0.145 0.059 0.383 0.974 0.17 0.071 0.411 0.984 0.192 0.08 0.437 0.831 Model 3 a1 a2 a3 a4 0.959 0.159 0.057 0.329 0.983 0.217 0.088 0.462 0.972 0.184 0.073 0.394 1.72E05 1.18E04 2.55E05 3.91E04 0.186 0.715 0.393 0.186 0.965 0.17 0.064 0.363 0.972 0.182 0.074 0.394 0.979 0.206 0.081 0.429 0.821 MNS: Maximum Nash–Sutcliffe. L. Li et al. / Journal of Hydrology 406 (2011) 54–65 (Li et al., 2009) and (3) the maximum Nash–Sutcliffe value (MNS) (Nash and Sutcliffe, 1970). ARIL measures the resolution of the predictive distributions: NQ in;p is the number of observations which are contained within the p confidence interval. PCI was plotted as a function of p and should be close to the 45° diagonal line. The maximum Nash–Sutcliffe value (MNS) measures the efficiency of the model predictions. ð19Þ LimitUpper;t;p and LimitLower;t;p are the upper and lower boundary values of the p confidence interval, n is the number of time steps, Robs;t is the observed discharge. ARIL should be as small as possible and was plotted as a function of p. The PCI measures the reliability of the predictive distributions and it is based on the count, the number of the observed data within a range of simulated intervals, which is a function of p. N MNS ¼ maxfNSi g a1 a2 a3 a1 a2 a3 1.000 0.003 0.029 0.003 1.000 0.725 0.029 0.725 1.000 T P NS ¼ 1 0.4 0.1 0 0.95 (b) 0.4 0 0.1 1 A1 0.4 0.2 0.1 (c) 0.4 0.975 A1 0.2 0.1 1 A2 0 0.3 0.1 A3 0.4 0.1 0 0.05 A3 0.4 0.3 0.5 A4 0.1 0.4 0.4 0.5 A4 0.3 0.2 0 0.2 0 0.3 0.1 0.1 0.2 0.4 0.3 0.3 0.2 0 0.1 0.05 0.2 0 0.3 0.1 0.975 0 0.4 0.3 Frequency Frequency 0.2 0.2 0.1 0.3 0.1 0.4 0.2 0 0.3 A2 0.2 0 0.1 1 0.3 0 0.95 0.2 A4 0.3 0.1 0.3 Frequency Frequency 0.3 0 0.95 0.2 0.1 0.975 0.4 0.3 Frequency 0.2 A3 0.4 0.3 Frequency Frequency 0.3 A2 Frequency A1 ð22Þ ðRobs;t Robs Þ2 in which, i is the acceptable sample index, t is the time index, Robs;t is the observed discharge, Rsim;t is the simulated discharge, and Robs is the average value of Robs;t . MNS should be as close to 1 as possible. However, the previous research (Li et al., 2010b) shows that it is not adequate to judge the uncertainty results by only ARIL or PCI, because they vary simultaneously. To facilitate a more direct uncertainty assessment for 95% confidence interval of discharge, Frequency 0.4 ðRobs;t Rsim;t Þ2 t¼1 T P t¼1 Frequency (a) ð21Þ i¼1 Table 5 The Correlation between the parameters for monthly WASMOD estimated by Bayesian method. Correlation ð20Þ Frequency 1 X Limit Upper;t;p LimitLower;t;p ARILp ¼ n Robs;t NQ in;p n PCIp ¼ Frequency 60 0.2 0.1 0 0.05 0.1 0 0.3 0.4 0.5 Fig. 4. Histograms of parameters of daily WASMOD derived by Metropolis Hasting method using: (a) Model 1; (b) Model 2 and (c) Model 3. 61 L. Li et al. / Journal of Hydrology 406 (2011) 54–65 this paper proposes a new index based on ARIL and PCI, called the Percentage of observations bracketed by the Unit Confidence Interval (PUCI): PUCI ¼ ð1:0 AbsðPCI 0:95ÞÞ=ARIL ð23Þ From Eq. (23), we can see that the PUCI is ranges from zero to infinity, and the upper boundary is not clear, which is a weak point of the index. It’s only used in 95% confidence interval evaluation. In fact the larger the PUCI the lower the uncertainty of 95% confidence interval of discharge is. 4. Results and discussion 4.1. Parameter estimates and parameter uncertainty Five parallel simulation chains are used in the MH-algorithm with 5000 iterations each. The acceptance rate is ranging between 20% and 30%. The total number of samples of each parameter is pffiffiffi 25,000. The Scale Reduction Score R for all the parameters is approximately 1.0, indicating convergence of the iterative procedure for a chain. Fig. 3 shows the marginal posterior parameter densities calculated by Bayesian method for monthly WASMOD. Sharp and peaky distributions show well identifiable parameters, while flat distributions indicate a larger parameter uncertainty. Fig. 3 shows that all parameters have well-defined posterior distributions, from which parameter estimates can be clearly inferred as modal values. The marginal posterior distributions for all parameters are nearly symmetric as indicated by the skewness coefficients in Table 3 and also visible in Fig. 3. Besides, it is evident that the interval of parameter a1 is very narrow and parameter a3 has the most skewed distribution. Correlations between the parameters of monthly WASMOD estimated from Bayesian samples are represented in Table 5. The results show that the correlation coefficients range between 0.003 and 0.725. The correlation coefficients of all parameters are quite small except between parameter a2 and a3 with a value of 0.725. It shows that the slow flow and fast flow are highly correlated since both of them are related to soil moisture content. For daily WASMOD, the marginal posterior parameter densities calculated by Models 1–3 are shown in Fig. 4. The following can be deduced from the Figure: (1) by comparing the posterior distributions of parameters derived by the three models, those estimated by Model 1 are slightly different; the variances and Table 6 The Correlation between the parameters for daily WASMOD estimated by Bayesian method. Model 1 a1 a2 a3 a4 Model 2 Model 3 a1 a2 a3 a4 a1 a2 a3 a4 a1 a2 a3 a4 1 0.468 0.617 0.283 0.468 1 0.189 0.259 0.617 0.189 1 0.327 0.283 0.259 0.327 1 1 0.338 0.380 0.272 0.338 1 0.259 0.148 0.380 0.259 1 0.231 0.272 0.148 0.231 1 1 0.341 0.488 0.363 0.341 1 0.352 0.305 0.488 0.352 1 0.282 0.363 0.305 0.282 1 Q(mm) 800 95% confidence interval Observed Simulated 600 400 200 0 1979-01 1980-01 1981-01 1982-01 1983-01 1984-01 1985-01 1986-01 1987-01 1988-01 Month (Y-M) Fig. 5. The 95% confidence intervals of monthly discharge (1979/1–1988/12) due to parameter and model uncertainty (grey bands), simulated discharge at the maximum of the posterior distribution (solid) and observed discharge (dots). 2 (a) 1 1.2 0.6 0.8 0.4 0.4 0 (b) 0.8 PCI ARIL 1.6 0.2 0 0.2 0.4 0.6 0.8 Confidence Interval 1 0 0 0.2 0.4 0.6 0.8 Confidence Interval 1 Fig. 6. The Average Relative Interval Length (ARIL) and the percentage of observations bracketed by the Confidence Interval (PCI) due to different confidence interval value (Probability) for 1979–1988 monthly runoff in Jiuzhou river basin. L. Li et al. / Journal of Hydrology 406 (2011) 54–65 effective parameter space obtained from Model 3 is more similar to that from Model 1 than Model 2 and (2) the posterior distributions obtained from Model 1 are somehow sharper and narrower than those obtained from Model 2 and Model 3, indicating that a slightly more optimal parameter set has been identified. This is confirmed by the variances of parameter samples given in Table 4, which reports posterior summary statistics for each parameter estimated by three likelihood functions for daily WASMOD. It shows several interesting features: (1) the variance of parameter samples estimated by Model 1 is smaller than those from Model 2 and Model 3; (2) the interval length for parameter a4 is nearly the same for the three methods, while for the other three parameters, the samples by Model 2 and Model 3 have slightly larger intervals than those by Model 1; (3) the marginal posterior distributions for all parameters except for a3 of Model 1 and a2 of Model 3, are nearly symmetric as demonstrated by the skewness coefficients and (4) the maximum Nash–Sutcliffe (MNS) values of Model 1, Model 2 and Model 3 are 0.805, 0.831 and 0.821, respectively. Obviously, the MNS from Model 1 is the smallest. It indicates that Model 1 has low accuracy although it has the narrowest interval of posterior distribution for parameters. It means that residuals autocorrelation in daily WASMOD is more 1 0.8 0.6 PCI 62 0.4 0.2 0 0.8 1.2 ARIL 1.6 2 important, which needs to be considered for uncertainty assessment purposes. Table 6 shows the correlations between the parameters samples estimated from the three Models. We see that compared to the monthly WASMOD, the average correlation between the parameters has increased slightly, but is still in the same range. And 95% confidence interval Observed Simulated 30 Q(mm) 0.4 Fig. 8. The relation of ARIL and PCI due to different confidence interval value (Probability) for 1979–1988 monthly runoff in Jiuzhou river basin. (a) 40 0 20 10 0 1 51 101 151 201 Day of year 1982 251 (b) 40 351 95% confidence interval Observed Simulated 30 Q(mm) 301 20 10 0 1 51 101 151 201 Day of year 1982 251 (c) 40 351 95% confidence interval Observed Simulated 30 Q(mm) 301 20 10 0 1 51 101 151 201 Day of year 1982 251 301 351 Fig. 7. The 95% confidence intervals of daily discharge (1982) due to parameter and model uncertainty (grey bands), simulated discharge at the maximum of the posterior distribution (solid) and observed discharge (dots) for (a) Model 1; (b) Model 2 and (c) Model 3. 63 L. Li et al. / Journal of Hydrology 406 (2011) 54–65 Table 7 The comparison of uncertainty measures between two hydrological models for the 95% confidence interval. Monthly WASMOD(79–88) Daily WASMOD (78–82) Model 1 Model 2 Model 3 Methods ARIL PCI (%) ARIL PCI (%) ARIL PCI (%) ARIL PCI (%) Bayesian_P Bayesian_PM 0.250 1.516 29.2 94.2 0.451 2.642 28.2 96.4 0.433 2.512 26.6 96.2 0.290 1.556 11.5 92.9 Bayesian_P represents the 95% confidence interval is only due to parameter uncertainty. Bayesian_PM represents the 95% confidence interval is due to parameter and model uncertainty. Table 8 The PUCI based on low, medium, high and all flows due to 95% confidence interval for 1978–1982 daily runoffs in Jiuzhou river basin. Model 1 Model 2 Model 3 Low Flows Medium Flows High Flows All Flows 0.255 0.263 0.610 0.428 0.442 0.607 0.805 0.813 0.675 0.373 0.393 0.629 now the largest correlation is between a1 and a3 and not a2 and a3 as in the monthly WASMOD. 4.2. Confidence interval of discharge due to parameters and model uncertainty For monthly WASMOD, the 95% confidence intervals of discharge from 1979 to 1988 due to parameter and model uncertainty are plotted in Fig. 5. It shows that most of the observed data has been covered by the 95% confidence intervals, which can also be seen more clearly from Fig. 6 presenting the plots of change of Average Relative Interval Length (ARIL) and the percentage of observations bracketed by the Confidence Interval (PCI) due to different confidence interval values for monthly WASMOD. It indicates that both ARIL and PCI are increasing with the increase of confidence interval. Furthermore, the relation of ARIL and PCI due to different confidence interval value (Probability) was shown in Fig. 8. Table 7 reports the uncertainty measures for different likelihood models for two different time-scale hydrological models for the 95% confidence interval. Due to parameter uncertainty and combined parameter and model uncertainty for monthly WASMOD, ARIL values derived by Bayesian method are 0.250 and 1.516, respectively, while the PCI values are 29.2% and 94.2%, respectively. This indicates that the uncertainty in the simulated discharges resulting from parameter uncertainty is less important than that resulting from total uncertainty. The 95% confidence intervals of daily discharge from 1982 estimated by three statistical models due to parameter and model uncertainty for daily WASMOD are plotted in Fig. 7 for illustrative purpose. Fig. 7a and b shows that there is no difference between results obtained by Model 1 and Model 2 on the basis of visual comparison, while the confidence intervals estimated by Model 3 are different in the two periods (Fig. 7c). It indicates that the confidence intervals in dry periods derived by Model 3 are narrower than those derived from Model 2 and Model 1. The difference of 95% confidence intervals between three statistical models can be seen more clearly in Table 7, which shows (1) ARIL of 95% confidence interval due to parameter and model uncertainty derived from Model 2 is narrower than those from Model 1. Furthermore, the PCI obtained by Model 2 is nearly the same as that of Model 1; (2) ARIL of 95% confidence interval due to parameter and model uncertainty derived from Model 3 is the smallest with a value of 1.556, or less than 60% of 2.512 and 2.642 from Model 2 and Model 1, respectively. Meanwhile, the PCI of Model 3 is 92.9% which is quite close to 96.2% of Model 2 and 96.4% of Model 1. The explanation lies in the fact that the statistical parameters of Model 3 depend on simulation periods which makes the interval of dry period discharge to be much narrower and the result thus more reliable and less uncertain and (3) ARIL of 95% confidence interval due to parameter uncertainty is much narrower than those due to both of parameter uncertainty and model uncertainty for all statistical models in monthly and daily WASMOD. Above results reveal that the parameter uncertainty is not the most important key in uncertainty estimates for both monthly and daily hydrological models. However, from results of Table 7, it is difficult to identify which likelihood function is the best for daily WASMOD when considering both ARIL and PCI as indices for confidence interval evaluation. For instance Model 1 has the biggest PCI value while Model 3 has the smallest ARIL value. In order to settle this problem and make the comparison easier and clearer, a new index as defined in Eq (23) is used for comparison and the results are shown in Table 8. In which the values of PUCI based on low, medium and high flows due to 95% confidence interval for 1978–1982 daily runoffs in the Jiuzhou basin are compared for the three likelihood functions. It can be clearly seen that (1) Model 3 has the largest PUCI of 0.629 and Model 1 has the smallest PUCI of 0.373 for all flows. Consequently Model 3 is the best for daily time steps, highlighting the importance for the time-dependent statistical parameters; (2) the PUCI of Model 2 is the largest based on the high flow, implying that Model 2 is less uncertain and better in high flow estimate than the other two models and (3) an inter-comparison of Model 1 with Model 2 without time dependence shows in terms of PUCI values that Model 2 is superior to Model 1 over the whole range of flows. It reflects that Model 2 which uses more auto-correlation steps over the time periods outperforms Model 1. 4.3. Comparison of application and computational efficiency Table 9 shows the comparison of application and computational efficiency of statistical models in WASMOD with different time-scales. From the running time, we can see that Model 2 is Table 9 Comparison of application and computational efficiency (Computer with Intel(R) Core(TM) 2 Duo CPU T8300 @ 2.40 GHz and 2.39 GHz, 1.96 GB of RAM). Monthly WASMOD (1979–1988) Difficulty of implement Number of runs number of chains Time Very easy 5000 5 1 min Daily WASMOD (1978–1982) Model 1 Model 3 Model 2 Little complicated 5000 5 141.6 min More complicated 5000 5 188.4 min Very complicated 2000 5 23 days 64 L. Li et al. / Journal of Hydrology 406 (2011) 54–65 computationally more demanding than Model 1 and Model 3. Implementation of model 1 is very easy compared to Model 2. This is due to the fact that Model 2 considers the autocorrelation of whole sample periods, which implies hundreds or thousands of matrix calculations for the more complex likelihood function and subsequent processing. For monthly WASMOD, the computational efficiency is much higher than that of daily WASMOD. For daily WASMOD, although it is more efficient for Model 2 to find the correct posterior distribution of parameters than for Model 1 and Model 3, Model 2 is computationally more expensive than Models 1 and Model 3, which is certainly the main disadvantage of using this model in the Bayesian method for uncertainty assessment. 5. Conclusions This study performed a comprehensive comparison and evaluation of uncertainty estimation through the Bayesian method by applying the Metropolis Hasting (MH) algorithm to two welltested conceptual hydrological models (WASMOD) operated with monthly and daily time step. Several likelihood functions have been constructed to supply the uncertainty estimates in monthly and daily hydrological models. The Box–Cox transformation was used on the observed and simulated flows in all likelihood functions. For monthly WASMOD a statistical Gaussian error model was constructed for the residuals after the transformation. For daily WASMOD, the AR (1) plus Multi-Normal model was used after the transformation (Model 2). Furthermore, Model 1 and Model 3 compared the results derived by two likelihood functions whose statistical parameters dependent on time and independent on time, respectively. A novel index called Percentage of observations bracketed by the Unit Confidence Interval (PUCI) has been proposed for uncertainty evaluation of discharges. The following conclusions are drawn from the results of this study: The autocorrelations of residuals in monthly WASMOD and daily WASMOD are different. In monthly WASMOD, it is not important to consider the autocorrelation, whereas in daily WASMOD it is necessary. In monthly WASMOD, the posterior distributions of parameters were sharp and peaky which is associated with well identifiable model parameter sets. Besides, the goodness of model fit, as measured by the maximum Nash–Sutcliffe efficiency value, is comparable to the value calculated by the optimization algorithm. The AR (1) model is not adequate enough to estimate the distribution of residuals in daily WASMOD. The values of PUCI show that Model 2 outperforms Model 1, although the posterior distributions of parameters derived by Model 1 are narrower than those obtained by the Model 2. Furthermore, the maximum Nash–Sutcliffe efficiency value obtained by Model 2 is 0.831, which is much higher than 0.805 achieved by Model 1. We conclude that the estimated parameters derived from Model 2 are more accurate than that of Model 1. For daily WASMOD, the values of PUCI show that Model 3 performs the best over the entire flow range, while Model 2 is the best for high flows. Moreover, the MNS derived from Model 3 is 0.821, which is slightly smaller than that of Model 2 which is in turn larger than that of Model 1. This is because Model 3 uses more statistical parameters which reflect more appropriately the statistical properties of the residuals than Model 1. In fact Model 3 performs superior on 95% confidence interval of discharge estimates than Model 1. Considering the difference in application and computational efficiency, we can see that a simple Gaussian error model is suitable for monthly WASMOD since there is no auto-correlation in transformed monthly residuals, while the value of PUCI shows that Model 3 is the best for daily WASMOD. But if considering the uncertainty of high flows and accuracy of estimated parameters, Model 2 is superior compared to Model 1 and Model 3 for daily WASMOD. Despite that Model 2 outperforms in high flows for daily WASMOD, the PUCI of Model 2 is smaller than that of Model 3 for all flows, which shows that Model 2 is also sub-optimal for daily WASMOD. It is better for statistical models to split the simulation period. Two periods (dry and wet period) for the statistic parameters are probably not enough, which means that additional time sections need to be considered for Bayesian analyses in daily WASMOD. Furthermore, some low flows and high flows do not fit the normal distribution after the Box–Cox transformation and AR (1) modelling in daily WASMOD, which implies that further efforts are required to improve the formulation of likelihood functions used in hydrological applications since the assumption of normal distribution of the residuals in daily WASMOD over the entire flow range is inappropriate. Finally, one of the major challenges is to find an appropriate likelihood function for the error distribution when extreme flows are outside the normal distribution. In view of the computational effort required by Model 2, another important issue is to find an improved way to account for error auto-correlation in daily hydrological models. We observe that although the study has led to interesting findings, the generality of the results requires using additional hydrological models and study regions in future research. Acknowledgments This study was supported by Key Project of the Natural Science Foundation of China (No. 40730632), the Norwegian Research Council (RCN) for sponsoring project number 171783 (FRIMUF) and the Environment and Development Programme (FRIMUF) of the Research Council of Norway, project 190159/V10 (SoCoCA). The authors would like to thank Mario L.V. Martina for his advice and help in this study. We also appreciate the comments and suggestions from two reviewers and the associate editor, Prof. Ezio Todini. Their comments greatly helped to improve the quality of this paper. References Abbaspouretal, K.C. et al., 2007. Modelling hydrology and water quality in the prealpine/alpine Thur watershed using SWAT. Journal of Hydrology 333 (2–4), 413–430. Bates, B.C., Campbell, E.P., 2001. A Markov chain Monte Carlo scheme for parameter estimation and inference in conceptual rainfall–runoff modeling. Water Resources Research 37 (4), 937–947. Benke, K.K., Lowell, K.E., Hamilton, A.J., 2008. Parameter uncertainty, sensitivity analysis and prediction error in a water-balance hydrological model. Mathematical and Computer Modelling 47 (11–12), 1134–1149. Bernardo, J.M., Smith, A.F., 1994. Bayesian Theory. Wiley, Chichester. Beven, K.J., Freer, J., 2001. Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. Journal of Hydrology 249 (1–4), 11–29. Blasone, R.S., Madsen, H., Rosbjerg, D., 2008a. Uncertainty assessment of integrated distributed hydrological models using GLUE with Markov chain Monte Carlo sampling. Journal of Hydrology 353 (1–2), 18–32. Blasone, R.S. et al., 2008b. Generalized likelihood uncertainty estimation (GLUE) using adaptive Markov Chain Monte Carlo sampling. Advances in Water Resources 31 (4), 630–648. Chen, X., Chen, Y.D., Xu, C.-Y., 2007. A distributed monthly hydrological model for integrating spatial variations of basin topography and rainfall. Hydrological Processes 21 (2), 242–252. Chib, S., Greenberg, E., 1995. Understanding the Metropolis–Hastings algorithm. The American Statistician 49, 327–335. Chowdhury, S., Sharma, A., 2007. Mitigating parameter bias in hydrological modelling due to uncertainty in covariates. Journal of Hydrology 340 (3–4), 197–204. Coccia, G., Todini, E., 2010. Recent developments in predictive uncertainty assessment based on the model conditional processor approach. Hydrology and Earth System Sciences and Discussion 7, 9219–9270. L. Li et al. / Journal of Hydrology 406 (2011) 54–65 Duan, Q., Ajami, N.K., Gao, X., Sorooshian, S., 2007. Multi-model ensemble hydrologic prediction using Bayesian model averaging. Advances in Water Resources 30 (5), 1371–1386. Engeland, K., Braud, I., Gottschalk, L., Leblois, E., 2006. Multi-objective regional modelling. Journal of Hydrology 327 (3–4), 339–351. Engeland, K., Gottschalk, L., 2002. Bayesian estimation of parameters in a regional hydrological model. Hydrology and Earth System Sciences 6 (5), 883–898. Engeland, K., Renard, B., Steinsland, I., Kolberg, S., 2010. Evaluation of statistical models for forecast errors from the HBV model. Journal of Hydrology 384 (1–2), 142–155. Engeland, K., Xu, C.Y., Gottschalk, L., 2005. Assessing uncertainties in a conceptual water balance model using Bayesian methodology. Hydrological Sciences Journal 50 (1), 45–63. Gelman, A., Rubin, D.B., 1992. Inference from iterative simulation using multiple sequences. Statistical Science 7, 457–472. Gong, L., Widen-Nilsson, E., Halldin, S., Xu, C.Y., 2009. Large-scale runoff routing with an aggregated network-response function. Journal of Hydrology 368 (1–4), 237–250. Hopper, M.J., 1978. A Catalogue of Subroutines. Harwell Subroutine Library, UK Atomic Energy Authority, Harwell. Jin, X., Xu, C.-Y., Zhang, Q., Singh, V.P., 2010. Parameter and modeling uncertainty simulated by GLUE and a formal Bayesian method for a conceptual hydrological model. Journal of Hydrology 383 (3–4), 147–155. Kavetski, D., Kuczera, G., Franks, S.W., 2006a. Bayesian analysis of input uncertainty in hydrological modeling. 1. Theory. Water Resources Research 42 (3), W03407. Kavetski, D., Kuczera, G., Franks, S.W., 2006b. Bayesian analysis of input uncertainty in hydrological modeling. 2. Application. Water Resources Research 42 (3), W03408. Krzysztofowicz, R., 1997. Transformation and normalization of variates with specified distributions. Journal of Hydrology 197 (1–4), 286–292. Krzysztofowicz, R., 1999. Bayesian theory of probabilistic forecasting via deterministic hydrologic model. Water Resources Research 35 (9), 2739–2750. Krzysztofowicz, R., 2002. Bayesian system for probabilistic river stage forecasting. Journal of Hydrology 268 (1–4), 16–40. Krzysztofowicz, R., Herr, H.D., 2001. Hydrologic uncertainty processor for probabilistic river stage forecasting: precipitation-dependent model. Journal of Hydrology 249 (1–4), 46–68. Krzysztofowicz, R., Kelly, K.S., 2000. Hydrologic uncertainty processor for probabilistic river stage forecasting. Water Resources Research 36 (11), 3265– 3277. Kuczera, G., Kavetski, D., Franks, S., Thyer, M., 2006. Towards a Bayesian total error analysis of conceptual rainfall–runoff models: characterising model error using storm-dependent parameters. Journal of Hydrology 331 (1–2), 161–177. Kuczera, G., Parent, E., 1998. Monte Carlo assessment of parameter uncertainty in conceptual catchment models: the Metropolis algorithm. Journal of Hydrology 211 (1–4), 69–85. Langsrud, Ø., Frigessi, A., Høst, G., 1998. Pure model error for the HBV-model. Norwegian Water Resources and Energy Directorate. Hydra Note 4/1998, Oslo, 28pp. Li, L., Xia, J., Xu, C.-Y., Chu, J., Wang, R., 2009. Analyse the sources of equifinality in hydrological model using GLUE methodology. In: Hydroinformatics in Hydrology, Hydrogeology and Water Resources (Proc. of Symposium JS.4 at the Joint IAHS & IAH Convention, Hyderabad, India, September 2009) IAHS Publ., no. 331, pp. 130–138. Li, Z., Shao, Q., Xu, Z., Cai, X., 2010a. Analysis of parameter uncertainty in semidistributed hydrological models using bootstrap method: a case study of SWAT model applied to Yingluoxia watershed in northwest China. Journal of Hydrology 385 (1–4), 76–83. Li, L., Xia, J., Xu, C.-Y., Singh, V.P., 2010b. Evaluation of the subjective factors of the GLUE method and comparison with the formal Bayesian method in uncertainty assessment of hydrological models. Journal of Hydrology 390 (3–4), 210–221. Lilliefors, H.W., 1967. On the Kolmogorov–Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association 62 (318), 399–402. Liu, Z., Martina, M.L.V., Todini, E., 2005. Flood forecasting using a fully distributed model: application of the TOPKAPI model to the Upper Xixian Catchment. Hydrology and Earth System Sciences 9 (4), 347–364. Looper, J.P., Vieux, B.E., Moreno, M.A., 2009. Assessing the impacts of precipitation bias on distributed hydrologic model calibration and prediction accuracy. Journal of Hydrology, doi:10.1016/j.jhydrol.2009.09.048. Mantovan, P., Todini, E., 2006. Hydrological forecasting uncertainty assessment: incoherence of the GLUE methodology. Journal of Hydrology 330 (1–2), 368– 381. Maranzano, C.J., Krzysztofowicz, R., 2004. Identification of likelihood and prior dependence structures for hydrologic uncertainty processor. Journal of Hydrology 290 (1–2), 1–21. Marshall, L., Nott, D., Sharma, A., 2007. Towards dynamic catchment modelling: a Bayesian hierarchical mixtures of experts framework. Hydrological Processes 21 (7), 847–861. 65 Montanari, A., Brath, A., 2004. A stochastic approach for assessing the uncertainty of rainfall–runoff simulations. Water Resources Research 40, W01106. Montanari, A., Grossi, G., 2008. Estimating the uncertainty of hydrological forecasts: a statistical approach. Water Resources Research 44, W00B08. Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual models. Part 1. A discussion of principles. Journal of hydrology 10 (3), 282–290. Raftery, A.E., Balabdaoui, F., Gneiting, T., Polakowski, M., 2003. Using Bayesian Model Averaging to Calibrate Forecast Ensembles. Tech. Rep. 440, Department of Statistics, University of Washington, Seattle. Raftery, A.E., Gneiting, T., Balabdaoui, F., Polakowski, M., 2005. Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review 133 (5), 1155–1174. Reggiani, P., Weerts, A.H., 2008. A Bayesian approach to decision-making under uncertainty: an application to real-time forecasting in the river Rhine. Journal of Hydrology 356 (1–2), 56–69. Romanowicz, R., Beven, K.J., Tawn, J., 1994. Evaluation of predictive uncertainty in nonlinear hydrological models using a Bayesian approach. Statistics for the Environment 2, 297–317. Thiemann, M., Trosser, M., Gupta, H., Sorooshian, S., 2001. Bayesian recursive parameter estimation for hydrologic models. Water Resources Research 37 (10), 2521–2536. Todini, E., 2007. Hydrological catchment modelling: past, present and future. Hydrology and Earth System Sciences 11 (1), 468–482. Todini, E., 2008. A model conditional processor to assess predictive uncertainty in flood forecasting. International Journal of River Basin Management 6 (2), 123– 138. Todini, E., 2011. History and perspectives of hydrological catchment modeling. Hydrology Research 42 (2–3), 73–85. Van der Waerden, B.L., 1952. Order tests for the two-sample problem and their power. Indagationes Mathematicae 14 (253), 453–458. Van der Waerden, B.L., 1953a. Order tests for the two-sample problem and their power 2. Indagationes Mathematicae 15, 303–310. Van der Waerden, B.L., 1953b. Order tests for the two-sample problem and their power 3. Indagationes Mathematicae 15, 311–316. Vogel, R.M., Stedinger, J.R., Batchelder, R., Lee, S.U., 2008. Appraisal of the Generalized Likelihood Uncertainty Estimation (GLUE) Method. American Society of Civil Engineers, 1801 Alexander Bell Drive, Reston, VA, 20191-4400, USA. Vrugt, J.A., Gupta, H.V., Bouten, W., Sorooshian, S., 2003. A Shuffled Complex Evolution Metropolis algorithm for optimization and uncertainty assessment of hydrologic model parameters. Water Resources Research 39 (8), 1201. Wang, G., Xia, J., Chen, J., 2009. Quantification of effects of climate variations and human activities on runoff by a monthly water balance model: a case study of the Chaobai River basin in northern China. Water Resources Research 45 (7), W00A11. Widen-Nilsson, E., Gong, L., Halldin, S., Xu, C.-Y., 2009. Model performance and parameter behavior for varying time aggregations and evaluation criteria in the WASMOD-M global water balance model. Water Resources Research 45 (5), W05418. Widen-Nilsson, E., Halldin, S., Xu, C.-Y., 2007. Global water-balance modelling with WASMOD-M: Parameter estimation and regionalisation. Journal of Hydrology 340 (1–2), 105–118. Xiong, L., O’Connor, K.M., 2008. An empirical method to improve the prediction limits of the GLUE methodology in rainfall–runoff modeling. Journal of Hydrology 349 (1–2), 115–124. Xiong, L., Wan, M., Wei, X., O’Connor, K.M., 2009. Indices for assessing the prediction bounds of hydrological models and application by generalised likelihood uncertainty estimation. Hydrological Sciences Journal/Journal des Sciences Hydrologiques 54 (5), 852–871. Xu, C.-Y., 2002. WASMOD – the water and snow balance modelling system. In: Singh, V.P., Frevert, D.K. (Eds.), Mathematical Models of Small Watershed Hydrology and Applications. Water Resources Publications, LLC, Chelsea, Michigan, USA, pp. 555–590. Xu, C.Y., Vandewiele, G.L., 1995. Parsimonious monthly rainfall–runoff models for humid basins with different input requirements. Advances in Water Resources 18 (1), 39–48. Yang, J., Reichert, P., Abbaspour, K.C., 2007a. Bayesian uncertainty analysis in distributed hydrologic modeling: a case study in the Thur River basin (Switzerland). Water Resources Research 43 (10), W10401. Yang, J., Reichert, P., Abbaspour, K.C., Xia, J., Yang, H., 2008. Comparing uncertainty analysis techniques for a SWAT application to the Chaohe Basin in China. Journal of Hydrology 358 (1–2), 1–23. Yang, J., Reichert, P., Abbaspour, K.C., Yang, H., 2007b. Hydrological modelling of the Chaohe Basin in China: statistical model formulation and Bayesian inference. Journal of Hydrology 340 (3–4), 167–182. Yapo, P.O., Gupta, H.V., Sorooshian, S., 1998. Multi-objective global optimization for hydrologic models. Journal of Hydrology 204 (1–4), 83–97.