Uncorrected Proof 1 © IWA Publishing 2016 Hydrology Research | in press | 2016 Probabilistic prediction in ungauged basins (PUB) based on regional parameter estimation and Bayesian model averaging Yanlai Zhou, Shenglian Guo, Chong-Yu Xu, Hua Chen, Jiali Guo and Kairong Lin ABSTRACT Predictions in ungauged basins (PUB) are widely considered to be one of the fundamentally challenging research topics in the hydrological sciences. This paper couples a regional parameter transfer module with a probabilistic prediction module in order to obtain probabilistic PUB. Steps in the proposed probabilistic PUB include: (1) Variable infiltration capacity-three layers (VIC-3L) model description; (2) three regional parameter transfer schemes for ungauged basins, i.e., regression analysis, spatial proximity, and physical similarity; (3) probabilistic PUB using Bayesian model averaging (BMA); and (4) performance evaluation for probabilistic PUB. The study is performed on 12 sub-basins in the Hanjiang River basin, China. The results demonstrate that the mean prediction of BMA is much closer to the observed data compared with its associated individual parameter transfer scheme (physical similarity approach), and the probabilistic predictions of BMA can effectively reduce the uncertainty in runoff PUB better than any associated individual parameter transfer schemes for two ungauged sub-basins. Key words | Bayesian model averaging, physical similarity, predictions in ungauged basins, regional parameter estimation, regression analysis, spatial proximity Yanlai Zhou (corresponding author) Changjiang River Scientific Research Institute, Wuhan 430010, China E-mail: zyl23bulls@whu.edu.cn Yanlai Zhou Shenglian Guo Chong-Yu Xu Hua Chen State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan 430072, China Chong-Yu Xu Department of Geosciences, University of Oslo, Norway Jiali Guo Three Gorges University, 443002, China Kairong Lin School of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, China INTRODUCTION Predictions in ungauged basins (PUB) are widely considered parameter values are transferred to simulate runoff for the as an important and challenging research topic in the hydro- target ungauged basin: regression analysis, spatial proximity, logical sciences (Sivapalan et al. ; Hrachowitz et al. physical similarity, and a mixture of them. ). One of the primary research objectives of the PUB Various studies have been performed to determine initiative was to improve the ability of existing hydrological which is best approach between the spatial proximity models to predict in ungauged basins with reducing uncer- method and the physical similarity approach. Post & Jake- tainties (Dong et al. a, b; Parkes et al. ). man () investigated the relationships between the Regionalization of parameters is diffusely used to simulate model parameters of a lumped conceptual rainfall-runoff runoff in PUB, which is regarded as the process of transfer- model and the basin landscape attributes of similarly sized ring parameter values from a donor gauged basin to the basins. Burn & Boorman (), Johansson (), Sefton target ungauged basin (Xu , 2003; Xie et al. ; & Howarth () and Kokkonen et al. () derived the Kizza et al. ; Hailegeorgis & Alfredsen ). Four regio- relationships between model parameters and physical catch- nalization approaches have been typically used for choosing ment descriptor indices using geographical information the donor gauged basin whose calibrated and optimized system. Croke et al. () adopted a simple hydrologic doi: 10.2166/nh.2016.058 Uncorrected Proof 2 Y. Zhou et al. | PUB based on RPE and BMA Hydrology Research Annual precipitation varies | in press from | 2016 approach to simulate runoff adaption to land-use changes in resources. ungauged basins. Goswami et al. () developed a pooling 1,100 mm, with 70–80% of the total amount occurring in 700 to method of regional parameter estimation coupled with soil the wet season from May to October. The Hanjiang basin data and SMAR model in order to simulate flow in plays a critical role in flood control and water supply in cen- ungauged basins in France. Zhang & Chiew () evalu- tral China. The Danjiangkou reservoir located in the middle ated the disadvantages and advantages of different reach of the Hanjiang River is the source of water for the regionalization methods using two rainfall runoff models, middle route of the South-North Water Division Project Xinanjiang and SIMHYD in 210 Australian basins. The (SNWDP), and the Jianghan plain in the down basin is study showed that the best approach between the spatial one of the most important bases for commodity grain proximity method and the physical similarity approach is production. hard to identify. Most of studies had to suggest that the Location of 12 primary streamflow stations in the use of more information (such as remotely sensed (RS) veg- Ankang basin (upper reach of Hanjiang basin) is shown in etation data, soil data, climate, and land cover) or a mixture Figure 1. The Ankang basin contains 98 precipitation of them can improve the accuracy of runoff simulation in stations, 9 weather stations, 12 streamflow stations of sub- ungauged basins (Duan et al. ; Götzinger & Bárdossy basins (10 sub-basins have observed data, Mumahe and ; Oudin et al. ; Bulygina et al. ; Li et al. ; Renhe sub-basins with in the Ankang basin are ungauged). Kling & Gupta ; Li et al. ). The streamflow information – DEM, forcing, soil and veg- Previous studies of PUB mostly focused on the compari- etation data, and so on – is required for VIC-3L model son of prediction of individual parameter transfer schemes implementation and calibration. The data include: (1) daily and their weight averages. Our research aims to compare streamflow and weather data (download from http://www. the probabilistic prediction generated by the Bayesian escience.gov.cn/) from 1980–1986 and 1987–1990 are used model averaging (BMA) with that of each individual par- for calibration and verification, respectively; (2) DEM data ameter transfer scheme, in order to see if BMA can (download from http://www.gscloud.cn/) of 0.009 degree effectively reduce the uncertainty in runoff PUB and (around 1 × 1 km2 cell size) spatial resolution for the improve the prediction reliability. Ankang basin are derived and used to delineate the sub- The paper is organized as follows: a brief introduction to basin boundary and stream network; (3) vegetation type the study area; then a general description of the main steps data are taken from the global land cover classification gen- and procedures including: (1) variable infiltration capacity- erated by the University of Maryland with a one-kilometre three layers (VIC-3L) model description, (2) regional par- pixel resolution; (4) vegetation parameters are based on ameter transfer schemes, (3) probabilistic prediction using the vegetation from the Land Data Assimilation System; BMA, and (4) performance evaluation; comparison between (5) the soil parameters are derived from the soil classifi- BMA and its three individual parameter transfer schemes is cation information of the global 5 min data provided by then discussed and, finishing with the conclusions drawn the National Atmospheric and Oceanic Administration. from this study. METHODOLOGY STUDY AREA AND DATA Procedures The Hanjiang River is the largest tributary of the Yangtze River and it passes through the provinces of Shannxi and A procedure coupling a regional parameter transfer module Hubei in China, and merges into the Yangtze River at with probabilistic prediction module is developed to obtain Wuhan city. The river is of length 1,570 km and area probabilistic prediction in ungauged basins. It consists of 2 159,000 km . The basin has a sub-tropical monsoon climate two modules, namely a regional parameter transfer and has, as a result, a dramatic diversity in its water module and probabilistic prediction. The regional parameter Uncorrected Proof 3 Y. Zhou et al. Figure 1 | | PUB based on RPE and BMA Hydrology Research | in press | 2016 Location of 12 main streamflow stations of the Ankang basin in China. transfer module is aimed at obtaining model parameter esti- the Darcy law. The ARNO method is used to describe base mates from a limited number of calibrated basins and then flow which takes place only in the lowest layer. The routing regionalizing them to uncalibrated basins based on the model represented by the unit hydrograph method for over- spatial proximity approach, physical similarity approach land flow and the linear Saint-Venant method for channel (similar characteristics of climate as well as physicality), flow, allows runoff to be predicted (Liang et al. ). and multiple regression analysis, which is described in The VIC-3L model has ten hydrological parameters that detail in the next section. Probabilistic prediction is designed need to be calibrated, as shown in Table 1. A similar descrip- to infer a prediction by weight averaging over many different tion for VIC-3L parameters was made by Xie et al. (). regional parameter transfer schemes based on the BMA method, which is described in detail in the section named Regional parameter transfer ‘Hydrological probability prediction’. The main steps and procedures include: (1) VIC-3L model description; (2) In the regional parameter transfer study, the parameters of regional parameter transfer scheme; (3) probabilistic predic- individual sub-basins with similar climate characteristics tion using BMA; and (4) performance evaluation. and underlying surface, as well as the individual sub-basin are assumed to have the same values. Three regional par- VIC-3L model description ameter transfer schemes, i.e., spatial proximity approach, physical similarity approach, and multiple regression analy- The VIC-3L model has one kind of bare soil and different veg- sis are tested in the study. etation types in each grid cell (Liang & Xie ; Xie et al. 1. Spatial proximity approach – The spatial proximity ). It includes both the saturation and infiltration excess approach uses the parameter values from the geographi- runoff processes in a grid cell with a consideration of the cally closest gauged catchment hypothesizing that sub-grid scale soil heterogeneity, and the frozen soil processes neighboring catchments should behave similarly. for cold climate conditions. The one-dimensional Richard 2. Physical similarity approach – The physical similarity equation is used to describe the vertical soil moisture move- approach transfers the entire set of parameter values from ment and the moisture transfer between soil layers obeys a physically similar catchment whose attributes (climatic Uncorrected Proof 4 Y. Zhou et al. Table 1 | | PUB based on RPE and BMA Hydrology Research | in press | 2016 Hydrological parameters in VIC-3L model Number Variable Description Units 1 b The shape of the variable infiltration capacity curve / 2 Dm The maximum base flow from the lowest soil layer mm/day 3 Ds The fraction of Dm where non-linear base flow begins / 4 Ws The fraction of the maximum soil moisture where non-linear base flow occurs / 5 Dep1 The depth of top layer soil m 6 Dep2 The depth of middle layer soil m 7 Dep3 The depth of lower layer soil m 8 x The regulation capacity of river channel for stream flow / 9 k The propagation time of steady flow in river channel h 10 ckg The regulation capacity of slope land for base flow / and physical) are similar to those of the target ungauged where one. The use of more information, such as RS vegetation X1j , X2j , , Xmj are independent variables, εj is fitting data and soil data in the physical similarity approach can error and is assumed as εj ∼ N(0, σ 2 ). Fifteen independent improve runoff estimates in ungauged basins. variables comprising six climatic characteristic variables 3. Multiple regression analysis – The multiple regression analysis approach establishes a relationship between Yj denotes the jth dependent variable, and nine soil characteristic variables are used, as shown in Table 2. VIC parameter values calibrated on gauged catchments For a detailed description of methods for weather for- and catchment descriptors or attributes (such as climatic, cing data, vegetation dataset, and soil dataset based on vegetation, and soil data), and then the VIC parameter regionalization and grid in this paper, readers are referred values for the ungauged catchments are estimated from to Xie et al. (). these attributes and the established relationships. Three regression analysis equations are used to establish relationships between dependent variables (VIC Hydrological probability prediction – Bayesian model averaging parameters) and independent variables (fifteen climatic as well as soil characteristic variables), described as follows. The regression analysis equations are the linear regression analysis equation, Yj ¼ β 0 þ β 1 X1j þ β 2 X2j þ þ β m Xmj þ εj by weight averaging over many different regional parameter transfer schemes. This method is not only a pathway for scheme combination but also a coherent approach for (1) accounting for between-scheme and within-scheme uncertainty (Ajami et al. ). Below is a brief description of the basic ideas of this method. the square-root regression analysis equation, qffiffiffiffiffi qffiffiffiffiffiffiffi qffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffi Yj ¼ β 0 þ β 1 X1j þ β 2 X2j þ þ β m Xmj þ εj BMA is a statistical technique designed to infer a prediction Let us consider a quantity Q to be predicted on the basis of input data D ¼ [I, O] (I denotes the input forcing data, and (2) O stands for the observational flow data). f ¼ [ f1, f2, …, fk] is the ensemble of the K-member predictions. The probabilistic prediction of BMA is given by and the logarithmic regression analysis equation, log (Yj ) ¼ β 0 þ β 1 log (X1j ) þ β 2 log (X2j ) þ þ β m log (Xmj ) þ εj (3) p(QjD) ¼ K X k¼1 p( fk jD) pk (Qj fk , D) (4) Uncorrected Proof 5 Y. Zhou et al. Table 2 | | PUB based on RPE and BMA Hydrology Research | in press | 2016 Soil and climatic characteristic variables Number Type Variable Description Units 1 Soil characteristic variables Sat_h Saturated hydraulic conductivity cm/h 2 Vsat Variability of saturated hydraulic conductivity (cm/h)^2 3 Bub Bubble pressure Pa 4 Qua Quartz content % 5 Sat_m Saturated moisture content % 6 Per_c Percentage of critical moisture content % 7 Per_w Percentage of wilting moisture content % 8 Res Residual moisture content % 9 Per_v Percentage of valid moisture content % T Annual mean temperature W P Annual mean precipitation mm 12 E Annual mean evaporation from water surface mm 13 Cv_T Coefficient of variation for monthly temperature during one year / 14 Cv_P Coefficient of variation for monthly rainfall during one year / 15 Cv_E Coefficient of variation for monthly evaporation from water surface during one year / 10 Climatic characteristic variables 11 where p( fk jD) is the posterior probability of the prediction fk C EM algorithm for BMA parameter estimation given the input data D and reflects how well the scheme fits Y. Actually p( fk jD) is just the BMA weight ωk , and better per- To estimate BMA weight ωk and scheme prediction variance forming predictions receive higher weights than poorer σ 2k , the Expectation-Maximization (EM) algorithm, which performing ones, all weights are positive and should add up has proved to be an efficient technique for BMA calculation to 1. pk (Qj fk , D) is the conditional probability density function based on the assumption that K-member predictions are nor- (PDF) of the prediction Q conditional on fk and D. For compu- mally distributed, is described in this section (Duan et al. tation convenience, pk (Qj fk , D) is always assumed to be a ). normal PDF and is represented as g(Qj fk , σ 2k ) ∼ N( fk , σ 2k ), Firstly, if we denote the set of BMA parameters to be estimated by θ ¼ wk , σ 2k , k ¼ 1, 2, . . . , K , the log form of where σ 2k is the variance associated with scheme prediction fk and observations O. In order to make this assumption the likelihood function can be represented as valid, some techniques such as Box-Cox transformation are needed to make the data approximately normally distributed and to narrow the data range (Poirier ). l(θ) ¼ log (p(QjD)) ¼ log K X ωk g Qj fk , σ 2k ! (6) k¼1 The BMA mean prediction is a weight average of the individual scheme’s predictions, with their posterior probabilities being the weights. In the case that the observations and individual scheme predictions are all normally distributed, the BMA mean prediction can be It is difficult to maximize the function (6) by analytical methods. The EM algorithm is an effective method for finding the maximum likelihood by alternating between two steps, the expectation step and maximization step. The two expressed as steps are iterated to convergence when there is no signifiE½QjD ¼ K X k¼1 K X pð fk jDÞ E g Qj fk , σ 2k ¼ ωk fk k¼1 cant (5) change between two consecutive iterative log-likelihood estimations. In the EM algorithm, a latent variable (unobserved quantity) ztk is used as an assistant Uncorrected Proof 6 Y. Zhou et al. | PUB based on RPE and BMA for estimating BMA weight ωk . For a detailed description of the EM algorithm for a BMA scheme, readers are referred to Dong et al. (b). Estimation of probabilistic prediction Hydrology Research | in press | 2016 definition of NS is expressed in following equation. 3 T 2 P Qtobs Qtsim 7 6 7 6 NS ¼ 61:0 t¼1 7 × 100% T 2 5 4 P t Qobs Qobs 2 (7) t¼1 After estimating BMA weight ωk and prediction variance σ 2k , we use the Monte Carlo method to generate BMA probabilistic prediction for any time t (Hammersleym & Handscomb ). The procedures are described as follows. where Qtobs and Qtsim are observed and simulated data at time t, Qobs is the average of observed data, T is the length of the data series. 2. Daily root mean square error (DRMS): 1. Generate an integer value of k from [1, 2, …, K ] with probability [ω1 , ω2 , . . . , ωk ]. A specific procedure is described as follows. 1(a). Set the cumulative weight ω00 ¼ 0 and compute ω0K ¼ DRMS ¼ vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uT 2 uP t t u tt¼1 Qobs Qsim T (8) ω0k1 þ ωk for k ¼ 1, 2, …, K. 1(b). Generate a random number u between 0 and 1. 1(c). If ω0k1 u ω0k , this indicates that we choose the kth member of the ensemble predictions. All notations have the same meaning as in Equation (7). As NS appears to have a negative value frequently and DRMS is sensitive to the differences between the obser- 2. Generate a value of Qt from the PDF of g(Qt =fkt , σ 2k ). vations and simulations, DRMS is also selected as a Here, g(Qt =fkt , σ 2k ) represents the normal distribution performance evaluation index. The smaller the DRMS with mean fkt and variance σ 2k . value is, the better the prediction performance. 3. Repeat the above steps (1) and (2) for M times. M is the probabilistic ensemble size. In this paper, we set M ¼ 100. After generating the BMA probabilistic ensemble prediction, results are sorted in ascending order. From this, the 90% uncertainty intervals can be derived within the range of the 5% and 95% quantities. 1. Relative error of total runoff (RE): 0 T P B B RE ¼ Bt¼1 @ Qtobs T P t¼1 T P t¼1 1 Qtsim C Qtobs C C × 100% A (9) For each individual scheme in the BMA model, the prediction uncertainty interval can also be constructed, with This reflects the relative bias in the simulation of the the Monte Carlo sampling method still being used to total runoff amount. A value of RE closes to zero indicates approximate the assumed PDF of g(Qt =fkt , σ 2k ). better agreement of total surface runoff. Performance evaluation indices Performance evaluation indices for probabilistic prediction Performance evaluation indices for mean prediction Xiong et al. () and Dong et al. (b) presented a set of indices for assessing the probabilistic prediction generated There are three indices for evaluating the mean prediction by the uncertainty analysis methods. Three main indices (Dong et al. b) presented as follows. are selected here to assess the probabilistic prediction pro- 1. The Nash-Sutcliffe coefficient of efficiency (NS) – NS is duced by the BMA model as well as from each individual not only an objective function but also a widely used per- parameter transfer scheme. formance criterion. It ranges from minus infinity to 1.0, 1. Containing ratio (CR) – The containing ratio is used for with higher values indicating better agreement. The assessing the goodness of the uncertainty interval. It is Uncorrected Proof 7 Y. Zhou et al. Table 3 | | PUB based on RPE and BMA Hydrology Research | in press | 2016 Calibrated parameters for the 10 primary sub-basins Gauge stations b Ds Dm (mm/day) Ws Dep1 (m) Dep2 (m) Dep3 (m) x k ckg Baohe 0.291 0.706 5.761 0.246 0.005 0.400 0.224 0.235 0.694 0.903 Hanzhong 0.250 0.590 0.518 0.420 0.074 0.204 0.018 0.258 1.878 0.542 Xushuihe 0.257 0.228 0.580 0.859 0.073 0.100 0.010 0.150 0.595 0.695 Youshuihe 0.257 0.428 2.580 0.559 0.053 0.250 0.010 0.050 0.555 0.695 Ziwuhe 0.257 0.428 3.540 0.559 0.053 0.270 0.008 0.258 0.750 0.750 Shiquan 0.257 0.828 5.540 0.359 0.083 0.150 0.010 0.258 0.700 0.950 Chihe 0.258 0.421 2.569 0.701 0.026 0.300 0.010 0.247 0.721 0.588 Zhehe 0.157 0.428 1.040 0.559 0.073 0.110 0.010 0.208 0.995 0.550 Lanhe 0.450 0.631 4.680 0.489 0.085 0.005 0.010 0.250 0.905 0.697 Ankang 0.257 0.828 4.040 0.159 0.140 0.240 0.100 0.100 0.660 0.950 defined as the percentage of observed data points that are RESULTS AND DISCUSSION covered in the prediction bounds. Calibration results T N (qtl Qtsim qtu ) CR ¼ t¼1 × 100% T (10) used for calibration. The gauged sub-basins are selected as where qtu and qtl denote as upper and lower prediction T bounds at time t, and N is the number of observed t¼1 data points that are covered in the prediction bounds. 2. Average band-width (B). Consider B¼ T 1X (qt qtl ) T t¼1 u Daily streamflow and weather data from 1980 to 1986 are the primary basins to implement VIC-3L model calibration, which is achieved by matching the total annual stream flow volume and the shape of the mean daily hydrograph to the corresponding observations in the Ankang River basin. The two criteria, i.e., NS and RE are used for model (11) Table 4 | Calibration statistics for the 10 primary sub-basins where the average band-width B is also an index for Calibration measuring the performance of estimated uncertainty Gauge stations NS (%) DRMS (m3/s) RE (%) Baohe 90.79 30.67 14.00 amplitude is an index to quantify the average deflection Hanzhong 91.81 23.31 3.90 of the curve of the middle points of the prediction Xushuihe 85.11 29.70 5.08 bounds from the observed stream flow hydrograph. It is Youshuihe 88.29 32.14 7.00 defined as Ziwuhe 86.87 29.82 6.36 Shiquan 89.27 31.51 11.00 Chihe 84.38 40.17 2.71 interval. 3. Average deviation amplitude (D) – The average deviation T 1X 1 t t t D¼ (q þ ql ) Qobs T t¼1 2 u with notations as defined previously. (12) Zhehe 86.02 31.90 13.00 Lanhe 85.65 32.89 4.83 Ankang 87.94 29.32 12.00 Uncorrected Proof 8 Y. Zhou et al. Table 5 | | PUB based on RPE and BMA Hydrology Research | in press | 2016 The transferred parameter values of three regionalization approaches Gauge stations Schemes b Ds Dm (mm day Mumahe A (Youshuijie) B (Chihe) C 0.257 0.258 0.258 0.428 0.421 0.121 Renhe A (Zhehe) B (Lanhe) C 0.157 0.450 0.250 0.428 0.631 0.491 1 ) Ws Dep1(m) Dep2(m) Dep3(m) x k ckg 2.580 2.569 1.569 0.559 0.701 0.801 0.053 0.026 0.056 0.250 0.300 0.150 0.010 0.010 0.010 0.050 0.247 0.247 0.555 0.721 0.921 0.695 0.588 0.588 1.040 4.680 0.580 0.559 0.489 0.589 0.073 0.085 0.085 0.110 0.005 0.005 0.010 0.010 0.010 0.208 0.250 0.250 0.995 0.905 0.955 0.550 0.697 0.697 Note: Scheme A denotes spatial proximity approach, scheme B denotes physical similarity approach, and scheme C denotes multiple regression analysis. calibration. In the calibration study, the parameters of indi- calibrated hydrological parameters can be transferred to vidual sub-basins with similar climate characteristics and the ungauged sub-basins with reasonably good results. underlying surface are assumed to have the same values. Ten hydrological parameters in the VIC-3L model have Testing of parameter transfer been calibrated for the 10 primary sub-basins. Table 3 shows the calibrated parameter values in 10 primary sub- Parameter transfer schemes basins in Ankang basin. Their typical ranges and the effect of each parameter on results of simulated stream flow are Three regionalization approaches, i.e., spatial proximity described below: (1) b typically ranges from 0 to 0.50. It approach, physical similarity approach, and multiple describes the total of available infiltration capacity as a func- regression analysis are used to choose the 10 donor tion of the relative saturated grid cell area and controls the gauged sub-basins whose optimized parameter values are quantity of runoff generation directly and the water balance. used to model daily runoff for the two ungauged sub- A lower value of b gives lower infiltration and yields higher basins (Mumahe and Renhe). Table 5 lists the transferred surface runoff (the value of b in this paper is the inverse parameter values of the three regionalization approaches. value of b in Xie et al. ). The highest value of b in subbasins is only 0.450 for Lanhe sub-basin and the lowest value is 0.157 for Zhehe sub-basin. The rest of the values of b in sub-basins are very close to 0.257, because Ankang basin is in a humid region; (2) Dm typically ranges from 0 to 6 mm day1; (3) Ds typically ranges from 0 to 1. With a higher value of Ds, the base flow will be higher at lower water content in the lowest soil layer; (4) Ws typically ranges from 0 to 1; (5) Dep1, Dep2, and Dep3 range from 0 to 0.40 m. In general, thicker soil depths slow down seasonal peak flows and increase the loss due to evapotranspiration; (6) x ranges from 0.05 to 0.30; (7) k ranges typically from 0.50 to 2.0; (8) ckg ranges typically from 0.5 to 1.0. Table 4 lists the statistical results (NS, DRMS, and RE) for the 10 primary sub-basins using calibrated hydrological parameters. In terms of NS, DRMS and RE, the model in the calibration provides good simulation results for all subbasins. In the next section, we will demonstrate that these Figure 2 | The distribution map of vegetation characteristics for the VIC-3L model with a 5 × 5 km grid in the Ankang basin. (The legend numbers denote different vegetation types). Uncorrected Proof 9 Y. Zhou et al. | PUB based on RPE and BMA Hydrology Research | in press | 2016 The spatial proximity sub-basins for Mumahe and Renhe are are Chihe and Lanhe, respectively. Figure 2 shows the distri- Youshuihe and Zhehe, respectively, as shown in Figure 1. bution map of vegetation characteristics for the VIC-3L The physical similarity sub-basins for Mumahe and Renhe model with 5 × 5 km grid in the Ankang basin. Figure 3 Figure 3 | The distribution map of climatic factors with a 5 × 5 km grid in the Ankang basin. Uncorrected Proof 10 Y. Zhou et al. Table 6 | | PUB based on RPE and BMA Hydrology Research | in press | 2016 Multiple regression analysis results for 10 donor gauged sub-basins Number Variable b Ds 1 Sat_h 2 Vsat 3 Bub 4 Qua 5 Sat_m 6 Per_c 7 Per_w √ 8 Res √ 9 Per_v 10 T 11 P 12 E 13 Cv_T √ √ 14 Cv_P √ √ 15 Cv_E Dm Ws Dep1 Dep2 √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ Number of regression variable √ √ √ √ 5 4 6 5 √ √ √ 6 5 F statistics 5.77 18.25 6.62 10.57 3.81 7.21 Regression equation Square root Square root Square root Linear Linear Square root R2 (%) 66.73 84.91 75.80 69.74 76.63 62.84 shows the distribution map of climatic factors with 5 × 5 km equation is tested by the Fα (m, n m 1) statistic (m is grid in the Ankang basin. The vegetation similarity sub- the number of regression variables, n is the number of basins for Mumahe and Renhe are Chihe and Lanhe, sub-basins; Pope & Webster ). The structures of respectively, in Figure 2. The annual mean temperatures regression analysis equations for six hydrological par- W W (T ) for Mumahe and Chihe are 14.6 C and 14.8 C, respect- ameters b, Dm, Ds, Ws, Dep1, and Dep2 are square root, ively. The annual mean temperatures (T ) for Renhe and square root, square root, linear, linear and square root, W W Lanhe are 15.2 C and 15.5 C, respectively. The annual respectively. However, the remaining four hydrological par- mean precipitation (P) for Mumahe and Chihe are ameters, Dep3, x, k, and ckg, have no remarkable 1,070 mm and 997 mm, respectively. The annual mean pre- regression analysis equations because these hydrological cipitation (P) for Renhe and Lanhe are 1,021 mm and parameters are affected by many model variables and 1,068 mm, respectively. The annual mean evaporation catchment descriptors or attributes. In terms of NS, the from water surface (E) for Mumahe and Chihe are regression analysis equations can provide reasonably 564 mm and 514 mm, respectively. The annual mean evap- good fitting results between parameter values calibrated oration from water surface (E) for Renhe and Lanhe are on gauged catchments and climatic as well as vegetation 246 mm and 268 mm, respectively. The climatic similarity variables. The reasonably good fitting results can be sub-basins for Mumahe and Renhe are Chihe and Lanhe, demonstrated by the fitting curves between calibrated respectively, as in Figure 3. results and regression analysis results of hydrological par- Table 6 shows multiple regression analysis results for ameters in the VIC-3L model, as shown in Figure 4. The 10 donor gauged sub-basins. The regression analysis distribution map of six hydrological parameters in the Uncorrected Proof 11 Y. Zhou et al. | PUB based on RPE and BMA Hydrology Research | in press | 2016 sub-basins can achieve 91.96 and 88.06% in the calibration period as well as 81.72 and 78.31% in the validation period, which is better than the best associated individual parameter transfer scheme prediction (Scheme B, physical similarity approach). However, in terms of RE, the mean prediction of BMA (3) performs worse than its best individual parameter transfer scheme prediction. Three indices illustrated in the section named ‘Performance evaluation indices for probabilistic prediction’ are used for assessing the probabilistic prediction of both BMA (3) and individual parameter transfer schemes. The results of two ungauged sub-basins for the whole flow series are also presented in Figure 7. It is clear that probabilistic prediction of BMA (3) has the largest values of CR and B, and almost the smallest D, in both calibration and validation periods. In other words, probabilistic prediction of BMA (3) has better properties than probabilistic prediction of any individual parameter transfer schemes in terms of CR and D, but worse in terms of B. We then compared the differences between BMA (3) and its individual parameter transfer schemes Figure 4 | The fitting curves between calibrated results and regression analysis results of hydrological parameters in the VIC-3L model. in probabilistic prediction by graphs. For illustrative purposes, Figure 8 shows the mean prediction and 90% confidence interval of both BMA (3) and three individual VIC-3L model with a 5 × 5 km grid in the Ankang basin is parameter transfer schemes of maximum one month shown in Figure 5 (calibrated hydrological parameters for hydrograph for Baohe sub-basin in 1983 during the cali- 10 primary sub-basins, as well as regression analysis results bration period, respectively. The observations of 1983 of hydrological parameters for Mumahe and Renhe are presented by dots, and the mean predictions of ungauged sub-basins). BMA (3) and its individual parameter transfer schemes are shown as solid curve. It is shown that the probabilistic prediction of BMA (3) is much broader than that of Evaluation for mean prediction and probabilistic any of its individuals. It can be found from Figure 9 prediction that the results of validation are similar to that of the calibration period. In a word, the probabilistic Figure 6 displays the weight estimates of individual par- prediction of BMA (3) has better performance than ameter transfer schemes in BMA (3). We check the its individual parameter transfer schemes for the flow mean prediction of BMA (3) using three criteria illus- series. trated in the section named ‘Performance evaluation indices for mean prediction’. Results of BMA (3) and its three individual parameter transfer schemes in the mean CONCLUSIONS prediction of two ungauged sub-basins for the whole flow series are shown in Figure 7. In terms of NS, the In this paper, the BMA method is used to predict a new mean prediction of BMA (3) for Mumahe and Renhe measurement value associated with a combination of Uncorrected Proof 12 Figure 5 Y. Zhou et al. | | PUB based on RPE and BMA Hydrology Research | in press | 2016 The distribution map of six hydrological parameters in the VIC-3L model with a 5 × 5 km grid in the Ankang basin (calibrated hydrological parameters for 10 primary sub-basins, as well as regression analysis results of hydrological parameters for Mumahe and Renhe ungauged sub-basins). probabilistic prediction in ungauged basins based on three between BMA (3) and its three individual parameter individual parameter transfer schemes. The comparison transfer schemes is made in terms of both mean Uncorrected Proof 13 Y. Zhou et al. | PUB based on RPE and BMA Hydrology Research | in press | 2016 prediction and probabilistic prediction in this study. The main conclusions are summarized as follows: (1) The mean prediction of BMA (3) is much closer to the observed data as compared with its best individual parameter transfer scheme (physical similarity approach) for two ungauged sub-basins; (2) The probabilistic predictions of BMA (3) have larger containing ratio, larger average band-width, and smaller average deviation amplitude than any of its individual parameter transfer schemes for the two ungauged sub-basins. It is worth mentioning that further works will focus on transferring parameter approaches based on data mining and machine learning techniques, such as artificial neural networks and support vector machine, as well as choosing other hydrological Figure 6 | Histogram for the weighted of individual schemes in BMA (3). Scheme A denotes the spatial proximity approach, Scheme B denotes the physical similarity approach, and Scheme C denotes multiple regression analysis. Figure 7 | models, then it is anticipated that the advantages of BMA can be generalized. The simulation results with three parameter transfer schemes in Mumahe and Renhe ungauged sub-basins. Scheme A denotes the spatial proximity approach, Scheme B denotes the physical similarity approach, and Scheme C denotes multiple regression analysis. Uncorrected Proof 14 Figure 8 Y. Zhou et al. | | PUB based on RPE and BMA Hydrology Research | in press | 2016 The mean prediction and 90% confidence interval of both BMA (3) and three individual parameter transfer schemes for Mumahe sub-basin in 1983 during the calibration period. Scheme A denotes the spatial proximity approach, Scheme B denotes the physical similarity approach, and Scheme C denotes multiple regression analysis. Uncorrected Proof 15 Figure 9 Y. Zhou et al. | | PUB based on RPE and BMA Hydrology Research | in press | 2016 The mean prediction and 90% confidence interval of both BMA (3) and three individual parameter transfer schemes for Mumahe sub-basin in 1987 during the validation period. Scheme A denotes the spatial proximity approach, Scheme B denotes the physical similarity approach, and Scheme C denotes multiple regression analysis. Uncorrected Proof 16 Y. Zhou et al. | PUB based on RPE and BMA Hydrology Research ACKNOWLEDGEMENTS This study is financially supported by the International Cooperation in Science and Technology Special Project of=China (2014DFA71910), National Natural Science Foundation of China (51509008, 51509141 and 51379223), Natural Science Foundation of Hubei Province (2015CFB217) and Open Foundation of State Key Laboratory of Water Resources and Hydropower Engineering Science in Wuhan University (2014SWG02). REFERENCES Ajami, N. K., Duan, Q. & Sorooshian, S. An integrated hydrologic Bayesian multimodel combination framework: confronting input, parameter, and model structural uncertainty in hydrologic prediction. Water Resources Research 43, 1–61. Ao, T., Ishidaira, H., Takeuchi, K., Kiem, A. S., Yoshitari, J., Fukami, K. & Magome, J. Relating BTOPMC model parameters to physical features of MOPEX basins. Journal of Hydrology 320, 84–102. Bastola, S., Ishidaira, H. & Takeuchi, K. Regionalisation of hydrological model parameters under parameter uncertainty: a case study involving TOPMODEL and basins across the globe. Journal of Hydrology 357, 188–206. Bulygina, N., McIntyre, N. & Wheater, H. Conditioning rainfall-runoff model parameters for ungauged catchments and land management impacts analysis. Hydrology and Earth System Sciences 13, 893–904. Burn, D. H. & Boorman, D. B. Estimation of hydrological parameters at ungauged catchments. Journal of Hydrology 143, 429–454. Croke, B. F. W., Merritt, W. S. & Jakeman, A. J. A dynamic model for predicting hydrologic response to land cover changes in gauged and ungauged catchments. Journal of Hydrology 291, 115–131. Dong, L., Xiong, L. & Zheng, Y. a Uncertainty analysis of coupling multiple hydrologic models and multiple objective functions in Han River, China. Water Science and Technology 68 (3), 506–516. Dong, L., Xiong, L. & Yu, K. b Uncertainty analysis of multiple hydrologic models using the Bayesian model averaging method. Journal of Applied Mathematics 1–11. Duan, Q., Schaake, J., Andréassian, V., Franks, S., Goteti, G., Gupta, H. V., Gusev, Y. M., Habets, F., Hall, A., Hay, L., Hogue, T., Huang, M., Leavesley, G., Liang, X., Nasonova, O. N., Noilhan, J., Oudin, L., Sorooshian, S., Wagener, T. & Wood, E. F. Model parameter estimation experiment (MOPEX): an overview of science strategy and major results from the second and third workshops. Journal of Hydrology 320, 3–17. | in press | 2016 Duan, Q., Ajami, N. K., Gao, X. & Sorooshian, S. Multimodel ensemble hydrologic prediction using Bayesian model averaging. Advances in Water Resources 30, 1371–1386. Ghahraman, B. & Davary, K. Assessment of rain-gauge networks using a probabilistic GIS based approach. Hydrology Research 45, 551–562. Goswami, M., O’Connor, K. M. & Bhattarai, K. P. Development of regionalisation procedures using a multimodel approach for flow simulation in an ungauged catchment. Journal of Hydrology 333, 517–531. Götzinger, J. & Bárdossy, A. Comparison of four regionalisation methods for a distributed hydrological model. Journal of Hydrology 333, 374–384. Hailegeorgis, T. T. & Alfredsen, K. Comparative evaluation of performances of different conceptualisations of distributed HBV runoff response routines for prediction of hourly streamflow in boreal mountainous catchments. Hydrology Research 46, 607–628. Hammersleym, J. & Handscomb, D. Monte Carlo Methods. Methuen, London, UK. Heuvelmans, G., Muys, B. & Feyen, J. Regionalisation of the parameters of a hydrological model: comparison of linear regression models with artificial neural nets. Journal of Hydrology 319, 245–265. Hrachowitz, M., Savenije, H. H. G., Blöschl, G., Donnell, M., Sivapalan, M., Pomeroy, J. W., Arheimer, B., Blume, T., Clark, M. P., Ehret, U., Fenicia, F., Freer, J. E., Gelfan, A., Gupta, H. V., Hughes, D. A., Hut, R. W., Montanari, A., Pande, S., Tetzlaff, D., Troch, P. A., Uhlenbrook, S., Wagener, T., Winsemius, H. C., Woods, R. A., Zehe, E. & Cudennec, C. A decade of Predictions in Ungauged Basins (PUB) - a review. Hydrological Sciences Journal 58, 1198–1255. Jin, X., Xu, C. Y., Zhang, Q. & Chen, Y. D. Regionalization study of a conceptual hydrological model in Dongjiang basin, south China. Quaternary International 208, 129–137. Johansson, B. The relationship between catchment characteristics and the parameters of a conceptual runoff model: a study in the south of Sweden. IAHS PublicationsSeries of Proceedings and Reports. International Association Hydrological Sciences 221, 475–482. Kay, A. L., Jones, D. A., Crooks, S. M., Calver, A. S. & Reynard, N. A comparison of three approaches to spatial generalization of rainfall-runoff models. Hydrological Processes 20, 3953–3973. Kizza, M., Guerrero, J.-L., Rodhe, A., Xu, C.-Y. & Ntale, H. K. Modelling catchment inflows into Lake Victoria: regionalisation of the parameters of a conceptual water balance model. Hydrology Research 44, 789–808. Kling, H. & Gupta, H. On the development of regionalization relationships for lumped watershed models: the impact of ignoring sub-basin scale variability. Journal of Hydrology 373, 337–351. Kokkonen, T. S., Jakeman, A. J., Young, P. C. & Koivusalo, H. J. Predicting daily flows in ungauged catchments: model regionalization from catchment descriptors at the Coweeta Uncorrected Proof 17 Y. Zhou et al. | PUB based on RPE and BMA Hydrologic Laboratory, North Carolina. Hydrological Processes 17, 2219–2238. Li, H., Zhang, Y., Chiew, F. H. & Xu, S. Predicting runoff in ungauged catchments by using Xinanjiang model with MODIS leaf area index. Journal of Hydrology 370, 155–162. Li, F., Zhang, Y., Xu, Z., Liu, C., Zhou, Y. & Liu, W. Runoff predictions in ungauged catchments in southeast Tibetan Plateau. Journal of Hydrology 511, 28–38. Liang, X. & Xie, Z. A new surface runoff parameterization with subgrid-scale soil heterogeneity for land surface models. Advances in Water Resources 24, 1173–1193. Liang, X., Lettenmaier, D. P., Wood, E. F. & Burges, S. J. A simple hydrologically based model of land surface water and energy fluxes for general circulation models. Journal of Geophysical Research: Atmospheres (1984–2012) 99 (D7), 14415–14428. Merz, R. & Blöschl, G. Regionalisation of catchment model parameters. Journal of Hydrology 287, 95–123. Oudin, L., Andréassian, V., Perrin, C., Michel, C. & Le, M. N. Spatial proximity, physical similarity, regression and ungaged catchments: a comparison of regionalization approaches based on 913 French catchments. Water Resources Research 44, 1–15. Parajka, J., Blöschl, G. & Merz, R. Regional calibration of catchment models: potential for ungauged catchments. Water Resources Research 43, 1–16. Parkes, B. L., Wetterhall, F., Pappenberger, F., He, Y., Malamud, B. D. & Cloke, H. L. Assessment of a 1-hour gridded precipitation dataset to drive a hydrological model: a case study of the summer 2007 floods in the Upper Severn, UK. Hydrology Research 44, 89–105. Poirier, D. J. The use of the box-cox transformation in limited dependent variable models. Journal of the American Statistical Association 73, 284–287. Pope, P. T. & Webster, J. T. The use of an F-statistic in stepwise regression procedures. Technometrics 14, 327–340. Post, D. A. & Jakeman, A. J. Predicting the daily streamflow of ungauged catchments in SE Australia by regionalising the parameters of a lumped conceptual rainfall-runoff model. Ecological Modelling 123, 91–104. Samaniego, L., Kumar, R. & Jackisch, C. Predictions in a data-sparse region using a regionalized grid-based hydrologic Hydrology Research | in press | 2016 model driven by remotely sensed data. Hydrology Research 42, 338–355. Sefton, C. E. M. & Howarth, S. M. Relationships between dynamic response characteristics and physical descriptors of catchments in England and Wales. Journal of Hydrology 211, 1–16. Sivapalan, M., Takeuchi, K., Franks, S. W., Gupta, V. K., Karambiri, H., Lakshmi, V., Liang, X., McDonnell, J. J., Mendiondo, E. M., O’Connell, P. E., Oki, T., Pomeroy, J. W., Schertzer, D., Uhlenbrook, S. & Zehe, E. IAHS Decade on Predictions in Ungauged Basins (PUB), 2003–2012: shaping an exciting future for the hydrological sciences. Hydrological Sciences Journal 48, 857–880. Vandewiele, G. L. & Elias, A. Monthly water balance of ungauged catchments obtained by geographical regionalization. Journal of Hydrology 170, 277–291. Wagener, T. & Wheater, H. S. Parameter estimation and regionalization for continuous rainfall-runoff models including uncertainty. Journal of Hydrology 320, 132–154. Xie, Z., Su, F., Liang, Q., Qingcun, Z., Zhenchun, H. & Yufu, G. Applications of a surface runoff model with Horton and Dunne runoff for VIC. Advances in Atmospheric Sciences 20, 165–172. Xie, Z., Yuan, F., Duan, Q., Zheng, J., Liang, M. & Chen, F. Regional Parameter Estimation of the VIC Land Surface Model: methodology and application to River Basins in China. Journal of Hydrometeorology 8, 447–468. Xiong, L., Wan, M., Wei, X. & O’Connor, K. M. Indices for assessing the prediction bounds of hydrological models and application by generalised likelihood uncertainty estimation. Hydrological Sciences Journal 54, 852–871. Xu, C. Y. Estimation of parameters of a conceptual water balance model for ungauged catchments. Water Resources Management 13, 353–368. Young, A. R. Stream flow simulation within UK ungauged catchments using a daily rainfall-runoff model. Journal of Hydrology 320, 155–172. Zhang, Y. & Chiew, F. H. Relative merits of different methods for runoff predictions in ungauged catchments. Water Resources Research 45, 1–13. First received 28 March 2015; accepted in revised form 20 November 2015. Available online 5 January 2016