Estimation of Regional Parameters in a Macro Scale Hydrological Model Abstract

advertisement
Estimation of Regional Parameters
03/09/01
Estimation of Regional Parameters
Estimation of Regional Parameters in a Macro Scale
Hydrological Model
Kolbjørn Engeland, Lars Gottschalk and Lena Tallaksen
Department of Geophysics, University of Oslo, Norway
Abstract
Macro-scale hydrological modelling implies a repeated application of a model within an area using
regional parameters. These parameters are based on climate and landscape characteristics, and they
are used to calculate the water balance in ungauged areas. The regional parameters ought to be
robust and not too dependent of the catchment and time period used for calibration. The ECOMAG
model is applied for the NOPEX-region as a macro-scale hydrological model distributed on a 2x2
km2 grid. Each model element is assigned parameters according to soil and vegetation classes. A
Bayesian methodology is followed. An objective function describing the fit between observed and
simulated values is used to describe the likelihood of the parameters. Using Baye’s theorem these
likelihoods are used to update the probability distributions of the parameters using additional data,
being it either an additional year of streamflow or an additional streamflow station. Two sampling
methods are used, regular sampling and Metropolis-Hastings sampling. The results show that
regional parameters exist according to some predefined criteria. The probability distribution of the
parameters shows a decreasing variance as data from new catchments are used for updating. A few
parameters do, however, not exhibit this property, and they are therefore not suitable in a regional
context.
Key words: hydrological macro modelling, distributed models, regional parameters, GLUE,
NOPEX
Page 1
Estimation of Regional Parameters
03/09/01
Introduction
Regional hydrological modelling or hydrological macro modelling implies a repeated use of a
model everywhere within an area using regional parameters. Observations for calibration and
validation of the model are only available at a subset of sites where the model is applied. For all
sites without observations the model application needs to be based on the regional parameters. The
problem as such is a classical one in hydrology - to be able to calculate streamflow or eventually
other hydrological variables like soil moisture or groundwater level, at ungauged sites. It has,
however, received renewed interest in climate impact studies where water balance elements are
estimated over large territories by linking the hydrological models more or less directly to General
Circulation Models (GCM).
Regionalisation methods aim to find a relationship between the parameters of the modelling units
and the physical characteristics of the corresponding landscape unit. Parameters of lumped
conceptual models operating at the catchment scale can be regionalised by relating them to
catchment characteristics using multiple regression (e.g. Abdulla and Lettenmaier, 1997). For a
distributed hydrological model, the approach is different. As the catchment unit used for calibration
can be composed of several modelling units, a regression analysis is difficult to perform. Secondly,
the modelling strategy adapted in this paper is to include physical characteristics, e.g. soil and
landuse classes in the parameterisation. The regression method would disturb this strategy.
Klemes (1986) suggested the split sample and proxy basin tests for regionalisation of parameters.
The split sample test considers whether the model is transposable in time, and the proxy basin test
whether the model is geographical transposable within a region. For both tests the model
parameters are calibrated using a subset of data and then validated on independent data (data from
years or catchments not used in the calibration).
Neither the regression method nor the split sample and proxy basin tests consider the nonuniqueness of parameter sets giving good model results. This is especially important when model
parameters are correlated. In such cases two catchments giving approximately the same parameter
sets may not be hydrologically similar and vice versa. The split sample and proxy basin tests were
used when applying the ECOMAG model to the NOPEX region (Motovilov et al. 1999). A striking
result was the variation in performance criteria between different years and different catchments.
The final calibrated parameter set was therefore dependent on the data used for calibration.
Regional parameters ought to be robust and not too dependent on the catchment and time period
used for calibration.
A conclusion that can be drawn from the earlier quoted studies is that more formal procedures are
needed to be able to accept a model for regional application and in the search for regional
parameters. In the hydrological literature there are at least two approaches that can serve as
appropriate tools - the multi-objective method (Gupta et al., 1998) and the Bayesian method, in
hydrology referred to as Generalised Likelihood Uncertainty Estimation (GLUE) (Beven and
Binley, 1992). Both the multi-objective method and the Bayesian method consider the uncertainty
in the choice of parameter values instead of finding the one and only optimal parameter set. In a
multi-objective context, the parameter variability is due to the trade-off between one or more
objective functions for the different catchments, resulting in a set of Pareto optimal parameter sets.
Page 2
Estimation of Regional Parameters
03/09/01
The Bayesian method, on the other hand, gives the statistical uncertainty around the optimum of
one objective function for all the catchments.
In this study the Bayesian method is selected for analysing the performance of the ECOMAG
model in the NOPEX area. Streamflow data from several catchments are used to update the
probability distributions of the parameters. For a model to "perform satisfactory" in a regional
context it might be expected that:
• The shape and the optima of the parameter distribution do not depend too much on catchments
or years used to estimate it.
• The parameters are sensitive to the model result (objective criteria).
• The variance of the parameter distribution decreases as new streamflow data (from a new year
or catchment) are used for updating. (Follows from the two points above.)
• A performance criteria (here the Reff Nash-Sutcliffe coefficient (Nash and Sutcliffe, 1970))
calculated for the optimal regional parameter set for each catchment should be higher than some
lowest acceptable value (here minimum 0.75 is classified as a good result and between 0.75 and
0.36 as satisfactory results).
An introductory part of the paper presents in brief the main features of the ECOMAG model and the
basic data sets used from the NOPEX region. It is followed by a description of the Bayesian method and
the applied sampling methods: regular sampling and Metropolis-Hastings (MH) sampling. The results of
applying these methods for construction of two dimensional parameter probability distributions for
regular sampling and a nine dimensional distribution for MH sampling are presented. The distributions
reveal how model structure and parameters behave in a regional context. Finally, conclusions are drawn
both what concerns the applicability of the ECOMAG model as a macro-scale hydrological model as
well as the quality and quantity of the NOPEX data for use in regional hydrological modelling.
Catchment and Model Descriptions
The NOPEX Area
The NOPEX area (Halldin et al. 1995; 1999) is situated in southern Sweden northwest of Uppsala.
It is an area of low relief with altitude ranging from 5 to 145 m.a.s.l.. The area is crossed by some
north-south oriented eskers reaching a height of 20-50 m over the surrounding terrain. Also
outcrops of bedrock rise over the plain. Till is the most common soil type, particularly in the north.
The fine grained clay soils, together with sandy and silty materials, dominate in the south. Part of
the area is covered by peat land having the largest extent in the northern part (Seibert, 1994).
The NOPEX area has a heterogeneous surface cover, represented by coniferous and mixed forest
(57%), mires (2.6%), lakes (2.6%) and urbanised areas (2.0%). The remainder 35.8% is mainly
agricultural land (evaluated from digital maps of the National Land Survey of Sweden). The
portion of forest increases from south towards north. The forest is predominantly coniferous.
Annual precipitation in the NOPEX area fluctuates between 600 and 800 mm, with a minimum in
August and a maximum in February. 20 to 30 per cent of the total annual precipitation falls as
Page 3
Estimation of Regional Parameters
03/09/01
snow. A snow cover lasts from the middle of November for 100 to 110 days on average, but is
normally not continuous throughout the winter. The mean annual temperature for the period 19611990 at the station Uppsala is +6oC, with a maximum in July (+17oC) and minimum in February (5oC). The vegetation period lasts about 180 days (Seibert, 1994).
The Swedish Meteorological and Hydrological Institute (SMHI) has 25 precipitation stations, 7
temperature stations, 5 air humidity stations and 10 streamflow gauging stations in the NOPEX
area. The gauged catchments cover a large part of the area as illustrated in Fig. 1. Short catchment
descriptions are given in Table 1. All the data are available as daily values for the period 1981-1995
in the SINOP database (Halldin and Lundin, 1994) developed for the NOPEX project. Temperature
and vapour pressure deficit are interpolated to a regular 2km grid by inverse distance weighting
whereas the precipitation is interpolated by kriging (Motovilov et al. 1999).
Fig. 1. NOPEX area and the ten gauged catchments
Page 4
Estimation of Regional Parameters
03/09/01
Table 1 - Gauged catchments in the NOPEX area.
Station
Vattholma
Ulva Kvarndamn
Sörsätra
Gränvad
Härnevi
Lurbo
Ransta
Sävja
Stabby
Tärnsjö
Catchment
Vattholmån
Fyrisån
Sagån
Lillån
Örsundaån
Hågaån
Sävaån
Sävjaån
Stabbybäcken
Stalbobäcken
Area (km2)
284.0
950.0
612.0
168.0
305.0
124.0
198.0
727.0
6.6
14.0
Lake (%)
4.8
3.0
1.1
0.0
1.0
0.3
0.9
2.0
0.0
1.5
Forest (%)
71.0
61.0
61.0
41.0
55.0
77.7
66.1
64.0
87.0
84.5
Open land(%)
24.2
36.0
37.9
59.0
44.0
27.0
33.0
34.0
13.0
14.0
The ECOMAG Model
The ECOMAG model (Motovilov et al. 1999) describes the main processes of the land surface
hydrological cycle: infiltration, evapotranspiration, heat and water regime of the soil, snowmelt and
formation of surface, subsurface, groundwater and river runoff on a daily time resolution.
The catchment is divided into grid cells (here 2x2 km), and the same model algorithms are applied
on each cell. The vertical structure of each grid cell is shown in Fig. 2. A threshold temperature
decides the phase of precipitation: snow or rain. Snowmelt is calculated using a degree-day factor.
The water reaching the ground, rain in summer or melt water in winter, infiltrates into horizon A,
portioned between the capillary and the non-capillary zone. If horizon A is saturated or infiltration
capacity exceeded, surface runoff is formed, described by a kinematic wave. In the non-capillary
zone of horizon A, the water can flow horizontally to the river network following Darcy's law, or
infiltrate vertically into horizon B. From the capillary zone of horizon A water can only be removed
by evapotranspiration. In horizon B the water can penetrate to the groundwater zone. In the
groundwater zone the water flows horizontally to the river network following Darcy's law.
Each grid cell is assigned a soil class and a vegetation class. Some of the parameters are determined
from the soil or the vegetation class of the grid cell. The rest of the parameters are common for the
whole region. Three parameters are described by a distribution function to account for the
variability within a grid-cell: the field capacity of horizon A, the surface retention storage and the
vertical conductivity of horizon A. Table 2 lists the optimal parameter values found in Motovilov et
al. (1999). To reduce the number of parameters that need calibration, the parameters for soil and
vegetation classes are not calibrated for each individual class. Instead the standard parameter values
are multiplied by a common factor. This means that the relative differences between the parameter
values of each soil or vegetation class are determined prior to the calibration.
Page 5
Estimation of Regional Parameters
03/09/01
precipitation
E5
ice particles
non
capilla ry
zone
capillary
zone
snow cover
h5
melt water
E1
infiltration
Z2
subsurface inflow
horizon A
s
o
i
l
m
a
t
r
i
x
horizon A
non
capillary
zone
capillary
zone
penetration
evapotranspiration
E2
infiltration
surface water storage
E3
subsurface inflow
horizon B
Z3
s
o
i
l
WP
horizon B
porosity
h3
E4
groundwater
inflow
Z4
surface water outflow
h2
River flow
subsurface outflow
horizon A
field capacity
groundwater zone
h4
penetration
m
a
t
r
i
x
h1
return flow
surface water
inflow
subsurface outflow
horizon B
groundwater outflow
Fig. 2. Vertical structure of the ECOMAG model.
Application of ECOMAG to the NOPEX Region
In this application the thickness of horizon B is set to zero due to the fact that in a typical Nordic
catchment having mainly till deposits, the ground water table is close to the surface.
The present study is based on the work by Motovilov et al.(1999) where a regional calibration and
validation of ECOMAG was done following the proxy basin scheme (Klemes, 1986) in addition to
internal validation. As a first step the model was calibrated on streamflow data for seven years for
three catchments. An additional adjustment of the soil parameters was performed using soil
moisture and groundwater data from five small experimental catchments. This was followed by
validation of the model against streamflow for 14 years from six other catchments and synoptic
streamflow and evapotranspiration measurements performed during two concentrated field efforts
in 1994 and 1995.
Page 6
Estimation of Regional Parameters
03/09/01
Table 2 - Parameters needed to be specified in the ECOMAG model and the parameter values
found by Motovilov et al. (1999a; 1999b). d.l. = dimensionless
3DUDPHWHUVIRUVRLOFODVVHV
3HDW
&OD\
6DQG
7LOO
6KDOORZ
/DNHV
EHGURFNV
Volume density (g cm-3)
Porosity of horizon A (d.l.)
Porosity of groundwater zone (d.l.)
Field capacity of horizon A (d.l.)
Field capacity of groundwater zone (d.l.)
Wilting point of horizon A (d.l.)
Vertical conductivity of horizon A (cm day-1), VCA
0.2
0.90
0.80
0.60
0.60
0.30
464.7
1.0
0.65
0.45
0.45
0.43
0.27
139.4
1.2
0.45
0.45
0.20
0.40
0.10
464.7
1.1
0.60
0.45
0.40
0.43
0.16
232.3
1.2
0.45
0.25
0.10
0.20
0.02
464.7
0.2
0.90
0.80
0.60
0.60
0.10
464.7
*Horizontal conductivity of horizon A (cm day-1),
HCA
*Horizontal conductivity of groundwater zone (cm
day-1), HCG
**Thickness of horizon A (cm), THA
Thickness of horizon B (cm)
Maximal field capasity (d.l.)
1140
81202
11400
1140
1540
1540
114000 1710
0
4620
1540
1540
1540
95.44
0.0
0.70
47.72
0.0
0.57
47.72
0.0
0.28
47.72
0.0
0.20
95.44
0.0
0.70
3DUDPHWHUVIRUYHJHWDWLRQFODVVHV
-1
-1
Evaporation parameter (cm d mb ) , EVAP
Degree-day-factor (cm d-1 oC-1), DDF
Density of new snow (g cm-3)
Heat conductivities for thawed soil (cal cm-1 oC-1 d-1)
Heat conductivities for frozen soil (cal cm-1 oC-1 d-1)
**Maximal surface depression storage (cm), SDS
Mannings roughness coefficient for slope
Factor for thickness of horizon A (d.l.)
Factor for horizontal conductivity of horizon A (d.l.)
47.72
0.0
0.55
2SHQODQG
)RUHVW
/DNH
6ZDPS
8UEDQ
0.072
0.42
0.15
120
240
4.0
11.5
1.0
1.261
0.072
0.294
0.12
96
192
4.0
11.5
1.3
0.247
0.080
0.42
0.15
96
192
7.0
11.5
1.0
1.0
0.080
0.42
0.15
96
192
7.0
11.5
1.0
1.0
0.072
0.42
0.15
120
240
4.0
11.5
1.0
1.0
3DUDPHWHUVIRUZKROHFDWFKPHQW
Critical temperature snow/rain (oC) CTP
Snow water holding capasity (volume/volume)
Parameter of snow compaction (cm2 g-1 day-1)
Snow evaporation parameter (cm day-1 mb-1)
Depth of unchanged ground temperature (cm)
Temperature of ground water (oC)
Part of actual evaporation from horizon A (d.l.) (The rest is evaporated from ground water zone)
Critical temperature for start of snow melt (oC) CTM
0.69
0.045
0.15
0.01
120
2.0
0.94
0.00
*The parameter is multiplied by the factor (mean slope of element/mean slope of NOPEX area)
**The parameter is divided by the factor (mean slope of element/mean slope of NOPEX area)
Page 7
Estimation of Regional Parameters
03/09/01
The Bayesian Method
The Bayesian method aims to establish a multi-dimensional probability distribution for the
parameters conditioned on hydrological observations. When we have observed data vector Y, the
probability, p, of the parameter set i is given by:
p( i Y )
(1)
Due to non-linearities in the model, the distribution may have an irregular surface containing
several local maxima. Therefore an empirical non-parametric distribution of Eq. (1) is established.
Estimation of Likelihood
First the likelihood for the parameter sets is calculated as (e.g. Freer et al, 1996):
 σ2 
L( i Y ) ∝ exp − 2i 
 σ obs 
(2)
where L( i Y ) is the likelihood of the ith parameter set
given the observations Y, σi2 is the sum
of squared errors divided by number of time steps in a period and σobs2 is the observed variance
over a period (here one year). The likelihood function is calculated for a period of one year, 1. June
- 31. May. The choice of the likelihood function can be based on two different arguments. The first
is to assume that the likelihood of a parameter set is proportional to a quality of fit measure (Beven
and Binley, 1992). The choice of likelihood function is then subjective. The function in Eq. (2)
was chosen due to the fact that when Baye’s theorem is used for updating, the error variance of each
period or catchment contributes linearly inside an exponent. Secondly a statistical derivation that
detects the assumptions hidden inside this likelihood function, can be performed. In this case it is
assumed that Yj ~ f (Yj | i ) where f (⋅ |⋅ ) is a density function indexed by a parameter vector i .
i
When the data Yj are given, the likelihood function L( i |Yj) is any function proportional to f(Yj | i ).
It is assumed that the simulation errors at each time step j, εi,j are identically normal distributed:
ε i , j ~ N (0, σ ε2 )
(3)
where σε2 is the variance for simulation errors (the difference between the observed and simulated
streamflow). Then the likelihood of the ith parameter set i , dependent on one observation Yj, is:
L
(
 1 ε i2, j
−
Y
exp
∝
i
j
 2σ2
ε

)




The likelihood of the parameter set
errors are independent, is:
Page 8
(4)
i
when data Y from one year are given and assuming the
Estimation of Regional Parameters
 n 2
 ∑ ε i, j
n
 1 ε i2, j 
 = exp − j =1
L( i Y ) ∝ ∏ exp −
2
 2σ 2
 2σ 
j =1
ε 
ε



03/09/01



σ i2 

 −

=
exp

2σ ε2 n 



(5)
In order to get Eq. (2) set n=365 and the variance of the simulation errors has to be:
n 2
σ ε2 = σ obs
2
(6)
It should be noted that the variance in Eq. (6) corresponding to the likelihood in Eq. (2) is far too
large to be a reasonable estimate of the variance of the simulation errors. The over-estimation of the
variance in Eq. (2) results in an over-estimation of the variance of the model parameters. The
likelihood in Eq. (2) is not a statistical likelihood but a generalised (GLUE) likelihood. The
difference between Eq. (2) and Eq. (5) influence only the scale of the likelihood, neither the
location nor the shape.
The statistical model chosen for the simulation errors (Eq. 3) does not describe all the properties of
the data. To find a more statistical correct likelihood function, the statistical properties of the
simulation errors have to be carefully investigated: the distribution of the simulation errors, whether
the variance is constant or dependent on streamflow and input variables (e.g. temperature,
precipitation) and whether the simulation errors are autocorrelated. Sorooshian (1991),
Romanowicz et al. (1994), Langsrud et al. (1998) and Kuczera (1983) among others have
constructed statistical models for the simulation error that take into account one ore more of these
three aspects. To find a model that satisfactory manage to describe the simulation errors is difficult.
Gupta et al. (1998) suggested that it is possible that there may not exist an objective statistically
correct choice for the error function.
Even though the chosen likelihood function is not statistically correct, is it a useful starting point
for investigating the existence of a regional parameters and approximately where in the parameter
space the best parameters values are located. The likelihood function (Eq. 2) is selected because it
is already commonly applied in hydrology. The aim is not to assess the model- or parameter
uncertainty in detail. Further research into this topic is needed, however not within the scope of the
present work.
Definition of Prior Distribution of Parameters
The prior distribution of the parameters is chosen to be uniform. The upper and lower limits of the
uniform distributions have to be decided. For some parameters the choice is easy, e.g. threshold
temperature for snow/rain precipitation. For other parameters, e.g. horizontal conductivity of
horizon A, test runs were performed to ensure a sufficient large variation. The borders span the
optimised parameters from Motovilov et al. (1999) (Table 2).
Page 9
Estimation of Regional Parameters
03/09/01
Estimation of Posterior Distribution of Parameters
The posterior distribution, p( i |Y ), is estimated using Baye’s theorem:
p( i Y ) =
p(Y
i
)P( )
P(Y )
i
When the data Y are given, p(Y| i ) can be regarded as a function of
function of i given Y and is written L( i |Y). We then have:
P( i Y ) =
L( i Y )p(
i
)
C
(7)
i
, which is the likelihood
(8)
where p( i ) is the prior probability for the parameter set, p( i |Y) the posterior probability given
observations Y, L( i |Y ) the likelihood function calculated from the set of observations Y, and C is
a scaling constant making the cumulative sum equal to 1.
Data from an additional period or data from a neighbouring catchment are used for updating the
distribution p( i |Y ). When Eq. (2) is used to calculate the likelihood, additional error variance
contributes linearly inside the exponent. Repeated use of equations 2 and 8 gives the posterior
distribution conditioned on n set of observations:
  σ2 
 σ 2 
p( i Y1 ....Yn ) = exp −  2i ,1  − .... −  2i ,n 
 C
σ


  σ obs ,1 
 obs ,n 
(9)
Value of Additional Data
The Shannon entropy measure H describes the variance of a multi-dimensional distribution and is
here used to measure the advantage of additional data.
M
H =−
∑p
i =1
i
log p i
log M
(10)
where the probabilities pi, are scaled making the sum equal to one, and M is the total number of
simulations of different parameter sets. This function has 1 as a maximum value when all the
realisations have the same probability, and a minimum at zero when one realisation has a
probability of 1 and all others are zero.
Page 10
Estimation of Regional Parameters
03/09/01
Uncertainty Boundaries for Streamflow
The errors in the streamflow simulations have four main sources; measurement errors in the
observed data used for calibration or as input, errors in the model structure and errors in the
parameter values. To assess the uncertainty in calculated streamflow due to the uncertainty in
parameter values, samples from the estimated parameter distribution are put into the model to
calculate a sample of streamflows for each day. The 95% quantiles of the calculated streamflow can
then be plotted together with the observed streamflow. Such a plot is useful for illustrating how
important uncertainty in parameter values is for the simulation error. If the observed streamflow
falls outside the uncertainty boundaries, the other error sources also are important.
Sampling at regular points
The distributions are sampled at regular points in the parameter space. Following this strategy, the
distributions are calculated only once for each data set (here totally 90 data sets from 10 year and
nine catchments), and afterwards Baye’s theorem gives the possibility to combine the distributions.
However, this implementation requires huge calculation capacity when the parameter space has
many dimensions. If the model has 9 parameters, 109 simulations are necessary to run the model at
a resolution of 1/10 of the initial parameter range. It is, however, still possible that the area of
highest probability is not found. If the parameter range is chosen too wide, it is possible to get a too
sparse sample in the parameter space, and if it is too narrow, it is possible that the most probable
part of the parameter space falls outside the window. As 1000 iterations of the ECOMAG model
for the nine catchments on a 500 MHz Compaq Alpha workstation requires 7.5 hours, the
computation time limits the dimension of the parameter space. This would also be the case for other
sampling strategies, e.g. importance sampling and stratified sampling. Therefore conditional
distributions are calculated.
Metropolis Hastings Sampling
The Metropolis Hastings (MH) algorithm (Metropolis et. al, 1953, Hasting, 1970), a Markov Chain
Monte Carlo (MCMC) method, is used to find the distribution in a nine dimensional parameter
space. This sampling method is chosen because it only requires knowledge about the likelihood
function, and because the number of required calculations is almost independent of the dimension
of the parameter space. Kuczera and Parent (1998) used the MH algorithm to assess parameter
uncertainty in a hydrological model, and they concluded that this algorithm produces reliable
inference with modest sampling. Here a random-walk MH algorithm is applied (Chib and
Greenberg, 1995) to calculate the chain {θ (1), θ (2), .., θ (n)}which gives the parameter vector
for iterations 1, 2, .., n
The first iterations before the chain has converged, the burn in, has to be removed. From the rest of
the chain, the mean, variance, correlation and histogram can be calculated. As a long chain is better
than several short chains Geyer (1992), one chain of length 11000 is calculated.
The number of iterations is for the MH algorithm, independent of the dimension of the parameter
space in case for the MH algorithm, whereas for the regular sampling the number of calculations
increases exponentially with the dimension. As the MH algorithm investigates only the most
probable part of the parameter space, the problem of too small sampling density is avoided. But the
Page 11
Estimation of Regional Parameters
03/09/01
MH algorithm has some drawbacks. The chain generated by the MH algorithm may not converge,
and especially for multi-modal distributions the algorithm may have difficulties. The MH routine
does not allow updating the distribution (Eq. 3) when additional data are available, and a new chain
has to be calculated. In this study totally 90 data sets are available, but a chain is calculated only
once using all the data sets.
Results
To investigate how the entropy develops as more data sets are used in the likelihood function,
regular sampling of conditional distributions in a two-dimensional parameter space is performed.
The MH algorithm is applied for calculating the distribution in the nine dimensional parameter
space when data from 10 years and 9 catchments are used.
Regular sampling
A regular sampling of conditional distributions is performed in two dimensional parameter spaces.
To have a common reference, the horizontal conductivity of horizon A is chosen as one dimension
in all the spaces. The other dimensions are in turn vertical conductivity of horizon A, horizontal
conductivity of the groundwater zone, thickness of horizon A, evaporation parameter, surface
depression storage, critical temperature snow/rain precipitation, degree-day-factor and threshold
temperature for start of snowmelt. These nine parameters prove to be the most sensitive to
simulation of streamflow. The likelihood function is computed for a time period of one year, 1.
June - 31. Mai, for each catchment for the years 1981-1990. Fig. 3 shows the entropy measure of
the conditional and marginal distributions as new streamflow data are used for updating (10 years
for each station) when regular sampling is used.
MH sampling
The MH algorithm is tuned, and 11000 iterations are performed for a nine dimensional parameter
space. The histograms for the marginal distributions are shown in Fig. 4. As the starting point of the
chain is assumed to be within the distribution, the first 1000 samples are removed as burn in, and
10000 samples are left for the analysis. Several of the parameters are correlated (Table 3). The
highest correlations are found between the three parameters representing winter conditions: degreeday-factor, critical temperature for start of snowmelt and critical temperature for phase of
precipitation. The calculated chains seem to be stationary for all parameters. The autocorrelations
decrease slowly for the three correlated parameters, after 1000 iterations the autocorrelations are
insignificant. Test calculations show that the problems disappear if two of the three parameters are
excluded. The high autocorrelation implies that longer chains are necessary to get a better estimate
of the histograms and that the results from the two dimensional conditional distributions may not be
totally correct. The Reff Nash-Sutcliffe coefficients for the parameter values having the maximum
likelihood value in the MH-chain are given in Table 4.
Page 12
Estimation of Regional Parameters
03/09/01
Fig. 3. The entropy measure for the two dimensional parameter spaces, as new data are used for
updating of the probability distributions.
Table 3 - Correlation matrix for nine ECOMAG parameters (See table 2 for the abbreviations of
parameter names)
HCA
EVAP
DDF
CTM
CTP
THA
SDS
HCG
VCA
Page 13
HCA
1.00
0.34
-0.09
0.03
-0.02
0.47
0.43
0.43
-0.10
EVAP
0.34
1.00
-0.01
0.07
-0.13
0.18
0.08
0.32
0.02
DDF
-0.09
-0.01
1.00
0.81
-0.31
0.05
0.01
-0.10
-0.16
CTM
0.03
0.07
0.81
1.00
-0.65
0.05
0.15
0.02
-0.13
CTP
-0.02
-0.13
-0.31
-0.65
1.00
0.08
-0.06
-0.05
0.01
THA
0.47
0.16
0.05
0.05
0.08
1.00
0.37
0.45
-0.00
SDS
0.43
0.08
0.01
0.15
-0.06
0.37
1.00
0.41
-0.25
HCG
0.43
0.32
-0.10
0.02
-0.05
0.45
0.41
1.00
-0.04
VCA
-0.01
0.02
-0.16
-0.13
0.01
-0.00
-0.25
-0.04
1.00
03/09/01
1.5
0.0
0
0.0
1
0.5
0.4
2
1.0
3
0.8
4
Estimation of Regional Parameters
0
2
4
6
8
10
0.0
2.0
3.0
0.0
Evaporation parameter
2.0
3.0
Degree-day-factor
0.4
-1
0
1
2
3
0.5
-2
0
1
2
3
1.0
2.0
3.0
Thickness horizon A
0
2
4
6
8
Surface depression storage
10
0.005
0.0
0.0
0.0
0.1
0.05
0.2
0.10
0.0
Critical temperature for phase of precipitation
0.3
0.15
Critical temperature for start of snowmelt
-1
0.015
-2
0.0
0.0
0.0
0.2
0.2
1.0
0.4
1.0
1.5
Horizontal conductivity horizon A
1.0
0
2
4
6
8
10
Horizontal conductity groundwater zone
0
20
40
60
80
100
Vertical conductivity horizon A
Fig. 4. The posterior marginal probability distributions updated on data from all years and
catchments, calculated using the MH algorithm.
Page 14
Estimation of Regional Parameters
03/09/01
Table 4 - Model performance (Reff) for the gauged catchments in the NOPEX area, when the
optimal parameter set is used.
Catchment
Fyrisån
Sagån
Lillån
Örsunda
ån
Hågaån
Savaån
Savjaån
Stabby
bäcken
Stalbo
bäcken
0.76
0.82
0.75
0.63
0.72
0.89
0.89
0.66
0.91
0.75
0.65
0.62
0.58
0.85
0.53
0.58
0.50
0.28
0.71
0.38
0.82
0.69
0.72
0.74
0.68
0.66
0.60
0.28
0.63
0.49
0.86
0.67
0.66
0.84
0.77
0.77
0.70
0.29
0.78
0.58
0.75
0.53
0.69
0.80
0.81
0.66
0.68
0.40
0.75
0.53
0.82
0.69
0.76
0.91
0.83
0.80
0.78
0.37
0.82
0.68
0.75
0.53
0.43
0.81
0.80
0.80
0.73
0.19
0.83
0.66
0.17
0.68
0.86
0.46
0.21
0.54
0.80
0.57
0.84
0.38
0.58
0.61
0.66
0.72
0.49
0.54
0.60
0.36
0.70
0.63
Year
1981/82
1982/83
1983/84
1984/85
1985/86
1986/87
1987/88
1988/89
1989/90
1990/91
The MH sample of the parameter distribution is put into the ECOMAG model to calculate a sample
of streamflows for each day. In Fig. 5 the area between the 95% quantiles of the calculated
streamflow is shaded grey, and the observed values are solid lines. The results are shown for four
of the catchments for the years 1986/87 and 1988/89.
Discussion
The entropy of the conditional distributions decreases as new data are used for updating them (Fig.
3), and quite well defined areas, with two exceptions, appear to contain good parameter values for
the NOPEX region (Fig 4).
The two parameters that do not follow this trend are the vertical conductivity of horizon A and the
surface depression storage. The reasons for the high variances of these two distributions are
different. When the vertical conductivity of horizon A is high enough, the model results are good.
Above this limit, the model results are rather insensitive to the parameter. This can be explained by
the high infiltration capacity in a typical Nordic catchment. All precipitation will infiltrate in
unsaturated areas. The vertical conductivity of horizon A is therefore not important to include in the
parameter space, providing a sufficient high value has been chosen. Consequently, it could be
excluded to save computing time. For surface depression storage there are differences between the
catchments concerning the shape of the distributions, resulting in a regional distribution with high
variance and without a clear modal value (Fig. 4). The differences between the catchments show
that the parameterisation of the grid cells does not manage to catch the physical properties that are
most important for this parameter. It is also possible that this process description is a compensation
for other physical processes, e.g. interception. A change in the description and/or parameterisation
of this process is necessary to find a regional parameter value.
Page 15
Estimation of Regional Parameters
03/09/01
1986/87
Q (m3/ s)
40
1988/89
Q ( m3/ s)
20
F yris å n
F yris å n
18
16
35
30
14
25
20
12
10
15
8
6
10
4
2
5
0
0
1
61
12 1
Q ( m3/ s)
16
18 1
D ay s
241
301
1
361
L illå n
61
12 1
Q (m3/ s)
9
18 1
D ay s
241
301
361
L illå n
8
14
7
12
6
10
5
8
4
6
3
4
2
2
1
0
0
1
61
Q ( m3/ s)
1.2
12 1
18 1
D ay s
241
301
1
361
61
Q (m3/ s)
0 .4
S t a lb o b ä c k e n
12 1
18 1
D ay s
241
301
361
241
301
361
241
301
361
S t a lb o b ä c k e n
0 .3 5
1
0 .3
0 .8
0 .2 5
0 .2
0 .6
0 .15
0 .4
0 .1
0 .2
0 .0 5
0
0
1
61
12 1
18 1
D ay s
241
301
361
61
12 1
45
18 1
D ay s
S a gå n
Q ( m3/ s)
40
S a gå n
Q ( m3/ s)
50
1
35
40
30
35
25
30
20
25
20
15
15
10
10
5
5
0
0
1
61
12 1
18 1
D ay s
241
301
361
1
61
12 1
18 1
D ay s
Fig. 5. The 95% quantiles (grey area) of simulated streamflow resulting from the parameter
uncertainty, and the observed streamflow (solid line) for 1986/87 and 1998/89.
Page 16
Estimation of Regional Parameters
03/09/01
Fig. 3 reveals that for three parameters (critical temperature for start of snowmelt, degree-dayfactor and critical temperature for phase of precipitation) the entropy stops decreasing when the
data sets from the seven catchments have been used for updating the distributions. This suggests
that a limit is reached for how accurate the parameters can be defined, and new data do not provide
more information about the parameters. However, as these three parameters are highly correlated,
these conclusions may not be valid for the marginal distributions.
The parameter values found by Motovilov et al. (1999) (Table 2) appear to be close to the modal
value found here. Also the Reff Nash-Sutcliffe coefficients for the parameter values giving the
maximum likelihood in the MH-chain (Table 4), indicate satisfactory or good model performance
for all catchments. For individual catchments and years, however, the criterion is not satisfactory.
For Stabbybäcken the Reff has the smallest value, 0.17 for the year 1981/82, and for the year
1988/89 the model does not perform good for any catchments and satisfactory for only five.
Stalbobäcken is a special catchment as it contains parts of an esker, and it may have been better to
exclude this catchment in the estimation of regional parameters.
The observed streamflow is outside the uncertainty boundaries much of the time (Fig. 5), indicating
that other error sources than errors in parameter values are important.
The observed streamflow is outside the uncertainty boundaries for all catchments during the days
200-220 in the year 1988/89. ECOMAG does not manage to describe the general response resulting
from the special winter condition of this period.
For Sagån the observed streamflow is almost always higher than the upper 95% quantile. The
observed specific streamflow in Sagån is higher than in the other eight catchments. Therefore a
quality control of the streamflow data and rating curve ought to be performed before concluding
that ECOMAG fails to model this catchment properly. Also for Stalbobäcken streamflow is
difficult to simulate. This is a special catchment as it contains parts of an esker giving a relatively
high streamflow in recession periods. This indicates that not all the important physical
characteristics of the grid elements are properly included in the parameterisation of ECOMAG.
It is important to be aware of that the chosen likelihood function is not a statistical likelihood but a
generalised one. The variance of the simulation errors (Eq. 6) is far too large to be a statistical
likelihood. If a more correct variance is used, the entropy in Fig. 3 would have been smaller, the
variance of the distributions in Fig. 4 would have been smaller, and the uncertainty boundaries in
Fig. 5 would have been narrower. The main conclusions, however, would not change much. The
difference between Eq. (2) and Eq. (5) affects only the scale of the likelihood, neither the location
nor the shape. The posterior (Eq. 9), however, might vary as the contribution to the posterior
likelihood from the different data sets might change. But this difference is not expected to influence
the conclusions drawn on the basis of the four regionalisation criteria defined in the introduction.
The GLUE approach let the uncertainty in the parameter values account for too much of the
simulation errors and does not recognise the three other sources of errors: measurement errors in
the observed data used for calibration or as input and errors in the model structure and errors.
Page 17
Estimation of Regional Parameters
03/09/01
Conclusions
The aim of this work is to show the existence of, and that it is meaningful to define, regional
parameters for the macro scale hydrological model ECOMAG. The Bayesian method is used to
find a distribution for the parameters and data from different catchments and years are used for
updating the distribution.
The results herein suggest that for the NOPEX area regional parameters exist according to the
predefined criteria. The use of additional data implies a decrease in the variances of the conditional
distributions for most of the parameters, and a relatively narrow area in parameter space appears to
contain good parameter sets for simulation of streamflow in the nine studied catchments. For three
parameters (critical temperature for start of snowmelt, degree-day-factor and critical temperature
for phase of precipitation) a limit is reached for how accurate the parameters can be defined. When
this limit is reached, additional data do not give more information about the parameters. It is shown
that one parameter in the ECOMAG model, the surface depression storage, is not suitable for a
regional application. The location of the best parameter value changed too much between years and
catchments. The model results are found relatively insensitive to the vertical conductivity of
horizon A, they are good providing a sufficient high value is chosen for this parameter.
Even though the statistical assumption for calculating the distribution of the parameters are
violated, i.e. the simulation errors are independently normally distributed with zero mean and
constant variance, the Bayesian method proves to be useful for estimating regional parameters. As
the simulation errors may depend on climatic conditions as well as physical properties of the
catchments, the search for a more appropriate error model needs further attention. To better identify
the parameter uncertainty, it would also be necessary to identify the other three main error sources:
measurement errors in the observed data used for calibration or as input and errors in the model
structure. The GLUE approach let the parameter uncertainty account for too much of the simulation
errors.
A practical problem is the required computing time. For regular sampling the dimension of the
parameter space and the density of the samples in parameter space can not be as high as wanted
whereas the MH algorithm requires recalculations when new data are available. The regular
sampling would gain from a reduction in the dimension of the parameter space, which could be
obtained by searching for relationship between parameters.
Page 18
Estimation of Regional Parameters
03/09/01
References
Abdulla, F. A. and Lettenmaier, D. P. (1997) Application of regional parameter estimation schemes
to simulate the water balance of a large continental river, Journal of Hydrology Vol. 197, pp.
258-285.
Beven, K.J. and Binley, A.M., 1992, The future of distributed models: Model calibration and
uncertainty prediction, Hydrol. Proces., 6, 279-298.
Chib, S. and Greenberg, E. (1995) Understanding the Metropolis-Hastings Algorithm, The
American Statistican, Vol. 49, (4), pp. 327-335.
Freer, J., Beven, K. and Ambroise, B. (1996) Bayesian estimation of uncertainty in runoff
prediction and the value of data: An application of the GLUE approach, Water Resources
Research, Vol. 32 (7), pp. 2161-2173.
Geyer, C.J. (1992) Practical Markov Chain Monte Carlo Statistical Science, 7, pp.743-511
Gupta, H. V., Sorooshian, S. and Yapo, P. O. (1998) Towards improved calibration of hydrologic
models: Multiple and noncommensurable measures of information, Water Resources
Research, Vol. 34 (4), pp. 751-763.
Halldin, S. and Lundin, L-C. (1994) SINOP-system for information in NOPEX , NOPEX Technical
report No. 1, Institute of Earth Sciences, Uppsala University.
Halldin, S., Gottschalk, L., Van de Girend, A. A., Gryning, S-E., Heikinheimo, M., Högstrom, U.,
Jochum, A. and Lundin, L-C. (1995) Science plan for NOPEX, NOPEX Technical report No.
12, Institute of Earth Sciences, Uppsala University.
Halldin, S., Gottschalk, L., Van de Griend, A.A., Gryning, S_E., Heikinheimo, M., Högstrom, U.,
Jochum, A. and Lundin, L-C. (1999) NOPEX - a northern hemisphere climate processes land
surface experiment. Accepted for publication in a BACH special issue of Journal of
hydrology.
Hasting, W.K (1970) Monte Carlo sampling methods using Markov chains and their applications.
Biometrica Vol. 57, pp. 97-109.
Klemes, V. (1986) Operational testing of hydrological simulation models, Hydrological Sciences
Journal, Vol. 31 (1), pp. 13-24.
Kuczera, G. (1983) Improved parameter ingerence in Catchment models 1. Evaluating Parameter
Uncertainty, Water Resources Research, Vol. 19 (5), pp. 1151-1162.
Kuczera, G. and Parent, E. (1998) Monte Carlo assessment of parameter uncertainty in conceptual
catchment models: the Metropolis algorithm, Journal of Hydrology, Vol. 211, pp. 69-85.
Langsrud, Ø., Frigessi, A. and Høst, G. (998) Pure Model Error of the HBV-model, Note 4/1998,
Norwegian Water Resources and Energy Directorate, Oslo.
Metropolis, N.., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A. H. and Teller, E. (1953) Equation
of state calculations by fast computing machines. J. Chem. Phys. Vol. 21 pp. 1087-1092.
Motovilov, Y.G., Gottschalk, L. Engeland, K. and Rodhe, A. (1999) Validation of a distributed
model against spatial observations, Nopex special issue of Journal of Agricultural and Forest
Meteorological Research, 98-99 pp. 257-277.
Nash, J. E. and Sutcliffe, J. V. (1970) River flow forecasting through conceptual models part 1 - A
discussion of principles, Journal of Hydrology Vol. 10, pp. 282-290.
Romanowicz, R., Beven, K. J. and Tawn, J. A. (1994) Evaluation of predictive uncertainty in nonlinear hydrological models using a Bayesian approach, Statistics for the Environment II,
Water Related Issues, edidet by Barnett, V. and Turkman, K.F. pp. 297-317 John Wiley, New
York.
Page 19
Estimation of Regional Parameters
03/09/01
Seibert, P. (1994) Hydrological characteristics of the NOPEX research area, Nopex Technical
Report No. 3, Instutute of Earth Sciences, Uppsala University.
Sorooshian, S. (1991) Parameter Estimation, Model Identification, and Model Validation:
Conceptual-Type Models, Recent Advances in The Modelling of Hydrologic Systems, edited
by Bowles, D., S. ans O’Connel, P. E., pp. 443-467, NATO ASI Series -Vol. 345.
Tanner, M.A. (1996) Tools for Statistical Inference: Methods for Exploration of Posterior
Distributions and Likelihood Functions, Springer, New York
Aknowledgements
This work has partly been carried out within the framework of NOPEX - a NOrthern hemisphere
climate Processes land-surface EXperiment. The data used in this investigation come from SINOP the System for Information in NOPEX. The river streamflow and climate data were provided to
SINOP from the Swedish Meteorological and Hydrological Institute (SMHI).
The authors are grateful for the constructive and useful critics raised by the reviewers.
Page 20
Download