PART III: Formats of the statistical methods in SWAT

advertisement
Sensitivity, auto-calibration , uncertainty and model
evaluation in SWAT2005
Ann van Griensven
a.vangriensven@unesco-ihe.org
1
Sensitivity, auto-calibration , uncertainty and model
evaluation in SWAT2005
I Theory
1 The LH-OAT sensitivity analysis
A LH-OAT combines the OAT design and Latin Hypercube sampling by taking the Latin Hypercube samples as initial points for a
OAT design (figure I.1).
1.1
Latin-Hypercube simulations
Latin-Hypercube is a sophisticated way to perform random sampling such as Monte-Carlo sampling to allow a robust analysis
requiring not too many runs. Monte Carlo samples are in general robust, but may require a high number of simulations and
consequently a large amount of computational resources (time and disk memory). The concept of the Latin-Hypercube Simulation
(McKay et al., 1979; McKay, 1988) is based on the Monte Carlo Simulation but uses a stratified sampling approach that allows
efficient estimation of the output statistics. It subdivides the distribution of each parameter into N ranges, each with a probability
of occurrence equal to 1/N. Random values of the parameters are generated such that each range is sampled only once. The model
is then run N times with the random combinations of the parameters. The model results are typically analysed with multi-variate
linear regression or correlation statistics methods. The Latin-Hypercube sampling is commonly applied in water quality modelling
due to its efficiency and robustness (Weijers and Vanrolleghem, 1997; Vandenberghe et al., 2001). The main drawback is the
assumptions on linearity. If these are not fulfilled, the biased results can be obtained.
1.2
One-factor-At-a-Time sampling
OAT (One-factor-At-a-Time) design as proposed by Morris (1991) is an example of an integration of a local to a global sensitivity
method. As in local methods, each run has only one parameter changed, so the changes in the output in each model run can be
unambiguously attributed to the input parameter changed. This approach has the advantage of a lack of reliance on predefined
(tacit or explicit) assumptions of relatively few inputs having important effects, monotonicity of outputs with respect to inputs, or
adequacy of low-order polynomial models as an approximation to the computational model (Morris, 1991).The output analysis is
based on the study of the random sample of observed elementary effects, which are generated from each considered input. The
change in model outcome
M ( x1 ,..., xi  xi ,..., xn ) can then be unambiguously attributed to such a modification by means of
2
an elementary effect Si defined by equation 1.
M ( x1 ,..., xi  xi ,..., xn ) is usually some lumped measure like total mass, SSQ
or SAE.
Considering n parameters (i.e. i=1,…,n), this means that this experiment involves performing n+1 model runs to obtain one partial
effect for each parameter according to equation 1.
Ideally, the computational experiment should account for the whole set of input parameters {xi}. In this work a simple design was
used where the computational experiment varies each input parameter one by one starting from an initial vector
( x1,..., xi i ,..., xn ) .
The result is quantitative, elementary and exclusive for the parameter. However, the quantitativeness of this measure of sensitivity
is only relative: as the influence of xi may depend on the values chosen for the remaining parameters, this result is only a sample
of its sensitivity (i.e. a partial effect). Therefore, this experiment is repeated for several sets of input parameters. The final effect
will then be calculated as the average of a set of partial effects, and the variance of such a set will provide a measure of how
uniform the effects are (i.e. the presence or absence of nonlinearities or crossed effects with other parameters).
The elementary effects obtained using this procedure allows the user to screen the entire set of input parameters with a low
computational requirement. In this way, local sensitivities get integrated to a global sensitivity measure.
The OAT design appeared to be a very usefull method for SWAT modelling (Francos et al., 2002; van Griensven et al., 2001) as it
is able to analyse sensitivity on high number of parameters.
1.3
The LH-OAT sensitivity analysis
The LH-OAT sensitivity analysis method combines thus the robustness of the Latin Hypercube sampling that ensures that the full
range of all parameters has been sampled with the precision of an OAT designs assuring that the changes in the output in each
model run can be unambiguously attributed to the input changed in such a simulation leading to a robust and efficient sensitivity
analysis method. The method is also efficient, as for m intervals in the LH method, a total of m*(p+1) runs is required.
3
x
x
x
p1
x
x
p2
Figure I.1: Illustration of MC-OAT sampling of values for a two parameters model where X respresent the Monte-Carlo points
and  the OAT points.
4
2 Parasol (Parameter Solutions method): optimization and uncertainty
analysis in a single run
2.1
Optimization method
2.1.1 The Shuffled complex evolution algorithm
This is a global search algorithm for the minimization of a single function for up to 16 parameters [Duan et al., 1992]. It
combines the direct search method of the simplex procedure with the concept of a controlled random search of Nelder and Mead
[1965], a systematic evolution of points in the direction of global improvement, competitive evolution [Holland, 1995] and the
concept of complex shuffling. In a first step (zero-loop), SCE-UA selects an initial ‘population’ by random sampling throughout
the feasible parameters space for p parameters to be optimized (delineated by given parameter ranges). The population is portioned
in to several “complexes” that consist of 2p+1 points. Each complex evolve independently using the simplex algorithm. The
complexes are periodically shuffled to form new complexes in order to share the gained information. It searches over the whole
parameter space and finds the global optimum with a success rate of 100% [Sorooshian et al. 1993].
SCE-UA has been widely used in watershed model calibration and other areas of hydrology such as soil erosion, subsurface
hydrology, remote sensing and land surface modeling [Duan, 2003]. It was generally found to be robust, effective and efficient
[Duan, 2003].
The SCE-UA has also been applied with success on SWAT for the hydrologic parameters [Eckardt and Arnold, 2001] and
hydrologic and water quality parameters [van Griensven et al., 2002].
2.1.2 Objective functions
Sum of the squares of the residuals (SSQ): similar to the Mean Square Error method (MSE) it aims at matching a simulated
series to a measured time series.
SSQ   xi ,measured  xi , simulated 
2
i 1, n
(1)
with n the number of pairs of measured (x measured) and simulated (xsimulated) variables
The sum of the squares of the difference of the measured and simulated values after ranking (SSQR): The SSQR method
aims at the fitting of the frequency distributions of the observed and the simulated series. As opposed to the SSQ method, the time
of occurrence of a given value of the variable is not accounted for in the SSQR method [van Griensven and Bauwens, 2001].
5
After independent ranking of the measured and the simulated values, new pairs are formed and the SSQR is calculated as
SSQR 
 x
j 1, n
j , measured
 x j , simulated

(2)
2
where j represents the rank.
2.1.3 Multi-objective optimization
This following is based on the Bayesian theory (1763), assuming normal distribution of the residuals [Box and Tiao,
1973]. Residual can be assumed to have a normal distribution N(0, σ2), whereby the variance is estimated by the residuals
correspond to random errors:
2

SSQMIN
nobs
(3)
with SSQMIN the sum of the squares at the optimum and nobs the number of observations. The probability of a residual can then be
calculated as:
p ( | y t ,obs ) 
  y t , sim  y t ,obs 2 

exp 
2 2


2 2


1
(4)
or
  y t , sim  y t ,obs 2 

p ( | y t ,obs )  exp 
2 2




for a time series (1..T) this gives
p( | Yobs ) 
(5)
  y t , sim  y t ,obs 2 

exp 

2
2



t 1


T
1
 2 
2
T
(6)
or
T
 
 y  yt ,obs 2 
 t 1 t , sim
p ( | Yobs )  exp 

2 2




(7)
For a certain time series Yobs the probability of the parameter set θ p(θ|Yobs) is thus proportional to
6
 SSQ1 
p( | Yobs )  exp 
2
 2 *1 
(8)
where SSQ1 are the sum of the squares of the residuals with corresponding variance σ 1 for a certain time series. For 2 objectives, a
Bayesian multiplication gives:
 SSQ1 
 SSQ2 
p( | Yobs )  C1 * exp 
* exp 
2
2
 2 * 1 
 2 * 2 
(9)
Applying equation (3), (9) can be written as:
 SSQ1 * nobs1 
 SSQ2 * nobs2 
p ( | Yobs )  C 2 * exp 
 * exp 

SSQ
SSQ2, min 

1, min


(10)
In accordance to (10), it is true that:
ln  p( | Yobs )  C 3 
SSQ2 * nobs 2 SSQ2 * nobs 2

SSQ2 min
SSQ2, min
(11)
We can thus optimize or maximize the probability of (11) by minimizing a Global Optimization Criterion (GOC) that is set to the
equation:
GOC 
SSQ1 * nobs1
SSQ2 * nobs 2

SSQ1, min
SSQ2, min
(12)
while according to equation (11) the probability can is related to the GOC according to:
p( | Yobs )  exp GOC 
(13)
The sum of the squares of the residuals get thus weights that are equal to the number of observations divided by the minimum.
This equation allows also for the uncertainty analysis as described below.
2.1.4 Parameter change options
Parameters affecting hydrology or pollution can be changed either in a lumped waye (over the entire catchment), or in a
distributed way (for selected subbasins or HRU’s). They can be modified by replacement, by addition of an absolute change or by
a multiplication of a relative change. It is never allowed to go beyond the predefined parameter ranges. A relative change allows
for a lumped calibration of distributed parameters while they keep there relative physical meaning (soil conductivity of sand will
be higher than soil conductivity of clay).
7
2.2 Parameter change options for SWAT
In the ParaSol algorithm as implemented with SWAT2005 parameters affecting hydrology or pollution
can be changed either in a lumped way (over the entire catchment), or in a distributed way (for selected subbasins or HRU’s). They can be modified by replacement, by addition of an absolute change or by a
multiplication of a relative change. A relative change means that the parameters, or several distributed
parameters simultaneously, are changed by a certain percentage. However, a parameter is never allowed to
go beyond the predefined parameter ranges. For instance, all soil conductivities for all HRU’s can be
changed simultaneously over a range of -50 to +50 % of their initial values which are different for the
HRU’s according to their soil type. This mechanism allows for a lumped calibration of distributed
parameters while they keep their relative physical meaning (soil conductivity of sand will be higher than soil
conductivity of clay).
2.3
Uncertainty analysis method
The uncertainty analysis divides the simulations that have been performed by the SCE-UA optimization into ‘good’ simulations
and ‘not good’ simulations. The simulations gathered by SCE-UA are very valuable as the algorithm samples over the entire
parameter space with a focus of solutions near the optimum/optima.
There are two separation techniques, both are based on a threshold value for the objective function (or global optimization
criterion) to select the ‘good’ simulations by considering all the simulations that give an objective function below this threshold.
The threshold value can be defined by 2-statistics where the selected simulations correspond to the confidence region (CR) or
Bayesian statistics that are able point out the high probability density region (HPD) for the parameters or the model outputs (figure
2).
8
2.3.1 2-method
For a single objective calibration for the SSQ, the SCE-UA will find a parameter set Ө* consisting of the p free parameters
(ө*1, ө*2,… ө*p), that corresponds to the minimum of the sum the square SSQ. According to 2 statistics, we can define a
threshold “c” for “good’ parameter set using equation
c  OF ( *) * (1 
 2 p ,0.95
n p
(14)
)
whereby the χ2p,0.95 gets a higher value for more free parameters p.
For multi-objective calibration, the selections are made using the GOC of equation (11) that normalizes the sum of the squares
for n, equal to the sum of nobs1 and nobs2, observation. A threshold for the GOC is the calculated by:
c  GOC ( *) * (1 
 2 p ,0.95
nobs1  nobs 2  p
)
(15)
2.3.2 Bayesian method
According to the Bayesian theorem, the probability p(θ|Yobs) of a parameter set θ is proportional to equation (11).
After normalizing the probabilies (to ensure that the integral over the entire parameter space is equal to 1) a cumulative
distributions can be made and hence a 95% confidence regions can be defined. As the parameters sets were not sampled randomly
but were more densely sampled near the optimum during SCE-UA optimisation, it is necessary to avoid having the densely
sampled regions dominate the results. This problem is prevented by determine a weight for each parameter set θi by the following
calculations:
1.
Dividing the p parameter range in m intervals
2.
For each interval k of the parameter j, the sampling density nsamp(k,j) is calculated by summing the times that the interval was
sampled for a parameter j.
A weight for a parameter set θi is than estimated by
Determine the interval k of the parameter өj,i
Consider the number of samples within that interval = nsamp(k,j)
The weight is than calculated as
W (i ) 
(16)
1
1/ P
 p

 nsamp(k , j )
 j 1i

The “c” threshold is determined by the following process:
9
a. Sort parameter sets and GOC values according to decreasing probabilities
b. Multiply probabilities by weights
c. Normalize the weighted probabilities by division by PT with
T
PT  W ( i ) *p( I | Yobs )
(17)
i 1
d. Sum normalized weighted probabilities starting from rank 1 till the sum gets higher than the cumulative probability limit (95%
or 97.5%). The GOC corresponding to the latest probability defines then the “c” threshold.
sce sampling
Xi-squared CR
Bayesian HPD
200
Smax
150
100
50
0
0.0
0.2
0.4
0.6
0.8
1.0
k
Figure I.2. Confidence region CR for the χ2-statistics and high probability density (HPD) region for the Bayesian statistics for a 2parameter test model.
10
3 SUNGLASSES (Sources of UNcertainty GLobal Assessment using SplitSamlpES): Model evaluation
3.1
Introduction
Model uncertainty analysis aims to quantitatively assess the reliability of model outputs. Many water quality modeling
applications used to support policy and land management decisions lack this information and thereby lose credibility [Beck, 1987].
Several sources of modeling unknowns and uncertainties result in the fact that model predictions are not a certain value, but
should be represented with a confidence range of values [Gupta et al., 1998; Vrugt et al. 2003; Kuczera, 1983a; Kuczera 1983b;
Beven, 1993]. These sources of uncertainty are often categorized as input uncertainties (such as errors in rainfall or pollutant
sources inputs), model structure/model hypothesis uncertainties (uncertainties caused by inappropriateness of the model to reflect
reality or the inability to identify the model parameters) and uncertainties in the observations used to calibrate/validate the model
outputs (Figure 3).
Over the last decade model uncertainty analysis has been investigated by several research groups from a variety of
perspectives. These methods have typically focused on methodologies that focus on model parametric uncertainty but
investigators have had a more difficult time assessing model structural and data errors and properly accounting for these sources of
model prediction error (e.g. see commentaries [Beven and Young, 2003; Gupta et al., 2003]). The focus on parametric uncertainty
in model calibration and uncertainty methodologies does not address overall model predictive uncertainty which encompasses
uncertainty introduced by data errors (in input and output observations), model structural errors and uncertainties introduced by
the likelihood measure or objective function used to develop a model and its particular application to a single location [Gupta et
al., 2003; Thiemann et al., 2001; Kuczera and Mroczkowski, 1998]. It is important to note that proper assessment of model
prediction uncertainty is somewhat of an unattainable goal and that questions about the informativeness of data and model
structural error are typically best assessed in a comparison mode such as one model structure is superior in a specific situation as
opposed a wholesale accounting of the size of model structural error (e.g. [Gupta et al., 1998]). This problem of not being able to
quantitatively account for model structural error and errors introduced during the model calibration process has been a continuing
source of problems and has generally prohibited the use of robust statistical methods for assessing uncertainty since these methods
typically assume that the structural form of the model is correct and that only model parameters need to be adjusted to properly
match a computational model to the observations [Beven and Young, 2003; Gupta et al., 2003]. It is well known that hydrologic
models, particularly those of the rainfall-runoff process and even more so for models of water quality, are not perfect models and
11
thus the assumption that the model being used in the calibration process is correct does not hold for the application of hydrologic
models (for examples see - [Mroczkowski et al., 1997; Boyle et al., 2001; Meixner et al., 2002; Beven, 1993]).
The traditional way in which hydrologists assess how good their model is and whether the calibration process they went
through was valuable and meaningful, is to conduct an evaluation of the model via some methodology. Model calibration and
evaluation in hydrology has a long history. A fundamental necessity noted by many is that the model must be evaluated using data
not used for model calibration [Klemes, 1986]. This concept typically goes under the name split sample methodology. Typically
this split sample approach is conducted using one half of a data set to calibrate the model and the second half of the time series to
evaluate the calibration data set. This approach represents the minimum bar over which a model must pass to be considered
suitable for further application [Mroczkowski et al., 1997]. More robust methodologies exist for assessing the suitability of a
calibrated model including calibration before a change in land use and evaluation of the model after that change [Mroczkowski et
al., 1997], the use of so-called “soft” data that represent the in-depth knowledge of field hydrologists [Seibert and McDonnell,
2003], or the use of observations at the same time or different times that were not used during the model calibration process
[Mroczkowski et al., 1997; Meixner et al., 2003]. Still the split sample in time methodology remains the dominant form of
assessing model and model calibration performance due to its simplicity and the general lack of robust multi-flux data sets of a
long duration.
The split sample methodology is not without its flaws. It is well-known that a model typically performs worse during an
evaluation time period than during the calibration period and if a model performs almost as well during the evaluation period it is
generally accepted that this means the model is at least an acceptable representation of the natural system it represents (e.g.
[Meixner et al., 2000]).
Singh [1988] discusses the problem of model calibration at length and particularly notes that the model calibration
problem has several fundamental attributes. First, model calibration starts with the problem that the data the model is being
calibrated to has some error associated with it . Next, Singh [1988] notes that model calibration typically over-compensates for the
data error and that the standard error of the estimate ends up being smaller than it should be. When the calibrated model is then
taken to another time period for evaluation the standard error of prediction is generally larger than the original standard error of
the data since the model was overly tuned to the observations for the calibration period. Singh notes that, while the standard error
of the data and of the estimate can be quantified using standard methods, the standard error of the prediction, which we are most
interested in, has no formalized methodology for estimation. This problem remains to this day. These properties of standard error
12
of the data, estimate and prediction extend to most of the uncertainty methods used in hydrology since they share many similarities
to the model calibration problem.
The framework established by Singh [1988] proves useful as we think about the problem of estimating model predictive
uncertainty. Since most methods estimate the standard error of the estimate they are stuck at the reduced uncertainty level
indicated by Singh [1988]. Given the fundamental interest in knowing the uncertainty of model predictions as opposed to
estimates during the calibration period it should prove useful to investigate methods that can assess the uncertainty of predictions.
The discussion above would indicate that using the split sample approach and an assessment of model performance during the
evaluation period would be useful for estimating overall model predictive uncertainty.
Many researchers noted the problem that parameter uncertainty was much smaller than expected for the level of trust we should
have in model predictions [Thiemann et al., 2001; Beven and Freer, 2001; Freer et al., 2003]. Here we develop a methodology
that utilizes a split sample approach to estimate overall model predictive uncertainty and we compare these results to those
garnered using our previously developed parametric uncertainty method based on statistical approaches ParaSol (Parameter
Solutions). SUNGLASSES and ParaSol are then compared using the commonly used river basin water quality model, the Soil
Water Assessment Tool (SWAT).
13
Real world values
On a spatial / temporal
continuum
Forcing Inputs
Topography
Sources of
Error
Mode
l
 Recording errors of forcing inputs
 Spatial/temporal discretization
Observed spatial resolution
Observed temporal
resolution
Observed forcing data
 Spatial discretization of landuse,
Landuse Map
Soil Map
Topographic map
soil, and topography
 Errors on parameters for landuse,
Landuse
soil, and topography
Soil
Spatial Inputs
SOM
Model
Structure
NO3
NH4
+
 Model scale discretization
 Model hypothesis
Model spatial structure
Simplified processes
Uncertain parameters
NO2Pollution
Point Sources
Environmental
Observations
Diffuse
Sources



Model diffuse pollution
sources
Model point pollution sources
Observation and temporal errors
for point-source pollution
Errors on land use practices
Temporal discretization for diffuse
pollution
Uncertain model
output
 Errors on observed values
Environmental
Observations
Observations – Model
output
RESIDUAL
S
Figure I.3: Scheme of sources of errors in distributed water quality modeling.
3.2
Description of the method
ParaSol is an optimization and statistical uncertainty method that assesses model parameter uncertainty. On top of ParaSol,
SUNGLASSES uses all parameter sets and simulations. Additional sources of uncertainty are detected using an evaluation period,
in addition to the calibration period.
In order to get a stronger evaluation of model prediction power, the Sources of Uncertainty Global Assessment using Split
SamplES (SUNGLASSES) is designed to assess predictive uncertainty that is not captured by parameter uncertainty. The method
accounts for strong increases in model prediction errors when simulations are done outside the calibration period by using a split
sample strategy whereby the evaluation period is used to define the model output uncertainties. The assessment during the
evaluation period should depend on a criterion related to the sort of decision the model is being used for.
These uncertainty ranges depend on the GOC, representing the objective functions, at one side to calibrate the model and
develop an initial estimate of model parameter sets, and an evaluation criterion (to be used in decision making) at the other that is
14
used to estimate uncertainty bounds. The GOC is used to assess the degree of error on the process dynamics, while the evaluation
criteria define a threshold on the GOC. This threshold should be as small as possible, but the uncertainty ranges on the criterion
should include the “true” value for both the calibration and the validation period. For example when model bias is used as the
criterion, these “true” values are then a model bias equal to zero. Thus, the threshold for the GOC would be increased until the
uncertainty ranges on the total mass flux include zero bias. SUNGLASSES operates by ranking the GOCs (Figure I.4). Statistical
methods can be used to define a threshold considering parameter uncertainty. In this case, ParaSol was used to define such a
threshold. However, when we look at the predictions, it is possible that unbiased simulations are not within the ParaSol uncertainty
range other than parameter uncertainty. This result means that there are additional uncertainties acting on the model outputs
(Figure I.5). Thus, a new, higher threshold is needed in order to have unbiased simulations included within the uncertainty bounds
(Figures I.4 and I.5). This methodology is flexible in the sense that different combinations of objective functions can be used
within the GOC. Also alternatives for the bias as the criterion for the model evaluation period are possible depending on the model
outputs to be used for decision making. Examples of alternative criteria are the percentage of time a certain output variable is
higher or lower than a certain threshold (being common for water quality policy) or the maximum value or the value of a certain
model prediction percentile (often important for flood control).
GOC (log-scale)
Ranked GOCs for all SCE-UA simulations
ParaSol threshold
SUNGLASSES
threshold
1.E+06
1.E+05
1.E+04
1.E+03
1.E+02
1
5001 10001 15001 20001 25001 30001
Rank
Figure I.4: Selection of good parameter sets using a threshold imposed by ParaSol or by SUNGLASSES
15
Model bias for the sediment loads (% )
ParaSol
1998-1999
SUNGLASSES
2000-2001
1998-1999
2000-2001
200.00
160.00
120.00
80.00
40.00
0.00
-40.00
-80.00
Figure I.5: Confidence regions for the sediment loads calculations according to ParaSol and SUNGLASSES
16
PART II: Step-by-step tutorial
1. Open the Yellow River project.
2. To use the sensitivity analysis, you have to activate an AVSWAT extension in the SWAT view. To
do this, you have to go to the Tools menu and select AVSWATX extensions. A new dialog box will
open (Figure II.1).
Figure II.1 AVSWATX extensions dialog box.
3. Double click the extension AVSWATX Sens-Auto-Unc and press OK. In the Tools menu there are
two new options: 1) Sensitivity analysis and 2) Auto-calibration and Uncertainty.
17
Figure II.2. Select Sensitivity Analysis Simulation dialog box.
Figure II.3 Sensitivity Analysis Manager dialog box.
4. Select Sensitivity analysis. A new dialog box will open (Figure 3.3). This dialog box allows you to
select the scenario and the simulation you want to use in the sensitivity analysis.
18
5. Select default for the scenario. Now you are offered the simulations that are available for this
particular scenario. Select sim1. If there are more simulations available, you can click on each sim#
to see a summary of the main properties of each simulation in the right panel of this dialog box.
6. Press OK. A new dialog box will open (Figure 3.4). You are offered three options for the output
variables to be used in the sensitivity analysis: 1) only flow, 2) flow and sediments and 3) flow,
sediments, and water quality. You can also select whether you also want to perform the sensitivity
analysis on the objective function (e.g. instead of using the mean average flow only).
7. Select Flow and activate the Use Observed Data button. The measured flow data are provided in
the file observations9394.txt. You might want to check whether the sim# you selected in step 5 also
covers the period 1993-1994.
8. Press the Start button to start the sensitivity analysis.
9. You will be asked to provide the reference outlet that is to be analyzed in the sensitivity analysis.
Select 7 by double clicking. Subbasin 7 contains the catchment outlet.
10. The interface now warns you that the sensitivity analysis may take several minutes. Press Yes to
continue. A DOS-windows will open and SWAT2003 will start running. Although the interface
warned that the analysis might take several minutes, this might easily turn into hours or days in the
case of a large SWAT project.
Currently it is not possible to control the variables included in the sensitivity analysis through the interface.
Instead, the interface will perform the analysis for a predefined set of 27 variables with 10 intervals in the
LH sampling. This means that SWAT2003 will make 280 runs to complete the sensitivity analysis. The
results of the analysis are provided in the directory …\advanced\sensitivity on the SWAT summer school
CD. Therefore, you do not have to wait for SWAT2003 to finish now. You can quit SWAT2003 in the DOS
window by pressing CTRL-C.
The model parameters included in the sensitivity analysis can be controlled outside of the AVSWATX
interface. In the following, the text-based input files required for the sensitivity analysis will be discussed in
detail.
These
input
files
are
located
…\AVSWATX\YellowRiver\Scenarios\Default\sim1\txtinout\sensitivity.
19
in
the
directory
Input file 1: Sensin.dat
The first three lines of sensin.dat specify three control variables of the LH-OAT sensitivity analysis:
1. Number of intervals m within Latin Hypercube sampling
2. Fraction of parameter range defined by minimum and maximum bounds that is varied in the OAT
part of the sensitivity analysis (see figure I.1)
3. Random seed required for random sampling within the m intervals.
Figure II.4 Sensin.dat
Input file 2: changepar.dat
The changepar.dat file specifies the model parameters included in the sensitivity analysis. These lines consist
of five columns:
1. Lower bound of parameter value
2. Upper bound of parameter value
3. Number specifying the model parameter (see table II.1)
4. Variation method (imet, see table II.2)
5. Number of HRU
If you specify the number of HRU as larger than 2000, the parameter is changed for all HRU. If the number
of HRU is lower than 2000, the parameter is changed for a selected number of HRU. In this case, the HRU
numbers must be provided in the next line in sets of 50*5i. Please note that the number of HRU is only
required for the subbasin type variables (sub) given in table II.1 (see figure II.4).
20
Figure II.5. changepar.dat
Table II.1. Parameter codes for sensitivity analysis, automatic model calibration and uncertainty analysis.
Par
Name
Type
Description
Location
1
ALPHA_BF
Sub
Baseflow alpha factor [days]
*.gw
2
GW_DELAY
Sub
Groundwater delay [days]
*.gw
3
GW_REVAP
Sub
Groundwater "revap" coefficient
*.gw
4
RCHRG_DP
Sub
Deep aquifer percolation fraction
*.gw
5
REVAPMN
Sub
Threshold water depth in the shallow aquifer for "revap" [mm]
*.gw
6
QWQMN
Sub
Threshold water depth in the shallow aquifer for flow [mm]
*.gw
7
CANMX
Sub
Maximum canopy storage [mm]
*.hru
8
GWNO3
Sub
Concentration of nitrate in groundwater contribution [mg N/l]
*.gw
10
CN2
Sub
Initial SCS CN II value
*.mgt
15
SOL_K
Sub
Saturated hydraulic conductivity [mm/hr]
*.sol
16
SOL_Z
Sub
Soil depth [mm]
*.sol
17
SOL_AWC
Sub
Available water capacity [mm H20/mm soil]
*.sol
18
SOL_LABP
Sub
Initial labile P concentration [mg/kg]
*.chm
19
SOL_ORGN
Sub
Initial organic N concentration [mg/kg]
*.chm
20
SOL_ORGP
Sub
Initial organic P concentration [mg/kg]
*.chm
21
21
SOL_NO3
Sub
Initial N03 concentration [mg/kg]
*.chm
22
SOL_ALB
Sub
Moist soil albedo
*.sol
23
SLOPE
Sub
Average slope steepness [m/m]
*.hru
24
SLSUBBSN
Sub
Average slope length [m]
*.hru
25
BIOMIX
Sub
Biological mixing efficiency
*.mgt
26
USLE_P
Sub
USLE support practice factor
*.mgt
27
ESCO
Sub
Soil evaporation compensation factor
*.hru
28
EPCO
Sub
Plant uptake compensation factor
*.hru
30
SPCON
Bas
Lin. re-entrainment parameter for channel sediment routing
*.bsn
31
SPEXP
Bas
Exp. re-entrainment parameter for channel sediment routing
*.bsn
33
SURLAG
Bas
Surface runoff lag time [days]
*.bsn
34
SMFMX
Bas
Melt factor for snow on June 21 [mm H2O/ºC-day]
*.bsn
35
SMFMN
Bas
Melt factor for snow on December 21 [mm H2O/ºC-day]
*.bsn
36
SFTMP
Bas
Snowfall temperature [ºC]
*.bsn
37
SMTMP
Bas
Snow melt base temperature [ºC]
*.bsn
38
TIMP
Bas
Snow pack temperature lag factor
*.bsn
41
NPERCO
Bas
Nitrogen percolation coefficient
*.bsn
42
PPERCO
Bas
Phosphorus percolation coefficient
*.bsn
43
PHOSKD
Bas
Phosphorus soil partitioning coefficient
*.bsn
50
CH_EROD
Sub
Channel erodibility factor
*.rte
51
CH_N
Sub
Manning's nvalue for main channel
*.rte
52
TLAPS
Sub
Temperature lapse rate [°C/km]
*.sub
53
CH_COV
Sub
Channel cover factor
*.rte
54
CH_K2
Sub
Channel effective hydraulic conductivity [mm/hr]
60
USLE_C
Sub
Minimum USLE cover factor
crop.dat
61
BLAI
Sub
Maximum potential leaf area index
crop.dat
*.rte
The sensitivity analysis provides three methods for varying the parameters. The first option allows you to
replace the value directly (option 1). For example, the parameter ALPHA_BF is varied between 0 and 1 and
the randomly drawn value is substituted directly into all *.gw files. The second method allows you to add
values to the initial values. The third method allows you to multiply the initial value with the drawn
parameter value. For example, the specified settings for the parameter CN2 in figure II.4 allow this
parameter to vary between 0.5 to 1.5 (-50% to +50%) times the value currently specified in each *.mgt files.
Table II.2. Variation methods (imet) available in sensing.dat
imet
1
2
3
Description
Replacement of initial parameter by value
Adding value to initial parameter
Multiplying initial parameter by value (in percentage)
22
Input file 3: responsmet.dat
Figure II.6. Responsmet.dat.
This file contains the output variables and methods that will be used for the sensitivity analysis (figure II.6).
Each line represents an output variable (with a maximum of 100) and has five columns that indicate:
1. Output variable number (see table II.3)
2. Parameter that allows you to either use the average of the output variable specified in column 1
(setting 1) or the percentage of time that the output variable is below the threshold defined in column
5 (setting 2)
3. When the output variable is a solute, you can either perform the sensitivity analysis on the
concentrations (setting 0) or the loads (setting 1)
4. Code number for the autocal file in fig.fig. This is required when the sensitivity analysis is
performed for more than one subbasin.
5. Threshold value corresponding to column 2, setting 2.
23
In the case of the example file shown in figure 3.6, the sensitivity analysis is performed on the average flow,
average sediment load, average organic N load, average organic P load and the average nitrate load.
Table II.3. Output variable number
Nr.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
20
21
22
Variable
Flow [m3/s]
Sediment concentration [g/l]
Organic N concentration [mg N/l]
Organic P concentration [mg P/l]
Nitrate concentration [mg N/l]
Ammonia concentration [mg N/l]
Nitrite concentration [mg N/l]
CBOD concentration [mg/l]
Dissolved oxygen concentration [mg/l]
Mineral P concentration [mg P/l]
Chlorophyll-a concentration [g/l]
Soluble pesticide concentration [mg/l]
Sorbed pesticide concentration [mg/l]
Temperature [°C]
Kjeldahl nitrogen concentration [mg N/l]
Total nitrogen concentration [mg N/l]
Total phosporus concentration [mg P/l]
24
Input file 4: objmet.dat
This file contains the output variables and methods that will be used when the sensitivity analysis is applied
to an error measure instead of an output variable (figure II.7). Each line stands for an output variable (with a
maximum of 100) and has five columns that indicate:
1. Output variable number (see table II.3)
2. Parameter that allows you to use different error measures (1=SSQ, 5=SSQR). The error measures are
discussed in more detail in the automatic calibration sections.
3. When the output variable is a solute, you can either perform the sensitivity analysis on the
concentrations (setting 0) or the loads (setting 1)
4. Code number for the autocal file in fig.fig. This is required when the sensitivity analysis is
performed for more than one subbasin.
5. Weight for the objective function in the case of a multi-objective calibration.
Figure II.7. Objmet.dat
These three files allow you to customize the sensitivity analysis to your needs. If you make changes to one
of these three input files, you cannot run the sensitivity analysis from the SWAT interface anymore. Instead,
you have to copy swat2003.exe from the …\avswatpr directory into the directory of the sensitivity analysis
and run this executable from a DOS interface.
25
Table II.4. Output files of the sensitivity analysis.
File name
sensresult.out
sensout.out
senspar.out
sensobjf.out
sensrespons.out
lathyppar.out
oatpar.out
Description
List of parameter ranks
Detailed output with mean, variance and partial sensitivities
Parameter values of each run
Value of objective function for each run
Model output values for each run
Normalized Latin-Hypercube sampling points
Normalized OAT sampling points
After completing the sensitivity analysis, SWAT2003 will produce a set of output files. A short description
of these input files is provided in table II.4. The files that you will mostly use are sensresult.out and
sensout.out. Since the file sensresult.out only contains the final ranking of each parameter in the analysis, we
will only discuss the file sensout.out here (see figure II.8). In the first lines of sensout.out, the settings of the
sensitivity analysis are summarized (not shown here). Then, the results of the sensitivity analysis are shown
for each objective function and for each model output variable selected in either responsmet.dat or
objmet.dat.
Figure II.9 shows part of the results for one model output variable. In the first lines shown, each column
represents a model parameter. Each of the 10 lines represents the results of the AOT sensitivity analysis for
each LH sampling point. In the following lines, the maximum, variance, mean and ranking for each model
parameter (column) are provided. Finally, a summary with mean and ranking for each model parameter is
given.
EXERCISE
3.1
The results of the standard sensitivity analysis are provided in the directory …\advanced\sensitivity of the
summer school CD. We have performed a sensitivity analysis for 1993 and for the period 1993-1994. Study the
sensitivity ranking of each model parameters for both time periods. Is the ranking stable or does it depend on
the time period used in the analysis?
26
Figure II.9. Detail of sensout.out.
EXERCISE
3.2
You can also compare the ranking of the model parameters between the sensitivity analysis performed on the
mean daily flow and the sensitivity analysis performed on the sum of squared residuals between measured and
modeled flow. There is a large difference in ranking for the surface runoff lag time (SURLAG). Do you have
an explanation for this large difference?
27
EXERCISE
3.3
To practice setting up your own sensitivity analysis, create the appropriate input files to perform a sensitivity
analysis on the average flow for the model parameters: CN2, SOL_K, SOL_AWC and CANMX. The base
simulation for the sensitivity analysis should run from 1.1.1993 to 31.12.1993. Make relative changes from –10
to 10% for CN2, relative changes from –25 to 25% for SOL_AWC, relative changes from –50 to 50% for
SOL_K and use actual values between 0 and 5 for CANMX. Which of these four parameters is the most
sensitive? How do you rate the variances of the mean partial sensitivities in sensout.out?
3.4
The results of the sensitivity analysis also depend on the bounds selected for each parameter. To test this,
increase the bounds from CN2 to –25% to 25%. Did the ranking of the model parameters change?
28
PART III: Formats of the statistical methods in SWAT
1 General input file formats (valid for all methods)
1.1
File.cio
ICLB The flag for autocalibration in the *.COD file has to be activated
0
!ICLB: auto-calibration option: 0=normal run , >1 statistical runs
ICLB VALUES AND MEANING:
0
1
2
3
4
5
no autocalibration
Sensitivity analysis
Opimization using PARASOL
Opimization and uncertainty using PARASOL
Rerun model with best parameter set
Rerun the model with good parameter sets (to obtain uncertainty bounds on output time
series)
6
(re)calculate uncertainty results (e.g. when optimization was abrupted manually)
8
Run sunglasses
Table II.1: Options for ICLB in File.cio
Some other adaptation are always required while other depend on the method that is indicated in File.cio
1: sensitivity analysis
2: auto-calibration (ParaSol)
4: rerun best parameter set
5: rerun good parameter set
9: auto-calibration
(SUNGLASSES)
Figure III.1: File.cio
29
1.2

FIG file
In the *.fig file, a line has to be added that indicates the node where calibration has to take place and the variable that will be
optimised.
Autocal command (16)
The command allows the user to print SWAT output to an output file "autocal.out" for one variable to generate a calibration time series.
This output file can then be read by the autocalibration. Variables required on the save command line are:
Variable name
COMMAND
HYD_STOR
INUM1S
INUM2S
AUTO_IN
Definition
The command code =16 for autocalibration
Number of one of the previous hydrograph storage location numbers to be used
for autocalibration
Number of autocalibration file (up to 10)
Subdaily, daily or monthly
0: daily registration
1: hourly registration (Only when ievent =1)
3: monthly registration
Name of the file with the measurements
Table III.2: inputs in the autocal command in *.fig file
The format of the autocal command line is:
Variable name
Line #
Position
COMMAND
1
space 11-16
HYD_STOR
1
space 17-22
INUM1S
1
space 23-28
INUM2S
1
space 29-34
AUTO_IN
2
Space 11-23
Table III.3: Format for autocal command in *.fig file
Format
6-digit integer
6-digit integer
6-digit integer
6-digit integer
13 characters
30
observation filename
file number (cfr. objmet/responsmet)
Time resolution: 0: daily 1: hourly
3: monthly
Figure III.2: Fig-file
1.3 Data file with observations
This file has the measurements of the variable that has to be optimized by calibration.
This file is a list with following columns:
Hourly observations:
year (1X5i) day (2x3i) [hour/zero (1x2i)] measured values (1X10F.3)
Daily observations:
year (1X5i) day (2x,3i), 3x, measured values (1X11F.3)
Measured values are in columns with:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
FLOW m^3/s
SED mg/L
ORGN mg/L
ORGP mg/L
NO3 mg/L
NH3 mg/L
NO2 mg/L
MINP mg/L
CBOD mg/L
DISOX mg/L
CHLA ug/L
SOLPST mg/L
SORPST mg/L
BACTP ct/L
BACTLP ct/L
CMETAL1 mg/LC
31
17.
18.
19.
20.
21.
22.
METAL2 mg/LC
METAL3 mg/L
TEMP deg C
Kjeldahl N mg/L
Total N mg/L
Total P mg/L
Montly observations
year (1X5i) month (3x2i), 3x, measured values (1X11F.3)
1.
FLOW m^3 /s
2.
SED metric tons /day
3.
ORGN kg N / day
4.
ORGP kg P day
5.
NO3 kg N / day
6.
NH3 kg N / day
7.
NO2 kg N / day
8.
MINP kg P / day
9.
CBOD kg / day
10.
DISOX kg / day
11.
CHLA kg / day
12.
SOLPST mg pesticide / day
13.
SORPST mg pesticide / day
14.
Kjeldahl kg N / day
15.
Total kg N / day
16.
Total kg N / day
All missing data should have a negative value. These will then be skipped by the calculation of the objective
function.
Figure III.3: observation file
32
1.4
changepar
The changepar file lists the parameters to be changed. These can be GENERAL parameters (1 parameter representing the entire
basin, such as parameters listed in basins.bsn file) or HRU-parameters (listed in the *.hru file) or subbasin parameters (listed in the
*.rte file). The formats are different. General parameters have 1 line each, HRU/subbasin parameters have one (when they are
changed in a LUMPED way – all distributed parameters get the same change) or have 2 lines when they are changed in a
DISTRIBUTED way. In this case, a second line lists the HRU/subbasin numbers to be changed. There are also a few routing
parameters. Here, the number of the reaches are to be listed.
General
name
line
lower border
1
up border
1
parameter code (see table)
1
method code (see table)
1
Number of HRU numbers listed 1
below (=0)
Table III.4: inputs for the changepar file
format
xxxx.xxxxx
xxxx.xxxxx
xxxxx
xxxxx
“ 0”
space
1-10
11-20
21-25
26-30
31-35
format
xxxx.xxxxx
xxxx.xxxxx
xxxxx
xxxxx
xxxxx
space
1-10
11-20
21-25
26-30
31-35
HRU/subbasin parameters
name
line
lower border
1
up border
1
parameter code (see table)
1
method code (see table)
1
Number of HRU/subbasins
1
numbers listed below
If nHRU/subbasins <=2000
lines of 50 values
HRU numbers or crop number
for changes in the crop.dat file
If nHRU/subbasins >2000, all
HRU’s of the model are
modified and no list of HRU’s
should be given
Table III.5: formats for the changepar file
50 times(xxxxx)
1
Replacement of parameter by value
2
Adding value to initial parameter
3
Multiplying initial parameter by value
Table III.6: Options for imet
The options for the parameters are listed in table II.1
33
Parameter bounds
Parameter code and name
Parmeter change method:
1: replace by value
2: addition of value
3: multiplication of value
Figure III.4: Changepar.dat
1.5
objmet.dat
This file defines the variables and methods that will be used for the optimization. Each line stands for a OF with a maximum of 20
OF. The OF is described by 3 control parameters:
OFMET1 determines which output variable will be used for the OF, OFMET2 determines which method will be used and
OFMET3 indicates if loads should be used in stead of concentrations, OFMET4 the number of file (site locations).
34
Table III.7: inputs for objmet.dat
OFMET1
i4
The code number of the variable to be saved for calibration:
1: water (m²/s)
2: sediment
etc. (like watout)
+
20: Kjeldahl nitrogen
21: total nitrogen
22: total phosphorus
OFMET2
i4
chose 1 or 5 according to previous described methods for the calculation of the OF
(1=SSQ and 5=SSQR)
OFMET3
i4
This option can only be used for the pollutants (for daily or hourly). 0 indicates that the
concentrations are calibrated, 1 the loads. Monthly is always based on the loads.
OFMET4
I4
Code number for the autocalfile in *.fig (1 for autocal1.out, etc)
CALW
F8.3
Given weight for objective function
Variable code (1=flow
Objective
etc)
function
1: SSQ
5: ranked SSQ
8:
bias
File
number (cfr.
fig.fig):
Figure III.5: Objmet.dat
1.6
responsmet.dat
35
This file defines the variables and methods that will be used for the sensitivity analysis. Each line stands for an output value with a
maximum of 100. The output value is described by the control parameters:
RESPONSMET1 determines which output variable will be used, RESPONSMET2 what value has to be calculated for the output
variable (average concentration, total mass,…).
Table III.8: inputs for Responsmet.dat
RESPONSMET1
i4 The code number of the variable of interest
1: water (m²/s)
2: sediment (g/l)
etc. (like watout)
+
20: Kjeldahl nitrogen
21: total nitrogen
22: total phosphorus
RESPONSMET2
i4 chose 1-2 according interest
1: average
2: percent of time < then ‘sensw’ threshold
RESPONSMET3
i4 This option can only be used for the pollutants. 0 indicates that the concentrations
are calibrated, 1 the loads
RESPONSMET4
I4 Code number for the autocalfile in basi.fig (1 for autocal1.out, etc)
responsw
F8 Threshold value for case responsmet2=2
.3
Variable code (1=flow
Response function
etc)
1: mean value
2: percentage < threshold
File number (cfr. fig.fig).
Threshold value
Figure III.6: Responsmet.dat
36
III.2 RUNNING THE “MC-OAT” SENSITIVITY ANALYSIS
III.2.1 INPUT FILES
The sensitivity analysis needs input fils listed in III.8
Table III.8: Input files for MC-OAT
File.cio
Basins.fig
Iclb=1
Indication of the location of the output
within the model structure
Definition of error fuctions
Definition of output criteria
Control parameters
Objmet.dat
Responsmet.dat (optional)
Sensin.dat
Changepar.dat
Adapt file (See above)
Adapt file (See above)
Create file (See above)
Create file (See above)
Create file (See above)
Indication of the parameters to be
changed
III.2.1.1 SENSIN.dat (INPUT)
Sensin.dat lists the control parameters
Control parameters
Each control parameter uses one line with free format
Table III.9: Inputs for sensin.dat
parameter
NINTVAL
ISEED
OATVAR
descripion
Number of intervals in the Latin
Hypercube
Random seed number
parameter change for OAT
(fraction)
default
20
2003
0.05
III.2.2 outputs
Table III.10 lists the output files.
Table III.10: Output files for LH-OAT
File name
Description
Sensobjf.out
Objective functions values for each run
Sensrespons.out
Model output values for each run
lathyppar.out
Latin hypercube sampling points (normalized values)
OATpar.out
OAT sampling points (normalized values)
Senspar.out
Parameter values
sensout.out
Detailed output with mean, variance and partial sensisitivities
for each latin hypercube cluster.
III.2.2.1 Sensresult.out
This file list the parameter ranks.
37
Figure III.7: Sensresult.out
38
III.3 RUNNING “PARASOL”
III.3.1 INPUT
Parasol performs a combined optimisation and uncertainty analysis. It requires the input files that are listed in III.11.
Table III.11: inputs needed for ParaSol
File.cio
Basins.fig
Objmet.dat
Responsmet.dat (optional)
parasolin.dat
iclb=2
Indication of the location of the output
within the model structure
Definition of error fuctions
Definition of output criteria
Control parameters +
Indication of the parameters to be
changed
Adapt file (See above)
Adapt file (See above)
Create file (See above)
Create file (See above)
Create file (See above)
III.3.1.1PARASOLIN.dat (INPUT)
liss Parasolin.dat lists the control parameters Each control parameter uses one line with free format (Table III.12)
Table III.12: Format for parasolin.dat
MAXN
20000 max no. of trials allowed before optimization is terminated
KSTOP
5
maximum number of shuffling loops in which the criterion value
PCENTO
0.01
percentage by which the criterion value must change...
NGS
10
number of complexes in the initial population
ISEED
1677
initial random seed
Empty line
Empty line
NSPL
5
number of evolution steps allowed for each complex before comp
ISTAT
1
Statistical method (1=Xi-squared; 2=Bayesian)
IPROB
iprob, when iprob=1 90% probability ; iprob=2 95% probability;
3
iprob=3 97.5% probability
number of objective functions to be included in global optimization
IGOC
0
(default=0 and means that all objective functions listed in objmet.dat)
NINTVAL
10
nintval in hypercube (for Bayesian method only)
III.3.2 outputs
Table III.10 lists the output files.
Table III.10: Output files for ParaSol
File name
Description
Sceobjf.out
Objective functions values for each optimization run
scerespons.out
Model output values for all simulation runs
scepar.out
Parameter values of all simulation runs
sceparobj.out
Parameter values of all simulation runs and global optimization
criterion
Uncobjf.good
Objective function values for the good parameter sets in
“goodpar.out”
Senspar.out
Parameter values
39
parasolout.out
autocalxx.out
Goodpar.out
Bestspar.out
Detailed output for each optimization loop and uncertainty
outputs
this files lists the simulated values that will be used for the
calibration of point xx
See below
See below
III.3.2.1 ParaSolout.out
The main output file is ParaSolout.out. The first part consists of a report on the input files. The second reports for every loop of the
SCE algorithm, the third part reports the results of the parameter uncertainty analysis.
Minimum value of GOC as printed in the last column of file
sceparobj.out
Minimum and maximum for all
simulations done by SCE,
As printed in the files
sceobjf.out and scerespons.out
Threshold as in equation
15 15
Minimum and maximum of the
parameters in goodpar.out
Figure III.8: ParaSolout.out
III.3.2.2 bestpar.out
This file lists the best parameter values.
40
III.3.2.3 goodpar.out
This file lists the good parameter values.
41
III.4 Running in batch (Running the good parameter sets)
This option enables to run a bunch of parameter sets. It is especially usefull to rerun “goodpar.out” for another period, other
scenario’s or to analyse certain objective functions or model outputs. During the runs, the minima and maxima of the indicated
output variables are stored. These can then be used to plot confidence intervals for these output variables.
III.4.1. INPUT FILES
Table III.13 lists the input files.
Table III.13: input files for running in batch
File.cio
Iclb=5
Basins.fig
Indication of the location of the output
within the model structure
changepar.dat
Indication of the parameters to be
changed
Objmet.dat
Definition of the Objective Functions
Responsmet.dat (optional)
Definition of Response Functions
batchin.dat
Control parameters
Goodpar.out
File with parameter values.
Adapt file (See above)
Adapt file (See above)
Create file (See above)
Create file (See above)
Create file (See above)
Create file (See above)
Create file (See above)
Objmet.dat or responsmet.dat could need to be adapted (as well as simulation period in File.cio).
III.4.1.2 BATCHIN.dat
This file has only the changepar section. The format is described above in the general input section.
III.4.1.3 Goodpar.out
This file is an output file of “parasol”, but it can also be made manually.
Format: 5blancs, (e12.5)*number of parameter
III.4.2 OUTPUT FILES MINVAL.out and MAXVAL.out
These files list the minima and maxima of the output values, following the order as listed in objmet.dat.
42
III.5 Running SUNGLASSES
SUNGLASSES requires split observations files. This means that
1. Out of 1 observation file, 2 observation files have to be created
2. These observations files have to be indicated at the *.fig file
3. These objective functions need to be added to the Objmet.dat file
4. Indicate the IGOC in sunglasses.in. IGOC should be equal to the objective functions for the 1 st period. In most cases, this
will be equal to the number of objective functions indicated in objmet.dat, divided by 2.
III.5.1 INPUTS
Table III.14: Inputs for SUNGLASSES
File.cio
*.fig
changepar.dat
Objmet.dat
Responsmet.dat (option!)
sunglasses.in
iclb=8
Indication of the location of the output
within the model structure
Indication of the parameters to be
changed
Definition of Objective Functions
Definition of Response Functions
Control parameters
III.5.1.1 *.fig file
Figure III.9 shows how the fig file can be adapted for split observation files. 2
43
Adapt file (See above)
Adapt file (See above)
Create file (See above)
Create file (See above)
Create file (See above)
Create file (See above)
III.5.1.2 SUNGLASSES.in (INPUT)
The control parameter for sunglasses are equal to parasolin.dat
MAXN
KSTOP
PCENTO
NGS
ISEED
Empty line
Empty line
NSPL
ISTAT
IPROB
20000
5
0.01
10
1677
max no. of trials allowed before optimization is terminated
maximum number of shuffling loops in which the criterion value
percentage by which the criterion value must change...
number of complexes in the initial population
initial random seed
5
1
number of evolution steps allowed for each complex before comp
Statistical method (1=Xi-squared; 2=Bayesian)
iprob, when iprob=1 90% probability ; iprob=2 95% probability;
iprob=3 97.5% probability
number of objective functions to be included in global optimization (default=0 and
means that all objective functions listed in objmet.dat)
nintval in hypercube (for Bayesian method only)
3
IGOC
NINTVAL
0
10
III.5.2 outputs
III.5.2.1 bestpar.out
This file lists the best parameter values.
III.5.2.2 goodpar.out
This file lists the good parameter values.
III.5.2.3 Other outputs
These files have are mainly printed for internal calculations but may be analysed in case of problems/errors in the method.
File name
Description
Sceobjf.out
Objective functions values for each optimization run
scerespons.out
Model output values for all simulation runs
scepar.out
Parameter values of all simulation runs
sceparobj.out
Parameter values of all simulation runs and global optimization
criterion
Uncobjf.good
Objective function values for the good parameter sets in
“goodpar.out”
parasolout.out
Detailed output for each optimization loop and uncertainty
outputs
autocalxx.out
this files lists the simulated values that will be used for the
calibration of point xx
44
2 References
Box, G.E.P., and G.C.Tiao, 1973. Bayesian Inference in Statistical Analysis, Addison-Wesley-Longman, Reading, Mass.
Duan, Q. (1991), A global optimization strategy for efficient and effective calibration of hydrological models, Ph . D. thesis, Dep.
of Hydrol. and Water Resour., Univ. of Ariz. ; Tucson.
Duan, Q., V. K. Gupta, and S. Sorooshian (1992). Effective and efficient global minimalization for conceptual rainfall-runoff
models. Water Resources Research., vol. 28, 1015-1031.
Duan, Q.D., 2003. Global Optimization for Watershed Model Calibration, in Calibration of Watershed Models , ed. Q. Duan, S.
Sorooshian, H. V. Gupta, A. N. Rousseau, and R. Turcotte, AGU Washington DC, DOI: 10.1029/006WS07.
Eckhardt K and J.G. Arnold. 2001. Automatic calibration of a distributed catchment model. Journal of Hydrology, 251: 103-109.
Francos A., Elorza, F.J., Bouraoui, F., Galbiati, L. and Bidoglio, G., 2001. Sensitivity analysis of distributed environmental
simulation models: understanding the model behaviour in hydrological studies at the catchment scale. Reliability Engineering
and System Safety 79 (2003) 205-218.
Gan, T.Y., and G.F. Biptu (1996). Automated calibration of conceptual rainfall-runoff models: Optimization algorithms,
catchment conditions, and model structure, Water Resources Research, 32, 3513-3524.
Holland, J.H. (1975). Adaptation in Natural and Artificial Systems, Univ. of Mich. Press, Ann Arbor, 1975.
McKay M.D., 1988. Sensitivity and Uncertainty Analysis Using a Staistical Sample of Input Values. Uncertainty Analysis, Y.
Ronen, ed. CRC press, Inc. Boca Raton, Florida, 145-186.
McKay M.D., Beckman, R.J. and Conover, W.J., 1979. A Comparison of Three Methods for Selecting Values of Input Variables
in the Analysis of Output from a Computer Code. Technometrics, 21(2), 239-245.
Morris, M.D., 1991. Factorial sampling plans for preliminary computational experiments. Tecnometrics, vol.33, nr2.
Nelder, J.A., and R. Mead (1965). A simplex method for function minimization, Compt. J. 7(4), 308-313.
Sorooshian, S., Q. Duan and V.K. Gupta (1993). Calibration of rainfall-runoff models: Application of global optimization to the
sacramento soil moisture accounting model, Water Resources Research, 29, 1185-1194.
Vandenberghe V., van Griensven A. and Bauwens W. (2001). Sensitivity analysis and calibration of the parameters of ESWAT:
Application to the river Dender. Water Science and Technol., 43(7), 295-301.
van Griensven A., Francos A. and Bauwens W. (2002). Sensitivity analysis and auto-calibration of an integral dynamic model for
river water quality, Water Science and Technology, 45(5), 321-328.
Weijers S.R. and Vanrolleghem P.A., 1997. A procedure for selecting parameters in calibrating the activated sludge model no.1
with full-scale plant data. Wat. Sci. Tech., 36(5), 69-79.
45
Arnold J. G., R. Srinivasan, R. S. Muttiah, and J. R. Williams. Large area hydrologic modeling and assessment part I: model
development. Journal of the American Water Resources Association, 34(1), 73-89, 1998.
Beck, M.B. Water quality modeling: a review of the analysis of uncertainty. Water Res. Research, 23(6), 1393-1441, 1987.
Beven, K., Prophecy, reality and uncertainty in distributed hydrological modeling, Advances in Water Resources, 16:41-51, 1993.
Beven, K. and A. Binley, The Future of Distributed Models: Model Calibration and Uncertainty Prediction, Hydrol. Processes,
6:279-298, 1992.
Beven, K. and J. Freer, Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex
environmental systems using the GLUE methodology, J. Hydrol., 249:11-29, 2001.
Beven, K. and P. Young, Comment on "Bayesian recursive parameter estimation for hydrologic models" by M. Thiemann, M.
Torsset, H. Gupta, and S. Sorroshian, Water. Resourc. Res., 39(5):COM 1-1-COM 1-4, 2003.
Boyle, D. P., H. V. Gupta, S. Sorooshian, V. Koren, Z. Zhang, and M. Smith, Toward improved streamflow forecasts: Value of
semidistributed modeling, Water. Resourc. Res., 37:2749-2759, 2001.
Duan, Q., V. K. Gupta, and S. Sorooshian, Effective and efficient global optimization for conceptual rainfall-runoff models,
Water. Resourc. Res., 28:1015-1031, 1992.
Duan, Q., S. Sorooshian, H. V. Gupta, A. N. Rousseau, and R. Turcotte, Advances in Calibration of Watershed Models,AGU,
Washington, DC, 2003.Freer, J., K. Beven, and N. E. Peters, Multivariate seasonal period model rejection within the Generalised
Likelihood Uncertainty Estimation procedure, in Calibration of Watershed Models, edited by Q. Duan, H. V. Gupta, S.
Sorooshian, A. N. Rousseau, and R. Turcotte, pp. 69-87, AGU, Washington, DC, 2003.
Grayson, R. B. and G. Bloschl, Spatial Modelling of Catchment Dynamics, in Spatial Patterns in Catchment Hydrology:
Observations and Modelling, edited by R. B. Grayson and G. Bloschl, pp. 51-81, Cambridge University Press, Cambridge, 2001.
Gupta, H., M. Thiemann, M. Trosset, and S. Sorooshian, Reply to comment by K. Beven and P. Young on "Bayesian recursive
parameter estimation for hydrologic models, Water. Resourc. Res., 39(5):COM 2-1-COM 2-5, 2003.
Gupta, H. V., S. Sorooshian, and P. O. Yapo, Toward improved calibration of hydrologic models: multiple and
noncommensurable measures of information, Water. Resourc. Res., 34:751-763, 1998.
Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T., Bayesian model averaging: A tutorial, Statist. Sci., 14(4), 382-417,
1999.
Klemes, V., Dilettantism in Hydrology: Transition or Destiny?, Water. Resourc. Res., 22:177S-188S, 1986.
46
Knisel, W.G. CREAMS, a field scale model for chemicals, runoff and erosion from agricultural management systems. USDA
Conservation Research Rept. No. 26., 1980.
Krysztofowicz, R. Bayesian system for probabilistic river stage forecasting. J. Hydrol., 268, 16-40, 2002.
Kuczera, G., Improved Parameter Inference in Catchment Models: 1. Evaluating Parameter Uncertainty, Water. Resourc. Res.,
19:1151-1162, 1983a.
Kuczera, G., Improved Parameter Inference in Catchment Models: 2. Combining Different Kinds of Hydrologic Data and Testing
Their Compatibility, Water. Resourc. Res., 19:1163-1172, 1983b.
Kuczera, G. and M. Mroczkowski, Assessment of hydrologic parameter uncertainty and the worth of multiresponse data, Water.
Resourc. Res., 34:1481-1489, 1998.
Leonard, R.A. and R.D.Wauchope, 1980. Chapter 5: The pesticide submodel. P. 99-112. In Knisel, W.G. (ed.). CREAMS: A fieldscale model for chemicals, runoff, and erosion from agricultural management systems. U.S. Department of Agriculture,
Conservation research report no 26.
Meixner, T., R. C. Bales, M. W. Williams, D. H. Campbell, and J. S. Baron, Stream chemistry modeling of two watersheds in the
Front Range, Colorado, Water. Resourc. Res., 36:77-87, 2000.
Meixner, T., L. A. Bastidas, H. V. Gupta, and R. C. Bales, Multi-criteria parameter estimation for models of stream chemical
composition, Water. Resourc. Res., 38(3):9-1-9-9, 2002.
Meixner, T., A. D. Brown, and R. C. Bales, Modeling of biogeochemistry of two alpine watersheds in the Sierra Nevada,
California, EOS, Trans. AGU, 78S173-S174, 1997.
Meixner, T., H. V. Gupta, L. A. Bastidas, and R. C. Bales, Estimating Parameters and Structure of a Hydrochemical Model Using
Multiple Criteria, in Calibration of Watershed Models, edited by Q. Duan, S. Sorooshian, H. V. Gupta, A. N. Rousseau, and R.
Turcotte, pp. 255-295, AGU, Washington, DC, 2003.
Montanari, A. and A. Brath. A stochastic approach for assessing the uncertainty of rainfall-runoff simulations. Water Resources
Research, 40: W01106, doi:10.1029/2003WR002540, 2004.
Mroczkowski, M., G. P. Raper, and G. Kuczera, The quest for more powerful validation of conceptual catchment models, Water.
Resourc. Res., 33:2325-2335, 1997.
Osidele O.O. and Beck, M.B. Identification of model structure for aquatic ecosystems using regionalized sensitivity analysis.
Water Science and Technology, 43(7), 271, 278., 2001.
47
Seibert, J. and J. J. McDonnell, The quest for an improved dialog between modeler and experimentalist, in Calibration of
Watershed Models, edited by Q. Duan, H. V. Gupta, S. Sorooshian, A. N. Rousseau, and R. Turcotte, pp. 301-315, AGU,
Washington, DC, 2003.
Singh, V. P., Hydrologic Systems- Rainfall-Runoff Modelling Prentice Hall, Englewood Cliffs, NJ, 1988.
Sorooshian, S. Parameter estimation of rainfall-runoff models with heteroscedastic streamflow errors: The noninformative data
case, J. Hydrol, 52, 127-138, 1980.
Sorooshian, S. and J. A. Dracup, Stochastic parameter estimation procedures for hydrologic rainfall-runoff models: Correlated and
heteroscedastic error cases, Water. Resourc. Res., 16:430-442, 1980.
Sorooshian, S. and Gupta, V. K. Evaluation of Maximum Likelihood Parameter Estimation Techniques for Conceptual RainfallRunoff Models: Influence of Calibration Data Variability and Length on Model Credibility. Water Resources Research, 19(1),
251-259., 1983.
Thiemann, M., M. Trosset, H. V. Gupta, and S. Sorooshian, Bayesian recursive parameter estimation for hydrological models,
Water. Resourc. Res., 37:2521-2535, 2001.
USDA Soil Conservation Service. National Engineering Handbook Section 4 Hydrology, Chapter 19, 1983.
van Griensven A. and Bauwens W. Integral modelling of catchments. Water Science and Technol., 43(7), 321-328, 2001.
van Griensven A. and T. Meixner, 2003.
Vrugt, J. A., H. V. Gupta, W. Bouten, and S. Sorooshian. A shuffled Complex Evolution Metropolis Algorithm for Estimating
Posterior Distribution of Watershed Model Parameters, in Calibration of Watershed Models , ed. Q. Duan, S. Sorooshian, H. V.
Gupta, A. N. Rousseau, and R. Turcotte, AGU Washington DC, DOI: 10.1029/006WS07, 2003.
Willems, P., and J. Berlamont, 2002. Accounting for the spatial rainfall variability in urban modelling applications. Water Science
and Technology, 45(2), 105-112.
Williams, J.R. Flood routing with variable travel time or variable storage coefficients. Trans. ASAE 12(1), 100-103.
Williams, J.R., R.W.C.A. Jones, and P.T. Dyke. A modeling approach to determining the relationship between erosion and soil
productivity. Trans. ASAE, 27(1):129-144, 1984.
Yapo, P. O., H. V. Gupta, and S. Sorooshian, Automatic calibration of conceptual rainfall-runoff models: sensitivity to
calibration data, J. Hydrol., 181:23-48, 1996.
48
Download