1 Aims Hydrological models that are able to simulate nationwide

advertisement
NATIONAL HYDROLOGICAL MODEL TESTING, PART I: NEED AND TEST
PROCEDURES
Booker DJ,1 McMillan HK,1
1
NIWA
Aims
Hydrological models that are able to simulate nationwide daily flow time-series using spatially
consistent methodologies and input data are a valuable tool to support development of national
polices for water management and national environmental reporting. Such models provide
nationwide time-series of flood risk, drought potential, water availability for irrigation or power
generation, and the effects of changes in climate on hydrology. However, when being used for
these purposes, it is important that both spatial and temporal aspects of uncertainty in simulated
time-series are quantified.
Previous testing of national hydrological models has concentrated on quantifying the ability to
model between-site patterns in time-averaged indices (e.g., mean flow, MALF). The aim of this
work is to present a comprehensive and consistent suite of test procedures that could be applied
to any nationwide hydrological model which calculates daily flow time-series. The first objective of
these procedures was to quantify performance when simulating patterns in: a) between-site
differences in time-averaged indices; b) between-site differences in daily time-series; and c)
between-year differences in annual time-series. The second objective was to identify
performance across hydrological signatures represented by various parts of the hydrograph. The
third objective was to identify spatial and temporal patterns in performance. The fourth objective
was to provide a consistent basis for comparison of performance between different models.
Methods
We collated daily flow time-series observed at 486 sites draining reasonably natural catchments
(free of major dams or diversions). These sites were distributed throughout New Zealand and
represented a range of catchment sizes and hydrological conditions. Although each time-series
was at least 5 years in length, not all time-series covered the same time period. Years with more
than 30 days of missing data were removed from the dataset. The location of each site on the
River Environment Classification national river network (Snelder and Biggs, 2002) was identified.
This allowed extraction of data describing site and catchment characteristics such as topographic
setting, geology and climate.
We then developed a suite of test procedures which, when applied together and over many sites,
quantify model performance through time and space. The test procedures allow hypotheses
which relate model performance to either model parameterisation, model structure, model
boundary conditions or geographical setting to be tested. They also provide a consistent
framework when comparing different modelled time-series. This is often a requirement when
changes in model structure, parameterisation procedures or input data have been made.
Alternatively, raw modelled values may be compared with post-processed corrected values
gained after applying statistical correction procedures.
Results
A suite of test procedures was designed in which sets of observed and modelled values were
compared (Table 1). These included: 1) mean daily values from the entire observed period for
comparisons between days within sites; 2) mean daily values from each year of the observed
period for comparisons between days within years within sites; 3) various indices calculated over
the entire observed period (e.g., mean annual low flow) for comparison between sites; and 4)
various indices calculated for each year of the observed period (e.g., annual low flows) for
comparisons between years within sites.
For each set of observed and modelled values, we applied three performance metrics, each
designed to quantify a different aspect of model performance: Nash-Sutcliffe efficiency (NSE2);
percent bias (pbias); and coefficient of determination (r2). See Moriasi et al. (2007) and
references therein for full details of these performance evaluation metrics. NSE is a
dimensionless metric that determines the relative magnitude of the residual variance (“noise”)
compared to the observed data variance (“information”). NSE is commonly used to evaluate
hydrological model performance; we used a scaled version NSE2 that takes values between -1
(worst) and +1 (perfect model), with 0 representing a model that is no better than a constant
prediction at the mean flow value. Percent bias evaluates the average tendency of the simulated
data to be larger (negative pbias) or smaller (positive pbias) than their observed counterparts. For
mean flow values, percent bias allows us to test whether the water balance of the catchment is
correctly represented. Coefficient of determination evaluates the correlation between observed
and modelled values. This metric allows us to test whether the model correctly models
locations/years with low/high index values, even if there is systematic bias in the values. These
metrics were applied to each set of observed and modelled values in their untransformed raw
units, and after having applied either a log or square root transformation in order to better
approximate normal distributions.
Table 1. Hydrological indices used to test model performance.
Index type
Index
Description
Daily flow series
DAILY
Daily flow series
Annual flow
descriptor
Multi-year flow
descriptor
MEAN
Mean annual flow
MAX
Annual flood
MIN
Annual low flow
QFEB
Proportion of flow in
February
QFLOOD5
5-year flood
QLOW5
5-year low flow
QVAR
QBAR
MALF
MAF
Interannual variation
All-time mean flow
Mean annual low flow
Mean annual flood
Calculation
Model simulation of entire daily flow
series
Mean flow in each hydrological year
Maximum daily flow in each hydrological
year
Minimum daily flow in each hydrological
year
Mean flow in February as a proportion
of mean annual flow, in each
hydrological year
Maximum daily flow expected during a
period of 5 years, using a Gumbel
extreme value approximation.
Minimum daily flow expected during a
period of 5 years, using a normal
distribution.
Interannual variation in mean flow
Mean flow over entire series
Mean of the annual minimum flows
Mean of the annual maximum flows
Conclusion
We developed a suite of test procedures that could be applied to any nationwide hydrological
model which calculates daily flow time-series. The procedures are comprehensive because they
cover a broad range of hydrological signatures across time and space. The procedures are also
consistent because they have been defined in advance of their application. The suite of test
procedures have been applied to the National TopNet Model. See McMillan et al (2015) for
further details of this example application.
References
Booker, D.J. & Woods, R.A. (2014) Comparing and combining physically-based and empirically-based
approaches for estimating the hydrology of ungauged catchments. Journal of Hydrology.
10.1016/j.jhydrol.2013.11.007.
Snelder, T.H. & Biggs, B.J.F., 2002. Multi-scale river environment classification for water resources
management. Journal of the American Water Resources Association 38, 1225–1240.
Moriasi, D.N., Arnold, J.G., Van Liew, M.W., Bingner, R.L., Harmel, R.D. & Veith, T.L. 2007. Model
evaluation guidelines for systematic quantification of accuracy in watershed simulations. Transactions of the
American Society of Agricultural and Biological Engineers 50, 885–900.
McMillan, HK, Booker, DJ, Cattoen-Gilbert, C, Zammit, C (2015) National hydrological model testing, Part I:
results and interpretation. Hydrological Society Conference presentation. Dec 2015.
Download