Classical and Bayesian nonlinear regression applied to hydraulic rating curve inference.

advertisement
Classical and Bayesian nonlinear
regression applied to hydraulic rating
curve inference.
Construction and uncertainty analysis of stage-discharge rating curves
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Motivation for this work
•
River hydrology
– Management of fresh water resources
• Decision-making concerning flood risk
• Decision-making concerning drought
•
•
River hydrology => How much water is flowing
through the rivers?
Key definition; discharge,
amount of water passing through a
cross-section of the river each time
unit
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Key problem
• Discharge is expensive. But hydrologists wants
discharge time series!
• Solution: Find a relationship between discharge and
something that is inexpensive to measure.
• Usually, that something is water level.
• This job must be done over and over again: Need
solid tools for finding such relationships.
• Discharge measurements are uncertain => need
statistical tools
• Program must be easy for hydrologists to use => User
friendliness in statistics?
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Water level definitions
• Stage: the height of the water level at a river site
h
Q
h0
Datum, height=0
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Stage-discharge relationship
h
Q=C(h-h0)b
Q
h0
Datum, height=0
Discharge, Q
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Stage-discharge relationship
•
Simple physical attributes:
– Q=0 for hh0
– Q(h2)>Q(h1) for h2>h1>h0
•
Parametric form suggested by hydraulics (Lambie (1978)
and ISO 1100/2 (1998)): Q=C(h-h0)b
•
Parameters may be fixed only in stage intervals segmentation
h
h
Q
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
width
Calibration data and statistical model
• n stage-discharge measurements.
• Discharge is error-prone.
nonlinear regression
• Statistical inference on C,b,h0
• Qi=C (hi-h0)b Ei, where Ei~logN(0,2) i.i.d. noise and
i{1,…,n}
• qi=a+b log(hi-h0) + i, where i~N(0,2) i.i.d.
• Problem: Enable hydrological engineers to estimate
Q(h)=C(h-h0)b and evaluate the calibration uncertainty.
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
One segment fitting, the old way
•
Guess or make approximate measurement of h0.
Then linear regression on qi vs log(hi-h0).
•
For each plausible value of h0, do linear regression.
Choose the h0 that gives least SSE for the
regression.
– Same as doing max likelihood inference on the model:
+b log(hi-h0) + i , i{1,…,n}
– Means that uncertainty analysis becomes available (?).
– Studied by Venetis (1970).
– Also by Clarke (1999).
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
qi=a
Problems concerning classic one
segment curve fitting
•
Sometimes exhibits heteroscedasticity.
• Sometimes there's no finite solution!
– Found a set of requirements that ensures finite estimates.
– In practice, broken requirements means no finite estimates.
– The model can produce broken requirements for any set of stage
measurements!
– Parameter estimators have infinite expectancy -> Uncertainty
inference becomes difficult!
– Explored in paper 1, Reitan and Petersen-Øverleir (2006) and in
the appendix, Reitan and Petersen-Øverleir (2005).
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Bayesian one segment fitting
•
•
Based in the same data model, but with a prior distribution to
the set of parameters. The Bayesian study of this resulted in
paper 2, Reitan and Petersen-Øverleir (2008a).
Bayesian analysis of other models done by Moyeeda and Clark
(2005) and Árnason (2005).
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Pros and cons
•
Upsides:
–
–
–
–
•
Encodes hydraulic knowledge.
Can put softer ‘restrictions’ on the parameters.
Finite estimates.
Natural uncertainty measures.
Downsides:
– Requires heavier numerical methods (MCMC).
– Coming up with a prior distribution can be hard.
– Also sometimes exhibits heteroscedasticity.
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Input - prior distribution
• Prior distribution form as simple as possible.
• Prior knowledge either local or regional.
• Regional knowledge can be extracted once and for
all.
• At-site prior knowledge can be set through asking for
credibility intervals.
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Output – estimates and uncertainties
• Estimates – expectancies or medians from the
posterior distribution.
• Uncertainty – credibility intervals of parameters
and the curve itself, Q(h)=C (h-h0)b.
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Segmentation
• Original idea: Divide the stage-discharge
measurement into sets and fit Q(h)=C (h-h0)b
separately for each segment. This can fit a wider
range of measurement sets.
Segment 2
h
Segment 1
Intersection
Q
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Problems with manual
segmentation h
1) Uncertainty analysis of
manual decisions not
statistically available.
2) Curves fitted to two
neighbouring sets may
not intersect.
3) Two such curves may h
have intersections only
inside the sets.
Jump in the curve
Q
Q
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Statistical model for segmentation
– the interpolation model
• Idea: Make a model with segments and let the data be
attached to that model.
• Model: for k segments, introduce k-1 segment limits
parameters, hs,1, …, hs,k-1. For a measurement
hs,j-1<hi< hs,j assume qi=aj+bj log(hi-h0,j) +i.
• Make sure there’s continuity by sacrificing one of the
parameters in the upper segments (aj for j>1).
• Goal: Make inference on all parameters in this model.
Also, make inference on k.
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Frequentist inference on the
interpolation model
• Segmented model first formulated and treated by
using the maximum likelihood method in paper 3,
Petersen-Øverleir and Reitan (2005).
• Problems:
3) Possibility of infinite parameter estimates inherited
from one segment model. (Much more likely than usual for
upper segments.)
4) Multi-modality and discontinuous derivative of the
marginal likelihood of changepoint parameters, {hs,j}.
5) Inference of k through AIC or BIC?
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Bayesian inference on the
interpolation model
• Need prior distribution of changepoints, {hs,j}, and
number of segments, k.
• MCMC for each sub-model characterized by k.
• Importance sampling for posterior sub-model
probability, Pr(k|D).
• Input: Data, prior probability of each sub-model and
prior for the parameter set of each sub-model. (Can be
regional or partially regional. Set by asking for credibility intervals.)
• Output: Pr(k|D) and posterior dist. of all parameters
for each k. (Summarised by estimates and credibility intervals.)
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Output example for interpolation
model inference
Q
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Problems with Bayesian treatment
of segmented models
•
•
•
•
•
Difficult to make efficient inference algorithms (but a semi-efficient
one has been made).
Changepoints only inside the dataset (thus ”the interpolation model”).
Extrapolation uncertainty underestimated because there can be
changepoints outside the dataset.
Solution(?): The process model, a new model where the
segments appear through a process.
Problems with the process model: Very inefficient algorithms.
Difficult to implement all sorts of relevant prior knowledge.
Middle ground? Use changepoints of most sub-model from the
interpolation model as data in inference about the changepoint
process. Process model used for extrapolation of the curve.
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
References
1)
2)
3)
4)
5)
6)
7)
8)
9)
10)
Árnason S (2005), Estimating nonlinear hydrological rating curves and discharge using the Bayesian
approach. Masters Degree, Faculty of Engineering, University of Iceland
Clarke, RT (1999), Uncertainty in the estimation of mean annual flood due to rating curve indefinition. J
Hydrol, 222: 185-190
ISO 1100/2. (1998), Stage-discharge Relation, Geneva
Lambie JC (1978), Measurement of flow - velocity-area methods. Hydrometry: Principles and Practices,
first edition, edited by R.W. Herschy, John Wiley & Sons, UK.
Moyeeda RA, Clarke RT (2005), The use of Bayesian methods for fitting rating curves, with case studies.
Adv Water Res, 28:8:807-818
Petersen-Øverleir A, Reitan T (2005), Objective segmentation in compound rating curves. J Hydrol, 311:
188-201
Reitan T, Petersen-Øverleir A (2005), Estimating the discharge rating curve by nonlinear regression - The
frequentist approach. Statistical Research Report, University of Oslo, Preprint 2, 2005 Available at:
http://www.math.uio.no/eprint/stat report/2005/02-05.html
Reitan T, Petersen-Øverleir A (2006), Existence of the frequentistic regression estimate of a power-law
with a location parameter, with applications for making discharge rating curves. Stoc Env Res Risk Asses,
20:6: 445-453
Reitan T, Petersen-Øverleir A (2008a), Bayesian power-law regression with a location parameter, with
applications for construction of discharge rating curves. Stoc Env Res Risk Asses, 22: 351-365
Venetis C (1970), A note on the estimation of the parameters in logarithmic stage-discharge relationships
with estimation of their error, Bull Inter Assoc Sci Hydrol, 15: 105-111
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo)
Download