Rule - Computational Hydraulics Int.

advertisement
RULES FOR
RESPONSIBLE
MODEL BUILDING
William James
University Professor Emeritus
President, CHI
Guelph, Canada
bill@computationalhydraulics.com
"All models are wrong, though some
may be said to be useful." (G.E.Box).
It's not enough to know simply when or
how a model may be said to be useful it's more important to know how reliable
it is.
R:1
A model is a concept. Concepts are used in
thinking, scientific deduction, engn’rg design
and forensics. They are improved by
experience. We do not necessarily require
the model that most approaches perfection,
rather we seek the model that provides an
acceptably accurate explanation. Simple
models are often said to be “better” than
complex models.
Optimal model complexity depends on the
questions to be resolved and the resources
available.
Your model should meet your own
ethical standards – it should:
•accept the limits of the discipline of engineering;
•improve and restore the natural balances and biodiversity;
•correct the human behaviour that caused the
problem to the ecosystem;
•imitate the structure of the natural, native or
indigenous system;
•be good for all parts of the natural system;
•not enrich one individual or group to the distress or
impoverishment of another;
•be in harmony with good character, cultural value,
and moral law.
R:70
•the living world is the matrix for all design,
•design should follow the laws of life,
•biological equity must determine design,
•design must reflect bioregionality,
•projects should use renewable energy
systems,
•design should integrate living systems,
•projects should heal the planet, and
•design should follow a sacred ecology
R:69
- fundamental tenet
variance can be systematically
reduced by including (explaining)
more and more relevant processes,
at a higher time and spatial
resolution.
R:12
The implicit problem in critical thinking is
to find the most probable flaws in an
argument, to discern the best lines of
thought and to improve the argument.
The solution may be stated:
if we test the argument perhaps over a long
time, which parts of the argument are less
likely to be a valid, and how may the
experience be better explained elsewise?
The implicit problem in scientific method
is to find the optimum or sufficient
description of the dominant processes.
The solution may be stated:
if we test the current explanation of
dominant processes over a long period of
time, e.g. 75-years for an engineering
environmental problem, is the description
optimal in the sense that it is the most
parsimonious description that meets the
required, or imposed, uncertainty?
The implicit problem in engineering
design is to find the optimum cost-effective
array of best practices.
A solution may be stated:
if the 75-year rainfall time series that
occurred at the International Airport, had in
fact occurred at Foxran Estates, then plan
126 would have been the most costeffective of the 329 plans examined - had
they, of course, all existed over this time.
The implicit problem in engineering
forensics is to find the most credible
explanation for an acute problem, and to
suggest a cost-effective solution which is
generally to replace the acute problem by a
chronic problem.
The price to be paid is vigilance.
Concerns include:
What array of models should be used?
What is the model applicability in the
context of the study objectives?
What accuracy is achievable?
What is the uncertainty of the model?
What investment of model effort is most
cost-efficient?
Is cost-efficiency appropriate for
optimizing an uncertain model?
Rule:
A model is used to help select the best
among competing proposals.
It is fundamentally irresponsible and
unethical for modelers not to interpret the
inherent uncertainty
R:2
R5
Steps in model construction
1. review and re-state the problem
2. construct the as-is model input data set
3. select model performance evaluation criteria
4. select an objective function
5. calibrate and evaluate the model
6. satisfied? If no, go back to 1; If y, go to 7
7. model several theoretical or to-be
situations
8. select the likely best alternative
9. report the best solution and its uncertainty.
Rule:
Computed and observed time series are
more ethically represented as smudges
than single-valued lines.
R:7
Rule:
Objectives must be simplified and related
to the computed output and objective
functions.
The model must include code that
adequately describes all significant
processes.
R:8
N M N S N pr
C    pa p , s , m
m 1 s 1 pr 1
where:
Nm = the number of modules active in the model,
Ns = number of sub-spaces modeled in each module,
Npr= number of processes modeled in each sub-space,
pa = the input parameters required for each process.
R:20
cost is taken to be a combination of:
1. engn’rg fees to design alternative solutions;
2. construction costs of the selected
alternative;
3. intangible costs;
4. costs due to uncertainty of the selected
option.
R:48
R62
Note: Bill’s
suggested
relations &
numbers
$
106
Evaluation function
105
F  e m + ed
104
Fmin
Design costs term
103
102
ed
Model error
term em
Optimum complexity
101
100
100
101
102
103
104
Complexity C
105
106
107
108
109
Rule:
In determining the best level of complexity,
test simple models first, proceeding to more
complex, until the required accuracy of the
computed response function is achieved.
Use the least number of processes,
discretized spaces, and the biggest time step
that delivers the required uncertainty.
R:51
Sensitivity analysis consists of
1. varying model coefficients one at a time,
with the amount varied being representative
of the uncertainty in the parameter being
analyzed,
2. dividing resulting dimensionless change in
computed response by the dimensionless
parameter variation, and then
3. ranking the resulting sensitivity gradient.
R:129
Non-linear sensitivity gradients for peak Flow
Me dium dura tion, m e dium inte nsity (0.3 in/hr for 1 hr) / Loc a tion 100
WW1
WAREA
WW3
WSLOPE
WW5
WW7
WW8
WW9
WW10
ww11
WW6
0.875
0.850
Flow [cfs]
0.825
0.800
0.775
0.750
-7.5
-5.0
-2.5
0.0
2.5
Percent change in parameter
5.0
7.5
Wkbk:59
Rule:
Do not test a generalized program per se for
sensitivity, parameter optimization, or error,
because individual applications are likely to be
radically different. Values of parameters in the
input datafile determine which processes will
be dominant or dormant. Relative parameter
values change both the model sensitivity and
the model uncertainty. Each model application
must be separately tested over the relevant
range of model
R:3
categorize input parameters in four groups:
1. can be measured with almost total certainty:
2. can be readily measured in the field or
laboratory.
3. cannot be easily measured in the field or
laboratory.
4. cannot be measured with any certainty at all.
model process calibration
parameter estimation
sensitivity analysis
event calibration
continuous model
© W James ‘97
1. Calibration
Start
Model
parameters
Datafile
Calibration
IFs
Programs
User input
Postprocessor
RFs
OFs
EFs
Parameter
Optimization
Longterm
IFs
Sensitivity
Analysis
Model
Error
Analysis
End
No
OK?
Yes
Continuous
Fuzzy RFs
Inference
2. Inferences
R:105
R5
Steps in model construction
1. review and re-state the problem
2. construct the as-is model input data set
3. select model performance evaluation criteria
4. select an objective function
5. calibrate and evaluate the model
6. satisfied? If no, go back to 1; If y, go to 7
7. model several theoretical or to-be
situations
8. select the likely best alternative
9. report the best solution and its uncertainty.
PCSWMM
PCSWMM 2005
Utilities
Terminology
24 Response
Functions
• Nodes
– Depth, head, volume, lateral inflow, total inflow,
flooding
• Links
– Flow, depth, velocity, capacity
• System
– Temp, rainfall, snow depth, losses, runoff, dry weather
inflow, ground water inflow, RDII inflow, direct inflow,
total inflow, flooding, outflow, storage
typical cycle in a response or input
function
-the functions may be observed, synthetic or computed; RFcrit and IFcrit are arbitrary
RF(t), IF(t)
RFcrit, IFcrit
t1,1
t1,2
t1,3
t1,4
t2,1
t2,2
R:117
OF1:
(t2,1 - t1,1)
duration of wet event
OF2:
(t2,2 - t1,3)
duration of dry event
OF3:
RF(t1,3)
peak flow, flux, or concentration
OF4:
RF(t1,1)
minimum flow, flux or concentration
OF5:
*INT (t1,4-t1,1)
total wet event flow or flux
OF6:
(t1,4 - t1,2)
duration of exceedance
OF7:
(t2,2 - t1,4)
duration of deficit
OF8:
n[RF>RFcrit]
number of exceedances
OF9:
n[RF<RFcrit]
number of deficits
OF10:
*INT (t2,2-t1,4)
volume of deficit
OF11:
*INT (t1,4-t1,2)
volume of excess
OF12:
OF5/OF1
wet event mean concentration
OF13:
*INT (t2,1-t1,4)
total dry event flow or flux
OF14:
OF13/OF2
dry event mean concentration
R:117
t 1,4
OF 5 =  RF(t)dt
t 1,1
t 2,2
OF 10 =
 RF
crit
- RF(t)dt
t 1,4
t 1,4
OF 11 =
 RF(t) - RF dt
crit
t 1,2
t 2,1
OF 13 =  RF(t)dt
t 1,4
R:118
Dominant process
Objective function
Overland flow over impervious areas
OF3
Infiltration into the upper soil mantle
OF4
Pollutant washoff
OF5
Erosion
OF1
Overland flow over pervious areas
OF3
Pollutant build-up
OF5
Recovery of storages
OF2
Recovery of loss (infiltration) rates
OF4
Recession of storages
OF7
Evaporation
*IF8
Snowmelt
*IF11
snow accumulation
*IF7
R:119
Rule:
Select the best objective function
thoughtfully, by relating it back to the original
design questions.
Use the minimum acceptable number of
objective functions.
R:119
1. observation error, related to field instrumentation,
comprising two components, one random one systematic;
2. sampling error, associated with the timing and location
of the field equipment;
3. numerical error, identified with numerical math used in
the code;
4. structural error, related to disaggregation (the number &
resolution of the processes active);
5. structural error, related to discretization (the spatial
resolution);
6. structural error, related to poor formulation of one or
more of the component process relations and code; and
7. propagated error, related to erroneous parameters.
R:123
External description
Prior knowledge
1. uncertainty due to natural
variability, or unobserved
input disturbances.
2. measurement and sampling
errors of observed input and
output.
Calibration
process
Identify as-is model
3. start-up error
4. input TS datafile error
5. model error
Internal description
1. aggregation error
2. numerical error
3. structural error
4. discretization error
5. input environment
datafile error
6. model structure and
state-parameter error
7. parameter optimization error
Design process
(inference to the tobe and as-was
scenarios
6. uncertainty of to-be
parameters
7. user output-interpretation
error
8. parameter propagation error
9. error analysis
R:124
Rule:
Sixteen sources of error are listed in the
framework for uncertainty analysis presented
here.
When interpreting the computed output from
your model, all sixteen sources should be
explicitly interpreted.
R:127
model users must be able to:
1. isolate the important empirical parameters that
require refining (calibration),
2. associate these parameters with their correct
processes (may be more than one),
3. isolate the conditions under which the
processes are active (again may be more than
one), and then
4. select state-variable events (SV sub-spaces) for
sensitivity (which may be hypothetical events), and
5. select state-variable events from the observed
record for calibration analyses.
R:136
(Ofi)c
C
D
B
D
A
A represents “small” events
B represents “medium” events
C represents “big” events
D represents fuzzy overlaps
(Ofi)o
R:137
Short-duration-high-intensity
SDHI
20 m;
3 in/h
Medium-duration-hi-intensity
MDHI 60 m;
1.0 in/hr
long-duration-high-intensity
LDHI
600 m;
0.2 in/h
Short-duration-med-intensity
SDMI
20 m;
0.4 in/hr
Medium-duratn-med-intensity
MDMI 60 m;
0.3 in/h
long-duration-med-intensity
LDMI
600 m;
0.1 in/h
Short-duration-low-intensity
SDLI
20 m;
0.1 in/h
Medium-duration-low-intensity
MDLI
60 m;
0.1 in/hr
long-duration-low-intensity
LDLI
600 m;
0.1 in/h
Short-duration-high-intensity
SDHI
1 d;
0.5 in/d
long-duration-high-intensity
LDHI
10 d;
0.3 in/d
Short-duration-low-intensity
SDHI
1 d;
0.05in/d
Evapo-transpiration:
long-duration-low-intensity
LDLI
10 d;
0.05 in/d
R:139
Rain:
Light rate of rain
Overland flow over impervious areas
Medium rate of rain
Infiltration into upper soil mantle; pollutant
washoff
Heavy rate of rain
Erosion; pollutant washoff; pervious area flow
Long duration rain
Overland flow over pervious areas
No rain:
Long duration drought
Pollutant build-up; groundwater depletion
Short duration drought
Storage recessions
Temperature:
High temperatures
Evapo-transpiration; snowmelt
Low temperatures
Snow accumulation & ripening
Wind:
High wind
Snowmelt
R:140
Rule:
Associate parameters with processes, and
processes with causative events, and
causative events with limited state-variable
sub-spaces.
R:140
A total error statistic (EFt) may be used to quantify overall
goodness of fit:
1
2
 n ( COF i - OOF i )2 
EF t = (1.0 - w)  
 + (w  | OPF p - CPF p |)
n
 i=1

where:
EFt = total error statistic (m3/s);
w
= weighting factor;
n
= number of measured hourly flows;
OOF = measured flow (m3/s);
COF = computed flow (m3/s);
OPF = measured peak flow (m3/s); and
CPF = computed peak flow (m3/s).
R:142
Rule:
Use first-order error analysis to report the
estimated propagated error in your
recommended design solution.
R:156
rate of rain
E
A
B
H
not used
+ ve
zero
G
C
- ve
evapotranspiration
rate
not used
D
F
zero
duration of rain
R:165
1.0
C,F,G
B,H
0.0
A,E
I
D
medium
zero
evapo-transpiration
short
duration
med
long
rate-of-rain
R:166
general form:
If X period is
Y , analyze Z parameters.
where X, Y, Z have the following meanings:
X
Y
Z
1. rain
long
erosion
2. rain
medium
pervious area flow
3. rain
medium
pollutant washoff
4. rain
short
impervious area flow
5. rain
short
rain-out
6. ET
exists
recovery of storages
7. ET
exists
recovery of loss rates
8. ET
exists
groundwater depletion
9. ET
medium
pollutant build-up
R:167
Rule:
Analyse only sensitive parameters, and
then only against relevant events.
R:167
Framework for continuous modeling:
At your desk:
1.
Make a list of simplified design questions, and
postulate the relationship between your list and your
proposed objective functions.
R:169
2. Select the best objective functions and response
functions for your study problem. Minimize the
computed output and computer execution times.
Allocate storage space for computed time series
management.
R:169
3. Obtain or generate a credible, very-long-term time
series to drive your model for design inference.
R:169
4. Obtain a short but sufficient record of good,
observed events to calibrate your model.
R:169
Using the PCSWMM4 shell:
5. List all parameters that need to be
optimized, and their associated processes.
R:169
6 Associate all processes with the limited statevariable sub-spaces where they dominate.
R:169
7. Search the good observed record for
a sufficient number of appropriate
events.
R:169
8. Estimate:
1. the mean most likely value,
2. a higher most likely value, and
3. a lower most likely value for each of
all input parameters. Choose the
sensitivity test range, but keep it
small.
R:169
9. Carry out the sensitivity tests, and rank all
parameters, in terms of their dimensionless
sensitivity gradients.
R:169
10. Optimize the parameters to give the
smallest error.
R:169
11. Run the calibrated model for the long term time
series for each array of BMPs.
R:169
12. Infer which is the best array. Rerun the model
for this array estimating the error in the computed
response functions.
R:169
13. Study all the input and output information again;
make certain that it is logical, and gain knowledge about
the performance of the drainage system. Interpret the
impact of the errors.
R:169
At your client's office:
14. Report your recommendations, and, provided
you follow the logic, become rich and famous.
R:169
The following 8 rules form a personal
catechism for honest, very-long term,
continuous surface water quality modeling
R:171
Rule 1: Do not calibrate all parameters
simultaneously against a long-term
continuous observed record,
notwithstanding any early advice to the
contrary in the literature.
R:171
Rule 2: Transpose or synthesize a long-term,
hydro-meteorologic input time-series from the
same hydrologic region, and use this for
inferring comparative performance of various
arrays of BMPs. Many records of 50 years
duration or longer are available.
R:171
Rule 3: Carefully choose the best objective
functions that represent the design questions
and the model variability. Get the advisory
committee to justify the selections in writing.
R:171
Rule 4: In order to control the amount of
computing, associate the input parameters with
processes, and processes with causative events,
and causative events with limited state-variable
sub-spaces. For this activity, sensitivity analysis
code in PCSWMM4 is helpful. Do not analyze
parameters outside these spaces.
R:171
Rule 5: Use three estimates of the most likely
parameter values. It is more meaningful to compare
the computed response from several reasonable
models, rather than responses computed using
extreme values.
R:171
Rule 6. Assume that the WQM is approximately
linear, for the purposes of optimizing
parameters, and estimating the propagated
error. Then analyze for sensitivity near the mean
expected values of all input parameters.
R:171
Rule 7: Calibrate only sensitive parameters, and then
only against relevant events for which you have
good, short-term observed data. And that must
include good rate-of-rain with adequate coverage
and spatial resolution.
R:171
Rule 8: Use first-order linear error
analysis, and report the estimated
propagated error in your
recommended design solution.
R:171
The end
see you on-line at:
•www.computationalhydraulics.com
•www.eos.uoguelph.ca/webfiles/james
•bill@computationalhydraulics.com
• wjames@uoguelph.ca
Download