Numeral Unit Spread Assessment, Pedigree

advertisement
UNCERTAINTY EXERCISE – NUSAP pedigree analysis
The NUSAP (Numeral Unit Spread Assessment, Pedigree) method aims to provide an
analysis and diagnosis of uncertainty. It captures both the quantitative and qualitative
dimensions of uncertainty. This exercise will focus on qualitative assessment: you
will determine the Pedigree scores for a number of parameters of a model or
calculation scheme.
Pedigree conveys an evaluative account of the production process of the information,
and indicates different aspects of the underpinning of the numbers and scientific status
of the knowledge used. Pedigree is expressed by means of a set of pedigree criteria to
assess these different aspects. Examples of such criteria are empirical basis or degree
of validation. These criteria are in fact yardsticks for strength. Many of these criteria
are hard to measure in an objective way. Assessment of pedigree involves qualitative
expert judgement. To minimise arbitrariness and subjectivity in measuring strength a
pedigree matrix is used to code qualitative expert judgements for each criterion into a
discrete numeral scale from 0 (weak) to 4 (strong) with linguistic descriptions
(modes) of each level on the scale (see the pedigree marices in tables 1 and 2). Note
that these linguistic descriptions are mainly meant to provide guidance in attributing
scores to each of the criteria for a given parameter. It is not possible to capture all
aspects that an expert may consider in scoring a pedigree in a single phrase. Therefore
a pedigree matrix should be applied with some flexibility and creativity.
The pedigree criteria you will use to evaluate the underpinning of a model as a whole
are:
 Supporting empirical evidence: proxy
 Supporting empirical evidence: Quality and Quantity
 Theoretical understanding (of the processes modelled)
 Representation of the understood underlying mechanisms
 Plausibility
 Colleague Consensus
You will also assess the pedigree of individual model parameters, using the following
pedigree criteria:
 Proxy
 Empirical basis
 Methodological rigour
 Validation
ASSIGNMENT:
For this assignment, you will work in groups of about 6 persons.
- Choose one of the models on which you performed a monte carlo analyisi
- Apply the "Pedigree matrix for evaluating the tenability of a conceptual model" to
the model
- For the 5 most sensitive parameter, apply the pedigree matrix for parameters..
- Construct a diganostic diagram in which you plot the 5 parameters.
Instructions:
 Do the Pedigree assessment in the role of an individual expert; we do not want a
group judgement.
 Group works on one parameter-card at a time.
 Discussion on each card starts with a group discussion aimed at clarification of
concepts (you should all use the same definition/concept when assigning pedigree
scores to the parameters): what do you, as a group, understand by the parameter
on the card?
 Discuss each of the four pedigree criteria, one by one, in the group. As a group,
write down the arguments and possible disagreements on the answer sheet. Hand
these arguments in at the end of this exercise. Do not yet assign scores. After the
discussion, individually assign a score for each criterion and write them down on
the parameter-score card.
 If you feel you cannot judge one or more of the pedigree scores for a given
parameter, leave it blank.
 If you feel a certain criterion is not applicable for a given parameter, indicate so
with "n.a.".
Table 1 Pedigree matrix for evaluating the tenability of a conceptual model
Score
Supporting empirical evidence
Theoretical
understanding
Representa-tion of
understood
underlying
mechanisms
Plausibility
Colleague
consensus
Well established theory
Model equations reflect
high mechanistic process
detail
Highly plausible
All but cranks
Proxy
Quality and quantity
4
Exact measures of the
modelled quantities
Controlled experiments and
large sample direct
measurements
3
Good fits or measures of Historical/field data
the modelled quantities uncontrolled experiments
small sample direct
measurements
Accepted theory with
partial nature (in view
of the phenomenon it
describes)
Model equations reflect
acceptable mechanistic
process detail
Reasonably
plausible
All but rebels
2
Well correlated but not
measuring the same
thing
Modelled/derived data
Indirect measurements
Accepted theory with
partial nature and
limited consensus on
reliability
Aggregated parameterized
meta model
Somewhat
plausible
Competing
schools
1
Weak correlation but
commonalities in
measure
Educated guesses indirect
approx. rule of thumb
estimate
Preliminary theory
Grey box model
Not very plausible
Embrionic field
0
Not correlated and not
clearly related
Crude speculation
Crude speculation
Black box model
Not at all plausable No opinion
JC Refsgaard, JP. van der Sluijs, J Brown and P van der Keur (2006), A Framework For Dealing With Uncertainty Due To Model Structure Error, Advances in Water Resources, 29 (11), 1586 - 1597.
Explanation of Pedigree criteria parameter pedigree:
Proxy
Sometimes it is not possible to represent directly the thing we are interested in by a
parameter so some form of proxy measure is used. Proxy refers to how good or close
a measure of the quantity that we model is to the actual quantity we represent. Think
of first order approximations, over simplifications, idealisations, gaps in aggregation
levels, differences in definitions, non-representativeness, and incompleteness issues.
If the parameter were an exact measure of the quantity, it would score four on proxy.
If the parameter in the model is not clearly related to the phenomenon it represents,
the score would be zero.
Empirical basis
Empirical basis typically refers to the degree to which direct observations,
measurements and statistics are used to estimate the parameter. When the parameter is
based upon good quality observational data, the pedigree score will be high.
Sometimes directly observed data are not available and the parameter is estimated
based on partial measurements or calculated from other quantities. Parameters
determined by such indirect methods have a weaker empirical basis and will generally
score lower than those based on direct observations.
Methodological rigour
Some method will be used to collect, check, and revise the data used for making
parameter estimates. Methodological quality refers to the norms for methodological
rigour in this process applied by peers in the relevant disciplines. Well-established
and respected methods for measuring and processing the data would score high on
this metric, while untested or unreliable methods would tend to score lower.
Validation
This metric refers to the degree to which one has been able to cross-check the data
and assumptions used to produce the numeral of the parameter against independent
sources. When these have been compared with appropriate sets of independent data to
assess its reliability it will score high on this metric. In many cases, independent data
for the same parameter over the same time period are not available and other data sets
must be used for validation. This may require a compromise in the length or overlap
of the data sets, or may require use of a related, but different, proxy variable for
indirect validation, or perhaps use of data that has been aggregated on different scales.
The more indirect or incomplete the validation, the lower it will score on this metric.
Table 2 Pedigree matrix to assess parameter strength
Proxy
Empirical basis
Methodological
Score
rigour
4
An exact measure of Controlled
Best available practice
the desired quantity
experiments and large in well established
sample direct
discipline
measurements
3
2
1
0
Validation
Compared with
independent
measurements of the
same variable over
long domain
Good fit or measure
Historical/field data
Reliable method
Compared with
uncontrolled
common within est.
independent
experiments small
discipline Best
measurements of
sample direct
available practice in
closely related variable
measurements
immature discipline
over shorter period
Well correlated but not Modelled/derived data Acceptable method but Measurements not
measuring the same
Indirect measurements limited consensus on independent proxy
thing
reliability
variable limited
domain
Weak correlation but Educated guesses
Preliminary methods Weak and very
commonalities in
indirect approx. rule of unknown reliability
indirect validation
measure
thumb estimate
Not correlated and not Crude speculation
No discernible rigour No validation
clearly related
performed
Download