UNCERTAINTY EXERCISE – NUSAP pedigree analysis The NUSAP (Numeral Unit Spread Assessment, Pedigree) method aims to provide an analysis and diagnosis of uncertainty. It captures both the quantitative and qualitative dimensions of uncertainty. This exercise will focus on qualitative assessment: you will determine the Pedigree scores for a number of parameters of a model or calculation scheme. Pedigree conveys an evaluative account of the production process of the information, and indicates different aspects of the underpinning of the numbers and scientific status of the knowledge used. Pedigree is expressed by means of a set of pedigree criteria to assess these different aspects. Examples of such criteria are empirical basis or degree of validation. These criteria are in fact yardsticks for strength. Many of these criteria are hard to measure in an objective way. Assessment of pedigree involves qualitative expert judgement. To minimise arbitrariness and subjectivity in measuring strength a pedigree matrix is used to code qualitative expert judgements for each criterion into a discrete numeral scale from 0 (weak) to 4 (strong) with linguistic descriptions (modes) of each level on the scale (see the pedigree marices in tables 1 and 2). Note that these linguistic descriptions are mainly meant to provide guidance in attributing scores to each of the criteria for a given parameter. It is not possible to capture all aspects that an expert may consider in scoring a pedigree in a single phrase. Therefore a pedigree matrix should be applied with some flexibility and creativity. The pedigree criteria you will use to evaluate the underpinning of a model as a whole are: Supporting empirical evidence: proxy Supporting empirical evidence: Quality and Quantity Theoretical understanding (of the processes modelled) Representation of the understood underlying mechanisms Plausibility Colleague Consensus You will also assess the pedigree of individual model parameters, using the following pedigree criteria: Proxy Empirical basis Methodological rigour Validation ASSIGNMENT: For this assignment, you will work in groups of about 6 persons. - Choose one of the models on which you performed a monte carlo analyisi - Apply the "Pedigree matrix for evaluating the tenability of a conceptual model" to the model - For the 5 most sensitive parameter, apply the pedigree matrix for parameters.. - Construct a diganostic diagram in which you plot the 5 parameters. Instructions: Do the Pedigree assessment in the role of an individual expert; we do not want a group judgement. Group works on one parameter-card at a time. Discussion on each card starts with a group discussion aimed at clarification of concepts (you should all use the same definition/concept when assigning pedigree scores to the parameters): what do you, as a group, understand by the parameter on the card? Discuss each of the four pedigree criteria, one by one, in the group. As a group, write down the arguments and possible disagreements on the answer sheet. Hand these arguments in at the end of this exercise. Do not yet assign scores. After the discussion, individually assign a score for each criterion and write them down on the parameter-score card. If you feel you cannot judge one or more of the pedigree scores for a given parameter, leave it blank. If you feel a certain criterion is not applicable for a given parameter, indicate so with "n.a.". Table 1 Pedigree matrix for evaluating the tenability of a conceptual model Score Supporting empirical evidence Theoretical understanding Representa-tion of understood underlying mechanisms Plausibility Colleague consensus Well established theory Model equations reflect high mechanistic process detail Highly plausible All but cranks Proxy Quality and quantity 4 Exact measures of the modelled quantities Controlled experiments and large sample direct measurements 3 Good fits or measures of Historical/field data the modelled quantities uncontrolled experiments small sample direct measurements Accepted theory with partial nature (in view of the phenomenon it describes) Model equations reflect acceptable mechanistic process detail Reasonably plausible All but rebels 2 Well correlated but not measuring the same thing Modelled/derived data Indirect measurements Accepted theory with partial nature and limited consensus on reliability Aggregated parameterized meta model Somewhat plausible Competing schools 1 Weak correlation but commonalities in measure Educated guesses indirect approx. rule of thumb estimate Preliminary theory Grey box model Not very plausible Embrionic field 0 Not correlated and not clearly related Crude speculation Crude speculation Black box model Not at all plausable No opinion JC Refsgaard, JP. van der Sluijs, J Brown and P van der Keur (2006), A Framework For Dealing With Uncertainty Due To Model Structure Error, Advances in Water Resources, 29 (11), 1586 - 1597. Explanation of Pedigree criteria parameter pedigree: Proxy Sometimes it is not possible to represent directly the thing we are interested in by a parameter so some form of proxy measure is used. Proxy refers to how good or close a measure of the quantity that we model is to the actual quantity we represent. Think of first order approximations, over simplifications, idealisations, gaps in aggregation levels, differences in definitions, non-representativeness, and incompleteness issues. If the parameter were an exact measure of the quantity, it would score four on proxy. If the parameter in the model is not clearly related to the phenomenon it represents, the score would be zero. Empirical basis Empirical basis typically refers to the degree to which direct observations, measurements and statistics are used to estimate the parameter. When the parameter is based upon good quality observational data, the pedigree score will be high. Sometimes directly observed data are not available and the parameter is estimated based on partial measurements or calculated from other quantities. Parameters determined by such indirect methods have a weaker empirical basis and will generally score lower than those based on direct observations. Methodological rigour Some method will be used to collect, check, and revise the data used for making parameter estimates. Methodological quality refers to the norms for methodological rigour in this process applied by peers in the relevant disciplines. Well-established and respected methods for measuring and processing the data would score high on this metric, while untested or unreliable methods would tend to score lower. Validation This metric refers to the degree to which one has been able to cross-check the data and assumptions used to produce the numeral of the parameter against independent sources. When these have been compared with appropriate sets of independent data to assess its reliability it will score high on this metric. In many cases, independent data for the same parameter over the same time period are not available and other data sets must be used for validation. This may require a compromise in the length or overlap of the data sets, or may require use of a related, but different, proxy variable for indirect validation, or perhaps use of data that has been aggregated on different scales. The more indirect or incomplete the validation, the lower it will score on this metric. Table 2 Pedigree matrix to assess parameter strength Proxy Empirical basis Methodological Score rigour 4 An exact measure of Controlled Best available practice the desired quantity experiments and large in well established sample direct discipline measurements 3 2 1 0 Validation Compared with independent measurements of the same variable over long domain Good fit or measure Historical/field data Reliable method Compared with uncontrolled common within est. independent experiments small discipline Best measurements of sample direct available practice in closely related variable measurements immature discipline over shorter period Well correlated but not Modelled/derived data Acceptable method but Measurements not measuring the same Indirect measurements limited consensus on independent proxy thing reliability variable limited domain Weak correlation but Educated guesses Preliminary methods Weak and very commonalities in indirect approx. rule of unknown reliability indirect validation measure thumb estimate Not correlated and not Crude speculation No discernible rigour No validation clearly related performed