Elicitation of Expert Opinion as a Prior Distribution

advertisement
An Overview of Elicitation Methods
and Software
Paul Garthwaite
Open University, UK
(Joint work with Fadlalla Elfadaly)
1
Here, the focus is on elicitation methods that help
an expert quantify his/her opinions as a probability
distribution.
There is elicitation software for many other
purposes. For example:
• To aid collaboration via brainstorming;
• to develop a conceptual model via mind maps;
• to evaluate influence diagrams;
• to construct qualitative probabilistic networks
(qualitative forms of Bayesian belief networks);
• To aggregate assessments of individual experts
in one combined probability density function
[Excalibur: acronym for Expert CALIBRation,
(Cooke & Solomatine, 1992)].
2
A useful review of much elicitation software is given in:
Software to support expert elicitation
Devilee & Knolby (2011).
http://www.rivm.nl/bibliotheek/rapporten/630003001.pdf
They consider:
- ways software programs may support elicitation;
- existing software programs that could support elicitation;
- the functionalities of these existing software projects.
3
Why use expert opinion? Can’t opinion
convey bias as well as knowledge?
• The expert may want to quantify his/her opinions
for her own use. e.g. An industrial chemist wanting
to design experiments.
• Available data may not be suitable as a statistical
input. e.g. Sightings of a rare and endangered
species.
• Data is absent so that using expert knowledge is
essential.
Bayesian statistics provides a mechanism for
incorporating expert opinion into a statistical
analysis.
4
Basic strategy for quantifying opinion
• We have a model for the problem in hand.
• Examples: regression models and multinomial models.
• The model has parameters, such as a and b in the
regression model y = a + bx.
• We want to quantify opinion about a and b.
• We give a and b a distribution: typically we would assume
that opinion about them can be represented by a
multivariate normal distribution:
a
b
 
 1    12   
MVN   , 
.
2 
 2     2  
The expert answers questions that determine 1, 2 ,12 , 22 and .
5
The expert must be asked questions they can understand and answer
meaningfully.
Much psychological research has examined the man’s ability at
acting as an intuitive statistician.
We are good at giving point estimates (especially medians and
modes – means tend to be biased when distributions are skew.)
We are not good at making direct assessments of variances and
correlations.
In general, experts should not be asked to assess parameter values
but should be asked about observable quantities. (e.g. in regression,
questions should be phrased in terms of the response (Y), rather than
in terms of regression coefficients.)
Visual assessment tasks may be easier for an expert to perform.
6
To quantify uncertainty, many elicitation
methods make extensive use of assessments of
lower and upper quartiles.
• This asks the expert to specify her median
and lower and upper quartiles– they can be
assessed by the method of bisection. Suppose
opinion about a probability is needed.
L
M
U
25%
25%
25%
25%
_________________________________
0
0.3
1.0
7
Elicitation of opinion for a multinomial model
In the multinomial sampling model there are a
number of categories.
Each observation will be in exactly one category and
expert opinion must:
• provide an estimate of the probability of each
category
• quantify the accuracy of the estimates.
Model to represent expert’s opinion: the simplest
conjugate prior distribution is the Dirichlet
distribution.
8
Example: Misclassification rates of BMI
• A person in Malta gives their height and weight in
a questionnaire and their calculated BMI is in the
normal range.
• Their true BMI is in one of the four categories:
normal; overweight; obese; underweight.
• We want to question an expert to assess the
probabilities that the person’s true BMI is in each
of these categories.
9
Median assessment of p for the first category
10
Assessing the probability of overweight, given that the
red bar is correct.
11
Assessing the probability for obese, conditional on the red
categories are correct. (The probability for the yellow category
follows automatically.)
12
The short blue lines are the expert’s lower and upper quartile
assessments for the first category. The insert shows the
probability density function for the first probability.
13
Blue lines are the expert’s quartile assessments for the
probability of overweight, conditional that 0.60 is the
probability of normal.
14
Quartile assessments for obese (also giving those for
underweight.)
15
Marginal distributions are shown to the expert as feedback.
The expert can make modifications if (s)he wishes.
16
When there are k categories, the Dirichet prior distribution has
the form
f ( p1 ,
( N )
, pk ) 
p1a1 1
(a1 ) (ak )
Thus, opinion about k parameters ( p1,
just k hyperparameters (a1, , ak 1 and N ).
p1ak 1.
, pk ) is
represented by
A more flexible model for the expert’s opinion is the ConnorMossiman distribution:
bi 1 ( ai bi )
k
 (a  b )


ai 1 
i
i
 pkbk 1 1.
, pk )   
pi   p j 

i 1  ( ai ) (bi )
 j i 

k 1
f ( p1,
This has 2(k – 1) hyperparameters. It can be determined from
the same assessments that give the Dirichlet distribution.
17
SHELF (Sheffield Elicitation Framework)
O’Hagan and Oakley have software to carry out
elicitation of probability distributions (aimed particularly
at quantifying uncertainty from a group of experts).
The univariate distributions it uses to model opinion are:
Normal, Student t, scaled beta, gamma, log-normal and
log Student-t.
An extension quantifies opinion about a multinomial
distribution by first eliciting marginal (beta) probabilities
for each category, and then reconciling them to form a
Dirichlet distribution.
Offers a choice of assessment tasks for quantifying
probabilities: quartiles, tertiles, fixed interval, roulette.
18
19
Gaussian copulas
The Connor-Mossiman distribution is more flexible than the
Dirichet, but greater flexibility can be obtained by eliciting
marginal beta distributions and then using a Gaussian copula to
tie the beta distributions together.
Elfadaly and I have developed a method for this and
implemented it in free software.
Adding Covariates to the Multinomial Model
Clearly desirable – the probability that an overweight person
says their weight is normal may depend on age and gender, for
example.
One way to do this is taking the multinomial logistic model as
the sampling model.
20
The multinomial logistic model gives the following
probability that a person with covariate values x falls in
category i.
1

i 1
1  k exp(  x '  ) ,
j
j
  j 2
pi ( x )  
exp( j  x '  j )

, i  2, , k
1   k exp(  x '  )
j
j
j 2

The prior distribution gives the α and β coefficients a
(singular) multivariate normal distribution. It is singular
because, for any given x, the expected values of the
probabilities must sum to 1.
21
Elicit opinion for the subpopulation of (i) men aged 30:
Man aged 30
Underweight
Normal
Overweight
Obese
p1(x)
p2(x)
p3(x)
p4(x)
Underweight
Normal
Overweight
Obese
p1(x)
p2(x)
p3(x)
p4(x)
Underweight
Normal
Overweight
Obese
p1(x)
p2(x)
p3(x)
p4(x)
Then women aged 30:
Woman aged 30
Then men aged 60:
Man aged 60
Eliciting the prior for the separate blocks is tricky; combining
the information quite easy.
22
Generalised Linear Models
We have developed methods for quantifying opinion about
generalised linear models (GLMs).
Can specify: ordinary linear regression, logistic regression,
Poisson regression;
Or the sampling distribution and link function may be
specified as:
Distribution: normal, Poisson, binomial, gamma, inversenormal, negative binomial, Bernoulli, geometric,
exponential.
Link function: canonical, identity, logarithm, logit,
reciprocal, square-root, probit, log-log, complementary
log-log, power, log-ratio, user specified.
23
The prior model gives the regression coefficients a
multivariate normal distribution.
Opinion is modelled by piece-wise linear models to
add flexibility.
24
This is the type of graph for assessing medians for a
continuous variable.
25
This is the type of bar-chart formed for a factor.
26
Graph for eliciting conditional quartiles.
27
PEGS (Probability Elicitation Graphical Software)
http://statistics.open.ac.uk/elicitation
Multinomial distribution.
Separate programs (and a single combined program) elicit:
• Dirichlet and Connor-Mossiman priors.
• Dirichlet and Gaussian copula priors.
• MVN prior for multinomial logistic model.
Piecewise-linear GLMs
Program that elicits an MVN prior also quantifies opinion about:
• The error variance in a normal linear model.
• The scale parameter in a gamma GLM.
These are also available in separate stand-alone programs.
28
SHELF (O’Hagan, Oakley et al., Sheffield). I earlier mentioned
that SHELF software elicits distributions and Dirichlet prior.
The software also includes a web-based tool for eliciting
probability distributions – users can log in from different sites
and they can all see and interact with the same graphics.
The Elicitator (Comford, Aston). Problem owners define an
elicitation problem and invite experts to participate, who
subsequently login to the website and complete a list of
questions that make up the elicitation process.
Elicitator (James, Low Choy, Mengersen, Queensland Univ.
Tech) . This tool quantifies expert opinion about regression
problems in ecology. Opinion at different geographical locations
can be elicited (as in Denham and Mengersen, 2007), so as to
define covariate values.
29
Reviews
Cooke, R.M. (1991). Experts in Uncertainty: Opinion and Subjective
Probability in Science. (Oxford University Press, New York).
Garthwaite, P.H., Kadane, J.B. and O’Hagan, A. (2005). Statistical
methods for eliciting probability distributions. J. Amer. Statist. Ass.,
100, 680–701.
Hogarth, R. M. (1975),. Cognitive Processes and the Assessment of
Subjective Probability Distributions. J. Amer. Statist. Ass., 70, 271–
294.
Kynn, M. (2008). The ‘‘heuristics and biases’’ bias in expert
elicitation. J. R. Statist.Soc., A 171, 239–264.
O’Hagan A., Buck C., Daneshkhah A., Eiser J., Garthwaite P.,
Jenkinson D., Oakley J., Rakow T., 2006. Uncertain Judgements:
Eliciting Experts' Probabilities (Wiley, Chichester).
Peterson, C. R., and Beach, L. R. (1967). Man as an Intuitive
Statistician. Psychological Bulletin, 68, 29–46.
Tversky, A. and Kahneman, D., 1974. Judgment under uncertainty:
heuristics and biases, Science, 185, 1124-1131.
30
PEGS References
Al-Awadhi, S A and Garthwaite, P H. (2006). Quantifying expert opinion for
modelling fauna habitat distributions. Computational Statist., 21, 121-140.
Garthwaite, P H, Chilcott, J B, Jenkinson, D J and Tappenden, P. (2008). Use
of expert knowledge in evaluating costs and benefits of alternative service
provision: A case study. Int. J. Technol. Assess. Health Care, 24, 350-357.
Garthwaite, P H, Alawadhi, A S, Elfadaly, F and Jenkinson, D J. (2013).
Quantifying subjective opinion about generalized linear and piecewise-linear
models. J. Applied Statist., 40, 59-75.
Elfadaly, F G and Garthwaite, P H. Eliciting Dirichlet and Connor-Mossiman
prior distributions for multinomial models. Test, in press.
Elfadaly, F G and Garthwaite, P H. Eliciting Dirichlet and Gaussian copula
prior distributions for multinomial models. Submitted.
Elfadaly, F G and Garthwaite, P H. Eliciting prior distributions for extra
parameters in some generalised linear models. Submitted.
31
Other References
Cooke, R. and Solomatine, D., "EXCALIBR – software package for expert
data evaluation and fusion and reliability assessment" report to the
Commission of the European Communities, Delft, 1990.
Devilee, J L A and Knol, A B. (2011). Software to support expert elicitation:
An exploratory study of existing software packages. RIVM Letter report
630003001 (National Institute for Public Health and the Environment,
Netherlands).
32
Download