Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and

advertisement
Prior Distribution Elicitation for
Generalized Linear and Piecewise-Linear
Models
Paul Garthwaite and
Fadlalla Elfadaly
Open University
1
Why piecewise-linear models?
• Initial motivation for this model came from the
need to model ecologists’ opinion about the
presence/absence of rare and endangered animals.
• (A good example where expert opinion is useful –
the ecologists had sightings of rare species but the
data was not from a sampling frame and hence
hard to incorporate in a statistical analysis.)
• For most variables there was an optimum value for
a species. E.g. too hot or too cold did not suit it;
nor too wet or too dry, etc.
2
3
Sampling Model
Logistic model:
y = ln( p/(1-p)) = β0 + β1 x1 + …+ βk xk .
GLM:
y = g(μ) = β0 + β1 x1 + …+ βk xk .
Strategy: Elicit quantiles of p or μ and transform the assessments to
quantiles of y.
Prior model: β ~ multivariate normal.
Three software implementations of the method:
Garthwaite (1998: Visual Basic)
Kynn (2004: Pascal. Elicitor)
Elfadaly, Jenkinson, Garthwaite and Laney (2007/9: JAVA)
(The programs of Garthwaite and Kynn only handle logistic
regression.)
4
Assessments at reference point
Scene-setting questions determine the number of variables and
factors, their ranges and also a reference point.
The reference point is chosen by the expert and gives the origin of
variables and the reference level of factors.
For a continuous variable it is assumed that opinion about slopes on
one side of the reference point is independent of opinion about
slopes on the other side.
With the methods of Garthwaite and Elfadaly et al., median, lower
and upper quartiles of the response at the reference point are
assessed.
5
Lower and upper quartiles have the advantage
that they can be assessed by the method of
bisection.
L
M
U
25%
25%
25%
25%
_________________________________
0
0.3
1.0
6
Elicitor is much more flexible.
For assessing the median, some techniques that can be used
with logistic regression are available to the expert:
Visual aids such as a probability wheel can be used.
Probabilities can be given by first stating a (large) sample size
and then assessing the number in that sample with the
characteristic of interest.
Scales marked in odds or log-odds can also be used.
For credible intervals, intervals other than 50% intervals can
be specified and a form of fixed interval method is also
advocated.
7
Median Assessments
Medians are assessed for one covariate at a time.
The expert is asked to assume that all other covariates are at
their reference values and to consider how the response
varies with the covariate of current interest.
The expert clicks on a graph to draw a curve for covariates or
a bar chart for factors.
(This is a poor approach to designing experiments but has
clear benefits when eliciting expert opinion.)
8
9
• The number of knots does not seem crucial.
• Elicitor gives the option of fitting a linear or quadratic
function to the medians.
• Garthwaite (1998) gave option of superimposing graphs to
help improve the expert’s internal consistency across
covariates.
(In forming models we almost always adopt linear
relationships as the building blocks. Elicited piecewise
linear relationships could instead be used as the building
blocks.)
10
11
Feedback
• Feedback is generally beneficial.
• Useful to display the median estimate at other design
points, other than those points where all but one of the
covariates are at their reference values.
• Mason (2008) used Elicitor to question an expert about
non-random non-response in a longitudinal survey.
• Reference point was for best response-rate. Worst case
setting of the covariates gave a response rate of only 1%.
The expert revised his median assessments and the worstcase response-rate increased to 9%, which the expert still
thought was too low.
• The response-rate rapidly diminishes as probabilities are
multiplied.
• Intend adding this feedback option to the software.
12
13
14
15
16
17
18
Examples
• O’Leary et al. (2009a) give an example where
two experts assessed the probability of
presence/absence for the brush-tailed rockwallaby using Elicitor.
• Only two covariates:
(i) Aspect (northerly vs other)
(ii) Slope (0o - 90o).
• O’Leary et al. (2009b) also gives an example
where presence/absence for this wallaby is
assessed – this time by only one expert but
using four different methods, with aspect as
the only covariate.
• Data: presence at 41 sites and absence at 9
(rare species? pest?)
19
Assessments of the two experts (O’Leary et al., 2009a)
20
Classification rates of four methods (O’Leary et al., 2009b)
Method
Elicitor
Map-method
Questionnaire
Classification
tree
Predicted
Observed
present
present
41
absent
9
absent
0
0
present
0
1
absent
41
8
present
41
9
absent
0
0
present
35
1
absent
6
8
Kynn (2004) gives five case studies conducted during the
development of Elicitor where ecologists used it to
quantify their opinions about an endangered species. Two
of the studies had sample data with which to evaluate
models.
Ground parrot
137 presences and 438 pseudo-absences.
80% of the data was used to fit models and 20% for testing.
Two continuous covariates, a factor with three levels and a
second factor with four levels.
Three models were considered:
(a) Assessed prior + data
(b) “Relaxed” prior + data
(relaxed: variances were multiplied by 10)
(c) Classical logistic stepwise regression.
22
Classification rates for ground parrot (Kynn, 2005)
Method
Predicted
Assessed prior present
+ data
absent
Observed
present
28
absent
16
1
48
Relaxed prior
+ data
present
22
11
absent
7
53
Frequentist
stepwise
present
27
10
absent
2
54
Stepwise does best – presumably variable selection helps. It
used just the two continuous variables.
23
2
2
(1

sensitivity)

(1

specificity)
Criteria for threshold: minimise
24
Stemmacantha (a thistle)
203 presences and 2741 absences.
Same three models; 80% of the data for fitting & 20% for testing.
Stemmacantha
Ground Parrat
25
Classification rates for Stemmacantha (Kynn, 2005)
Method
Predicted
Assessed prior present
+ data
absent
Observed
present
33
absent
77
11
457
Relaxed prior
+ data
present
34
83
absent
6
461
Frequentist
stepwise
present
32
69
absent
15
468
Numbers are inconsistent, but there seems little to choose
between the methods.
26
Garthwaite (1998) and Garthwaite & Al-Awadhi (2006) also quantify
the opinion of ecologists about rare species in Queensland.
Central Government wanted State Government to estimate habitat
distribution of rare and endangered species.
Some sample data were gathered.
The aim was to link the data, ecologists’ knowledge and a GIS
database to relate the probability of presence/absence to a large
number of covariates.
Preliminary meeting with about a dozen ecologists indicated that nonlinear relationships were needed to model their opinion (hence the
piecewise linear models).
Little bent-wing bat. (5 variables and
8 factors, giving 57 regression
coefficients. Data: 42 presences in
375 sites.)
27
Plumed frogmouth. (7 variables, 3 factors; 58 parameters).
Data: 31 presences in 324 sites.
Powerful owl. (1 variable, 5 factors; 24 parameters).
Data: 13 presences in 324 sites.
Greater glider. (7 variables, 4 factors; 60 parameters).
Data: 53 presences in 343 sites.
Common bent-wing bat. (4 variables, 7 factors; 59 parameters).
Data: 13 presences in 375 sites.
28
Various prior distributions were fitted to compensate for
systematic biases in the expert’s assessments.
1.
2.
3.
4.
(β0 , β1 ,…, βk) multivariate normal.
β0 diffuse, (β1 ,…, βk) ~ MVN(b, Σ).
θ, β0 diffuse, (β1 ,…, βk) ~ MVN(θb, θ2Σ).
γ, θ, β0 diffuse, (β1 ,…, βk) ~ MVN(θb,
γΣ).
Cross-validation: Repeatedly using 80% of the data for fitting
and 20% for testing.
Squared error loss was used to measure performance.
29
Little
bentwing bat
Common
b-w bat
Plumed
frogmouth
Powerful
Owl
Greater
glider
Prior 1
36.74
12.75
28.76
13.61
43.90
Prior 2
36.87
12.73
28.91
13.60
43.94
Prior 3
36.11
12.42
25.99
13.17
42.35
Prior 4
36.13
12.75
28.62
13.62
43.90
Stepwise logistic
Regression
41.07
13.70
30.91
14.68
44.16
Prior: no data
41.12
13.67
29.54
15.07
48.81
Prior 3 (constant term given diffuse prior and all coefficients
multiplied by a constant) is the best for each animal –
noticeably better for the plumed frogmouth.
The prior with no data is comparable with stepwise regression
except for the greater glider.
There is quite limited data.
30
A second example: Air pollution in (Khaldiya) Kuwait City
• Khaldiya had a mobile laboratory station to monitor
pollution for one year.
• Focus is on the probability of pollutants exceeding harmful
threshold level.
• There are two permanent fixed laboratory stations: 5 km
north-east and 5 km south-west of Khaldiya.
• Aim is to use the data and the opinion of two scientists to
relate Khaldiya pollution to the permanent laboratories.
• Pollutants: SO2, NO2 and n-CH4 (non-methane).
• Scientists quantified their opinions separately.
• Variables: pollution levels at the permanent labs,
temperature, wind speed, humidity, height of the inversion
line.
31
Expert A/
SO2
Expert A/
NO2
Expert B/
NO2
Expert A/
n-CH4
Expert B/
n-CH4
Prior 1
16.94
16.44
17.97
46.25
48.75
Prior 2
16.99
16.42
17.98
44.49
48.77
Prior 3
17.32
16.43
17.97
46.23
48.74
Prior 4
16.99
16.45
17.95
46.27
48.84
Stepwise logistic
Regression
18.02
19.71
19.71
46.29
46.29
Prior: no data
17.87
24.31
27.52
96.71
78.31
Non-methane: priors seem poor as priors + no data do much worse
than other methods; stepwise logistic does better than using expert
B’s prior but not expert A’s, especially with Prior 2.
For SO2 and NO2, the prior’s seem better and prior + data does better
than stepwise logistic regression. Prior 2 is perhaps the best. 32
(Not Kuwait City)
33
A medical application
• The UK National Health Service (NHS) initiated a study to
estimate the benefits of current bowel cancer services in
England and examine costs and benefits of alternative
developments in service provision.
• ScHARR developed a treatment pathway model that gave
the possible sequences of presentation, diagnosis,
treatment and outcomes that could be followed by a patient
with suspected colorectal (bowel) cancer. Available
information supplied most of the required numbers but
expert opinion filled in gaps.
• The resulting report states, “Owing to a lack of empirical
evidence in a number of areas, several of the model
parameter and details of the model structure were elicited
from experts.”
34
• For two quantities there were covariates. For these, the
new version of the software was used to quantify
consultants’ opinions.
• Choice of diagnostic test had level of fitness as a covariate.
• Choice of adjuvant chemotherapy had five covariates
(mostly factors): age, tumor location, disease status,
perforation/obstruction, and fitness for cytotoxic therapy.
• Results were validated where possible. Commenting on
assessments about adjuvant chemotherapy the YHECScHARR report notes that “The [pathways] model uses
expert 1’s responses as part of a generalised linear model
and is validated by expert 2’s responses.”
• The use of elicitation in the study is reported in
Garthwaite, Chilcott, Jenkinson & Tappenden (2008).
35
• Al-Awadhi & Garthwaite (2006). Computational statistics, 21, 121140.
• Garthwaite (1998). Quantifying expert opinion for modelling habitat
distributions. Sustainable Forest Management Tech. Report,
Queensland Depart. Natural Resources.
• Garthwaite & Al-Awadhi (2006). Tech. Report 06/07. Dept.
Statistics, Open University.
• Garthwaite, Chilcott, Jenkinson & Tappenden (2008). Int. J.
Technology assessment in Health Care, 24, 350-357.
• Kynn (2005). Eliciting expert knowledge for Bayesian logistic
regression in species habitat modelling in natural resources. PhD
thesis. Queensland University of Technology.
• Mason (2008). Methodological developments for combining data.
www.Bias-project.org.uk/Papers/CombineDataAJM.pdf.
• O’Leary, Choy, Kynn, Denham, Martin, Mengersem & Murray
(2009a). Environmetrics, 20, 379-398.
• O’Leary, Mengersem, Murray & Choy (2009b). Comparison of four
expert elicitation methods. 18th World IMACS/MODSIM Congress.
36
37
38
Download