Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University 1 Why piecewise-linear models? • Initial motivation for this model came from the need to model ecologists’ opinion about the presence/absence of rare and endangered animals. • (A good example where expert opinion is useful – the ecologists had sightings of rare species but the data was not from a sampling frame and hence hard to incorporate in a statistical analysis.) • For most variables there was an optimum value for a species. E.g. too hot or too cold did not suit it; nor too wet or too dry, etc. 2 3 Sampling Model Logistic model: y = ln( p/(1-p)) = β0 + β1 x1 + …+ βk xk . GLM: y = g(μ) = β0 + β1 x1 + …+ βk xk . Strategy: Elicit quantiles of p or μ and transform the assessments to quantiles of y. Prior model: β ~ multivariate normal. Three software implementations of the method: Garthwaite (1998: Visual Basic) Kynn (2004: Pascal. Elicitor) Elfadaly, Jenkinson, Garthwaite and Laney (2007/9: JAVA) (The programs of Garthwaite and Kynn only handle logistic regression.) 4 Assessments at reference point Scene-setting questions determine the number of variables and factors, their ranges and also a reference point. The reference point is chosen by the expert and gives the origin of variables and the reference level of factors. For a continuous variable it is assumed that opinion about slopes on one side of the reference point is independent of opinion about slopes on the other side. With the methods of Garthwaite and Elfadaly et al., median, lower and upper quartiles of the response at the reference point are assessed. 5 Lower and upper quartiles have the advantage that they can be assessed by the method of bisection. L M U 25% 25% 25% 25% _________________________________ 0 0.3 1.0 6 Elicitor is much more flexible. For assessing the median, some techniques that can be used with logistic regression are available to the expert: Visual aids such as a probability wheel can be used. Probabilities can be given by first stating a (large) sample size and then assessing the number in that sample with the characteristic of interest. Scales marked in odds or log-odds can also be used. For credible intervals, intervals other than 50% intervals can be specified and a form of fixed interval method is also advocated. 7 Median Assessments Medians are assessed for one covariate at a time. The expert is asked to assume that all other covariates are at their reference values and to consider how the response varies with the covariate of current interest. The expert clicks on a graph to draw a curve for covariates or a bar chart for factors. (This is a poor approach to designing experiments but has clear benefits when eliciting expert opinion.) 8 9 • The number of knots does not seem crucial. • Elicitor gives the option of fitting a linear or quadratic function to the medians. • Garthwaite (1998) gave option of superimposing graphs to help improve the expert’s internal consistency across covariates. (In forming models we almost always adopt linear relationships as the building blocks. Elicited piecewise linear relationships could instead be used as the building blocks.) 10 11 Feedback • Feedback is generally beneficial. • Useful to display the median estimate at other design points, other than those points where all but one of the covariates are at their reference values. • Mason (2008) used Elicitor to question an expert about non-random non-response in a longitudinal survey. • Reference point was for best response-rate. Worst case setting of the covariates gave a response rate of only 1%. The expert revised his median assessments and the worstcase response-rate increased to 9%, which the expert still thought was too low. • The response-rate rapidly diminishes as probabilities are multiplied. • Intend adding this feedback option to the software. 12 13 14 15 16 17 18 Examples • O’Leary et al. (2009a) give an example where two experts assessed the probability of presence/absence for the brush-tailed rockwallaby using Elicitor. • Only two covariates: (i) Aspect (northerly vs other) (ii) Slope (0o - 90o). • O’Leary et al. (2009b) also gives an example where presence/absence for this wallaby is assessed – this time by only one expert but using four different methods, with aspect as the only covariate. • Data: presence at 41 sites and absence at 9 (rare species? pest?) 19 Assessments of the two experts (O’Leary et al., 2009a) 20 Classification rates of four methods (O’Leary et al., 2009b) Method Elicitor Map-method Questionnaire Classification tree Predicted Observed present present 41 absent 9 absent 0 0 present 0 1 absent 41 8 present 41 9 absent 0 0 present 35 1 absent 6 8 Kynn (2004) gives five case studies conducted during the development of Elicitor where ecologists used it to quantify their opinions about an endangered species. Two of the studies had sample data with which to evaluate models. Ground parrot 137 presences and 438 pseudo-absences. 80% of the data was used to fit models and 20% for testing. Two continuous covariates, a factor with three levels and a second factor with four levels. Three models were considered: (a) Assessed prior + data (b) “Relaxed” prior + data (relaxed: variances were multiplied by 10) (c) Classical logistic stepwise regression. 22 Classification rates for ground parrot (Kynn, 2005) Method Predicted Assessed prior present + data absent Observed present 28 absent 16 1 48 Relaxed prior + data present 22 11 absent 7 53 Frequentist stepwise present 27 10 absent 2 54 Stepwise does best – presumably variable selection helps. It used just the two continuous variables. 23 2 2 (1 sensitivity) (1 specificity) Criteria for threshold: minimise 24 Stemmacantha (a thistle) 203 presences and 2741 absences. Same three models; 80% of the data for fitting & 20% for testing. Stemmacantha Ground Parrat 25 Classification rates for Stemmacantha (Kynn, 2005) Method Predicted Assessed prior present + data absent Observed present 33 absent 77 11 457 Relaxed prior + data present 34 83 absent 6 461 Frequentist stepwise present 32 69 absent 15 468 Numbers are inconsistent, but there seems little to choose between the methods. 26 Garthwaite (1998) and Garthwaite & Al-Awadhi (2006) also quantify the opinion of ecologists about rare species in Queensland. Central Government wanted State Government to estimate habitat distribution of rare and endangered species. Some sample data were gathered. The aim was to link the data, ecologists’ knowledge and a GIS database to relate the probability of presence/absence to a large number of covariates. Preliminary meeting with about a dozen ecologists indicated that nonlinear relationships were needed to model their opinion (hence the piecewise linear models). Little bent-wing bat. (5 variables and 8 factors, giving 57 regression coefficients. Data: 42 presences in 375 sites.) 27 Plumed frogmouth. (7 variables, 3 factors; 58 parameters). Data: 31 presences in 324 sites. Powerful owl. (1 variable, 5 factors; 24 parameters). Data: 13 presences in 324 sites. Greater glider. (7 variables, 4 factors; 60 parameters). Data: 53 presences in 343 sites. Common bent-wing bat. (4 variables, 7 factors; 59 parameters). Data: 13 presences in 375 sites. 28 Various prior distributions were fitted to compensate for systematic biases in the expert’s assessments. 1. 2. 3. 4. (β0 , β1 ,…, βk) multivariate normal. β0 diffuse, (β1 ,…, βk) ~ MVN(b, Σ). θ, β0 diffuse, (β1 ,…, βk) ~ MVN(θb, θ2Σ). γ, θ, β0 diffuse, (β1 ,…, βk) ~ MVN(θb, γΣ). Cross-validation: Repeatedly using 80% of the data for fitting and 20% for testing. Squared error loss was used to measure performance. 29 Little bentwing bat Common b-w bat Plumed frogmouth Powerful Owl Greater glider Prior 1 36.74 12.75 28.76 13.61 43.90 Prior 2 36.87 12.73 28.91 13.60 43.94 Prior 3 36.11 12.42 25.99 13.17 42.35 Prior 4 36.13 12.75 28.62 13.62 43.90 Stepwise logistic Regression 41.07 13.70 30.91 14.68 44.16 Prior: no data 41.12 13.67 29.54 15.07 48.81 Prior 3 (constant term given diffuse prior and all coefficients multiplied by a constant) is the best for each animal – noticeably better for the plumed frogmouth. The prior with no data is comparable with stepwise regression except for the greater glider. There is quite limited data. 30 A second example: Air pollution in (Khaldiya) Kuwait City • Khaldiya had a mobile laboratory station to monitor pollution for one year. • Focus is on the probability of pollutants exceeding harmful threshold level. • There are two permanent fixed laboratory stations: 5 km north-east and 5 km south-west of Khaldiya. • Aim is to use the data and the opinion of two scientists to relate Khaldiya pollution to the permanent laboratories. • Pollutants: SO2, NO2 and n-CH4 (non-methane). • Scientists quantified their opinions separately. • Variables: pollution levels at the permanent labs, temperature, wind speed, humidity, height of the inversion line. 31 Expert A/ SO2 Expert A/ NO2 Expert B/ NO2 Expert A/ n-CH4 Expert B/ n-CH4 Prior 1 16.94 16.44 17.97 46.25 48.75 Prior 2 16.99 16.42 17.98 44.49 48.77 Prior 3 17.32 16.43 17.97 46.23 48.74 Prior 4 16.99 16.45 17.95 46.27 48.84 Stepwise logistic Regression 18.02 19.71 19.71 46.29 46.29 Prior: no data 17.87 24.31 27.52 96.71 78.31 Non-methane: priors seem poor as priors + no data do much worse than other methods; stepwise logistic does better than using expert B’s prior but not expert A’s, especially with Prior 2. For SO2 and NO2, the prior’s seem better and prior + data does better than stepwise logistic regression. Prior 2 is perhaps the best. 32 (Not Kuwait City) 33 A medical application • The UK National Health Service (NHS) initiated a study to estimate the benefits of current bowel cancer services in England and examine costs and benefits of alternative developments in service provision. • ScHARR developed a treatment pathway model that gave the possible sequences of presentation, diagnosis, treatment and outcomes that could be followed by a patient with suspected colorectal (bowel) cancer. Available information supplied most of the required numbers but expert opinion filled in gaps. • The resulting report states, “Owing to a lack of empirical evidence in a number of areas, several of the model parameter and details of the model structure were elicited from experts.” 34 • For two quantities there were covariates. For these, the new version of the software was used to quantify consultants’ opinions. • Choice of diagnostic test had level of fitness as a covariate. • Choice of adjuvant chemotherapy had five covariates (mostly factors): age, tumor location, disease status, perforation/obstruction, and fitness for cytotoxic therapy. • Results were validated where possible. Commenting on assessments about adjuvant chemotherapy the YHECScHARR report notes that “The [pathways] model uses expert 1’s responses as part of a generalised linear model and is validated by expert 2’s responses.” • The use of elicitation in the study is reported in Garthwaite, Chilcott, Jenkinson & Tappenden (2008). 35 • Al-Awadhi & Garthwaite (2006). Computational statistics, 21, 121140. • Garthwaite (1998). Quantifying expert opinion for modelling habitat distributions. Sustainable Forest Management Tech. Report, Queensland Depart. Natural Resources. • Garthwaite & Al-Awadhi (2006). Tech. Report 06/07. Dept. Statistics, Open University. • Garthwaite, Chilcott, Jenkinson & Tappenden (2008). Int. J. Technology assessment in Health Care, 24, 350-357. • Kynn (2005). Eliciting expert knowledge for Bayesian logistic regression in species habitat modelling in natural resources. PhD thesis. Queensland University of Technology. • Mason (2008). Methodological developments for combining data. www.Bias-project.org.uk/Papers/CombineDataAJM.pdf. • O’Leary, Choy, Kynn, Denham, Martin, Mengersem & Murray (2009a). Environmetrics, 20, 379-398. • O’Leary, Mengersem, Murray & Choy (2009b). Comparison of four expert elicitation methods. 18th World IMACS/MODSIM Congress. 36 37 38