Top Banner
1 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University
38

11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

11

Prior Distribution Elicitation forGeneralized Linear and Piecewise-Linear

Models

Paul Garthwaite and Fadlalla Elfadaly

Open University

Page 2: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

22

Why piecewise-linear models?

• Initial motivation for this model came from the need to model ecologists’ opinion about the presence/absence of rare and endangered animals.

• (A good example where expert opinion is useful –the ecologists had sightings of rare species but the data was not from a sampling frame and hence hard to incorporate in a statistical analysis.)

• For most variables there was an optimum value for a species. E.g. too hot or too cold did not suit it; nor too wet or too dry, etc.

Page 3: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

3

Page 4: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

4

Sampling ModelLogistic model: y = ln( p/(1-p)) = β0 + β1 x1 + …+ βk xk .

GLM:

y = g(μ) = β0 + β1 x1 + …+ βk xk .

Strategy: Elicit quantiles of p or μ and transform the assessments to quantiles of y.

Prior model: β ~ multivariate normal.

Three software implementations of the method:Garthwaite (1998: Visual Basic)

Kynn (2004: Pascal. Elicitor)

Elfadaly, Jenkinson, Garthwaite and Laney (2007/9: JAVA)

(The programs of Garthwaite and Kynn only handle logistic regression.)

Page 5: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

55

Assessments at reference point

Scene-setting questions determine the number of variables and factors, their ranges and also a reference point.

The reference point is chosen by the expert and gives the origin of variables and the reference level of factors.

For a continuous variable it is assumed that opinion about slopes on one side of the reference point is independent of opinion about slopes on the other side.

With the methods of Garthwaite and Elfadaly et al., median, lower and upper quartiles of the response at the reference point are assessed.

Page 6: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

66

Lower and upper quartiles have the advantage that they can be assessed by the method of bisection.

L M U 25% 25% 25% 25%

_________________________________ 0 0.3 1.0

Page 7: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

7

Elicitor is much more flexible.

For assessing the median, some techniques that can be used with logistic regression are available to the expert:Visual aids such as a probability wheel can be used. Probabilities can be given by first stating a (large) sample size and then assessing the number in that sample with the characteristic of interest.Scales marked in odds or log-odds can also be used.

For credible intervals, intervals other than 50% intervals can be specified and a form of fixed interval method is also advocated.

Page 8: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

88

Median Assessments

Medians are assessed for one covariate at a time.

The expert is asked to assume that all other covariates are at their reference values and to consider how the response varies with the covariate of current interest.

The expert clicks on a graph to draw a curve for covariates or a bar chart for factors.

(This is a poor approach to designing experiments but has clear benefits when eliciting expert opinion.)

Page 9: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

9

Page 10: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

10

• The number of knots does not seem crucial.

• Elicitor gives the option of fitting a linear or quadratic function to the medians.

• Garthwaite (1998) gave option of superimposing graphs to help improve the expert’s internal consistency across covariates.

(In forming models we almost always adopt linear relationships as the building blocks. Elicited piecewise linear relationships could instead be used as the building blocks.)

Page 11: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

11

Page 12: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

12

Feedback• Feedback is generally beneficial.

• Useful to display the median estimate at other design points, other than those points where all but one of the covariates are at their reference values.

• Mason (2008) used Elicitor to question an expert about non-random non-response in a longitudinal survey.

• Reference point was for best response-rate. Worst case setting of the covariates gave a response rate of only 1%. The expert revised his median assessments and the worst-case response-rate increased to 9%, which the expert still thought was too low.

• The response-rate rapidly diminishes as probabilities are multiplied.

• Intend adding this feedback option to the software.

Page 13: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

13

Page 14: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

14

Page 15: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

15

Page 16: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

16

Page 17: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

17

Page 18: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

18

Page 19: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

19

Examples• O’Leary et al. (2009a) give an example where

two experts assessed the probability of presence/absence for the brush-tailed rock-wallaby using Elicitor.

• Only two covariates:

(i) Aspect (northerly vs other)

(ii) Slope (0o - 90o).

• O’Leary et al. (2009b) also gives an example where presence/absence for this wallaby is assessed – this time by only one expert but using four different methods, with aspect as

the only covariate.

• Data: presence at 41 sites and absence at 9 (rare species? pest?)

Page 20: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

20

Assessments of the two experts (O’Leary et al., 2009a)

Page 21: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

Classification rates of four methods (O’Leary et al., 2009b)

Method Predicted Observed

present absent

Elicitor present 41 9

absent 0 0

Map-method present 0 1

absent 41 8

Questionnaire present 41 9

absent 0 0

Classification tree

present 35 1

absent 6 8

Page 22: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

22

Kynn (2004) gives five case studies conducted during the development of Elicitor where ecologists used it to quantify their opinions about an endangered species. Two of the studies had sample data with which to evaluate models.

Ground parrot137 presences and 438 pseudo-absences.

80% of the data was used to fit models and 20% for testing.

Two continuous covariates, a factor with three levels and a second factor with four levels.

Three models were considered:

(a) Assessed prior + data

(b) “Relaxed” prior + data

(relaxed: variances were multiplied by 10)

(c) Classical logistic stepwise regression.

Page 23: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

23

Classification rates for ground parrot (Kynn, 2005)

Stepwise does best – presumably variable selection helps. It used just the two continuous variables.

Method Predicted Observed

present absent

Assessed prior + data

present 28 16

absent 1 48

Relaxed prior + data

present 22 11

absent 7 53

Frequentist stepwise

present 27 10

absent 2 54

Page 24: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

24

Criteria for threshold: minimise2 2(1 sensitivity) (1 specificity)

Page 25: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

25

Stemmacantha (a thistle)

203 presences and 2741 absences.

Same three models; 80% of the data for fitting & 20% for testing.

Stemmacantha Ground Parrat

Page 26: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

26

Classification rates for Stemmacantha (Kynn, 2005)

Numbers are inconsistent, but there seems little to choose between the methods.

Method Predicted Observed

present absent

Assessed prior + data

present 33 77

absent 11 457

Relaxed prior + data

present 34 83

absent 6 461

Frequentist stepwise

present 32 69

absent 15 468

Page 27: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

2727

Garthwaite (1998) and Garthwaite & Al-Awadhi (2006) also quantify the opinion of ecologists about rare species in Queensland. Central Government wanted State Government to estimate habitat distribution of rare and endangered species.Some sample data were gathered. The aim was to link the data, ecologists’ knowledge and a GIS database to relate the probability of presence/absence to a large number of covariates.Preliminary meeting with about a dozen ecologists indicated that non-linear relationships were needed to model their opinion (hence the piecewise linear models).

Little bent-wing bat. (5 variables and 8 factors, giving 57 regression coefficients. Data: 42 presences in 375 sites.)

Page 28: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

28

Plumed frogmouth. (7 variables, 3 factors; 58 parameters). Data: 31 presences in 324 sites.

Powerful owl. (1 variable, 5 factors; 24 parameters).Data: 13 presences in 324 sites.

Greater glider. (7 variables, 4 factors; 60 parameters).Data: 53 presences in 343 sites. Common bent-wing bat. (4 variables, 7 factors; 59 parameters).Data: 13 presences in 375 sites.

Page 29: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

2929

Various prior distributions were fitted to compensate for systematic biases in the expert’s assessments.

1. (β0 , β1 ,…, βk) multivariate normal.

2. β0 diffuse, (β1 ,…, βk) ~ MVN(b, Σ).

3. θ, β0 diffuse, (β1 ,…, βk) ~ MVN(θb, θ2Σ).

4. γ, θ, β0 diffuse, (β1 ,…, βk) ~ MVN(θb, γΣ).

Cross-validation: Repeatedly using 80% of the data for fitting and 20% for testing.

Squared error loss was used to measure performance.

Page 30: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

3030

Little bent-

wing bat

Common b-w bat

Plumed frogmouth

Powerful Owl

Greater glider

Prior 1 36.74 12.75 28.76 13.61 43.90

Prior 2 36.87 12.73 28.91 13.60 43.94

Prior 3 36.11 12.42 25.99 13.17 42.35

Prior 4 36.13 12.75 28.62 13.62 43.90

Stepwise logistic Regression

41.07 13.70 30.91 14.68 44.16

Prior: no data 41.12 13.67 29.54 15.07 48.81

Prior 3 (constant term given diffuse prior and all coefficients multiplied by a constant) is the best for each animal – noticeably better for the plumed frogmouth.The prior with no data is comparable with stepwise regression except for the greater glider. There is quite limited data.

Page 31: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

31

A second example: Air pollution in (Khaldiya) Kuwait City

• Khaldiya had a mobile laboratory station to monitor pollution for one year.

• Focus is on the probability of pollutants exceeding harmful threshold level.

• There are two permanent fixed laboratory stations: 5 km north-east and 5 km south-west of Khaldiya.

• Aim is to use the data and the opinion of two scientists to relate Khaldiya pollution to the permanent laboratories.

• Pollutants: SO2, NO2 and n-CH4 (non-methane).

• Scientists quantified their opinions separately.

• Variables: pollution levels at the permanent labs, temperature, wind speed, humidity, height of the inversion line.

Page 32: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

3232

Expert A/ SO2

Expert A/ NO2

Expert B/ NO2

Expert A/ n-CH4

Expert B/ n-CH4

Prior 1 16.94 16.44 17.97 46.25 48.75

Prior 2 16.99 16.42 17.98 44.49 48.77

Prior 3 17.32 16.43 17.97 46.23 48.74

Prior 4 16.99 16.45 17.95 46.27 48.84

Stepwise logistic Regression

18.02 19.71 19.71 46.29 46.29

Prior: no data 17.87 24.31 27.52 96.71 78.31

Non-methane: priors seem poor as priors + no data do much worse than other methods; stepwise logistic does better than using expert B’s prior but not expert A’s, especially with Prior 2.

For SO2 and NO2, the prior’s seem better and prior + data does better than stepwise logistic regression. Prior 2 is perhaps the best.

Page 33: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

33

(Not Kuwait City)

Page 34: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

3434

A medical application

• The UK National Health Service (NHS) initiated a study to estimate the benefits of current bowel cancer services in England and examine costs and benefits of alternative developments in service provision.

• ScHARR developed a treatment pathway model that gave the possible sequences of presentation, diagnosis, treatment and outcomes that could be followed by a patient with suspected colorectal (bowel) cancer. Available information supplied most of the required numbers but expert opinion filled in gaps.

• The resulting report states, “Owing to a lack of empirical evidence in a number of areas, several of the model parameter and details of the model structure were elicited

from experts.”

Page 35: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

35

• For two quantities there were covariates. For these, the new version of the software was used to quantify consultants’ opinions.

• Choice of diagnostic test had level of fitness as a covariate.

• Choice of adjuvant chemotherapy had five covariates (mostly factors): age, tumor location, disease status, perforation/obstruction, and fitness for cytotoxic therapy.

• Results were validated where possible. Commenting on assessments about adjuvant chemotherapy the YHEC-ScHARR report notes that “The [pathways] model uses expert 1’s responses as part of a generalised linear model and is validated by expert 2’s responses.”

• The use of elicitation in the study is reported in Garthwaite, Chilcott, Jenkinson & Tappenden (2008).

Page 36: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

3636

• Al-Awadhi & Garthwaite (2006). Computational statistics, 21, 121-140.

• Garthwaite (1998). Quantifying expert opinion for modelling habitat distributions. Sustainable Forest Management Tech. Report, Queensland Depart. Natural Resources.

• Garthwaite & Al-Awadhi (2006). Tech. Report 06/07. Dept. Statistics, Open University.

• Garthwaite, Chilcott, Jenkinson & Tappenden (2008). Int. J. Technology assessment in Health Care, 24, 350-357.

• Kynn (2005). Eliciting expert knowledge for Bayesian logistic regression in species habitat modelling in natural resources. PhD thesis. Queensland University of Technology.

• Mason (2008). Methodological developments for combining data. www.Bias-project.org.uk/Papers/CombineDataAJM.pdf.

• O’Leary, Choy, Kynn, Denham, Martin, Mengersem & Murray (2009a). Environmetrics, 20, 379-398.

• O’Leary, Mengersem, Murray & Choy (2009b). Comparison of four expert elicitation methods. 18th World IMACS/MODSIM Congress.

Page 37: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

37

Page 38: 11 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University.

38