MCMC_practical

advertisement
DTC module in statistical modelling and inference
MCMC practical – Fitting a ‘bespoke’ model
In this practical we are going to look at using MCMC to estimate posterior distributions of
parameters in a ‘bespoke’ probabilistic model. This will hopefully highlight how MCMC can be used
as a remarkably powerful tool to estimate parameters in complex models and to compare
alternative models.
The data we are going to use is the ‘memory’ test data from the introductory statistics module. We
actually need a bit more data than before, so we will collect it again. Working in pairs, one of you
will use Matlab’s randperm function to generate a permutation of the numbers 1-10. Read the
sequence out slowly (don’t let the other person see). The partner then has to repeat the sequence.
You record the number of correct answers before the first mistake (if the person remembers the
sequence correctly, record ‘10’). Do one practice, then repeat the test 3 times each. Assemble a
class set of results (i.e. of size n people by 3 trials).
What we are interested in making inference about is where there is a fixed probability of
‘remembering’ a number, or whether the probability of making an error increase with position in the
sequence, and if so, how does it increase? We can also use MCMC to explore whether different
people have different capacities for remembering sequences of numbers.
1. The likelihood function
We are going to fit a model in which the probability, for individual j, of making a mistake after i trials,
pi, increases with position in the sequence:
𝑝𝑖 =
exp⁡{𝛽0𝑗 +𝛽1𝑗 𝑖}
1+exp⁡{𝛽0𝑗 +𝛽1𝑗 𝑖}
.
This is just the logistic model that we have seen before. However, our recorded data is the number
of successes to the first failure, rather than success and failure at each question, so the likelihood
function needs to be adapted a bit. For example, the likelihood given that the first error occurs at
the 4th question is the product of success for the first three trials and failure at the next:
1
1
𝐿 = 1+exp⁡{𝛽 } × 1+exp⁡{𝛽
0
0 +𝛽1 }
1
× 1+exp⁡{𝛽
0 +2𝛽1 }
exp⁡{𝛽 +3𝛽1 }
.
0 +3𝛽1 }
× 1+exp⁡{0𝛽
The first thing we will do is to assume that the betas are identical for all individuals. We need to
choose decent priors for the betas. One suggestion is that 0 comes from a normal distribution with
mean 0 and variance 1, while 1 comes from an exponential distribution with rate 1. For 0 = 0.2 and
1 = 0.5, calculate the prior and, for each sample, the likelihood. The figure below shows the
probability of an error at each step and the probability of the first error occurring at each step for
these parameter values.
2. Setting up the MCMC
We need to construct a standard Metropolis Hastings random walk to get the posterior distribution
for our model parameters. I suggest using normal distributions with mean 0 and some variance (up
to you) as the proposal distributions for each parameter. Using all the good MCMC practice that you
have learned, construct an MCMC that provides you with an estimate of the posterior for each
parameter. Answer the following.

What are the mean posterior values for each parameter?

What is the 95% credible interval for each (ETPI)?

What does the joint posterior for the two parameters look like? What is the posterior

correlation between the parameters?

For every sample from your chain, calculate the pi values (i from 0 to 10). Plot 100 of these
lines. At every value (from 0 to 10) calculate the mean and 95% credible interval. What do
you infer about how hard it is to remember numbers as their position in the sequence
increases?
3. Extending the model: Hard
We have no reason to particularly believe the parametric form for our model to be true. So we
might be interested in comparing alternative models. One possibility is to include a quadratic term
in the logistic model
𝑝𝑖 =
exp⁡{𝛽0𝑗 +𝛽1𝑗 𝑖+𝛽2𝑗 𝑖 2 }
1+exp⁡{𝛽0𝑗 +𝛽1𝑗 𝑖+𝛽2𝑗 𝑖 2 }
.
We can put a prior on 2 (say exponential with parameter 10) and perform MCMC as before.
Something else we can do, though is to introduce ‘indicator’ variables for whether the coefficients
for the linear and quadratic terms should be included in the model. I.e.
exp⁡{𝛽 +𝐼1 𝛽1𝑗 𝑖+𝐼2 𝛽2𝑗 𝑖 2 }
𝑝𝑖 = 1+exp⁡{0𝑗
𝛽
0𝑗 +𝐼1 𝛽1𝑗 𝑖+𝐼2 𝛽2𝑗 𝑖
2}
,
where I1 and I2 are either 0 or 1. We can include these terms in the MCMC – say put a prior of 0.5 of
each one being 1 – and perform inference about them. This is called ‘trans-dimensional MCMC’ or
‘reversible-jump MCMC’. This technique allows you to explore complex models of differing
dimensionality. Construct an MCMC that allows you to perform inference about the indicator
functions (note you need to propose moves from 0 to 1 and vice versa). What is the posterior
probability on I2 being 1? As before, plot the posterior mean (and credible intervals) for the
estimated pis. How much does this differ from the case without the quadratic term?
Download