DTC module in statistical modelling and inference MCMC practical – Fitting a ‘bespoke’ model In this practical we are going to look at using MCMC to estimate posterior distributions of parameters in a ‘bespoke’ probabilistic model. This will hopefully highlight how MCMC can be used as a remarkably powerful tool to estimate parameters in complex models and to compare alternative models. The data we are going to use is the ‘memory’ test data from the introductory statistics module. We actually need a bit more data than before, so we will collect it again. Working in pairs, one of you will use Matlab’s randperm function to generate a permutation of the numbers 1-10. Read the sequence out slowly (don’t let the other person see). The partner then has to repeat the sequence. You record the number of correct answers before the first mistake (if the person remembers the sequence correctly, record ‘10’). Do one practice, then repeat the test 3 times each. Assemble a class set of results (i.e. of size n people by 3 trials). What we are interested in making inference about is where there is a fixed probability of ‘remembering’ a number, or whether the probability of making an error increase with position in the sequence, and if so, how does it increase? We can also use MCMC to explore whether different people have different capacities for remembering sequences of numbers. 1. The likelihood function We are going to fit a model in which the probability, for individual j, of making a mistake after i trials, pi, increases with position in the sequence: 𝑝𝑖 = exp{𝛽0𝑗 +𝛽1𝑗 𝑖} 1+exp{𝛽0𝑗 +𝛽1𝑗 𝑖} . This is just the logistic model that we have seen before. However, our recorded data is the number of successes to the first failure, rather than success and failure at each question, so the likelihood function needs to be adapted a bit. For example, the likelihood given that the first error occurs at the 4th question is the product of success for the first three trials and failure at the next: 1 1 𝐿 = 1+exp{𝛽 } × 1+exp{𝛽 0 0 +𝛽1 } 1 × 1+exp{𝛽 0 +2𝛽1 } exp{𝛽 +3𝛽1 } . 0 +3𝛽1 } × 1+exp{0𝛽 The first thing we will do is to assume that the betas are identical for all individuals. We need to choose decent priors for the betas. One suggestion is that 0 comes from a normal distribution with mean 0 and variance 1, while 1 comes from an exponential distribution with rate 1. For 0 = 0.2 and 1 = 0.5, calculate the prior and, for each sample, the likelihood. The figure below shows the probability of an error at each step and the probability of the first error occurring at each step for these parameter values. 2. Setting up the MCMC We need to construct a standard Metropolis Hastings random walk to get the posterior distribution for our model parameters. I suggest using normal distributions with mean 0 and some variance (up to you) as the proposal distributions for each parameter. Using all the good MCMC practice that you have learned, construct an MCMC that provides you with an estimate of the posterior for each parameter. Answer the following. What are the mean posterior values for each parameter? What is the 95% credible interval for each (ETPI)? What does the joint posterior for the two parameters look like? What is the posterior correlation between the parameters? For every sample from your chain, calculate the pi values (i from 0 to 10). Plot 100 of these lines. At every value (from 0 to 10) calculate the mean and 95% credible interval. What do you infer about how hard it is to remember numbers as their position in the sequence increases? 3. Extending the model: Hard We have no reason to particularly believe the parametric form for our model to be true. So we might be interested in comparing alternative models. One possibility is to include a quadratic term in the logistic model 𝑝𝑖 = exp{𝛽0𝑗 +𝛽1𝑗 𝑖+𝛽2𝑗 𝑖 2 } 1+exp{𝛽0𝑗 +𝛽1𝑗 𝑖+𝛽2𝑗 𝑖 2 } . We can put a prior on 2 (say exponential with parameter 10) and perform MCMC as before. Something else we can do, though is to introduce ‘indicator’ variables for whether the coefficients for the linear and quadratic terms should be included in the model. I.e. exp{𝛽 +𝐼1 𝛽1𝑗 𝑖+𝐼2 𝛽2𝑗 𝑖 2 } 𝑝𝑖 = 1+exp{0𝑗 𝛽 0𝑗 +𝐼1 𝛽1𝑗 𝑖+𝐼2 𝛽2𝑗 𝑖 2} , where I1 and I2 are either 0 or 1. We can include these terms in the MCMC – say put a prior of 0.5 of each one being 1 – and perform inference about them. This is called ‘trans-dimensional MCMC’ or ‘reversible-jump MCMC’. This technique allows you to explore complex models of differing dimensionality. Construct an MCMC that allows you to perform inference about the indicator functions (note you need to propose moves from 0 to 1 and vice versa). What is the posterior probability on I2 being 1? As before, plot the posterior mean (and credible intervals) for the estimated pis. How much does this differ from the case without the quadratic term?