emo004153056s1

advertisement
Supplemental Materials
Negative Emotional Stimuli Enhance Vestibular Processing
by N. Preuss et al., 2015, Emotion
http://dx.doi.org/10.1037/emo0000092
Introduction
In this document, we provide Jags/Stan code for all models used in our data analysis. For both
response accuracy and response time, we first identified the best-fitting models by comparing
DIC (Deviance Information Criterion) for Jags models, and WAIC (Watanabe-Akaike information
criterion) for Stan models. Models with smaller DIC should be preferred over models with
greater DIC. WAIC is an improvement over DIC for Bayesian models. It is computed using the log
pointwise posterior predictive density and a correction term (Gelman et al., 2013a,b); it can be
interpreted as an approximation to cross-validation and is numerically more stable than DIC.
For further analysis, we used the models with the lowest DIC/WAIC values.
Accuracy Models
We estimated both random coefficients (random intercepts & random slopes for each
participant) models and random intercept models. Deviance information criteria (DIC) were
smaller for the random intercept models than for the random coefficients models. Therefore, we
reported the results of the random intercept model. We drew 2000 samples from the posterior
distribution using 3 chains, with 5000 iteration each, with a burn-in period of 1000 samples for
each chain. We used every third sample in order to minimize auto-correlation of the samples.
Each chain was monitored for convergence using the
Values of
(Gelman and Rubin, 1992) statistic.
close to 1 indicate convergence; this was achieved for all parameters.
For further details on implementation please see Gelman & Hill (2007).
Parameter estimation for Accuracy Data
We fit the following logistic regression model to the response accuracy data. Each response was
modeled as being drawn from a Bernoulli distribution with rate parameter giving the
probability of a correct response. We predicted the logit of using the following linear model:
where
is the effect of the neutral image category, is an additive effect for image category ,
is the effect of motion intensity and
is an additive participant effect.
We placed non-informative priors on all image effects and on the effect of motion intensity.
Participant effect was given prior centred at 0, with a uniform prior between 0 and 100 on the
between-participant standard deviation.
Jags code
model{
for (i in 1:length(response)) {
response[i] ~ dbern(theta[i])
# linear model on rate parameter theta
logit(theta[i]) <- (b0 + u[subject[i], 1]) +
b.snake * snake[i] +
b.threat * threat[i] +
b.erotic * erotic[i] +
b.sports * sports[i] +
b.mutilation * mutilation[i] +
b.intensity * intensity[i]
}
# priors on fixed effects
b0 ~ dnorm (0, .001)
b.snake ~ dnorm (0, .001)
b.threat ~ dnorm (0, .001)
b.mutilation ~ dnorm (0, .001)
b.sports ~ dnorm (0, .001)
b.erotic ~ dnorm (0, .001)
b.intensity ~ dnorm (0, .001)
# priors on participant effects
for (j in 1:J) { # J is the number of participants
u[j, 1] ~ dnorm (0, tau1.subject)
}
tau1.subject <- pow(sigma1.subject, -2)
sigma1.subject ~ dunif (0, 100)
}
Bayes Factor for Accuracy Data
In order to calculate a Bayes factor using the Savage-Dickey density ratio (Lee & Wagenmakers,
2014), we fit the following model to the accuracy data, using a similar model the above, but with
the images aggregated into 2 categories, negative and positive/neutral.
In this model, the coefficient
is thus an indicator variable which takes the value
for mutilation, snake and threat images, and the value for objects, sports and erotic content.
is thus an additive effect of negative images, and we tested the order-restricted
hypothesis that effect
was zero, versus the alternative hypothesis that
was drawn from a Half-Cauchy distribution with mean
Jags code
model{
for (i in 1:length(response)) {
response[i] ~ dbern(theta[i]) # accuracy
and scale
:
# linear model on rate parameter theta
logit(theta[i]) <- (b0 + u[id2[i], 1]) +
b.negative * negative[i] +
b.intensity * intensity[i]
}
# additive participant effect
for (j in 1:J) { # J is the number of participants
u[j, 1] ~ dnorm (0, tau1.subject)
}
tau1.subject <- pow(sigma1.subject, -2)
sigma1.subject ~ dunif(0, 100)
# priors on fixed effects
b0 ~ dnorm (0, .001)
# coefficient for negative images is defined as effect size * standard
# deviation
b.negative <- delta*sigma
b.intensity ~ dnorm(0, .001)
# delta and sigma come from half-Cauchy distributions
# see (Lee & Wagenmakers, 2014) for details
delta ~ dnorm(0, lambdadelta)T(0,)
lambdadelta ~ dchisqr(1)
sigma <- abs(sigmatmp)
sigmatmp ~ dnorm(0, lambdasigma)
lambdasigma ~ dchisqr(1)
}
Response time models
We modeled response time using an Ex-Gaussian (exponentially modified Gaussian) distribution
in Stan. The Ex-Gaussian results from the convolution of a normal and an exponential
distribution, and its probability density function (pdf) is given by:
where erfc is the complementary error function:
The parameters of the Ex-Gaussian are and
, the mean and variance of the normal
component, and , the rate of the exponential component. In particular, the parameters which
are usually of greatest interest are , which shifts the distribution along the x-axis, and ,
which affects the tail behavior.
After selecting the model with the lowest WAIC value, we fit 2 separate models in order to
estimate the effect of the image category on and , using the same linear model as above. In
model 1, we predicted using the linear model, while estimating participant-specific and
parameters, which were in turn given group-level prior distributions. All priors used in the
model were mildly informed to match the scale of the data being modeled.
The fixed effects coefficients (image coefficients) were given Cauchy priors (Gelman et al.,
2013a):
and the coefficient of motion intensity was given a normal distribution:
In model 1 (linear model on ), we estimated individual and parameters (for participant
), which were drawn from truncated normal distributions with group level parameters.
In model 2, we predicted the logarithm of using the same linear model. Since must be
,
it is natural to place a linear model on the logarithm of , rather than on itself. We estimated
individual and
parameters (for participant ); since is constrained to be > 0, we
used a normal distribution for individual and group-level priors.
For all RT models, we drew a total of 1000 samples from the joint posterior distribution, after
discarding the initial samples from the burn-in period (4 chains * 250 samples). Each of the 4
chains was checked for convergence using the
(Gelman and Rubin, 1992) statistic.
Parameter Estimation With Linear Model on
Stan code
data {
int<lower = 0> n_trials;
int<lower = 0> n_participants;
int<lower = 0> n_images;
vector<lower = 0>[n_trials] y;
matrix[n_trials, n_images] X;
int<lower = 1, upper = n_participants> participant[n_trials];
int<lower = -1, upper = 1> motion[n_trials];
}
parameters {
// all parameters given lower bounds are constrained to be positive
real<lower = 0> sigma[n_participants];
real<lower = 0> lambda[n_participants];
vector[n_images] beta;
real bmotion;
real u[n_participants];
real<lower = 0> lambda_mu;
real<lower = 0> lambda_sd;
real<lower = 0> sigma_mu;
real<lower = 0> sigma_sd;
real<lower = 0> u_sd;
}
model {
real mu[n_trials];
// group-level priors
lambda_mu ~ normal(0, 5);
lambda_sd ~ cauchy(0, 2.5);
sigma_mu ~ normal(0, 1);
sigma_sd ~ normal(0, 1);
u_sd ~ cauchy(0, 2.5);
bmotion ~ normal(0, 2);
// priors on fixed effects (image categories)
beta[1] ~ cauchy(0, 2.5); // intercept: neutral images
for (i in 2:n_images){ //additive effects of other image categories
beta[i] ~ cauchy(0, 2.5);
}
// random participant effects
for (j in 1:n_participants){
lambda[j] ~ normal(lambda_mu, lambda_sd);
sigma[j] ~ normal(sigma_mu, sigma_sd);
u[j] ~ normal(0, u_sd);
}
for (k in 1:n_trials){
// linear model on mu
mu[k] <- X[k] * beta + u[participant[k]] + bmotion * motion[k];
//response time is ~ Ex-Gaussian
y[k] ~ exp_mod_normal(mu[k],
sigma[participant[k]],
lambda[participant[k]]);
}
}
generated quantities {
vector[n_trials] log_lik;
real mu[n_trials];
for (k in 1:n_trials){
mu[k] <- X[k] * beta + u[participant[k]] + bmotion * motion[k];
// used to calculate WAIC
log_lik[k] <- exp_mod_normal_log(y[k],
mu[k],
sigma[participant[k]],
lambda[participant[k]]);
}
}
Parameter Estimation With Linear Model on
Stan code
data {
int<lower = 0> n_trials;
int<lower = 0> n_participants;
int<lower = 0> n_images;
vector<lower = 0>[n_trials] y;
matrix[n_trials, n_images] X;
int<lower = 1, upper = n_participants> participant[n_trials];
int<lower = -1, upper = 1> motion[n_trials];
}
parameters {
// all parameters given lower bounds are constrained to be positive
real mu[n_participants];
real<lower = 0> sigma[n_participants];
vector[n_images] beta;
real bmotion;
real u[n_participants];
real mu_mu;
real<lower
real<lower
real<lower
real<lower
=
=
=
=
0>
0>
0>
0>
mu_sd;
sigma_mu;
sigma_sd;
u_sd;
}
model {
real lambda[n_trials];
// group-level priors
mu_mu ~ normal(0, 5);
mu_sd ~ cauchy(0, 2.5);
sigma_mu ~ normal(0, 1);
sigma_sd ~ normal(0, 1);
u_sd ~ cauchy(0, 2.5);
bmotion ~ normal(0, 2);
// priors on fixed effects (image categories)
beta[1] ~ cauchy(0, 2.5); //intercept: neutral images
for (i in 2:n_images){ //additive effects of other image categories
beta[i] ~ cauchy(0, 2.5);
}
// random participant effects
for (j in 1:n_participants){
mu[j] ~ normal(mu_mu, mu_sd);
sigma[j] ~ normal(sigma_mu, sigma_sd);
u[j] ~ normal(0, u_sd);
}
/* because the lambda parameter must be positive, it makes sense
to fit a linear model on the logarithm scale */
for (k in 1:n_trials){
// linear model on log(lambda)
lambda[k] <- exp(X[k] * beta + u[participant[k]] + bmotion * motion[k]);
//response time is ~ Ex-Gaussian
y[k] ~ exp_mod_normal(mu[participant[k]],
sigma[participant[k]],
lambda[k]);
}
}
generated quantities {
vector[n_trials] log_lik;
real lambda[n_trials];
for (k in 1:n_trials){
lambda[k] <- exp(X[k] * beta + u[participant[k]] + bmotion * motion[k]);
// used to calculate WAIC
log_lik[k] <- exp_mod_normal_log(y[k],
mu[participant[k]],
sigma[participant[k]],
lambda[k]);
}
}
Bayes Factor Models
In order to calculate a Bayes factor using the Savage-Dickey density ratio (Lee & Wagenmakers,
2014), we fit the following models to the RT data, using a similar model the above. The images
aggregated into 2 categories, negative and positive/neutral.
Bayes Factor Model for Effect of Negative Images on
Stan code
data {
int<lower = 0> n_trials;
int<lower = 0> n_participants;
int<lower = 0> n_images;
vector<lower = 0>[n_trials] y;
matrix[n_trials, n_images] X;
int<lower = 1, upper = n_participants> participant[n_trials];
int<lower = -1, upper = 1> motion[n_trials];
}
parameters {
real<lower = 0> sigma[n_participants];
real<lower = 0> lambda[n_participants];
real alpha;
// for an order-restricted comparison:
real<lower=0> delta;
// real delta; // for undirected comparison:
real<lower = 0> sigmab;
real bmotion;
real u[n_participants];
real<lower = 0> lambda_mu;
real<lower = 0> lambda_sd;
real<lower = 0> sigma_mu;
real<lower = 0> sigma_sd;
real<lower = 0> u_sd;
}
transformed parameters {
real beta;
// beta is the additive component for negative images
beta <- delta * sigmab;
}
model {
real mu[n_trials];
// priors
lambda_mu ~ normal(0, 5);
lambda_sd ~ cauchy(0, 2.5);
sigma_mu ~ normal(0, 1);
sigma_sd ~ normal(0, 1);
u_sd ~ cauchy(0, 2.5);
bmotion ~ normal(0, 2);
alpha ~ normal(0, 2.5);
// Half-Cauchy prior on effect size delta
delta ~ cauchy(0, 1);
sigmab ~ cauchy(0, 1);
for (j in 1:n_participants){
lambda[j] ~ normal(lambda_mu, lambda_sd);
sigma[j] ~ normal(sigma_mu, sigma_sd);
u[j] ~ normal(0, u_sd);
}
for (k in 1:n_trials){
// linear model on mu
mu[k] <- X[k, 1] * alpha + X[k, 2] * beta +
u[participant[k]] + bmotion * motion[k];
// response time is ~ Ex-Gaussian
y[k] ~ exp_mod_normal(mu[k],
sigma[participant[k]],
lambda[participant[k]]);
}
}
Bayes Factor Model for Effect of Negative Images on
Stan code
data {
int<lower = 0> n_trials;
int<lower = 0> n_participants;
int<lower = 0> n_images;
vector<lower = 0>[n_trials] y;
matrix[n_trials, n_images] X;
int<lower = 1, upper = n_participants> participant[n_trials];
int<lower = -1, upper = 1> motion[n_trials];
}
parameters {
real mu[n_participants];
real<lower = 0> sigma[n_participants];
real alpha;
real delta;
real<lower = 0> sigmab;
real bmotion;
real u[n_participants];
real mu_mu;
real<lower = 0> mu_sd;
real<lower = 0> sigma_mu;
real<lower = 0> sigma_sd;
real<lower = 0> u_sd;
}
transformed parameters {
real beta;
// beta is the additive component for negative images
beta <- delta * sigmab;
}
model {
real lambda[n_trials];
// priors
mu_mu ~ normal(0, 5);
mu_sd ~ cauchy(0, 2.5);
sigma_mu ~ normal(0, 1);
sigma_sd ~ normal(0, 1);
u_sd ~ cauchy(0, 2.5);
bmotion ~ normal(0, 2);
alpha ~ normal(0, 2.5);
// Half-Cauchy prior on effect size delta
delta ~ cauchy(0, 1);
sigmab ~ cauchy(0, 1);
for (j in 1:n_participants){
mu[j] ~ normal(mu_mu, mu_sd);
sigma[j] ~ normal(sigma_mu, sigma_sd);
u[j] ~ normal(0, u_sd);
}
for (k in 1:n_trials){
// linear model on log(lambda)
lambda[k] <- exp(X[k, 1] * alpha + X[k, 2] * beta +
u[participant[k]] + bmotion * motion[k]);
// response time is ~ Ex-Gaussian
y[k] ~ exp_mod_normal(mu[participant[k]],
sigma[participant[k]],
lambda[k]);
}
}
References
Gelman, A. and J. Hill (2006). Data analysis using regression and multilevel/hierarchical models,
Cambridge University Press
Gelman, A., and Rubin, D. B. (1992). Inference from iterative simulation using multiple
sequences. Stat. Sci. 457–472.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013a). Bayesian
Data Analysis, 3rd Edn., eds A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B.
Rubin (Boca Raton, FL: CRC press), 165–196.
Gelman, A., Hwang, J., and Vehtari, A. (2013b). Understanding predictive information criteria for
Bayesian models. Stat. Comput. 1–20.
Lee, M. D. and E.-J. Wagenmakers (2013). Bayesian Modeling or Cognitive Science: A Practical
Course, Cambridge University Press.
Download