Applied Bayesian Analysis for the Social Sciences

advertisement

Applied Bayesian Analysis for the Social Sciences

Philip Pendergast

Computing and Research Services

Department of Sociology philip.pendergast@colorado.edu

Sponsored by Computing and Research Services and the Institute of Behavioral Science

Suspending Disbelief-- Faith in

Classical Statistics

What are some issues that we have with classical statistics? Think back to your introductory class…

QuickTime™ and a

decompressor are needed to see this picture.

Suspending Disbelief-- Faith in

Classical Statistics

• Conducting an infinite number of experiments/ repeated sampling

• Assume that some parameter unknown but has a fixed value

 is

Pvalue worship

• Null hypothesis testing

• Multiple comparisons

• Strict data assumptions, often unmet

• Confidence Interval interpretation

• Small samples are an issue

QuickTime™ and a

decompressor are needed to see this picture.

The Coin Flip

• Frequentist

– We can determine the bias of a coin ( b) by repeatedly flipping it and counting heads. As long as we repeat the process enough times, we should be able to estimate the “true” bias of the coin.

– If p<.05 that b= 0.5, we reject the null hypothesis that it is unbiased.

QuickTime™ and a

decompressor are needed to see this picture.

The Nail Flip

• Frequentist

– We determine the bias of a nail ( b) by repeatedly flipping it and counting “heads”

(landing on its flat base).

– If p<.05 that b= 0.5, we reject the null hypothesis that it is unbiased.

QuickTime™ and a

decompressor are needed to see this picture.

Does this seem reasonable? Don’t we know that the nail is biased?

Classical Statistics is Atheoretical

• Science is an iterative process, we should learn from past research.

• Theory should guide us in how we analyze data.

– Typically, beyond the lit. review, informs:

• Variable selection

• Model building

• Choice of model (e.g. SEM, HLM) QuickTime™ and a

decompressor are needed to see this picture.

• NOT the actual way parameters are estimated in the analysis

Bayesian Statistics and Theory

• Bayesian statistics considers to be unknown, possessing a probability distribution reflecting our degree of uncertainty about it.

• We take into consideration theory and uncertainty when estimating  .

• The Posterior: A probably distribution for on hand.

 given our data

• The Data: Needs only meet the assumption of exchangeability.

• The Prior: A distribution based on knowledge about certainty.

 , and our

p(

|y)

 p(y|

)p(

)

The Nail Flip

• Bayesian

– Prior Beliefs: We consult several nail experts, who are relatively certain that nails will land on their heads only 1/50 times, or 2% of the time.

– Data on Hand: We flip the nail 100 times.

– Posterior: We sample from the joint probability of our prior beliefs given our data (the Posterior distribution) to see whether the experts’ opinions are reasonable and/or if our nail shares a similar bias to other nails.

.

ure r d a pict sso

an his re e™ e t mp im o se ckT d t eco d

Qui de ee n are Well if we examine the anatomy of the nail…

QuickTime™ and a

decompressor are needed to see this picture.

Priors and Subjectivity

• “B-B-B-Bbbbut wait, aren’t these priors subjective? We are objective scientists!”

– Variable selection, model choice, research questions are all subjective decisions.

• By making these subjective decisions explicit, we open ourselves to critique and are forced to thoughtfully choose and defend our choice of priors. If we have no good theory, we must choose a prior that lets the data speak for itself .

Choosing Sensible Priors

• How much do we know? How accurate do we take this information to be?

– Informative priors: Historical data, expert opinion, past research findings, theoretical implications.

– Non-informative prior: Uniform distribution over a sensible range of values.

• If the prior has high precision (1/

2 ) or N is small, it will heavily influence the posterior distribution. If it has low precision or N is large, the data influences the posterior more.

Conjugate Prior Distributions

• Conjugate priors have a distribution that yields a posterior distribution in the same family as the prior when combined with data.

Data Distribution

Normal

Poisson

Binomial

Conjugate Prior

Normal or Uniform

Gamma

Beta

The Posterior Distribution and

Monte Carlo Integration

• Recall that p(  |y) is a probability distribution.

• It is computationally demanding to directly derive summary measures of p(  |y).

• Instead, we repeatedly sample from p(  |y) and summarize the distribution formed by these samples.

QuickTime™ and a

decompressor are needed to see this picture.

– This is called Monte Carlo

Integration

Monte Carlo Markov Chains,

Explained

QuickTime™ and a

decompressor are needed to see this picture.

QuickTime™ and a

decompressor are needed to see this picture.

Markov Chains, Continued

• We specify the number of chains as well as the number of iterations made.

• They “dance” around the posterior from starting values, moving to areas of higher density.

• Chains stabilize around the posterior mean.

• Once stabilized, discard early iterations

( Burn-in samples).

• Estimates of the posterior come from the post-burn-in period.

Bayesian Analysis (Finally!)

• Decide on a model.

• Specify the # of Markov Chains, # of iterations, a burn-in period, and your prior beliefs.

• Run model diagnostics to check for convergence.

• Compare results of models with different specifications of priors, parameters, etc. to see which best “returns” the data inhand or obtains the highest model fit (e.g.

BIC, Bayes Factor, Deviance).

Overcoming Classical

• Conducting an infinite number of experiments/ repeated sampling

• Assume that some parameter but has a

 is unknown fixed value

Pvalue worship

• Null hypothesis testing

• Multiple comparisons

• Strict data assumptions, often unmet

• Confidence Interval

Interpretation

• Small samples are an issue

Shortcomings

• Only use data on hand, no extrapolating to other potential(ly conflicting) data

• Directly estimate our uncertainty of 

• Report HDIs, thoughtfully draw conclusions

• More meaningful hypothesis testing (e.g. different priors)

• Not an issue

• Minimal assumptions

(exchangability)

• HDI shows the believability

(probability) of values

• If strong priors, still useful

References

Kruschke, J. K. (2011). Doing Bayesian

Data Analysis: A tutorial with R and

BUGS. Oxford: Academic Press.

Kaplan, D. (2014). Bayesian Statistics for the Social Sciences. New York:

Guilford Press.

R “MCMCpack” Tutorial

• Run simple models predicting job satisfaction as a function of income.

• One model uses an uninformative prior

(specifically, the uniform distribution)

• The other uses an informed prior from earlier data

• Compare the Bayes Factors to see which “retrieves” the data better (i.e. is a better fit)

R “MCMCpack” Tutorial

• Open R

• Click “Packages”-->Set CRAN mirror-->

Pick anything in the US.

• Open “Packages” again-->Install

Packages-->Scroll down to MCMCpack.

• Say “yes” to a new library.

• Type “library(MCMCpack)” to load it in, also type “library(foreign)” to enable reading of the STATA file.

Download