Multilevel Analysis By Zach Andersen Jon Durrant Jayson Talakai OUTLINE Jon – What is Multilevel Regression Jayson – The Model Zach – R code applications / examples WHAT IS MULTILEVEL REGRESSION Regression models at multiple levels, because of dependencies in nested data Not two stage, this occurs all at once EXAMPLES •Students in schools •Individuals by area •Employees in organizations •Firms in various industries •Repeated observations on a person https://www.youtube.com/watch?v=wom6uPdI-P4 WHEN TO USE A MULTILEVEL MODEL? •Individual units (often people), with group indicators (e.g. Schools, area). •Dependent variable (level 1) •More than one person per group •Generally we need at least 5 groups, preferably more. (Ugly rule of thumb) https://www.youtube.com/watch?v=wom6uPdI-P4 WHEN TO USE A MULTILEVEL MODEL? Use a multilevel model whenever your data is grouped (or nested) into categories (or clusters) Allows for the study of effects that vary by group Regular regression ignores the average variation between groups and may lack the ability to generalize http://www.princeton.edu/~otorres/Multilevel101.pdf DATA STRUCTURE AND DEPENDENCE •Independence makes sense sometimes and keeps statistical theory relatively simple. • Eg; standard error(sample average) = s/n requires that the n observations are independent •But data often have structure, and observations have things in common; same area, same school, repeated observations on the same person •Observations usually cannot be regarded as independent https://www.youtube.com/watch?v=wom6uPdI-P4 Multilevel Models https://www.youtube.com/watch?v=wrTiCfgGdro PROBLEMS CAUSED BY CORRELATION •Imprecise parameter estimates •Incorrect standard errors A SIMPLE 2-LEVEL HIERARCHY School 1 Student 1 Student 2 School 2 Student 3 https://www.youtube.com/watch?v=wom6uPdI-P4 Student 1 Student 2 Student 3 A SIMPLE 2-LEVEL HIERARCHY School 1 Student 1 Student 2 Level 2 Student 3 Student 1 Level 1 https://www.youtube.com/watch?v=wom6uPdI-P4 School 2 Student 2 Student 3 PEOPLE ARE AT LEVEL 1?? The first level of a hierarchy is not necessarily a person https://www.youtube.com/watch?v=wom6uPdI-P4 A SIMPLE 2-LEVEL HIERARCHY Level 2 Industry 1 Firm 1 Firm 2 Firm 3 Firm 1 Level 1 https://www.youtube.com/watch?v=wom6uPdI-P4 Industry 2 Firm 2 Firm 3 A SIMPLE 2-LEVEL HIERARCHY Person 1 Event 1 Event 2 Level 2 Event 3 Event 1 Level 1 https://www.youtube.com/watch?v=wom6uPdI-P4 Person 2 Event 2 Event 3 BRIEF HISTORY •Problems of single level analysis, cross level inferences and ecological fallacy https://www.youtube.com/watch?v=wom6uPdI-P4 DISCUSSION AS TO WHY A NORMAL REGRESSION CAN BE A POOR MODEL •Because Reality might not conform to the assumptions of linear regression (Independence) • Because in nature observation tend to cluster • A random person in Lubbock is more likely to be a student then a random person in another city (clustering of populations/not independent) •Different clusters react differently https://www.youtube.com/watch?v=wom6uPdI-P4 EXTENSIONS •Focus was initially on hierarchical structures and especially students in schools •Also longitudinal, geographical studies •More recently moved to non hierarchical situations such as cross-classified models. (single level is part of more than one group) INTRACLASS CORRELATION •Level 1 variance explained by the group (level 2) •ICC is the proportion of group-level variance to the total variance •Formula for ICC: • • Variance in group Overall variance http://en.wikipedia.org/wiki/Intraclass_correlation MULTILEVEL MODELING • Random or Fixed Effects • • • What are random and fixed effects? When should you use random and fixed effects? Types of random effects models • The Model • • Assumptions of the model Building a multilevel model FIXED VS RANDOM EFFECTS **Anytime that you see the word “population” substitute it with the word “processes.” http://www2.sas.com/proceedings/forum2008/374-2008.pdf INTRODUCING THE MODEL Types of Models: Random Intercepts Model • Intercepts are allowed to vary: • The scores on the dependent variable for each individual observation are predicted by the intercept that varies across groups. http://en.wikipedia.org/wiki/Multilevel_model Types of Models: Random Slopes Model • Slopes are different across groups. • This model assumes that intercepts are fixed (the same across different contexts). http://en.wikipedia.org/wiki/Multilevel_model http://www.strath.ac.uk/aer/materials/5furtherquantitativeresearchdesignanda nalysis/unit4/randomslopemodelling/ Types of Models: Random intercepts and slopes model • Includes both random intercepts and random slopes • Is likely the most realistic type of model, although it is also the most complex. http://en.wikipedia.org/wiki/Multilevel_model Assumptions for Multilevel Models Modification of assumptions Linearity and normality assumptions are retained Homoscedasticity and independence of observations need to be adjusted. 1. Observations within a group are more similar to observations in different groups. 2. Groups are independent from other groups, but observations within a group are not. http://en.wikipedia.org/wiki/Multilevel_model Multilevel Model: Example http://faculty.smu.edu/kyler/training/AERA_overheads.pdf Multilevel Model: Level 1 Regression Equation http://faculty.smu.edu/kyler/training/AERA_overheads.pdf Multilevel Model continued: http://faculty.smu.edu/kyler/training/AERA_overheads.pdf Multilevel Model continued: http://faculty.smu.edu/kyler/training/AERA_overheads.pdf Multilevel Model continued: http://faculty.smu.edu/kyler/training/AERA_overheads.pdf Adding a Random Sample Component http://faculty.smu.edu/kyler/training/AERA_overheads.pdf EXAMPLES IN R Example of group effects without Multilevel modeling Example of the Covariance Theorem Example of Random Intercept Model