Methodological Workshop 3: Fixed Effects Models and Multi-Level Models Yu Xie University of Michigan What’s Common? Both the fixed effects model and the multilevel model utilize clustered data. Both the fixed effects model and the multilevel model are designed to handle crosscontext heterogeneity. Different Objectives Fixed effects model and multi-level model are very different research designs: Fixed effects model controls for (or absorbs) pre-treatment heterogeneity (type I heterogeneity) Multi-level model models both forms of heterogeneity across contexts. Application of Different Principles The fixed effects model is essentially an application of the social grouping principle (with a group being a cluster) The multi-level model is essentially an application of the social context principle. Using Different Assumptions The fixed effects model assumes no type II heterogeneity bias (often constant effects model), or additive effects of heterogeneity across contexts (i.e., clusters). The multi-level model relaxes homogeneity assumption at the individual level but assumes that both forms of heterogeneity are at the context level and can be modeled adequately with contextual covariates. A General Lesson: Tradeoff between Data and Assumption “When observed data are thin, it takes strong assumptions to yield sharp results. There is no free information in statistics. Either you collect it, or you assume it.” (Xie 1996, AJS). Fixed effects model Sibling model as an example Family SES, environment are shared • Yi1 = b0 + b1Xi1 + ai + ei1 • Yi2 = b0 + b1Xi2 + ai + ei2 a and X may be correlated. Take difference between the two eq. • Yi2 - Yi1= b1 (Xi2 - Xi1) + (ei2 - ei1) • Resulting in a more robust equation Properties of the fixed effects approach: • All fixed-characteristics are controlled • It consumes a lot of information • Unobserved heterogeneity (Type I) is controlled for at the group level (fixed effects) Example: Critique of Zhou and Hou (1999): Positive Benefits of Send-Down? “More interestingly, our findings also reveal some positive consequences of the send-down experience. For instance, when compared with urban youth, a noticeably higher proportion of the send-down youth attained a college education after 1977. Partly as a result of their educational attainment, these sent-down youth, especially those with shorter rural durations, were equally likely to enter favorable employment (type of occupation and work organizations) in the urban labor force, despite their relatively short urban labor force experience.” (Zhou and Hou 1999: 32) Speculated Reason for the Beneficial Effects The unusual hardship faced by sent-down youth forced them to be more adaptive and thus acquire skills to survive. In Our Recent Study (Xie, Yang, and Greenman 2008) We analyze data from the survey of Family Life in Urban China that we conducted in three large cities (Shanghai, Wuhan, and Xi’an) in 1999. We use some items designed for this study. Statistical Analyses (1) We present the differences in six socioeconomic indicators between respondents who experienced send-down with those who did not experience send-down. (2) We present results from a fixed-effects model capitalizing on the sibling structure in our data. (3) We examine educational attainment closely as a time-varying covariate and its endogenous role in affecting early returns of sent-down youth. Table 1: Descriptive Differences between Respondents with Send-Down Experience and Respondents without Send-Down Experience Not Sent Down Sent down Sent Down Duration <6 10.9 11.9 15.2 * 11 10.8 11.3 ** Annual Salary (yuan) 5,318 4,983 4,567 Total Annual Income (yuan) 8,468 8,680 5.3 SEI N College Education (%) Years of Schooling Cadre (%) Notes: *p<.1, **p<.05, ***p<.01 Sent Down Duration 6+ 3 *** 9.4 *** 6,083 *** 7,976 10,542 *** 6.3 6.6 5.3 42.5 42 42.5 40.6 651 481 349 132 *** After We Control for Covariates (Table 2) There are no differences in salary or income. Short-term sent-down youth still have higher levels of education than the other two groups (non-sent-down and long-term sent-down). Potential Sources of Bias Some sent-down youth did not return to cities or did not return to the same cities. There can be unobserved family-level characteristics associated with both senddown and outcomes. We use a fixed effects model based on sibling pairs to address both problems. Table 3 : Unadjusted Differences by Send-Down Experience Using Sibling Pairs Not Sent down Sent Down d College Education (%) 11.4 11.7 -0.3 Years of Schooling 10.9 10.8 0.1 8.9 5.4 3.5 SEI 43.7 44.5 -0.7 N 344 344 Cadre (%) Notes: *p<.1, **p<.05, ***p<.01 What’s Going On? If there are no effects of send-down (from the fixed effects model), why do we observe differences in education between short-term sent-down youth and long-term sent-down youth? The answer largely lies in “pre-treatment” differences. Table 4: Unadjusted Differences by Duration Duration <6 Duration > 6 53 13.6 *** Years of Schooling at Send-Down 10.5 9.2 *** Years of Schooling at Return 10.7 9.3 *** College Enrollment in Year of Return (%) 13.2 1.5 *** College Education (%) 15.2 3 *** Truncated Sample 11.9 2.3 *** 11.3 9.5 *** 11.1 9.4 *** HS Graduate at Send Down (%) Current Years of Schooling Truncated Sample N Notes: *p<.1, **p<.05, ***p<.01 349 132 Conclusion Did send-down experience benefit youth? -- No. Our analyses of the new data show that the send-down experience did not benefit the youth who were affected. Differences in social outcomes between those who experienced send-down and those who did not are either non-existent or spurious due to other social processes. Accounting for Heterogeneous Responses with Social Context Principle Possible with nested data, assuming that patterns of relationships are homogeneous (or following a distribution) within social contexts (by time or space). dk is allowed to vary across k (k=1,…K), social context, but is homogeneous within k, conditional on X. Multi-level Model (MLM) Yik = ak + dkDik + b’Xik + eik ak = l+fzk+mk dk = g+szk+nk Other names: hierarchical linear models, randomcoefficient models, growth-curve models, and mixed models. Units of analysis at a lower level are nested within higherlevel units of analysis Examples: Students within schools Observations over time within persons (growth curve) Problems without MLM If we ignore higher-level units of analysis => we cannot account for context (individualistic approach) If we ignore individual-level observation and rely on higher-level units of analysis, we may commit ecological fallacy (aggregated data approach) Without explicit modeling, sampling errors at second level may be large =>unreliable slopes Homoscedasticity and no serial correlation assumptions of OLS are violated (an efficiency problem). No distinction between parameter variability and sampling variability. Advantages of MLM Cross-level comparisons Controls for differences across higher levels Example: Xie and Hannum (1996) T log Y b 0 + b1 X 1 + b 2 X 2 + b 3 X 22 + b 4 X 4 + b 5 X 5 + b 6 X 1 X 5 + (1) Where Y = earnings, X1 = years of schooling, X2 = years of work experience, X4 = a dummy variable denoting membership in the Communist Party of China (1 = party member), X5 a dummy variable denoting gender (1 = female). Note two interactions. Consider regional heterogeneity For the ith person in kth city: log ( yik b 0 k + b1k x1ik + b 2 k x2ik + b 3 x22ik + b 4 k x4ik + b 5 k x5ik + b 6 x1ik x5ik + ik . Instead of using fixed effects for the intercept b0k, and full interactions for slope parameters, Xie and Hannum modeled these parameters in a multilevel model. Let z be a city-level covariate that measures the degree of economic reform. Let us assume that individual-level parameters depend on z in the following linear regressions: Cross-City Model (“meta analysis”) b0k a 0 + l0 zk + m0k b1k a1 + l1 zk + m1k b 2 k a 2 + l2 zk + m2 k b3 a 3 b 4 k a 4 + l4 zk + m4 k b5k a5 + l5 zk + m5k b6 a 6 Combining the two levels => log ( yik a 0 + a1 x1ik + a 2 x2ik + a 3 x22ik + a 4 x4ik + a 5 x5ik + a 6 x1ik x5ik + l0 zk + l1 x1ik z k +l2 x2ik zk + l4 x4ik zk + l5 x5ik zk + (m0 k + m1k x1ik + m2 k x2ik + m4 k x4ik + m5k x5ik + ik We can see that the city-level covariate z interacts with most of the individual-level predictors. Special Cases Special case 1: If all the coefficients of the city-level covariate (z) are zero, we have what is called “random coefficient model” Special case 2: If all the coefficients of the city-level covariate (z) are zero and there are no random coefficients in all slope coefficients (except the intercept), we have what is called “variance component model”. [See Table 3.] Summary: Four ways to conceptualize variability in parameters Specification Complete Random homogeneity variation Regression Fixed Degree of Freedom 1 1+Pk K Parsimony (DF for Model) High Low Accuracy (like R2) Low High 2 where Pk is the number of predictors at the 2nd level, and K is the number of units at the second level. References Xie, Yu. 1996. “Review of Identification Problems in the Social Sciences by Charles Manski.” American Journal of Sociology 101:1131-1133. Xie, Yu and Emily Hannum. 1996. “Regional Variation in Earnings Inequality in Reform-Era Urban China.” American Journal of Sociology 101:950-992. Xie, Yu, Yang Jiang, and Emily, Greenman. 2008. “Did Send-Down Experience Benefit Youth? A Reevaluation of the Social Consequences of Forced Urban-Rural Migration during China’s Cultural Revolution.” Social Science Research 37: 686-700.