Mediation Models Laura Stapleton UMBC Mediation Models Tasha Beretvas University of Texas at Austin Session outline What is mediation? Basic single mediator model Short comment on causality Tests of the hypothesized mediation effect Mediation models for cluster randomized trials Brief mention of advanced issues What is mediation? A mediator explains how or why two variables are related. In the context of interventions, a mediator explains how or why a Tx effect occurs A mediator is thought of as the mechanism or processes through which a Tx influences an outcome (Barron & Kenny, 1986). If X M and M Y, then M is a mediator X causes proximal variable, M, to vary which itself causes distal, variable,Y, to vary What is mediation? Mediational process can be Observed or latent Internal or external At the individual or cluster level Based on multiple or sequential processes Who cares?! Mediation analyses can identify important processes/mechanisms underlying effective (or ineffective!) treatments thereby providing important focal points for future interventions. Mediation Examples Bacterial exposure Disease Bacterial exposure Infection Disease Stimulus Response Might work for simple organisms (amoebae!), however, for more complex creatures: Stimulus Organism Response Stimulus Expectancy Response Monkey and lettuce example Maze-bright, maze-dull rats and maze performance example Mediation Examples Intervention Outcome Intervention Receptivity Outcome Intervention Tx Fidelity Outcome Intervention Tch Confid Outcome Intervention Soc Comp Achievement Intervention Phon Aware Reading Intervention Peer Affil Delinq Beh Mediation Moderation A moderator explains when an effect occurs Relationship between X and Y changes for different values of M More in later session of workshop… Basic (single-level) mediation model c Treatment Yi 0 1 Ti ei Outcome M i 0 1 Ti ei Mediator a b Treatment Outcome Yi 0' 1' M i 2' Ti ei' c’ total effect = indirect effect + direct effect c= ab + c’ Causality concerns Just because you estimate the model XMY does not mean that the relationships are causal Unless you manipulate M, causal inferences are limited. Mediation model differs from Mediation design Causality concerns – mediation model Remember, if the mediator is not typically manipulated, causal interpretations are limited Z Mediator M a Treatment T Ok! b Outcome Y Possible misspecification For now, be sure to substantively justify the causal direction and “assume or hypothesize that M causes Y and assuming that, here’s the strength of that effect…” In future research, manipulate mediator Tests of the hypothesized mediation effect Mediator a M Treatment T b Outcome c’ Y The estimate of the indirect effect, ab, is based on the sample To infer that a non-zero αβ exists in the population, a test of the statistical significance of ab is needed Several approaches have been suggested and differ in their ability to “see” a true effect (power) Tests of the hypothesized mediation effect Causal steps approach (Baron & Kenny) Test of joint significance z test of ab (with normal theory confidence interval) Asymmetric confidence interval (Empirical M or distribution of the product) Bootstrap resampling Causal steps approach Step 1: test the effect of T on Y (path c) c Treatment Outcome Step 2: test the effect of T on M (path a) Mediator a Treatment Causal steps approach Step 3: test the effect of M on Y, controlling for T (path b) Mediator b Treatment Outcome c’ Step 4: to decide on partial or complete mediation, test the effect of T on Y, controlling for M (path c’) Causal steps approach: performance Step 1 may be non-significant when true mediation exists Mediator +2 What if… FdF +3 Treatment Outcome T Dep -6 Mediator +2 or… FdF +3 Treatment T Outcome +3 -2 Mediator SS Dep Causal steps approach: performance Lacks power Power is a function of the product of the power to test each of the three paths Power discrepancy worsens for smaller n and smaller effects Lower Type I error rate than expected i.e., too conservative Test of joint significance Very similar to causal steps approach Mediator a b Treatment Outcome c’ 1st: test the effect of T on M (path a) 2nd: test the effect of M on Y, controlling for T (path b) If both significant, then infer significant mediation Test of joint significance: performance Better power than causal steps approach Type I error rate slightly lower than expected Power nearly as good as newer methods in singlelevel models Power lower than other methods in multilevel models No confidence interval around the indirect effect is available z test of ab product Calculate z = Sobel’s seab = ab seab a 2 seb2 b2 sea2 Compare z test value to critical values from the standard normal distribution Can also calculate confidence interval around ab CI = ab ( zcritical )( seab ) z test of ab product: performance One of the least powerful approaches Type I error rate much lower than expected .05. Single-level models, it approaches the power of other methods when sample size are 500 or greater, or effect sizes are large Multilevel models, it never reaches the levels of other models although it does get closer although still lower Problem is that the ab product is not normally distributed, so critical values are inappropriate How is the ab product distributed? Sampled 1,000 a ~ N(0,1) and of b ~ N(0,1) 200 200 Distribution of path a Distribution of path b 150 150 100 100 50 50 0 0 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 200 150 Distribution of product of axb 100 50 0 -4 -3 -2 -1 0 1 2 3 4 4 Empirical M-test (asymmetric CI) Determines empirical (more leptokurtic) distribution of z of the ab product (not assuming normality) dist’n is leptokurtic and symmetric αβ>0: dist’n is less leptokurtic and +ly skewed αβ<0: dist’n is less leptokurtic and -ly skewed αβ=0: Due to asymmetry, different upper and lower critical values needed to calculate asymmetric confidence intervals (CIs). Meeker derived tables for various combinations of Za and Zb values (increments of 0.4) that could be used to calculate asymmetric CIs. Empirical M-test (asymmetric CI) MacKinnon et al created PRODCLIN that, given a, b, and their SEs, determines the distribution of ab and relevant critical values for calculating asymmetric CI. (MacKinnon & Fritz, 2007, 384-389). Confidence interval limits: ab (CVlower )( seab ) ab (CVupper )( seab ) If CI does not include zero, then significant Empirical M-test: performance Good balance of power while maintaining nominal Type I error rate Performed well in both single-level and multi-level tests of mediation Only bootstrap resampling methods had (very slightly) better power than this method PRODCLIN software is easy to use Bootstrap resampling methods Determines empirical distribution of the ab product Several variations Parametric percentile Non-parametric percentile Bias-corrected versions of both Can bootstrap cases or bootstrap residuals. It is typical in multilevel designs to bootstrap residuals. Parametric percentile bootstrap With original sample, run the analysis and obtain estimates of variance(s) of residuals New residuals are then resampled from a distribution ~N(0,σ2) (thus, the “parametric”). New values of M are created by using the fixed effects estimates from the original analysis, T and the resampled residual(s). New values of Y are created using the fixed effects, and T and M values and residual(s). Then, the analysis is run and the ab product is estimated Parametric percentile bootstrap The process of resampling and estimating ab is repeated many times (commonly 1,000 times) The ab estimates are then ordered With 1,000 estimates, the 25th and the 975th are taken as the lower and upper limits of the 95% (empirically derived) CI. Note that the CI limits may not be symmetric around the original ab estimate If CI does not include zero, then significant mediation Non-parametric percentile bootstrap The parametric bootstrap involves the assumption that the residuals are normally distributed Instead, residuals can be resampled with replacement from the empirical distribution of actual residuals (saved from the original sample’s analysis) The remaining process is the same as for the parametric version Bias-corrected bootstrap With both the parametric and non-parametric bootstrap, the initial ab product may not be at the median of the bootstrap ab distribution Thus, the initial ab estimate is biased BC-bootstrap procedures “shift” the confidence interval to adjust for the difference in the initial estimate and the median Bootstrap resampling methods: performance Resampling methods provide the most power and accurate Type I error rates of all methods Parametric has best confidence interval coverage BC-parametric had best power, especially with low effect sizes with normal and non-normally distributed residuals; Type I error rate was slightly high for multilevel analyses Non-parametric had the most accurate Type I error rates; good overall power BC Non-parametric had good power But … complicated to program Summary: tests of the hypothesized mediation effect Causal steps approach Test of joint significance z test of ab Empirical M Bootstrap resampling OK for single level… Yes! Easy! Yes! Not quite as easy… but does have the most power Example for today Social-emotional curriculum = Tx Child social competence = outcome Randomly selected classrooms (one per school) Why would Tx affect outcome? Teacher attitude about importance? Child understanding of others’ behaviors? Puppet show down-time soothes child? Researcher should think in advance of possible mediators to measure Mediation models for cluster randomized trials Extend basic model to situations when treatment is administered at cluster level Model depends on whether mediator is measured at cluster or individual level Definition (Krull & MacKinnon, 2001) depends on level at which each variable is measured: T → M →Y Upper-level mediation [2→2→1] Cross-level mediation [2→1→1] Cross-level and upper-level mediation [2→(1 & 2) →1] Measured variable partitioning First, consider that any variable may be partitioned into individual level components and cluster level components Yij 00 u 0 j rij Note: No intercepts depicted Cluster uoj Yij Individual rij Mediation model possibilities Tx Cluster M Cluster Y Cluster Tx M Y Tx M Y Individual Individual Individual Data Example Context Cluster randomized trial (hierarchical design) 14 preschools: ½ treatment, ½ control 6 kids per school (/classroom) Socio-emotional curriculum Outcome is child social competence behavior Possible mediators: teacher attitude about importance of including this kind of training in classroom, child socio-emotional knowledge Sample data are on handout Total effect of treatment Before we examine mediation, let’s examine the total effect of treatment on the outcome… u 0 j Tx Cluster Tx Yij 0 j eij 01 T j u0 j 0 j 00 01 Y Cluster Y Y Cluster rij Total effect of treatment: FE Results Final estimation of fixed effects: ---------------------------------------------------------------------------Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------For INTRCPT1, B0 INTRCPT2, G00 34.357143 1.029102 33.386 12 0.000 T, G01 4.238095 1.455370 2.912 12 0.014 ---------------------------------------------------------------------------- c Upper-level mediation model (2→2→1) 01 M Cluster Tx Cluster Tx ’01 ’02 M M j 00 01T j u0 j Yij '0 j r 'ij 01 M j 02 T j u0 j 0 j 00 u 0 j Y Cluster Y Y Cluster rij Upper-level mediation model: Results To estimate the a path, I ran an OLS regression in SPSS using the Level 2 file M j 00 01T j u0 j Coeffi cientsa Model 1 (Const ant) T Unstandardized Coeffic ients B St d. Error 9.429 .444 .714 .628 St andardiz ed Coeffic ients Beta .312 t 21.228 1.137 Sig. .000 .278 a. Dependent Variable: M1 What is the estimate of a and its SE? Upper-level mediation model: Results To estimate the b path, I ran a model in HLM Final estimation of fixed effects: ---------------------------------------------------------------------------Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------For INTRCPT1, B0 INTRCPT2, G00 34.640907 1.036530 33.420 11 0.000 M1, G01 0.794540 0.656229 1.211 11 0.252 T, G02 3.670567 1.502879 2.442 11 0.033 ---------------------------------------------------------------------------- What is the estimate of b and its SE? What is the estimate of c’ and its SE? Upper-level mediation model: Results M Cluster Tx Cluster Tx u 0 j Y Cluster 3.671 M Y Y Cluster Direct effect = 3.671 rij Indirect effect = (.714)(.795) = .568 Total effect = DE + IE = 3.671 + .568 = 4.239 Upper-level mediation model: Results Causal steps approach No . Step 1 significant, but not Steps 2 and 3 Test of joint significance No . Neither path a nor path b are significant z test of ab product No . se=.68, z=.83, p=.41 95% CI = -.78 to 1.91 Empirical-M test No . 95% CI = -.47 to 2.26 BC parametric bootstrap No . 95% CI = -.42 to 3.68 Upper-level mediation model: Results PRODCLIN http://www.public.asu.edu/~davidpm/ripl/ Prodclin/ Cross-level mediation model (2→1→1) Model A Model B u0 j γ01 u0' j Mediator CLUSTER Treatment CLUSTER Treatment CLUSTER γ’01 Mediator Outcome CLUSTER Mediator Treatment Treatment Outcome Mediator Mediator INDIVIDUAL INDIVIDUAL γ’10 Outcome INDIVIDUAL rij' M ij 0 j rij , 0 j 00 01T j u0 j Yij '0 j '1 j M ij r 'ij '0 j '00 '01Tj u '0 j '1 j '10 rij' Cross-level mediation model: Results To estimate the a path: Final estimation of fixed effects: ---------------------------------------------------------------------------Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------For INTRCPT1, B0 INTRCPT2, G00 39.309524 0.845210 46.509 12 0.000 T, G01 2.642857 1.195308 2.211 12 0.047 ---------------------------------------------------------------------------- What is a and its SE? Cross-level mediation model: Results To estimate the b path: Final estimation of fixed effects: ---------------------------------------------------------------------------Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------For INTRCPT1, B0 INTRCPT2, G00 35.138955 0.941637 37.317 12 0.000 T, G01 2.674528 1.358185 1.969 12 0.072 For M2_GRAND slope, B1 INTRCPT2, G10 0.591620 0.142895 4.140 81 0.000 ---------------------------------------------------------------------------- What is b and its SE? And for c’? Cross-level mediation model: Results Model A Model B u0 j u0' j Mediator CLUSTER Treatment CLUSTER Treatment CLUSTER Mediator 2.675 Outcome CLUSTER Mediator Treatment Treatment Outcome Mediator Mediator INDIVIDUAL INDIVIDUAL Outcome INDIVIDUAL rij' Direct effect = 2.675 Indirect effect = (2.643)(.592) = 1.564 Total effect = 2.675 + 1.564 = 4.239 rij' Cross-level mediation model: Results Causal steps approach Yes Steps 1, 2 and 3 significant Test of joint significance Yes Paths a and b significant z test of ab product No se=.802, z=1.95, p=.051 95% CI = -.01 to 3.13 Empirical-M test Yes 95% CI = .19 to 3.32 BC parametric bootstrap Yes 95% CI = .31 to 3.57 Cross-level and upper-level mediation model [2→(1 & 2) →1] Model A Model B γ01 u0 j Mediator CLUSTER γ’01 Mediator CLUSTER Treatment CLUSTER u0' j Treatment CLUSTER Outcome CLUSTER Avg M Mediator Treatment Mediator Treatment Outcome Mediator Mediator INDIVIDUAL INDIVIDUAL Outcome INDIVIDUAL M ij 0 j rij , 0 j 00 01T j u0 j rij Yij 0 j M ij r ij ' ' 1j ' 01 T j 02 AveM j u0 j 0 j 00 1j 10 rij' Cross-level and upper-level mediation model: Results Path a is the same as in the prior model. For the b and c’ paths: Final estimation of fixed effects: ---------------------------------------------------------------------------Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------For INTRCPT1, B0 INTRCPT2, G00 35.095622 1.047773 33.495 11 0.000 T, G01 2.761188 1.602238 1.723 11 0.112 M2_AVE, G02 -0.041278 0.363535 -0.114 11 0.912 For M2 slope, B1 INTRCPT2, G10 0.600111 0.160566 3.737 80 0.001 ---------------------------------------------------------------------------- Cross-level and upper-level mediation model [2→(1 & 2) →1] Model A Model B u0 j Mediator CLUSTER Mediator CLUSTER Treatment CLUSTER Treatment CLUSTER 2.761 Outcome CLUSTER Avg M Mediator Treatment u0' j Mediator Treatment Outcome Mediator Mediator INDIVIDUAL INDIVIDUAL Outcome rij' abind = (2.643)(.600) = 1.586 abcluster = (2.643)(-.041) = -.109 Total indirect effect = 1.586 – 0.109 = 1.477 Total effect = 1.477+2.761 = 4.238 INDIVIDUAL rij' Cross-level and upper-level mediation model [2→(1 & 2) →1] Group-mean centered M Model A Model B u0 j Mediator CLUSTER Mediator CLUSTER Treatment CLUSTER Treatment CLUSTER 2.761 Outcome CLUSTER Avg M Mediator Treatment u0' j Mediator Treatment Outcome Mediator Mediator INDIVIDUAL INDIVIDUAL Outcome rij' INDIVIDUAL If the level one predictor had been group-mean centered, then the L2 path would have been 0.559 not -0.041. This path would be interpreted as the sum of the average individual and contextual effects of M. Under grand-mean centering, the path represents the unique contextual effect. rij' Cross- and upper-level mediation model: Results at the individual level Causal steps approach Yes Steps 1, 2 and 3 significant Test of joint significance Yes Paths a and b significant z test of ab product No se=.886, z=1.79, p=.073 95% CI = -.15 to 3.32 Empirical-M test Yes 95% CI = .19 to 3.44 BC parametric bootstrap ? Not yet programmed Brief review of advanced issues Multisite / randomized blocks (1→1 →1) More complicated! Testing mediation in 3-level models Including multiple mediators Examining moderated mediation Dichotomous or polytomous outcomes Measurement error in mediation models Notes on software HLM,SPSS Plug results into PRODCLIN SAS (PROC MIXED) See handout Can use Stapleton’s macros for bootstrapping MLwiN, MPlus Have limited bootstrapping capacity but still have to summarize results SEM software Provide test of but using Sobel. tasha.beretvas@mail.utexas.edu