Addressing Alternative Explanations: Multiple Regression g 17.871 Spring 2012 1 Did Cli Clinton t h hurtt G Gore example l Did Clinton hurt Gore in the 2000 election? Treatment is not liking Bill Clinton 2 Bivariate regression of Gore thermometer on on Clinton thermometer Clinton thermometer 3 Did Cli Clinton t h hurtt G Gore example l What alternative explanations would you need to address? Nonrandom selection into the treatment group (disliking Clinton) from many sources Let’s address one source: party identification How could we do this? Matching: compare Democrats who like or don’t like Clinton; do the same for Republicans and independents Multivariate regression: control for partisanship statistically Also called multiple regression, Ordinary Least Squares (OLS) Presentation below is intuitive 4 Democrati tic piicture t Clinton thermometer 5 I d Independ dentt piicture t Clinton thermometer 6 R Republican bli piicture t Clinton thermometer 7 C bi d d Combined data t picture i t Clinton thermometer 8 Combined data picture with regression: bias! Clinton thermometer 9 Combined data picture with “true” regression lines overlaid Clinton thermometer 10 Tempting p g yyet wrong g normalizations Subtract the Gore therm. from the avg. Gore therm. therm. score Clinton thermometer Subtract the Clinton Clinton therm. from the avg. Clinton therm. score Clinton thermometer 11 3D R Relati l tionship hi 12 3D Linear Relationship 13 3D R Rellationshi ti hip: Cli Clinton t 50 100 14 0 3D R Relati l tionship: hi party t Rep Ind 15 Dem The Linear Relationship between Three Three Variables Gore thermometer Clinton thermometer Party ID Yi 0 1 X 1,i 2 X 2,i i 16 The method of least squares (again) Pick 0 , 1 , and 2 to minimize minimize n 2 ˆ ( Y Y ) or i i or i1 n 2 (Y X X ) i 0 1 i 2 2 i1 17 Th Sl The Slope C Coeffici ffi ients t n ˆ1 1 (Y Y )( X i 1 X 1,i ) i1 n 2 ( X X ) 1 1,i n - ˆ2 (X ˆ2 (Y Y )(X i 2 X 1,i ) i1 n 2 ( X X ) 2 2,i X 1,i )( X 2 X 2,i ) i1 n 2 ( X X ) 1 1,i i1 n 1 and i1 n - ˆ1 (X 1 X 1,i )( X 2 X 2,i ) i1 n 2 ( X X ) 2 2,i i1 i1 0 Y 1 X 1 2 X 2 X1 is Clinton thermometer, X2 is PID, and Y is Gore thermometer 18 The Slope Coefficients More Simply cov(X 1 ,Y ) ˆ cov(X 1 , X 2 ) ˆ 1 and - 2 var(X 1 ) var(X 1 ) cov(X 2 ,Y ) ˆ cov(X 1 , X 2 ) ˆ 2 - 1 var(X 2 ) var(X 2 ) var(X X1 is Clinton thermometer, X2 is PID, and Y is Gore thermometer 19 Th M The Matri t ix fform y1 y2 … 1 1 1 x1,1 x2,1 … xk,1 x1,2 x2,2 … xk,2 … … … … yn 1 x1,n x2,n … xk,n ( X X ) X y 1 20 Multivariate slope coefficients Clinton effect (on Gore) in bivariate (B) regression Bivariate estimate: Multivariate estimate: Are Gore and Party ID related? cov(( X 1 ,Y ) B ˆ 1 vs vs. var( X 1 ) cov( X 1 ,Y ) ˆ M cov( X 1 , X 2 ) M ˆ 1 - 2 var( X 1 ) var( X 1 ) Are Clinton and Partyy ID related? Clinton effect ( Gore) (on G ) in i multivariate (M) regression When does B M ˆ ˆ 1 1 ? Obviously, when ˆ 2M cov( X 1 , X 2 ) 0 var( X 1 ) X1 is Clinton thermometer, X2 is PID, and Y is Gore thermometer 21 Th Output The O t t . reg gore clinton party3 df Source | SS MS -------------+-----------------------------Model | 629261.91 22 314630.955 314630.955 2 Residual | 522964.934 1742 300.209492 -------------+-----------------------------660.68053 Total | 1152226.84 1744 Number of obs F( 2, 1742) Prob Prob > FF R-squared Adj R-squared Root MSE = 1745 = 1048.04 = 0.0000 =0.0000 0.5461 = 0.5456 = 17.327 Prob -----------------------------------------------------------------------------gore | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------0.000 .0175952 clinton | .5122875 29.12 .4777776 .5467975 0.000 10.31 5.770523 4.673191 .5594846 6.867856 party3 | 26.61862 _cons | 28.6299 1.025472 27.92 0.000 30.64119 ------------------------------------------------------------------------------ Interpretation of clinton effect: Holding constant party identification identification, a oneonepoint increase in the Clinton feeling thermometer is associated with a .51 increase in the Gore thermometer. 22 S Separate t regressions i Intercept p Clinton Party (1) 23.1 0.62 -- (2) 55.9 -15.7 (3) 28.6 0.51 5.8 cov(X 1 ,Y ) ˆ cov(X 1 , X 2 ) ˆ 1 - 2 and and var(X 1 ) var(X 1 ) cov(X 2 ,Y Y ) ˆ cov(X 1 , X 2 ) ˆ 2 - 1 var(X 2 ) var(X 2 ) 23 Why did the Clinton Coefficient change from 0.62 to 0.51 . corr gore clinton party, cov (obs=1745) | gore clinton party3 -------------+ +--------------------------gore | 660.681 clinton | 549.993 883.182 party3 t 3 | 13.7008 13 7008 16 16.905 905 .8735 8735 24 The Callcullatitions Th cov(gore, clinton cov(gore clinton)) 549 549.993 993 B ˆ 1 0.6227 var(clinton) 883.182 cov(gore, clinton) ˆ M cov(clinton, party) M ˆ 1 2 var(( cli lintton)) var(( cli lintton)) 549.993 16.905 5.7705 883.182 883.182 0.6227 0.1105 . corr gore clinton party,cov (obs=1745) | gore clinton party3 -------------+--------------------------gore | 660.681 clinton | 549.993 883.182 party3 | 13.7008 16.905 .8735 0.5122 25 Another way of thinking about this R it Rewrite cov(gore, clinton) ˆ M cov(clinton, party) M ˆ 1 2 var(clinton) var(clinton) as cov(gore, clinton) ˆ M ˆ M cov(clinton, party) 1 2 var(clinton) var(clinton) Total effect = Direct effect + indirect effect The Total Effect of the Clinton thermometer on the Gore thermometer (.61) can be Broken down into a direct effect of .51, plus an indirect effect (though party) of .11 26 D i ki and Drinking dG Greek k Life Lif E Example l Why is there a correlation between living y/sororityy house and drinking? in a fraternity Greek organizations often emphasize social gathering g gs that have alcohol. The effect is being in the Greek organization itself, not the house. There’s something about the House environment itself. 27 Dependent variable: Times Times Drinking in Past 30 Days :HFKVOHU+HQU\&ROOHJH$OFRKRO6WXG\+DUYDUG6FKRRORI3XEOLF+HDOWK +DUYDUG6FKRRORI3XEOLF+HDOWK$OOULJKWVUHVHUYHG7KLVFRQWHQWLVH[FOXGHGIURP RXU&UHDWLYH&RPPRQVOLFHQVH)RUPRUHLQIRUPDWLRQVHHKWWSRFZPLWHGXIDLUXVH 28 . infix age 10-11 residence 16 greek 24 screen 102 timespast30 103 howmuchpast30 104 gpa 278-279 studying 281 timeshs i h 325 h howmuchhs hh 326 socializing i li i 283 stwgt_99 99 475 475-493 493 weight99 494-512 using da3818.dat,clear (14138 observations read) . recode timespast30 timeshs (1=0) (2=1.5) (3=4) (4=7.5) (5=14.5) (6=29.5) (7=45) (timespast30: 6571 changes made) (timeshs 10272 changes made) (timeshs: . replace timespast30=0 if screen<=3 (4631 real changes made) 29 . tab timespast30 timespast30 | Freq. Cum. Percent ------------+ +----------------------------------0 | 4,652 33.37 33.37 19.64 1.5 | 2,737 53.01 19.03 4 | 2,653 72.04 13.30 7.5 | 1,854 85.34 11.82 14.5 | 1,648 97.17 29.5 | 350 2.51 99.68 45 | 45 0 32 0.32 100.00 100 00 ------------+----------------------------------Total | 13,939 100.00 30 K explanatory Key l t variiables bl Live in fraternity/sorority house Indicator variable (dummy variable) Coded 1 if live in, 0 otherwise Member of fraternity/sorority Indicator variable (dummy variable) Coded 1 if member member, 0 otherwise 31 Th Three Regressi R ions Dependent variable: number of times drinking in past 30 days Live in frat/sor house 4.44 (0.35) --- 2.26 (0.38) --- 2.88 2 88 (0.16) 2.44 2 44 (0.18) 4.54 (0.56) 4.27 (0.059) 4.27 (0.059) S.E.R. 6.49 6.44 6.44 R2 .011 011 .023 023 .025 025 (indicator variable) M b off frat/sor Member f t/ (indicator variable) Intercept N 13,876 13,876 13,876 What is the substantive interpretation of the coefficients? Note: N t Standard St d d errors in i parentheses. th Corr. C Between B t living li i in i frat/sor house and being a member of a Greek organization is .42 32 Th Pi The Picture t Living in frat house X2 ˆ 2M =2.26 Drinks i k per 30 days Y 21 =0.19 Member of fraternityy X1 ̂1M =2.44 33 A Accounting ti ffor th the ttotal t l effect ff t B M M ˆ ˆ ˆ 1 1 2 21 Total effect = Direct effect + indirect effect Living in frat house X2 ˆ 2M =2.26 Drinks per 30 days Y 21 =0.19 Member of fraternity X1 ˆ1M =2.44 34 Accounting g for the effects of frat house living and Greek membership on drinking From From multiple regressions From bi i t bivariate regressions Effect Member of Greek org. Total 2.88 Live in frat/ sor. house 4.44 Direct 2.44 (85%) 2.26 (51%) 35 accounting identity: T=D+I Indirect 0.44 (15%) 2.18 (49%) MIT OpenCourseWare http://ocw.mit.edu 17.871 Political Science Laboratory Spring 2012 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.