Addressing Alternative Explanations: g

advertisement
Addressing Alternative
Explanations:
Multiple Regression
g
17.871
Spring 2012
1
Did Cli
Clinton
t h
hurtt G
Gore example
l

Did Clinton hurt Gore in the 2000 election?
 Treatment
is not liking Bill Clinton
2
Bivariate regression of Gore thermometer on
on
Clinton thermometer
Clinton thermometer
3
Did Cli
Clinton
t h
hurtt G
Gore example
l




What alternative explanations would you need to
address?
Nonrandom selection into the treatment group (disliking
Clinton) from many sources
Let’s address one source: party identification
How could we do this?

Matching: compare Democrats who like or don’t like Clinton; do
the same for Republicans and independents
 Multivariate regression: control for partisanship statistically


Also called multiple regression, Ordinary Least Squares (OLS)
Presentation below is intuitive
4
Democrati
tic piicture
t
Clinton thermometer
5
I d
Independ
dentt piicture
t
Clinton thermometer
6
R
Republican
bli
piicture
t
Clinton thermometer
7
C bi d d
Combined
data
t picture
i t Clinton thermometer
8
Combined data picture with
regression: bias!
Clinton thermometer
9
Combined data picture with
“true” regression lines overlaid
Clinton thermometer
10
Tempting
p g yyet wrong
g
normalizations
Subtract the Gore
therm. from the
avg. Gore therm.
therm.
score
Clinton thermometer
Subtract the Clinton
Clinton
therm. from the
avg. Clinton therm.
score
Clinton thermometer
11
3D R
Relati
l tionship
hi
12
3D Linear Relationship
13
3D R
Rellationshi
ti
hip: Cli
Clinton
t
50
100
14
0
3D R
Relati
l tionship:
hi party
t
Rep
Ind
15
Dem
The Linear Relationship between Three
Three
Variables
Gore
thermometer
Clinton
thermometer
Party ID
Yi   0  1 X 1,i   2 X 2,i   i
16
The method of least squares
(again)
Pick  0 , 1 , and  2
to minimize
minimize
n
2
ˆ
(
Y

Y
)
or
 i i or
i1
n
2
(Y




X


X
)
 i 0 1 i 2 2
i1
17
Th Sl
The
Slope C
Coeffici
ffi ients
t
n
ˆ1
1 
 (Y  Y )( X
i
1
 X 1,i )
i1
n
2
(
X

X
)
 1 1,i
n
- ˆ2
(X
ˆ2 
 (Y  Y )(X
i
2
 X 1,i )
i1
n
2
(
X

X
)
 2 2,i
 X 1,i )( X 2  X 2,i
)
i1
n
2
(
X

X
)
 1 1,i
i1
n
1
and
i1
n
- ˆ1
(X
1
 X 1,i )( X 2  X 2,i
)
i1
n
2
(
X

X
)
 2 2,i
i1
i1
0  Y  1 X 1  2 X 2
X1 is Clinton thermometer, X2 is PID, and Y is Gore thermometer
18
The Slope Coefficients More
Simply
cov(X 1 ,Y ) ˆ cov(X 1 , X 2 )
ˆ
1 
and
- 2
var(X 1 )
var(X 1 )
cov(X 2 ,Y ) ˆ cov(X 1 , X 2 )
ˆ
2 
- 1
var(X 2 )
var(X 2 )
var(X
X1 is Clinton thermometer, X2 is PID, and Y is Gore thermometer
19
Th M
The
Matri
t ix fform
y1
y2
…
1
1
1
x1,1 x2,1 … xk,1
x1,2 x2,2 … xk,2
… … … …
yn
1
x1,n x2,n … xk,n
  ( X X ) X y
1
20
Multivariate slope coefficients
Clinton effect
(on Gore) in
bivariate (B)
regression
Bivariate estimate:
Multivariate estimate:
Are Gore and Party ID
related?
cov(( X 1 ,Y )
B
ˆ
1 
vs
vs.
var( X 1 )
cov( X 1 ,Y ) ˆ M cov( X 1 , X 2 )
M
ˆ
1 
- 2
var( X 1 )
var( X 1 )
Are Clinton and
Partyy ID
related?
Clinton effect
( Gore)
(on
G ) in
i
multivariate (M)
regression
When does
B
M
ˆ
ˆ
1  1 ?
Obviously, when
ˆ 2M
cov( X 1 , X 2 )
0
var( X 1 )
X1 is Clinton thermometer, X2 is PID, and Y is Gore thermometer
21
Th Output
The
O t t
. reg gore clinton party3
df
Source |
SS
MS
-------------+-----------------------------Model |
629261.91
22 314630.955
314630.955
2
Residual | 522964.934 1742
300.209492
-------------+-----------------------------660.68053
Total | 1152226.84 1744
Number of obs
F( 2,
1742)
Prob
Prob > FF
R-squared
Adj R-squared
Root MSE
=
1745
= 1048.04
= 0.0000
=0.0000
0.5461
= 0.5456
= 17.327
Prob
-----------------------------------------------------------------------------gore |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------0.000
.0175952
clinton |
.5122875
29.12
.4777776
.5467975
0.000
10.31
5.770523
4.673191
.5594846
6.867856
party3 |
26.61862
_cons |
28.6299
1.025472
27.92
0.000
30.64119
------------------------------------------------------------------------------
Interpretation of clinton effect: Holding constant party identification
identification, a oneonepoint increase in the Clinton feeling thermometer is associated with a .51 increase
in the Gore thermometer.
22
S
Separate
t regressions
i
Intercept
p
Clinton
Party
(1)
23.1
0.62
--
(2)
55.9
-15.7
(3)
28.6
0.51
5.8
cov(X 1 ,Y ) ˆ cov(X 1 , X 2 )
ˆ
1 
- 2
and
and
var(X 1 )
var(X 1 )
cov(X 2 ,Y
Y ) ˆ cov(X 1 , X 2 )
ˆ
2 
- 1
var(X 2 )
var(X 2 )
23
Why did the Clinton Coefficient
change from 0.62 to 0.51
. corr gore clinton party, cov
(obs=1745)
|
gore clinton
party3
-------------+
+--------------------------gore | 660.681
clinton | 549.993 883.182
party3
t 3 | 13.7008
13 7008
16
16.905
905
.8735
8735
24
The Callcullatitions
Th
cov(gore, clinton
cov(gore
clinton)) 549
549.993
993
B
ˆ
1 

 0.6227
var(clinton)
883.182
cov(gore, clinton) ˆ M cov(clinton, party)
M
ˆ
1 
 2
var(( cli
lintton))
var(( cli
lintton))
549.993
16.905

 5.7705
883.182
883.182
 0.6227  0.1105
. corr gore clinton party,cov
(obs=1745)
|
gore clinton
party3
-------------+--------------------------gore | 660.681
clinton | 549.993 883.182
party3 | 13.7008
16.905
.8735
 0.5122
25
Another way of thinking about
this
R it
Rewrite
cov(gore, clinton) ˆ M cov(clinton, party)
M
ˆ
1 
 2
var(clinton)
var(clinton)
as
cov(gore, clinton) ˆ M ˆ M cov(clinton, party)
 1   2
var(clinton)
var(clinton)
Total effect = Direct effect + indirect effect
The Total Effect of the Clinton thermometer on the Gore thermometer (.61) can be
Broken down into a direct effect of .51, plus an indirect effect (though party) of .11
26
D i ki and
Drinking
dG
Greek
k Life
Lif E
Example
l

Why is there a correlation between living
y/sororityy house and drinking?
in a fraternity
 Greek
organizations often emphasize social gathering
g
gs that have alcohol. The effect is being in the Greek organization itself, not the
house.
 There’s something about the House environment itself.
27
Dependent variable: Times
Times
Drinking in Past 30 Days
:HFKVOHU+HQU\&ROOHJH$OFRKRO6WXG\+DUYDUG6FKRRORI3XEOLF+HDOWK
‹+DUYDUG6FKRRORI3XEOLF+HDOWK$OOULJKWVUHVHUYHG7KLVFRQWHQWLVH[FOXGHGIURP
RXU&UHDWLYH&RPPRQVOLFHQVH)RUPRUHLQIRUPDWLRQVHHKWWSRFZPLWHGXIDLUXVH
28
. infix age 10-11 residence 16 greek 24 screen 102 timespast30 103 howmuchpast30 104 gpa 278-279 studying 281 timeshs
i
h 325 h
howmuchhs
hh 326 socializing
i li i
283 stwgt_99
99 475
475-493
493
weight99 494-512 using da3818.dat,clear
(14138 observations read)
. recode timespast30 timeshs (1=0) (2=1.5) (3=4) (4=7.5) (5=14.5) (6=29.5) (7=45)
(timespast30: 6571 changes made)
(timeshs 10272 changes made)
(timeshs:
. replace timespast30=0 if screen<=3
(4631 real changes made)
29
. tab timespast30
timespast30 |
Freq.
Cum.
Percent
------------+
+----------------------------------0 |
4,652
33.37
33.37
19.64
1.5 |
2,737
53.01
19.03
4 |
2,653
72.04
13.30
7.5 |
1,854
85.34
11.82
14.5 |
1,648
97.17
29.5 |
350
2.51
99.68
45 |
45
0 32
0.32
100.00
100 00
------------+----------------------------------Total |
13,939
100.00
30
K explanatory
Key
l
t
variiables
bl

Live in fraternity/sorority house
 Indicator
variable (dummy variable)
 Coded 1 if live in, 0 otherwise

Member of fraternity/sorority
 Indicator
variable (dummy variable)
 Coded 1 if member
member, 0 otherwise
31
Th
Three
Regressi
R
ions
Dependent variable: number of times drinking in past 30 days
Live in frat/sor house
4.44
(0.35)
---
2.26
(0.38)
---
2.88
2
88
(0.16)
2.44
2
44
(0.18)
4.54
(0.56)
4.27
(0.059)
4.27
(0.059)
S.E.R.
6.49
6.44
6.44
R2
.011
011
.023
023
.025
025
(indicator variable)
M b off frat/sor
Member
f t/
(indicator
variable)
Intercept
N
13,876
13,876
13,876
What is the substantive interpretation of the coefficients?
Note:
N
t Standard
St d d errors in
i parentheses.
th
Corr.
C
Between
B t
living
li i in
i
frat/sor house and being a member of a Greek organization is .42
32
Th Pi
The
Picture
t
Living in frat
house X2
ˆ 2M =2.26
Drinks
i k per
30 days Y
 21 =0.19
Member of
fraternityy X1
̂1M =2.44
33
A
Accounting
ti ffor th
the ttotal
t l effect
ff t
B
M
M
ˆ
ˆ
ˆ
1  1   2 
21
Total effect = Direct effect + indirect effect
Living in frat
house X2
ˆ 2M =2.26
Drinks per
30 days Y
 21 =0.19
Member of
fraternity X1
ˆ1M =2.44
34
Accounting
g for the effects of frat
house living and Greek membership on drinking
From
From multiple
regressions
From
bi i t
bivariate
regressions
Effect
Member of
Greek org.
Total
2.88
Live in frat/
sor. house
4.44
Direct
2.44
(85%)
2.26
(51%)
35
accounting
identity: T=D+I
Indirect
0.44
(15%)
2.18
(49%)
MIT OpenCourseWare
http://ocw.mit.edu
17.871 Political Science Laboratory
Spring 2012
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Download