Matching Methods

advertisement
Matching Methods
Why match? The Rubiin causal model (1973, 1979) reminds us that we are
always trying to get to the counterfactual: how would the world have turned out
differently, if one factor changed but everything else remained the same. This is the
same as asking what Y values would we have seen under the same social system X,
with one treatment applied. It points our attention to making causal inferences not
from widely disparate cases that vary on many factors, but from comparable cases
that are alike in every important regard except that some receive the treatment and
some the control. If we have otherwise comparable treatment and control groups,
then our observational studies begin to look a bit more like randomized
experiments.
What is matching? Matching algorithms (such as nearest neighbor, genetic,
etc.) find control group cases that look like the treatment group cases (except, of
course, that they didn’t get treated. You can then either downweight the
“incomparable” control cases (for instance, weighting by their propensity score) or
throw them out of your analysis entirely (“pre-processing” your dataset). Either
way, you will “improve the balance” between your treatment and control groups,
making them look more similar on other variables and thus approaching what your
dataset would look like in a randomized experiment.
Advantages of Matching:

Removes outlying control cases that fall outside the range of
variation within your treatment group, and are thus not
relevant counterfactual cases. This “lack of common
support.”

Reduces model dependence, because your results are now
much less likely to vary by the assumptions that you make in
order to put control variables into your multivariate model.
Even if you enter these variables into a later model, matching
has broken the link between these variables and your
treatment, which means that your estimated effect of the
treatment is much more robust to (less likely to be biased by)
the parametric choices that you make regarding these other
variables.

Easy to interpret. Because researchers often simply match
and then compare mean values of the DV in the treatment
and control groups, this looks like an experiment and can be
presented to a lay audience quite simply.
Limits of Matching:

A match is only as good as the covariates you match upon. It
doesn’t solve the threat of omitted variables.
How to Match in Practice:
The easiest way now is to use R, call the Zelig package, and then use
the MatchIt package (Gary King’s Shop, gking.harvard.edu/matchit) or
genetic matching (Jas Sekhon’s program,
(http://sekhon.berkeley.edu/matching/)
library(foreign)
> setwd("C:/Users/Thad/Documents/San Diego Files/Absentee Voters/San Diego 2008
Matching")
> Matchtrd800ls <- read.dta("m300ls_trd800ls.dta")
> dim(Matchtrd800ls)
[1] 937 18
> library(MatchIt)
> install.packages("MatchIt")
> names(Matchtrd800ls)
[1] "fid.precin"
"consnum"
[5] "pvbm"
"pop2000"
[9] "pcthisp"
"pctasian"
[13] "pctownerocchh" "mail"
[17] "near.dist"
"stratum"
"consname"
"pcturban"
"pct18under"
"adjac"
"rv.totals"
"pctblack"
"pct65pl"
"near.fid"
> m.out <- matchit(mail ~ rv.totals + pcturban + pctblack + pcthisp + pctasian +
pct18under + pct65pl + pctownerocchh, data = Matchtrd800ls, method = "genetic", replace =
FALSE)
Call:
matchit(formula = mail ~ rv.totals + pcturban + pctblack + pcthisp +
pctasian + pct18under + pct65pl + pctownerocchh, data = Matchtrd800ls,
method = "genetic", replace = FALSE)
Summary of balance for all data:
Means Treated Means Control SD Control Mean Diff eQQ Med eQQ Mean
distance
0.889
0.014
0.081
0.875
0.988
0.871
rv.totals
159.214
603.254
139.474 -444.041 481.000 442.592
pcturban
43.424
89.523
27.552
-46.099 48.225
45.726
pctblack
3.765
4.032
6.329
-0.267
0.847
1.222
pcthisp
20.561
19.760
18.800
0.801
3.281
3.315
pctasian
4.909
7.581
8.779
-2.672
1.686
2.837
pct18under
25.029
23.868
9.389
1.161
1.760
2.583
pct65pl
13.744
13.534
11.160
0.211
1.635
2.116
pctownerocchh
74.809
65.261
27.087
9.547
5.722
10.003
eQQ Max
distance
0.999
rv.totals
526.000
pcturban
100.000
pctblack
23.008
pcthisp
14.859
pctasian
16.393
pct18under
42.637
pct65pl
34.746
pctownerocchh 28.874
Summary of balance for matched data:
Means Treated Means Control SD Control Mean Diff eQQ Med eQQ Mean
distance
0.889
0.101
0.205
0.788
0.903
0.788
rv.totals
159.214
453.505
163.975 -294.291 284.000 294.291
pcturban
43.424
44.833
45.507
-1.409
0.000
1.631
pctblack
3.765
3.777
6.885
-0.012
0.195
0.441
pcthisp
20.561
18.288
17.393
2.273
2.610
3.036
pctasian
4.909
4.381
6.333
0.528
0.325
0.770
pct18under
25.029
25.662
7.271
-0.633
0.409
0.799
pct65pl
13.744
13.656
9.376
0.089
0.859
1.033
pctownerocchh
74.809
75.196
19.982
-0.388
0.878
1.546
eQQ Max
distance
0.993
rv.totals
497.000
pcturban
18.395
pctblack
7.002
pcthisp
20.665
pctasian
8.840
pct18under
4.096
pct65pl
3.796
pctownerocchh 13.915
Percent Balance Improvement:
Mean Diff. eQQ Med eQQ Mean eQQ Max
distance
10.03
8.608
9.555
0.654
rv.totals
33.72 40.956
33.507
5.513
pcturban
96.94 100.000
96.434 81.605
pctblack
95.58 76.973
63.901 69.566
pcthisp
-183.61 20.476
8.405 -39.076
pctasian
80.22 80.741
72.860 46.076
pct18under
45.43 76.738
69.070 90.394
pct65pl
57.99 47.451
51.168 89.074
pctownerocchh
95.94 84.652
84.547 51.808
Sample sizes:
Control Treated
All
834
103
Matched
103
103
Unmatched
731
0
Discarded
0
0
Here is how we save the matched data, and then turn it into
a Stata file (with unmatched cases thrown out):
> output_trd800ls <- match.data(m.out, subclass="subclass")
> write.dta(output_trd800ls, file="C:/Users/Thad/Documents/San Diego Files/Absentee
Voters/San Diego 2008 Matching/output_trd800ls.dta")
Before Matching
100.0
90.0
80.0
70.0
60.0
50.0
40.0
30.0
20.0
10.0
0.0
80.1
74.8
71.1
43.4
25.8
25.0
20.6
19.2
13.7
13.0
4.95.6
3.84.6
VBM
Precincts
(Mean for
103
Precincts)
Traditional
Precincts
(Mean for
495
Precincts)
After Matching
100.0
90.0
80.0
70.0
60.0
50.0
40.0
30.0
20.0
10.0
0.0
76.6
75.9
50.8
50.5
26.1
25.1
19.3
18.1
3.73.9
4.74.3
13.6
13.6
VBM
Precincts
(Mean for
101
Precincts)
Traditional
Precincts
(Mean for
101
Precincts)
Call:
matchit(formula = mailblt ~ vap.whi.pct + vap.blk.pct + vap.asi.pct +
vap.his.pct + vap.muloth.pct + reg.dem.pct + reg.rep.pct +
reg.oth.pct + pct1824 + pct2534 + pct3544 + pct4554 + pct5564 +
pct65pl + reg.male.pct + cdmargin + cdspend + admargin +
adspend, data = data2000, method = "nearest", ratio = 3)
matchit(formula = mailblt ~ vap.whi.pct + vap.blk.pct + vap.asi.pct +
vap.his.pct + vap.muloth.pct + reg.dem.pct + reg.rep.pct +
reg.oth.pct + pct1824 + pct2534 + pct3544 + pct4554 + pct5564 +
pct65pl + reg.male.pct + cdmargin + cdspend + admargin +
adspend, data = data2000, method = "nearest", ratio = 1)
What to Do After You Match:
Option #1: In the most commonly used approach, you compare levels
of the DV in your treatment and control group either through an Average
Treatment Effect (ATE) or an Average Treatment Effect on the Treated
(ATT). See works by Imbens, the definition in Ho et al. and the application in
Kousser and Mullin. Guido Imbens has the “match” package for Stata to
calculate these
(http://www.economics.harvard.edu/faculty/imbens/software_imbens)
Option #2: In the approach recommended by Ho et al., you take your
pre-processed dataset (the one that has thrown out all of the incomparable,
unmatched cases) and then run multivariate models like you always would
have done. If you don’t have perfect balance between the treatment and
control groups (because you didn’t want to throw out too many cases, or
because the world isn’t perfect), than estimating a model with controls is
helpful. It is less potentially hurtful than if you hadn’t matched, because the
specific parametric assumptions you make about the control variables will
not dramatically affect your estimated treatment effect.
regress turnpct mailblt
vp_blk_p vp_s_pct vp_hs_pc vp_mlt_p rg_dm_pc
rg_rp_pc rg_th_pc rg_ml_pc pct2534 pct3544 pct4554 pct5564 pct65pl cdmargin
cdspend admargin adspend [weight=totreg]
(analytic weights assumed)
(sum of wgt is
2.9035e+06)
Source |
SS
df
MS
-------------+-----------------------------Model | 315209.548
18 17511.6415
Residual | 144179.132 4569 31.5559491
-------------+-----------------------------Total | 459388.679 4587 100.150137
Number of obs
F( 18, 4569)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
4588
554.94
0.0000
0.6862
0.6849
5.6175
-----------------------------------------------------------------------------turnpct |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------mailblt | -2.725629
.3794405
-7.18
0.000
-3.469515
-1.981742
vp_blk_p | -.2819971
.0122151
-23.09
0.000
-.3059447
-.2580496
vp_s_pct | -.0406101
.0104169
-3.90
0.000
-.0610323
-.0201879
vp_hs_pc | -.2364223
.0059486
-39.74
0.000
-.2480844
-.2247602
vp_mlt_p | -.9394277
.2440667
-3.85
0.000
-1.417916
-.460939
rg_dm_pc |
.5404006
.0293499
18.41
0.000
.4828606
.5979406
rg_rp_pc |
.4728693
.0289845
16.31
0.000
.4160456
.529693
rg_th_pc | -.5682599
.0582184
-9.76
0.000
-.682396
-.4541237
rg_ml_pc | -.1080187
.0163326
-6.61
0.000
-.1400384
-.075999
pct2534 | -.1708281
.022612
-7.55
0.000
-.2151585
-.1264977
pct3544 |
.1431211
.0218624
6.55
0.000
.1002601
.185982
pct4554 |
.3058852
.0235917
12.97
0.000
.259634
.3521364
pct5564 |
.1913343
.0269946
7.09
0.000
.1384118
.2442568
pct65pl |
.0522553
.0139891
3.74
0.000
.0248298
.0796808
cdmargin |
.0293757
.006545
4.49
0.000
.0165443
.0422071
cdspend | -.0407063
.0463789
-0.88
0.380
-.1316314
.0502189
admargin |
.0240662
.0073524
3.27
0.001
.009652
.0384803
adspend | -.0852189
.1133447
-0.75
0.452
-.3074293
.1369915
_cons |
35.49562
2.913364
12.18
0.000
29.78402
41.20722
------------------------------------------------------------------------------
Tips:
Ho et al. usefully note that “matching” should really be called “pruning,”
because you aren’t basing any of your analyses on how one observation in the
treatment group is paired with one or more specific observations in the control
group.
Jonathan Wand has a good reading list with links:
http://www.stanford.edu/class/polisci353/2004winter/reading.html
Elizabeth Stuart has a great website with links to various matching programs
in R and Stata,
http://www.biostat.jhsph.edu/~estuart/propensityscoresoftware.html
“Synthetic controls” are matching meets case studies. Suppose you’ve got
only one treated case, and no perfectly comparable control case. You can use this
approach to apply different weights to a number of control cases and then
synthesize a control case that looks just like the treatment case on covariates.
Download