Alexander Spermann, University of Freiburg, Summer Term 2009 Tutorial: Matching and Difference in Difference Estimation This tutorial is based on the data set matchingdata.dta. Before using the data set expand the memory with the command set memory 200m To solve the exercise you will need the stata package psmatch2, which you can download in the following way: Type net search psmatch2 in STATA Click on: psmatch2 from http://fmwww.bc.edu/RePEc/bocode/p Click on: ‘click here to install’ Restart STATA To find an exact description and help for psmatch2, type help psmatch2 In this exercise we again use the (experimental) data set for the Evaluation of labor market reforms in the US (in slightly modified form). In this exercise the approaches of matching discussed in the course will be applied using the data set already known which is described again below. As a reminder: Matching is about combining (“matching”) a group of participants of a treatment with a group of non participants with equal characteristics. The control group is used then to estimate the unobservable (contrafactual) outcome. There are different approaches to find good “matches”. These will be presented and applied in this exercise. 1 Alexander Spermann, University of Freiburg, Summer Term 2009 Comments on the data set: Treated Dummy variable for treatment in a training program (1=treatment) age Age of the individual in 1977 educ Years of education black Dummy for race (1=black) hisp =1, if „hispanic“ married Dummy for family status (1=married) nodegree Dummy for type of formal education (1=no high school degree) dwincl =1, if used in the example of Dehejia and Wahba (Variable is used to define sample later on.) re74 Real income 1974 re75 Real income 1975 re78 Real income 1978 early_ra =1, if individual out of the experimental sample was randomly assigned (in the first four months of random assignment) (Variable is used to define sample later on.) sample Indicator for the sample the individual is in: 1=NSW; 2=CPS; 3=PSID 2 Alexander Spermann, University of Freiburg, Summer Term 2009 Problems: 1a) In the following exercises you will estimate the „Average Treatment Effect on the Treated“ (ATT) by „Propensity Score Matching“. As „Propensity Score Matching“ implies comparison with another sample, the originally used individuals of the experimental sample (sample==1) are complemented by the individuals of two other samples (sample==2 and sample==3). Replace the non participants from the experimental sample by all the individuals from the two other samples. drop if treated == 0 & sample == 1 replace treated=0 if sample==2 | sample==3 To check this, look at the variable „treated“ in a table: tab treated tab treated indicator: | 1 if | treated, 0 | if not | treated | Freq. Percent Cum. ------------+----------------------------------0 | 18,482 99.01 99.01 1 | 185 0.99 100.00 ------------+----------------------------------Total | 18,667 100.00 1b) Generate the following variables that will be used in the exercises. The following variables are interaction variables. Their use and meaning is the same as in the exercise on selection problems: gen age2=age*age gen age3=age2*age gen educ2=educ*educ gen re74_2=re74*re74 gen re75_2=re75*re75 gen zero_earn_74=re74==0 gen zero_earn_75=re75==0 gen int_educ_re74=educ*re74 gen int_zero74_hisp=zero_earn_74*hisp The following variable is a difference which will be used for the connection of „Propensity Score Matching“ and „Difference in Difference Estimator“ later on: gen d_earn=re78-re75 3 Alexander Spermann, University of Freiburg, Summer Term 2009 1c) 4 First do a probit estimation to find out how the variables „age“, „age2“, „age3“, „educ“, „educ2“, „black“, „hisp“, „married“, „nodegree“, „re74“, „re75“, „re74_2“, „re75_2“, „zero_earn_75“, „int_educ_re74“ and „int_zero74_hisp“ influence the participation probability (treated=1 or treated=0) in the new overall sample. Then predict the „propensity score“. probit treated age age2 age3 educ educ2 black hisp married nodegree re74 int_zero74_hisp re75 re74_2 re75_2 zero_earn_75 int_educ_re74 Alexander Spermann, University of Freiburg, Summer Term 2009 Iteration Iteration Iteration Iteration Iteration Iteration Iteration Iteration Iteration 0: 1: 2: 3: 4: 5: 6: 7: 8: log log log log log log log log log likelihood likelihood likelihood likelihood likelihood likelihood likelihood likelihood likelihood = = = = = = = = = 5 -1037.6992 -630.39842 -522.77856 -481.07983 -465.67343 -462.91251 -462.75867 -462.75712 -462.75712 Probit estimates Log likelihood = -462.75712 Number of obs LR chi2(16) Prob > chi2 Pseudo R2 = = = = 18667 1149.88 0.0000 0.5541 -----------------------------------------------------------------------------treated | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------age | .9113437 .164721 5.53 0.000 .5884965 1.234191 age2 | -.0240836 .0052904 -4.55 0.000 -.0344525 -.0137147 age3 | .0001904 .0000538 3.54 0.000 .000085 .0002958 educ | .4455625 .1163902 3.83 0.000 .2174419 .6736832 educ2 | -.0264906 .0062148 -4.26 0.000 -.0386715 -.0143098 black | 1.655884 .1095013 15.12 0.000 1.441265 1.870503 hisp | .7390261 .208835 3.54 0.000 .3297171 1.148335 married | -.7758958 .110946 -6.99 0.000 -.9933459 -.5584456 nodegree | .3827289 .1477954 2.59 0.010 .0930552 .6724027 re74 | -.0001589 .000047 -3.38 0.001 -.000251 -.0000669 re75 | -.000075 .0000182 -4.11 0.000 -.0001107 -.0000392 re74_2 | 8.57e-10 3.71e-10 2.31 0.021 1.30e-10 1.58e-09 re75_2 | 1.63e-10 3.23e-10 0.50 0.615 -4.71e-10 7.96e-10 zero_earn_75 | .3240265 .1167357 2.78 0.006 .0952287 .5528244 int_educ_~74 | 9.05e-06 4.28e-06 2.12 0.034 6.66e-07 .0000174 int_zero74~p | -.0019342 .2992769 -0.01 0.995 -.5885061 .5846376 _cons | -14.23208 1.756289 -8.10 0.000 -17.67434 -10.78981 -----------------------------------------------------------------------------note: 2723 failures and 0 successes completely determined. The estimated coefficients cannot be interpreted directly! These are not the marginal effects of the explaining variables on the dependent variable. Those would have to be calculated separately. predict double ps double determines the data format. You can use sum ps to check the „propensity score“: Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------ps | 18667 .0099275 .0545496 8.07e-43 .7932666 As the propensity score is a probability, it has to be in the interval [0;1]. So the average probability to participate in the treatment for all the individuals is 0.99%. 1d) Now estimate the ATT using „Propensity Score Matching“. Look at the effect on the outcome variables „re74“, „re75“ and „re78“. Alexander Spermann, University of Freiburg, Summer Term 2009 First interpret the results with respect to the outcome variable „re78“, i.e. the real income in 1978. Use the command psmatch2. Hints: The outcome variables are stated in brackets after the option outcome. We use a two step matching, as we do not name the exogenous variables after treated in the command psmatch2, but use the “propensity score” estimated by the probit. This is especially useful if you have a model with a lot of variables in order to keep your program more concise. psmatch2 treated, outcome(re78 re74 re75) pscore(ps) psmatch2 treated, outcome(re78 re74 re75) pscore(ps) There are observations with identical propensity score values. The sort order of the data could affect your results. Make sure that the sort order is random before calling psmatch2. Matching Method = neighbor Metric = pscore -----------------------------------------------------------------Variable Sample | Treated Controls Difference ----------------------------+------------------------------------re78 Unmatched | 6349.1435 15750.3 -9401.15645 ATT | 6349.1435 5074.05777 1275.08574 ----------------------------+------------------------------------re74 Unmatched | 2095.57369 14745.9287 -12650.355 ATT | 2095.57369 1895.30997 200.263719 ----------------------------+------------------------------------re75 Unmatched | 1532.05531 14380.0105 -12847.9552 ATT | 1532.05531 1100.9613 431.094014 ----------------------------+-------------------------------------- - Only „Nearest-Neighbour-Matching“ - Estimated effect is given by the post treatment variable „re78“. For the individuals of the treatment group, the treatment has raised the real income by 1275$ on average. 1e) Interpret the results of the matching with respect to the real income in 1975. The result for the pre treatment variable „re75“ is a so-called Pre Program Test. It is checked if the matching results in a balancing of the original level of income before the treatment. The difference of 431,09$ after matching results of unobserved factors. But at least this is much less than the difference of -12.847,95$ before matching. The interpretation of „re74“ is similar. 1f) Using the command pstest, check the success of the matching for the exogenous variables „age“, „educ“, „black“, „married“, „hisp“, „nodegree“, „re74“ and „re75“. 6 Alexander Spermann, University of Freiburg, Summer Term 2009 pstest age educ black married hisp nodegree re74 re75 pstest age educ black married hisp nodegree re74 re75 ---------------------------------------------------------------------------| Mean %reduct | t-test Variable Sample | Treated Control %bias |bias| | t p>|t| ------------------------+----------------------------------+---------------age Unmatched | 25.816 33.444 -82.3 | -9.43 0.067 Matched | 25.816 25.876 -0.6 99.2 | -0.07 0.956 | | educ Unmatched | 10.346 12.04 -67.9 | -7.92 0.080 Matched | 10.346 10.611 -10.6 84.4 | -1.23 0.434 | | black Unmatched | .84324 .09739 224.5 | 33.96 0.019 Matched | .84324 .85405 -3.3 98.6 | -0.26 0.837 | | married Unmatched | .18919 .73255 -129.9 | -16.63 0.038 Matched | .18919 .14595 10.3 92.0 | 1.01 0.498 | | hisp Unmatched | .05946 .06671 -3.0 | -0.39 0.761 Matched | .05946 .03243 11.1 -272.6 | 1.12 0.463 | | nodegree Unmatched | .70811 .2971 90.0 | 12.17 0.052 Matched | .70811 .66486 9.5 89.5 | 0.81 0.566 | | re74 Unmatched | 2095.6 14746 -156.5 | -16.63 0.038 Matched | 2095.6 1895.3 2.5 98.4 | 0.41 0.751 | | re75 Unmatched | 1532.1 14380 -170.9 | -17.24 0.037 Matched | 1532.1 1101 5.7 96.6 | 1.34 0.408 | | ---------------------------------------------------------------------------- This is a t-test on the hypothesis that the mean value of each variable is the same in the treatment group and the non treatment group. It is done before and after matching. If p>0.1, the null hypothesis cannot be rejected on the 10% significance level. Furthermore, a bias before and after matching is calculated for each variable and the change in this bias is stated. This “bias” is defined as the difference of the mean values of the treatment group and the (not matched / matched) non treatment group, devided by the square root of the average sample variance in the treatment group and the not matched non treatment group. In the table one can see the difference of the values of the exogenous variables between the two groups before matching. E.g., 84.3% of the treatment group are black, but only 9.7% of the control group. These factors have a significant influence on the treatment probability (see part 1b)). By the matching, the differences between treatment group and non treatment group are reduced considerably. An exception is the dummy hisp. For this variable the difference between the two groups is not eliminated. However, the “bias” was already rather small before matching. The null hypothesis that the mean values of the two groups do not differ after matching cannot be rejected for any variable. 7 Alexander Spermann, University of Freiburg, Summer Term 2009 8 Conclusion: Through the „propensity-score nearest-neighbour matching“ it was possible to generate a control group which is similar enough to the treatment group to be used for the ATT estimation. 1g) Check graphically if the assumption of „common support“ holds in the example. If the assumption holds, there must be an overlap of the „propensity scores“ of the participants and non participants. Use the command psgraph. psgraph, bin(10) Treated Untreated 0 .05 .25 .15 .45 .35 .65 .55 .75 Due to the scale, it is difficult to discern in this graph that in each class of the „propensity score“ there is a certain number of non treated individuals as well. So we can assume that common support is given. Alexander Spermann, University of Freiburg, Summer Term 2009 Additional problems: 1h) * Estimate the ATT as in part 1d), but use the Kernel Matching approach this time. Briefly interpret the results with respect to the outcome variables „re78“, „re75“ and „re74“, comparing them to the results of parts 1d) and 1e). psmatch2 treated, kernel outcome(re78 re74 re75) pscore(ps) psmatch2 treated, kernel outcome(re78 re74 re75) pscore(ps) Matching Method = kernel Metric = pscore -----------------------------------------------------------------Variable Sample | Treated Controls Difference ----------------------------+------------------------------------re78 Unmatched | 6349.1435 15750.3 -9401.15645 ATT | 6349.1435 6433.4888 -84.3452968 ----------------------------+------------------------------------re74 Unmatched | 2095.57369 14745.9287 -12650.355 ATT | 2095.57369 3934.69386 -1839.12017 ----------------------------+------------------------------------re75 Unmatched | 1532.05531 14380.0105 -12847.9552 ATT | 1532.05531 3160.4338 -1628.37849 ----------------------------+------------------------------------- - results for „re74“ and „re75“ are worse now 1i) * Determine the „Average Treatment Effect“ (ATE) of the training program on the basis of the „Propensity Score Matching“ in part 1d). What does the result tell you with respect to the outcome variable „re78“? Hint: The ATE is calculated analogous to the above matching procedure, complemented by the option ate in the STATA command. psmatch2 treated, outcome(re78 re74 re75) pscore(ps) ate 9 Alexander Spermann, University of Freiburg, Summer Term 2009 psmatch2 treated, outcome(re78 re74 re75) pscore(ps) ate There are observations with identical propensity score values. The sort order of the data could affect your results. Make sure that the sort order is random before calling psmatch2. Matching Method = neighbor Metric = pscore -----------------------------------------------------------------Variable Sample | Treated Controls Difference ----------------------------+------------------------------------re78 Unmatched | 6349.1435 15750.3 -9401.15645 ATT | 6349.1435 5074.05777 1275.08574 ATU | 15750.3 1793.1172 -13957.1828 ATE | -13806.2228 ----------------------------+------------------------------------re74 Unmatched | 2095.57369 14745.9287 -12650.355 ATT | 2095.57369 1895.30997 200.263719 ATU | 14745.9287 8701.14418 -6044.78454 ATE | -5982.89275 ----------------------------+------------------------------------re75 Unmatched | 1532.05531 14380.0105 -12847.9552 ATT | 1532.05531 1100.9613 431.094014 ATU | 14380.0105 2101.52323 -12278.4873 ATE | -12152.5285 ----------------------------+------------------------------------| psmatch2: psmatch2: | Common Treatment | support assignment | On suppor? | Total -----------+-----------+---------Untreated | 18,482 | 18,482 Treated | 185 | 185 -----------+-----------+---------Total | 18,667 | 18,667 The ATE, i.e. the average effect of the treatment for an individual drawn from the overall population at random, is -13806,22$. So the real income of a randomly drawn person would be13806,22$ lower because of the participation in the labor market program. This results because a negative effect is estimated for the non participants (ATU) who are much more numerous than the participants (see second table). So the ATE does not have a direct interpretation for the evaluation of the program. The ATU is estimated by matching a similar participant to each non participant. Because of the small number of participants one would have to check if the balancing is also achieved for this control group. Otherwise the ATU might be biased. ATE, ATT and ATU are linked as follows: ATE = N1/N*ATT + N0/N*ATU Where N1 is the number of participants and N0 is the number of non participants. In the example we have: ATE = 185/18.667*1.275 + 18.482/18.667*(-13.957) = -13.806 10 Alexander Spermann, University of Freiburg, Summer Term 2009 2) „Propensity Score Matching“ with Difference in Difference 2a) Discuss if the conditions for a combination of the two methods are given in this example. Conditions: Panel data We do not have real panel data, but at least for the real income we have time series information (before and after program) Time constant and additive selection bias in the outcome equation: We assume that unobserved factors have a constant influence on the outcome. 2b) Realize a „Propensity Score Matching“ in combination with a Difference in Difference approach To do this, calculate the ATT on the outcome variable „d_earn“ which was generated in part 1a) as difference of the real incomes of 1978 and 1975. Interpret the results. psmatch2 treated, outcome(d_earn re78 re74 re75) pscore(ps) psmatch2 treated, outcome(d_earn re78 re74 re75) pscore(ps) There are observations with identical propensity score values. The sort order of the data could affect your results. Make sure that the sort order is random before calling psmatch2. Matching Method = neighbor Metric = pscore -----------------------------------------------------------------Variable Sample | Treated Controls Difference ----------------------------+------------------------------------d_earn Unmatched | 4817.08818 1370.2894 3446.79878 ATT | 4817.08818 3973.09646 843.991721 ----------------------------+------------------------------------re78 Unmatched | 6349.1435 15750.3 -9401.15645 ATT | 6349.1435 5074.05777 1275.08574 ----------------------------+------------------------------------re74 Unmatched | 2095.57369 14745.9287 -12650.355 ATT | 2095.57369 1895.30997 200.263719 ----------------------------+------------------------------------re75 Unmatched | 1532.05531 14380.0105 -12847.9552 ATT | 1532.05531 1100.9613 431.094014 ----------------------------+-------------------------------------- With this approach we find an ATT of 843.99$. So the real income of participants is raised by 843.99$ through the program. The ATT is calculated as: ATT= (after-before)treated –(after-before)control=4.817 -3.973=844 Through the combination of matching and DiD we have eliminated the time constant unobserved effects. This can be seen if you calculate the ATT as the difference 11 Alexander Spermann, University of Freiburg, Summer Term 2009 between ATTre78 (1275,08$) and ATTre75 (431,09$). ATTre78 would be the ATT if you do not consider the selection bias through time constant unobserved effects, ATTre75 is the selection bias through time constant unobserved effects. 12