Propensity Score Matching Definition “The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates.” (Rosenbaum and Rubin 1983) When to Use Propensity Score Matching PSM is often used to estimate the impact of a policy or program by comparing people subject to the policy (the treated group) to people not subject to the policy but as similar to the treated group as possible (the untreated group). Why Use PSM? Correct for observable differences between the treated and non-treated (such as selection bias). We know that one of the main problems with simply estimating a program’s impact by comparing participants and nonparticipants is that there may be a difference between these two groups in outcomes without the intervention. How We Reduce Bias We need to construct a group that is very similar to the treatment group so we can look into the counterfactual of what would have happened without the program. The comparison group is matched to the treatment group on the basis of a set of observed characteristics, or using the predicted probability of participation given observed characteristics (“propensity score”). We will estimate the average causal effect of a treatment in a population under the assumption that treatment assignment is independent of the potential outcomes. In that case, the average of the treated cases minus the untreated cases provides an unbiased estimate of E (Y1 - Y0), which is the population average causal effect. Steps in Propensity Score Matching Step 1: Select a set of covariates from which to estimate the propensity score. Should be based on empirical evidence about relationships between variables of interest. Step 2: Pool treated and untreated groups and estimate the propensity score for each subject. Most common method to determine the propensity score is to use a logit regression of the treatment status on the set of explanatory variables and determine the predicted probability of being treated. Think of the propensity score as an individual prediction of whether the individual would have been included in the treatment group. Step 3: Match each subject in the treated group to a subject in the non-treatment group based on the propensity score. There are four methods to do this:1 Nearest neighbor—“The most straightforward matching estimator is nearest neighbor (NN) matching. The individual from the comparison group is chosen as a matching partner for a treated individual that is closest in terms of propensity score. Several variants of NN matching are proposed, e.g. NN matching `with replacement' and `without replacement'. In the former case, an untreated individual can be used more than once as a match, whereas in the latter case it is considered only once. Matching with replacement involves a trade-off between bias and variance. If we allow replacement, the average quality of matching will increase and the bias will decrease.” Caliper and Radius—“NN matching faces the risk of bad matches, if the closest neighbor is far away. This can be avoided by imposing a tolerance level on the maximum propensity score distance (caliper). Imposing a caliper works in the same direction as allowing for replacement. Bad matches are avoided and hence the matching quality rises. However, if fewer matches can be performed, the variance of the estimates increases.” Stratification—“The idea of stratification matching is to partition the common support of the propensity score into a set of intervals (strata) and to calculate the impact within each interval by taking the mean difference in outcomes between treated and control observations. This method is also known as interval matching, blocking and subclassification.” Kernel—“Kernel matching KM) and local linear matching (LLM) are nonparametric matching estimators that use weighted averages of all individuals in the control group to construct the counterfactual outcome. Thus, one major advantage of these approaches is the lower variance which is achieved because more information is used. A drawback of these methods is that possibly observations are used that are bad matches.” Weighting—“Imbens (2004) notes that propensity scores can also be used as weights to obtain a balanced sample of treated and untreated individuals. If the propensity score is known, the estimator can directly by implemented as the difference between a weighted average of the outcomes for the treated and untreated individuals. Unless in experimental settings, the propensity score has to be estimated.” Step 4: Assess the matching quality. Use a t-test or f-test to make sure there are no significant regressors between the treated and untreated group. Step 5: Estimate the effect. Generally will be the average treatment effect on the treated (ATT) rather than the average treatment effect (ATE) which looks at the whole population. Caliendo and Kopeinig . “Some practical guidance for the implementation of propensity score matching.” 1 How to use Propensity Score Matching in STATA: Stata does not have a built-in command for propensity score matching, however, there are several user-written modules for this method. You can find these modules using the .net command as follows: .net search psmatch2 .net search pscore .net search nnmatch You can install these modules using the .ssc or .net command, for example: .ssc install psmatch2, replace After installation, read the help files to find the correct usage, for example: .help psmatch2