Notes 14

advertisement
Stat 921 Notes 14
I. Propensity Score Caliper Matching
Matching on the propensity score focuses entirely on balance
and not on obtaining close matches.
A comprise between obtaining close matches and good balance
is propensity score caliper matching.
Reference: Rosenbaum, P.R. and Rubin, D.B. (1985),
“Constructing a control group by multivariate matched sampling
methods that incorporate the propensity score” American
Statistician.
With a caliper of width w, if two individuals, say k and l, have
propensity scores that differ by more than w, then the distance
between these individuals is set to  ; whereas if the propensity
scores differ by w or less, the distance is a measure of the
proximity of xk and xl .
A caliper of 20% of the standard deviation of the propensity
score is a common choice. A reasonable strategy is to start with
a width of 20% of the standard deviation of the propensity score,
and adjust the caliper to be less if needed to obtain balance on
the propensity score.
Within the caliper, a good measure of distance between xk and
xl is the Mahalanobis distance. If ̂ is the sample covariace
1
matrix of x , then the estimated Mahalanobis distance between
xk and xl is
( x  x )T ˆ 1 ( x  x ) .
k
l
k
l
Speaking very informally, in the Mahalanobis distance, a
difference of one standard deviation counts the same for each
covariate in x . Even as an informal description, this is not quite
correct. The Mahalanobis distance takes account of the
correlations among variables. If one covariate in x were weight
in pounds rounded to the nearest pound and another were weight
in kilograms rounded to the nearest kilogram, then the
Mahalanobis distance would come very close to counting those
two covariates as a single covariate because of their high
correlation.
The Mahalanobis distance was originally developed for use with
multivariate normal data, and for data of this type it works fine.
When the data are not normal, the Mahalanobis distance can
exhibit some odd behavior. If one covariate contains extreme
outliers or has a long-tailed distribution, its standard deviation
will be inflated, and the Mahalanobis distance will tend to
ignore that covariate in matching. With binary indicators, the
variance is largest for events that occur about half the time, and
it is smallest for events with probability near zero and one. In
consequence, the Mahalanobis distance gives greater weight to
binary variables with probabilities near zero and one than to
binary variables with probabilities closer to one half. If there
were binary indicators for the states of the US, then the
Mahalanobis distance would regard matching for Wyoming as
vastly more important than matching for California, simply
2
because fewer people live in Wyoming. In many contexts, rare
binary covariates are not of overriding importance, and outliers
do not make a covariate unimportant, so the Mahalanobis
distance may not be appropriate with covariates of this kind.
A simple alternative to the Mahalanobis distance (i) replaces
each of the covariates, one at a time, by its ranks, with average
ranks for ties; (ii) pre-multiplies and post-multiplies the
covariance matrix of the ranks by a diagonal matrix whose
diagonal elements are the ratios of the standard deviations of
untied ranks 1, , L to the standard deviations of the tied ranks
of the covariates; and (iii) computes the Mahalanobis distance
using the ranks and this adjusted covariance matrix. This is
called the rank-based Mahalanobis distance. Step (i) limits the
influence of outliers. After step (ii) is complete, the adjusted
covariance matrix has a constant diagonal. Step (ii) prevents
heavily tied covariates, such as rare binary variables, from
having increased influence due to reduced variance.
Penalty functions: There may be no pair matching in which the
caliper on the propensity score is respected for all 21 matched
pairs. For this reason, instead of using infinite distance when
the propensity scores are further apart than the caliper, we use a
“penalty function” which extracts a large but finite penalty for
violations of the constraint, e.g.,
1000*max(0,| eˆ( xk )  eˆ( xl ) | w) , so if the propensity score for
units k and l are within w, no penalty is extracted but if the
propensity scores are further apart than w, then the penalty is
1000*(| eˆ( xk )  eˆ( xl ) | w) . This penalty is added to the rank
3
based Mahalanobis distance for the corresponding pair. Optimal
matching will try to avoid the penalties by respecting the caliper,
but when that is not possible, it will prefer to match so the
caliper is only slightly violated for a few matched pairs.
Example: Welder data from Notes 13.
# Data
treatment=c(rep(1,21),rep(0,26));
age=c(38,44,39,33,35,39,27,43,39,43,41,36,35,37,39,34,35,53,38,37,38,48,63,44,4
0,50,52,56,47,38,34,42,36,41,41,31,56,51,36,44,35,34,39,45,42,30,35);
african.american=c(0,0,0,1,rep(0,5),1,rep(0,11),1,rep(0,12),rep(1,4),rep(0,9));
smoker=c(rep(0,2),rep(1,4),0,1,1,0,1,0,0,0,1,0,1,0,1,0,1,0,0,1,rep(0,8),1,0,1,1,1,0,1,
0,0,1,1,0,0,0,1);
Xmat=cbind(age,african.american,smoker);
# Outcome: dpc = DNA-protein cross-links in percent in white blood cells
dpc=c(1.77,1.02,1.44,.65,2.08,.61,2.86,4.19,4.88,1.08,2.03,2.81,.94,1.43,1.25,2.97,
1.01,2.07,1.15,1.07,1.63,1.08,1.09,1.1,1.1,.93,1.11,.98,2.2,.88,1.55,.55,1.04,1.66,1.
49,1.36,1.02,.99,.65,.42,2.33,.97,.62,1.02,1.78,.95,1.59);
# The propensity score model building and balance checking process leads us to a
# propensity score model that includes all variables, interactions and squares.
agesq=age^2;
age.race=age*african.american;
age.smoker=age*smoker;
race.smoker=african.american*smoker;
# Propensity score estimate
model3=glm(treatment~age+agesq+african.american+smoker+age.race+age.smok
er+race.smoker,family=binomial);
propscore.model3=predict(model3,type="response")
# Function for computing
# rank based Mahalanobis distance. Prevents an outlier from
# inflating the variance for a variable, thereby decreasing its importance.
# Also, the variances are not permitted to decrease as ties
# become more common, so that, for example, it is not more important
# to match on a rare binary variable than on a common binary variable
# z is a vector, length(z)=n, with z=1 for treated, z=0 for control
4
# X is a matrix with n rows containing variables in the distance
smahal=
function(z,X){
X<-as.matrix(X)
n<-dim(X)[1]
rownames(X)<-1:n
k<-dim(X)[2]
m<-sum(z)
for (j in 1:k) X[,j]<-rank(X[,j])
cv<-cov(X)
vuntied<-var(1:n)
rat<-sqrt(vuntied/diag(cv))
cv<-diag(rat)%*%cv%*%diag(rat)
out<-matrix(NA,m,n-m)
Xc<-X[z==0,]
Xt<-X[z==1,]
rownames(out)<-rownames(X)[z==1]
colnames(out)<-rownames(X)[z==0]
library(MASS)
icov<-ginv(cv)
for (i in 1:m) out[i,]<-mahalanobis(Xc,Xt[i,],icov,inverted=T)
out
}
# Rank based Mahalanobis distance
distmat1=smahal(treatment,Xmat);
# Function for adding propensity score caliper
# caliper*standard deviation of the propensity score p is the width of the caliper
addcaliper=function(dmat,z,p,caliper=0.2,penalty=1000){
#add a penalty function to dmat for violations of capliper on p
sdp<-sd(p)
adif<-abs(outer(p[z==1],p[z==0],"-"))
adif<-(adif-(caliper*sdp))*(adif>(caliper*sdp))
dmat<-dmat+adif*penalty
dmat
}
5
# Add propensity score caliper
distmat2=addcaliper(distmat1,treatment,propscore.model3);
# Optimal pair match
library(optmatch)
pairmatchvec=pairmatch(distmat2);
# Create a vector saying which control unit each treated unit is matched to
pairs.short=substr(pairmatchvec,start=3,stop=10);
pairsnumeric=as.numeric(pairs.short);
notreated=sum(treatment)
pairsvec=rep(0,notreated);
for(i in 1:notreated){
temp=(pairsnumeric==i)*seq(1,length(pairsnumeric),1);
pairsvec[i]=sum(temp,na.rm=TRUE)-i;
}
# Assessment of balance
# Calculate standardized differences
notreated=sum(treatment);
Xmat=cbind(age,african.american,smoker,age.race,age.smoker,race.smoker);
treatedmat=Xmat[1:notreated,];
# Standardized differences before matching
controlmat.before=Xmat[(notreated+1):nrow(Xmat),];
controlmean.before=apply(controlmat.before,2,mean);
treatmean=apply(treatedmat,2,mean);
treatvar=apply(treatedmat,2,var);
controlvar=apply(controlmat.before,2,var);
stand.diff.before=(treatmean-controlmean.before)/sqrt((treatvar+controlvar)/2);
# Standardized differences after matching
controlmat.after=Xmat[pairsvec,];
controlmean.after=apply(controlmat.after,2,mean);
# Standardized differences after matching
stand.diff.after=(treatmean-controlmean.after)/sqrt((treatvar+controlvar)/2);
stand.diff.before
age african.american
-0.6449557
-0.2734547
age.smoker race.smoker
smoker
age.race
0.3562784
-0.3295512
6
0.3286498
-0.2443904
stand.diff.after
age african.american
smoker
age.race
-0.275803443 0.000000000 0.381989217 -0.009209958
age.smoker race.smoker
0.400160703 0.000000000
Balance is not great on some variables; we will examine full
matching later.
For illustrative purposes, let’s consider making inferences
assuming balance is fine.
welder.dpc=dpc[1:notreated];
control.dpc=dpc[pairsvec];
boxplot(welder.dpc,control.dpc,names=c("Welder","Control"))
# Inference Under Additive Treatment Effect Model
wilcox.test(welder.dpc,control.dpc,paired=TRUE,conf.int=TRUE);
Wilcoxon signed rank test
data: welder.dpc and control.dpc
V = 180, p-value = 0.02385
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
0.095 1.195
sample estimates:
(pseudo)median
0.595
If there is no hidden bias, there is strong evidence that being a
welder increases inappropriate DNA-protein cross links.
7
Download