Notes 16 - Wharton Statistics Department

advertisement
Stat 921 Notes 16
Reading: Chapter 4.3-4.5
I. Sensitivity Analysis for the Signed Rank Test
Morton et al. (1982, American Journal of Epidemiology)
studied lead in the blood of children whose parents worked
in a factory where lead was used in making batteries. They
were concerned that children were exposed to lead
inadvertently brought home by their parents. Their study
included 33 such children from different families – they are
the exposed or treated children. The outcome Y was the
level of lead found in a child’s blood in μg/dl of whole
blood. The covariate x was two-dimensional, recording
age and neighborhood of residence. They matched each
exposed child to one control child of the same age and
neighborhood whose parents were employed in other
industries not using lead.
1
If this study were free of hidden bias, we would be justified
in analyzing the data using methods for a randomized
experiments with 33 matched pairs:
leaddata.exposed=c(38,23,41,18,37,36,23,62,31,34,24,14,21,17,16,20,15,10,
45,39,22,35,49,48,44,35,43,39,34,13,73,25,27)
leaddata.control=c(16,18,18,24,19,11,10,15,16,18,18,13,19,10,16,16,24,13,9
,14,21,19,7,18,19,12,11,22,25,16,13,11,13)
wilcox.exact(leaddata.exposed,leaddata.control,paired=TRUE,conf.int=TRU
E);
Exact Wilcoxon signed rank test
2
data: leaddata.exposed and leaddata.control
V = 499, p-value = 8.037e-07
alternative hypothesis: true mu is not equal to 0
95 percent confidence interval:
9.5 21.5
sample estimates:
(pseudo)median
15.25
There is strong evidence that lead exposure of the parents
affects the lead in the blood of children assuming the study
is free of hidden bias; a 95% confidence interval for the
effect of lead exposure of the parents on lead in the blood
of children is (9.5, 21.5)
When there is the potential of hidden bias, we can consider
a sensitivity analysis based on the formula (1.4) from Notes
15, assuming a large sample normal approximation.
#### Sensitivity Analysis for Signed Rank Statistic
#### Assumes that a one-sided test of whether the treatment has
#### a positive effect is being done.
#### diff is the difference between the treated and control units
#### Gamma=exp(gamma) is the sensitivity parameter
sens.analysis.signedrank=function(diff,Gamma){
rk=rank(abs(diff));
s1=1*(diff>0);
s2=1*(diff<0);
W=sum(s1*rk);
Eplus=sum((s1+s2)*rk*Gamma)/(1+Gamma);
Eminus=sum((s1+s2)*rk)/(1+Gamma);
V=sum((s1+s2)*rk*rk*Gamma)/((1+Gamma)^2);
Dplus=(W-Eplus)/sqrt(V);
Dminus=(W-Eminus)/sqrt(V);
list(lowerbound=1-pnorm(Dminus),upperbound=1-pnorm(Dplus));
}
3
diff=leaddata.exposed-leaddata.control;
Sensitivity Analysis for Lead in the Blood Children: Range
of Significance Levels for the Signed Rank Statistic
Minimum
Maximum

1
<0.0001
<0.0001
2
<0.0001
0.0018
3
<0.0001
0.0136
4
<0.0001
0.0388
4.25
<0.0001
0.0468
5
<0.0001
0.0740
The table shows that to explain away the observed
association between parental exposure to lead and child’s
lead level, a hidden bias or unobserved covariate would
need to increase the odds of exposure by more than a factor
of   4.25 . The association cannot be attributed to small
hidden biases, but it is somewhat more sensitive to bias
than Hammond’s study of heavy smokers.
II. Sensitivity Intervals
Consider the additive treatment effect model rTi  rCi   .
4
A 95% confidence interval for  in a randomized
experiment or study free of hidden bias is a random interval
that in at least 95% of such studies will contain the true  .
A 95% sensitivity interval for  with sensitivity parameter
 is a random interval that in at least 95% of studies will
contain the true  assuming that the true  , call it  0
satisfies exp( 0 )   .
A sensitivity interval is an analogue of a confidence
interval when there might be hidden bias, but the
magnitude of hidden bias is assumed to have some known
maximum magnitude. Notice that the sensitivity interval
does not make any assumptions about the unmeasured
covariate u .
Finding a (1   ) sensitivity interval:
Recall that the way we test H 0 :   0 is to do a Wilcoxon
signed rank test on the adjusted responses R  Z  . For a


fixed  , let T and T be the bounding random variables
when the response is R  Z  and let T be the Wilcoxon
signed rank test statistic applied to the adjusted responses

R  Z  Suppose  / 2  P(T  a1 ) and
 / 2  P(T  a2 ) . Then Proposition 1 from Notes 15
(Proposition 13 in text book) implies
 / 2  P(T  a1 ) for all u U  [0,1]N and
 / 2  P(T  a2 ) for all u U . If T  a1 or T  a2 , then
5
 is rejected at level  for all u U so  is excluded
from the sensitivity interval. If T  a1 or T  a2 , then 
is included in the sensitivity interval. When   0 , the
sensitivity interval is just the confidence interval from a
study free of hidden bias but as  increases, the sensitivitiy
interval becomes larger reflecting uncertainty about the
impact of u .
When using the normal approximation to the distribution of
T and T , the endpoints of the 95% sensitivity interval
are :


 T

 T





 ,obs  E (T )
 ,obs  E (T )
inf   :
 1.96  and sup   :
 1.96  .
Var (T )
Var (T )




#### Function for computing deviates [T_beta –
#### E(T_{beta}^+)]/SD(T_{beta}^+)]
#### and [T_beta-E(T_{beta}^-]/SD(T_{beta}^-)] that are involved in
#### finding sensitivity interval
sens.interval.deviates=function(beta,treated,control,Gamma){
diff=(treated-beta)-control;
rk=rank(abs(diff));
s1=1*(diff>0);
s2=1*(diff<0);
W=sum(s1*rk);
Eplus=sum((s1+s2)*rk*Gamma)/(1+Gamma);
Eminus=sum((s1+s2)*rk)/(1+Gamma);
V=sum((s1+s2)*rk*rk*Gamma)/((1+Gamma)^2);
Dplus=(W-Eplus)/sqrt(V);
Dminus=(W-Eminus)/sqrt(V);
list(Dplus=Dplus,Dminus=Dminus);
}
# Find sensitivity interval for lead exposure data
6
Gamma=5;
betagrid=seq(-60,60,.05);
Dplus.betagrid=rep(0,length(betagrid));
Dminus.betagrid=rep(0,length(betagrid));
for(i in 1:length(betagrid)){
beta=betagrid[i];
deviates=sens.interval.deviates(beta,leaddata.exposed,leaddata.control,Gam
ma);
Dplus.betagrid[i]=deviates$Dplus;
Dminus.betagrid[i]=deviates$Dminus;
}
lower.si=min(betagrid[Dplus.betagrid<=1.96]);
upper.si=max(betagrid[Dminus.betagrid>=-1.96]);
Sensitivity Intervals

95% Sensitivity Interval
1
(9.55, 20.5)
2
(4.5, 27.5)
3
(1.05, 32.5)
4
(-1.0, 37.0)
5
(-3.0, 41.95)
If the study were free of hidden bias, the 95% sensitivity
interval (SI) would be (9.55, 20.5)1. If   2 , matched
children might differ by a factor of two in their odds of
exposure to lead due to differences in the unobserved
covariate. In this case, the 95% SI is longer, (4.5, 27.5),
1
This is slightly different than the confidence interval we got from wilcox.exact for a study free of hidden
bias because we are using a large sample approximation here.
7
though the smallest plausible effect 4.5 is still relatively
large – it is 28% of the median lead level 16 among
controls. For   4 , slightly negative effects become just
plausible, though large positive effects are also plausible.
8
Download