Stat 921 Notes 16 Reading: Chapter 4.3-4.5 I. Sensitivity Analysis for the Signed Rank Test Morton et al. (1982, American Journal of Epidemiology) studied lead in the blood of children whose parents worked in a factory where lead was used in making batteries. They were concerned that children were exposed to lead inadvertently brought home by their parents. Their study included 33 such children from different families – they are the exposed or treated children. The outcome Y was the level of lead found in a child’s blood in μg/dl of whole blood. The covariate x was two-dimensional, recording age and neighborhood of residence. They matched each exposed child to one control child of the same age and neighborhood whose parents were employed in other industries not using lead. 1 If this study were free of hidden bias, we would be justified in analyzing the data using methods for a randomized experiments with 33 matched pairs: leaddata.exposed=c(38,23,41,18,37,36,23,62,31,34,24,14,21,17,16,20,15,10, 45,39,22,35,49,48,44,35,43,39,34,13,73,25,27) leaddata.control=c(16,18,18,24,19,11,10,15,16,18,18,13,19,10,16,16,24,13,9 ,14,21,19,7,18,19,12,11,22,25,16,13,11,13) wilcox.exact(leaddata.exposed,leaddata.control,paired=TRUE,conf.int=TRU E); Exact Wilcoxon signed rank test 2 data: leaddata.exposed and leaddata.control V = 499, p-value = 8.037e-07 alternative hypothesis: true mu is not equal to 0 95 percent confidence interval: 9.5 21.5 sample estimates: (pseudo)median 15.25 There is strong evidence that lead exposure of the parents affects the lead in the blood of children assuming the study is free of hidden bias; a 95% confidence interval for the effect of lead exposure of the parents on lead in the blood of children is (9.5, 21.5) When there is the potential of hidden bias, we can consider a sensitivity analysis based on the formula (1.4) from Notes 15, assuming a large sample normal approximation. #### Sensitivity Analysis for Signed Rank Statistic #### Assumes that a one-sided test of whether the treatment has #### a positive effect is being done. #### diff is the difference between the treated and control units #### Gamma=exp(gamma) is the sensitivity parameter sens.analysis.signedrank=function(diff,Gamma){ rk=rank(abs(diff)); s1=1*(diff>0); s2=1*(diff<0); W=sum(s1*rk); Eplus=sum((s1+s2)*rk*Gamma)/(1+Gamma); Eminus=sum((s1+s2)*rk)/(1+Gamma); V=sum((s1+s2)*rk*rk*Gamma)/((1+Gamma)^2); Dplus=(W-Eplus)/sqrt(V); Dminus=(W-Eminus)/sqrt(V); list(lowerbound=1-pnorm(Dminus),upperbound=1-pnorm(Dplus)); } 3 diff=leaddata.exposed-leaddata.control; Sensitivity Analysis for Lead in the Blood Children: Range of Significance Levels for the Signed Rank Statistic Minimum Maximum 1 <0.0001 <0.0001 2 <0.0001 0.0018 3 <0.0001 0.0136 4 <0.0001 0.0388 4.25 <0.0001 0.0468 5 <0.0001 0.0740 The table shows that to explain away the observed association between parental exposure to lead and child’s lead level, a hidden bias or unobserved covariate would need to increase the odds of exposure by more than a factor of 4.25 . The association cannot be attributed to small hidden biases, but it is somewhat more sensitive to bias than Hammond’s study of heavy smokers. II. Sensitivity Intervals Consider the additive treatment effect model rTi rCi . 4 A 95% confidence interval for in a randomized experiment or study free of hidden bias is a random interval that in at least 95% of such studies will contain the true . A 95% sensitivity interval for with sensitivity parameter is a random interval that in at least 95% of studies will contain the true assuming that the true , call it 0 satisfies exp( 0 ) . A sensitivity interval is an analogue of a confidence interval when there might be hidden bias, but the magnitude of hidden bias is assumed to have some known maximum magnitude. Notice that the sensitivity interval does not make any assumptions about the unmeasured covariate u . Finding a (1 ) sensitivity interval: Recall that the way we test H 0 : 0 is to do a Wilcoxon signed rank test on the adjusted responses R Z . For a fixed , let T and T be the bounding random variables when the response is R Z and let T be the Wilcoxon signed rank test statistic applied to the adjusted responses R Z Suppose / 2 P(T a1 ) and / 2 P(T a2 ) . Then Proposition 1 from Notes 15 (Proposition 13 in text book) implies / 2 P(T a1 ) for all u U [0,1]N and / 2 P(T a2 ) for all u U . If T a1 or T a2 , then 5 is rejected at level for all u U so is excluded from the sensitivity interval. If T a1 or T a2 , then is included in the sensitivity interval. When 0 , the sensitivity interval is just the confidence interval from a study free of hidden bias but as increases, the sensitivitiy interval becomes larger reflecting uncertainty about the impact of u . When using the normal approximation to the distribution of T and T , the endpoints of the 95% sensitivity interval are : T T ,obs E (T ) ,obs E (T ) inf : 1.96 and sup : 1.96 . Var (T ) Var (T ) #### Function for computing deviates [T_beta – #### E(T_{beta}^+)]/SD(T_{beta}^+)] #### and [T_beta-E(T_{beta}^-]/SD(T_{beta}^-)] that are involved in #### finding sensitivity interval sens.interval.deviates=function(beta,treated,control,Gamma){ diff=(treated-beta)-control; rk=rank(abs(diff)); s1=1*(diff>0); s2=1*(diff<0); W=sum(s1*rk); Eplus=sum((s1+s2)*rk*Gamma)/(1+Gamma); Eminus=sum((s1+s2)*rk)/(1+Gamma); V=sum((s1+s2)*rk*rk*Gamma)/((1+Gamma)^2); Dplus=(W-Eplus)/sqrt(V); Dminus=(W-Eminus)/sqrt(V); list(Dplus=Dplus,Dminus=Dminus); } # Find sensitivity interval for lead exposure data 6 Gamma=5; betagrid=seq(-60,60,.05); Dplus.betagrid=rep(0,length(betagrid)); Dminus.betagrid=rep(0,length(betagrid)); for(i in 1:length(betagrid)){ beta=betagrid[i]; deviates=sens.interval.deviates(beta,leaddata.exposed,leaddata.control,Gam ma); Dplus.betagrid[i]=deviates$Dplus; Dminus.betagrid[i]=deviates$Dminus; } lower.si=min(betagrid[Dplus.betagrid<=1.96]); upper.si=max(betagrid[Dminus.betagrid>=-1.96]); Sensitivity Intervals 95% Sensitivity Interval 1 (9.55, 20.5) 2 (4.5, 27.5) 3 (1.05, 32.5) 4 (-1.0, 37.0) 5 (-3.0, 41.95) If the study were free of hidden bias, the 95% sensitivity interval (SI) would be (9.55, 20.5)1. If 2 , matched children might differ by a factor of two in their odds of exposure to lead due to differences in the unobserved covariate. In this case, the 95% SI is longer, (4.5, 27.5), 1 This is slightly different than the confidence interval we got from wilcox.exact for a study free of hidden bias because we are using a large sample approximation here. 7 though the smallest plausible effect 4.5 is still relatively large – it is 28% of the median lead level 16 among controls. For 4 , slightly negative effects become just plausible, though large positive effects are also plausible. 8