Stat 921 Notes 17 Reading: Chapters 5.1-5.3 I. Further Notes on Sensitivity Analysis Gastwirth, Krieger and Rosenbaum (2000), “Asymptotic separability in sensitivity analysis,” Journal of the Royal Statistical Society, Series B, 62, 545-555 develops sensitivity analysis for a full matching. This would be a good topic for a final project. II. Dilated Treatment Effect Model Effects that Vary from Person to Person: The effect of a treatment may vary from one person to the next. One person may benefit or suffer greatly from treatment while another person may experience little or no effect. In other words, the effect of the treatment on the ith person in stratum s, rTsi rCsi , may not be constant but may change with i and s. Example: Thun (1993) reported data from an observational study concerning 20 highly exposed workers at a cadmium production plant in Denver and 29 workers at a nearby hospital where cadmium exposure was unlikely The outcome is a measure of kidney dysfunction, namely the level of a protein, 2 microglobulin, found in urine. We are first going to analyze this observational study under the assumption that it is free of hidden bias, i.e., it can be analyzed as a randomized experiment, and then will consider sensitivity analysis in the next notes. hospital=c(116.5,209.6,83.2,134.1,564.6,81.4,120,173.1,110.4,135.46,199.1, 113.7,305,256.8,250,159.3,311.4,255.7,225.5,177.5,253.8,95.8,213.3,375.9, 142,246.6,337.5,242.2,221.8); cadmium=c(200.8,2803,891.7,10208,2302,122,97.5,328.1,700,488,67632,24 288,211,512.5,1144,389.2,172.8,18836,33679,107143); boxplot(hospital,cadmium,names=c("Hospital","Cadmium"),ylab="Beta-2 microglobulin"); The additive treatment effect model does not appear reasonable – the cadmium outcomes are larger and more dispersed. Multiplicative Treatment Effect Model: rTi rCi For 1 , treated outcomes will be larger and more dispersed than control outcomes. Multiplicative treatment effect model is an additive treatment effect model on the log scale: log(rTi ) log(rCi ) log boxplot(log(hospital),log(cadmium),names=c("Hospital","Cadmium"),ylab= "Log(Beta-2 microglobulin)"); The log of the cadmium outcomes are still larger and more dispersed than the log of the hospital outcomes so a multiplicative treatment effect model does not appear to hold. Dilated Treatment Effect Model (Chapter 5.3): Model for treated outcomes that are both higher and more dispersed than control outcomes. The treatment has a dilated effect if rTi rCi (rCi ), i 1, , N for some nonnegative, nondecreasing function () . With a dilated effect, the effect of the treatment is nonnegative and is larger, or at least no smaller, when the response that would have been observed under control, rCi , is higher. Examples of dilated effects: 1. No effect ( 0 in additive treatment effect model) and positive effects in additive treatment effect model ( 0 ). Here, (rCi ) . 2. Multiplicative treatment effect rTi rCi with 1. Here (rCi ) ( 1)rCi . 3. Linear effect. rTi rCi with 0, 1 . Here, (rCi ) ( 1)rCi . 4. Suppose (rCi ) 0 for r rCi and (r ) 0 for rCi r with () nondecreasing; then individuals who would exhibit low responses under the control, r rCi , are not susceptible to the treatment, but the treatment does affect other individuals with rCi r . Dispersion under dilated treatment effects: When the treatment effect is dilated, the potential outcomes under treatment are not only higher than the potential outcomes under control, but also more dispersed. A common way to measure dispersion is by the difference in two order statistics, such as the range, which is the between the maximum and the minimum, or the interquartile range, which is the difference between the lower and upper quartiles. Let rT ( m) , rC ( m) denote the m th order statistics of the potential outcomes under treatment and control respectively. Note that because larger control potential outcomes , rCi , entail larger treated potential outcomes, rTi rCi (rCi ) , when the effect is dilated, it follows that the rCi ’s and rTi ’s are ordered in the same way, so that rT ( m ) rC ( m ) (rC ( m ) ) . When the effect is dilated, the order statistics of potential outcomes under treatment are farther apart than the order statistics of the potential outcomes under control; that is for every k j , we have rT ( k ) rT ( j ) [rC ( k ) (rC ( k ) )] [rC ( j ) (rC ( j ) )] rC ( k ) rC ( j ) because (rC ( k ) ) (rC ( j ) ) . Fix a k and let rC ( k ) be the k / N quantile of the potential outcomes under control, and consider drawing inferences about the effect of the treatment at this quantile, ( ) . Inferences about dilated effects: The logic of inference about an additive effect requires some changes before it can be applied to a dilated effect. Under the additive effect model, rTi rCi ,under the null hypothesis H 0 : 0 , the adjusted outcome Ri Zi 0 , equals the potential outcome under the control rCi Ri Zi 0 and also rTi rCi 0 ; we can use these values of rCi , rTi to find the null hypothesis distribution of any test statistic. However, under the dilated effect model, (0) the adjusted outcome Ri Zi ( ) does not equal Ri and continues to depend on the treatment assignment through Ri Zi ( ) rCi Zi {(rCi ) ( )} . Although the magnitudes of the adjusted outcomes are not equal to the magnitudes of the outcomes under control, there is a sense in which they have the correct sign. More precisely, the adjusted outcome, Ri Zi ( ) is above just when rCi is above . This provides a basis for exact randomization inference for ( ) . Write sign(a) 1 if a 0 , sign( a) 0 if a 0 and sign(a) 1 if a 0 . Proposition 1: If the treatment has a dilated effect, for i 1, , N , sign( Ri Zi ( ) ) sign(rCi ) . Proof: Recall that () is nonnegative and nondecreasing. It follows that if rCi , then rTi rCi (rCi ) ( ) , so that Ri Zi ( ) . Similarly, if rCi , then rTi rCi (rCi ) ( ) , so that Ri Zi ( ) . Finally if rCi , then rTi rCi (rCi ) ( ) so that Ri Zi ( ) . Testing hypotheses about ( ) . Under the model of a dilated effect, for fixed k , consider testing the null hypothesis ( ) 0 . Calculate the adjusted outcomes Ai Ri Zi 0 and let A( N ) A( N 1) A(1) be their order statistics. If the null hypothesis is true, then proposition 1 implies A( k ) . Let qi 1 if Ai A( k ) and qi 0 if Ai A( k ) . Again, by proposition 1, if the null (0) hypothesis is true, qi 1 if Yi and qi 0 if rCi . N Let q qi . i 1 N Consider the test statistic T ( 0 ) Z i qi . Under the null i 1 hypothesis ( ) 0 , the test statistic T ( 0 ) is the number of treated subjects whose outcomes under control, rCi , would have equalled or exceeded rC ( k ) . Under the null hypothesis ( ) 0 , T ( 0 ) has a hypergeometric distribution, q N q j m j P(T ( 0 ) j ) , j 0, , m N (1.1) m For a one-sided test, H 0 : ( ) 0 vs. H a : ( ) 0 , we reject for large values of T ( 0 ) and for a one-sided test, H 0 : ( ) 0 vs. H a : ( ) 0 , we reject for small values of T ( 0 ) . The test statistic T ( 0 ) is a monotone decreasing function of 0 .A confidence interval for ( ) can be formed by inverting the hypothesis test. A point estimate for ( ) can be found using the HodgesLehmann estimate. q E ( T ( ) | ) m E0 . 0 0 From (1.1), we have N Since T ( 0 ) is a monotone decreasing function of 0 , we have that the Hodges-Lehmann estimate is inf 0 : E0 T ( 0 ) sup 0 : E0 T ( 0 ) ˆ( ) HL 2 R code: # Test of dilated treatment effect # See Notes 5 dilated.treateffect.test.func=function(Delta0,treated,control,k,alternative="hi gher",returntype="pval"){ # Create vectors for Ri and Zi, and find total number in experiment and # number of treated subjects Ri=c(treated,control); Zi=c(rep(1,length(treated)),rep(0,length(control))); N=length(Ri); m=length(treated); # Calculate adjusted responses and rho=r_{C(k)} A=Ri-Zi*Delta0; sorted.A=sort(A); rho=sorted.A[k]; # q=1 if adjusted response>=rho, 0 otherwise q=(A>=rho); qpos=sum(q); # Test statistic = # of assigned to treatment units with q=1 teststat.obs=sum(q*Zi); # For returning the p-value, # p-value computed using hypergeometric distribution, see Notes 5 if(returntype=="pval"& alternative=="lower"){ returnval=phyper(teststat.obs,qpos,N-qpos,m); } if(alternative=="higher"){ returnval=1-phyper(teststat.obs-1,qpos,N-qpos,m); } # For returning the test statistic minus its expected value if(returntype=="teststat.minusev"){ returnval=teststat.obs-m*qpos/N; } returnval; } # Search for endpoints of lower and upper .025 confidence intervals; k=25; pval.Delta0.func=function(Delta0,treated,control,k,alternative){ dilated.treateffect.test.func(Delta0,treated,control,k,alternative)-.025; } upper.ci.limit=uniroot(pval.Delta0.func,c(10000,10000),treated=cadmium,control=hospital,k=k,alternative="lower")$r oot; lower.ci.limit=uniroot(pval.Delta0.func,c(10000,10000),treated=cadmium,control=hospital,k=k,alternative="higher")$ root; # Find Hodges Lehmann estimate hlest=uniroot(dilated.treateffect.test.func,c(10000,10000),treated=cadmium,control=hospital,k=k,returntype="teststat.m inusev"); Inferences for dilated treatment effect in study of effects of cadmium exposure k 12 (lower quartile) 25 (median) 38 (upper quartile) Hodges-Lehmann Estimate 91.0 490.4 18582.1 95% CI for ( ) [-20.0, 377.6] [85.9, 2643.7] [832.6, 67389.8] The treatment effect of cadmium exposure for subjects in the upper quartile of the potential outcomes under control distribution is estimated to be much larger than for subjects at the median, being about 38 times larger. The treatment effect for subjects in the lower quartile is estimated to be about five times smaller than at the median, and it is even plausible that there is no effect for subjects in the lower quartile.