# Notes 4 - Wharton Statistics Department ```Stat 921 Notes 4
Observational Studies, Chapters 2.5-2.7
I. Testing General Hypotheses (Section 2.6.1)
So far, we have considered testing the null hypothesis of no
treatment effect. There is an extension to test any hypothesized
treatment effect. Let   rT  rC be the effect of treatment.
Suppose we want to test H 0 :    0 (for no treatment effect
0  0 ). Under H 0 , rCsi  rTsi   si Z si . Thus, we can compute
rC under H 0 based on the observed responses R and the
observed treatment assignment Z ; call this value of rC under
H 0 , rC ( 0 ) . Our test statistic is t ( Z , rC (0 )) and we reject for
large values of the test statistic. The null hypothesis of the test
statistic can be computed because H 0 , rC  rC (0 ) .
Example:
Suppose we want to test the additive treatment effect
H 0 : rT  rC  1
Unit
Group
Responses
I
Zi
Ri  Zi
Ri
1
qi
1
1
9
8
7
2
0
1
1
1
3
0
3
3
2
4
0
4
4
3
5
1
7
6
5
6
1
11
10
8
7
1
8
7
6
8
0
5
5
4
Let t ( Z , rC (0 )) be the Wilcoxon rank sum statistic, i.e., the
sum of the ranks of the adjusted responses in the treated group.
Then, the observed test statistic is T  7  5  8  6  26 . This is
the largest possible rank sum for N  8 and the p-value is
1
8 
   1/ 70  0.014 .
 4
For the creativity study, suppose we want to test the hypothesis
that the intrinsic motivation increases scores by 2.
intrinsic=c(12,12,12.9,13.6,16.6,17.2,17.5,18.2,19.1,19.3,19.8,20.3,20.5,20.6,21.3,
21.6,22.1,22.2,22.6,23.1,24,24.3,26.7,29.7);
extrinsic=c(5,5.4,6.1,10.9,11.8,12,12.3,14.8,15,16.8,17.2,17.2,17.4,17.5,18.5,18.7,
18.7,19.2,19.5,20.7,21.2,22.1,24);
wilcox.test(intrinsic-2,extrinsic,exact=TRUE,alternative=&quot;greater&quot;);
Wilcoxon rank sum test with continuity correction
2
data: intrinsic - 2 and extrinsic
W = 330, p-value = 0.1274
alternative hypothesis: true location shift is greater than 0
Warning message:
In wilcox.test.default(intrinsic - 2, extrinsic, exact = TRUE, alternative = &quot;greater&quot;)
:
cannot compute exact p-value with ties
There is no strong evidence against the hypothesis that the
intrinsic treatment has an additive effect of 2.
For a two-sided test,
wilcox.test(intrinsic-2,extrinsic,exact=TRUE);
Wilcoxon rank sum test with continuity correction
data: intrinsic - 2 and extrinsic
W = 330, p-value = 0.2548
alternative hypothesis: true location shift is not equal to 0
Note, when there are ties, wilcox.test does not compute exact pvalues and instead uses the normal approximation.
To obtain exact p-values, we can install the exactRankTests
package.
library(exactRankTests);
wilcox.exact(intrinsic-2,extrinsic,alternative=&quot;greater&quot;);
Exact Wilcoxon rank sum test
data: intrinsic - 2 and extrinsic
W = 330, p-value = 0.1276
alternative hypothesis: true mu is greater than 0
3
II. Confidence Intervals by Inverting a Test
Under the model of an additive treatment effect, rTsi  rCsi   , a
1   confidence set for  is obtained by testing each value of 
and collecting all values not rejected into a set A .
For an effect increasing statistic, which includes all the tests
from Chapter 2.4.3 of the book (e.g., Wilcoxon rank sum,
Wilcoxon signed rank, difference in means, etc.), the test
statistic is a decreasing function of  . The argument is the
following:
*
*
Let    . Let rC ( ), rC ( ) denote the potential responses
*
under control under  , respectively based on the observed
*
*
responses R , i.e., rC  ( R   Z ), rC  ( R   Z ) . Then, for any
*
z, (rsi  rsi )(2 zsi  1)  0 for all s, i . Then, for an effect
*
increasing statistic, t ( Z , rC ( ))  t ( Z , rC ( ))
Since the test statistic is a decreasing function of  , we can find
the confidence interval by the bisection method (see Chapter 9
of Numerical Recipes in C,
http://www.nrbook.com/a/bookcpdf.php ) The function uniroot
in R finds the zero of a one-dimensional monotonic function
using a bisection method.
# Find one-sided 95% lower confidence interval for tau
pval.minus.alpha.func=function(tau0,ytreated,ycontrol,alpha=.05){
pval=wilcox.exact(ytreated-tau0,ycontrol,alternative=&quot;greater&quot;)\$p.value;
pval-alpha;
4
}
lower.ci.limit=uniroot(pval.minus.alpha.func,c(10,10),ytreated=intrinsic,ycontrol=extrinsic)\$root;
&gt; lower.ci.limit
 1.400071
A one-sided 95% confidence interval for  is approximately
(1.40, ) .
Two Sided Confidence Interval
A two sided 95% confidence interval can be found by taking the
intersection of two 97.5% one-sided confidence intervals (this is
the shortest interval containing all  that are rejected by neither
of two one-sided, 0.025 level tests).
# Find two-sided confidence inteval
lower.twosided.ci.limit=uniroot(pval.minus.alpha.func,c(10,10),ytreated=intrinsic,ycontrol=extrinsic,alpha=.025)\$root;
upper.twosided.ci.limit=uniroot(pval.minus.alpha.func,c(10,10),ytreated=intrinsic,ycontrol=extrinsic,alpha=.025,side=&quot;less&quot;)\$root;
&gt; lower.twosided.ci.limit
 1.000062
&gt; upper.twosided.ci.limit
 6.599908
III. Point Estimates: Unbiased Estimates of the Average Effect
(Section 2.7.1)
Point Estimates: Unbiased Estimates of the Average Effect
5
Randomized experiments enable us to obtain unbiased estimates
of the average treatment effect.
Suppose there are N subjects and m of the subjects are
randomly assigned to treatment, the rest to take the control
Consider estimating the average causal effect of the treatment
among the population of these N subjects,
1 N
ACE   rTsi  rCsi , by the differences between the sample
N i 1
mean of the outcomes in the treated group and the sample mean
of the outcomes in the control group,
N
N
1
1
ˆ 
ACE
(1  Z i ) Ri .
 Zi Ri  N  m 
m i 1
i 1
ˆ is unbiased estimator of ACE .
Proposition: ACE
Proof: Taking the expectation over the distribution of
N
  equally likely random assignments, we have
m 
6
N
1
1 N

ˆ
E ( ACE )  E   Z i Ri 
(1  Z i ) Ri 

N  m i 1
 m i 1

N
1
1 N

 E   Z i rTsi 
(1

Z
)
r

i Csi 
N  m i 1
 m i 1

N
1 N
1
  ( m / N ) rTsi 
 ( N  m) / N  rCsi
m i 1
N  m i 1
1 N
  rTsi  rCsi
N i 1
■
Comment: The proposition says that a randomized experiment
provides an unbiased estimate of the mean treatment effect
1 N
among the subjects in the study, N  rTsi  rCsi . It follows that
i 1
if the units are randomly sampled from an infinite population, a
randomized experiment provides an unbiased estimate of the
mean treatment effect in the population over repeated
experiments (where each experiment consists of randomly
sampling the units and then randomly assigning the sampled
units to the treatments).
An unbiased estimate of the median treatment effect in the
population cannot be obtained. To see this, consider two
populations of units, one in which
P(r  6, r  4)  1/ 3, P(r  8, r  6)  1/ 3, P( r  10, r  8)  1/ 3 ,
and another in which
Ti
Ci
Ti
Ci
Ti
7
Ci
P(rTi  10, rCi  4)  1/ 3, P(rTi  8, rCi  8)  1/ 3, P( rTi  6, rCi  6)  1/ 3
In the first population, the median treatment effect is 2 while in
the second population, the median treatment effect is 0. But the
marginal distributions of rCi and rTi are the same for the two
populations, so the distribution of the treated and control subject
outcomes will be the same in repeated experiments in which the
units are randomly drawn from an infinite population.
8
```