  Hypothesis tests for two samples

advertisement
Hypothesis tests for two samples
Test
F test on ratio of
two variances
Null hypothesis
Test statistic
The ratio of the variances of
2
the two populations is 1 2 .
2
F
s12 /  12
s2 2 /  2 2
s
2
1
 s22 
Distribution
Fn1 1, n2 1
D*  n1n2 D where
KolmogorovSmirnov
2-sample test
The two samples are drawn D = largest difference between
from the same underlying cumulative probabilities based on As in statistical tables
population.
the two samples and n1 , n2 are the
sample sizes.
Rank all the data values from both
samples together;
for sample sizes m, n : m  n ,
calculate T1 
and T2 
1
 S  2 n(n  1) where
The two samples are drawn
R is the sum of ranks of
from the same underlying
population.
sample size m,
S is the sum of
[H1: the two samples come
from populations with different ranks of sample size n.
The test statistic is the smaller of
medians]
T1 , T2 .
The test statistic can also be found
by counting the number of values
in sample 2 which exceed each of
the values in sample 1, repeat for
all the values in sample 2. There
are online calculators such as:

Mann Whitney
U test
1
 R  2 m(m  1)

As in statistical tables.
Approx Normal for large
samples with
Mean 12 mn ,
Variance 121 mn  m  n  1
http://socr.stat.ucla.edu/Applets.dir/U_Test.html
Normal test for
paired samples
with known
variance
The difference in the
population means has value k.
z
Normal test for
paired samples
with unknown
variance
The difference in the
population means has value k.
z
 x1  x2   k  d  k


n
n
 x1  x2   k  d  k
s
s
n
Normal test for
 x  y    1  2 
unpaired samples The difference in the means of z 
1 1
the two populations is 1  2 .
with common


n1 n2
known variance
© MEI 2008
N(0, 1)
N(0, 1) for large samples
n
N(0, 1)
 x  y    1  2 
where
z
Normal test for
1 1
unpaired samples The difference in the means of
s

n1 n2
with common
the two populations is 1  2 .
unknown
 n  1 s12   n2  1 s22
variance
s 1
N(0, 1) for large samples
 x  y    1  2 
Normal test for
unpaired samples The difference in the means of z 
 12  2 2
the two populations is 1  2 .
with different

n1
n2
known variances
N(0, 1)
Normal test for
 x  y    1  2 
unpaired samples The difference in the means of z 
with different
s12 s2 2
the two populations is 1  2 .

unknown
n1 n2
variances
N(0, 1) for large samples
n1  n2  2
Sign test
Population of differences has
median = 0.
r = number of values of di  0
t test for paired
samples
The difference in the
population means has value k.
t
where di  xi  yi
 x1  x2   k  d  k
s
s
n
B(n, ½) where
n = number of values ≠ 0
tn 1
n
 x  y    1  2 
t
where
t test for
1 1
unpaired samples The difference in the means of
s

n1 n2
with common
the two populations is 1  2 .
unknown
n1  1 s12   n2  1 s2 2

variance
s
tn  n  2
1
2
n1  n2  2
Wilcoxon paired
sample test
Test statistic: T = min [P, Q]
P, Q are the sums of the ranks
corresponding to positive and
negative deviations  d i  M 
Population of differences has
median = M.
As in statistical tables
where di  xi  yi
Wilcoxon rank
sum
2-sample test
The two samples are drawn
from the same underlying
population.
[H1: the two samples come
from populations with different
medians]
Statistical tables give
critical values for the
lower tail;
Rank all the data values from both critical value for the upper
tail is m(m  n  1)  WC
samples together;
W  sum of ranks of sample size where WC is the critical
m.
value from tables.
where sample sizes are
Approx Normal for large
m, n : m  n .
samples with
Mean 12 mn  12 m  m  1 ,
Variance 121 mn  m  n  1
© MEI 2008
Download