U3.1-TwoPopulations

advertisement
Two Independent Populations
(Chapter 6)
•
Develop a confidence interval for the difference in means between
two independent normal populations.
•
Develop a statistical test for the difference in means between two
independent normal populations, assuming:
• equal population variances;
• unequal population variances.
•
Develop a nonparametric statistical test for the difference in
means between two independent populations, assuming equal
population variances.
•
Develop parametric and nonparametric statistical tests for the
difference in means between two dependent populations.
Two-Sample-Means-1
Situation
Need to determine if concentration of contaminant at an old industrial
site is greater than background levels from areas surrounding the site.
Take soil samples
at random locations
within the site and
at random locations
at areas outside the
site.
Assume values at
areas outside the
site are unaffected
by site activities
that lead to
contamination.
Two-Sample-Means-2
Hypotheses of Interest
1. Background level of contaminant is greater than the no-detect
level (-5.0) [on a natural log scale].
2. Site level of contaminant is greater than the no-detect level (-5.0).
3. Site level is different from Background level.
Is the situation
this or
this?
2
-5.0 1
2
Two-Sample-Means-3
One- Sample Hypothesis Tests
Background level of contaminant is greater than the
no-detect level (-5.0).
Site level of contaminant is greater than
the no-detect level (-5.0).
H0: i  -5.0
HA: i > -5.0
One sided t-tests:
T. S.
R.R.
( y i  i )
ti 
s
ni
ti  t ,ni 1
i = 1 => Background areas (B)
i = 2 => Contamination site (S)
Critical Values
n
t.05,n-1
7
1.943
8
1.895
9
1.860
10
1.833
Two-Sample-Means-4
Y
-2.96
-1.09
-3.13
-2.12
-2.59
-4.31
-1.20
DATA and One-Sided T-tests
n1  7
y1  2.48
H0: B  0 = -5.0
HA: B > 0 = -5.0
s1  1.13
Conclusion:
Pr(Type I error) =  = 0.05
T.S.
y1  0 2.48  ( 5.0)
t

 5.90
s
1.13
n
7
R.R.
Reject H0 if t > t,n-1
T0.05,6=1.943
Since 5.90 > 1.943 we reject H0 and
conclude that the true average
background level is above -5.0.
Same test could be performed for contaminated site data.
Two-Sample-Means-5
Comparison Hypothesis
H0: Site level is not different from Background level. (B = S)
HA: Site level is different from Background level. (B S)
This requires comparing sample means from two
“independent” samples, one from each population.
T. S.
s y1  y 2
(y1  y 2 )
t
sy1y2
Obvious
test
statistic.
the standard error of the
difference of the two means.
Two-Sample-Means-6
Standard Error of the Difference of Two Means
If the true variances of the two populations are known,
we use the property of independent random variables
that says: Var(X-Y) = Var(X) + Var(Y) .

2
y1  y 2
 y y
1
2


 Var(y1)  Var(y 2 ) 

n1 n2
 12  22


From sampling Dist of Y
1
n1 n2
From sampling Dist of Y
2
1
2
2
2
T. S.
z  ( y1  y2 )  y1  y2
Two-Sample-Means-7
Assume the two populations have the same, or
nearly the same true variance, 2p.
True standard error of
the difference of two means.
 y y   p
Estimate of the standard
error of the mean differences.
s y1  y2  s p
Estimate of the common
(Pooled) standard deviation.
1
2
1 1

n1 n2
1 1

n1 n2
(n1  1) s  (n2  1) s
sp 
n1  n2  2
2
1
2
2
Two-Sample-Means-8
Confidence Interval for difference
Assume confidence level of (1-)100%
y1  y 2   t  ,n n 2sp
2
sp 
1
2
1 1

n1 n2
(n1  1)s12  (n2  1)s22
n1  n2  2
Two-Sample-Means-9
Statistical test for
difference of means
Pooled Variances T-test
Pr(Type I Error) = 
Ho : 1  2  D0
H A : 1   2  D 0 , reject H o if t  t  ,df
H A : 1   2  D 0 , reject H o if t   t  ,df
H A : 1   2  D 0 , reject H o if
t  t  / 2,df
Test Statistic:
y1  y 2  Do
t
1 1
sp

n1 n2
df  n1  n2  2
sp 
(n1  1)s12  (n2  1)s22
n1  n2  2
Two-Sample-Means-10
CONT
Background
n1  7
y1  2.48
H0: B - S = 0
HA: B - s  0
s1  1.13
YB
-2.96
-1.09
-3.13
-2.12
-2.59
-4.31
-1.20
YS
-3.81
-5.83
-5.70
-4.11
-3.83
-5.01
-5.49
Site
n2  7
y 2  4.83
s2  0.89
HA: Average site level significantly different from background.
(n1  1) s1  (n2  1) s2
(6)1.132  (6)0.892
sp 

n1  n2  2
772
2
2
 1.017
Two-Sample-Means-11
Pr(Type I Error) =  = 0.05
sp  1.017
s y1  y2  s p
1 1
1 1

 1.017   .5436
n1 n2
7 7
y1  y2  Do  2.48  (4.82)  0

 4.305
T.S. t 
0.5436
1 1
sp

n1 n2
R.R.
Reject if:
t  t
2
, df
 t.025,12  2.179
Conclusion: Since 4.304 > 2.179 we reject H0 and
conclude that site concentration levels are
Two-Sample-Means-12
significantly different from background.
What if the two populations do
not have the same variance?
Separate Variances
CI and T-test
• New estimate of standard error of difference.
• Test and CI is no longer exact - uses Satterthwaite’s
approximate df value (df’).
C.I.:
T.S.:
( y1  y 2 )  t / 2 ,df 
t '  ( y1  y 2  Do ) 
(n1  1)(n2  1)
df  
,
2
2
(n2  1)c  (n1  1)(1  c)
Round df’ down to the nearest integer.
s12
s22

n1
n2
s12
s22

n1
n2
s12 / n1
c 2
2
s1 / n1  s2 / n2
Two-Sample-Means-13
For site contamination example, assume 1  2 and redo test.
 1.276


7 

c 2

 .6171
2
s1 s2  1.276  0.7921




n1 n2  7   7 
df  
s12
n1
Redo Test
(n1  1)(n2  1)
(n1  1)c 2  (n2  1)(1  c) 2
(6)(6)

 11.4
T.S.
2
2
(6)0.6171  (6)(1  0.6171)
y1  y 2  Do  2,48  (4.82)
2.34
'
t 


 4.304
2
2
2
2
0.5437
s1 s2
1.13
0.87


n1 n2
7
7
R.R. Reject if:
| t |  t / 2,df   t.025,11  2.201
Conclusion: Since 4.304 > 2.201 we reject H0 and
conclude that site concentration levels are
Two-Sample-Means-14
significantly different from background.
Sample Size Determination
(equal variances)
(1-)100% CI for μ1 – μ2 :
2 z / 2
n
,
2
E
2
2
1
E  intervalwidth, n1  n2  n
2
 = Pr(Type I Error)
 = Pr(Type II Error)
One-sided Test:
n
2 2 ( z  z  ) 2

2
,
 | 1   2 |,
n1  n2  n
,
 | 1   2 |,
n1  n2  n
Two-sided Test:
n
2 2 ( z / 2  z  ) 2

2
Two-Sample-Means-15
Example of Sample Size Determination
In our sample, a  of 2.34 was observed. What if we
had wanted to be sure that a  of say 1 unit would be
declared significant with:
0.05 =  = Pr(Type I Error)
0.10 =  = Pr(Type II Error)
Assume a common population variance of 2 = 1.
One-sided test:
n
22 (z  z )2
2
2(1)2 (z0.05  z0.1)2
2


2
(
1
.
645

1
.
280
)
 17.11  18
2
(1)
Two-sided test:
n
2 2 ( z  z  ) 2
2
2

2(1) 2 ( z0.025  z0.1 ) 2
2


2
(
1
.
96

1
.
280
)
 21
2
(1)
n=n1=n2
Two-Sample-Means-16
Summary
• These two-sample inferences require the assumption of
independent, normal population distributions.
• If we have reason to believe the two population
variances are equal, then we should use the pooled
variances method. This results in more powerful
inferences.
• We need not worry about the normality assumption
when both sample sizes are large, all results are still
approximately correct. (CLT.)
• What to do when independence does not hold?
Advanced! Partial solution next lecture (paired samples).
• What to do when we have small samples and we don’t
Two-Sample-Means-17
believe the data are normal? Next slide…
The Wilcoxon Rank Sum Test
A class of nonparametric tests. These:
•
•
Do not require data to have normal distributions.
Seek to make inferences about the median, a more appropriate
representation of the center of the population for highly skewed
and/or very heavy-tailed distributions.
In the Wilcoxon Rank Sum Test:
•
•
•
•
•
Population variances assumed to be equal.
Measurements (observations) are assumed to be independent
from continuous distributions.
Interest is whether the center of the two population distributions
are the same or not.
Also known as the Mann-Whitney U Test.
1-sample equivalent: Wilcoxon Signed Rank Test.
Two-Sample-Means-18
Idea behind the Wilcoxon Test
H0: Populations are identical
If H0 is true, when we put the data from the two samples
together and sort them from lowest to highest, i.e. we rank
them (lowest obs gets rank=1, 2nd rank=2, etc., tied obs get
average of ranks). The ranks of the observations from the two
samples should be fairly well intermingled.
Thus, the sum of the ranks from population 1 observations
should be approximately equal to the sum of the ranks of
population 2 observations.
HA: Population 1 is shifted to the right of population 2.
If HA is true, if we put the data from the two samples together
then sort them lowest to highest, the sum of the ranks of
population 1 observations should be greater than the sum of
ranks of observations from population 2 .
Two-Sample-Means-19
Definition of Population 1
NOTE: Population 1 should be taken to be the one
corresponding to the smaller sample size (n1 n2).
If the sample sizes are equal (n1=n2), either one can be
Population 1.
This is so that the Tables in Ott will give correct critical values.
(Ott inadvertantly does not mention this!)
Two-Sample-Means-20
Wilcoxon Test Statistic
H0:
Populations are identical
HA: 1. Population 1 is shifted to right of population 2.
2. Population 1 is shifted to the left of population 2.
3. Populations 1 and 2 have different location parameters.
T.S.
Let T denote the sum of the ranks of
population 1 observations.
If n110 and n2 10 use T as the test statistic and
Situation #1 Table 5 in Ott & Longnecker for critical values.
R.R.
1.
2.
3.
Reject H0 if T > TU
Reject H0 if T < TL
Reject H0 if T>TU or T<TL
Two-Sample-Means-21
Situation #2 If n1>10, n2 >10 we use a normal approximation
to the distribution of the sum of the ranks.
Let T denote the sum of the ranks of population 1 observations.
T.S.
T  T
z
T
n1(n1  n2  1)
T 
2
k
2


t
(
t

1
)

j
n
n
j
j 1

 T2  1 2 (n1  n2  1) 
12 
( n1  n2 )(n1  n2  1) 


and tj denotes the number of tied ranks in the jth group of ties, j=1,…,k.
R.R.
1.
2.
3.
Reject H0 if z > z
Reject H0 if z < -z 
Reject H0 if |z|>z /2
 = Pr(Type I Error)
Two-Sample-Means-22
Situation #1: n110 and n2 10
Group
Value Rank
1
1
1
1
1
1
2
2
2
1
2
2
2
2
-1.09
-1.20
-2.12
-2.59
-2.96
-3.13
-3.81
-3.83
-4.10
-4.31
-5.01
-5.41
-5.41
-5.83
SUM
14
13
12
11
10
9
8
7
6
5
4
2.5
2.5
1
Pop 1
Ranks
14
13
12
11
10
9
5
74 = T
Two sided alternative hypothesis
R.R. :Reject H0 if T>TU or T<TL
Table 5
n1 = n2 = 7
TL = 37
TU = 68
Conclusion: Reject H0
Two-Sample-Means-23
Table 5
Wilcoxon Critical Values
Two-Sample-Means-24
Confidence Interval for Δ = μ1 – μ2
Can be constructed in analogous way to the nonparametric CI for μ
based on the Sign Test.
(See §6.3 in Ott.)
Two-Sample-Means-25
Paired Data Situation (§6.4-6.5)
In this set of slides we will:
• Develop confidence intervals and test for the difference
between two means that have been measured on the
same or highly related experimental units.
• The underlying population of differences is assumed to
be normally distributed.
• A nonparametric alternative that does not rely on
normality will be discussed (§6.5).
Two-Sample-Means-26
Situation
Two analysts, supposedly of identical abilities, each measure the parts
per million of a certain type of chemical impurity in drinking water. It is
claimed that analyst 1 tends to give higher readings than analyst 2. To
test this theory, each of six water samples is divided and then analyzed
by both analysts separately. The data are as follows:
Water Sample
Analyst 1
1
31.4
2
37.0
3
44.0
4
28.8
5
59.9
6
37.6
Column mean
39.8
standard deviation
11.19
Data are paired hence
observations are not
independent.
Analyst 2
28.1
37.1
40.6
27.3
58.4
38.9
38.4
11.28
Row mean
29.8
37.1
42.3
28.1
59.2
38.3
Difference
3.3
-0.1
3.4
1.5
1.5
-1.3
1.4
1.85
Observations in the same row are
more likely to be close to each other
than are observations between rows.
Two-Sample-Means-27
Paired Data Considerations
Because of the dependence within rows we don’t use the difference
of two means but instead use the mean of the individual differences.
Ho : 1  2  D0
written as
Ho : d  D0
Let yij represent the ith observation for the jth sample, i=1,…,n, j=1,2.
Compute di = yi1 - yi2 the difference in responses for the ith observation,
and then proceed as in the one sample t-test situation.
1 n
d   di
n i1
1
n
2
sd 
(di  d)

i1
n 1
the sample mean of the differences.
the sample standard deviation
of the differences.
Two-Sample-Means-28
Importance of Independence/Dependence on Variance of Difference
If the two populations were independent, the variance of the
difference would be computed using the probability rule:
Var( X - Y ) = Var(X) + Var(Y)
But here the two populations are dependent and the above
rule does not hold.
i.e we don’t use a pooled
2
2
variance estimator. We use
s
s
s
d
the variance of the
Hence
 1  2
differences.
n n
n
In above Example:
sd
1.85

 0.76,
n
6
1 2
(s1  s22 ) 
n
Very different!
1
(11.192  11.282 )  6.49
6
Two-Sample-Means-29
d = D0
H0:
Ha:
1.
2.
3.
d > D0
d < D0
d  D0
Rejection Region:
The paired t-test
Test Statistic:
d  D0
t
sd
n
-level confidence interval:
1. Reject H0 if t > t,n-1
2. Reject H0 if t < -t
,n-1
3. Reject H0 if |t| > t
/2,n-1
d  t
2
,n1
sd
n
For the Example, t = 1.853 and t0.05,5 = 2.015 hence we do not reject
H0: d =0 in testing situation 1, and conclude that there are no
significant differences.
95% CI: 1.4  1.94 or ( -.54 , 3.34 )
Two-Sample-Means-30
Wilcoxon Signed-Rank Test for Paired Data (§6.5)
A nonparametric alternative to the paired t-test when the
population distribution of differences are not normal.
(Requires symmetry about the population median of the
differences, M.)
Test Construction:
1. Compute the differences in the pairs of observations.
2. Let D0 be the hypothesized value of M, and subtract D0 from all
the differences.
3. Delete all zero values, and let n be the resulting number of
nonzero values.
4. List the absolute values in increasing order, and assign them
ranks 1,…,n (or average of ranks for ties).
T+ = sum of the positive ranks (T+ = 0 if no positive ranks).
T- = sum of the negative ranks (T- = 0 if no negative ranks).
Two-Sample-Means-31
Situation 1: Small
Sample (n≤50)
H0:
HA: 1.
2.
3.
M = D0 (usually 0).
M > D 0.
M < D 0.
M ≠ D 0.
T.S.
1. T=T2. T=T+
3. T=smaller(T-,T+)
R.R.
1. Reject if TT,n
2. Reject if T  T,n
3. Reject if T T/2,n
Critical values (from Table 6)
Two-Sample-Means-32
Situation 2: Large
Sample (n>50)
T.S.
z
T   T
T
H 0:
HA: 1.
2.
3.
M = D0 (usually 0).
M > D 0.
M < D 0.
M ≠ D 0.
n(n  1)
T 
4

1 
1 g
2
 T  n(n  1)(2n  1)   t j (t j  1)(t j  1)
24 
2 j 1

g = number of distinct ranks assigned to the differences. (If no ties, g=n.)
tj = number of tied ranks in jth group. (If no ties, tj=1 for j=1,…,g, and the
summation term is zero.)
R.R.
1. Reject H0 if z > z 
2. Reject H0 if z < - z 
3. Reject H0 if |z| > z /2
Critical values (from Table 1)
Two-Sample-Means-33
Differences
-1.3
-0.1
1.5
1.5
3.3
3.4
Ranks (signs)
1
(-)
2
(-)
3.5
(+)
3.5
(+)
5
(+)
6
(+)
Back to the
Problem
n6
T   18
T 3
T  min(T  , T  )  3
For HA: M > 0 (1st case):
• the test statistic is T- = 3;
• the critical value is T0.05,6 = 2.
Hence we do not reject H0 and conclude that analyst 1 does not
tend to give higher readings than analyst 2.
Two-Sample-Means-34
Differences
-1.3
-0.1
1.5
1.5
3.3
3.4
Ranks (signs)
1
(-)
2
(-)
3.5
(+)
3.5
(+)
5
(+)
6
(+)
Illustration of calculations
supposing the problem were
of a large sample nature (not
the case here).
n(n  1) 6(7)

 10.5
4
4
1 
1
 543

6
(
7
)(
2

6

1
)

2
(
2

1
)(
2

1
)

 22.6


24 
2
 24
T 
T 2
For HA: M > 0, test statistic is
Z = (T+ – μT) / σT = (18 – 10.5) / 4.8 = 1.6
Since z = 1.6 < z0.05 = 1.645, we do not reject H0.
Two-Sample-Means-35
Concluding Comments
The paired measurements (samples) case can be extended to more
than two measurements; called repeated measurements.
When repeated measurements are taken on an individual, we are in
the same situation as with two paired samples, that is, the repeated
measurements on an individual are expected to be more correlated
than measurements among individuals.
Solutions to the repeated measurements case cannot follow the simple
solution for two dependent samples. The final solution involves
specifying not only what we expect to happen in the means of the
sampling “times” but we have to specify the structure of the
correlations between sampling times.
It is advantageous to design a paired data experiment rather than an
independent samples one. This helps to eliminate the confounding
effect (masking of treatment differences) that sources of variation other
Two-Sample-Means-36
than the treatments have on the experimental units.
Download