Independent Samples: Comparing Means

advertisement
Independent Samples: Comparing Means
Two Independent Random Samples
Take a simple random
sample of size n 1
Population 1
Take a simple random
sample of size n 2
Population 2
Population 1 mean = μ1
Population 2 mean = μ2
Population 1 standard déviation = σ1
Population 2 standard déviation = σ2
Assumptions for a Two Independent Samples Design
We have a simple random sample of
n1 observations from a N 1, 
population. We have a simple random sample of
n2 observations from a
N 2 ,  population. The two random samples are independent of each
other.
Notation in Two Independent Samples Design
n1 = sample size for first sample (number of observations from Population
n2 = sample size for second sample (number of observations from Population
x1 = observed sample mean for the first sample.
x2 = observed sample mean for the second sample.
s1 = observed sample standard deviation for the first sample.
s2 = observed sample standard deviation for the second sample.
Testing the Difference Between Two Means of Independent Samples Design
There are actually two different options for the use of t tests. One
option is used when the variances of the populations are not equal, and
the other option is used when the variances are equal. To determine
whether two sample variances are equal, the researcher
can use an F test.
Note, however, that not all statisticians are in agreement about using
the F test before using the t test. Some believe that conducting the F
and t tests at the same level of significance will change the overall level
of significance of the t test. Their reasons are beyond the scope of this
course.
Assumptions:
- Both populations are normally distributed
-The samples are obtained independently
Not satisfied
Non-parametric
method are used
TEST if the variances of two
normally distributed
populations
Satisfied
are different
H 0: 12   22
H1 : 12   22
Using F-test
Did not reject
Reject
Pooled T-test
H 0 : 1   2
To test
To test
H1 : 1   2
Use the test statistic
T 
X
1
sp
 X2

d0
1
1

n1
n2
Where
sp 
Non-pooled T-test
n1  1s12  n2  1s22
n1  n2  2
H 0 : 1   2
H1 : 1   2
Use the test statistic
( x  x )  ( 1  2 )
t 1 2
s12 s22

n1 n2
With approximate d.f.
2
 s12
s22 



n2 
 n1
df 
2
2
 s12 
 s22 




 n1    n2 
n1  1
n2  1
Let’s Do It! 1
Which Version of a Two Independent Samples Test to Use?
Each scenario presents a picture of the distributions of the two
populations being compared. Based on these distributions, determine
which version of the two-independent samples test to use.
Version of Test:
(select one)
Pooled t-test
Nonpooled t-test
Nonparametric test
Explain:
Version of Test:
Pooled t-test
Nonpooled t-test
Nonparametric test
Explain:
Version of Test:
(select one)
Explain:
Pooled t-test
Nonpooled t-test
Nonparametric test
Two Independent Samples Pooled t-Test
We are interested in comparing the population means 1
parameter of interest is the difference 1   2 .
and  2 , so the
Distribution of the Standardized X 1  X 2 for the Two Independent
Samples Scenario when  1   2
The quantity
Where sp 
T 
X
1
sp

 X 2  d0
1
1

n1 n2
n1  1s12  n2  1s22
, has a t-distribution with
n1  n2  2
n1  n2  2
degrees of freedom.
Two Independent Samples Pooled t-Test
Assumptions:
The first sample is a random sample from a normal
population with mean  1. The second sample is a random sample from
a normal population with mean  2. The two samples are independent.
Normality is less crucial if the sample sizes n1 and n2 are large,
Hypotheses: H0 : 1   2  d0 versus H1 : 1   2  d0 or
H0 : 1   2  d0 versus
H1 : 1   2  0 or
H0 : 1   2  d0 versus H1 : 1   2  0 .
The significance level
 to be used is determined.
Data:
The two sets of data from which the two sample means x1
and x 2 , and the two sample standard deviations s1 and s 2 can
be computed.
x1  x 2  d 0
(n1  1) s12  (n2  1) s 22
t

s

Observed Test Statistic:
where p
n1  n2  2
1
1
sp
n1

n2
And the t-distribution used has d.f.= (n1+ n2 – 2)
p-value:
We find the p-value for the test using the t(n1+ n2 - 2)
distribution. The direction of extreme will depend on how the
alternative hypothesis is expressed.
Decision:
A p-value less than
Confidence Interval:
where s p 
 leads to rejection of H0

x1  x2   t *  s p

(n1  1) s12  (n2  1) s 22
and
n1  n2  2
1
1 


n1 n2 
t* is an appropriate percentile of
the t(n1+ n2 - 2) distribution.
EXAMPLE
Comparing Two Headache Treatments
Medical researchers are comparing two treatments for
migraine headaches. They wish to perform a doubleblind experiment to assess if Treatment 2 (the new
treatment) is significantly better than Treatment 1 (the standard
treatment) using a 5% significance level.
The data
n1  10 x1  22.6 s1  5.2
n2  10 x 2  19.4 s2  4.9
(a)
State the appropriate hypotheses to be tested. Keep in mind that smaller
responses imply a better treatment and Treatment 1 is the new treatment.
H0 : 1   2  0 vs H1 : 1   2  0 .
(b)
State the conditions required for performing a two independent samples
pooled t-test are satisfied.
The first sample is a random sample from a normal population with mean
1 and standard deviation  . The second sample is a random sample
from a normal population with mean  2 but same standard deviation  .
The two samples are independent.
(c)
The mean time to relief for the Treatment 1 subjects was 22.6 minutes,
with a standard deviation of 5.2 minutes. The mean time to relief for the
Treatment 2 group was 19.4 minutes, with a standard deviation of 4.9
minutes. Recall that one of the assumptions for performing this test is
equal population standard deviations. However, 5.2 is not equal to 4.9.
Does this imply that the pooled test will not be valid?
Even though the sample standard deviations of 5.2 and 4.9 are not equal,
this does not mean the equal population standard deviations assumption
has been violated. Examining the relative magnitude of the two sample
standard deviations is a quick check for this assumption.
(d)
Give an estimate of the common population standard deviation.
An estimate of the equal population standard deviation is
sp 
(e)
10  1 5.2 2  10  1 4.9 2
10  10  2
Compute the pooled t-test statistic.
The observed pooled t-test statistic is
t
 5.05
22.6  19.4
 1416
.
1
1
5.05

10 10
The value of 1.416 means that we observed two sample means that are
about 1.4 standard errors apart. Is this a large enough difference to reject the
null hypothesis at a 5%significance level?
(f)
Find the corresponding p-value.
The p-value is the probability of observing a test statistic as large as or
larger than the observed value of 1.416, computed under the null
distribution, which is the t-distribution with degrees of freedom. 10 +10 -2
=18
t(18)
Using the TI:
1. Using the tcdf( function.
Using the tcdf( function on the TI we have:
p-value = PT  1.416 = tcdf(1.416, E99, 18) =
0.0869.
Area=p-value
0
1.416
2. Using the 2-SampTTest function under STAT TESTS.
In the TESTS menu located under the STAT button, we select the 4:2SampTTest option. With the sample means of 22.6 and 19.4, the sample
standard deviations of 5.2 and 4.9, and the sample sizes of 10 and 10, we can
use the Stats option of this test. The steps and corresponding input and output
screens are shown. Notice that you must specify Yes under the Pooled option.
The No Pooled option is discussed at the end of this section as another version
of our test.
p-value = PT  1.416 = 0.08688.
(g)
State the decision and conclusion using a 5%significance level.
At the 5% significance level we cannot reject the null hypothesis. The
claim that Treatment 1 is as effective as Treatment 2, in terms of the mean
response, cannot be rejected. Based on the data, it appears the two
treatments are equally effective. This does not mean that we are not going
to use the new treatment. It might be that the new treatment is less
expensive or has fewer side effects for patients, in which case, since both
treatments are equivalent in terms of time to relief, it may be reasonable
to use the new treatment.
Let’s Do It!
Drug 1
Drug 2
Sample
Size
12
14
Sample Mean
5.6
5.0
Sample Standard
Deviation
1.3
1.8
(a)
Assume the two equal population variances and the assumption of
independent samples is satisfied. Suppose we can assume each sample is
representative of the larger population of potential drug users. One more
assumption is required regarding the populations. What is that
assumption?
(d)
Is the difference between the mean cholesterol reduction for Drug 1 and
the mean cholesterol reduction for Drug 2 statistically significant at the
5% level?
Homework Page339: 11, 12, 13, 29, 30, 40, 47 (assume
variances are equal for all problem)
Download