Chapter 18 Two-sample Problems

advertisement
Chapter 18
Two-sample Problems
So far we have looked into the problems that involve one sample. For this we
developed techniques to draw inference about the (unknown) mean of the population
from which the sample is drawn.
Issue: In real life, often we have two samples from two independent populations and
we want to compare the two population means using the two samples. Two typical
situations are:
1) We want to compare responses to two treatments.
For example: Compare cholesterol levels between a placebo group of subjects and a
group receiving a new cholesterol-lowering drug.
2) We want to compare characteristics of two populations.
For example: Compare scores of males and females on a positive attitude test.
The data (two samples) for the above situations can arise from the comparative
experiments.
Goal: Develop techniques to draw inference for comparing the two (unknown)
population means.
Notation:
1 – population mean of the first population
2 – population mean of the second population
1 – population standard deviation of the first population
2 – population standard deviation of the second population
n1 – size of SRS drawing from the first population
n2 – size of SRS drawing from the second population
1
Assumptions for comparing two population means:
1) The two samples are SRSs from two distinct populations.
2) Samples are independent!!!
- the response of subjects in one sample has no influence on the responses of the
subjects in the other sample.
3) Both populations are normally distributed, i.e.,
- 1, 2, 1 and 2 are unknown (realistic situation!)
The parameter of interest is
We want to develop CI for
An obvious estimate of 1-2 is
Idea: Large differences between the sample means suggest the population means are
likely to be different.
However, large differences can arise just by chance if the observations vary a great
deal.
So, the variability needs to be accounted for. It is given by the
Since the population standard deviations 1, 2 are unknown, we replace them with
2
Recall from Chapter 17, the form of the CI for  (when  is unknown):
In general terms it has the form:
Analogously, using the general form of t-CI, we have:
The CI for 1-2 is given by
Here t* is the upper /2 critical value for the tk distribution where k = min(n1–1, n2–1)
is the degrees of freedom. This CI has level at least (1-).
Ex: Influence of gene aP2
A geneticist at a Medical Center is studying the influence of gene aP2 on diabetes. She
compares the level of insulin in a random sample of 11 normal mice with another
random sample of 10 mice whose gene aP2 is removed. The following results (in
ng/ml) are obtained:
Group
Normal
aP2 removed
Mean
5.90
0.75
Std. dev.
2.850
0.632
(a) Compute a 95% CI for the difference in the mean insulin levels of the normal mice
and aP2 removed mice.
3
(b) Compute a 90% CI for the difference in the mean insulin levels of the normal mice
and aP2 removed mice.
(c) Suppose the geneticist wants to test if there is significant difference in the mean
insulin levels of the normal mice and aP2 removed mice at 5% level. Perform the test
by using confidence intervals with the four-step process.
4
Keys points to check in solving problems that involve drawing inference about
population mean(s):
 Check if the question is about one sample or two samples.
 If it is about one sample, check if the population standard deviation  is given.
o If  is given, use z CI.
o If  is not given, use one-sample t-CI.
 If it is about two samples, check if the samples are independent or matched pairs
(not independent).
o If the samples are matched pairs, apply one-sample t procedures to the
differences of observed responses.
o If the samples are independent, apply two-sample t CI.
5
Ex: The diameter of Jupiter is measured 100 times independently by a new unbiased
process. Using these 100 measurements, a 99% CI for the true diameter is computed
to be (88,707 miles, 88,733 miles). Is there evidence at 1% level that the true
diameter of Jupiter is not 88,720 miles? Use the four-step process.
6
Ex. Suppose a manufacturer of printers for personal computers wishes to estimate the
mean number of characters printed before the printhead fails. The printer
manufacturer tests 15 printheads and records the number of characters printed until
failure for each. These 15 measurements (in millions of characters) are listed below.
1.13
1.55
1.43
0.92
1.25
1.36
1.32
0.85
1.07
1.48
1.20
1.33
1.18
1.22
1.29
(a)
Construct a 99% confidence interval for the mean number of characters
printed before the printhead fails.
(b)
The store manager is interested in knowing if the number of characters
printed before the printhead fails is one million or not. State the appropriate
hypotheses. Using the above CI what do you conclude? Use the four-step
process.
7
Ex. The Chapin Social Insight Test is a psychological test designed to measure how
accurately a person appraises other people. The possible scores on the test range from
0 to 41. During the development of the Chapin test, it was given to several different
groups of people. Here are the results for male and female college students majoring
in the liberal arts:
Group
Sex
N
x
s
1
2
Male
Female
133
162
25.34
24.94
5.05
5.44
Do these data support the contention that female and male students differ in average
social insight at significance level  = 0.1? Use the four-step process.
8
Download