Chapter 11 Notes

advertisement
Chapter 11: Comparing Two Populations or Treatments
11.1: Inferences Concerning the Difference Between Two Population or Treatment Means
Using Independent Samples
Suppose we have a population of adult men with a mean height of 71 inches and standard deviation
of 2.5 inches. We also have a population of adult women with a mean height of 65 inches and
standard deviation of 2.3 inches. Assume heights are normally distributed.
Suppose we take a random sample of 30 men and a random sample of 25 women from their
respective populations and calculate the difference in their heights (man’s height – woman’s height).
If we did this many times, what would the distribution of differences be like?
The sampling distribution is normally distributed with:
What is the probability that the difference mean heights of a random sample of 30 men and a
random sample of 25 women is less than 5 inches?
1
̅𝟏 − 𝒙
̅𝟐
Properties of the Sample Distribution 𝒙
If the random samples on which 𝑥̅1 and 𝑥̅2 are based are selected independently of one another,
then:
1.
2.
3. In n1 and n2 are both large or the population distributions are (at least approximately) normal, x1
and x2 each have (at least approximately) normal distributions. This implies that the sampling
distribution of x1 – x2 is also (approximately) normal.
These properties imply that 𝑥̅1 − 𝑥̅2 can be standardized:
Two-Sample t Test for Comparing Two Populations
Null Hypothesis:
Test Statistic:
The appropriate df for the two-sample t test is:
Alternative Hypothesis:
P-value:
Ha: 𝜇1 − 𝜇2 > hypothesized value
Area under the appropriate t curve to the right of the
computed t
Ha: 𝜇1 − 𝜇2 < hypothesized value
Area under the appropriate t curve to the left of the
computed t
Ha: 𝜇1 − 𝜇2 ≠ hypothesized value
2(area to right of computed t) if +t or
2(area to left of computed t) if -t
2
Another Way to Write Hypothesis Statements:
𝐻0 : 𝜇1 − 𝜇2 = 0
𝐻𝑎 : 𝜇1 − 𝜇2 < 0
𝐻𝑎 : 𝜇1 − 𝜇2 > 0
𝐻𝑎 : 𝜇1 − 𝜇2 ≠ 0
Assumptions:
1.
2.
When comparing two treatment groups, use the following assumptions:
1.
2.
Are women still paid less than men for comparable work? A study was carried out in which salary
data was collected from a random sample of men and from a random sample of women who worked
as purchasing managers and who were subscribers to Purchasing magazine. Annual salaries (in
thousands of dollars) appear below (the actual sample sizes were much larger). Use a = .05 to
determine if there is convincing evidence that the mean annual salary for male purchasing
managers is greater than the mean annual salary for female purchasing managers.
Men
81
69
81
76
76
74
69
76
79
65
Women
78
60
67
61
62
73
71
58
68
48
Hypotheses:
Assumptions:
3
Test Statistic and P-Value
Conclusion
Use your calculator: Stat, Tests, 2-SampTTest
The Two-Sample t Confidence Interval for the Difference Between Two Population or
Treatment Means
The general formula for a confidence interval for 𝜇1 − 𝜇2 when:
1) The two samples are independently selected random samples from the populations of interest
2) The sample sizes are large (generally 30 or larger) or the population distributions are (at least
approximately) normal.
In a study on food intake after sleep deprivation, men were randomly assigned to one of two
treatment groups. The experimental group were required to sleep only 4 hours on each of two
nights, while the control group were required to sleep 8 hours on each of two nights. The amount of
food intake (Kcal) on the day following the two nights of sleep was measured. Compute a 95%
confidence interval for the true difference in the mean food intake for the two sleeping conditions.
4-hour sleep
8-hour sleep
4
3585
4470
3068
5338
2221
4791
4435
3187
3901
3868
3869
4878
3632
4518
4965
3918
1987
4993
5220
3653
3510
4100
5792
4547
3319
3336
4304
4057
3099
3338
Find the mean and standard deviation for each treatment:
Verify Assumptions:
Interval:
Interpret:
Use your calculator: Stat, Tests, 2-SampTInt
Pooled t Test
•
•
Combines information from both samples to create a “pooled” estimate of the common
variance which is used in place of the two sample standard deviations
•
Is not widely used due to its sensitivity to any departure from the equal variance assumption
1. For each situation below, determine if the samples are independent or dependent.
a. A sample of 5 CDs is treated with coating A, and a second sample of 7 CDs is treated with
coating B. Each CD is then tested for strength.
b. Does environment affect intelligence? Researchers identified 25 sets of identical twins who
were raised by different parents. Each twin was given an IQ test.
c. A sample of 10 people who suffer from severe headaches is given drug A for their pain for one
month and drug B another month.
d. A total of 20 people enter a study to determine the benefits of a new drug designed to reduce
cholesterol. The new drug is given to 10 people, while a placebo is given to the other 10. After
a period of three months, the reduction in cholesterol is measured for each person.
5
Eye
No Eye
2. The early detection of danger is important for the survival of animals. In a
Contact Contact
field experiment in Costa Rica, investigators located and directly
3.19
2.09
approached black iguanas; that is, they walked straight towards them. Two
2.34
1.96
treatments were randomly assigned to the individual iguanas. In one
2.45
1.85
treatment the investigator gazed at the iguana while approaching,
2.71
2.45
"maintaining eye contact." In a second treatment, the investigator did not
1.90
2.77
gaze at the iguana while approaching. The outcome measure was the
2.12
2.55
distance of the investigator from the iguana when it decided to run away.
2.56
2.44
3.41
2.80
The researchers believe that eye contact is noticed by the iguana, leading to
2.41
3.27
a longer approach distance. Data from this experiment is shown in the
2.66
2.01
table at right. An initial analysis of the data established the plausibility
2.86
3.49
that the distributions of approach distances are approximately normal. Do
2.44
2.75
these data provide evidence of a difference in mean approach distances for
black iguanas when eye contact is maintained and when it is not? Provide appropriate
statistical justification for your conclusion.
11.1 Homework: 1, 4, 6, 7, 10, 12, 13, 15, 18
11.2 Inferences Concerning the Difference Between Two Populations or Treatment
Means Using Paired Samples
Summary of the Paired t-test for Comparing Two Population or Treatment Means
Null Hypothesis:
Test Statistic:
Alternative Hypothesis:
P-Value
Ha: 𝜇𝑑 > hypothesized value
Area to the right of calculated t
Ha: 𝜇𝑑 < hypothesized value
Area to the left of calculated t
Ha: 𝜇𝑑 ≠ hypothesized value
2(Area to the right of calculated t) if +t or
2(Area to the left of calculated t) if -t
6
Assumptions
1.
2.
3.
Is this an example of paired samples?
An engineering association wants to see if there is a difference in the mean annual salary for
electrical engineers and chemical engineers. A random sample of electrical engineers is surveyed
about their annual income. Another random sample of chemical engineers is surveyed about their
annual income.
A pharmaceutical company wants to test its new weight-loss drug. Before giving the drug to
volunteers, company researchers weigh each person. After a month of using the drug, each person’s
weight is measured again.
Can playing chess improve your memory? In a study, students who had not previously played chess
participated in a program in which they took chess lessons and played chess daily for 9 months.
Each student took a memory test before starting the chess program and again at the end of the 9month period.
Student
1
2
3
4
5
6
7
8
9
10
11
12
Pre-test
510
610
640
675
600
550
610
625
450
720
575
675
Post-test
850
790
850
775
700
775
700
850
690
775
540
680
Difference
State the Hypotheses:
Verify Assumptions:
7
Test Statistic and P-Value:
Conclusion:
Calculator: T-Test
Paired t Confidence Interval for 𝝁𝒅
When
1. The samples are paired.
2. The n sample differences can be viewed as a random sample from a population of differences.
3. The number of sample differences is large (generally at least 30) or the population
distribution of differences is (at least approximately) normal.
Compute a 90% confidence interval for the mean difference in memory scores before chess training
and the memory scores after chess training.
Conclusion:
Calculator: TInterval
8
3.
The home range of an animal is the average area an animal occupies
while foraging for food and defending its territory. It is thought that home
ranges of animals usually do not change much, except when an area is
under ecological stress. As part of a study of white-tailed deer in Florida,
deer were radiocollared and their movements followed over the course of a
year. The home range data are shown below. The area is reported in
hectares. (1 hectare = 2.471 acres.) The investigators are interested in
determining whether the home range white-tailed deer change over the
course of as little time as a year.
a) Using graphical display(s) of your choice show that the assumptions
necessary for determining a change in the mean home ranges are
plausible.
1992 1991
HR
HR
80
175
268
206
113
103
83
93
24
9
111
115
100
135
103
14
293
104
95
104
152
319
133
59
293
125
32
112
80
206
61
115
271
49
111
150
Diff
HR
95
-62
-10
10
-15
4
35
-89
-189
9
167
-74
-168
80
126
54
-222
39
b) Construct a 95% confidence interval for the difference in means of the home ranges from 1991 to
1992.
c) Do the data provide evidence of a change in the size of the home ranges between 1991 and 1992?
Provide statistical justification for your response.
11.2 Homework: 24-27, 31, 32, 35
9
Helium Filled Footballs Activity. Page 687.
Trial
1
2
3
4
5
6
7
8
9
Air
25
23
18
16
35
15
26
24
24
Helium
25
16
25
14
23
29
25
26
22
Trial Air
Helium
Trial Air Helium
Trial
Air
Helium
10
28
26
20
29
24
30
20
11
11
25
12
21
31
31
31
27
26
12
19
28
22
27
34
32
26
32
13
27
28
23
22
39
33
28
30
14
25
31
24
29
32
34
32
29
15
34
22
25
28
14
35
28
30
16
26
29
26
29
28
36
25
29
17
20
23
27
22
30
37
31
29
18
22
26
28
31
27
38
28
30
19
33
35
29
25
33
39
28
26
11.3: Large-Sample Inferences Concerning the Difference Between Two Population or
Treatment Proportions
Investigators at Madigan Army Medical Center tested using duct tape to remove warts versus the
more traditional freezing treatment. Suppose that the duct tape treatment will successfully remove
50% of warts and that the traditional freezing treatment will successfully remove 60% of warts.
Let’s investigate the sampling distribution of pfreeze - ptape
̂𝟏 − 𝒑
̂𝟐
Properties of the sampling Distribution of 𝒑
If two random samples are selected independently of one another, the following properties hold:
1.
2.
3.
10
Summary of Large-Sample z Test for 𝒑𝟏 − 𝒑𝟐 = 𝟎
Null Hypothesis:
Test Statistic:
Alternative Hypothesis
P-Value
Another Way to Write Hypothesis Statements:
𝐻0 = 𝑝1 − 𝑝2 = 0
𝐻𝑎 = 𝑝1 − 𝑝2 > 0
𝐻𝑎 = 𝑝1 − 𝑝2 < 0
𝐻𝑎 = 𝑝1 − 𝑝2 ≠ 0
Assumptions:
1.
2.
Investigators at Madigan Army Medical Center tested using duct tape to remove warts. Patients
with warts were randomly assigned to either the duct tape treatment or to the more traditional
freezing treatment. Those in the duct tape group wore duct tape over the wart for 6 days, then
removed the tape, soaked the area in water, and used an emery board to scrape the area. This
process was repeated for a maximum of 2 months or until the wart was gone. The data follows:
Treatment
n
Number with wart successfully removed
Liquid nitrogen freezing
100
60
Duct tape
104
88
11
Do these data suggest that freezing is less successful than duct tape in removing warts? Test with
an alpha level of .01.
Hypotheses:
Assumptions:
𝑝̂𝑐 =
𝑧=
P-value:
Conclusion:
Calculator: 2-PropZTest
A Large-Sample Confidence Interval for 𝒑𝟏 − 𝒑𝟐
When
1) The samples are independently chosen random samples or treatments were assigned at random
to individuals or objects
2) Both sample sizes are large n1p1 > 10, n1(1 – p1) > 10, n2p2 > 10, n2(1 – p2) > 10
a large-sample confidence interval for p1 – p2 is
12
The article “Freedom of What?” (Associated Press, February 1, 2005) described a study in which
high school students and high school teachers were asked whether they agreed with the following
statement: “Students should be allowed to report controversial issues in their student newspapers
without the approval of school authorities.” It was reported that 58% of students surveyed and 39%
of teachers surveyed agreed with the statement. The two samples – 10,000 high school students
and 8000 high school teachers – were selected from schools across the country.
Compute a 90% confidence interval for the difference in proportion of students who agreed with the
statement and the proportion of teachers who agreed with the statement.
𝑝̂1 =
𝑝̂2 =
Assumptions:
Confidence Interval:
Conclusion:
Based on this confidence interval, does there appear to be a significant difference in proportion of
students who agreed with the statement and the proportion of teachers who agreed with the
statement? Explain.
Calculator: 2-PropZInt
11.3 Homework: 37-39, 41, 47-49, 55, 56
13
4. Olfactory development in amphibians is very important; an individual’s survival depends on
identification of chemical cues in their watery environment. It is not clear how this capability
develops. One theory suggests that amphibians learn about odors by exposure as embryos. To
test this theory, investigators assigned 200 frog embryos to one of two treatments (experimental
and control) at random. The investigators injected fresh orange natural extract into the
experimental eggs. After the eggs hatched the tadpoles were placed in the middle of an
aquarium with an orange at one end. The tadpoles’ preference for swimming in the “orange
half” of the tank was observed for 5 minutes. Data from this experiment is shown below:
Treatment
Preferred orange side of tank
Preferred other side of tank
Orange injected
72
28
Control
39
61
Does there appear to be sufficient evidence that the orange extract injected frogs have a greater
preference for the orange side of the aquarium? Test the relevant hypotheses at the .05 level of
significance.
14
Download