Worksheet--> Summary with examples from Chapter 24

advertisement
Mr. Bastien
AP Statistics
Chapter 24 "Paired Samples and Blocks"
When two samples are independent, the selection of the individuals or objects that make up one of the
samples has no bearing on the selection of those in the other sample. But what happens when an
experiment with independent samples is not the best way to obtain information concerning a possible
difference between the populations?
What if a pair of students selected the topic of effects of jogging regularly on blood pressure?
Would you select two independent samples? A random sample of people who jog regularly and a
second random sample of people who do not exercise regularly. Would you then use a two-sample
t-test to conclude that a significant difference exists between the average blood pressure for joggers
and nonjoggers. Is it reasonable to think that the difference in mean blood pressure is attributable to
jogging? Explain
One way to avoid this difficulty would be to match subjects by weight. Then you would find pairs of
subjects so that the jogger and nonjoggers in each pair were similar in weight (although weights from
different pairs might vary widely). Then we can rule out the extraneous factor of weight, that is, weight
could then be ruled out as a possible explanation for an observed difference in average blood pressure
between the two groups. Matching the subjects by weight results in two samples for which each
observation in the first sample is coupled in a meaningful way with a particular observation in the
second sample.
Such samples are said to be paired.
Experiments can be designed to yield paired data in a number of different ways. Some studies involve
using the same group of individuals with measurements recorded both before and after some
intervening treatment. Others use naturally occurring pairs, such as twins or husbands and wives, and
some construct pairs by matching on factors with effects that might otherwise obscure differences (or
the lack of them) between the two populations of interest (as the weight factor in the jogging example)
It has been hypothesized that strenuous physical activity affects hormone levels. The article "Growth Hormone Increase During Sleep After Daytime Exercise" (J. of Endocrinology (1974): 473-478) reported
the results of an experiment involving six healthy male subjects. For each participant, blood samples
were taken during sleep on two different nights. The first blood sample (the control) was drawn after a
day that included no strenuous activities, and the second was drawn after a day when the subjects had
engaged in strenuous exercise. The samples are paired rather than independent, because both samples
are composed of measurements on the same men, (in mg/ml).
Postexercise
Control
1
13.6
8.5
2
14.4
12.6
Subject
3
42.8
21.6
4
20.0
19.4
5
19.2
14.7
6
17.3
13.6
Let u1= the mean nocturnal growth hormone level for the population of all healthy males who
participated in strenuous activity on the previous day.
Let u2= the mean nocturnal hormone level for the population consisting of all healthy males whose
activities on the previous day did not include any strenuous physical exercise.
H0: u1-u2=0
versus Ha: u1-u2≠0
If we would ignore the paired nature of the samples, then we would end up losing information.
If we were to use a two sample t-test, the resulting t test statistic would be 1.28, thus , we would not be
able to reject the hypothesis u1-u2=0 , even at a significance level of 10%!
This suggests that the inference methods developed for independent samples (Chapter 23), are not
adequate for dealing with paired samples
When sample observations from the first population are paired in some meaningful way with sample
observations from the second population, inferences can be based on the differences between the two
observations within each pair. The n sample differences can then be regarded as having been selected
from a large population of differences.
ud= mean value of difference population.
σd= standard deviation of the difference population.
The relationship between ud and the two individual population means is
ud= u1-u2
Therefore, when the samples are paired, inferences about u1-u2 are equivalent to inferences about ud.,
so instead the original two-sample t-test, we do a one sample t-test., keeping in mind that ud will be
used for the paired t-test
H0: ud= 0
versus Ha: ud≠0
tn-1= (xd-hypothesized value)/SEd
(doesn't have to be zero)
where SEd= sd/√𝑛
Children with Down Syndrome generally show a pattern of retarded mental development, although
some achieve higher intellectual levels than others. The intellectual achievements of Down Syndrome
children have been studied by numerous investigators, and several different types of chromosomal
abnormalities associated with the syndrome have been identified. Two such abnormalities are
trisomy21 and mosaicism.
Thirty children with mosaic Down syndrome who were at the USC Medical Center were selected to
participate in the research project (to compare trisomy21 and mosaicism). The investigators then chose
30 children with trisomy21 Down syndrome from among 350 who had been seen at the medical center.
Computer was used to help pair the mosaic and trisomy21 group. The result was 30 matched pairs of
children, and IQ levels for all children were determined. The test was to determine the theory that
children with mosaic generally achieve higher intellectual levels.
Pair
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Let:
Mosaic
73
43
69
89
53
81
59
71
65
53
58
71
92
57
76
Trisomy21
71
53
58
71
50
70
55
18
31
57
63
47
44
28
75
Difference
2
-10
11
18
3
11
4
53
34
-4
-5
24
48
29
1
Pair
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Mosaic
55
61
63
87
67
63
58
50
59
75
61
91
43
55
88
Trisomy21
60
48
55
59
78
55
51
55
53
47
50
63
46
54
48
u1= mean IQ for children with mosaic Down syndrome
u2= mean IQ for children with trisomy 21 Down syndrome
ud= u1-u2= mean IQ difference between mosaic and trisomy 21 Down syndrome children
Ho: ud=0
versus
Ha>0
Difference
-5
13
8
28
-14
8
7
-5
6
28
11
28
-3
1
40
Now, let's look at an example that involves an interval
The effect of exercise on the amount of lactic acid in the blood was examined in the article " A
Descriptive Analysis of Elite-Level Racquetball" (Research Quarterly for Exercise and Sport (1991): 109114). Eight males were selected at random from those attending a week-long training camp. Blood
lactate levels were measured before and after playing three games of racquetball, as shown in the table.
We will use this data to estimate the mean change in blood lactate level using a 95% confidence interval.
Player
Before
After
Difference
1
13
18
-5
2
20
37
-17
3
17
40
-23
4
13
35
-22
5
13
30
-17
6
16
20
-4
7
15
33
-18
8
16
19
-3
The eight men were selected at random from training camp participants. Create a boxplot and interpret.
Now, construct a 95% confidence interval. Note: remember that tn-1=critical value), in this case 7df.
Download