Here

advertisement
Math 311, Spring 2008, Lab 5
Due May 15th at 3:00 p.m.
The Tools
In this section you’ll learn the mechanics of 1-sample, matched pairs, and 2-sample t-tests
Inferences With Minitab:
Have Minitab compute two columns of 10,000 rows of data. Store this data in columns C1 and C2.
Choose the first (C1) from a normally distributed population (use N(0,1)) and the second (C2) from a
uniformly distributed population, distributed between 0 and 1. Recall one does this by selecting
Calc>Random Data> …. We will use these data in our tests below.
1-sample t-test:
View the data in column C1 as a sample of size 10,000. We may use
Minitab to compute both a confidence interval for the (true) mean of the population and perform a
hypothesis test for the mean of the population at one time!
a. Select Stat>Basic Statistics>1-Sample t…
b. Enter C1 for Samples in columns box
c. Click Perform hypothesis test.
d. Enter the value of the population mean from the Null Hypothesis as the Hypothesized
mean: (this is for the hypothesis test – the Alterative Hypothesis is entered below). This
time, use 0 (this is, in fact, actually the true mean – recall that the data came from a
N(0,1) distribution).
e. To set the confidence level select Options (this is for the confidence interval). 95% is
the default setting.
f. Notice that while you’re in the Options menu, you can also select equal to, greater than,
or less than for your Alternative Hypothesis. Since this is just practice, pick whichever
floats your boat.
g. Finally, select OK (twice) and get something like this:
One-Sample T: C1
Test of mu = 0 vs not = 0
Variable
C1
N
10000
Variable
C1
P
0.970
Mean
0.000375
StDev
1.003442
SE Mean
0.010034
95% CI
(-0.019294, 0.020044)
T
0.04
h. Note: the hypotheses tested (mu = 0 vs not = 0) are listed at the top of the output
(I’ve highlight these for emphasis.)
i. Note also: T = one sample t-statistic and P = p-value. The p-value is high because our
null hypothesis was correct – mu is 0. (I’ve highlight these for emphasis too.)
Matched pair t-test:
We can also do a matched pair t-test by viewing the data in
columns C1 and C2 as two data points collected from 10,000 individuals and by following the
directions below:
a. Select Stat>Basic Statistics>Paired t…
b. Select Samples in Columns
c. Enter C1 in First sample: and C2 in Second sample:
d. Select Options to set the confidence level and type of hypothesis test and the value of the
mean in the null hypothesis.
e. Finally, select OK (twice) and get something like this:
Paired T-Test and CI: C1, C2
Paired T for C1 - C2
C1
C2
Difference
N
10000
10000
10000
Mean
0.000375
0.501160
-0.500785
StDev
SE Mean
1.003442 0.010034
0.290175 0.002902
1.044458 0.010445
95% CI for mean difference: (-0.521258, -0.480311)
T-Test of mean difference = 0 (vs not = 0): T-Value = -47.95
P-Value = 0.000
f. Notice that the top shaded region (I added the shading) tells us what the mean difference
was (this is x1  x2 ) – this number makes sense as the normally distributed data in C1 has
a mean around 0 and the uniformly distributed data in C2 has a mean around .5
g. Meanwhile the lower shaded region tells the reader both the null hypothesis (difference =
0) and the alternative hypothesis (not = 0).
2-sample t-tests:
We can perform 2-sample inferences by viewing the data in columns
C1 and C2 as samples from independent populations and by following the directions below:
a. Select Stat>Basic Statistics>2-Sample t…
b. Select Samples in Different Columns (in this example, we’re going to compare the
samples in C1 and C2).
c. Enter C1 in First: and C2 in Second:
d. Select Options to set the confidence level and type of hypothesis test (1-sided vs. 2sided) and the value of the mean in the null hypothesis.
e. Finally, select OK (twice) and get something like this:
Two-Sample T-Test and CI: C1, C2
Two-sample T for C1 vs C2
C1
C2
N
10000
10000
Mean
0.00
0.501
StDev
1.00
0.290
SE Mean
0.010
0.0029
Difference = mu (C1) - mu (C2)
Estimate for difference: -0.500785
95% CI for difference: (-0.521260, -0.480310)
T-Test of difference = 0 (vs not =): T-Value = -47.94
11659
P-Value = 0.000
DF =
f. Note Estimate for difference = difference of sample means = x1  x2 . DF =
degrees of freedom (recall that Minitab uses a much more complicated, albeit more
correct, formula for computing degrees of freedom.
g. Important note: the samples from the two populations do not need to be in separate
columns. For example, suppose we were comparing the pollution levels in streams on the
east coast vs. the west coast. If the pollution data was in C1 and the location (east or west
coast) was in C2, then we could compare the mean pollution levels by selecting the
option Samples in one column and then entering C1 in the Samples: box and C2 in the
Subscripts: box. You’ll need to try this in one of following questions.
The Questions
In this section you’ll apply the techniques learned above
!!!BE CERTAIN TO READ EACH SITUATION CAREFULLY – THEY CONTAIN USEFUL CLUES!!!
Twins (or Nature vs. Nurture):
How much of our personality (or lack of personality), our likes, our individuality is predetermined by
our genes? The classical method of studying this phenomenon is the study of identical twins
separated early in life and reared apart. Identical twins, because they share an identical genotype (set
of genes), make ideal subjects for investigation of the degree to which various environmental
conditions may instigate change (because we have two expressions of one identical genotype).
Over the past twenty years, several studies of identical twins have been conducted. The most
publicized study was begun in 1979 by University of Minnesota psychologist Thomas Bouchard and
continues today. Bouchard and his colleagues at the Minnesota Center for Twin and Adoption
Research have published over 129 scientific papers on the subject. In this lab, we consider a similar
study carried out by psychologist Susan Farber and published in her book, Identical Twins Reared
Apart (Basic Books, 1981).
Farber chronicles and analyzes data for 95 pairs of identical twins reared apart. Much of her
discussion focuses on a comparison of IQ scores. The question of concern is, “Are there significant
differences between the IQ scores of identical twins, where one member of the pair is reared by the
natural parents and the other member is not?” IQ scores for 32 of Farber’s 95 twins are stored in
Twins.mtw. (Get the file now from the course webpage.) One member (A) from each of the sets
was reared by a natural parent, whereas the other member (B) was reared by a relative or some other
person.
1. Which test would be the most appropriate way to compare the IQ’s of these twins: a onesample t-test or a matched pair test? Why?
2. Construct a Null Hypothesis and an Alternative Hypothesis. Be certain to define your
parameter.
3. Perform the appropriate test and report the p-value you found (you can do this by just cutting
and pasting the Minitab analysis into Word)
4. State your conclusions.
Note: we are now well past the halfway point of the course. Thus, I expect a complete and
correctly worded explanation and summary of your results. If it helps, imagine that you are
Dr. Farber and write accordingly.
Nutrition – platewaste in the 4th & 5th grades:
A few years ago colleagues from the Family and Consumer Sciences Department and I studied
various factors that affect the amount of food elementary school children eat during lunch. Four
schools were studied – each for ten days. The platewaste of each child was compared to the amount
served that day and was used to compute the nutrients each child consumed that day. The results for
each child in 4th and 5th grade are recorded in the file nutri.mtw. Get this file now.
Column C4 contains the data concerning the placement of recess with respect to lunch (before and
after).
Column C11 contains the data concerning the number of calories consumed by the child that day.
Some days the children were served more and some days, less. To adjust for this, I computed the
percent of the calories served that were consumed. These data are recorded in C12. Thus if a child
ate half the calories served that day, .50 would appear in C12.
Similar analysis was done for other nutrients (scroll right to see the results). Let’s investigate these
data….
5. Does the placement of recess affect the number of calories consumed? State your
hypotheses, perform the appropriate test, and report the P-value you found. State your
conclusions.
Note: in this worksheet, the data is recorded in one column (C11) while the subscripts
are contained in another column (C4).
6. The difference in calories consumed by each sample is not very large. Find another food
item in your house that contains the same number of calories as this difference.
7. Does gender affect the number of calories consumed? State your hypotheses, perform the
appropriate test, and report the P-value you found. State your conclusions.
Download