sol_anova

advertisement
Analysis of Variance (ANOVA) - Solutions
One-way analysis of variance is a method for comparing several population means, when the
data are from independent samples. It can be thought of as a tool for examining the relationship
between a quantitative response variable and a categorical explanatory variable.
1 In each part:  Explain whether one-way analysis of variance can be used to analyze the
relationship, or not.
a. An educational researcher compares three different teaching methods. Each method is used by
100 students (300 in all). Scores on a final exam will be used for the comparison.
Yes, response is quantitative and we’re comparing groups so explanatory is categorical
b. A sociologist looks whether there is a relationship between racial group (Caucasian, AfricanAmerican, etc) and opinion about the death penalty (favor or oppose).
No, response is categorical
c. Restaurant servers draw a happy face on some bills, write a thank you message on other bills
and write nothing on a third batch of bills. Tip percents are compared for the three different
“conditions.”
Yes, response is quantitative and explanatory is categorical
d. A medical researcher examines the relationship between age and blood pressure.
No, both variables are quantitative
2 The data are from the 2002 General Social Survey, a federally funded national survey done
every other year by the University of Chicago. In the Datasets folder of the agenda page, click
the link for the “GSS Dataset” to open Minitab with the data in place.

This activity examines the relationship between number of children ever had (children)
and answer to a question about how often premarital sex is wrong (premarsx). There are
four categories for how often premarital sex is wrong: Always, Almost always,
Sometimes, and Never.
a. Use Stat>Basic Statistics>Display Descriptive Statistics. Enter the variable children in the
Variables box and enter premarsx in the “By Variables” box. Give the sample mean number of
children ever had for each category for how often premarital sex is wrong.
Means are: Always 2.167 Almost always 1.789
Sometimes 1.637 Never 1.4286
b. Write two or three sentences that describe how the sample means differ. For instance, which
group has the greatest number of children (on average), etc.
People who think premarital sex is always wrong had the highest mean ideal number of children.
People who think premarital sex is never wrong had the lowest mean. Generally, means decreased
as the attitude regarding premarital sex lessened in severity.
c. In words, write a null hypothesis about the mean number of children for the four response
categories of the premarital sex question. .
Null: The population means are the same for the four categories.
1
d. Write the null hypothesis given in part c using appropriate statistical notation.
H0: μ 1 = μ 2 = μ 3 = μ 4
e. Use Stat>ANOVA>One-way to do a one-way analysis of variance F-test to compare mean
number of children for the four categories of the premarital sex question. The “Response” is
children and the “Factor” is premaresx.
Locate the p-value in the output. What is the p-value? p-value = 0.000
State a conclusion about this situation. We reject the null and conclude that population means are
not all the same
f. The output gives a graphical display of confidence intervals for the population means in the
four categories of how often premarital sex is wrong. Using that display, describe how mean
number of children differs for these categories.
Generally the means change according to how often premarital sex is wrong. The intervals for
“Always” and “Never” do not overall so it’s reasonable to say they’re different.
3 In past class surveys students were asked to rate how much the like Rap music on a scale of 1
to 6, with 1 = hate it and 6 = like it a lot. Students were also asked whether they are from a big
city, rural area, small town or suburban area. Following are analysis of variance results
comparing mean ratings for the four categories of hometown.
Source
DF
Hometown
SS
MS
3
31.70 10.57
Error
2107
4810.77 2.28
Total
2110
4842.46
F
4.63
P
0.003
Individual 95% CIs For Mean Based on
Pooled StDev
Level
N
Mean
StDev
Big_city
220
4.527
1.393
Rural
314
4.070
1.585
Small_town 533
4.128
1.535
(----*----)
Suburban
4.180
1.500
(--*---)
1044
- ---+---------+---------+---------+----(-------*-------)
(------*-----)
----+---------+---------+---------+----4.00
4.25
4.50
4.75
a. Using appropriate statistical notation, write a null hypothesis for this situation.
Ho: u1= u2 = u3 = u4
2
b. What conclusion can be made about the null hypothesis? Justify your answer.
With a p-value of 0.003 which is less than 0.05, there exists statistical evidence that not all
populations means are equal for the four regions.
c. Use the display of confidence intervals (and accompanying sample means) to describe any
differences (and similarities) in ratings of Rap for the categories of hometown.
Since the intervals overlap for Rural, Small Town and Suburban, we can say that these three
regions have a similar mean in “likeness” of rap music. The Big City region, however, indicates
that this population of students has a higher appreciation of rap music.
d. A multiple comparisons analysis (not shown) includes these two confidence intervals for the
difference in mean ratings:
Big City – Rural
95% CI for difference in means is 0.20 to 0.72
Small Town – Rural 95% CI for difference in means is −.15 to +0.27
(i) Based on the confidence interval explain whether it is reasonable to conclude that (population)
mean ratings differ for students from big cities and rural areas.
Yes, since the interval does not contain 0.
(ii) Based on the confidence interval explain whether it is reasonable to conclude that
(population) mean ratings differ for students from small towns and rural areas.
No, since the interval contains 0.
3
Download