Module 12: One-Way ANOVA 1 One-way ANOVA Assignment: 12.1, 12.4

advertisement
Module 12: One-Way ANOVA
Assignment: 12.1, 12.4
January 30, 2008
1
One-way ANOVA
We learn today about ANOVA, which stands for analysis of variance. In the study of
ANOVA, we are interested in testing whether a group of two or more means are equal.
One-way ANOVA tests whether the independent variables in a linear model is significant,
this means that it is having some kind of influence on the dependent variable.
1.1
Background
A statistician is interested in collecting data and finding a relationship between the data
he/she has collected. For instance, if a statistician collects data on temperature and
pressure in different areas of the world. The statistician is interested to see how pressure
affects temperature. The first thing the statistcian needs to do is test whether or not the
areas the data was collected has an effect on the pressure.
1.2
PROC GLM and PROC ANOVA
We will learn two new proc steps today, they are PROC GLM and PROC ANOVA. The
only difference between the two procs is that PROC GLM is used when the data is unequal
for each group, and PROC ANOVA is used when the data is equal for each group.
1.3
Example 12.1
Researchers wanted to test whether the average braking time of drivers following different
types of trucks equipped with center high-mounted stop lamp(CHMSL) are the same or
not.
filename datain ’Taillite.dat’;
data one;
infile datain;
input id vehtype group positn speedzn resptime follotme folltmec;
1
if group = 1;
run;
proc glm data=one;
class vehtype;
model resptime = vehtype;
means vehtype / tukey lines;
run;
Let’s take a look at what the question is asking.
We want to test whether the average breaking time of drivers following different types
of trucks equipped with CHMSL are the same or not.
So, we want to see if breaking time is a function of types of vehicles. In the PROC step,
after the command MODEL, we write resptime = vehtype. This telling SAS that we are
testing to see if the vehicle type has a significant impact on response time.
Also, notice there is an if statement in the code. This is because the problem says that
we only want to look at types of trucks equipped with CHMSL. If you look in the back of
the book under the information for the data set taillite, the variable group is divided into
vehicles that have the CHMSL equipped(denoted by a 1) and those that do not(denoted
by a 2). By writing an if statement that says group = 1, we are telling SAS that we only
want to look at the data where the the CHMSL are on.
1.4
Conclusion
Once we run the code we can interpret the output. Page 2 shows that the p-value is 0.0109,
this tells us that we would reject the null hypothesis that the means for the different type
of vehicles is rejected. This means that the average response time are different for different
vehicle types.
The last thing we tested for is a Tukey comparison. The variables that are grouped
together by the letters A and B are the variables that are not significantly different. The
variabes that are not grouped together are significantly different.
2
Download