Lab8

advertisement
STAT 460
Lab 8 Turn in Sheet
11/8/2004
To receive credit for this lab, turn this sheet in before leaving the lab.
Name: _____________________________________________
Lab Section: ____
1. Put your 2x3 table here:
2. Which null hypotheses can be rejected in the full factorial (with interactions) model?
3. Which null hypotheses can be rejected in the no-interaction model?
4. List one or two things that are still unclear to you.
2
STAT 460
Lab 8 Instructions
11/8/2004
Goals: In this lab you will get practice and additional insight for 2-way ANOVA.
Summary: 2-way ANOVA is used to analyze experiments (or observational studies)
with quantitative outcome and two categorical explanatory variables (”factors”).
a) Both factors may be of primary interest, or one may be primary and the other is
“blocks”.
b) Each factor may have 2 or more levels.
c) This week we restrict ourselves to “full factorial” models (every combination of
the levels of the two factors is represented by at least one subject).
d) At each combination of levels of the explanatory variables, the outcome is
assumed to be normally distributed with the same variance (usually denoted σ2).
e) There may be no interaction between the factors (additive model) or there may be
interaction (saturated model).
The purpose of 2-way ANOVA is to model the effects of two (or more, for multi-way
ANOVA) categorical explanatory variables on a continuous outcome. For each combination of
levels from the 2 explanatory variables we assume that the outcome is normally distributed with
the same variance as at any other combination of levels.
The two explanatory variables may be either two separate dimensions of “treatment”, e.g.
water and fertilizer for plant growth, or they may be a factor of primary interest and “blocks”. In
the latter case, we assume that there is a “block effect” but are not interested in it; rather the
blocks are used to provide a wider range of subjects and/or environments to improve
generalizability without sacrificing power through too much within-group variability.
The model may be “additive”, in which case the effect (on outcome) of “moving” between
specific levels of either variable when holding the other variable constant is the same for all levels
of the second variable. Or the model may include the “interaction” between the two explanatory
variables, in which case the effect of moving between specific levels of one variable does indeed
depend on the level of the second variable. A test of the null hypothesis of “no interaction”
decides between these two models.
Tests of “main effects” of the two explanatory variables individually do not make sense if
there is an interaction. If there is no interaction, we are interested in testing the main effect null
hypotheses of no effect of factor A and no effect of factor B (unless that effect is a block effect).
Reading of the ANOVA table is similar to what we have seen before, but somewhat more
complicated. Pay special attention to which hypothesis goes with which p-value.
Although the use of indicator variables and interaction variables is critical, we will be using an
SAS analysis procedure (univariate general linear model) that does this for us automatically.
The profile or interaction plot shows predicted mean outcomes on the y-axis. The x-axis
shows levels of one factor (in some arbitrary order). A separate line is drawn for each level of the
second factor. Note that choice of “first” vs. “second” factor is arbitrary. If we fit a model
without an interaction, the lines will be parallel and the predicted means will not perfectly match
the observed cell means.
Example: Ginkgo for Memory
An herb called Ginkgo biloba has played a role in Chinese herbal medicine for thousands of
years, but recently it has become a hot seller on the dietary supplement circuit. Proponents claim
that daily doses of Ginkgo biloba improve memory and increase concentration. They have been
encouraged by a recent study in the Journal of the American Medical Association, which
suggests that Ginkgo biloba ``improved'' quality of life for patients with mild dementia caused by
Alzheimer's disease. No rigorous studies have yet been run on healthy patients, but millions of
dollars are being made on the claims surrounding Ginkgo biloba. Let's consider hypothetical data
from a study of its memory effects in healthy patients.
The design is arranged in a “two-way layout” where two distinct factors---A and B---are varied
simultaneously. In this case, factor A is the dosage of Ginkgo biloba given to the subjects daily
for two months, and factor B is the level of ``mnemonic training'' given to subjects during that
same period. By considering both factors simultaneously, we can determine if the effect of the
herb facilitates training. Three levels of daily dosage of Ginkgo biloba are given in the
experiment: Placebo, 120mg, and 250mg. Two levels of training are given in the experiment: an
engaging but otherwise useless video-tape that serves as a placebo and specific training with a
variety of well-studied mnemonic techniques (e.g., building associations, etc). Time on task is
equalized in both levels of training.
All subjects were given a memory test before the study and again at the end of the two-month
period. The response variable for the experiment is the difference (after - before) in the memory
test scores for each subject. The test has been calibrated so that subjects are not near ceiling in
their initial test scores.
There were 18 subjects randomly assigned to each combination of levels of factor A and B (6
combinations in all). A specific combination of the two factors can be called a “condition” or a
“cell” (in the two-way layout), interchangeably. The factors in this experiment are called “crossed”
because every level of factor A is combined with every level of factor B.
Task 1: The basic 2-way ANOVA procedure.
a. Load the tab-delimited data ginkgo.txt. The outcome is the “after-before” score difference
in a memory test. Note that the explanatory variables Ginkgo Dose and Training Type
should be treated as and are “nominal”. If a nominal variable has more than 2 levels but is
used as a quantitative variable, we will get an incorrect analysis. Note that there is no
need to create dummy or interaction variables because we will use an SAS procedure that
does that for us.
b. How many levels are there Dose and how many for the Type variable?
c. Here is a way to perform EDA for 2-way ANOVA. Create a new variable that combines
the information from the two explanatory variables into a single variable. Make sure you
are in Edit mode. Use Data/Transform/Compute to create new column “dosetran” based
on the formula “10*dose+training”. This creates a 2 digit code with dose in the tens place
and training in the ones place. You can also create more meaningful labels for plotting by
recoding “dosetrain” into a new column “Dose/Train”. Go to Data/Transform/Recode
Values and for example code 11 as “placebo/placebo” and 12 as “placebo/ mnemonic”,
etc. and choose ‘character’ not ‘numeric’. Then perform Statistics/Descriptive/Summary
Statistics/ for SCOREDIFF with class DOSETRAIN. Add Medians and for Plots choose
both Boxplot and Histogram. Examine the pattern of medians and look for problems with
2
outliers, gross non-normality, and unequal spread. Any outliers? Under Graphs choose
Boxplot but this time for class choose the recoded variable.
d. Before performing the correct analysis, use Statisistics/ANOVA/OneWayANOVA to look
at difference in the memory score based on the Dose/Training factor.
e. Here is how to perform the analysis, including some additional EDA.
1. From the menu choose Statistics/ANOVA/Factorial ANOVA or
Statistics/ANOVA/LinearModel. Enter the SCOREDIFF as the Dependent
Variable and enter DOSE and TRAINING in Class. Entering something in the
Class box is what automatically creates indicator variables (with the last level of
each factor as the baseline level).
2. Click Model/Standard Models and include 2-way interaction. You should see now
in the Effects in Model also DOSE*TRAINING. Click OK.
3. To create the profile (interaction) plot, click Plots. Under Means tab, choose both
type of Plots and Predicted Means. Under Predicted tab, choose both plots. Under
Residual tab choose appropriate plots as we did in all the other ANOVAs so far.
4. Click OK to perform the analysis.
f. Look at the Descriptive Statistics tables. By hand, make a 2 by 3 table that has row labels
“Placebo” and “Mnemonic” and column labels for the 3 doses. Inside the table put the
number of subjects for each combination of factor levels. In the margins put the number
of subjects added up for that row or column. (♠1) Is this a “balanced-design”? This is a
nice summary of the experimental design. How do these cells correspond to the boxplots?
See if the six cell variances are similar. Now, choose Reports/Tables from the menu.
Select Column Classes/Row Classes type of table. In Column Class enter DOSE and for
the Row Class enter TRAINING. Click OK. Does this match the table you got by hand?
3
g. Look at the ANOVA table. Notice that, as usual, each MS is the ratio of the SS and df for
that line.
h. Note that “dose” has 2 df because there are 3 levels, and that “training” has 1 df because
there are 2 levels. The interaction of dose and training has 2 df . Here is one explanation
why: If there is no interaction, then the “additive” pattern can be described as the pattern
for the 3 doses for the placebo training plus the “shift” (up or down) of the same pattern
for mnemonic training. But if there is an interaction, we also need to describe how
120mg/mnemonic differs from its additive position and how 250mg/mnemonic differs
from its additive position, so we are free to specify 2 additional pieces of information
when we have the interaction model.
i.
Note that each of the five F values is a specific MS divided by the Error MS. (In the
future we may see situations where there is a different denominator.) Check for DOSE
that Fdose=MSdose/MSerror? For each F value, the p-value is the area under the
appropriate null sampling distribution of F to the right of the F value given in the table.
What is the “appropriate null sampling distribution of F” for each row of the table?
j.
Here are the null hypotheses for the five p-values. Which can we reject? (♠2)
1. Model: the mean outcome is the same for all cells (no effect of either dose or
training)
2. Error: the mean outcome for the baseline levels (first level in SAS) of both factors
is zero
3. DOSE: the mean outcome is the same for all levels of dose, ignoring training (no
effect of dose). This is the test for the dose main effect.
4. TRAINING: the mean outcome is the same for all levels of training, ignoring dose
(no effect of training). This is the test for the training main effect.
5. DOSE*TRAINING: there is no interaction of dose and training (the effect profile
of one factor is parallel at all levels of the other factor)
As mentioned above, it does not make sense to look at the main effects if interactions
are present. For example, if the interaction plot shows a U shape for the mean
outcomes at different levels of one factor at level 1 of the second factor, and an
inverted U shape for the corresponding outcome means at level 2 of the second factor,
then both main effects may be non-significant, but this does not indicate that changing
levels of either factor has no effect on outcome.
Therefore, it makes sense to first look at the interaction p-value, and then look at the
main effect p-values only if we do not reject the “no interaction” null hypothesis.
k. Look at the residual plots. The most useful subplot is the first one, which is a residual vs.
predicted plot. There are 6 columns of points, one for each cell in your table. We can see
that the spreads of the 6 sets are about equal, and the means of the 6 sets are all reasonably
close to the reference line in the center of the graph. You may prefer to create a nicer plot
4
by using the Store option to store predicted and residual values in your data spreadsheet,
and then separately create the scatter plots and add your own embellishments. This is
something you can explore on your own for homeworks.
l.
Finally look at the two Profile Plots. Verify that the plots show the exact same
information, which is the six cell means. The non-significant interaction p-value of 0.341
tells us that we cannot consider the lines on either plot to be non-parallel once we consider
the uncertainty in those cell means. Note: generally we put the factor with more levels on
the x-axis, but if the other direction is more informative you should go with that.
Task 2: Eliminating a case
In the EDA, you should have noticed an “outlier” for 250mg/mnemonic, which has a memory
score difference of about –85 (case 104). Remember that the definition of outlier for the purpose
of defining boxplots is any point less that 1.5 IQR below Q1 (or more that 1.5 IQR above Q3).
This point is between 1.5 and 2.0 IQR below Q1. This does not mean that the observation is
necessarily bad in some way! But at least we should see if it was recorded or entered in error. If
we find no entry error, we may consider reporting results with and without that observation. Here
is how to analyze the data without that observation.
a. Identify the point in the Data Editor.
b. Go to Data/Filter/SubsetData. Choose SCOREDIFF and NE (for not equal) and in LOOK
up value window choose -85. Hit OK. In your data Editor you will see that the case 104 is
missing.
c. Do Descripitve/Summary Statistics and look at the boxplot. The point should not be there
any more.
d. Perform the 2-way ANOVA. You should be able just to Select ANOVA/Linear Models or
ANOVA/Factorial ANOVA (depending what you did at the beginning) and just click OK.
All the options you selected the last time should be still there.
e. Go to Data/Filter and select “None” to return to normal.
f.
What are your interpretations now? To be fair and honest, you must report both sets of
analyses if you eliminate a case without clear cause.
Task 3: Additional options for SAS 2-way ANOVA
a. Reopen the ANOVA/LinearModel/dialog box. To analyze the data with a model that only
has the main effects and does not include an interaction, choose Model box, and select
DOSE*TRAINING in Effects to be Removed. Or if you are using Linear Model option
for the first time, choose under Standard Models just the Main Effects. Now also click on
tab Predictions and choose List Predictions based on the original data. Under Means tab
5
choose a comparison method and test the means for both effects. Under Statistics tab,
choose Parameter Estimates. Click OK, to get the no-interaction model. What row is
missing from the ANOVA table? Note that most of the SS values change when the model
changes. Which null hypotheses are rejected for the no-interaction model? (♠3)
b. It is generally insufficient to state that a certain null hypothesis is rejected. When we do
not reject the “no-interaction” null hypothesis, we also want to know which level of each
factor is associated with the better outcome. For 2 level factors that is all we need to
know. For factors with more than 2 levels, we want to do additional “contrast testing”
(saved for another week). Now your results will have a section labeled Least Squared
Means. We can see that the estimated mean for mnemonic is about 21 units higher than
placebo training. If there were a significant main effect for training, we would state that
mnemonic training improves memory more than placebo training. For now we just note
that for dose placebo and 120 mg have similar mean memory score differences, and 250
mg has a higher mean, though we have not learned a way to test the significance of these
differences yet.
c. Effect estimates (parameter estimates or regression coefficients) are also listed Note that
the highest level of both factors has estimate 0 with no standard error, t-value or p-value,
and that they have the note “this parameter is set to zero because it is redundant”. These
are the “baseline” levels and all other comparisons are to the baseline level, as in our study
of ANCOVA. For example, assuming that there is no interaction, at either training level
the placebo (dose=1) has a predicted mean of 33 lower than the 250mg dose (dose=3).
With a standard error of 14.861, this gives us a 95% confidence interval of 62.5 to 3.6
units lower. To test the null hypothesis that placebo has the same effect on outcome as
250 mg of gingko, we calculate t=-33.056/14.861=-2.22. Based on the t-distribution with
1 df, values smaller than –2.22 or bigger than +2.22 occur only 2.8% of the time when that
null hypothesis is true. So we reject that null hypothesis and conclude that it is likely that
placebo really does produce a lower memory score difference than 250 mg of ginkgo.
Make a corresponding interpretation for the comparison of 120 mg with 250 mg.
Task 4: Try to do ANCOVA with SAS for the DOSE and TRAINING as main effects.
Compare these results to the no-interaction 2-way ANOVA results. What is different?
6
Download