Exercise 7 Comments DOC

advertisement
SOME REMARKS ABOUT THE EXERCISE ON THE ANALYSIS OF COVARIANCE
To begin, the collective performance on this exercise was not good, which suggests to me
that I didn’t cover the ideas and procedures sufficiently well in class or that I confused you with
the assignment. When every member of a good class performs below expectation, any good
instructor looks inward. So I’ll keep this problem in mind as I evaluate what the grades on the
assignment are telling me.
To continue, let’s think about the method per se. In any problem of this sort, one is
trying to dissect the effects of the covariate (the continuous, regression-style predictor) and the
effects of the classification variable (the categorical predictors like population in this case) on the
response variable. Now, the classic “analysis of covariance” was designed to test the null
hypothesis that there is no effect of the categorical predictor after we adjust for the covariate’s
effect on the response. This is the test of “adjusted means” but it presumes that the slope of the
response variable on the covariate is the same for all levels of the categorical predictor (i.e.each
population shows the same slope of response on covariate) and that the strength of the regression
within each category is roughly the same.
Remember the logic of these presumptions. If the regression is lousy or even nonexistent in some groups, the analysis of adjusted means is rather a silly thing to do. If the slopes
are different in different groups, the test of the null hypothesis is ambiguous because the outcome
can depend upon where along the covariate axis one performs the test. These are issues we
discussed in class.
All of which combines to dictate the sequence in which you proceed with an analysis of
covariance. First, you do regressions within each category. This verifies for you that the
covariate works within each category and gives you the information needed to make the first test,
which is, formally, a test of the null hypothesis that the residual variances within the groups are
the same. We discussed using the F-max test to do this, and textbooks advocate various tests on
the equality of variances that work as well but with more tedium.
Second, presuming that you accepted the null hypothesis, you then test the null
hypothesis that the slopes of the regression of the response on the covariate are equal across
groups. This is done with most software packages by running a general linear model with terms
for the covariate, the categorical variable, and the interaction between the two. It is the
interaction between the two that tests the “slope” hypothesis.
Third, if justified, you proceed to test the effects of the adjusted means. In other words,
you test the null hypothesis that that there is no effect of the categorical predictor after we adjust
for the covariate’s effect on the response. This is done in most software by re-running your
general linear model call and now deleting the interaction from the terms to be considered.
Now, I focused the class time on showing you where the sums of squares for all of these
tests come from and how one can do some of these tests without running a general linear model
program. This is where I don’t appear to have been completely clear in my discussion; at one
level you can always stumble through by running the “GLM” modules but it is nice to know
where this stuff originates and what the software is really doing. The exercise was focused in part
on reinforcing the lessons in class and in part on helping you see that, when the groups differ in
their distributions of the covariate, the sums of squares for the effect of the covariate and the
group will change, and sometimes change dramatically, when you exclude or include the other
factor. That is, the sum of squares for the effect of population changed substantially whether we
considered population by itself or in the presence of the covariate. This is the same idea as we
discussed for the two-way analysis of variance with unequal sample sizes: the effects are
confounded and we get different answers with Type I SS (calculated for one factor by ignoring
the presence of the other, or summing over it) and Type III SS (calculated for one factor by
acknowledging the presence of the other and calculating the effect of the factor in question
effectively “within levels” of that other factor).
To conclude, these data has some idiosyncratic challenges and I think most folks, in their
focus on the method, glossed over those challenges. For one, there was a scaling issue. Mass
should scale with the cube of length, so any analysis of a mass variable would best be done on the
same scale as the covariate, length, meaning the first thing to do is either take the cube root of
mass or create the cube of length. If you do either one, you’ll see the pattern in the data looks
much cleaner at the outset. Failure to transform in some fashion like either of these creates a
problem in that the strength of the regression is different in the different populations because the
populations have different distributions of length and the scaling effect makes the regressions
with the raw data look more heterogeneous than they really can be.
There are some legitimate ambiguities in how one might use the data to address the
question. If by “condition” we mean some measure of energy reserves, then the response might
be the difference between the lean dry mass and total dry mass (which is presumably the mass of
the extractable storage lipids). If we mean overall robustness, we might mean total mass. If we
mean some more inchoate measure of muscle/bone density, we might mean the lean mass. The
traditional measure of “condition” just places total mass over length and doesn’t usually address
these nuances. There is similar nuance in “potential fecundity” - do we mean available offspring
(fertilized ova, however far each has developed), do we mean capacity for producing offspring
(perhaps total mass of reproductive tissue), or perhaps proportionate mass of tissue (mass of
reproductive tissue divided by total dry mass).
But why do analysis of covariance on such problems rather than just take ratios? For one
reason, a ratio of two random variables can hide information. Here one recalls work by your
graduate student colleague Brian Storz, who showed that the ratio of the orbitohyoideus muscle to
total tadpole mass is NOT necessarily a good indication of the development of the cannibalistic
morph. Yes, cannibals have high ratios, but you can also get high ratios in stunted omnivores in
which the muscle is developing normally but the overall somatic growth is slow. For another
reason, a ratio of two random variables has a variance and a distribution that is a complex
function of the distribution of each individual random variable. If the two variables covary, the
result can be a mess and information can be lost. For a third, consider that a ratio of something
like reproductive tissue mass to body length cubed is a good index only if reproductive tissue
mass varies isometrically with length cubed - that is, the ratio in a big individual is the same as
that in a small individual. If there is any allometry, then the ratio is different in individuals of
different size just because of the allometry. Now if populations differ in body length distributions
and the response variable is an allometric function of the covariate, then a simple analysis of
variance on the ratio could lead one to think that there are differences among populations when,
in fact, they really have no difference other than the simple difference in body size its
concomitant allometric relationships. Bad scene, isn’t it, which is why analysis of covariance is
so helpful in this sort of context.
Download