Chapter 5-10 Linear Regression Robustness to

advertisement
Chapter 5-10 Linear Regression Robustness to Assumptions
In ANOVA and linear regression, the following assumptions are made (van Belle et al, 2004,
p.397):
1. Homogeneity of variance (In one-way ANOVA, each group has the same variance on the
outcome variable. In linear regression with a single continuous predictor, the variance around
the regression line is the same at every point along the X axis.)
2. Normality of the residual error (the distribution of differences between the actual and
predicted values has a normal distribution).
3. Statistical independence of the residual errors (the residuals have no discernable pattern).
4. Linearity of the model (the form Y = a + bX + error is a good representation of the data,
so the variability in Y can be partitioned into these separate terms).
If either or both of the assumptions of homogeneity of variance or normality of the residual error
are violated, transformations of the data, such as taking logarithms, are frequently advocated. If
the right transformation is selected, either or both assumptions are usually met on the
transformed scale.
Statisticians frequently make the comment that t tests, analysis of variance (ANOVA), and linear
regression are robust to the assumptions of homogeneity of variance (equal variance) and
normality. These two assumptions are the focus of this chapter, which provides authoritative
references to back up the robustness claim.
These robustness claims originally came from statistical papers on ANOVA. However, ANOVA
is simply a special case of linear regression, as we saw in Chapter 5-4). Further, an independent
groups t test is just a one-way ANOVA comparing two means. So, the robustness described in
this chapter applies to the t test, ANOVA, and linear regression.
_________________
Source: Stoddard GJ. Biostatistics and Epidemiology Using Stata: A Course Manual [unpublished manuscript] University of Utah
School of Medicine, 2010.
Chapter 5-10 (revision 16 May 2010)
p. 1
Homogeneity of Variance
Lorenzen and Anderson (1993, p.35) state,
“Historically, the assumption thought to be the most critical was the homogeneity of
variance. However, Box (1954) demonstrated that the F test in ANOVA was most robust
for  while working with a fixed model having equal sample sizes. He showed that for a
relatively large (one variance up to nine times larger than another) departures from
homogeneity, the  level may only change from .05 to about .06. This is not considered
to be of any practical importance. (It should be pointed out that the only time an  level
increased dramatically was when the sample size was negatively correlated with the size
of the variance.)”
To put this in simpler terms, statisticians concern themselves with insuring that significance tests
do not increase the type I error rate. Violations of assumptions, then, are of a concern if they lead
to a rejection of a true null hypothesis more frequently than , almost always set at 0.05. Box
showed that you could have very large departures from homogeneity of variance without
affecting the alpha level in any appreciable way.
When the homogeniety of variance assumption is violated, using a test such as Levene’s test of
homogeneity, it is frequently advised by authors of statistics textbooks to transform the outcome
variance. Lorenzen and Anderson (1993, p.35) offer this advice,
“When there are large departures from homogeneity, it is felt that the data should be
transformed to produce more meaningful results. However, one must take in the
interpretation of the results after transforming since transforming also changes the form
of the mathematical model. To our knowledge, no one has come up with an  level on
homogeneity tests that protects against too much heterogeneity. A set of working rules
that seems to be effective for the practitioner is as follows:
1. If the homogeneity test is accepted at  = .01, do not transform.
2. If the homogeneity test is rejected at  = .001, transform.
3. If the result of the homogeneity test is between  = .01 and  = .001, try very hard to
find out the theoretical distribution from the investigator. If there is a practical reason
to transform and the transformed variable makes sense, go ahead and transform.
Otherwise, we recommend not transforming.”
Chapter 5-10 (revision 16 May 2010)
p. 2
In discussing the assumptions of classical hypothesis tests (t test, ANOVA, linear regression) van
Belle (2002, p.10) states,
“The second condition for the validity of tests of hypotheses is that of homogeneity of
variance. Box (1953) already showed that hypothesis tests are reasonably robust against
heterogeneity of variance. For a two-sample test a three-fold difference in variances does
not affect the probability of a Type I error. Tests of equality of variances are very
sensitive to departures from the assumptions and usually don’t provide a good basis for
proceeding with hypothesis tests. Box (1953) observed that,
...to make the preliminary test on variances is rather like putting to sea in a rowing
boat to find out whether conditions are sufficiently calm for an ocean linear to
leave port!”
Normality
Lorenzen and Anderson (1993, p.41) state,
“Generally speaking, the F ratio used in the analysis of variance has been shown to be
very robust to departures from normality, Eisenhart (1947).”
In discussing commonly used tests of means, Box (1953) states,
“...thanks to the work of Pearson (1931), Bartlett (1935), Geary (1947), Gayen (1950 a,
b), David & Johnson (1951 a, b) there is abundant evidence that these comparative tests
on means are remarkably insensitive to general* non-normality of the parent population.
____
*By ‘general’ parent non-normality is meant that the departure from normality, in
particular skewness, is the same in the different groups, as could usually be assumed
when the data were from an experiment in which the groups corresponded with different
applied treatments to be compared. In tests in which sample means are compared,
general skewness tends to be cancelled out; larger effects are found, however, if the
skewness is in different directions in the different groups.
References
Pearson, E.S. (1931). Biometrika, 23, 114.
Bartlett, M.A. (1935). Proc. Camb. Phil. Soc. 31, 223.
Geary, R.C. (1947). Biometrika, 34, 209.
Gayen, A.K. (1950a). Biometrika, 37, 236.
Gayen, A.K. (1950b). Biometrika, 37, 399.”
Chapter 5-10 (revision 16 May 2010)
p. 3
In discussing and contrasting independence of observations, homogeneity of variance, and
normality in t tests, ANOVA, and linear regression, van Belle (2002, p. 10) states,
“Normality is the least important in tests of hypotheses. It should be noted that the
assumption of normality deals with the error term of the model, not the original data.
This is frequently forgotten by researchers who plot histograms of the raw data rather
than the residuals from the model....”
Note: van Belle points out that the assumption of normality is actually that the residuals are
normality distributed, not the variables themselves. However, if you use indicator variables to
model the groups, the regression line goes directly through the group means, and so the residuals
will have the same distributional shape as the outcome variable when you examine the outcome
variable separately for each group.
Just how far from normality can one go without creating a problem? It is not so simple to
quantify a departure from normality, but a contrast to the homogeneity of variance assumption
can be made. We saw above that for the homogeneity of variance assumption, the variance of
one group could be three-times the variance of another group without changing alpha (the type I
error) at all, and could be nine-times the variance and only change alpha from 0.05 to 0.06, a
change of no importance. What might seem like appalling violations, then, are of no
consequence due to the robustness of the hypothesis tests. In contrast, the hypothesis tests are
even more robust to the assumption of normality. van Belle (2002, p.8) explains,
“Many investigators have studied the issue of the relative importance of the assumptions
underlying hypothesis tests (see, Cochran, 1947; Gastwirth and Rubin, 1971; Glass et al.,
1972; Lissitz and Chardoes, 1975; Millard et al., 1985; Pettitt and Siskind, 1981; Praetz,
1981; Scheffé, 1959; and others). All studied the effects of correlated errors on classical
parametric and nonparametic tests. In all of these studies, positive correlation resulted in
an inflated Type I error level, and negative correlation resulted in a deflated Type I error
level. The effects of correlation were more important than differences in variances
between groups, and differences in variances were more important than the assumption of
a normal distribution.”
What To Do With This Knowledge
Your model is probably fine in most cases, if your data are somewhat symmetrical and skewness
does not occur in opposite directions for your study groups.
There is no need to rush into transformations, which can be avoided in most cases.
Still, many journal reviewers are not aware of the robustness of linear regression to violations of
normality and homogeneity of variance, since this is not taught in introductory statistics courses,
so somethings you get a reviewer response to your manuscript asking if you tested the
assumptions of normality and equal variances.
Chapter 5-10 (revision 16 May 2010)
p. 4
As a final note, even when aware that linear regression is robust to the equal variance and
normality assumptions, statisticians will sometimes still go ahead and try a transformation when
the assumptions are violated. This is because an increase in precision is frequently gained by the
transformation, making it easier to achieve statistical significance. We will see an example of
this in the “modeling costs” chapter, where a log transformation of the hospitalizaiton cost
variance shrinks the variance considerably.
References
Box, GEP. (1953). Non-normality and tests on variances. Biometrika 40:318-335.
Box, GEP. (1954). Some theorems on quadratic forms applied in the study of analysis of variance
problems, I. Effect of inequality of variance in the one-way classification. Annals of
Mathematical Statistics, 25: 290-302.
Cochran WG. (1947). Some consequences when the assumptions for the analysis of variance are
not satisified. Biometrics 3:22-38.
Eisenhart C. (1947). The assumptions underlying the analysis of variance, Biometrics 3;1-21.
Gastwirth JL, Rubin H. (1971). Effect of dependence on the level of some one-sample tests.
Journal of the American Statistical Association. 66:816-820.
Glass GV, Peckham PD, Sanders JR. (1972). Consequences of failure to meet the assumptions
underlying the fixed effects analysis of variance and covariance. Reviews in Educational
Research. 42:237-288.
Lassitz RW, Chardoes S. (1975). A study of the effect of the violation of the assumption of
independent sampling upon the type I error rate of the two group t-test. Educational and
Psychological Measurement. 35:353-359.
Lorenzen TJ, Anderson VL. (1993). Design of Experiments: a No-Name Approach. New York,
Marcel Dekker.
Millard SP, Yearsley JR, Lettenmaier DP. (1985). Space-time correlation and its effect on
methods for detecting aquatic ecological change. Canadian Journal of Fisheries and
Aquatic Science. 42:1391-1400. Correction: (1986) 43:1680.
Nawata K, Sohmiya M, Kawaguchi M, et al. (2004). Increased resting metabolic rate in patients
with type 2 diabetes mellitus accompanied by advanced diabetic nephropathy.
Metabolism 53(11) Nov: 1395-1398.
Pettitt AN, Siskind V. (1981). Effect of within-sample dependence on the Mann-WhitneyWilcoxon statistic. Biometrika 68:437-441.
Chapter 5-10 (revision 16 May 2010)
p. 5
Praetz P. (1981). A note on the effect of autocorrelation on multiple regression statistics.
Australian Journal of Statistics. 23:309-313.
Scheffé H. (1959). The Analysis of Variance. New York, John Wiley and Sons.
van Belle G. (2002). Statistical Rules of Thumb. New York, John Wiley & Sons.
van Belle G, Fisher LD, Heagerty PJ, Lumly T. (2004). Biostatistics: A Metholdogy for the
Health Sciences, 2nd ed, Hoboken, NJ, John Wiley & Sons.
Chapter 5-10 (revision 16 May 2010)
p. 6
Download