Chapter 5-10 Linear Regression Robustness to Assumptions
In ANOVA and linear regression, the following assumptions are made (van Belle et al, 2004,
1. Homogeneity of variance (In one-way ANOVA, each group has the same variance on the
outcome variable. In linear regression with a single continuous predictor, the variance around
the regression line is the same at every point along the X axis.)
2. Normality of the residual error (the distribution of differences between the actual and
predicted values has a normal distribution).
3. Statistical independence of the residual errors (the residuals have no discernable pattern).
4. Linearity of the model (the form Y = a + bX + error is a good representation of the data,
so the variability in Y can be partitioned into these separate terms).
If either or both of the assumptions of homogeneity of variance or normality of the residual error
are violated, transformations of the data, such as taking logarithms, are frequently advocated. If
the right transformation is selected, either or both assumptions are usually met on the
transformed scale.
Statisticians frequently make the comment that t tests, analysis of variance (ANOVA), and linear
regression are robust to the assumptions of homogeneity of variance (equal variance) and
normality. These two assumptions are the focus of this chapter, which provides authoritative
references to back up the robustness claim.
These robustness claims originally came from statistical papers on ANOVA. However, ANOVA
is simply a special case of linear regression, as we saw in Chapter 5-4). Further, an independent
groups t test is just a one-way ANOVA comparing two means. So, the robustness described in
this chapter applies to the t test, ANOVA, and linear regression.
Homogeneity of Variance
Lorenzen and Anderson (1993, p.35) state,
“Historically, the assumption thought to be the most critical was the homogeneity of
variance. However, Box (1954) demonstrated that the F test in ANOVA was most robust
for  while working with a fixed model having equal sample sizes. He showed that for a
relatively large (one variance up to nine times larger than another) departures from
homogeneity, the  level may only change from .05 to about .06. This is not considered
to be of any practical importance. (It should be pointed out that the only time an  level
increased dramatically was when the sample size was negatively correlated with the size
of the variance.)”
To put this in simpler terms, statisticians concern themselves with insuring that significance tests
do not increase the type I error rate. Violations of assumptions, then, are of a concern if they lead
to a rejection of a true null hypothesis more frequently than , almost always set at 0.05. Box
showed that you could have very large departures from homogeneity of variance without
affecting the alpha level in any appreciable way.
When the homogeniety of variance assumption is violated, using a test such as Levene’s test of
homogeneity, it is frequently advised by authors of statistics textbooks to transform the outcome
variance. Lorenzen and Anderson (1993, p.35) offer this advice,
“When there are large departures from homogeneity, it is felt that the data should be
transformed to produce more meaningful results. However, one must take in the
interpretation of the results after transforming since transforming also changes the form
of the mathematical model. To our knowledge, no one has come up with an  level on
homogeneity tests that protects against too much heterogeneity. A set of working rules
that seems to be effective for the practitioner is as follows:
1. If the homogeneity test is accepted at  = .01, do not transform.
2. If the homogeneity test is rejected at  = .001, transform.
3. If the result of the homogeneity test is between  = .01 and  = .001, try very hard to
find out the theoretical distribution from the investigator. If there is a practical reason
to transform and the transformed variable makes sense, go ahead and transform.
Otherwise, we recommend not transforming.”
In discussing the assumptions of classical hypothesis tests (t test, ANOVA, linear regression) van
Belle (2002, p.10) states,
“The second condition for the validity of tests of hypotheses is that of homogeneity of
variance. Box (1953) already showed that hypothesis tests are reasonably robust against
heterogeneity of variance. For a two-sample test a three-fold difference in variances does
not affect the probability of a Type I error. Tests of equality of variances are very
sensitive to departures from the assumptions and usually don’t provide a good basis for
proceeding with hypothesis tests. Box (1953) observed that, make the preliminary test on variances is rather like putting to sea in a rowing
boat to find out whether conditions are sufficiently calm for an ocean linear to
leave port!”
Lorenzen and Anderson (1993, p.41) state,
“Generally speaking, the F ratio used in the analysis of variance has been shown to be
very robust to departures from normality, Eisenhart (1947).”
In discussing commonly used tests of means, Box (1953) states,
“...thanks to the work of Pearson (1931), Bartlett (1935), Geary (1947), Gayen (1950 a,
b), David & Johnson (1951 a, b) there is abundant evidence that these comparative tests
on means are remarkably insensitive to general* non-normality of the parent population.
*By ‘general’ parent non-normality is meant that the departure from normality, in
particular skewness, is the same in the different groups, as could usually be assumed
when the data were from an experiment in which the groups corresponded with different
applied treatments to be compared. In tests in which sample means are compared,
general skewness tends to be cancelled out; larger effects are found, however, if the
skewness is in different directions in the different groups.
Pearson, E.S. (1931). Biometrika, 23, 114.
Bartlett, M.A. (1935). Proc. Camb. Phil. Soc. 31, 223.
Geary, R.C. (1947). Biometrika, 34, 209.
Gayen, A.K. (1950a). Biometrika, 37, 236.
Gayen, A.K. (1950b). Biometrika, 37, 399.”
In discussing and contrasting independence of observations, homogeneity of variance, and
normality in t tests, ANOVA, and linear regression, van Belle (2002, p. 10) states,
“Normality is the least important in tests of hypotheses. It should be noted that the
assumption of normality deals with the error term of the model, not the original data.
This is frequently forgotten by researchers who plot histograms of the raw data rather
than the residuals from the model....”
Note: van Belle points out that the assumption of normality is actually that the residuals are
normality distributed, not the variables themselves. However, if you use indicator variables to
model the groups, the regression line goes directly through the group means, and so the residuals
will have the same distributional shape as the outcome variable when you examine the outcome
variable separately for each group.
Just how far from normality can one go without creating a problem? It is not so simple to
quantify a departure from normality, but a contrast to the homogeneity of variance assumption
can be made. We saw above that for the homogeneity of variance assumption, the variance of
one group could be three-times the variance of another group without changing alpha (the type I
error) at all, and could be nine-times the variance and only change alpha from 0.05 to 0.06, a
change of no importance. What might seem like appalling violations, then, are of no
consequence due to the robustness of the hypothesis tests. In contrast, the hypothesis tests are
even more robust to the assumption of normality. van Belle (2002, p.8) explains,
“Many investigators have studied the issue of the relative importance of the assumptions
underlying hypothesis tests (see, Cochran, 1947; Gastwirth and Rubin, 1971; Glass et al.,
1972; Lissitz and Chardoes, 1975; Millard et al., 1985; Pettitt and Siskind, 1981; Praetz,
1981; Scheffé, 1959; and others). All studied the effects of correlated errors on classical
parametric and nonparametic tests. In all of these studies, positive correlation resulted in
an inflated Type I error level, and negative correlation resulted in a deflated Type I error
level. The effects of correlation were more important than differences in variances
between groups, and differences in variances were more important than the assumption of
a normal distribution.”
What To Do With This Knowledge
Your model is probably fine in most cases, if your data are somewhat symmetrical and skewness
does not occur in opposite directions for your study groups.
There is no need to rush into transformations, which can be avoided in most cases.
Still, many journal reviewers are not aware of the robustness of linear regression to violations of
normality and homogeneity of variance, since this is not taught in introductory statistics courses,
so somethings you get a reviewer response to your manuscript asking if you tested the
assumptions of normality and equal variances.
As a final note, even when aware that linear regression is robust to the equal variance and
normality assumptions, statisticians will sometimes still go ahead and try a transformation when
the assumptions are violated. This is because an increase in precision is frequently gained by the
transformation, making it easier to achieve statistical significance. We will see an example of
this in the “modeling costs” chapter, where a log transformation of the hospitalizaiton cost
variance shrinks the variance considerably.
