Redundancy and Suppression in Trivariate Regression Analysis

Redundancy and Suppression in Trivariate Regression Analysis
Redundancy
In the behavioral sciences, one’s nonexperimental data often result in the two predictor
variables being redundant with one another with respect to their covariance with the dependent
variable. Look at the ballantine above, where we are predicting verbal SAT from family income as
parental education. Area b represents the redundancy. For each X, sri and i will be smaller than ryi,
and the sum of the squared semipartial r’s will be less than the multiple R2. Because of the
redundancy between the two predictors, the sum of the squared semipartial correlation coefficients
(areas a and c, the unique contributions of each predictor), is less than the squared multiple
correlation coefficient, R2y.12 (area a + b + c).
Classical
Suppression
Y
X2
X1
Look at the ballantine above. Suppose that Y is score on a statistics achievement test, X1 is
score on a speeded quiz in statistics (slow readers don’t have enough time), and X 2 is a test of
.382
 .181  .4255, greater than ry1.
1  .45 2
Adding X2 to X1 increased R2 by .181  .382 = .0366. That is, sr2   .0366  .  .19 . The sum of the
two squared semipartial r’s, .181 + .0366 = .218, greater than the R2y.12.
reading speed. The ry1 = .38, ry2 = 0, r12 = .45. Ry .12 
1 
.38  0(.45)
0  .38(.45)
 .476, greater than ry1.  2 
 .214.
2
1  .45
1  .45 2
Notice that the sign of the  and sr for the classical suppressor variable may be opposite that
of its zero-order r12. Notice also that for both predictor variables the absolute value of  exceeds that
of the predictor’s r with Y.

Copyright 2014, Karl L. Wuensch - All rights reserved.
Suppress.docx
2
How can we understand the fact that adding a predictor that is uncorrelated with Y (for
practical purposes, one whose r with Y is close to zero) can increase our ability to predict Y? Look at
the ballantine. X2 suppresses the variance in X1 that is irrelevant to Y (area d). Mathematically,
R 2  ry22  ry2(1.2)  0  ry2(1.2)
r2y(1.2), the squared semipartial for predicting Y from X2 ( sr 22 ), is the r2 between Y and the residual
X
1

 X̂ 1.2 . It is increased (relative to r2y1) by removing from X1 the irrelevant variance due to X2 


what variance is left in X 1  X̂ 1.2 is more correlated with Y than is unpartialled X1.
2
2
1y
r
r1y
b
b
b
.144

 b  .382  .144 which is < r(21.2 ) y 



 .181  Ry2 .12
2
2
bcd
b  c 1  d 1  r12 1  .45
Velicer (see Smith et al, 1992) wrote that suppression exists when a predictor’s “usefulness” is
greater than its squared zero-order correlation with the criterion variable. “Usefulness” is the squared
semipartial correlation for the predictor.
Our X1 has rY21  .144 -- all by itself it explains 14.4% of the variance in Y. When added to a
model that already contains X2, X1 increases the R2 by .181 – that is, sr12  .181. X1 is more useful in
the multiple regression than all by itself. Likewise, sr22  .0366  r22  0. That is, X2 is more useful in
the multiple regression than all by itself.
Net Suppression
X1
X2
Look at the ballantine above. Suppose Y is the amount of damage done to a building by a fire.
X1 is the severity of the fire. X2 is the number of fire fighters sent to extinguish the fire. The ry1 = .65,
ry2 = .25, and r12 = .70.
1 
.65  .25(.70)
 .93 > ry 1.
1  .702
2 
.25  .65(.70)
 .40.
1  .702
Note that 2 has a sign opposite that of ry2. It is always the X which has the smaller ryi which
ends up with a  of opposite sign. Each  falls outside of the range 0  ryi, which is always true with
any sort of suppression.
Again, the sum of the two squared semipartials is greater than is the squared multiple
correlation coefficient:
Ry2.12  .505, sr12  .505  ry22  .4425, sr22  .505  ry21  .0825, .4425  .0825  .525 > .505.
3
Again, each predictor is more useful in the context of the other predictor than all by itself:
sr  .4424  rY21  .4225 and sr22  .0825  rY22  .0625 .
2
1
For our example, number of fire fighters, although slightly positively correlated with amount of
damage, functions in the multiple regression primarily as a suppressor of variance in X 1 that is
irrelevant to Y (the shaded area in the Venn diagram). Removing that irrelevant variance increases
the  for X1.
Looking at it another way, treating severity of fire as the covariate, when we control for severity
of fire, the more fire fighters we send, the less the amount of damage suffered in the fire. That is, for
the conditional distributions where severity of fire is held constant at some set value, sending more
fire fighters reduces the amount of damage. Please note that this is an example of a reversal
paradox, where the sign of the correlation between two variables in aggregated data (ignoring a third
variable) is opposite the sign of that correlation in segregated data (within each level of the third
variable). I suggest that you (re)read the article on this phenomenon by Messick and van de Geer
[Psychological Bulletin, 90, 582-593].
Cooperative Suppression
R2 will be maximally enhanced when the two X’s correlate negatively with one another but
positively with Y (or positively with one another and negatively with Y) so that when each X is
partialled from the other its remaining variance is more correlated with Y: both predictor’s , pr, and
sr increase in absolute magnitude (and retain the same sign as ryi). Each predictor suppresses
variance in the other that is irrelevant to Y.
Consider this contrived example: We seek variables that predict how much the students in an
introductory psychology class will learn (Y). Our teachers are all graduate students. X1 is a measure
of the graduate student’s level of mastery of general psychology. X2 is a measure of how strongly the
students agree with statements such as “This instructor presents simple, easy to understand
explanations of the material,” “This instructor uses language that I comprehend with little difficulty,”
etc. Suppose that ry1 = .30, ry2 = .25, and r12 = 0.35.
1 
sr1 
.30  .25(.35)
 .442.
1  .35 2
.30  .25(.35)
1  .35
2
 .414.
2 
sr2 
.25  .30(.35)
 .405.
1  .35 2
.25  .30(.35)
1  .35 2
 .379.
Ry2.12   ryi  i  .3(.442)  .25(.405)  .234. Note that the sum of the squared semipartials, .4142 +
.3792 = .171 + .144 = .315, exceeds the squared multiple correlation coefficient, .234.
Again, each predictor is more useful in the context of the other predictor than all by itself:
sr  .171  rY21  .09 and sr22  .144  rY22  .062 .
2
1
4
Summary
When i falls outside the range of 0  ryi, suppression is taking place. This is Cohen &
Cohen’s definition of suppression. As noted above, Velicer defined suppression in terms of a
predictor having a squared semipartial correlation coefficient that is greater than its squared zeroorder correlation with the criterion variable.
If one ryi is zero or close to zero , it is classic suppression, and the sign of the  for the X with
a nearly zero ryi may be opposite the sign of r12.
When neither X has ryi close to zero but one has a  opposite in sign from its ryi and the other a
 greater in absolute magnitude but of the same sign as its ryi, net suppression is taking place.
If both X’s have absolute i > ryi, but of the same sign as ryi, then cooperative suppression is
taking place.
Although unusual, beta weights can even exceed one when cooperative suppression is
present.
References



Cohen, J., & Cohen, P. (1975). Applied multiple regression/correlation for the behavioral
sciences. New York, NY: Wiley. [This handout has drawn heavily from Cohen & Cohen.]
Smith, R. L., Ager, J. W., Jr., & Williams, D. L. (1992). Suppressor variables in multiple
regression/correlation. Educational and Psychological Measurement, 52, 17-29.
doi:10.1177/001316449205200102
Wuensch, K. L. (2008). Beta Weights Greater Than One !
http://core.ecu.edu/psyc/wuenschk/MV/multReg/Suppress-BetaGT1.doc .
 Example of Net Suppression and Reversal Paradox – with data and SAS ouput.
 Return to Wuensch’s Stats Lessons Page
Copyright 2014, Karl L. Wuensch - All rights reserved.