Using Matrix Algebra to do Multiple Regression*

Using Matrix Algebra to do Multiple Regression
Before we had computers to assist us, we relied on matrix algebra to solve
multiple regressions. You have some appreciation of how much arithmetic is involved in
matrix algebra, so you can imagine how tedious the solution is. We shall use SAS to do
that arithmetic for us. Consider the research I have done involving the relationship
between a person’s attitudes about animals, idealism, and misanthropy. I also have, for
the same respondents, relativism scores and gender. Below is a correlation matrix and,
in the last two rows, a table of simple descriptive statistics for these variables.
Persons who score high on the idealism dimension believe that ethical behavior
will always lead only to good consequences, never to bad consequences, and never to
a mixture of good and bad consequences. Persons who score high on the relativism
dimension reject the notion of universal moral principles, preferring personal and
situational analysis of behavior. Persons who score high on the misanthropy dimension
dislike humans. Gender was coded ‘1’ for female, ‘2’ for male. High scores on the
attitude variable indicate that the respondent supports animals rights and does not
support research on animals. There were 153 respondents.
idealism
relativism
misanthropy
gender
attitude
mean
standard dev.
idealism
1.0000
-0.0870
-0.1395
-0.1011
0.0501
relativism
-0.0870
1.0000
0.0525
0.0731
0.1581
misanthropy
-0.1395
0.0525
1.0000
0.1504
0.2259
3.64926
0.53439
3.35810
0.57596
2.32157
0.67560
gender
-0.1011
0.0731
0.1504
1.0000
-0.1158
attitude
0.0501
0.1581
0.2259
-0.1158
1.0000
1.18954
0.39323
2.37276
0.52979
1. The first step is to obtain Riy, the column vector of correlations between Xi and
Y.
ar
0.0501
0.1581
0.2259
-0.1158
ideal
relat
misanth
gender
2. Next we obtain Rii, the matrix of correlations among X’s.
ideal
relat
misanth
gender
ideal
1.0000
-0.0870
-0.1395
-0.1011
relat
-0.0870
1.0000
0.0525
0.0731
misanth
-0.1395
0.0525
1.0000
0.1504
gender
-0.1011
0.0731
0.1504
1.0000
3. Now we invert Rii. You don’t really want to do this by hand, do you?

Copyright 2014, Karl L. Wuensch - All rights reserved.
MultReg-Matrix.docx
4. Now R ii1R iy  Bi -- that is, we post multiply the inverted X correlation matrix by
the XY correlation vector to get the partial regression coefficients in standardized form
[]. Since we don’t want to do that by hand either, we employ IML in SAS. Copy this
little program into the SAS editor, submit it, and see the resulting matrices:
proc iml;
reset print;
ii ={
1.0000 -0.0870 -0.1395 -0.1011,
-0.0870 1.0000 0.0525 0.0731,
-0.1395 0.0525 1.0000 0.1504,
-0.1011 0.0731 0.1504 1.0000};
iy = {0.0501, 0.1581, 0.2259, -0.1158};
betas = inv(ii)*iy;
quit;
In the output window you will see first the Riy, then the Rii, and finally the column
vector of Beta weights. The first beta is for idealism, the second relativism, the third
misanthropy, and the last gender.
5. If you want unstandardized coefficients, you need to use the following
formulae:
sy
bi   i
a  Y   bi X i
si
6. To obtain the squared multiple correlation coefficient, R 2  r1 . For our
data, that is 0.0501(.0837) + 0.1581(.1636) + 0.2259(.2526) + (-0.1158)(-.1573) =
.1053.
7. Test the significance of the R2
For our data, sy = 0.52979, n = 153, so SSY = 152(0.52979)2 = 42.663.
The regression sum of squares, SSregr  R 2SSY  .1053(42.663)  4.492
The error sum of squares, SSerror  SSY  SSregr  42.663  4.492  38.171
SS
df
MS
F
Source
Regression
4.492
4 1.123 4.353
Residual
38.171
148 0.258
Total
42.663
152
This is significant at about .002. We could go on to obtain test the significance of
the partials and obtain partial or semipartial correlation coefficients, but frankly, that is
just more arithmetic than I can stand. Let us stop at this point. The main objective of
this handout is to help you appreciate how matrices and matrix algebra are essential
when computing multiple regressions, and I hope that I have already made that point
adequately.
At http://www.danielsoper.com/statcalc3/calc.aspx?id=15 is a web app that will
calculate F and p given the R2 and N. Rounding error will produce a small discrepancy
between the values it obtains and those shown here.