Revised Chapter 10 in Specifying and Diagnostically Testing Econometric Models (Edition 3)
© by Houston H. Stokes 29 November 2011. All rights reserved. Preliminary Draft
Chapter 10
Special Topics in OLS Estimation ............................................................................................... 2
10.0 Introduction ....................................................................................................................... 2
10.1 The QR Approach ............................................................................................................. 2
10.2 The Principal-Component Regression Model ............................................................... 10
10.3 Ridge, Lasso and Elastic Net Models ............................................................................ 15
Figure 10.1 OLS vs GLM Yhat Values for Out-of-Sample Observations ............................. 20
Figure 10.2 Out-of-Sample Residuals for OLS and GLM Models ........................................ 20
Figure 10.3 Effect of changes in Lamda on the Out-of-Sample RSS..................................... 21
Figure 10.4 RSS for Out-of-Sample Models vs Number of Vectors in the Model ................ 22
Table 10.1 Effect on the Out-of_Sample RSS of values of  and the Number of Coefficients
................................................................................................................................................ 23
10.4 Partial Least Squares and Continuum Regression Models ......................................... 23
Table 10.2 Matlab Code to Obtain PLS Model Solution suggested by de Jong (1993) ......... 27
Table 10.3 B34S Implementation of SIMPLS Calculation .................................................... 28
Table 10.4 PLS1 the Jong-Wise-Ricker (2001) PLS-CRM Estimation Approach ................ 30
Table 10.5 B34S Implementation of PLS1 including CRMTEST ......................................... 31
Table 10.6 Effect on the RSS of PCM, PLS and CRM Models of Varying Degrees ............ 35
Figure 10.6 Residual Sum of Squares for various CR Models ............................................ 44
Figure 10.7 Residual Sum of Squares Surface .................................................................... 45
Table 10.7 Setup for Analysis of Octain Data ........................................................................ 45
Figure 10.8 Octain vs NIG Data ............................................................................................. 47
Figure 10.9 Sensitivity of Octain Model to # of Vectors and  setting ................................ 48
Figure 10.10 Octain Model Matrix T ..................................................................................... 49
Figure 10.11 Maping of X data to the PLS Vectors ............................................................... 50
Table 10.8 Tests to illustrate PLS Model intermediate calculations. ..................................... 54
10.5 Boosting ............................................................................................................................ 62
10.6 Extended Examples ......................................................................................................... 63
Table 10.9 Example file for Shrinkage Models ..................................................................... 63
Table 10.10 Ridge and Lasso Routines .................................................................................. 64
Table 10.11 LTS and LTS_REC Routines for Resistant Estimation ..................................... 70
Table 10.12 Estimation of LTS Based Models ...................................................................... 72
Table 10.13 Boosting Routine ................................................................................................ 75
Table 10.14 Modified Forward Stagewise model boosting ................................................... 76
Table 10.15 Boosting Test Case ............................................................................................. 77
Figure 10.12 OLS Boosting Example .................................................................................... 78
Table 10.16 Modifications to OLS Boosting to Facilitate Forecasting .................................. 80
Table 10.17 Forecasting an OLS Boosting Model ................................................................. 82
10-1
10-2
Chapter 10
Table 10.18 Wampler Test Problem ...................................................................................... 86
Table 10.19 Longley Test Data .............................................................................................. 89
Table 10.20 Results from the Longley Total Equation .......................................................... 92
Table 10.21 Correlation between the Error term and Right-Hand-Side................................. 93
Table 10.22 Effect of Data Precision on Accuracy using the Grunfeld Data ......................... 94
Table 10.23 Effect of Data Precision on Accuracy using Gas Data ....................................... 99
Table 10.24 Matlab Commands to Replicate Accuracy Results Obtained with B34S ........ 105
Table 10.25. Correlation of the Residual and RHS Variables using Matlab 2006b............. 106
Table 10.26. Correlation of the Residual and RHS Variables using Matlab 2007a ............. 106
10.7 Conclusion ...................................................................................................................... 106
Special Topics in OLS Estimation
10.0 Introduction
The qr command in B34S allows calculation of a high accuracy solution to the OLS
regression problem and is discussed in section 10.1. Using this command the principal component
regression can be calculated to give further insight into possible rank problems and is discussed in
section 10.2 where the singular value decomposition is introduced. The matrix command provides
substantially more capability and can be used to provide additional insight. After an initial survey of
the theory, a number of examples are presented. Further examples on this important topic are
presented in section 16.7 which involves the matrix command and illustrates accuracy gains due to
data precision and alternative methods of calculation. Section 10.3 discusses the ridge, lasso, least
trimmed squares and elastic net models which can be used to shrink the variables on the right hand
side and/or remove outliers. These procedures are related to the principle component regression
model which can also be used to shrink the model. Section 10.4 is devoted to the partial least squares
(PLS) procedure which is shown to be a contromise between the continuum of models between OLS
to principle component regression (PCR) in terms of skrinkage. The continuum power regression
model is shown to be a more general setup incolving OLS, PLS and PCR as special cases. Section
10.5 is devoted to boosting while extended examples for many of the procedures discussed are given
in section 10.6.
10.1 The QR Approach
Interest in the QR approach to estimation was stimulated by Longley's (1967) seminal paper
on computer accuracy.1 Equation (4.1-6) indicated how the OLS calculation of ˆ , which is usually
1 The literature on the QR approach is vast. No attempt will be made to provide a summary in the
limited space available in this book. Good references include Longley (1984), Dongarra et al (1979)
and Strang (1976). Chapter 4 of this book provides a discussion of this approach applied to systems
estimation. This chapter provides some examples and a brief intuitive discussion based in part on
Dongarra et al (1979), which documents the LINPACK routines DQRDC and DQRSL. These
routines are used for all QR calculations.
Special Topics in OLS Estimation
10-3
calculated as ( X ' X )1 X ' y , could be more accurately calculated as R 1Q1' y , where the T by K
matrix X was initially factored as a product of the T by T orthogonal matrix Q and the upper
triangular K by K Cholesky matrix R:
R
Q' X   
0 
(10.1-1)
Matrix Q is usually partitioned as Q  Q1 Q2  so that
X  Q1R
(10.1-2)
where Q1 is a T by K matrix that is usually kept in factored form as
Q  H1H 2 ,
(10.1-3)
, Hk
where H i is the Householder transformation (Strang 1976, 279). Q2 is a T by (T  K ) matrix.
In terms of the QR factorization the parameter values and the fitted values are,
ˆ  ( X ' X )1 X ' y  ( R ' Q1'Q1R)1 R ' Q1' y  ( R ' R)1 R ' Q1' y  R 1Q1' y
(10.1-4)
and
X ˆ  Q1RR1Q1' y  Q1Q1' y .
(10.1-5)
Davidson and MacKinnon (1993, 30) write X ˆ  Q1Rˆ  Q1ˆ . In view of (10-1-5) Q1ˆ  Q1Q1' y and
ˆ  Q1' y
(10.1-6)
from which the fitted values (10.1-5) can be calculated. Assuming T is large relative to K, then Q1 is
T by K and Q2 is T by (T-K) matrix, which is substantially larger. Usually the "economy" QR is
made where only Q1 is calculated. In many cases even Q1 is not needed, since only R is needed to
( X ' X ) 1  ( R ' Q 'Q R ) 1  R 1 ( R 1 )' from which it is possible to get the standard errors and ˆ using
1
1
a more accurate estimate of R than obtained from the Cholesky factorization of X ' X . An example
of the gain will be shown below. Dongarra et al (1979, sec. 9.1) show that given X is of rank K, then
the matrix
10-4
Chapter 10
PX  Q1Q1'
(10.1-7)
is the orthogonal projection onto the column space of X and
PX  Q2 Q2'
(10.1-8)
is the projection onto the orthogonal complement of X in view of equation (10.1-1). The residuals of
an OLS model are constrained to be orthogonal or have zero correlation with the right-hand sides of
the equation. Using a regression the mapping of X to Q1 will be illustrated below. Given that the
residuals are orthogonal to the left hand side, if follows from (10.1-2)
eˆ  PX y  Q2 Q2' y
(10.1-9)
The code listed next illustrates these relationships using the Theil (1971) textile dataset studied in
Chapter 2. First the X matrix is built and the economy QR factorization performed. The columns of
Q1 are shown to have Euclidian length unity and be orthogonal (Q1'Q1  I ) . Column 1 of Q1 is
shown to be a scalar transformation of X (-0.1205241588547691) . OLS is used to show the second
and third columns of Q1 are linear transformations of col 1 and to 2 and 1-3 of X respectively.
b34sexec options ginclude('b34sdata.mac') member(theil);
b34srun;
b34sexec matrix;
call loaddata;
call echooff;
x=mfam(catcol(log10ri log10rpt,constant));
r=qr(x,q);
call print(x,q,r);
i=nocols(q);
test=array(i:);
do k=1,nocols(q);
test(k)=sumsq(q(,k));
enddo;
call print(test);
col_1_q=q(,1);
col_2_q=q(,2);
col_3_q=q(,3);
col_1_x=x(,1);
col_2_x=x(,2);
col_3_x=x(,3);
s= afam(col_1_q)/afam(col_1_x);
call print('Scale factor for x(,1) => q(,1) ',S:);
call print('Second col of q linear transform of x Orthog. to Col 1':);
call olsq(col_2_q, col_1_x, col_2_x :noint :print);
call olsq(col_1_q, col_2_q :noint :print);
call olsq(col_3_q, col_1_x, col_2_x col_3_x :noint :print);
call print('Test of orthogonal Condition',transpose(q)*q);
b34srun;
Special Topics in OLS Estimation
10-5
Edited output
B34S 8.10Z
(D:M:Y)
Variable
TIME
CT
RI
RPT
LOG10CT
LOG10RI
LOG10RPT
CONSTANT
21/ 8/06 (H:M:S) 14: 3:40
Label
1
2
3
4
5
6
7
8
DATA STEP
PAGE
# Cases
Mean
17
17
17
17
17
17
17
17
1931.00
134.506
102.982
76.3118
2.12214
2.01222
1.87258
1.00000
YEAR
CONSUMPTION OF TEXTILES
REAL INCOME
RELATIVE PRICE OF TEXTILES
LOG10(CONSUMPTION OF TEXTILES)
LOG10(REAL INCOME)
LOG10(RELATIVE PRICE OF TEXTILES)
Number of observations in data file
17
Current missing variable code
1.000000000000000E+31
Data begins on (D:M:Y) 1: 1:1923 ends 1: 1:1939.
Frequency is
B34S(r) Matrix Command. d/m/y 21/ 8/06. h:m:s 14: 3:40.
=>
CALL LOADDATA$
=>
CALL ECHOOFF$
X
= Matrix of
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Q
1
1.98543
1.99167
2.00000
2.02078
2.02078
2.03941
2.04454
2.05038
2.03862
2.02243
2.00732
1.97955
1.98408
1.98945
2.01030
2.00689
2.01620
R
1
-0.239292
-0.240044
-0.241048
-0.243552
-0.243552
-0.245799
-0.246416
-0.247120
-0.245703
-0.243751
-0.241931
-0.238583
-0.239129
-0.239777
-0.242290
-0.241879
-0.243000
TEST
1
-8.29709
0.00000
0.00000
= Array
1.00000
17
3
by
3
1.00000
3
elements
elements
3
-0.306449
-0.236160
-0.142443
0.920310E-01
0.923998E-01
0.301762
0.359337
0.425744
0.294795
0.113219
-0.561974E-01
-0.368744
-0.317944
-0.255997
-0.224782E-01
-0.607658E-01
0.436465E-01
by
2
-7.72132
-0.375029
0.00000
of
3
3
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
2
-0.417759
-0.391904
-0.370073
-0.204205
-0.150576
-0.146393
-0.145235
-0.264908E-01
0.137147
0.177336
0.214822
0.123456
0.114489
0.347728
0.252841
0.248275
0.236848
= Matrix of
1
2
3
by
2
2.00432
2.00043
2.00000
1.95713
1.93702
1.95279
1.95713
1.91803
1.84572
1.81558
1.78746
1.79588
1.80346
1.72099
1.77597
1.77452
1.78746
= Matrix of
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
17
3
elements
3
-4.12287
0.304283E-03
-0.442434E-01
elements
1.00000
Scale factor for x(,1) => q(,1)
-0.1205241588547691
Second col of q linear transform of x Orthog. to Col 1
Std. Dev.
5.04975
23.5773
5.30097
16.8662
0.791131E-01
0.222587E-01
0.961571E-01
0.00000
1
Variance
25.5000
555.891
28.1003
284.470
0.625889E-02
0.495451E-03
0.924619E-02
0.00000
1
Maximum
Minimum
1939.00
168.000
112.300
101.000
2.22531
2.05038
2.00432
1.00000
1923.00
99.0000
95.4000
52.6000
1.99564
1.97955
1.72099
1.00000
10-6
Chapter 10
Note perfect fit of columns 1 and 2 of X mapping to column 2 of Q1 .
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
COL_1_X
COL_2_X
Lag
0
0
Coefficient
2.4814260
-2.6664627
COL_2_Q
1.000000000000000
1.000000000000000
2.033498177717717E-26
1.355665451811812E-27
3.681936245797599E-14
0.9999999945536433
502.7987247346952
1.789899162686381E-05
0.2499999993192054
5.317413176442187E-13
5.668199248898641E-04
5.750955267558311E-14
17
SE
0.91472228E-13
0.98177458E-13
t
0.27127643E+14
-0.27159623E+14
Here column 1 of Q1 is orthogonal to col 2 of Q1 .
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
COL_2_Q
Lag
0
COL_1_Q
-8683.232715995069
-8683.232715995069
0.9999999999999993
6.249999999999996E-02
0.2499999999999999
1.151512209199723E-04
-3.964164000159326E-02
-0.2425216604976431
2.682713422544098E-03
4.122868228459933
0.9999999999999999
0.2471202954562588
17
Coefficient
-0.27061686E-15
SE
0.25000000
t
-0.10824674E-14
Column 3 of Q1 is shown to be a linear transform of X.
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
COL_1_X
COL_2_X
COL_3_X
Lag
0
0
0
COL_3_Q
1.000000000000000
1.000000000000000
1.675885395409665E-23
1.197060996721189E-24
1.094102827307008E-12
0.9998848542254362
445.7268402712293
-2.602552757714062E-03
0.2499856063638260
1.409913852334910E-11
9.474231625417810E-06
1.713684749660160E-12
17
Coefficient
11.248239
-0.18338531E-01
-22.602243
SE
0.12603327E-10
0.29174534E-11
0.24729178E-10
t
0.89248168E+12
-0.62858008E+10
-0.91399087E+12
Test of orthogonal Condition
Matrix of
1
2
3
1
1.00000
-0.270617E-15
-0.213371E-15
3
by
3
2
-0.270617E-15
1.00000
-0.100614E-15
elements
3
-0.213371E-15
-0.100614E-15
1.00000
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
8856736, peak space used
63, peak number used
51, # user temp clean
3330
63
0
Special Topics in OLS Estimation
10-7
The next example illustrates the gains of the QR approach. The gas data example studied in Chapter
7 and 8 is used to show accuracy gains obtained by use of the QR factorization on a simple problem.
Later in this chapter a more difficult problem will be shown.
/$ Illustrates OLS Capability under Matrix Command
b34sexec options ginclude('b34sdata.mac')
member(gas); b34srun;
b34sexec matrix;
call load(qr_small :staging);
call echooff;
call loaddata;
nlag=6;
call olsq(gasout gasin{0 to nlag} gasout{1 to nlag} :print :savex);
call print(ccf(%y,
%res));
call print(ccf(%yhat,%res));
/;
/; Large QR used for illustration. Q2 is large!!!
/; error2 equation uses q1 => uses the economy qr. This is the
/; best way to proceed
/;
r=qr(%x,q:);
call qr_small(%x,q,r,q1,q2,r_small);
/; call print(q,q1,q2);
yhat =q1*transpose(q1)*%y;
error =q2*transpose(q2)*%y;
error2=%y - yhat;
beta=inv(r_small)*transpose(q1)*%y;
call print('Beta from QR ',beta);
call print(ccf(%y,error));
call print(ccf(yhat,error));
call print(ccf(yhat,error2));
/; call tabulate(%y,%yhat,yhat,%res,error,error2);
call print(' ':);
call print('Study Error Buildup using Cholesky':);
call print(' ':);
/; excessive problem
maxlag=40;
chol_r=vector(maxlag:);
qr_r =vector(maxlag:);
r_cond=vector(maxlag:);
do i=1,maxlag;
/;
/; :qr call uses linpack to get OLS. This is close to LAPACK QR(
/;
call olsq(gasout gasin{0 to i} gasout{1 to i} :savex);
chol_r(i)=ccf(%yhat ,
%res);
r_cond(i)=%rcond;
/; call olsq(gasout gasin{0 to i} gasout{1 to i} :qr :savex);
/; chol_r(i)=ccf(%yhat ,
%res);
/;
/; Use economy size qr to save space!!
/;
r=qr(%x,q);
qr_yhat =q*transpose(q)*%y;
qr_error =%y - qr_yhat;
qr_r(i) =ccf(qr_yhat,qr_error);
enddo;
)
10-8
Chapter 10
call tabulate(chol_r,qr_r r_cond
:title 'As maxlag increases accuracy declines');
b34srun;
When lags of 1-6 are used, as suggested by Tiao-Box (1981), the reciprocal of the matrix condition
was found to be 2.34596E-08. The correlation between the Cholesky ê and ŷ was -0.1333E-10
while for the QR this was -0.4148E-14, which is substantially smaller. When an excessive number
of lags, (40) was used, the condition fell to 0.5827E-10 and the correlations were -0.1387E-10 and
-0.3607E-13 respectively. Output documenting these findings is shown below.
=>
CALL LOAD(QR_SMALL :STAGING)$
=>
CALL ECHOOFF$
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F(13,
276)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
CONSTANT
Lag
0
1
2
3
4
5
6
1
2
3
4
5
6
0
Coefficient
-0.67324068E-01
0.19318519
-0.21454694
-0.42981100
0.14122227
-0.94767767E-01
0.23492127
1.5418090
-0.58620686
-0.17641567
0.13419248
0.54079963E-01
-0.40030303E-01
3.8759484
GASOUT0
0.9946789205951879
0.9944282900435120
16.09378007198891
5.831079736227866E-02
0.2414762873705794
3024.532965517241
7.767793572930756
53.50965517241379
3.235044356946151
48.28024672308078
3968.705786042084
1.000000000000000
2.345968875600428E-08
1.407426755603240
290
SE
0.76805415E-01
0.16668178
0.18914123
0.18922276
0.19069299
0.18185376
0.11100544
0.59960417E-01
0.11056171
0.11539164
0.11472181
0.10092430
0.42973854E-01
0.85787179
t
-0.87655367
1.1590061
-1.1343214
-2.2714551
0.74057401
-0.52112076
2.1163041
25.713781
-5.3020786
-1.5288427
1.1697207
0.53584681
-0.93150370
4.5180975
The correlation of y and ŷ with the Cholesky error is shown.
0.72945729E-01
-0.18895630E-10
ˆ is calculated using the QR method and the same correlations performed with one difference.
When eˆ  y  yˆ the correlation is .4827531E-14 while when ê is calculated using Q2 in (10.1-7)
the correlation is slightly smaller in absolute value, -.350777E-14.2
Beta from QR
BETA
= Vector of
-0.673241E-01
-0.586207
14
0.193185
-0.176416
elements
-0.214547
0.134192
-0.429811
0.540800E-01
0.141222
-0.400303E-01
-0.947678E-01
3.87595
0.234921
1.54181
0.72945729E-01
2 Later in this chapter the effect of data precision on the accuracy of QR vs Cholesky calculation
is studied using two different datasets. In those examples the estimated correlation between the
residual and the right hand side variables of the model are tested.
Special Topics in OLS Estimation
10-9
-0.35077737E-14
0.48275313E-14
Study Error Buildup using Cholesky
As maxlag increases accuracy declines
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
CHOL_R
QR_R
R_COND
-0.3512E-11 -0.7724E-15 0.9180E-06
-0.1841E-10 0.1020E-13 0.2720E-06
-0.2981E-10 -0.8277E-15 0.8986E-07
-0.2070E-10 -0.5568E-14 0.5034E-07
-0.1927E-10 0.1700E-13 0.3355E-07
-0.1890E-10 0.4828E-14 0.2346E-07
-0.1908E-10 -0.9341E-14 0.1800E-07
-0.1333E-10 -0.4148E-14 0.1364E-07
-0.1823E-10 0.1090E-13 0.1009E-07
-0.1827E-10 -0.2608E-13 0.6999E-08
-0.1523E-10 0.1335E-13 0.5429E-08
-0.1454E-10 -0.1528E-13 0.4008E-08
-0.1479E-10 0.2589E-13 0.3039E-08
-0.2019E-10 0.2707E-14 0.2460E-08
-0.1711E-10 -0.2973E-13 0.1896E-08
-0.1432E-10 0.1527E-14 0.1574E-08
-0.1526E-10 -0.2731E-13 0.1299E-08
-0.1733E-10 -0.1140E-13 0.1075E-08
-0.1958E-10 -0.9592E-14 0.9116E-09
-0.1701E-10 -0.2558E-13 0.5930E-09
-0.1990E-10 0.3035E-13 0.4891E-09
-0.1744E-10 -0.3318E-14 0.4295E-09
-0.1978E-10 -0.7585E-14 0.3684E-09
-0.1676E-10 -0.1069E-13 0.3191E-09
-0.1614E-10 0.3729E-13 0.2692E-09
-0.1279E-10 -0.6407E-14 0.2405E-09
-0.1606E-10 -0.3884E-13 0.2157E-09
-0.1862E-10 0.2220E-13 0.2008E-09
-0.1539E-10 -0.5976E-14 0.1900E-09
-0.1945E-10 0.1394E-13 0.1783E-09
-0.2438E-10 -0.5711E-14 0.1660E-09
-0.1603E-10 -0.1808E-13 0.1540E-09
-0.1894E-10 0.3219E-13 0.1394E-09
-0.1650E-10 0.2197E-14 0.1561E-09
-0.8106E-11 0.3452E-13 0.1393E-09
-0.1313E-10 -0.4354E-14 0.1041E-09
-0.2144E-10 0.2191E-13 0.1067E-09
-0.1345E-10 0.1394E-13 0.8881E-10
-0.1293E-10 -0.6742E-13 0.7338E-10
-0.1387E-10 -0.3607E-13 0.5827E-10
B34S Matrix Command Ending. Last Command reached.
The qr command has a number of options that facilitate its use in cases in which rank may be a
problem. The EPS option allows the user to specify a nonnegative number such that the condition
number of X ' X must be < (1/EPS). If ESP is not supplied, it defaults to V * 1.0E-16, where V is
the largest sum of absolute values in any row of X. The IFREZ option allows the user to require that
certain variables be placed in any final model when all variables cannot be entered in the regression
due to rank problems.3
NK 2
operations while the QR decomposition
2
requires NK 2 operations. If K  ( N / 2) then the QR is both faster and more accurate than the
Cholesky. When K  ( N / 2) the speeds are the same while for smaller rank problems the
Cholesky is faster. The QR decomposition requires more memory than the Cholesky.
3 The Cholesky decomposition requires K 3 
10-10
Chapter 10
The B34S qr command provides easy access to the singular value decomposition procedure,
although much more power is contained in the matrix procedure capability of the same name.4
Use of the capability is explained in the next section Assume a model y  f ( x1, x 2, x3) . The
following command will estimate both the QR approach to the OLS model and the associated
principal component regression.
b34sexec qr ipcc=pcreglist$ model y = x1 x2 x3$ b34seend$
Values printed include ˆ j , t j , U , V , Yˆ , ˆ j , and t j . If IPCC=PCREG, U , V , Yˆ , and eˆ are not
printed. The principle component regression model is discussed next.
10.2 The Principal-Component Regression Model
The principal-component regression model can easily be calculated from the data and
provides a useful transformation, especially in cases in which there is collinearity.5 Assume X is a T
by K matrix. The singular value decomposition of X is
X  U V '
(10.2-1)
where U is T by r,  is r by r, V is K by r, T  K and r  K . U and V have the property that
U 'U  I
(10.2-2)
and
V 'V  I
(10.2-3)
4 The matrix command r=qr(x,q); or r=qr(x,q,q1,q2); will perform a QR factorization of X. The
QR approach to OLS is implemented by the :qr option on the call olsq( ); command. Outside of
the matrix command, the
qr command will do QR and PC estimation.
5 Mandel (1982) provides a good summary of the uses of the principal-component regression and
the discussion here has benefited from that treatment. The LINPACK FORTRAN routine
DSVDC has been used to calculate the singular value decomposition. This routine has been
found to be substantially more accurate than the singular value decomposition routines in
versions 8 and 9 of the IMSL Library and substantially faster than the corresponding NAG
routine to calculate the singular value decomposition. Stokes (1983, 1984) implemented
LINPACK in SPEAKEASY 7. Subsequent testing of the SPEAKEASY LINKULE SVDCM,
which uses DSVDC, against the LINKULE SVD, which uses the NAG routine, documented the
speed advantage.
Special Topics in OLS Estimation
10-11
 i j  0 for i  j and the elements  i i are the square roots of the nonzero eigenvalues of X ' X
or X X ' . This result builds on the fact that while there are T eigenvalues of X X ' only K are non
zero and are in fact the eigenvalues of X ' X . The rows of V ' are the eigenvectors of X ' X , while
the columns of U are the eigenvectors of X X ' . This can be easily seen if we note that using (10.2-1)
X ' X  V U 'U V '  V  2V ' and XX '  U V 'V U '  U  2U '. If K = r, then X is of full rank. If we
replace X in equation (2.1-7) by its singular value decomposition, we obtain
Y  X   e  U V '   e
(10.2-4)
which can be written
Y  U  e
(10.2-5)
if we define
  V ' 
(10.2-6)
An estimate of ̂ can be calculated from equation (2.1-8) as
ˆ  (U 'U )1U ' y  U ' y
(10.2-7)
ˆ  (V ') 1  1ˆ
(10.2-8)
ˆ  [ˆ1, ,ˆ k ] and ˆ  [ˆ1, , ˆk ] . Mandel (1982) points out that the variance of each coefficient is
k
var( ˆ j )   [v 2j m /  2m m ]ˆ 2
(10.2-9)
m 1
where ˆ 2 is defined in equation (2.1-10), is vi j an element of matrix V and
var(ˆ j )  ˆ 2
(10.2-10)
Equation (10.2-9) shows that it is possible to determine just what vectors in X are causing the
increase in the variance since the elements of V are in the range 0 through 1. As the smallest  m m
approaches 0, the var( ˆ ) approaches  . The t test for each
j
coefficient ˆ j , t j is
t j  ˆ j / 
(10.2-11)
10-12
Chapter 10
in view of equation (10.2-10). The singular value decomposition can be used to illustrate the
problems of near collinearity. Following Mandel (1982), from equation (10.2-1) we write
U  X V
(10.2-12)
or
a 0 
U
  X [Va Vb ]
0

b

(10.2-13)
where  a and  b are square diagonal matrices that contain the ka and kb very small singular values
(ka  kb  K ) along the diagonals, respectively. Va and Vb are K by ka and K by kb , respectively. In
terms of our prior notation r  ka . If X is close to not being full rank, then
X Vb  0
(10.2-14)
since U [0 b ]'  0 as  b approaches zero. Equation (10.2-14) is very important in understanding
why predictions of OLS models with rank problems in X have high variance in cases of near
collinearity and why there is no unique solution for ˆi when there is exact collinearity (r  K ) . The
perfect collinearity case will be discussed first. Assume the vector ˆ is partitioned into ˆ
a
containing ka elements and ˆb containing kb elements. Given a new vector xT  j consisting of K
elements, a forecast of y in period T+ j can be calculated from
yˆT  j  xT  j (ˆa ˆb ) '  ( xa T  j xb T  j )(ˆa ˆb ) '
(10.2-15)
where xT  j is partitioned into xa T  j and xb T  k . Following Mandel (1982), if Z  V ' , where Z is
r by K, equation (10.2-8) becomes
Z ˆ  (Za Zb )(ˆa ˆb )'  ˆ
(10.2-16)
where Z has been partitioned into Z a , which is ka by kb , and Z b , which is ka by kb . From equation
(10.2-16) ˆ is written as
a
ˆa  Za1 Zb ˆb  Za1ˆ
(10.2-17)
Special Topics in OLS Estimation
10-13
Equation (10.2-17) shows that if a value for ˆb is arbitrarily determined, ˆa is uniquely determined.
Using equation (10.2-17), we substitute for ˆ in equation (10.2-15)
a
yˆT  j  xaT  j (Za1ˆ )  ( xb T  j  xa T  j Za1Zb )b
(10.2-18)
Equation (10.2-18) suggests that the only way we can make yˆT  j independent of any arbitrary value
of ˆ is to impose the constraint
b
xb T  j  xa T  j Za1Zb
(10.2-19)
which implies that
yˆT  j  xa T  j (Za1ˆ )
(10.2-20)
We note that
Z a1Z b  (Va' ) 1Vb'
(10.2-21)
where V ' has been partitioned as we did with Z. The above discussion repeats Mandel's (1982)
important proof that if there is collinearity such that r  K , there is no unique solution of yˆT  j ,
given a vector xT  j , except when the new x vector ( xT  j ) fulfills equation (10.2-19). Next, near
collinearity will be discussed.
Consider a new vector xT  j from which we want to obtain a prediction yˆT  j . If xT  j satisfies
the near collinearity condition of the X matrix expressed in equation (10.2-14), then from equation
(10.2-1) we can write
xT  j  uT  j V '
(10.2-22)
which, since  1 exists, can be written
uT  j  xT  jV  1
(10.2-23)
where xT  j and uT  j are K element vectors. From equation (10.2-7)
yˆT  j  uT  jˆ
and in view of equation (10.2-10),
(10.2-24)
10-14
Chapter 10
K
var( yˆT  j )  ˆ 2  ui2
(10.2-25)
i 1
From equation (10.2-13)
  a1
uT  j  ( xT  jVa xT  jVb ) 
0
0 

 b1 
(10.2-26)
Assume the new vector xT  j satisfies the near collinearity condition of the X matrix expressed in
equation (10.2-14), xT  jVb  0 and the variance of yˆT  j will be small. However in the case
where xT  jVb  0 , then the value ( xT  jVb )b1 will be very large, because of having to invert  b where
there are small values along the diagonal. These small values will imply large values in u T  j from
equation (10.2-26) and a large variance in yˆT  j from equation (10.2-25). This rather long discussion,
which has benefited from Mandel's (1982) excellent paper, has stressed the problems of using an
OLS regression model for prediction purposes when the original X matrix has collinearity problems.
The singular value decomposition approach to OLS estimation has been shown to highlight the effect
of collinearity, which potentially impacts OLS, ARIMA, VAR and VARMA models. The ridge
regression model and the Lasso procedure are two ways to deal with this problem is a structured
manner.6
6 The stepwise and best regression approaches, discussed in Chapter 2, are other alternatives that
have their advantages and disadvantages.
Special Topics in OLS Estimation
10-15
10.3 Ridge, Lasso and Elastic Net Models
The Ridge regression model of Hoerl-Kennard (1970), as discussed in Hastie-TibshiraneFriedman (2001, 59)(2009 61), is a shrinkage method of estimation that involves calculation of

T
k 1
k

i 1
j 1
j 1

ˆ ridge  min  ( yi  0   xij  j )2     j2 
(10.3-1)

The ridge coefficients mimimize a penalizied residual sum of squares where  controls the amount
T
of shrinkage.   0.0 implies an OLS model. If the inputs are centered and ˆ0   yi / T and X * has
i 1
the vector of 1’s removed, the rest of the ridge coefficients can be estimated as
( X *' X *   I )1 X *' y .
(10.3-2)
Note
X ˆ ridge  X * ( X *' X   I ) 1 X *' y
 U  (  2   I )  1 U ' y

k 1
 uj
j 1
2
j
(   )
2
j
(10.3-4)
u 'j y
where u j are the columns of U from (10.2-1) and  i is the ith diagonal element of  and X * was
used in the place of X. Equation (10.3-2) shows that for   0 the ridge estimates are smaller than
their OLS counterparts. The columns of X * with the least variance will be associated with the
smallest  i . An example is shown below.
The Lasso is another shrinkage procedure, which following Hastie-Tibshirano-Friedman (2001, 72)
(2009, 68), can be estimated as

T
k 1
k 1

j 1
j 1

ˆ  arg min [ yi  0   ( xij  x j ) j ]2   |  j |d 
 i1
(10.3-5)
if d  1 . Setting d  2 corresponds to the ridge regression (10.3-1) while d  0 corresponds to
variable subset selection. Examples of the ridge and lasso models are presented in section 10.5 in
tables 10.2 and 10.3.
10-16
Chapter 10
A related shrinkage approach is the least trimmed regression (LTS) model discussed in
q
Faraway (2005, 101-103) where one minimizes
 uˆ
i 1
2
(i )
where q  T and uˆ(2i ) is the sorted residual
squared. The smaller q, the more outliers that are removed from the dataset. Compared to the full
sample, inspection of the LTS coefficients will indicate how sensitive the results are to possibly
rogue observations that are coming from a different distribution or have something wrong/strange
with the data. The LTS estimates are an example of a resistant regression method and are illustrated
in Table 10.9 in section 10.5.
The Elastic Net model, discussed next, combines the Ridge and Lasso modeling approaches
and provides a major step forward in view of the progress made in easily computing the solutions of
lasso and ridge models over a range of  values.7 The basic Elastic Net Model assumes both  and
 and minimizes
 1
 2T

k

'
2
.5(1   )  j2   |  j |
(
y



x

)


P
(

)
where
P
(

)



i
0
i
a


i 1
j 1

T
(10.3-6)
Equation (10.3-6) is a compromise between a ridge regression penalty (  0) and a lasso penalty
(  1) .8 The below listed code tests a number of the capabilities of the GLM model. The problem
is to test the extent of which out-of-sample performance is impacted by shrinking the model. The
exact number of lags used by Tiao-Box (1981) of 6 is used and a holdout sample of 100 is assumed.
The GLM switch ne=9 restricts the model to nine coefficients and measures the effect on the R**2
and e ' e .
/; Illustrates effect of reduction in model on the out of sample
/; performance
/;
b34sexec options ginclude('gas.b34'); b34srun;
b34sexec matrix;
call loaddata;
call load(glm_info :staging);
call echooff;
/; logic works for holdout > 0
/; set max VAR lag and holdout # of obs
/;
/; k=max lag, parm(0.0)=> ridge (1.0) lasso, nlam = # models tried
k=6;
7 Zon-Hastie (2005) first proposed the Elastic Net Model. Friedman-Hastie-Tibshirani (2009)
provided details on a fast way to compute the general linear model using coordinated descent.
Further details on this approach are given in Hastie-Tibshirani-Friedman (2009, 662).
8 The elastic net model is implemented in the GLM command which uses GPL code developed
by Friedman-Hastie-Tibshirani in a 2009 working paper.
Special Topics in OLS Estimation
holdout=100;
maxlag=k;
nlam=100;
lam_min=.0001;
thr
=.1e-6;
parm
= .5;
ne
= 9;
/; At issue is how are yhat values calculated within the sample treated?
/; For purposes of this analysis they are removed
call olsq(gasout gasout{1 to k} gasin{1 to k} :print
:savex :holdout holdout);
%yfuture=
gasout(integers(norows(%x)+maxlag,norows(gasout)));
olsf=mfam(%xfuture)*vfam(%coef);
olsfss=sumsq(afam(%yfuture)-afam(olsf));
call print(' ':);
call print('ols forecast error sum sq ',olsfss:);
call print(' ':);
call glm(gasout gasout{1 to k} gasin{1 to k} :print :savex
:lamdamin lam_min :nlam nlam :holdout holdout
:thr thr :parm parm :ne ne);
call print(' ':);
%yfuture=
gasout(integers(norows(%x)+maxlag,norows(gasout)));
call glm_info(%yfuture,%xfuture,%coef,%a0,%alm,glmf,
fss,mod,1);
call print(' ':);
call print('glm forecast sum of squares ',fss:);
call print(' ':);
res_ols=vfam(%yfuture)-olsf;
res_glm=vfam(%yfuture)-glmf;
call graph(%yfuture,olsf,glmf :nolabel :nocontact
:pgborder :file 'ols_glm_yhat_oos.wmf'
:heading 'OLS vs GLM yhat out-of-sample');
call graph( res_ols res_glm :nolabel :nocontact
:pgborder :file 'ols_glm_res_oos.wmf'
:heading 'OLS vs GLM residual out-of-sample');
call tabulate(%yfuture,olsf,glmf res_ols res_glm);
/; tests of reduction loss as a function of restriction
tparm=grid(.1,.9,.1);
fsstest=array(norows(tparm):);
nein=ne;
do jj=1,norows(tparm);
call glm(gasout gasout{1 to k} gasin{1 to k} :savex
:lamdamin lam_min :nlam nlam :holdout holdout
:thr thr :parm tparm(jj) :ne ne);
call glm_info(%yfuture,%xfuture,%coef,%a0,%alm,glmf,
fss,mod,1);
fsstest(jj)=fss;
enddo;
call tabulate(tparm
call graph(
tparm
:nolabel :nocontact
:heading
fsstest );
fsstest :plottype xyplot :grid
:pgborder :file 'PARM_vs_OSS.wmf'
'Parm vs GLM residual SS out of sample');
10-17
10-18
Chapter 10
rss_glm=array(2*k:);
ne_used=array(2*k:);
icount=1;
%yfuture=
gasout(integers(norows(%x)+maxlag,norows(gasout)));
do ne=1,2*k;
ii=ne;
call glm(gasout gasout{1 to k} gasin{1 to k} :savex
:lamdamin lam_min :nlam nlam :holdout holdout
:thr thr :parm parm :ne ii
/;
:print
);
/; call print('+++++++++++++++++++++++++':);
call glm_info(%yfuture,%xfuture,%coef,%a0,%alm,glmf,fss,
mod,0);
rss_glm(icount)=sfam(fss);
ne_used(icount)=sfam(dfloat(ii));
icount=icount+1;
enddo;
rss_glm
=dropfirst(rss_glm,1);
ne_used
=dropfirst(ne_used,1);
call tabulate(rss_glm,ne_used);
call graph(ne_used rss_glm :grid
:heading 'RSS out-of-sample as a function of model reduction'
:plottype xyplot :nocontact :nolabel :pgborder
:file 'rss_loss.wmf');
b34srun;
Edited and annotated output follows:
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F(12,
177)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Holdout option reduced sample by
Variable
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
CONSTANT
Lag
1
2
3
4
5
6
1
2
3
4
5
6
0
Coefficient
1.2285222
-0.29063287
-0.24342211
0.20531898
-0.89608071E-01
0.31553886E-01
0.12786487
-0.32396180
-0.49884894
0.25235636
-0.19926975
0.12738336
8.4137071
ols forecast error sum sq
GASOUT
0.9979718278979760
0.9978343247046184
3.682663208127010
2.080600682557632E-02
0.1442428744360578
1815.754789473684
105.0235276580087
52.50894736842105
3.099543224133753
20.92257736335549
7257.808371787274
1.000000000000000
3.637698112128508E-09
0.4505041686010856
190
100
SE
0.74956376E-01
0.11828994
0.11861680
0.11606267
0.10219641
0.40767543E-01
0.52721231E-01
0.11198859
0.12726385
0.13183588
0.12969996
0.99759679E-01
1.3930658
t
16.389829
-2.4569534
-2.0521723
1.7690355
-0.87682210
0.77399529
2.4253013
-2.8928107
-3.9198008
1.9141706
-1.5363902
1.2769022
6.0397055
52.36509239330390
The OLS out-of-sample residual sum of squares was found to be 52.365. A GLM model is shown
next where the maximum number of variables in the model is assumed to be 9. In equation (10.3-6)
it is assumed that   .5 . min  0.142E-01 . Figures 10.1 and 10.2 show out-of-sample ŷ and
residuals for OLS and GLM models.
Special Topics in OLS Estimation
10-19
Generalized Linear Model via Coordinated Descent
Version 5/17/2008 converted to real*8 10/28/2009
Number of Observations
190
Holdout option reduced sample by
100
Number of right hand side variables
12
Maximum number of variables in model (:ne)
9
Number of lamda values considered
(:nlam)
100
Minimum lamda
(:flmin)
1.000000000000000E-04
Covariance updating algorithm selected (:cua)
Percent lasso (:parm)
0.5000000000000000
Analysis on standardized predictor variables
Converge threshold for Lamda solution (:thr)
1.000000000000000E-07
Left Hand Side Variable
GASOUT
Series
GASOUT
Mean
52.51
Max
60.20
Min
45.60
Right Hand Side Variables
#
1
2
3
4
5
6
7
8
9
10
11
12
Series
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
Lag
1
2
3
4
5
6
1
2
3
4
5
6
Mean
52.52
52.53
52.53
52.53
52.53
52.53
0.2094
0.2091
0.2064
0.2025
0.1980
0.1933
Max
60.20
60.20
60.20
60.20
60.20
60.20
2.834
2.834
2.834
2.834
2.834
2.834
Min
45.60
45.60
45.60
45.60
45.60
45.60
-2.716
-2.716
-2.716
-2.716
-2.716
-2.716
Total Number of passes over the data
Maximum R**2
Last Lamda Value Considered
Residual Sum of Squares for last model
Sum of Absolute Residuals for last model
Largest Adsolute residual for last model
Penalty
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
14807
0.9966397334158390
1.422720271397891E-02
6.101420144071417
26.03246518892973
0.6581948928894832
OLS vs GLM yhat out-of-sample
61
60
59
58
57
%
Y
F
U
T
U
R
E
56
55
54
53
52
51
50
0
10
20
30
40
50
Obs
60
70
80
90
100
O
L
S
F
G
L
M
F
10-20
Chapter 10
Figure 10.1 OLS vs GLM Yhat Values for Out-of-Sample Observations
OLS vs GLM residual out-of-sample
2
1.5
1
R
E
S
_
O
L
S
.5
0
-.5
R
E
S
_
G
L
M
-1
-1.5
0
10
20
30
40
50
Obs
60
70
80
90
100
Figure 10.2 Out-of-Sample Residuals for OLS and GLM Models
Figure 10.3 shows the effect of changing the value of  on the out-of-sample residual sum of
squares. The residual sum of squares is relatively insensitive to  values in the range of .3 to 1.0.
Figure 10.4 shows the sum of squares is relatively insensitive to models with 8 or more coefficients.
More detail on these results are presented in Table 10.1.
Special Topics in OLS Estimation
10-21
Parm vs GLM residual SS out of sample
700
600
500
400
300
200
100
.10
re
.20
.30
.40
.50
TPARM
.60
.70
Figure 10.3 Effect of changes in Lamda on the Out-of-Sample RSS
.80
.90
10-22
Chapter 10
RSS out-of-sample as a function of model reduction
1200
1100
1000
900
800
700
600
500
400
300
200
100
2
3
4
5
6
7
NE_USED
8
9
10
11
12
Figure 10.4 RSS for Out-of-Sample Models vs Number of Vectors in the Model
Assuming   .5 , the out-of-sample sum of squares varied between 1338.78 for   5.48 to 85.43
for   85.43. Table 10.1 shows added detail as  is changed from .1 to .9..
Special Topics in OLS Estimation
10-23
Table 10.1 Effect on the Out-of_Sample RSS of values of  and the Number of Coefficients
____________________________________________________________________________________________________________________________
Obs
1
2
3
4
5
6
7
8
9
TPARM
0.1000
0.2000
0.3000
0.4000
0.5000
0.6000
0.7000
0.8000
0.9000
FSSTEST
691.0
247.3
74.50
69.11
85.43
84.92
84.35
83.34
82.23
1
2
3
4
5
6
7
8
9
10
11
RSS_GLM
1190.
1055.
933.6
821.6
271.1
194.8
85.43
85.43
85.43
85.43
85.43
NE_USED
2.000
3.000
4.000
5.000
6.000
7.000
8.000
9.000
10.00
11.00
12.00
Obs
_________________________________________________________________________________________________________________________
10.4 Partial Least Squares and Continuum Regression Models
Define X as the data matrix with means removed, and y as the left hand-side with mean
removed. While the principle component regression model (PCR), discussed in section 10.2,
shrinks the X matrix by using orthogonal projections that key on high variance in the X matrix
only, the partial least squares procedure (PLS), first suggested by Wold (1975), keys on both
high variance and correlation with the left hand side variable y. As such it can usually explain
more variance than the PCR approach given that the number of projections is less than the
number of columns in X. The performance of PLS vs PCR was studied by Frank-Friedman
(1993) and is summarized in Hastie-Tibshirani-Friedman (2009, 61-99). Stone-Brooks (1990)
suggested the Continuum Regression Model (CRM) of which OLS, PLS and PCR are special
cases. After first discussing PLS and CRM, a number of examples will be developed.
10-24
Chapter 10
Important research by de Jong-Wise-Ricker (2001) provides both an excellent discussion
of the theory behind the PLS and a computationally efficient procedure to obtain both PLS and
CRM results.9 A first approach to estimating a CRM suggested by Iglarsh-Cheng (1980) formed
a weighted continuous estimator (WCR) as
ˆWCR  (1   )ˆOLS  ˆPCR
(10.4-1)
which could be estimated with a nonlinear search procedure. The strength of this procedure is in
its simplicity. Its developers argued that the WCR estimator has a smaller mean square error than
OLS in the presence of multicollinearity. Note that OLS implies   0 while PCR implies that
  1. The PLS estimator, which will be defined next, is not included in this specification which
will not be discussed further.
Define  

and note that given ( X ' X )  ULU ' then ( X ' X )  UL U ' . This is
(  1)
equivalent to modifying X into its "powered" form or X ( )  UL /2V ' .   0,1,  corresponds to
the effect on X of OLS, PLS and PCR respectively.10 For 0    1 multicolinearity has been
artificially taken out of the X matrix while for 1     multicolinearity has been artificially put
in the the X matrix. Define T as the PLS component or orthogonal vectors and gs ( ) as the
Gram-Schmidt orthogonalization11 of ( ) where
T  gs ( K )
(10.4-2)
given
K  [ULU ' y, UL2U ' y,
,ULaU ' y]
(10.4-3)
and a is the number of columns of the PLS model. Given
p  U ' y/ | y |
(10.4-4)
is the vector of correlations of y with the non-zero principle components, the canonical
form of K is
K  [ Lp, L2 p,
, La p]
(10.4-5)
9 This reference will be used to develop the math in the following discuission.
10 The SVD decomposition shows X  USV '. X ' X  VSU 'USV '  VS 2V '. For a positive
definite matrix such as X ' X then V '  U . S 2  L.
11 The Gram-Schmidt orthogonalization on a matrix is usually U from the SVD of the matrix.
Special Topics in OLS Estimation
10-25
from which we can define the orthonormal basis of K as
T  gs( K )  U 'T
(10.4-6)
T  UT  XR
(10.4-7)
where R are the weights given by
R  VL.5U 'UT  VL.5T
(10.4-8)
The importance of R is that it shows how each orthogonal column in T is related to the original
data in X . The loadings c of T are defined as
c  y 'T (T 'T )1  y 'UT
(10.4-9)
The loadings of T with respect to X are
P  X 'T (T 'T )1  X 'T  VL.5U 'UT  VL.5T
(10.4-10)
The fraction of variance of y explained by the PLS model is
2
RyT
 y 'T (T 'T )1T ' y / y ' y  cc '/ | y |2  p 'TT ' p
(10.4-11)
where p  U ' y / | y | . The PLS fitted y and PLS coefficient vector  PLS are
yˆ PLS  T (T 'T ) 1T ' y  Tc '  XRc '
(10.4-12)
and
ˆPLS  Rc '
(10.4-13)
respectively. From R we can obtain the implied vector ˆOLS as
ˆOLS  RˆPLS
(10.4-14)
since the PLS coefficients are related to their implied OLS coefficients. Either can be used to
obtain ŷ .
T  PLS  y  X OLS  yˆ
(10.4-15)
10-26
Chapter 10
The importance of (10.4-14) is that inspection of the implied OLS coefficients OLS shows how
the latent vectors in T are related to the underlying data vectors in X. If the number of latent
vectors is less than the number of original vectors in X, the resulting residual sum of squares will
be larger than would occur if all PLS coefficients were used.
Table 10.2 contains the Matlab code to estimate the PLS model using the de Jong (1993)
method of analysis. Table 10.3 shows the b34s implementation of this approach. Although this
approach is outdated, it is still in use in sas proc pls and in the Matlab plsregress
command.
Special Topics in OLS Estimation
Table 10.2 Matlab Code to Obtain PLS Model Solution suggested by de Jong (1993)
function [Xloadings,Yloadings,Xscores,Yscores,Weights]=simpls(X0,Y0,ncomp)
[n,dx] = size(X0);
dy = size(Y0,2);
%
outClass = superiorfloat(X0,Y0);
Xloadings = zeros(dx,ncomp,outClass);
Yloadings = zeros(dy,ncomp,outClass);
if nargout > 2
Xscores = zeros(n,ncomp,outClass);
Yscores = zeros(n,ncomp,outClass);
if nargout > 4
Weights = zeros(dx,ncomp,outClass);
end
end
% Each new basis vector can be removed from Cov separately.
V = zeros(dx,ncomp);
Cov = X0'*Y0;
for i = 1:ncomp
% Find unit length ti=X0*ri and
% is jointly maximized, subject
[ri,si,ci] = svd(Cov,'econ');
ti = X0*ri;
normti = norm(ti); ti = ti ./
Xloadings(:,i) = X0'*ti;
qi = si*ci/normti; % = Y0'*ti
Yloadings(:,i) = qi;
if nargout >
Xscores(:,i)
Yscores(:,i)
if nargout >
Weights(:,i)
end
end
ui=Y0*ci whose covariance, ri'*X0'*Y0*ci,
to ti'*tj=0 for j=1:(i-1).
ri = ri(:,1); ci = ci(:,1); si = si(1);
normti; % ti'*ti == 1
2
= ti;
= Y0*qi; % = Y0*(Y0'*ti), and proportional to Y0*ci
4
= ri ./ normti; % rescaled weights
% Update the orthonormal basis
vi = Xloadings(:,i);
for repeat = 1:2
for j = 1:i-1
vj = V(:,j);
vi = vi - (vj'*vi)*vj;
end
end
vi = vi ./ norm(vi);
V(:,i) = vi;
Cov = Cov - vi*(vi'*Cov);
Vi = V(:,1:i);
Cov = Cov - Vi*(Vi'*Cov);
end
if nargout > 2
for i = 1:ncomp
ui = Yscores(:,i);
for repeat = 1:2
for j = 1:i-1
tj = Xscores(:,j);
ui = ui - (tj'*ui)*tj;
end
end
Yscores(:,i) = ui;
end
end
10-27
10-28
Chapter 10
Table 10.3 B34S Implementation of SIMPLS Calculation
subroutine pls_reg(y,x,pls_coef,xload,yload,xscores,
yscores,weights,yhat,pls_res,rss,ncomp,iprint);
/; Partial Least Squares. See Wold (1975)
/; pls_reg is designed to 100% track the Matlab simpls routine
/; from which this discussion of PLS and PC regression has been
/; developed.
/;
/; The Matlab code came from
/; de Jong, S. "SIMPLS: An Alternative Approach to Partial Least Squares
/; Regression." Chemometrics and Intelligent Laboratory Systems. Vol.
/; 18, 1993, pp. 251–263.
/;
/; A newer reference is:
/;
/; y
=>
left hand variable. Usually %y from olsq with :savex
/;
Can include more that 1 col!
/; x
=>
left hand variable. Usually %x from olsq with :savex
/;
n by k
/; pls_coef =>
The pls_beta. Calculated as:
/;
pls_beta=weights*transpose(yload) for each.
/;
pls_coef is a (k,ncomp) matrix of coefficients. If
/;
ncomp is set = k, then pls_coef(,ncomp) is the same
/;
as the ols coefficients
/; xload
/; yload
/; xscores
/; yscores
/; weights
/; yhat
=>
Predicted y value
/; res
=>
Residiual for last PLS regression
/; rss
=>
Residual Sum of Squares for from 1,...,ncomp models
/; ncomp
=>
# of cols in pls_beta
/; iprint
=>
=0 print nothing, =1 print results
/;
/; Built 28 April 2011 by Houston H. Stokes
/;
k=nocols(x);
kk=nocols(y);
n=norows(x);
if(ncomp.gt.k)then;
call epprint(
'ERROR: ncomp must be 0 < ncomp le # cols of x. was',ncomp:);
call epprint(
'
# of columns of x was
',k:);
go to finish;
endif;
if(norows(y).ne.norows(x))then;
call epprint('ERROR: # of obs in y and x not the same':);
go to finish;
endif;
if(kk.gt.1)then;
call epprint(
'ERROR: This release of pls_reg limited to one left hand variable':);
go to finish;
endif;
meany=mean(y);
meanx=array(k:);
y0=y-meany;
x0=x;
do i=1,k;
meanx(i)=mean(x(,i));
x0(,i)=mfam(afam(x(,i))-sfam(afam(meanx(i))));
enddo;
vbig
vi
vibig
xload
yload
xscores
yscores
weights
=matrix(k,ncomp:);
=matrix(k,1:);
=matrix(k,1:);
=matrix(k,ncomp:);
=matrix(kk,ncomp:);
=matrix(n,ncomp:);
=matrix(n,ncomp:);
=matrix(k,ncomp:);
Special Topics in OLS Estimation
rss
=vector(ncomp:);
pls_beta=matrix(k,ncomp:);
cov=matrix(nocols(x0),1:transpose(x0)*y0);
do i=1,ncomp;
s=svd(cov,ibad,21,uu,vv );
if(ibad.ne.0)then;
call epprint('ERROR: SVD of cov failed':);
go to finish;
endif;
ri=uu(,1);
ci=vv(,1);
si=s(1);
ti=x0*ri;
normti=sqrt(sumsq(ti));
if(normti.le.0.0)then;
call epprint('ERROR: Norm of ti le 0.0':);
go to finish;
endif;
ti=vfam(afam(ti)/normti);
xload(,i)=transpose(x0)*ti;
qi=si*ci/normti;
yload(,i)=qi;
xscores(,i)=ti;
yscores(,i)=y0*qi;
weights(,i)=vfam(afam(ri)/sfam(normti));
vi(,1)=xload(,i);
do repeat=1,2;
if(i.gt.1)then;
do j=1,(i-1);
vj(,1)=vbig(,j);
vi=mfam(afam(vi)-(sfam(transpose(vj)*vi)*afam(vj)));
enddo;
endif;
enddo;
normvi=sqrt(sumsq(vi));
if(normvi.le.0.0)then;
call epprint('ERROR: Norm of vi le 0.0':);
go to finish;
endif;
xjunk=afam(afam(vi)/ normvi);
vi=matrix(norows(xload),1:xjunk);
vbig(,i)=vi(,1);
cov=cov - (vi*(transpose(vi)*cov));
vibig=submatrix(vbig,1,norows(vbig),1,i);
cov=cov - (vibig*(transpose(vibig)*cov));
do iii=1,ncomp;
ui=yscores(,iii);
do repeat=1,2;
if(iii.gt.1)then;
do j=1,(iii-1);
tj=xscores(,j);
xwork=sfam(tj*ui);
ui=mfam(afam(ui)-(xwork*afam(tj)));
enddo;
endif;
enddo;
yscores(,iii)=ui;
enddo;
pls_beta=weights*transpose(yload);
adj=array(nocols(x):);
do jj=1,nocols(x);
adj(jj)=meanx(jj)*sfam(pls_beta(jj,1));
enddo;
scale=meany-sum(adj);
10-29
10-30
Chapter 10
jj=norows(pls_beta);
pls_beta(jj,1)=scale;
yhat=x*pls_beta;
pls_res=vfam(afam(y)-afam(yhat));
rss(i)=sumsq(pls_res);
pls_coef(,i)=pls_beta(,1);
enddo;
if(iprint.ne.0)then;
call print(' ':);
iix=nocols(x);
call print('Partial Least Squares - 26 April 2011 Version' :);
call print('Number Columns in origional data
',iix :);
call print('Number Columns in PLS Coefficient Vector',ncomp:);
call print('PlS sum of squared errors
',rss(ncomp):);
endif;
/;
go to done;
finish continue;
yhat=missing();
phs_beta=missing();
done continue;
return;
end;
Table 10.4 PLS1 the Jong-Wise-Ricker (2001) PLS-CRM Estimation Approach
function [c,R,P,B_CPR,R2X,R2y,T,B_CPRmh] = pls1(x,y,A,alpha,yORG,xORG)
%
% Code suggested by Jong-Wise-Ricker j Chemometrics 2001, 15: 85-100
%
% Inputs:
% x
x matrix
% y
y matrix
% A
dimensionality of PLS model
% alpha =0, .5 1. for OLS, PLS and PC
%
% Outputs:
% T
orthonormal PLS component scores where T = XR
% R
weights
% P
% B_CPR betas - PLS regression vector
% c
loadings of T with respect to y
% +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
%
% Code from de Jong, Wise, Ricker (2001)
% 29 April 2011 version. Additions by Michael Hunstad
% +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[U,L,V]=svd(x,0);
% SVD
L=diag(L.^2);
% Eigenvalues
r=sum(L>(L(1)/1e14))
% rank of x
L=L(1:r);
% non-zero eigenvalues
U=U(:,1:r);
% unit-length PC scores
V=V(:,1:r);
% PC weights
A=min(A,r);
% dimensionality
gam = alpha/(1-alpha);
% continuum power exponent
Lgam=L.^gam;
% 'powered' eigenvalues
rho=(y'*U)';
% |y| * corr(y,PCs)
T=zeros(r,A);
% initialize T
for a=1:A;
t=Lgam.*rho;
% can version of (XX')^gam*y
t=t-T(:,1:max(1,a-1))*(T(:,1:max(1,a-1))'*t); % orthogonalize
t=t/sqrt(t'*t)
% normalize t
T(:,a)=t
% store in T
rho=rho-t*(t'*rho);
% residual of rho w. r. t. t
end
c=rho'*T;
% loadings c of T w. r.to y
cmh = y'*U*T;
% This is added
R=V*(diag(1./sqrt(L))*T);
P=V*(diag(sqrt(L))*T);
B_CPR=R*triu(c'*ones(size(c)));
% ones(size)) has been fixed
meanY = mean(yORG); %mean of original data - i.e., not de-meaned
xORG(:,end+1) = ones(size(xORG,1),1);
meanX = mean(xORG); %mean of original data - i.e., not de-meaned
B_CPRmh = R*cmh';
B_CPRmh(end) = [meanY - meanX*B_CPRmh];
Special Topics in OLS Estimation
R2X=100*1'*cumsum(T'.^2)'/sum(1);
%
R2y=100*cumsum(c.^2)/(y'*y);
%
T=U*T;
%
%note that throughout the algorithm T is actually
%step
10-31
R-squared on X Eq. 18
R-squared on y Eq. 17
Eq.11
Ttilde until the last
The algorithm logic in Tables 10.2 and 10.3 shows that initially the covariance matrix of
X and y is formed. The SVD is then repeatedly calculated as the information contained in each T
vector is used to update the covariance matrix. A number of loops are required. In contrast the de
Jong-Wise-Ricker (2001) algorithm achieves the same result with only one loop and one SVD
calculation. The Matlab logic is shown in Table 10.4 and its b34s implementation in Table 10.5.
Comments have been added to show additions to the Matlab code. The crmtest program was
developed to graphically display the results of changing  on the residual sum of squares.
Table 10.5 B34S Implementation of PLS1 including CRMTEST
/;
/; Also includes crmtest to graphically study CRM Model
/;
subroutine pls1_reg(y,x,y0,x0,r,pls_beta,
u,v,s,pls_coef,yhat,pls_res,rss,ncomp1,gamma,iprint);
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
Partial Least Squares. See Wold (1975). This subroutine allows
user to artificially decrease (gamma < 1) or increase (gamma > 1)
the degree of multicolinearity in the X data. Only one SVD is used
in contrast to the simpls approach coded in pls_reg which should
increase performance. This is called canonical PLS.
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
pls_reg and pls1_reg take the response variable into account, and
therefore often leads to models that are able to fit the response
variable with fewer components.
pls1_reg is designed to implement de Jong-Wise_Ricker Matlab code
from which this discussion of PLS and PC regression has been
developed.
Alternative Matlab code came from
de Jong, S. "SIMPLS: An Alternative Approach to Partial Least Squares
Regression." Chemometrics and Intelligent Laboratory Systems. Vol.
18, 1993, pp. 251–263 and was implemented in B34S as pls_reg. This
approach is substantially slower and has less capability.
The complete reference for this code is:
de Jong, Sijmen, Barry Wise, N. Lawrence Ricker "Canonical Partial
Least Squares and Continuum Power Regression," Journal of
Chemometrics Vol. 15, 2001, pp 85-100
pc_reg creates components to explain the observed variability in the
predictor variables, without considering the response variable at all.
y
x
=>
=>
u
v
s
y0
x0
r
pls_beta
=>
=>
=>
=>
=>
=>
=>
pls_coef =>
yhat
res
rss
=>
=>
=>
left hand variable. Usually %y from olsq with :savex
left hand variable. Usually %x from olsq with :savex
n by k
from svd(x)
x=u*diagmat(s)*transpose(v)
from svd(x)
singular values
y with mean removed
X with means subtracted except for last col
weight matrix. Note bigt=x0*r from equation (16)
pls_beta such that x0*r*pls_beta + mean(y) maps to yhat
t*pls_beta + mean(y)
since t = x0*r
If ncomp is set = k, then pls_coef is the same as the
ols coefficients. This might change in future releases
to be simular to pls_reg.
Note: (t*pls_beta)+mean(y) = x*pls_coef = yhat
Predicted y value for last regression
Residual for last PLS regression
Residual Sum of Squares all ncomp models
10-32
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
/;
ncomp
gamma
Chapter 10
=>
=>
Note:
iprint
# of cols in pls_beta
=0 for OLS =1. for pls
=> infinity for pc
Use caution changing gamma
gamma = 0 => OLS
gamma = 1 => PLS
gamma > 1 => increasing multicolinearity in dataset
set .le. 20
gamma < 1 => decrease
multicolinearity in dataset
=>
=0 print nothing, =1 print results,
=2 suppress coef list
Note:
bigt= x0*r;
bigt= u * t_tilda => t_tilda= transpose(u) * bigt
loadings of bigt with respect to y0
cc = y0*u*t_tilda = y0*bigt
fitted y0= bigt* cc
= x0*r*y0*x0*r
Built 11 May 2011 by Houston H. Stokes
Contributions made by Michael Hunstad to Matlab version.
First Equation number refers to de Jong-Sijman-Wise-Ricker (2001)
kin=nocols(x);
kk=nocols(y);
n=norows(x);
ncomp=ncomp1;
if(ncomp.gt.kin)then;
call epprint(
'ERROR: ncomp must be 0 < ncomp le # cols of x. was',ncomp:);
call epprint(
'
# of columns of x was
',kin:);
go to finish;
endif;
if(norows(y).ne.norows(x))then;
call epprint('ERROR: # of obs in y and x not the same':);
go to finish;
endif;
if(kk.gt.1)then;
call epprint(
'ERROR: This release of pls1_reg limited to one left hand variable':);
go to finish;
endif;
meany=mean(y);
meanx=array(kin:);
y0=y-meany;
x0=x;
do i=1,(kin-1);
meanx(i)=mean(x(,i));
x0(,i)=mfam(afam(x(,i))-sfam(afam(meanx(i))));
enddo;
s=svd(x0,ibad,21,u,v );
if(ibad.ne.0)then;
call epprint('ERROR: SVD x0 failed':);
go to finish;
endif;
L=afam(s)*afam(s);
k=idint(sum(L.gt.(sfam(L(1))/1.e+14)));
L=L(integers(1,k));
ncomp=min1(ncomp,k);
u=submatrix(u,1,norows(u),1,k);
v=submatrix(v,1,norows(v),1,k);
Lgam=vfam(afam(L)**gamma);
rho=(y0*u);
bigt=matrix(k,ncomp:);
rss=vector(ncomp:);
do i=1,ncomp;
maxcol=max1(1,i-1);
t=afam(Lgam)*afam(rho);
Special Topics in OLS Estimation
t=t-
afam(submatrix(bigt,1,k,1,maxcol)*
(transpose(submatrix(bigt,1,k,1,maxcol))*vfam(t)));
t=vfam(afam(t)/sqrt(sumsq((t))));
bigt(,i)=vfam(t);
rho=rho-(vfam(t)*(vfam(t)*vfam(rho)));
/; Equation (16)
(10.4-13)
pls_beta=y0*u*bigt;
yhat=(u*bigt*pls_beta) + meany;
rss(i)=sumsq((afam(y)-afam(yhat)));
enddo;
/; Equation (14)
(10.4-8)
r=v*(diagmat((1./dsqrt(afam(L))))*bigt);
/; Equation (22)
(10.4-13)
pls_coef=r*pls_beta;
i=norows(pls_coef);
pls_coef(i)=sfam(meany)-sfam(vfam(meanx)*pls_coef);
/; Equation (11) (10.4-7)
bigt=u*bigt;
yhat=x*pls_coef;
pls_res=vfam(afam(y)-afam(yhat));
tss=variance(y)*dfloat(norows(y)-1);
rsq=1.0-afam(rss(ncomp)/tss);
if(iprint.ne.0)then;
call print(' ':);
iix=nocols(x);
call print('Partial Least Squares PLS1 - 9 May 2011 Version.
' :);
call print('Logic from de Jong, Wise, Ricker (2001) Matlab Code':);
call print('Number of rows in original data
',
norows(x):);
call print('Number Columns in origional data
',iix:);
call print('Number Columns in PLS Coefficient Vector
',
ncomp:);
if(ncomp.lt.ncomp1)
call print('Note: PLS coefficient vector reduced due to rank of X':);
call print('Gamma
',
gamma:);
call print('Mean of left hand variable
',
meany:);
call print('PLS sum of squared errors
',
rss(ncomp):);
call print('Total sum of squares
',tss:);
call print('PLS R^2
',rsq:);
if(iprint.ne.2)then;
call tabulate(pls_beta,pls_coef
:title '(T*pls_beta)+mean(y) = x*pls_coef');
endif;
endif;
/;
go to done;
finish continue;
yhat=missing();
phs_coef=missing();
done continue;
return;
end;
subroutine crmtest(y,x,ncomp,gammag,rsstest,rote,iprint,noshow);
/;
/; Investigates the effect of changes in gamma on the RSS
/; Various gamma => alternate Continuum Regression Models
/;
/; y
=> left hand variable. Must be vector
/; x
=> right hand variable matrix with constant included
/; ncomp => # of PLS/CR vectors
/; gammg => Vector of gamma values
/; rsstest=> Matrix of RSS values for 1-ncomp vectors and gammag
/; rote
=> Sets rotation
/; iprint => =0 do not give pls1_reg output
/;
=1 give pls1_reg output
/;
=2 do not give pls1_coef list
10-33
10-34
Chapter 10
/; noshow => =1 Just produce graph in crm_test.wmf
/;
=0 show graph and save graph
/;
/;
/;
/;
0 < gamma < 1 => Multicollinearity taken from X matrix
/;
1 < gamma < 15.=> Multicollinearity added to X matrix.
/;
gamma = 1 => PLS Model. a large value of gamma
/;
approachs PC
/;
/; Subroutine crmtest built 23 May 2010 by Houston H. Stokes
/; +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/;
ii=ncomp;
jj=norows(gammag);
rsstest=matrix(ii,jj:);
do j=1,jj;
gamma=gammag(j);
call pls1_reg(y,x,y0,x0,r,c,,u,v,s,pls2coef,yhat,
pls_res,pls_rss,
ncomp,gamma,iprint);
rsstest(,j)=pls_rss;
enddo;
scaleadj=array(4: dfloat(1), gammag(1),
dfloat(ii),gammag(norows(gammag)));
if(noshow.eq.1)then;
call graph(rsstest :plottype meshc :d3axis :d3border :grid :noshow
:rotation rote :pgborder :file 'crm_test.wmf'
:xlabel '# Vectors' :ylabelleft 'Gamma'
:pgunits scaleadj
:heading 'RSS vs # vectors and gamma' );
endif;
if(noshow.ne.1)then;
call graph(rsstest :plottype meshc :d3axis :d3border :grid
:rotation rote :pgborder :file 'crm_test.wmf'
:xlabel '# Vectors' :ylabelleft 'Gamma'
:pgunits scaleadj
:heading 'RSS vs # vectors and gamma' );
endif;
return;
end;
Table 10.6 is the driving program to analyse the gas data using the PLS and CRM
methods of analysis.
Special Topics in OLS Estimation
Table 10.6 Effect on the RSS of PCM, PLS and CRM Models of Varying Degrees
b34sexec options ginclude('gas.b34'); b34srun;
b34sexec matrix;
call loaddata;
call load(pls_reg);
call load(pls1_reg);
call load(pc_reg);
call echooff;
nn=6;
call olsq(gasout gasin{1 to nn} gasout{1 to nn} :print :savex);
ols_coef=%coef;
ols_rss =%rss;
iprint=1;
%xhold=%x;
%yhold=%y;
/; Test model
- As setup will stop
testmod=1;
if(testmod.ne.0)then;
iprint=2;
noshow=0;
ncomp=4;
gammag=grid(.05,1.,.01);
call print(gammag);
rote=270.;
call crmtest(%y,%x, ncomp,gammag,rsstest,rote,iprint,noshow);
call print(rsstest);
call stop;
endif;
call pc_reg(%y,%x,ols_coef,ols_rss,tss,pc_coef,
pcrss,pc_size,u,iprint);
jj=integers(norows(pcrss),1,-1);
pc_rss=pcrss(jj);
/;
/; Note: Use caution changing gamma for pls1_reg
/;
gamma = 0 => OLS
/;
gamma = 1 => PLS
/;
gamma > 1 => increasing multicolinearity in dataset
/;
gamma < 1 => decrease
multicolinearity in dataset
/;
ncompmax=9;
/;
/; simpls code: Left in for testing purposes only.
/; pls1_reg much faster!
/;
/; call pls_reg(%y,%x,pls_coef,xload,yload,xscores,
/;
yscores,weights,yhat,error_pls,pls_rss,ncompmax,iprint);
/;
/; call tabulate(pc_rss,pls_rss);
/;
gamma= 1.;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,pls_rss,ncompmax,gamma,iprint);
k1=norows(pls_rss);
k2=norows(pc_rss);
if(k1.lt.k2)call deleterow(pc_rss,k1+1,(k2-k1));
%x=%xhold;
%y=%yhold;
rss_1p=pls_rss;
ncompmax=6;
gamma= 1.;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,pls_rss,ncompmax,gamma,iprint);
jj=integers(norows(pcrss),1,-1);
pc_rss=pcrss(jj);
gamma= .1;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
10-35
10-36
Chapter 10
pls_res,rss_p1 ,ncompmax,gamma,iprint);
gamma= .2;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,rss_p2 ,ncompmax,gamma,iprint);
gamma= .4;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,rss_p4 ,ncompmax,gamma,iprint);
gamma= 1.5;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,rss_1p5 ,ncompmax,gamma,iprint);
gamma= 10.;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,rss_10p,ncompmax,gamma,iprint);
gamma= 15.;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,rss_15p,ncompmax,gamma,iprint);
k1=norows(rss_p1);
k2=norows(pc_rss);
if(k1.lt.k2)call deleterow(pc_rss,k1+1,(k2-k1));
call graph(pc_rss rss_p1 rss_p2, rss_p4,pls_rss,
rss_1p5,rss_10p,rss_15p
:nocontact :pgborder :grid :nolabel
:file 'pc_pls_rss.wmf'
:heading 'Res Sum sq vs # pls_.5, pls,% pls_1p5 components');
call tabulate(pc_rss rss_p1 rss_p2, rss_p4,pls_rss,
rss_1p5,rss_10p,rss_15p
:title 'As Gamma increases => PC Regression rss');
gamma= .1;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,rss_p1 ,3,gamma,iprint);
t=(x0*r);
call print(' ':);
call print('Using Only Three vectors of Reduced Model we Try OLS':);
call olsq(%y t :print);
call print(' ':);
call print('We now see what Stepwise will do in terms of RSS':);
call stepwise(%y %xhold :nstep 3
:noint :print :printsteps);
b34srun;
Edited results from running the commands in Table 10.6 are listed next.
B34SI 8.11F
Variable
TIME
GASIN
GASOUT
CONSTANT
(D:M:Y)
19/ 5/11 (H:M:S) 20:45:35
Label
1
2 Input gas rate in cu. ft / min
3 Percent CO2 in outlet gas
4
Number of observations in data file
Current missing variable code
DATA STEP
# Cases
296
148.500
296 -0.568345E-01
296
53.5091
296
1.00000
296
1.000000000000000E+031
B34SI Matrix Command. d/m/y 19/ 5/11. h:m:s 20:45:35.
=>
CALL LOADDATA$
=>
CALL LOAD(PLS1_REG)$
=>
CALL LOAD(PC_REG)$
=>
CALL ECHOOFF$
The OLS results liusted next are the "base case."
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
PAGE
Mean
GASOUT
0.994664107436370
0.994432949635779
16.1385829591582
5.826203234353124E-002
0.241375293564878
Std. Dev.
85.5921
1.07277
3.20212
0.00000
Variance
Maximum
Minimum
7326.00
1.15083
10.2536
0.00000
296.000
2.83400
60.5000
1.00000
1.00000
-2.71600
45.6000
1.00000
1
Special Topics in OLS Estimation
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F(12,
277)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
CONSTANT
Lag
1
2
3
4
5
6
1
2
3
4
5
6
0
Coefficient
0.63160860E-01
-0.13345763
-0.44123536
0.15200749
-0.12036440
0.24930584
1.5452265
-0.59293307
-0.17105674
0.13238479
0.56869923E-01
-0.42085617E-01
3.8241094
10-37
3024.53296551724
7.36469419004290
53.5096551724138
3.23504435694615
48.1338529453902
4302.96578742115
1.00000000000000
1.929993696611666E-008
1.43081466326252
290
SE
0.75989856E-01
0.16490508
0.18869442
0.19021604
0.17941884
0.10973982
0.59808504E-01
0.11024897
0.11518138
0.11465530
0.10083191
0.42891891E-01
0.85547296
t
0.83117489
-0.80929968
-2.3383593
0.79913078
-0.67085705
2.2717902
25.836234
-5.3781279
-1.4851076
1.1546329
0.56400722
-0.98120217
4.4701698
The principle component results indicate the increase in e ' e that occurs when less that 13
principle component vectors are used.
Principle
Total Sum
Number of
Number of
Component Regression Model
of Squares
3024.53296551724
observations in X
290
columns in X
13
PC and OLS Coefficients
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
PC_COEF
-912.2
-27.16
5.125
-18.93
5.327
3.503
-2.414
-1.023
-1.104
-0.3703
-0.3912
0.3701
-1.058
OLS_COEF
0.6316E-01
-0.1335
-0.4412
0.1520
-0.1204
0.2493
1.545
-0.5929
-0.1711
0.1324
0.5687E-01
-0.4209E-01
3.824
Shrinkage Accuracy loss
Obs
PC_SIZE
1
2
3
4
5
6
7
8
9
10
11
12
13
13
12
11
10
9
8
7
6
5
4
3
2
1
PC_RSS
16.14
17.26
17.39
17.55
17.68
18.90
19.95
25.78
38.05
66.42
424.9
451.1
1189.
RSQ
0.9947
0.9943
0.9942
0.9942
0.9942
0.9937
0.9934
0.9915
0.9874
0.9780
0.8595
0.8508
0.6070
Note that for 9 partial least squares vectors e ' e  16.1597 while for the PCR model e ' e  17.68 .
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
PLS_BETA
47.70
25.33
6.602
5.115
3.910
2.053
PLS_COEF
0.6239E-01
-0.1572
-0.3602
0.4605E-01
-0.5736E-01
0.2355
290
13
9
1.00000000000000
53.5096551724138
16.1597525278728
3024.53296551724
0.994657108151205
10-38
7
8
9
10
11
12
13
Chapter 10
1.325
0.8342
0.2304
NA
NA
NA
NA
1.550
-0.6111
-0.1489
0.1277
0.4777E-01
-0.3690E-01
3.820
Here only 6 latent vectors are calculated but the first 6 are rthe same as when 9 were calculated in
the prior estimation. The implied OLS coefficients are however all different. For this problem
e ' e  18.665 while for a PCR mopdel with 6 latent vectors e ' e  25.78 which is substantially
larger.
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
290
13
6
1.00000000000000
53.5096551724138
18.6653423691790
3024.53296551724
0.993828686087412
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
PLS_BETA
47.70
25.33
6.602
5.115
3.910
2.053
NA
NA
NA
NA
NA
NA
NA
PLS_COEF
0.1171
-0.1598
-0.3709
-0.2662
0.6345E-01
0.4162
1.222
-0.5941E-01
-0.3070
-0.1807E-01
0.1571
-0.5878E-01
3.439
Here a CRM model is estimated where   .1 . Note e ' e  16.138 which shows the gain over the
prior PLS model. Models with   .2 and   .4 were tried with only marginal changes in e ' e.
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
290
13
6
0.100000000000000
53.5096551724138
16.1385839842000
3024.53296551724
0.994664107097461
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
PLS_BETA
PLS_COEF
54.53
0.6322E-01
5.776
-0.1337
1.081
-0.4408
0.1641
0.1515
0.3024E-01 -0.1199
0.5818E-02 0.2490
NA
1.545
NA
-0.5930
NA
-0.1711
NA
0.1324
NA
0.5687E-01
NA
-0.4209E-01
NA
3.826
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
PLS_BETA
53.82
PLS_COEF
0.6393E-01
290
13
6
0.200000000000000
53.5096551724138
16.1391507928914
3024.53296551724
0.994663919693753
Special Topics in OLS Estimation
2
3
4
5
6
7
8
9
10
11
12
13
10.14
2.845
0.8182
0.2597
0.9250E-01
NA
NA
NA
NA
NA
NA
NA
10-39
-0.1367
-0.4325
0.1397
-0.1107
0.2436
1.545
-0.5923
-0.1745
0.1336
0.5915E-01
-0.4358E-01
3.860
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
290
13
6
0.400000000000000
53.5096551724138
16.2201208262738
3024.53296551724
0.994637148607339
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
PLS_BETA
52.00
16.39
5.139
2.643
1.390
0.6000
NA
NA
NA
NA
NA
NA
NA
PLS_COEF
0.6955E-01
-0.1453
-0.3834
0.2769E-01
-0.4567E-01
0.2386
1.514
-0.5271
-0.2266
0.1303
0.1027
-0.6803E-01
4.005
The next three models show CRM models where  was set to 1.5, 10.0 and 15.0 respectively.
The result was a poorer fit.
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
290
13
6
1.50000000000000
53.5096551724138
20.8390179186053
3024.53296551724
0.993110004699505
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
PLS_BETA
46.13
27.75
6.983
5.662
4.138
2.882
NA
NA
NA
NA
NA
NA
NA
PLS_COEF
0.1431
-0.1815
-0.4012
-0.3103
0.7465E-01
0.4672
1.094
0.9743E-01
-0.2608
-0.9247E-01
0.9881E-01
-0.2192E-02
3.501
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
PLS_BETA
45.08
29.07
7.601
6.030
3.252
1.041
PLS_COEF
-0.1727
-0.1519
-0.1474
-0.1569
-0.1369
-0.4914E-01
290
13
6
10.0000000000000
53.5096551724138
43.2030846923479
3024.53296551724
0.985715783168870
10-40
7
8
9
10
11
12
13
Chapter 10
NA
NA
NA
NA
NA
NA
NA
0.7445
0.3305
-0.1016
-0.3090
-0.1595
0.2667
12.19
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
290
13
6
15.0000000000000
53.5096551724138
200.245716277202
3024.53296551724
0.933792847173363
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
PLS_BETA
45.08
29.07
7.601
6.059
6.059
6.059
NA
NA
NA
NA
NA
NA
NA
PLS_COEF
0.4093
0.7494E-01
-0.2962
-0.5521
-0.5721
-0.3516
0.8249
0.1877
-0.3572
-0.4844
-0.1153
0.5611
20.45
As Gamma increases => PC Regression rss
This table summarizes the effect of the residual sum of squares for  settings of .1, .2, .4, 1., 1.5,
10., and 15. Resulting in 6 PLS vectors to be compared to the PC_RSS vector.
Obs
1
2
3
4
5
6
PC_RSS
1189.
451.1
424.9
66.42
38.05
25.78
RSS_P1
50.69
17.34
16.17
16.14
16.14
16.14
RSS_P2
127.8
24.98
16.88
16.22
16.15
16.14
RSS_P4
320.5
51.91
25.50
18.51
16.58
16.22
PLS_RSS
749.6
107.9
64.33
38.17
22.88
18.67
RSS_1P5
896.9
127.1
78.33
46.27
29.15
20.84
RSS_10P
992.0
146.8
89.06
52.70
42.12
43.20
RSS_15P
992.0
146.8
89.06
52.70
89.76
200.2
Using only three latent vectors with   .1 the residual sum of squares is 16.166. This is
substantially smaller than the case with the PC approach were e ' e  424.9 .
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
290
13
3
0.100000000000000
53.5096551724138
16.1664707952968
3024.53296551724
0.994654886893411
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
PLS_BETA
54.53
5.776
1.081
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
PLS_COEF
0.6363E-01
-0.1371
-0.4245
0.1119
-0.1005
0.2502
1.521
-0.5501
-0.1862
0.1164
0.6327E-01
-0.3639E-01
3.867
Using Only Three vectors of Reduced Model we Try OLS
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
%Y
0.994654886893411
0.994598819273412
16.1664707952967
5.652612166187671E-002
0.237752227459338
3024.53296551724
Special Topics in OLS Estimation
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F( 3,
286)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
CONSTANT
Lag
0
0
0
0
Coefficient
54.532922
5.7756087
1.0813185
53.509655
7.11434715299218
53.5096551724138
3.23504435694615
48.2875380708319
17740.2730293859
1.00000000000000
3.452225714523516E-003
1.41956375257301
290
SE
0.23775223
0.23775223
0.23775223
0.13961292E-01
t
229.36871
24.292554
4.5480900
3832.7153
We now see what Stepwise will do in terms of RSS
Stepwise Option called.
Y Variable
%Y
Y Variable Mean
53.5096551724138
Y Variable Variance
10.4655119914091
Y Variable Number
13
Number of observations
290
PIN set as
5.000000000000000E-002
POUT set as
0.100000000000000
TOL set as
2.220446049250313E-014
Constant estimated in all models - Listed for Final Model
X Var. #
1
2
3
4
5
6
7
8
9
10
11
12
Name
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Col___11
Col___12
Lag
0
0
0
0
0
0
0
0
0
0
0
0
Mean
-0.59800000E-01
-0.57886207E-01
-0.56775862E-01
-0.56613793E-01
-0.57286207E-01
-0.58534483E-01
53.496207
53.482759
53.467931
53.451379
53.434483
53.417931
Var
1.1731711
1.1737639
1.1742883
1.1743570
1.1741485
1.1738231
10.423757
10.373543
10.308830
10.227766
10.139360
10.047221
FORWARD STEPWISE SELECTION
STEP 0: No variables entered.
* * * Statistics for Variables Not in the Model * * *
Coef.
Standard t-statistic
Prob. of
Variance
Variable
Estimate
Error
to enter
Larger t
Inflation
1
-1.788
0.1410
-12.682
0.0000
1
2
-2.165
0.1212
-17.873
0.0000
1
3
-2.516
0.0947
-26.586
0.0000
1
4
-2.761
0.0669
-41.239
0.0000
1
5
-2.838
0.0546
-51.977
0.0000
1
6
-2.732
0.0710
-38.483
0.0000
1
7
0.975
0.0137
71.218
0.0000
1
8
0.904
0.0258
35.057
0.0000
1
9
0.805
0.0357
22.520
0.0000
1
10
0.696
0.0433
16.085
0.0000
1
11
0.593
0.0486
12.201
0.0000
1
12
0.506
0.0522
9.681
0.0000
1
STEP 1 :
Variable 7 entered.
Dependent
Variable
13
Source
Regression
Error
Total
Variable
7
R-squared
(percent)
94.627
Adjusted
R-squared
94.608
Est. Std. Dev.
of Model Error
0.7512
* * * Analysis of Variance * * *
Sum of
Mean
DF
Squares
Square Overall F
1
2862.0
2862.0
5072.046
288
162.5
0.6
289
3024.5
Prob. of
Larger F
0.0000
* * * Inference on Coefficients * * *
(Conditional on the Selected Model)
Coef.
Standard
Prob. of
Estimate
Error t-statistic
Larger t
0.975
0.0137
71.218
0.0000
Variance
Inflation
1.00
* * * Statistics for Variables Not in the Model * * *
Coef.
Standard t-statistic
Prob. of
Variance
Estimate
Error
to enter
Larger t
Inflation
-0.495
0.0365
-13.584
0.0000
1.31
-0.662
0.0328
-20.179
0.0000
1.56
-0.860
0.0310
-27.718
0.0000
2.12
-1.072
0.0429
-24.958
0.0000
3.50
-1.033
0.0908
-11.381
0.0000
7.16
0.421
0.1336
3.150
0.0018
11.06
-0.857
0.0306
-27.994
0.0000
18.53
-0.405
0.0207
-19.580
0.0000
5.25
-0.245
0.0178
-13.746
0.0000
2.75
-0.161
0.0166
-9.717
0.0000
1.89
-0.108
0.0159
-6.781
0.0000
1.51
Variable
1
2
3
4
5
6
8
9
10
11
12
STEP 2 :
Variable 8 entered.
10-41
10-42
Chapter 10
Dependent
Variable
13
Source
Regression
Error
Total
Variable
7
8
R-squared
(percent)
98.560
Adjusted
R-squared
98.550
Est. Std. Dev.
of Model Error
0.3896
* * * Analysis of Variance * * *
Sum of
Mean
DF
Squares
Square Overall F
2
2981.0
1490.5
9819.495
287
43.6
0.2
289
3024.5
Prob. of
Larger F
0.0000
* * * Inference on Coefficients * * *
(Conditional on the Selected Model)
Coef.
Standard
Prob. of
Estimate
Error t-statistic
Larger t
1.807
0.03056
59.126
0.0000
-0.857
0.03063
-27.994
0.0000
Variance
Inflation
18.53
18.53
* * * Statistics for Variables Not in the Model * * *
Coef.
Standard t-statistic
Prob. of
Variance
Variable
Estimate
Error
to enter
Larger t
Inflation
1
-0.279
0.02036
-13.687
0.0000
1.53
2
-0.379
0.02154
-17.580
0.0000
2.15
3
-0.514
0.02792
-18.403
0.0000
3.79
4
-0.551
0.05009
-11.003
0.0000
7.96
5
-0.082
0.07212
-1.131
0.2589
11.64
6
0.323
0.06792
4.749
0.0000
11.08
9
0.465
0.05263
8.835
0.0000
68.97
10
0.213
0.02172
9.788
0.0000
12.22
11
0.121
0.01358
8.936
0.0000
4.54
12
0.080
0.01042
7.695
0.0000
2.50
STEP 3 :
Variable 3 entered.
Dependent
Variable
13
Source
Regression
Error
Total
Variable
3
7
8
R-squared
(percent)
99.341
Adjusted
R-squared
99.334
Est. Std. Dev.
of Model Error
0.2641
* * * Analysis of Variance * * *
Sum of
Mean
DF
Squares
Square Overall F
3
3004.6
1001.5 14361.258
286
19.9
0.1
289
3024.5
Prob. of
Larger F
0.0000
* * * Inference on Coefficients * * *
(Conditional on the Selected Model)
Coef.
Standard
Prob. of
Estimate
Error t-statistic
Larger t
-0.514
0.02792
-18.403
0.0000
1.352
0.03225
41.926
0.0000
-0.518
0.02777
-18.648
0.0000
Variance
Inflation
3.79
44.91
33.16
* * * Statistics for Variables Not in the Model * * *
Coef.
Standard t-statistic
Prob. of
Variance
Estimate
Error
to enter
Larger t
Inflation
-0.052
0.02918
-1.775
0.0770
4.17
-0.139
0.05694
-2.442
0.0152
16.05
0.324
0.07612
4.258
0.0000
29.88
0.282
0.04985
5.663
0.0000
13.41
0.300
0.04440
6.762
0.0000
11.09
0.200
0.04233
4.730
0.0000
82.25
0.103
0.01758
5.852
0.0000
14.62
0.060
0.01060
5.647
0.0000
5.23
0.039
0.00784
5.009
0.0000
2.77
Variable
1
2
4
5
6
9
10
11
12
Intercept
Std. error
t Stat.
8.85996699737638
0.423056555466042
20.9427483935716
Figure 10.5 shows that the explained sum of squares starts to level out if there are more than 4
vectors in the T matrix. Figure 10.6 shows the pattern of the residual sum of squares for
alternative  values and number of columns in the T matrix. In general the PC model had a
larger residual sum of squares for every number of latent vectors except for the   10 and
  15 models for > 4 vectors. For example e ' e was 42.12 and 43.20 for the   10 model for a
model of 5 and 6 vectors respectively. For the   15 model the values were 89.76 and 200.2
while the corresponding PC model values were 38.05 and 25.78 respectively. Clearly  was too
large.
Special Topics in OLS Estimation
10-43
Explained sum of squares as a function # of PC terms
1.00
.95
.90
.85
.80
.75
.70
.65
.60
2
4
6
8
10
12
PC_SIZE
Figure 10.5 PCR Model Explained Sum of Squares vs # of Latent Vectors
10-44
Chapter 10
Res Sum sq vs # pls_.5, pls,% pls_1p5 components
1200
1100
1000
900
800
700
PRRRPRRR
CSSSLSSS
_SSSSSSS
R_______
SPPPR111
S124SP05
S5PP
600
500
400
300
200
100
0
1
1.5
2
2.5
3
3.5
Obs
4
4.5
5
5.5
6
Figure 10.6 Residual Sum of Squares for various CR Models
Figure 10.7 shows the same information as figure 10.6 except for more cases and in a 3-D
context. A grid   (.05,.06, ,.99,1.) was used for up to 4 CRM vectors (4*96) for 384 models
where e ' e was calculated and displayed on the vertical axis. The resulting 3-D graph illustrates
the gain of CRM models over PLS models.
Special Topics in OLS Estimation
10-45
RSS vs # vectors and gamma
Z
A
x
i
s
700
600
500
400
300
200
100
4
3
2
# Ve
ctor
s
1
.10
.20
.30
.40
.50
.60
.70
.80
.90
1.00
a
Gamm
Figure 10.7 Residual Sum of Squares Surface
The next dataset was used by Matlab to illustrate the PLS model where there are
more columns in the X matrix than observations. The Matlab spectra dataset consists of 60
samples of gasoline where 401 near infrared (NIR) spectral intensities were measured. The left
hand side variable is the octane rating of the gasoline. Table 10.7 lists the setup to study this
dataset. Figure 10.8 lists the X matrix. It can be seen that the spectral intensities are quite
different depending on frequency but remarkedly stable across the 60 samples. Figure 10.9 plots
e ' e for various values   (.1.,.2, , 2.4, 2.5) with from one to five vectors for a total of 125
models. A value of   1 and 6 vectors was selected as a reasonable model. Figure 10.10 shows
a plot of the T matrix of latent values while figure 10.11 shows how the data in the original X
matrix maps to the six latent vectors.
Table 10.7 Setup for Analysis of Octain Data
/;
/; Data obtained from Matlab
/;
b34sexec matrix;
10-46
Chapter 10
call echooff;
call load(pls1_reg);
call load(pc_reg);
testmod=0;
/; Data on near infrared (NIR) spectral Intensities at 401 wavelengths
/; 60 samples. Octane rating is in col 1. Matrix is 60,402
/;
call getmatlab(dd :file 'c:\b34slm\mfiles\spectra.dat');
y=dd(,1);
x=submatrix(dd,1,norows(dd),2,nocols(dd));
call graph(x :plottype meshc :d3axis :d3border :grid
:rotation 180. :pgborder :file 'Spectra_data.wmf'
:xlabel 'Sample #' :ylabelleft 'NIF Frequency'
:heading 'Spectra NIR vs Octane' );
/; Test model
if(testmod.ne.0)then;
iprint=2;
noshow=0;
ncomp=5;
gammag=grid(.1, 2.5,.1);
call print(gammag);
rote=180.;
call crmtest(y,x, ncomp,gammag,rsstest,rote,iprint,noshow);
call print(rsstest);
call stop;
endif;
/; Model test using PC Model
iprint=2;
call pc_reg(y,x,ols_coef,ols_rss,tss,pc_coef,
pcrss,pc_size,u,iprint);
/; PLS Model
ncomp=6;
gamma=1.0;
call pls1_reg(y,x,y0,x0,r,c,,u,v,s,pls2coef,yhat,
pls_res,pls_rss,
ncomp,gamma,iprint);
T=x*r;
call print(r);
call print(t);
call graph(T :plottype meshc :d3axis :d3border :grid
:pgborder :xlabel
'Sample'
:ylabelleft 'PLS Latent Vector'
:file 'pls_latent_v.wmf'
:heading 'PLS Latent Vector matrix T');
/; 2D graphs of latent vectors
/;
/;
/;
/;
t1=t(,1);
t2=t(,2);
t9=t(,9);
call graph(t1,t2,t9);
scale=array(4: 1.,1.,dfloat(nocols(x)),dfloat(ncomp));
call graph(r :plottype meshc :d3axis :d3border :grid
:pgborder :xlabel
'X Data Mapping'
:pgunits scale
/;
:rotation 90.
:ylabelleft 'PLS Latent Vector'
:file 'pls_loading.wmf'
:heading 'PLS Loading Matrix R');
/;
/;
/;
/;
r1=r(,1);
r2=r(,2);
r9=r(,9);
call graph(r1 r2 r9);
call olsq(y t :print );
b34srun;
Special Topics in OLS Estimation
10-47
Spectra NIR vs Octane
Z
A
x
i
s
1.2
1
.8
.6
.4
.2
0
400
300
200
100
NIF
Freq
uenc
y
Figure 10.8 Octain vs NIG Data
60
50
40
30
le #
Samp
20
10
10-48
Chapter 10
RSS vs # vectors and gamma
Z
A
x
i
s
120
100
80
60
40
20
1
2
2
1.5
3
1
.5
Gamm
a
4
5
s
ctor
# Ve
Figure 10.9 Sensitivity of Octain Model to # of Vectors and  setting
Special Topics in OLS Estimation
10-49
PLS Latent Vector matrix T
Z
A
x
i
s
4
2
0
-2
-4
1
2
3
PLS
4
Late
nt V
ect
Figure 10.10 Octain Model Matrix T
5
6
10
20
40
30
e
l
Samp
50
60
10-50
Chapter 10
PLS Loading Matrix R
Z
A
x
i
s
2.5
2
1.5
1
.5
0
-.5
-1
1
400
2
3
PLS
300
4
200
5
Late
nt V
ect
100
6
ng
appi
M
a
t
X Da
Figure 10.11 Maping of X data to the PLS Vectors
Edited and annotated output from running the commands in Table 10.7 are listed next.
First aPC model calculates 60 PC vectors and 401 implied OLS coefficients.
B34SI Matrix Command. d/m/y 26/ 5/11. h:m:s 13: 1:46.
=>
CALL ECHOOFF$
Data File built by Matlab 21-May-2011 11:38:12
Principle
Total Sum
Number of
Number of
Number of
Component Regression Model
of Squares
138.127125000000
observations in X
60
columns in X
401
PC elements
60
PC and OLS Coefficients
Obs
1
2
3
4
5
6
7
8
PC_COEF
OLS_COEF
-675.2
-23.01
-2.159
-10.65
-6.105
-26.52
-0.4348
-21.86
-10.88
-21.90
0.9242E-01
5.019
-2.296
6.909
0.6961
-7.443
Special Topics in OLS Estimation
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
398
399
400
401
10-51
0.4730
-14.68
-1.858
-38.76
1.198
6.106
-1.570
14.45
-0.3460
3.334
-0.5206
7.614
0.1264
9.576
-0.2591
-1.239
0.3299
1.752
0.1848
-1.778
0.8297
-2.746
0.8646E-01
5.230
0.8050E-02
9.994
0.1637
6.767
0.3583
-10.93
-0.9069E-01
4.438
-0.3398
0.1093
-0.1047E-01
5.693
-0.5152
-10.47
0.2129E-01 -3.665
-0.1544
1.370
0.3143
0.9660
-0.4799E-01 -11.18
0.2204E-01
9.342
0.2801
12.99
-0.1853
6.821
0.1522
4.310
-0.3894E-01 -1.419
-0.1909
-0.8463
0.3460
-13.15
0.1246
12.81
0.2984
7.896
-0.2201E-01 -4.004
0.2400
1.009
0.5249
9.465
-0.1540
-2.972
-0.1975E-01 -1.656
-0.1968
5.203
-0.4844E-01
3.471
-0.1857E-01 -7.817
0.5029E-01 -13.18
0.2989E-01
13.92
-0.2077
-2.524
0.1306
2.368
0.2120
-7.109
-0.8663E-01 -11.43
-0.2303
8.175
-0.2034
2.233
0.8019E-01
9.659
-0.1181
5.830
-0.9125E-01 -11.84
0.2023
-10.18
NA
NA
NA
NA
-1.375
-1.992
0.2682
8.139
The accuracy loss as less and less PC vectors are listed next. If for example there were only 6 PC
vectors e ' e  16.39.
Shrinkage Accuracy loss
Obs
PC_SIZE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
60
59
58
57
56
55
54
53
52
51
50
49
48
47
PC_RSS
RSQ
0.2777E-24
1.000
0.4094E-01 0.9997
0.4927E-01 0.9996
0.6322E-01 0.9995
0.6965E-01 0.9995
0.1110
0.9992
0.1640
0.9988
0.1716
0.9988
0.2165
0.9984
0.2336
0.9983
0.2767
0.9980
0.2776
0.9980
0.2801
0.9980
0.2805
0.9980
10-52
Chapter 10
15
46
0.2828
48
49
50
51
52
53
54
55
56
57
58
59
60
13
12
11
10
9
8
7
6
5
4
3
2
1
2.941
3.060
5.526
6.961
10.41
10.64
11.12
16.39
16.40
134.7
134.9
172.2
176.9
0.9980
0.9787
0.9778
0.9600
0.9496
0.9246
0.9230
0.9195
0.8813
0.8813
0.2457E-01
0.2320E-01
-0.2466
-0.2803
Note that e ' e  1.769 far lower that a 6 vector model PC model.
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
60
401
6
1.00000000000000
87.1775000000000
1.76946660567481
138.127125000000
0.987189579123761
The 4001 by 6 R matrix, of which only 20 rows are shown, maps from X to T using equation
(10.4-7). The jth column measures the maping of the ith column in the X matrix to the jth
latent vector. The latent vector matrix T is shown in figure 10.10 while the R matrix is shown in
figure 10.11. The full T matrix is shown below.
R
= Matrix of
1
2
3
4
5
6
7
8
9
10
11
12
T
1
-0.312977E-02
-0.198509E-02
-0.187377E-02
-0.128795E-02
-0.121976E-02
-0.122001E-02
0.971001E-04
-0.160181E-02
-0.335944E-02
-0.631175E-02
-0.875870E-02
-0.101636E-01
= Matrix of
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1
0.988901
0.693743
0.793356
0.767886
1.14628
0.941463
1.11180
0.928202
1.06688
1.11623
1.22368
1.11970
0.927796
0.825877
0.695175
1.01141
1.08833
1.07239
0.907406
0.946685
401
by
2
-0.710386E-04
0.280989E-03
0.384907E-03
0.714660E-03
0.793598E-03
0.890373E-03
0.130553E-02
0.765827E-03
0.659687E-04
-0.104119E-02
-0.191428E-02
-0.242430E-02
60
by
2
0.297868E-01
-0.220051E-01
0.269856E-02
-0.123758E-01
0.675831E-01
0.230492E-01
0.624769E-01
0.299545E-01
0.547265E-01
0.634153E-01
0.816043E-01
0.591827E-01
0.232402E-01
0.100105E-01
-0.123719E-01
0.345590E-01
0.558072E-01
0.556562E-01
0.199947E-01
0.324142E-01
6
elements
3
0.243791E-01
0.300102E-01
0.336652E-01
0.432287E-01
0.461349E-01
0.504599E-01
0.573708E-01
0.482682E-01
0.324927E-01
0.909237E-02
-0.833871E-02
-0.187356E-01
6
4
0.864504E-01
0.846470E-01
0.841480E-01
0.847414E-01
0.861428E-01
0.867424E-01
0.992914E-01
0.894234E-01
0.908605E-01
0.789190E-01
0.679014E-01
0.692084E-01
5
0.840030E-01
0.744934E-01
0.755793E-01
0.762021E-01
0.737757E-01
0.822037E-01
0.841788E-01
0.944800E-01
0.879563E-01
0.883645E-01
0.732018E-01
0.753006E-01
6
-0.578909E-01
-0.131944
-0.122159
-0.147239
-0.111015
-0.285924E-01
0.203989E-01
0.102505
0.520659E-01
0.778642E-01
-0.492166E-01
-0.332195E-01
elements
3
-1.37788
-1.20226
-0.927989
-1.38612
-1.16447
-1.32746
-1.10155
-0.998211
-1.07367
-1.13888
-1.24993
-1.21057
-1.10421
-0.977263
-0.808208
-1.34252
-1.16320
-1.10078
-1.26097
-1.01369
4
-5.25711
-5.24118
-5.20228
-5.26845
-5.20102
-5.26045
-5.15796
-5.22046
-5.14851
-5.14688
-5.15973
-5.19612
-5.27282
-5.13307
-5.17704
-5.35978
-5.15035
-5.18537
-5.31193
-5.26007
5
4.57921
4.53260
4.48372
4.53431
4.62051
4.62389
4.55432
4.44818
4.51542
4.54266
4.60185
4.62723
4.49948
4.40562
4.40392
4.59336
4.49485
4.40185
4.32437
4.34062
6
0.224458
0.346293
0.185458
0.119033
-0.309322
0.483005E-01
0.113346E-01
0.827684E-01
-0.463346E-01
0.912766E-01
0.259372
0.335436
0.333305
0.130752
0.270361
0.135832
0.180443
0.100493
0.954622E-01
0.136154
Special Topics in OLS Estimation
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
1.04033
0.990318
0.984146
1.01701
1.00449
1.08251
1.00992
1.03627
1.00691
1.08173
1.04786
0.888500
0.913688
0.882807
0.903452
1.08614
1.01224
1.10485
0.970562
1.07900
1.29637
1.17331
1.09387
0.939720
1.18000
1.17921
1.17253
1.08209
1.19069
1.22376
1.08351
1.16980
1.14637
0.883618
0.990513
0.968454
1.00950
1.09814
1.22335
1.07687
0.445206E-01
0.355594E-01
0.347272E-01
0.421427E-01
0.393541E-01
0.571684E-01
0.406514E-01
0.433007E-01
0.397300E-01
0.516227E-01
0.458973E-01
0.129255E-01
0.189848E-01
0.114013E-01
0.136410E-01
0.554221E-01
0.347526E-01
0.545122E-01
0.310651E-01
0.516013E-01
0.928419E-01
0.727464E-01
0.556964E-01
0.195051E-01
0.745302E-01
0.710570E-01
0.679095E-01
0.566267E-01
0.723700E-01
0.807788E-01
0.580789E-01
0.710416E-01
0.721835E-01
0.173970E-01
0.360076E-01
0.274825E-01
0.517277E-01
0.604937E-01
0.882957E-01
0.557995E-01
-1.23375
-1.16217
-1.17629
-1.14606
-1.19053
-1.08735
-1.25093
-1.30249
-1.24469
-1.27910
-1.26462
-1.36045
-1.36032
-1.36708
-1.37623
-1.15058
-1.36990
-1.13586
-1.02342
-1.09833
-1.24613
-1.16533
-1.14417
-1.31289
-1.16350
-1.12185
-1.13039
-1.00997
-1.13766
-1.13632
-1.19487
-1.31213
-1.21964
-1.37663
-1.43152
-1.44889
-1.19443
-1.31325
-1.15502
-1.27155
-5.28707
-5.17825
-5.26010
-5.27858
-5.30752
-5.25844
-5.26287
-5.33735
-5.32440
-5.35182
-5.34313
-5.37922
-5.37230
-5.35219
-5.39813
-5.23086
-5.37636
-5.35801
-5.37516
-5.38084
-5.20289
-5.17351
-5.28656
-5.40741
-5.26702
-5.45193
-5.53415
-5.43525
-5.50693
-5.47720
-5.00232
-5.13979
-4.98038
-4.99856
-5.00132
-5.25708
-4.98649
-5.09163
-5.05872
-5.09131
4.32031
4.60146
4.31059
4.31616
4.32373
4.43951
4.37527
4.37207
4.32866
4.37673
4.35792
4.32504
4.24544
4.33537
4.35643
4.38856
4.47357
4.52024
4.42959
4.46691
4.56651
4.50500
4.63859
4.48914
4.42641
4.24065
4.34307
4.19551
4.31464
4.22620
4.30132
4.44597
4.35272
4.17836
4.37452
4.52559
4.07003
4.28462
4.24757
4.19806
10-53
0.107698
-0.560219E-01
0.103885
0.749457E-01
0.548220E-01
0.743532E-01
0.147988
0.647937E-01
-0.150272E-01
0.531925E-02
-0.114005E-01
0.108952
0.115398
0.103431
0.105236
0.308389
0.840115E-01
0.281195
0.300350
0.329084
0.202289
-0.139271E-02
0.177489
0.180210
0.863997E-01
0.265480
0.293311
-0.626014E-01
0.242892
0.238114
0.111759
0.334206
0.361026
0.240871
0.127398
0.383110
0.645175E-01
0.205630
0.312002
0.103613
As a test, the latent vector matrix T is used on the right hand side of an OLS model predicting y.
As expected the residual sum of squares was 1.76940.
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F( 6,
53)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
CONSTANT
Lag
0
0
0
0
0
0
0
Coefficient
6.6986671
0.52497431
9.3217709
2.0139747
0.47607837
0.97167176
99.783430
Y
0.987190014829113
0.985739827828635
1.76940642294723
3.338502684806099E-002
0.182715699511730
138.127125000000
20.5747007921381
87.1775000000000
1.53007768164378
8.45940033463316
680.732908586196
1.00000000000000
1.604776721098139E-008
0.447398270645863
60
SE
4.2958494
22.074553
0.43351639
0.45293403
0.29499321
0.19412013
1.3070037
t
1.5593347
0.23781877E-01
21.502696
4.4465077
1.6138621
5.0055177
76.345179
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
24856574, peak space used
74, peak number used
5192, # user temp clean
583756
74
0
10-54
Chapter 10
Table 10.8 Tests to illustrate PLS Model intermediate calculations.
/;
/; Illustrates PLS Calculations
/;
b34sexec options ginclude('gas.b34'); b34srun;
b34sexec matrix;
call loaddata;
call load(pls1_reg);
call echooff;
/; Base Case calculated with OLS and PLS e'e will be the same
nn=6;
call olsq(gasout gasin{1 to nn} gasout{1 to nn} :print :savex);
ols_coef=%coef;
ols_rss =%rss;
yhat_ols=%yhat;
iprint=1;
ncompmax=nocols(%x);
gamma= 1.;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,pls_rss,ncompmax,gamma,iprint);
call echoon;
call print('are ols_rss = pls_rss?',ols_rss,pls_rss);
/;
/; testing equation 10.4-12. Note we use the x with means removed.
/;
yhat_pls=x0*r*pls_beta+mean(%y);
/;
/; PLS_coef = r*pls_beta. pls_coef gets us to yhat another way
/;
yhat2=%x * pls_coef;
call tabulate(yhat_ols,yhat,yhat_pls,yhat2);
/; Reduced Model -
How does e'e increase if only 6 vectors in T?
ncompmax=4;
gamma= 1.;
call echooff;
call pls1_reg(%y,%x,y0,x0,r,pls_beta,u,v,s,pls_coef,yhat,
pls_res,pls_rss,ncompmax,gamma,iprint);
call echoon;
yhat_pls=x0*r*pls_beta+mean(%y);
yhat2=%x * pls_coef;
call tabulate(yhat_ols,yhat,yhat_pls,yhat2
:title 'Last three columns must be the same');
T=%x*r;
T_tilda=transpose(u)*T;
call
call
call
call
call
print(' ':);
print('Looking at loading':);
print('Error sum of squares decreases as columns of T enter':);
print('
':);
stepwise(%y T :nstep 3 :print :printsteps);
call graph(r :plottype meshc :d3axis :grid :rotation 0.
:pgborder :heading 'Mapping of X to PLS vectors in T'
:file 'mapping.wmf');
/;
call print(' ':);
call print('Validate the r**2 using equation (10.4-10) & (10.4-11)':);
/;
c=y0*U*T_tilda;
r_sq_y=sumsq(c)/sumsq(y0);
call print(r_sq_y);
call print('
':);
call print('Check with an OLS calculation of y = f(T) ':);
call olsq(%y t :print);
/; Validate the R matrix
call print('r matrix',r);
call olsq(t(,1) xhold :noint :print);
call olsq(t(,2) xhold :noint :print);
Special Topics in OLS Estimation
10-55
call olsq(t(,3) xhold :noint :print);
call olsq(t(,4) xhold :noint :print);
b34srun;
Edited results of running the code in table 10.8 follows.
B34SI Matrix Command. d/m/y 27/ 5/11. h:m:s
=>
CALL LOADDATA$
=>
CALL LOAD(PLS1_REG)$
=>
CALL ECHOOFF$
9:10:20.
This is the base case where if the number of PLS vectors = the number of right hand side vectors
in an OLS model e ' e  16.13858 in both cases.
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F(12,
277)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
CONSTANT
Lag
1
2
3
4
5
6
1
2
3
4
5
6
0
Coefficient
0.63160860E-01
-0.13345763
-0.44123536
0.15200749
-0.12036440
0.24930584
1.5452265
-0.59293307
-0.17105674
0.13238479
0.56869923E-01
-0.42085617E-01
3.8241094
GASOUT
0.994664107436370
0.994432949635779
16.1385829591582
5.826203234353124E-002
0.241375293564878
3024.53296551724
7.36469419004290
53.5096551724138
3.23504435694615
48.1338529453902
4302.96578742115
1.00000000000000
1.929993696611666E-008
1.43081466326252
290
SE
0.75989856E-01
0.16490508
0.18869442
0.19021604
0.17941884
0.10973982
0.59808504E-01
0.11024897
0.11518138
0.11465530
0.10083191
0.42891891E-01
0.85547296
t
0.83117489
-0.80929968
-2.3383593
0.79913078
-0.67085705
2.2717902
25.836234
-5.3781279
-1.4851076
1.1546329
0.56400722
-0.98120217
4.4701698
Note that the implied PLS_COEF is in fact the OLS coefficients. Equation (10.4-15) shows two
ways to obtain ŷ from ˆPLS or ˆOLS .
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
PLS_BETA
47.70
25.33
6.602
5.115
3.910
2.053
PLS_COEF
0.6316E-01
-0.1335
-0.4412
0.1520
-0.1204
0.2493
290
13
13
1.00000000000000
53.5096551724138
16.1385829591581
3024.53296551724
0.994664107436370
10-56
7
8
9
10
11
12
13
Chapter 10
1.325
1.545
0.8342
-0.5929
0.2304
-0.1711
0.1323
0.1324
0.5716E-01 0.5687E-01
0.4442E-02 -0.4209E-01
0.1914E-01
3.824
This section illustrates the increase in e ' e as fewer and fewer PLS vectors are used.
=>
CALL PRINT('are ols_rss = pls_rss?',OLS_RSS,PLS_RSS)$
are ols_rss = pls_rss?
OLS_RSS =
16.138583
PLS_RSS = Vector of
749.612
16.2128
16.1598
13
elements
107.920
64.3347
38.1673
22.8806
16.1422
16.1390
16.1389
16.1386
18.6653
16.9087
Here ŷ from the olsq command (yhat_ols) is the same as ŷ from the pls1_reg
command (yhat). The vectors yhat_pls and yhat2 are from equation (10.4-15) and are the
same.
=>
YHAT_PLS=X0*R*PLS_BETA+MEAN(%Y)$
=>
YHAT2=%X * PLS_COEF$
=>
CALL TABULATE(YHAT_OLS,YHAT,YHAT_PLS,YHAT2)$
Obs
1
2
3
4
5
6
7
8
9
10
.
YHAT_OLS
52.76
52.34
52.15
52.08
51.94
52.15
52.92
53.76
55.04
55.89
.
YHAT
52.76
52.34
52.15
52.08
51.94
52.15
52.92
53.76
55.04
55.89
.
YHAT_PLS
52.76
52.34
52.15
52.08
51.94
52.15
52.92
53.76
55.04
55.89
.
YHAT2
52.76
52.34
52.15
52.08
51.94
52.15
52.92
53.76
55.04
55.89
.
280
281
282
283
284
285
286
287
288
289
290
53.00
53.69
55.40
57.05
57.26
57.96
58.33
57.78
57.61
57.14
56.74
53.00
53.69
55.40
57.05
57.26
57.96
58.33
57.78
57.61
57.14
56.74
53.00
53.69
55.40
57.05
57.26
57.96
58.33
57.78
57.61
57.14
56.74
53.00
53.69
55.40
57.05
57.26
57.96
58.33
57.78
57.61
57.14
56.74
The reduced PLS model now has only 4 vectors. Note that as expected e ' e  38.167 . There are
only 4 PLS coefficients (PLS_BETA) but they map to 13 coefficients that show how the 4 PLS
vectors map to the original X matrix.
=>
NCOMPMAX=4$
=>
GAMMA= 1.$
=>
CALL ECHOOFF$
Special Topics in OLS Estimation
Partial Least Squares PLS1 - 9 May 2011 Version.
Logic from de Jong, Wise, Ricker (2001) Matlab Code
Number of rows in original data
Number Columns in origional data
Number Columns in PLS Coefficient Vector
Gamma
Mean of left hand variable
PLS sum of squared errors
Total sum of squares
PLS R^2
10-57
290
13
4
1.00000000000000
53.5096551724138
38.1673392441907
3024.53296551724
0.987380749464682
(T*pls_beta)+mean(y) = x*pls_coef
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
PLS_BETA
47.70
25.33
6.602
5.115
NA
NA
NA
NA
NA
NA
NA
NA
NA
PLS_COEF
0.3014E-02
-0.1409
-0.2707
-0.3042
-0.2166
-0.4860E-01
0.7309
0.2073
-0.1318
-0.2034
-0.6247E-01
0.1701
15.44
=>
YHAT_PLS=X0*R*PLS_BETA+MEAN(%Y)$
=>
YHAT2=%X * PLS_COEF$
=>
=>
CALL TABULATE(YHAT_OLS,YHAT,YHAT_PLS,YHAT2
:TITLE 'Last three columns must be the same')$
The last three columns have to be the same. The first column is what would have been the case
had all 13 PLS vectors been used. Note that the first 4 PLS vector coefficients are the same as
when there were 13 PLS coefficients but that ˆOLS has of course changed. This suggests that
users can experiment withs subsets of a PLS model by using less T vectors.
Last three columns must be the same
Obs
1
2
3
4
5
6
7
8
9
10
.
280
281
282
283
284
285
286
287
288
289
290
YHAT_OLS
52.76
52.34
52.15
52.08
51.94
52.15
52.92
53.76
55.04
55.89
.
53.00
53.69
55.40
57.05
57.26
57.96
58.33
57.78
57.61
57.14
56.74
YHAT
53.00
52.52
52.18
52.07
52.04
52.19
52.73
53.57
54.73
55.76
.
53.05
53.67
54.91
56.35
57.04
57.27
57.19
56.81
56.46
56.02
55.76
YHAT_PLS
53.00
52.52
52.18
52.07
52.04
52.19
52.73
53.57
54.73
55.76
.
53.05
53.67
54.91
56.35
57.04
57.27
57.19
56.81
56.46
56.02
55.76
YHAT2
53.00
52.52
52.18
52.07
52.04
52.19
52.73
53.57
54.73
55.76
.
53.05
53.67
54.91
56.35
57.04
57.27
57.19
56.81
56.46
56.02
55.76
Here we are obtaining T and T using (10.4-6) and (10.4-7) for future use. Using the stepwise
command and restricting to 3 variables, the first 3 T vectors enter with e ' e values of 749.6,
107.9 and 64.33 respectively. As a final check all 4 PLS vectors are used to estimate the model
using the olsq command. As anticipated e ' e  38.17 . Of more interest is that the PLS vectors
10-58
Chapter 10
now can be ranked in terms of their t tests which were respectively 130.33, 69.22, 18.04, 13.98
with the t for the constant 29.057.
=>
T=%X*R$
=>
T_TILDA=TRANSPOSE(U)*T$
=>
CALL PRINT(' ':)$
=>
CALL PRINT('Looking at loading':)$
Looking at loading
=>
CALL PRINT('Error sum of squares decreases as columns of T enter':)$
Error sum of squares decreases as columns of T enter
=>
CALL PRINT('
':)$
=>
CALL STEPWISE(%Y T :NSTEP 3 :PRINT :PRINTSTEPS)$
Stepwise Option called.
Y Variable
%Y
Y Variable Mean
53.5096551724138
Y Variable Variance
10.4655119914091
Y Variable Number
4
Number of observations
290
PIN set as
5.000000000000000E-002
POUT set as
0.100000000000000
TOL set as
2.220446049250313E-014
Constant estimated in all models - Listed for Final Model
X Var. #
1
2
3
Name
Col____1
Col____2
Col____3
Lag
0
0
0
Mean
0.95836819
-0.44771392
-0.19376726
Var
0.34602076E-02
0.34602076E-02
0.34602076E-02
FORWARD STEPWISE SELECTION
STEP 0: No variables entered.
* * * Statistics for Variables Not in the Model * * *
Coef.
Standard t-statistic
Prob. of
Variance
Variable
Estimate
Error
to enter
Larger t
Inflation
1
47.70
1.613
29.564
0.0000
1
2
25.33
2.876
8.807
0.0000
1
3
6.60
3.217
2.052
0.0411
1
STEP 1 :
Variable 1 entered.
Dependent
Variable
4
R-squared
(percent)
75.216
Source
Regression
Error
Total
Variable
1
Adjusted
R-squared
75.130
Est. Std. Dev.
of Model Error
1.613
* * * Analysis of Variance * * *
Sum of
Mean
DF
Squares
Square Overall F
1
2274.9
2274.9
874.022
288
749.6
2.6
289
3024.5
Prob. of
Larger F
0.0000
* * * Inference on Coefficients * * *
(Conditional on the Selected Model)
Coef.
Standard
Prob. of
Estimate
Error t-statistic
Larger t
47.70
1.613
29.564
0.0000
Variance
Inflation
1
* * * Statistics for Variables Not in the Model * * *
Coef.
Standard t-statistic
Prob. of
Variance
Estimate
Error
to enter
Larger t
Inflation
25.33
0.613
41.310
0.0000
1
6.60
1.568
4.209
0.0000
1
Variable
2
3
STEP 2 :
Variable 2 entered.
Special Topics in OLS Estimation
Dependent
Variable
4
R-squared
(percent)
96.432
Est. Std. Dev.
of Model Error
0.6132
* * * Analysis of Variance * * *
Sum of
Mean
DF
Squares
Square Overall F
2
2916.6
1458.3
3878.204
287
107.9
0.4
289
3024.5
Source
Regression
Error
Total
Variable
1
2
Adjusted
R-squared
96.407
Prob. of
Larger F
0.0000
* * * Inference on Coefficients * * *
(Conditional on the Selected Model)
Coef.
Standard
Prob. of
Estimate
Error t-statistic
Larger t
47.70
0.6132
77.781
0.0000
25.33
0.6132
41.310
0.0000
Variance
Inflation
1
1
* * * Statistics for Variables Not in the Model * * *
Coef.
Standard t-statistic
Prob. of
Variance
Variable
Estimate
Error
to enter
Larger t
Inflation
3
6.60
0.4743
13.920
0.0000
1
STEP 3 :
Dependent
Variable
4
Variable 3 entered.
R-squared
(percent)
97.873
Intercept
Std. error
t Stat.
Est. Std. Dev.
of Model Error
0.4743
* * * Analysis of Variance * * *
Sum of
Mean
DF
Squares
Square Overall F
3
2960.2
986.7
4386.523
286
64.3
0.2
289
3024.5
Source
Regression
Error
Total
Variable
1
2
3
Adjusted
R-squared
97.851
Prob. of
Larger F
0.0000
* * * Inference on Coefficients * * *
(Conditional on the Selected Model)
Coef.
Standard
Prob. of
Estimate
Error t-statistic
Larger t
47.70
0.4743
100.564
0.0000
25.33
0.4743
53.410
0.0000
6.60
0.4743
13.920
0.0000
Variance
Inflation
1
1
1
20.4197570610335
0.510801287876884
39.9759310433750
=>
=>
=>
CALL GRAPH(R :PLOTTYPE MESHC :D3AXIS :GRID :ROTATION 0.
:PGBORDER :HEADING 'Mapping of X to PLS vectors in T'
:FILE 'mapping.wmf')$
=>
CALL PRINT(' ':)$
=>
CALL PRINT('Validate the r**2 using equation (10.4-10) & (10.4-11)':)$
Validate the r**2 using equation (10.4-10) & (10.4-11)
=>
C=Y0*U*T_TILDA$
=>
R_SQ_Y=SUMSQ(C)/SUMSQ(Y0)$
=>
CALL PRINT(R_SQ_Y)$
R_SQ_Y
=
0.98738075
=>
CALL PRINT('
=>
CALL PRINT('Check with an OLS calculation of y = f(T) ':)$
Check with an OLS calculation of y = f(T)
=>
CALL OLSQ(%Y T :PRINT)$
':)$
10-59
10-60
Chapter 10
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F( 4,
285)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
Col____4
CONSTANT
=>
Lag
0
0
0
0
0
Coefficient
47.696134
25.331646
6.6018826
5.1154030
15.438848
%Y
0.987380749464682
0.987203637176467
38.1673392441908
0.133920488576108
0.365951483910242
3024.53296551724
-117.446563459150
53.5096551724138
3.23504435694615
70.3905891845412
5574.88562434546
1.00000000000000
3.532683390771366E-004
1.74714855404004
290
SE
0.36595148
0.36595148
0.36595148
0.36595148
0.53132559
t
130.33458
69.221322
18.040322
13.978364
29.057227
CALL PRINT('r matrix',R)$
The final sequence of tests varidate the calculation of the R matrix by running each PLS vector in
the T matrix on all the X data. The ith regression listed below replicates the ith column in R .
Note that as expected th fit is perfect.
r matrix
R
= Matrix of
1
2
3
4
5
6
7
8
9
10
11
12
13
=>
1
-0.816216E-03
-0.989053E-03
-0.114989E-02
-0.126158E-02
-0.129674E-02
-0.124796E-02
0.395372E-02
0.364950E-02
0.322800E-02
0.276954E-02
0.233997E-02
0.197721E-02
-0.270289E-18
13
by
2
-0.400790E-02
-0.504122E-02
-0.571785E-02
-0.568495E-02
-0.477588E-02
-0.309993E-02
0.980122E-02
0.443410E-02
-0.907315E-03
-0.519005E-02
-0.782500E-02
-0.873809E-02
0.322290E-17
4
elements
3
-0.834093E-02
-0.119730E-01
-0.134877E-01
-0.107002E-01
-0.376867E-02
0.471397E-02
0.850522E-02
-0.105492E-01
-0.178896E-01
-0.114058E-01
0.466072E-02
0.230272E-01
0.458080E-16
4
0.388115E-01
0.221005E-01
0.351568E-02
-0.575148E-02
-0.173543E-02
0.114023E-01
0.465133E-01
-0.184729E-02
-0.282878E-01
-0.251736E-01
-0.129510E-02
0.283711E-01
0.234304E-15
CALL OLSQ(T(,1) XHOLD :NOINT :PRINT)$
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Lag
0
0
0
0
0
0
0
0
Coefficient
-0.81621613E-03
-0.98905278E-03
-0.11498880E-02
-0.12615798E-02
-0.12967366E-02
-0.12479594E-02
0.39537165E-02
0.36494975E-02
##1209
1.00000000000000
1.00000000000000
1.705044853683151E-022
6.155396583693686E-025
7.845633552297537E-013
1.00000000000000
7678.51625031125
0.958368190986431
5.882352941176474E-002
1.596971443973416E-010
1.929993696611666E-008
3.731459585765151E-012
290
SE
0.24699651E-12
0.53600550E-12
0.61333008E-12
0.61827593E-12
0.58318085E-12
0.35669698E-12
0.19440084E-12
0.35835193E-12
t
-0.33045654E+10
-0.18452288E+10
-0.18748273E+10
-0.20404803E+10
-0.22235582E+10
-0.34986543E+10
0.20337960E+11
0.10184116E+11
Special Topics in OLS Estimation
Col____9
Col___10
Col___11
Col___12
Col___13
=>
0
0
0
0
0
0.32280031E-02
0.27695401E-02
0.23399745E-02
0.19772077E-02
-0.21420440E-11
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Col___11
Col___12
Col___13
Lag
0
0
0
0
0
0
0
0
0
0
0
0
0
Coefficient
-0.40079028E-02
-0.50412166E-02
-0.57178505E-02
-0.56849453E-02
-0.47758761E-02
-0.30999253E-02
0.98012207E-02
0.44341045E-02
-0.90731500E-03
-0.51900517E-02
-0.78250020E-02
-0.87380883E-02
-0.49981027E-11
##1216
1.00000000000000
1.00000000000000
1.749298702340186E-023
6.315157770181178E-026
2.512997765653837E-013
1.00000000000000
8008.67567425387
-0.447713919937468
5.882352941176472E-002
5.085987186959073E-011
1.929993696611666E-008
1.206201805104001E-012
290
SE
0.79114285E-13
0.17168539E-12
0.19645286E-12
0.19803704E-12
0.18679590E-12
0.11425192E-12
0.62267614E-13
0.11478201E-12
0.11991723E-12
0.11936952E-12
0.10497775E-12
0.44655451E-13
0.89064693E-12
t
-0.50659660E+11
-0.29363108E+11
-0.29105458E+11
-0.28706475E+11
-0.25567350E+11
-0.27132369E+11
0.15740479E+12
0.38630657E+11
-0.75661773E+10
-0.43478870E+11
-0.74539623E+11
-0.19567797E+12
-5.6117666
CALL OLSQ(T(,3) XHOLD :NOINT :PRINT)$
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Col___11
Col___12
Col___13
=>
0.86221674E+10
0.74315313E+10
0.71396667E+10
0.14182144E+11
-0.77034789
CALL OLSQ(T(,2) XHOLD :NOINT :PRINT)$
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
1/Condition XPX
Maximum Absolute Residual
Number of Observations
=>
0.37438418E-12
0.37267422E-12
0.32774282E-12
0.13941529E-12
0.27806190E-11
Lag
0
0
0
0
0
0
0
0
0
0
0
0
0
Coefficient
-0.83409338E-02
-0.11972963E-01
-0.13487669E-01
-0.10700250E-01
-0.37686661E-02
0.47139743E-02
0.85052182E-02
-0.10549245E-01
-0.17889554E-01
-0.11405813E-01
0.46607156E-02
0.23027214E-01
-0.62476284E-11
##1223
1.00000000000000
1.00000000000000
1.387238706486448E-022
5.008081972875263E-025
7.076780322205333E-013
1.00000000000000
7708.42629761190
-0.193767263817997
5.882352941176477E-002
1.497757265433997E-010
1.929993696611666E-008
3.005623527840839E-012
290
SE
0.22279145E-12
0.48347825E-12
0.55322521E-12
0.55768637E-12
0.52603053E-12
0.32174153E-12
0.17535003E-12
0.32323430E-12
0.33769543E-12
0.33615304E-12
0.29562481E-12
0.12575293E-12
0.25081251E-11
t
-0.37438303E+11
-0.24764223E+11
-0.24380068E+11
-0.19186859E+11
-0.71643486E+10
0.14651432E+11
0.48504231E+11
-0.32636526E+11
-0.52975410E+11
-0.33930419E+11
0.15765644E+11
0.18311474E+12
-2.4909557
CALL OLSQ(T(,4) XHOLD :NOINT :PRINT)$
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
##1230
1.00000000000000
1.00000000000000
8.851832887877189E-022
3.195607540749888E-024
10-61
10-62
Chapter 10
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Col___11
Col___12
Col___13
Lag
0
0
0
0
0
0
0
0
0
0
0
0
0
Coefficient
0.38811538E-01
0.22100525E-01
0.35156778E-02
-0.57514845E-02
-0.17354253E-02
0.11402341E-01
0.46513253E-01
-0.18472915E-02
-0.28287817E-01
-0.25173608E-01
-0.12951020E-02
0.28371074E-01
-0.12852264E-10
1.787626230717677E-012
0.999999999999996
7439.69644703890
0.973707980304693
5.882352941176460E-002
3.691369432345937E-010
1.929993696611666E-008
7.552181102710165E-012
290
SE
0.56278113E-12
0.12212876E-11
0.13974715E-11
0.14087406E-11
0.13287765E-11
0.81273344E-12
0.44294198E-12
0.81650425E-12
0.85303370E-12
0.84913756E-12
0.74676143E-12
0.31765749E-12
0.63356356E-11
t
0.68963823E+11
0.18096086E+11
0.25157420E+10
-0.40827136E+10
-0.13060325E+10
0.14029620E+11
0.10500981E+12
-0.22624395E+10
-0.33161430E+11
-0.29646089E+11
-0.17342915E+10
0.89313413E+11
-2.0285674
10.5 Boosting
Hastie-Tibshirani-Friedman (2001, 299) have noted "boosting is one of the most powerful
learning ideas introduced in the last ten years….The motivation for boosting was a procedure that
combines the outputs of many 'weak' classifiers12 to produce a powerful 'committee.'" Efron-HastieJohnson-Tibshirani (2004, 445) noted ".. in some sense least squares boosting may be carrying out a
Lasso fit on the infinite set of tree predictors." In this section two forms of the boosting algorithm are
outlined. Their use is later illustrated.
In data mining applications in many cases the number of potential right hand side variables is
large and it is thus not feasible to place them in an OLS model. Boosting provides an iterative
procedure by which the information from the vast set of potential right hand side variables can be
extracted, one at a time. Boosting first centers all x variables to have zero mean and unit length and
removes the mean from the left hand side variable. Assume a small positive adjustment constant
0    1 . Define e jt as the t th observation of the j th iteration of the residual of a model to predict
yt . e jt for t  {1,
, N }  e j . . Start the process by assuming j  1 where by assumption no
knowledge is assumed or yˆ1t  0 which implies eˆ1t  yt . Next select xk . as the vector having the
highest absolute correlation with eˆ jt . An OLS regression of eˆ j .  f ( xk . ) produces a fitted value
zˆ j . which is used to update yˆ j 1.  yˆ j .   zˆ j . and eˆ j 1.  y.  yˆ j 1. The process is repeated for a given
number of steps. It can be shown that if enough steps are done, the correlation of yt . and yˆt . will
approximate what can be obtained of by an OLS fit if all the x variables are in the model. This will
be illustrated in Table 10.12. A variant to boosting is modified stagewise boosting, shown in Table
10.13, which involves first finding the best x vector but in addition taking a small step in all vectors
already in the model. Stokes suggests a modification that estimates a constant at each stage and does
12 By classifier they mean OLS, L1, MINIMAX, MARS, GAM etc.
Special Topics in OLS Estimation
10-63
not center the x variables or remove the mean from Y. This modification shown in Table 10.13 and
10.14 facilitates forecasting out of sample.
10.6 Extended Examples
The code listed in Table 10.9 will illustrate shrinkage estimation. A RATS implementation is shown
to provide validation of the shrinkage calculation. The matrix command subroutines ridge and
lasso, listed in Table 10.9, provide implementation detail. The routine lasso, not shown,
shows how d in (10.2-30) can be set and includes the usual lasso as a special case. The GLM routine,
discussed in section 10.3, provides a substantially more efficient approach to solve elastic net
problems that involves using both the lasso and risge models in varying proportions.
Table 10.9 Example file for Shrinkage Models
___________________________________________________________________
/;
/; Ridge regression also shown
/;
%b34slet dorats =0;
%b34slet doridge=1;
b34sexec options ginclude('gas.b34'); b34srun;
b34sexec matrix;
call loaddata;
call load(ridge :staging);
call load(lasso :staging);
call echooff;
k=6;
call olsq(gasout gasout{1 to k} gasin{1 to k} :print :savex);
%b34sif(&doridge.ne.0)%then;
lamda=2.;
call ridge(%y,%x,lamda,%coef,%names,%lag,ridge_c,0);
call ridge(%y,%x,lamda,%coef,%names,%lag,ridge_c,1);
%b34sendif;
lamda= 10.*sum(%coef)/2.0;
call echooff;
call lasso(%y,%x,%coef,%lcoef1,%l_t1,lamda,lresid1,1);
call lasso(%y,%x,%coef,%lcoef2,%l_t2,lamda,lresid2,3);
call tabulate(%names,%lag,%coef,%se,%t,%lcoef1,%l_t1,%lcoef2,%l_t2);
b34srun;
%b34sif(&dorats.ne.0)then;
b34sexec options open('rats.dat') unit(28) disp=unknown$ b34srun$
b34sexec options open('rats.in') unit(29) disp=unknown$ b34srun$
b34sexec options clean(28)$ b34srun$
b34sexec options clean(29)$ b34srun$
/; Uses logic from RATS User's Guide Version 6.1 Page 192
b34sexec pgmcall$
rats passasts
10-64
Chapter 10
pcomments('* ',
'* Data passed from B34S(r) system to RATS',
'*
',
"display @1 %dateandtime() @33 ' Rats Version ' %ratsversion()"
'* ') $
PGMCARDS$
*
* Non centered results
*
cmoment
# constant gasout{1 to 6} gasin{1 to 6} gasout
linreg(cmoment) gasout
# constant gasout{1 to 6} gasin{1 to 6}
do row=1,13
compute %cmom(row,row)=%cmom(row,row)+2
end do row
linreg(cmoment) gasout
# constant gasout{1 to 6} gasin{1 to 6}
b34sreturn$
b34srun $
b34sexec options close(28)$ b34srun$
b34sexec options close(29)$ b34srun$
b34sexec options
/$
dodos(' rats386 rats.in rats.out ')
dodos('start /w /r
rats32s rats.in /run')
dounix('rats
rats.in rats.out')$ B34SRUN$
b34sexec options npageout
WRITEOUT('Output from RATS',' ',' ')
COPYFOUT('rats.out')
dodos('ERASE rats.in','ERASE rats.out','ERASE
dounix('rm
rats.in','rm
rats.out','rm
$
B34SRUN$
%b34sendif;
rats.dat')
rats.dat')
The matrix subroutines ridge and lasso that are called are listed in Table 10.10:
Table 10.10 Ridge and Lasso Routines
_____________________________________________________________
subroutine ridge(y,x,lamda,ols_c,_name,_lag,ridge_c,iscale);
/;
/; Ridge Regression. See Hastie-Tibshirani-Friedman (2001) page 60
/;
/; y
=> left hand side
/; x
=> right hand side - constant at end
/; lamda
=> Lamda for Ridge Regression
/; OLS_c
=> OLS Coef
/; _name
=> Usually %name
/; lag
=> Usually %lag
/; ridge_c => Ridge Coef
/; iscale
=> =0 do not center X matrix
/;
=1 center X matrix
/;
/; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/; Routine built 24 March 2006
/; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Special Topics in OLS Estimation
newx=mfam(x);
if(iscale.ne.0)then;
call deletecol(newx);
b0=mean(y);
do i=1,nocols(newx);
newx(,i)=newx(,i)-mean(newx(,i));
enddo;
endif;
d=diagmat(vector(nocols(newx):))+lamda;
b_ridge=inv((transpose(newx)*newx) + d)*transpose(newx)*vfam(y);
if(iscale.ne.0)then;
ridge_c=vector(:b_ridge,b0);
call print(' ':);
call print('Ridge Regression: X matrix has been centered ':);
endif;
if(iscale.eq.0)then;
ridge_c=vector(:b_ridge);
call print(' ':);
call print('Ridge Regression: X matrix has not been centered ':);
endif;
call tabulate(_name,_lag,OLS_c,ridge_c);
return;
end;
subroutine lasso(y,x2,olscoef,lcoef,l_t,lamda,lresid,iprint);
/;
/; Implements the LASSO shrinkage Method
/; Reference: Hastie-Tibshirani-Friedman (2001) Page 64, 72 and 77
/;
/;
y
=> left hand side
/;
x
=> Right hand side. Usually %x from :savex
/;
olscoef => %coef from call olsq. Used for starting values
/;
lcoef
=> Lasso Coef
/;
lcoefse => lasso Coef t
/;
lamda
=> Lamda for Lasso Model. Larger Lamda => more shrinkage
/;
lresid
=> Residual from Lasso Model
/;
iprint
=> =0 No print use cmax2
/;
iprint
=> =0 Print
use cmax2
/;
iprint
=> =0 No print use max2
/;
iprint
=> =0 Print
use max2
/;
/;
Note: The constant is not restricted!!
/;
Page 77 illustrates a centered example
/;
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rvec=array(:olscoef);
ll= array(norows(rvec):)-1.d+32;
uu= array(norows(rvec):)+1.d+32;
x=mfam(x2);
lcoef=vfam(olscoef);
i=integers(norows(olscoef)-1);
if(iprint.eq.0)
call cmaxf2(func :name lasso_1 :parms lcoef :ivalue rvec
:maxit 2000 :lower ll :upper uu);
10-65
10-66
Chapter 10
if(iprint.eq.1)
call cmaxf2(func :name lasso_1 :parms lcoef :ivalue rvec
:maxit 2000 :lower ll :upper uu :print);
if(iprint.eq.2)
call maxf2(func :name lasso_1 :parms lcoef :ivalue rvec
:maxit 2000);
if(iprint.eq.3)
call maxf2(func :name lasso_1 :parms lcoef :ivalue rvec
:maxit 2000 :print);
lresid =sumsq(y-x*lcoef);
l_t=%t;
lse=%se;
if(iprint.eq.1.or.iprint.eq.3)then;
call print('Lamda for Lasso model
',lamda:);
call print('Sum of squared Residuals for Lasso Model ',lresid);
endif;
return;
end;
program lasso_1;
func=(-1.0)*(sumsq(y-x*lcoef) + lamda*(sum(abs(lcoef(i)))) );
call outstring(3,3,'Function');
call outdouble(36,3,func);
return;
end;
Edited output is shown next. First the Gas data is used in a VAR model with 6 lags. The condition of
X’X was found to be 2.341548639194536E-08. A problem with the ridge approach is the need
to set  which was set as 2.0 for purposes of illustration.
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F(12,
277)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
CONSTANT
Lag
1
2
3
4
5
6
1
2
3
4
5
6
0
Coefficient
1.5452265
-0.59293307
-0.17105674
0.13238479
0.56869923E-01
-0.42085617E-01
0.63160860E-01
-0.13345763
-0.44123536
0.15200749
-0.12036440
0.24930584
3.8241094
GASOUT
0.9946641074363697
0.9944329496357792
16.13858295915809
5.826203234353101E-02
0.2413752935648779
3024.532965517241
7.364694190043447
53.50965517241379
3.235044356946151
48.13385294520057
4302.965787421152
1.000000000000000
2.341548639194536E-08
1.430814663262694
290
SE
0.59808504E-01
0.11024897
0.11518138
0.11465530
0.10083191
0.42891891E-01
0.75989856E-01
0.16490508
0.18869442
0.19021604
0.17941884
0.10973982
0.85547296
Ridge Regression: X matrix has not been centered
Obs
1
2
_NAME
GASOUT
GASOUT
_LAG
1
2
OLS_C
1.545
-0.5929
RIDGE_C
1.462
-0.3270
t
25.836234
-5.3781279
-1.4851076
1.1546329
0.56400722
-0.98120217
0.83117489
-0.80929968
-2.3383593
0.79913078
-0.67085705
2.2717902
4.4701698
Special Topics in OLS Estimation
3
4
5
6
7
8
9
10
11
12
13
GASOUT
GASOUT
GASOUT
GASOUT
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
CONSTANT
3
4
5
6
1
2
3
4
5
6
0
-0.1711
0.1324
0.5687E-01
-0.4209E-01
0.6316E-01
-0.1335
-0.4412
0.1520
-0.1204
0.2493
3.824
10-67
-0.2444
0.5392E-01
0.7752E-01
-0.2530E-01
0.6540E-01
-0.1593
-0.3127
-0.7397E-01
0.2525E-01
0.4256
0.2010
Ridge Regression: X matrix has been centered
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
_NAME
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
CONSTANT
_LAG
1
2
3
4
5
6
1
2
3
4
5
6
0
OLS_C
RIDGE_C
1.545
1.355
-0.5929
-0.3117
-0.1711
-0.2472
0.1324
0.5269E-01
0.5687E-01 0.9835E-01
-0.4209E-01 -0.3854E-01
0.6316E-01 0.4999E-01
-0.1335
-0.1568
-0.4412
-0.3262
0.1520
-0.1011
-0.1204
0.4107E-02
0.2493
0.2302
3.824
53.51
The ridge calculation is validated for the not centered case by RATS as shown below. For example
note that both B34S and RATS find the coefficient for gasout{1} = 1.46164.
*
* Data passed from B34S(r) system to RATS
*
display @1 %dateandtime() @33 ' Rats Version ' %ratsversion()
08/21/2006 12:05
Rats Version
6.10000
*
CALENDAR(IRREGULAR)
ALLOCATE
296
OPEN DATA rats.dat
DATA(FORMAT=FREE,ORG=OBS,
$
MISSING=
0.1000000000000000E+32
) / $
TIME
$
GASIN
$
GASOUT
$
CONSTANT
SET TREND = T
TABLE
Series
Obs
Mean
Std Error
Minimum
TIME
296
148.500000
85.592056
1.000000
GASIN
296
-0.056834
1.072766
-2.716000
GASOUT
296
53.509122
3.202121
45.600000
TREND
296
148.500000
85.592056
1.000000
Maximum
296.000000
2.834000
60.500000
296.000000
*
* Non centered results
*
cmoment
# constant gasout{1 to 6} gasin{1 to 6} gasout
linreg(cmoment) gasout
# constant gasout{1 to 6} gasin{1 to 6}
Linear Regression - Estimation by Least Squares
Dependent Variable GASOUT
Usable Observations
290
Degrees of Freedom
277
Centered R**2
0.994664
R Bar **2
0.994433
Uncentered R**2
0.999981
T x R**2
289.994
Mean of Dependent Variable
53.509655172
Std Error of Dependent Variable 3.235044357
Standard Error of Estimate
0.241375294
Sum of Squared Residuals
16.138582959
Regression F(12,277)
4302.9658
Significance Level of F
0.00000000
Log Likelihood
7.36469
Durbin-Watson Statistic
1.990879
Variable
Coeff
Std Error
T-Stat
Signif
*******************************************************************************
1. Constant
3.824109410 0.855472960
4.47017 0.00001141
2. GASOUT{1}
1.545226521 0.059808504
25.83623 0.00000000
3. GASOUT{2}
-0.592933069 0.110248971
-5.37813 0.00000016
4. GASOUT{3}
-0.171056741 0.115181382
-1.48511 0.13865246
5. GASOUT{4}
0.132384790 0.114655303
1.15463 0.24923589
6. GASOUT{5}
0.056869923 0.100831906
0.56401 0.57320557
7. GASOUT{6}
-0.042085617 0.042891891
-0.98120 0.32734926
8. GASIN{1}
0.063160860 0.075989856
0.83117 0.40659072
9. GASIN{2}
-0.133457630 0.164905082
-0.80930 0.41903740
10. GASIN{3}
-0.441235358 0.188694423
-2.33836 0.02008022
11. GASIN{4}
0.152007492 0.190216039
0.79913 0.42489927
12. GASIN{5}
-0.120364395 0.179418842
-0.67086 0.50287058
13. GASIN{6}
0.249305837 0.109739815
2.27179 0.02386617
10-68
Chapter 10
do row=1,13
(01.0032)
compute %cmom(row,row)=%cmom(row,row)+2
(01.0076) end do row
linreg(cmoment) gasout
# constant gasout{1 to 6} gasin{1 to 6}
Linear Regression - Estimation by Least Squares
Dependent Variable GASOUT
Usable Observations
290
Degrees of Freedom
277
Centered R**2
0.994093
R Bar **2
0.993838
Uncentered R**2
0.999979
T x R**2
289.994
Mean of Dependent Variable
53.509655172
Std Error of Dependent Variable 3.235044357
Standard Error of Estimate
0.253955001
Sum of Squared Residuals
17.864600467
Log Likelihood
-7.36850
Durbin-Watson Statistic
1.714901
Variable
Coeff
Std Error
T-Stat
Signif
*******************************************************************************
1. Constant
0.201004680 0.175858742
1.14299 0.25402989
2. GASOUT{1}
1.461646296 0.048558960
30.10045 0.00000000
3. GASOUT{2}
-0.327011979 0.088129225
-3.71060 0.00024987
4. GASOUT{3}
-0.244439653 0.089247268
-2.73890 0.00656424
5. GASOUT{4}
0.053923597 0.088583807
0.60873 0.54320247
6. GASOUT{5}
0.077517182 0.082098090
0.94420 0.34588925
7. GASOUT{6}
-0.025297741 0.037997848
-0.66577 0.50611360
8. GASIN{1}
0.065403718 0.058181817
1.12413 0.26193261
9. GASIN{2}
-0.159258039 0.106196160
-1.49966 0.13484158
10. GASIN{3}
-0.312688564 0.109538260
-2.85461 0.00463463
11. GASIN{4}
-0.073966322 0.111719455
-0.66207 0.50847551
12. GASIN{5}
0.025246375 0.111576001
0.22627 0.82115779
13. GASIN{6}
0.425619705 0.074577938
5.70705 0.00000003
Greene (2003, 58) remarks on the ridge regression "this biased estimator has a covariance
matrix unambiguously smaller … The tradeoff of some bias for smaller variance may be worth
making … but, never the less, economists are generally averse to biased estimators, so this approach
has little practical use." Greene's traditional view may be a bit harsh in view of the recent interest in
data mining. A major disadvantage of the ridge procedure is the need to set  . A possible strategy
might be to estimate the model over a range of  values. Accuracy problems associated with
estimation of large models might be resolved by use the higher accuracy methods such as the QR and
higher data precision. These issues are discussed in Chapter 16.
An advantage of the Lasso technique lies in its ability to reduce towards 0.0 the least
significant coefficients. The CMAX2 and MAX2 commands are used to solve (10.3-5). Since the
constraints were not binding, the observed differences are in the way the Hessian was calculated.13
Note:
Scaled step tolerance satisfied. May be a local
solution. Or progress too slow. Adjust STEPTL.
Constrained Maximum Likelihood Estimation using CMAXF2 Command
Final Functional Value
-66.33990842435293
# of parameters
13
# of good digits in function 15
# of iterations
107
# of function evaluations
366
# of gradiant evaluations
108
Scaled Gradient Tolerance
6.055454452393343E-06
Scaled Step Tolerance
3.666852862501036E-11
Relative Function Tolerance
3.666852862501036E-11
False Convergence Tolerance
2.220446049250313E-14
Maximum allowable step size
4210.1335437160988
Size of Initial Trust region -1.000000000000000
1 / Cond. of Hessian Matrix
2.703839538993241E-12
#
1
2
3
4
Name
BETA___1
BETA___2
BETA___3
BETA___4
Coefficient
1.0244446
-0.25655831E-09
-0.23134710
-0.25678470E-07
Standard Error
0.27374792E-02
0.27427500E-05
0.72168633E-02
0.11949062E-03
T Value
374.22918
-0.93540535E-04
-32.056462
-0.21489946E-03
13 The SE is the square root of the diagonal elements of the inverse of the Hessian. The CMAX2
and MAX2 commands use IMSL routines based on NLPQL and ZXLSF respectively that use a
Quasi-Newton (BFGS) method. The Hessian can differ depending on the gradiant at the point of
the solution.
Special Topics in OLS Estimation
5
6
7
8
9
10
11
12
13
BETA___5
BETA___6
BETA___7
BETA___8
BETA___9
BETA__10
BETA__11
BETA__12
BETA__13
0.93661382E-11
0.24113886E-01
-0.10680745E-10
-0.41266048E-01
-0.53869926
-0.45525531E-07
-0.23292376E-08
0.16193677E-08
9.7543182
0.31952100E-02
0.15537732E-02
0.78749958E-06
0.52025460E-02
0.10659899E-01
0.20988395E-03
0.84625424E-04
0.59690117E-04
0.45478764
10-69
0.29313060E-08
15.519566
-0.13562859E-04
-7.9318949
-50.535117
-0.21690811E-03
-0.27524088E-04
0.27129577E-04
21.448072
Gradiant Vector
-0.162525E-01
-0.274755E-02
-8.32499
-6.88956
-0.179800E-01
-20.6699
-11.5486
9.27715
14.5865
-0.209805E-03
-0.151524E-01
-4.30439
0.160521E-02
Lower vector
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
-0.100000E+33
0.100000E+33
0.100000E+33
0.100000E+33
0.100000E+33
0.100000E+33
0.100000E+33
0.100000E+33
0.100000E+33
0.100000E+33
0.100000E+33
0.100000E+33
Upper vector
0.100000E+33
0.100000E+33
Lamda for Lasso model
22.60966011992043
Sum of squared Residuals for Lasso Model
LRESID
Note:
=
24.288858
Last global step failed to locate a lower point than
the current x value.
Maximum Likelihood Estimation
Finite-difference Gradiant
Final Functional Value
# of parameters
# of good digits in function
# of iterations
# of function evaluations
# of gradiant evaluations
Scaled Gradient Tolerance
Scaled Step Tolerance
Relative Function Tolerance
False Convergence Tolerance
Maximum allowable step size
1 / Cond. of Hessian Matrix
#
1
2
3
4
5
6
7
8
9
10
11
12
13
Name
BETA___1
BETA___2
BETA___3
BETA___4
BETA___5
BETA___6
BETA___7
BETA___8
BETA___9
BETA__10
BETA__11
BETA__12
BETA__13
using MAXF2 Command
-66.33989973667484
13
15
105
294
129
6.055454452393343E-06
3.666852862501036E-11
3.666852862501036E-11
2.220446049250313E-14
4210.1335437160988
9.887930572777443E-07
Coefficient
1.0246898
-0.23461755E-08
-0.23158361
-0.78332610E-09
0.39740723E-10
0.24191727E-01
-0.56468635E-08
-0.41580631E-01
-0.53813947
-0.34905921E-09
-0.15004931E-08
0.15162698E-08
9.7497007
Standard Error
0.21438009E-01
0.21492272E-02
0.25456171E-01
0.66603466E-03
0.23122965E-02
0.33097359E-01
0.11352848E-02
0.16030380E-01
0.63823539E-02
0.86480428E-03
0.48923809E-03
0.24349930E-03
0.93149088
T Value
47.797807
-0.10916368E-05
-9.0973466
-0.11761041E-05
0.17186690E-07
0.73092619
-0.49739622E-05
-2.5938644
-84.316771
-0.40362799E-06
-0.30669997E-05
0.62269984E-05
10.466770
SE calculated as sqrt |diagonal(inv(%hessian))|
Gradiant Vector
-0.231362E-02
0.197955E-03
12.6483
15.4705
-0.121604E-02
0.864598
Lamda for Lasso model
10.5148
-12.2471
-8.01297
-0.692538E-04
0.144096E-02
14.2601
0.106954E-03
22.60966011992043
Sum of squared Residuals for Lasso Model
LRESID
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
=
24.281743
%NAMES
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
CONSTANT
%LAG
1
2
3
4
5
6
1
2
3
4
5
6
0
%COEF
%SE
%T
1.545
0.5981E-01
25.84
-0.5929
0.1102
-5.378
-0.1711
0.1152
-1.485
0.1324
0.1147
1.155
0.5687E-01 0.1008
0.5640
-0.4209E-01 0.4289E-01 -0.9812
0.6316E-01 0.7599E-01 0.8312
-0.1335
0.1649
-0.8093
-0.4412
0.1887
-2.338
0.1520
0.1902
0.7991
-0.1204
0.1794
-0.6709
0.2493
0.1097
2.272
3.824
0.8555
4.470
%LCOEF1
%L_T1
%LCOEF2
%L_T2
1.024
374.2
1.025
47.80
-0.2566E-09 -0.9354E-04 -0.2346E-08 -0.1092E-05
-0.2313
-32.06
-0.2316
-9.097
-0.2568E-07 -0.2149E-03 -0.7833E-09 -0.1176E-05
0.9366E-11 0.2931E-08 0.3975E-10 0.1719E-07
0.2411E-01
15.52
0.2419E-01 0.7309
-0.1068E-10 -0.1356E-04 -0.5647E-08 -0.4974E-05
-0.4127E-01 -7.932
-0.4158E-01 -2.594
-0.5387
-50.54
-0.5381
-84.32
-0.4553E-07 -0.2169E-03 -0.3491E-09 -0.4036E-06
-0.2329E-08 -0.2752E-04 -0.1500E-08 -0.3067E-05
0.1619E-08 0.2713E-04 0.1516E-08 0.6227E-05
9.754
21.45
9.750
10.47
10-70
Chapter 10
At a cost of e ' e going from 16.13858 to 22.60966, insignificant parameters such as GASIN{1}
were reduced substantially, ie. from .631eE-0 to -.1068E-10. We are left with a substantially reduced
model where GASOUT{1, 3} and GASIN{3} where shown to be driving the model.
The GAS model is next viewed from the perspective of the LTS approach. Here outliers are
trimmed from the dataset and the results inspected to see how much difference it makes.14 The
analysis uses the lts and lts_rec subroutines listed in Table 10.11
Table 10.11 LTS and LTS_REC Routines for Resistant Estimation
__________________________________________________________
subroutine lts(y,x,resid,names,lags,oldcoef,oldse,oldt,p,sort_y,
sort_x,ihold,iprint);
/;
/; Least Trimmed Squares
/;
/; Reference: "Linear Models with R" By Julian Faraway
/;
Page 101
/;
/; y
=> left hand side.
Usually %y
/; x
=> right hand side.
Usually %x
/; resid
=> Residual from original regression Usually %res
/; names
=> Names from regression.
Usually %names
/; lags
=> lags from original regression.
Usually %lags
/; oldceof => Coefficient for full model.
Usually %coef
/; oldse
=> SE from original Model.
Usually %se
/; oldt
=> t from original model.
Usually %t
/; p
=> % trimmed
/; sort_y => Y after sorting and truncation
/; sort_x => X after sorting and truncation
/; ihold
=> # of obs held out
/; iprint => If > 0 => print.
/;
/;
/; Help on how to recover coef ouside of routine
/; Assume:
/;
/; n=6;
/; call olsq(gasout gasin{1 to n} gasout{1 to n} :print :savex);
/; call lts(%y,%x,%res,%names,%lag,%coef,%se,%t,p,newy,newx,
/;
ihold,iprint);
/; option # 1
/; call olsq(newy newx :noint :print :holdout ihold);
/; option # 2
/; n=norows(newy);
/; call deleterow(newy,n-ihold+1,ihold);
/; call deleterow(newx,n-ihold+1,ihold);
/; call print(inv(transpose(newx)*newx)*transpose(newx)*newy);
/;
/; ----------------------------------------------------------/;
/; LTS is an example of a resistent regression method.
/; The objective is to see how sensitive the results are
/; to outliers.
14 Since the larger residuals have been removed, the t statistics usually increase as the number of
observations dropped are increased.
Special Topics in OLS Estimation
/;
/; Built 21 June 2007 by Houston H. Stokes
/; Mods 23 June 2007
/;
n=norows(x);
if(p.lt.0. .or. p .gt.1.)then;
call epprint('p not in range 0 lt p le 1.
go to done;
endif;
Was
',p:);
r=afam(resid)*afam(resid);
j=ranker(r);
sort_x=x(j,);
sort_y=y(j);
ihold=idint((1.0-p)*dfloat(n));
if(iprint.eq.0)
call olsq(sort_y sort_x :noint :qr :holdout ihold);
if(iprint.ne.0)then;
call print(' ':);
call print('Least Trimmed Squares with holdout
',ihold:);
call olsq(sort_y sort_x :noint :print :qr :holdout ihold);
call print(' ':);
call print('Least Trimmed Squares with holdout % ',(1.0-p):);
call tabulate(names lags oldcoef oldse oldt %coef %se %t :cname);
endif;
done continue;
return;
end;
program lts_rec;
/;
/; Does Recursive lts
/;
/; Needs the following set:
/;
/;
oldcoef =1;
/;
n_recur =6;
/;
/;
oldcoef = 0 => use Base Coefficients for table
/;
oldceof = 1 => use prior LTS coef for table
/;
n_recur =
=> Sets number of recursive LTS estimates
/;
/;
example of use;
/;
/;
b34sexec matrix;
/;
call loaddata;
/;
call load(lts
:staging);
/;
call load(lts_rec :staging);
/;
call echooff;
/;
n=6;
/;
p=.8;
/;
iprint=1;
/;
call olsq(gasout gasin{1 to n} gasout{1 to n} :print :savex);
/;
call lts(%y,%x,%res,%names,%lag,%coef,%se,%t,p,newy,newx,
/;
ihold,iprint);
/;
oldcoef=1;
/;
n_recur=6;
/;
call lts_rec
/;
b34srun;
/;
10-71
10-72
Chapter 10
/;
Built 23 June 2007
/;
----------------------------------------------------------/;
holdname=%names;
holdlag =%lag;
holdcoef=%coef;
holdse =%se;
holdt
=%t;
do i=1,n_recur;
/; logic . Rerun to get %coef etc and proceed
call olsq(newy,newx,:noint :savex :holdout ihold);
yy=newy;
xx=newx;
n_y=norows(yy);
call deleterow(yy,n_y-ihold+1,ihold);
call deleterow(xx,n_y-ihold+1,ihold);
if(oldcoef.eq.1)then;
%coef=holdcoef;
%se =holdse;
%t
=holdt;
endif;
call print('
':);
if(oldcoef.eq.0)call print('Oldcoef is Prior LTS Coef':);
if(oldcoef.eq.1)call print('Oldcoef is OLS Coef':);
call print('Recursive Trimmed Squares pass
',i:);
call lts(yy,xx,%res,%names,%lag,%coef,%se,%t,p,newy,
newx,ihold,iprint);
enddo;
return;
end;
A sample job that does both LTS and recursive LTS estimation is shown in Table 10.12
Table 10.12 Estimation of LTS Based Models
____________________________________
b34sexec options ginclude('gas.b34'); b34srun;
b34sexec matrix;
call loaddata;
call load(lts
:staging);
call load(lts_rec :staging);
call echooff;
n=6;
p=.8;
iprint=1;
call olsq(gasout gasin{1 to n} gasout{1 to n} :print :savex);
call lts(%y,%x,%res,%names,%lag,%coef,%se,%t,p,newy,newx,ihold,iprint);
/;
/; Recursive
/;
oldcoef=1;
n_recur=2;
call lts_rec;
b34srun;
Special Topics in OLS Estimation
10-73
and when run produces edited output:
Least Trimmed Squares with holdout =
57
Ordinary Least Squares Estimation using QR Method
Dependent variable
SORT_Y
Centered R**2
0.9986167752837815
Adjusted R**2
0.9985413266628969
Residual Sum of Squares
3.228594427396152
Residual Variance
1.467542921543706E-02
Standard Error
0.1211421859446042
Total Sum of Squares
2334.106952789699
Log Likelihood
167.8898399989807
Mean of the Dependent Variable
53.32103004291845
Std. Error of Dependent Variable
3.171877335426148
Sum Absolute Residuals
22.80572349773055
F(12,
220)
13235.71940182378
F Significance
1.000000000000000
QR Rank Check variable (eps) set as
2.220446049250313E-16
Maximum Absolute Residual
0.3122580017218866
Number of Observations
233
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Col___11
Col___12
Col___13
Lag
0
0
0
0
0
0
0
0
0
0
0
0
0
Coefficient
0.25577822E-01
-0.16805808E-01
-0.61852464
0.26140688
-0.45601485E-01
0.17696140
1.5118237
-0.46466148
-0.30331998
0.20943964
-0.88668066E-04
-0.18285169E-01
3.4737529
SE
0.45056830E-01
0.10722801
0.12362575
0.12062224
0.11847044
0.65535734E-01
0.43583183E-01
0.79569077E-01
0.74393105E-01
0.75542966E-01
0.67338641E-01
0.27449928E-01
0.57869901
Least Trimmed Squares with holdout %
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
NAMES
GASIN
GASIN
GASIN
GASIN
GASIN
GASIN
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
GASOUT
CONSTANT
LAGS
1
2
3
4
5
6
1
2
3
4
5
6
0
t
0.56767914
-0.15672965
-5.0032025
2.1671532
-0.38491867
2.7002277
34.688234
-5.8397244
-4.0772594
2.7724572
-0.13167487E-02
-0.66612815
6.0026936
0.2000000000000000
OLDCOEF
0.6316E-01
-0.1335
-0.4412
0.1520
-0.1204
0.2493
1.545
-0.5929
-0.1711
0.1324
0.5687E-01
-0.4209E-01
3.824
OLDSE
OLDT
0.7599E-01 0.8312
0.1649
-0.8093
0.1887
-2.338
0.1902
0.7991
0.1794
-0.6709
0.1097
2.272
0.5981E-01
25.84
0.1102
-5.378
0.1152
-1.485
0.1147
1.155
0.1008
0.5640
0.4289E-01 -0.9812
0.8555
4.470
%COEF
0.2558E-01
-0.1681E-01
-0.6185
0.2614
-0.4560E-01
0.1770
1.512
-0.4647
-0.3033
0.2094
-0.8867E-04
-0.1829E-01
3.474
%SE
0.4506E-01
0.1072
0.1236
0.1206
0.1185
0.6554E-01
0.4358E-01
0.7957E-01
0.7439E-01
0.7554E-01
0.6734E-01
0.2745E-01
0.5787
%T
0.5677
-0.1567
-5.003
2.167
-0.3849
2.700
34.69
-5.840
-4.077
2.772
-0.1317E-02
-0.6661
6.003
Note how the significance of most variables increased, as expected. If in fact true outliers were
removed by the LTS algorithm, then the LTS estimated coefficients might be used in place of the
OLS coefficients for any out-of-sample forecasting. Some coefficnts that got their significance from
the outliers that were removed, in many cases will be seen to lose significance in the LTS model as
these outliers are removed. There is no reason why the LTS algorithm might not be applied again.
As an example, if LTS is applied two more times the results are:
Oldcoef is OLS Coef
Recursive Trimmed Squares pass
1
Least Trimmed Squares with holdout
46
Ordinary Least Squares Estimation using QR Method
Dependent variable
SORT_Y
Centered R**2
0.9993441473606075
Adjusted R**2
0.9992989161440977
Residual Sum of Squares
1.276018688275978
Residual Variance
7.333440737218262E-03
Standard Error
8.563551095905403E-02
Total Sum of Squares
1945.587486631016
Log Likelihood
200.9770082868623
Mean of the Dependent Variable
53.34973262032085
Std. Error of Dependent Variable
3.234215171813110
Sum Absolute Residuals
13.09864156581015
F(12,
174)
22094.12490914283
F Significance
1.000000000000000
QR Rank Check variable (eps) set as
2.220446049250313E-16
Maximum Absolute Residual
0.1621619933634477
Number of Observations
187
Variable
Col____1
Lag
0
Coefficient
0.26812228E-01
SE
0.38419042E-01
t
0.69788903
10-74
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Col___11
Col___12
Col___13
Chapter 10
0
0
0
0
0
0
0
0
0
0
0
0
0.26604491E-01
-0.69665370
0.30968284
-0.52287859E-01
0.16948075
1.4767408
-0.37286811
-0.39747469
0.26441683
-0.22112572E-01
-0.14860658E-01
3.5218606
0.89371617E-01
0.98456422E-01
0.91216416E-01
0.87851941E-01
0.50876822E-01
0.36399384E-01
0.65275731E-01
0.64883283E-01
0.63975923E-01
0.54621589E-01
0.22135590E-01
0.49989539
Least Trimmed Squares with holdout %
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
NAMES
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Col___11
Col___12
Col___13
LAGS
0
0
0
0
0
0
0
0
0
0
0
0
0
0.29768389
-7.0757568
3.3950341
-0.59518160
3.3311977
40.570487
-5.7122013
-6.1259954
4.1330679
-0.40483208
-0.67134681
7.0451953
0.2000000000000000
OLDCOEF
0.6316E-01
-0.1335
-0.4412
0.1520
-0.1204
0.2493
1.545
-0.5929
-0.1711
0.1324
0.5687E-01
-0.4209E-01
3.824
Oldcoef is OLS Coef
Recursive Trimmed Squares pass
2
Least Trimmed Squares with holdout
37
OLDSE
OLDT
0.7599E-01 0.8312
0.1649
-0.8093
0.1887
-2.338
0.1902
0.7991
0.1794
-0.6709
0.1097
2.272
0.5981E-01
25.84
0.1102
-5.378
0.1152
-1.485
0.1147
1.155
0.1008
0.5640
0.4289E-01 -0.9812
0.8555
4.470
%COEF
0.2681E-01
0.2660E-01
-0.6967
0.3097
-0.5229E-01
0.1695
1.477
-0.3729
-0.3975
0.2644
-0.2211E-01
-0.1486E-01
3.522
%SE
%T
0.3842E-01 0.6979
0.8937E-01 0.2977
0.9846E-01 -7.076
0.9122E-01
3.395
0.8785E-01 -0.5952
0.5088E-01
3.331
0.3640E-01
40.57
0.6528E-01 -5.712
0.6488E-01 -6.126
0.6398E-01
4.133
0.5462E-01 -0.4048
0.2214E-01 -0.6713
0.4999
7.045
%COEF
0.1470E-01
0.7846E-01
-0.7579
0.3250
-0.2532E-01
0.1531
1.446
-0.2819
-0.4990
0.3203
-0.3598E-01
-0.1550E-01
3.523
%SE
%T
0.3239E-01 0.4540
0.7526E-01
1.043
0.8120E-01 -9.334
0.7342E-01
4.426
0.7138E-01 -0.3547
0.4105E-01
3.728
0.3255E-01
44.41
0.5977E-01 -4.716
0.5722E-01 -8.720
0.5490E-01
5.833
0.4794E-01 -0.7506
0.1931E-01 -0.8028
0.4046
8.707
Ordinary Least Squares Estimation using QR Method
Dependent variable
SORT_Y
Centered R**2
0.9996155075367260
Adjusted R**2
0.9995818293647604
Residual Sum of Squares
0.5880635102914726
Residual Variance
4.292434381689581E-03
Standard Error
6.551667254744842E-02
Total Sum of Squares
1529.453933333333
Log Likelihood
202.7758915427529
Mean of the Dependent Variable
53.14733333333334
Std. Error of Dependent Variable
3.203871329950913
Sum Absolute Residuals
7.986427256057823
F(12,
137)
29681.40635892062
F Significance
1.000000000000000
QR Rank Check variable (eps) set as
2.220446049250313E-16
Maximum Absolute Residual
0.1219604495012589
Number of Observations
150
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Col___11
Col___12
Col___13
Lag
0
0
0
0
0
0
0
0
0
0
0
0
0
Coefficient
0.14701514E-01
0.78455632E-01
-0.75794272
0.32497672
-0.25320706E-01
0.15306172
1.4458257
-0.28187578
-0.49898873
0.32026947
-0.35984380E-01
-0.15499016E-01
3.5229776
SE
0.32385411E-01
0.75255516E-01
0.81201400E-01
0.73417337E-01
0.71378642E-01
0.41052916E-01
0.32552851E-01
0.59770228E-01
0.57220844E-01
0.54902520E-01
0.47938922E-01
0.19306494E-01
0.40461601
Least Trimmed Squares with holdout %
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
NAMES
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Col___11
Col___12
Col___13
LAGS
0
0
0
0
0
0
0
0
0
0
0
0
0
t
0.45395483
1.0425233
-9.3341090
4.4264302
-0.35473785
3.7284008
44.414719
-4.7159898
-8.7204014
5.8334202
-0.75062973
-0.80278768
8.7069654
0.2000000000000000
OLDCOEF
0.6316E-01
-0.1335
-0.4412
0.1520
-0.1204
0.2493
1.545
-0.5929
-0.1711
0.1324
0.5687E-01
-0.4209E-01
3.824
OLDSE
OLDT
0.7599E-01 0.8312
0.1649
-0.8093
0.1887
-2.338
0.1902
0.7991
0.1794
-0.6709
0.1097
2.272
0.5981E-01
25.84
0.1102
-5.378
0.1152
-1.485
0.1147
1.155
0.1008
0.5640
0.4289E-01 -0.9812
0.8555
4.470
where the original OLS results are listed on the left. Note how the t scores are moving upward.
Tables 10.13 and 10.14 lists subroutines that implement boosting and modified stagewise
boosting. Table 10.15 contains the test case code that does 100 iterations where   .5 . Figure 10.12
Special Topics in OLS Estimation
10-75
shows how the correlation of y and yˆ j ,. increases at the iterations proceed. The “best” correlation
obtained for OLS was .719547.
Table 10.13 Boosting Routine
___________________________________________________________
subroutine boost(y,yhat,res,x,in,e,ipass,itype,iprint);
/;
/; Implements OLS boosting as discribed in "Least Angle Regression"
/; by Bradley Efron, Trevor Hastie, Iain Johnson and Robert Tibshirani
/; The Annals of Statistics Vol 32 No. 2 (April 2004) pp 407-451
/;
/; y
=> left hand variable
/; yhat
=> Forecasted y
/; res
=> error
/; x
=> n by k matrix of right hand side variables
/; in
=> work vector telling what vectors in X used
/;
in = 0 implies that vector in
/; e
=> Step size - use a small number
/; ipass => Number of times called. Be sure called enough times
/;
one way to test this is to monitor progress in the
/;
correlation improvement
/; itype => =0 OLS, =1 MARS, =2 GAM, =3 L1 =4 mimimax
/; iprint => =0 no printing, =1 print step data, =2 print correlations
/;
/; Note: X must be centered. Y must have mean removed.
/;
cc=array(nocols(x):);
if(ipass.eq.1)then;
in=array(nocols(x):)+1.;
yhat=y*0.0;
res=y;
endif;
do i=1,nocols(x);
cc(i)=ccf(res,x(,i));
enddo;
ij=imax(abs(cc));
if(iprint.eq.2)then;
call print('Largest Correlation vector was ',ij:);
call tabulate(in,cc);
endif;
if(iprint.eq.0)then;
if(itype.eq.0)call olsq(
res,x(,ij) :noint);
if(itype.eq.1)call marspline(res,x(,ij));
if(itype.eq.2)call gamfit(
res,x(,ij)[predictor,3] :noint);
if(itype.eq.3)call olsq(
res,x(,ij) :noint :l1);
if(itype.eq.4)call olsq(
res,x(,ij) :noint :minimax);
endif;
if(iprint.ne.0)then;
if(itype.eq.0)call olsq(
res,x(,ij) :noint :print);
if(itype.eq.1)call marspline(res,x(,ij)
:print);
if(itype.eq.2)call gamfit(
res,x(,ij)[predictor,3] :noint :print);
if(itype.eq.3)call olsq(
res,x(,ij) :noint :print :l1);
if(itype.eq.4)call olsq(
res,x(,ij) :noint :print :minimax);
endif;
if(itype.eq.3)%yhat=%l1yhat;
if(itype.eq.4)%yhat=%mmyhat;
yhat=yhat+ afam(e)*afam(%yhat);
res=y-yhat;
return;
end;
____________________________________________________________
10-76
Chapter 10
Table 10.14 Modified Forward Stagewise model boosting
___________________________________________________________
subroutine boost2(y,yhat,res,x,xbuild,in,e,ipass,itype,iprint);
/;
/; Modified OLS boosting as described in "Least Angle Regression"
/; by Bradley Efron, Trevor Hastie, Iain Johnson and Robert Tibshirani
/; The Annals of Statistics Vol 32 No. 2 (April 2004) pp 407-451
/;
/; y
=> left hand variable
/; yhat
=> Forecasted y
/; res
=> error
/; x
=> n by k matrix of right hand side variables
/; in
=> work vector telling what vectors in X used in set = 0
/; e
=> Step size
/; ipass => Number of times called
/; itype => =0 OLS, =1 MARS, =2 GAM, =3 L1, =4 Minimax
/; iprint => =0 no printing, =1 print step data, =2 print correlations
/;
/; Note: X must be centered. Y must have mean removed.
/;
cc=array(nocols(x):);
if(ipass.eq.1)then;
in=array(nocols(x):)+1.;
yhat=y*0.0;
res=y;
endif;
do i=1,nocols(x);
cc(i)=ccf(res,x(,i));
enddo;
ij=imax(abs(cc));
if(iprint.eq.2)then;
call print('Largest Correlation vector was ',ij:);
call tabulate(in,cc);
endif;
if(iprint.ne.0)call print(ipass,cc,ij);
if(ipass.eq.1)
xbuild=array(norows(x),1:x(,ij));
if(ipass.gt.1.and.in(ij).ne.0.0)xbuild=catcol(xbuild,x(,ij));
in(ij)=0.0;
do j=1,nocols(xbuild);
if(iprint.eq.0)then;
if(itype.eq.0)call olsq(
res,xbuild(,j):noint);
if(itype.eq.1)call marspline(res,xbuild(,j));
if(itype.eq.2)call gamfit(
res,xbuild(,j)[predictor,3] :noint);
if(itype.eq.3)call olsq(
res,xbuild(,j):noint :l1);
if(itype.eq.4)call olsq(
res,xbuild(,j):noint :minimax);
endif;
if(iprint.ne.0)then;
if(itype.eq.0)call olsq(
res,xbuild(,j) :noint :print);
if(itype.eq.1)call marspline(res,xbuild(,j) :print);
if(itype.eq.2)call gamfit(
res,xbuild(,j)[predictor,3] :noint :print);
if(itype.eq.3)call olsq(
res,xbuild(,j) :noint :print :l1);
if(itype.eq.4)call olsq(
res,xbuild(,j) :noint :print :minimax);
endif;
if(itype.eq.3)%yhat=%l1yhat;
if(itype.eq.4)%yhat=%mmyhat;
yhat=yhat+ afam(e)*afam(%yhat);
res=y-yhat;
enddo;
return;
end;
Special Topics in OLS Estimation
_____________________________________________________________________________
Table 10.15 Boosting Test Case
_________________________________________________________________
b34sexec options ginclude('b34sdata.mac') member(efron_1); b34srun;
/; b34sexec options ginclude('b34sdata.mac') member(efron_2); b34srun;
/;
/; Implements OLS boosting as discribed in "Least Angle Regression"
/; by Bradley Efron, Trevor Hastie, Iain Johnson and Robert Tibshirani
/; The Annals of Statistics Vol 32 No. 2 (April 2004) pp 407-451
/;
/; Modified boosting also available if iboost set = 1
/;
b34sexec matrix;
call loaddata;
call load(center :staging);
call load(center2 :staging);
call load(boost
:staging);
call echooff;
/; iboost=0 => ols boost
/; iboost=1 => modified OLS boost
iboost=0;
/; iboost=1;
e=.5;
ntry=100;
/;
/;
/;
/;
/;
itype=0
itype=1
itype=2
itype=3
itype=4
=>
=>
=>
=>
=>
ols
mars
gam
l1
minimax
itype=0;
iprint=0;
x=center2(catcol(age sex bmi bp s1 s2 s3 s4 s5 s6));
y=y-mean(y);
/; Set up the base case.
if(itype.eq.0)call
if(itype.eq.1)call
if(itype.eq.2)call
if(itype.eq.3)call
if(itype.eq.4)call
olsq(y,x :noint :print);
marspline(y x :print :nk 20);
gamfit(y x[predictor,3] :print);
olsq(y,x :noint :print :l1);
olsq(y,x :noint :print :minimax);
call print(ccf(%y,%yhat));
base1=ccf(%y,%yhat);
fit=array(ntry:);
do i=1,ntry;
if(iboost.eq.0)call boost( y,yhat,res,x,
in,e,i,itype,iprint);
if(iboost.eq.1)call boost2(y,yhat,res,x,xbuild,in,e,i,itype,iprint);
/; if(iprint.ne.0)call graph(y,yhat);
fit(i)=ccf(y,yhat);
call outstring(1,2,'Iteration');
call outinteger(40,2,i);
call outstring(1,3,'Correlation');
call outdouble(40,3,fit(i));
enddo;
10-77
10-78
Chapter 10
base=array(norows(fit):)+base1;
if(itype.eq.0)call char1(cc1,'OLS'
);
if(itype.eq.1)call char1(cc1,'MARS'
);
if(itype.eq.2)call char1(cc1,'GAM'
);
if(itype.eq.3)call char1(cc1,'L1'
);
if(itype.eq.4)call char1(cc1,'MINIMAX');
cc=' Boosting based Correlation of y and yhat given eps =';
if(iboost.eq.1)
cc=' Modified Boosting based Correlation of y and yhat given eps =';
call ir8tostr(e,cc2,'(f8.4)');
cc=catrow(cc1,cc,cc2);
call graph(base,fit :heading cc :file 'boost.wmf'
:nocontact :nolabel :pgborder);
call print(fit);
b34srun;
OLS Boosting based Correlation of y and yhat given eps = 0.5000
.72
.70
.68
.66
B
A
S
E
.64
.62
.60
.58
0
10
20
30
Figure 10.12 OLS Boosting Example
40
50
Obs
60
70
80
90
100
F
I
T
Special Topics in OLS Estimation
10-79
Using Efron's test data, and setting   .5 , after 100 iterations the correlation for boosting
was .717720 as shown below:
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F( 9,
432)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Col____8
Col____9
Col___10
Lag
0
0
0
0
0
0
0
0
0
0
Coefficient
-10.009866
-239.81564
519.84592
324.38465
-792.17564
476.73902
101.04327
177.06324
751.27370
67.626692
Y
0.5177484222203508
0.5077015143499414
1263985.785633341
2925.893022299400
54.09152449598182
2621009.124434389
-2385.992862123519
-1.196026686437816E-14
77.09300453299109
19128.63379518925
51.53311137103671
1.000000000000000
1.708266908084447E-03
155.8267661215523
442
SE
59.680052
61.151444
66.456394
65.346228
416.19732
338.63787
212.28533
161.28879
171.70091
65.907867
t
-0.16772549
-3.9216677
7.8223613
4.9640913
-1.9033655
1.4078137
0.47597857
1.0978025
4.3754789
1.0260792
0.71954737
FIT
= Array
0.586450
0.707305
0.715390
0.716644
0.717120
0.717309
0.717412
0.717472
0.717527
0.717570
0.717617
0.717660
0.717703
of
100
0.676525
0.710165
0.715387
0.716694
0.717162
0.717326
0.717418
0.717482
0.717531
0.717577
0.717620
0.717665
0.717708
elements
0.691952
0.712198
0.715911
0.716811
0.717174
0.717342
0.717429
0.717487
0.717538
0.717581
0.717627
0.717671
0.717713
0.700391
0.712596
0.715876
0.716909
0.717203
0.717356
0.717434
0.717497
0.717543
0.717588
0.717632
0.717675
0.717720
0.700302
0.713552
0.716153
0.716942
0.717220
0.717374
0.717444
0.717500
0.717548
0.717593
0.717638
0.717682
0.706626
0.713761
0.716215
0.716963
0.717249
0.717381
0.717454
0.717509
0.717555
0.717600
0.717645
0.717688
0.705759
0.713940
0.716437
0.717037
0.717265
0.717390
0.717459
0.717517
0.717559
0.717607
0.717648
0.717693
0.706539
0.714881
0.716488
0.717096
0.717294
0.717397
0.717469
0.717522
0.717567
0.717611
0.717652
0.717696
If the iterations were increased to 1900 the correlation increased to .719537, just a shade below the
OLS value of .719547.
10-80
Chapter 10
Table 10.16 Modifications to OLS Boosting to Facilitate Forecasting
________________________________________________________________________
subroutine boost3(y,yhat,res,x,in,beta1,beta2,e,ipass,itype,iprint);
/;
/; Implements a mod to OLS boosting as described in
/; "Least Angle Regression"
/; by Bradley Efron, Trevor Hastie, Iain Johnson and Robert Tibshirani
/; The Annals of Statistics Vol 32 No. 2 (April 2004) pp 407-451
/;
/; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/; HHS Mods involve use of constant and no mean adjustment
/; Goal is to facilitate forecasting using boost4 routine
/; in, beta1, beta2 used as input into boost4
/; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/;
/; y
=> left hand variable
/; yhat
=> Forecasted y
/; res
=> error
/; x
=> n by k matrix of right hand side variables
/; in
=> work vector telling vector in X used
/; beta1 => saves the beta
/; beta2 => sames the constant
/; e
=> Step size - use a small number
/; ipass => Number of times called. Be sure called enough times
/;
one way to test this is to monitor progress in the
/;
correlation improvement
/; itype => =0 OLS, =1 MARS, =2 GAM, =3 L1 =4 mimimax
/; iprint => =0 no printing, =1 print step data, =2 print correlations
/;
/;
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/;
cc=array(nocols(x):);
if(itype.eq.1.or.itype.eq.2)then;
call print('ERROR: ITYPE in boots3 cannot be 1 or 2':);
call stop;
endif;
if(ipass.eq.1)then;
yhat=y*0.0;
res=y;
endif;
do i=1,nocols(x);
cc(i)=ccf(res,x(,i));
enddo;
ij=imax(abs(cc));
in(ipass)=ij;
if(iprint.eq.2)then;
call print('Largest Correlation vector was ',ij:);
call tabulate(in,cc);
endif;
if(iprint.eq.0)then;
if(itype.eq.0)call olsq(
res,x(,ij) );
/; if(itype.eq.1)call marspline(res,x(,ij) );
/; if(itype.eq.2)call gamfit(
res,x(,ij)[predictor,3] :noint);
if(itype.eq.3)call olsq(
res,x(,ij) :l1);
if(itype.eq.4)call olsq(
res,x(,ij) :minimax);
endif;
Special Topics in OLS Estimation
if(iprint.ne.0)then;
if(itype.eq.0)call olsq(
res,x(,ij) :print);
/; if(itype.eq.1)call marspline(res,x(,ij) :print );
/; if(itype.eq.2)call gamfit(
res,x(,ij)[predictor,3] :noint :print);
if(itype.eq.3)call olsq(
res,x(,ij) :print :l1);
if(itype.eq.4)call olsq(
res,x(,ij) :print :minimax);
endif;
if(itype.eq.0)then;
beta1(ipass)=%coef(1);
beta2(ipass)=%coef(2);
endif;
if(itype.eq.3)then;
%yhat=%l1yhat;
beta1(ipass)=%l1coef(1);
beta2(ipass)=%l1coef(2);
endif;
if(itype.eq.4)then;
%yhat=%mmyhat;
beta1(ipass)=%mmcoef(1);
beta2(ipass)=%mmcoef(2);
endif;
yhat=yhat+ afam(e)*afam(%yhat);
res=y-yhat;
return;
end;
subroutine boost4(yhat,x,in,beta1,beta2,e);
/;
/; HHS boosting Forecasting for model estimated with boost3
/; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/; HHS Mods involve use of constant and no mean adjustment
/; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/;
/; yhat
=> Forecasted y
/; x
=> n by k matrix of right hand side variables
/; in
=> work vector telling what vectors in X used
/; beta1 => Beta Coef
/; beta2 => Constant coef
/; e
=> Step size used
/;
/; +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/;
yhat=array(norows(x):);
do i=1,norows(beta1);
add=(beta1(i)*x(,in(i)))+beta2(i);
yhat=yhat+ afam(e)*afam(add);
enddo;
return;
end;
_______________________________________________________________________
10-81
10-82
Chapter 10
Table 10.17 Forecasting an OLS Boosting Model
________________________________________________________________________
b34sexec options ginclude('b34sdata.mac') member(efron_1); b34srun;
/; b34sexec options ginclude('b34sdata.mac') member(efron_2); b34srun;
/;
/; Implements OLS boosting as described in "Least Angle Regression"
/; by Bradley Efron, Trevor Hastie, Iain Johnson and Robert Tibshirani
/; The Annals of Statistics Vol 32 No. 2 (April 2004) pp 407-451
/;
/; Modified boosting also available if iboost set = 1
/;
b34sexec matrix;
call loaddata;
call load(center :staging);
call load(center2 :staging);
call load(boost
:staging);
call echooff;
/;
/;
/;
/;
/;
itype=0
itype=1
itype=2
itype=3
itype=4
=>
=>
=>
=>
=>
ols
mars not ready
gam not ready
L1
MINIMAX
itype=0;
iprint=0;
e=.5 ;
ntry=1000;
x=catcol(age sex bmi bp s1 s2 s3 s4 s5 s6);
/; Gets the base
if(itype.eq.0)call olsq(y,x :print);
if(itype.eq.1)call marspline(y x :print :nk 20);
if(itype.eq.2)call gamfit(y x[predictor,3] :print);
if(itype.eq.3)call olsq(y,x :print :l1);
if(itype.eq.4)call olsq(y,x :print :minimax);
call print(ccf(%y,%yhat));
base1=ccf(%y,%yhat);
fit
beta1
beta2
in
=array(ntry:);
=array(ntry:);
=array(ntry:);
=idint(array(ntry:));
do i=1,ntry;
call boost3(y,yhat,res,x,in,beta1,beta2,e,i,itype,iprint);
fit(i)=ccf(y,yhat);
call outstring(1,2,'Iteration');
call outinteger(20,2,i);
call outstring(1,3,'Correlation');
call outstring(1,3,'Correlation');
call outdouble(20,3,fit(i));
enddo;
call tabulate(in,fit,beta1,beta2);
/; Testing if can "forecast" using saved Model
call boost4(yhat2,x,in,beta1,beta2,e);
call tabulate(yhat,yhat2);
Special Topics in OLS Estimation
10-83
base=array(norows(fit):)+base1;
if(itype.eq.0)call char1(cc1,'OLS'
);
if(itype.eq.1)call char1(cc1,'MARS'
);
if(itype.eq.2)call char1(cc1,'GAM'
);
if(itype.eq.3)call char1(cc1,'L1'
);
if(itype.eq.4)call char1(cc1,'MINIMAX');
cc=' HHS Boosting based Correlation of y and yhat given eps =';
call ir8tostr(e,cc2,'(f8.4)');
cc=catrow(cc1,cc,cc2);
call graph(base,fit :heading cc :file 'boost.wmf'
:nocontact :nolabel :pgborder);
call print(fit);
b34srun;
Edited Results for running the code is Table 10.17 is given next.
B34S 8.11D
(D:M:Y)
Variable
# Cases
AGE
1
SEX
2
BMI
3
BP
4
S1
5
S2
6
S3
7
S4
8
S5
9
S6
10
Y
11
CONSTANT 12
442
442
442
442
442
442
442
442
442
442
442
442
23/ 3/08 (H:M:S) 15:18: 1
Mean
48.51809955
1.468325792
26.37579186
94.64701357
189.1402715
115.4391403
49.78846154
4.070248869
4.641410860
91.26018100
152.1334842
1.000000000
Number of observations in data file
Current missing variable code
DATA STEP
Std Deviation
13.10902782
0.4995611704
4.418121561
13.83128342
34.60805168
30.41308097
12.93420215
1.290449897
0.5223905611
11.49633474
77.09300453
0.000000000
Variance
171.8466104
0.2495613630
19.51979812
191.3044010
1197.717241
924.9554940
167.2935854
1.665260936
0.2728918983
132.1657124
5943.331348
0.000000000
442
1.000000000000000E+31
B34S(r) Matrix Command. d/m/y 23/ 3/08. h:m:s 15:18: 1.
=>
CALL LOADDATA$
=>
CALL LOAD(CENTER
=>
CALL LOAD(CENTER2 :STAGING)$
=>
CALL LOAD(BOOST
=>
SUBROUTINE BOOST3(Y,YHAT,RES,X,IN,BETA1,BETA2,E,IPASS,ITYPE,IPRINT)$
=>
SUBROUTINE BOOST4(YHAT,X,IN,BETA1,BETA2,E)$
=>
CALL ECHOOFF$
:STAGING)$
:STAGING)$
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F(10,
431)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Col____6
Col____7
Lag
0
0
0
0
0
0
0
Coefficient
-0.36361224E-01
-22.859648
5.6029621
1.1168080
-1.0899963
0.74645046
0.37200472
Y
0.5177484222203494
0.5065592904853228
1263985.785633344
2932.681637200335
54.15423932805570
2621009.124434389
-2385.992862123519
152.1334841628959
77.09300453299109
19128.63379518937
46.27243958524313
1.000000000000000
1.170314752960385E-08
155.8267661215634
442
SE
0.21704144
5.8358213
0.71710550
0.22523817
0.57333186
0.53083439
0.78246385
t
-0.16753126
-3.9171261
7.8133023
4.9583425
-1.9011613
1.4061833
0.47542735
Efron Diabeties Data
Maximum
79.00000000
2.000000000
42.20000000
133.0000000
301.0000000
242.4000000
99.00000000
9.090000000
6.107000000
124.0000000
346.0000000
1.000000000
Minimum
19.00000000
1.000000000
18.00000000
62.00000000
97.00000000
41.60000000
22.00000000
2.000000000
3.258100000
58.00000000
25.00000000
1.000000000
PAGE
1
10-84
Chapter 10
Col____8
Col____9
Col___10
CONSTANT
0
0
0
0
6.5338319
68.483125
0.28011699
-334.56714
5.9586378
15.669719
0.27331395
67.454621
1.0965311
4.3704117
1.0248909
-4.9598846
0.71954737
This illustrates the fit as more and more iterations are performed. Fit approaches the OLS fit of
.71954737.
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
IN
3
9
4
7
9
2
3
4
7
2
5
10
2
7
4
6
2
9
5
9
FIT
0.5865
0.6765
0.6920
0.7004
0.7003
0.7066
0.7058
0.7065
0.7073
0.7102
0.7122
0.7126
0.7136
0.7138
0.7139
0.7149
0.7154
0.7154
0.7159
0.7159
991
992
993
994
995
996
997
998
999
1000
6
5
6
7
5
9
1
6
5
6
0.7194
0.7194
0.7194
0.7194
0.7194
0.7194
0.7194
0.7194
0.7194
0.7194
BETA1
BETA2
10.23
-117.8
64.20
-221.9
1.337
-88.55
-1.067
72.16
19.87
-82.71
-13.63
24.77
1.566
-38.93
0.3923
-35.94
-0.3380
17.42
-10.40
15.56
-0.1071
20.40
0.3346
-30.46
-5.868
8.653
-0.2400
11.97
0.1968
-18.62
-0.9124E-01
10.54
-4.372
6.422
4.479
-20.79
-0.5773E-01
10.142
3.225
-14.97
0.2122E-02
-0.2006E-02
0.2085E-02
0.4979E-02
-0.1873E-02
0.1560
-0.4288E-02
0.1982E-02
-0.2113E-02
0.2069E-02
-0.2450
0.3795
-0.2407
-0.2479
0.3542
-0.7241
0.2080
-0.2288
0.3996
-0.2388
This tests BOOST4 since YHAT = YHAT2
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
YHAT
205.9
68.40
176.6
166.0
128.2
106.1
75.28
119.9
159.5
213.9
97.98
97.49
114.9
164.0
102.6
175.8
210.14
182.8
147.6
123.6
119.7
87.21
114.5
257.9
165.0
147.1
96.66
178.9
129.1
184.4
159.1
69.51
259.6
110.6
79.00
86.36
207.6
157.0
241.2
137.0
YHAT2
205.9
68.40
176.6
166.0
128.2
106.1
75.28
119.9
159.5
213.9
97.98
97.49
114.9
164.0
102.6
175.8
210.14
182.8
147.6
123.6
119.7
87.21
114.5
257.9
165.0
147.1
96.66
178.9
129.1
184.4
159.1
69.51
259.6
110.6
79.00
86.36
207.6
157.0
241.2
137.0
Special Topics in OLS Estimation
10-85
The next set of examples uses the QR approach to estimate datasets that are known to be
difficult to estimate when test answers are available. Table 10.17 lists the commands to load the
data from Wampler (1970). Five problems are run, each with a different left-hand-side. The object is
to show the accuracy differences that result.
y1  1 x  x 2  x 3  x 4  x 5
(10.5-1)
y1  1 .1x  .01x 2  .001x3  .0001x 4  .00001x5
(10.5-2)
y3  y1  
(10.5-3)
y4  y1  100
(10.5-4)
y5  y1  10000
(10.5-5)
For all problems except (10.5-2) the right hand side is a constant plus x, x 2 , , x5 . For problem
(10.5-2) the effect of multicollinearity is isolated from the effect of larger and larger values on the
right hand side. Here the variables are scaled as (.1) x, (.1)2 x 2 , , (.1)5 x5 and the left-hand side
adjusted such that for all 5 models the estimated coefficients should be 1. It is hypothesized that the
estimated coefficients from estimating (10.5-2) will be closer 1. than (10.5-1) and that this effects
will be due to scaling..
10-86
Chapter 10
Table 10.18 Wampler Test Problem
==WAMPLER
b34sexec data heading('Data from Wampler JASA June 1970')
filef=@@$
* Data from Roy Wampler JASA June 1970 Vol 65, No. 330, pp. 549-564$
input x1 delta$
build y1 y2 y3 y4 y5 x2 x3 x4 x5 ax1 ax2 ax3 ax4 ax5$
* y1 = 1+
x1 +
x1**2 +
x1**3 +
x1**4 +
x1**5 $
* y2 = 1 + .1*x1 +.01*x1**2 +.001*x1**3 + .0001*x1**4 + .00001*x1**5$
* y3 = y1 + delta $
* y4 = y1 + 100*delta $
* y5 = y1 + 10000*delta $
gen x2= x1*x1 $ gen x3 = x2*x1$ gen x4 = x3*x1$ gen x5 =
x4*x1$
gen ax1=.1*x1 $ gen ax2= .01*x2$ gen ax3=.001*x3$ gen ax4=.0001*x4$
gen ax5=.00001*x5 $
gen y1 = 1 +
x1 +
x1**2 +
x1**3 +
x1**4 +
x1**5$
gen y2 = 1 + .1*x1 +.01*x1**2 +.001*x1**3 + .0001*x1**4 + .00001*x1**5$
gen y3 = y1 + delta$
gen y4 = y1 + 100. *delta $
gen y5 = y1 + 10000.*delta $
datacards$
0.
759.
1. -2048. 2. 2048. 3. -2048. 4. 2523. 5. -2048.
6. 2048.
7. -2048. 8. 1838. 9. -2048. 10. 2048. 11. -2048.
12. 1838. 13. -2048. 14. 2048. 15. -2048. 16. 2523. 17. -2048.
18. 2048. 19. -2048. 20.
759.
b34sreturn$
b34seend$
b34sexec list$ var x1 delta$ b34seend$
b34sexec qr $ model y1= x1 x2 x3 x4 x5 $ b34seend$
b34sexec qr $ model y2=ax1 ax2 ax3 ax4 ax5 $ b34seend$
b34sexec qr $ model y3= x1 x2 x3 x4 x5 $ b34seend$
b34sexec qr $ model y4= x1 x2 x3 x4 x5 $ b34seend$
b34sexec qr $ model y5= x1 x2 x3 x4 x5 $ b34seend$
==
Equations (10.5-1) and (10.5-2) are perfect fits and all estimated coefficients should be 1.0 since the
parameters {.1, .01, .001, .0001, .00001} have been built into the variables {AX1,...,AX5}.
Equations (10.5-3) - (10.5-5) are not perfect fits and have increasing amounts of noise. X is defined
as {0,1,...,20}. Edited output from running this test problem on a Gateway P5-90 is given below. On
the IBM 390 slightly different answers will be found as shown in Stokes (1991, 246).
OLS using QR decomposition for Y
Number of observations
Adjusted R square
Standard Error of Estimate
Sum of Squared Residuals
Variable
X5
X4
X3
X-11
X-10
X- 9
= Y1
X- 3
=
21
=
1.00000000000000
=
0.763648257769776D-010
=
0.874737992392221D-019
Coefficient
1.00000000000001350
0.999999999999290570
1.00000000001383780
Standard Error
0.363438173143536890E-14
0.182671077017781300E-12
0.328334037806728010E-11
t value
275149963293776.470
5474320381337.38090
304567874440.871150
Special Topics in OLS Estimation
X2
CONSTANT
X1
X- 8
X-17
X- 1
0.999999999878483870
0.999999999453480280
1.00000000045903460
OLS using QR decomposition for Y
Number of observations
Adjusted R square
Standard Error of Estimate
Sum of Squared Residuals
Variable
AX5
CONSTANT
AX3
AX1
AX4
AX2
X-16
X-17
X-14
X-12
X-15
X-13
Variable
X5
X4
X3
X2
CONSTANT
X1
X-11
X-10
X- 9
X- 8
X-17
X- 1
Variable
X5
X4
X3
X2
CONSTANT
X1
X-11
X-10
X- 9
X- 8
X-17
X- 1
Variable
X5
X4
X3
X2
CONSTANT
X1
X-11
X-10
X- 9
X- 8
X-17
X- 1
t value
130992017073610.310
683616589443359.750
14499714893904.2730
62252325878667.1480
26061870416950.0860
18879555449649.2110
Standard Error
0.112324854679314690
5.64566512170766810
101.475507550352620
779.343524331607910
2152.32624678173580
2363.55173469688360
t value
8.90274910975845120
0.177127048530407170
0.985459471114569390E-02
0.128313121061206540E-02
0.464613578411293800E-03
0.423092071916349630E-03
= Y4
X- 6
=
21
=
0.943304587767550
=
236014.502379268
=
835542680000.000
Coefficient
1.00000000000030930
0.999999999984220960
1.00000000028554030
0.999999997837600070
0.999999996673739290
1.00000000606312800
OLS using QR decomposition for Y
Number of observations
Adjusted R square
Standard Error of Estimate
Sum of Squared Residuals
Standard Error
0.763405299300070910E-14
0.146280826919992380E-14
0.689668733017904870E-13
0.160636568334661690E-13
0.383702314531362890E-13
0.529673488693598360E-13
= Y3
X- 5
=
21
=
0.999994078701093
=
2360.14502379267
=
83554267.9999999
Coefficient
1.00000000000001620
0.999999999999141240
1.00000000001652990
0.999999999858289690
0.999999999425811750
1.00000000051438680
OLS using QR decomposition for Y
Number of observations
Adjusted R square
Standard Error of Estimate
Sum of Squared Residuals
39656683700.2237400
14359430718.8853910
13076159579.0254170
= Y2
X- 4
=
21
=
1.00000000000000
=
0.160405034435521D-014
=
0.385946626083912D-028
Coefficient
0.999999999999994780
0.999999999999995890
0.999999999999980570
1.00000000000001440
1.00000000000002040
0.999999999999993560
OLS using QR decomposition for Y
Number of observations
Adjusted R square
Standard Error of Estimate
Sum of Squared Residuals
0.252164302854411900E-10
0.696406437713641010E-10
0.764750532765802950E-10
10-87
Standard Error
11.2324854679314790
564.566512170767280
10147.5507550352700
77934.3524331608500
215232.624678173770
236355.173469688570
t value
0.890274910976105170E-01
0.177127048527764240E-02
0.985459471379667420E-04
0.128313120801925440E-04
0.464613577132643180E-05
0.423092074263977630E-05
= Y5
X- 7
=
21
=
-0.330337747712334
=
23601450.2379268
=
0.835542680000000D+016
Coefficient
1.00000000002976020
0.999999998484693230
1.00000002728896820
0.999999795397397810
0.999999721282435080
1.00000056024505920
Standard Error
1123.24854679314760
56456.6512170767240
1014755.07550352710
7793435.24331608510
21523262.4678173740
23635517.3469688560
t value
0.890274911002324830E-03
0.177127048262157320E-04
0.985459497990451150E-06
0.128313094826191270E-06
0.464613449182103880E-07
0.423092308733959070E-07
Since they are exact fits, the models for Y1 and Y2 show 1.0 as the adjusted R2. The model
for Y1 shows approximately 11 to 12 correct digits, while the model for Y2 shows approximately 14
correct digits. This is because in the model for Y2, the higher power X terms are scaled, making
them smaller, and thus making it easier to factor X. This problem, as noted, has been designed to
shows the effects of matrix scaling when the answers are known. The models for Y3, Y4 and Y5 are
not exact. As we gradually increase the error, the adjusted R2 falls from .99 to .94 to -.33 and the
round-off error increases with the same X matrix. It is a common error to assume that only rank
problems in X can cause problems. These examples show the effect of changes in the left-hand-side
vector on the accuracy of the solution. This example illustrates the a signal-to-noise-ratio effect. Note
that the SEs of the last three models have the same digits but appear to have been scaled. How could
10-88
Chapter 10
this be so? From equation (2.1-11) we note that the SEs the are square roots of the diagonal elements
of ˆ 2 ( X ' X )1 . In all three problems ( X ' X )1 is unchanged although increasingly more noise is
added in the form of  , 100δ and 10,000δ. The added noise causes ˆ 2 to increase by a factor of 100
and 10,000 respectively.
The second problem to be studied is the Longley (1967) dataset, which is given in Table
10.19. Three types of problems are run. These are the qr command, which obtains ˆi from
equation (4.1-6), the usual regression command in which the constant is a vector of 1's in the
data set and ˆi is obtained from equation (2.1-3) and the deviations from the mean approach,
which obtains ˆ from equation (2.1-5) for all variables except the constant and equation (2.1-6)
i
for the constant. The deviations-from-the-mean estimation strategy is implemented by rereading
the data back into B34S from unit 8 (with the UNIT=8 option) and specifying the noconstant
option on the second data command. In the subsequent regression command the noint option is
used because the deviations-from-the-mean approach obtains an estimate of the constant from
equation (2.1-6). The deviations-from-the-mean option is rarely used since the usual approach is
quite accurate. Its use here is to study the effect of multicollinearity on estimation. Although
Table 10.19 lists all the Longley data, only results for Y1 are given. The reader is encouraged to
run the other problems.
Special Topics in OLS Estimation
10-89
Table 10.19 Longley Test Data
______________________________________________________________
==LONGLEY
/$longley
/$ this problem is discussed in chapter 10
b34sexec data noob=16 nohead heading=('longley data')$
input y1 x1 x2 x3 x4 x5 x6 y2 y3 y4 y5 y6 y7 y8
$
* Data from longley jasa september 1967
$
* All equations contain variables x1-x7
$
datacards$
60323. 83.0 234289. 2356. 1590. 107608. 1947.
8256. 6045. 427. 1714. 38407. 1892. 3582.
61122. 88.5 259426. 2325. 1456. 108632. 1948.
7960. 6139. 401. 1731. 39241. 1863. 3787.
60171. 88.2 258054. 3682. 1616. 109773. 1949.
8017. 6208. 396. 1772. 37922. 1908. 3948.
61187. 89.5 284599. 3351. 1650. 110929. 1950.
7497. 6069. 404. 1995. 39196. 1928. 4098.
63221. 96.2 328975. 2099. 3099. 112075. 1951.
7048. 5869. 400. 2055. 41460. 2302. 4087.
63639. 98.1 346999. 1932. 3594. 113270. 1952.
6792. 5670. 431. 1922. 42216. 2420. 4188.
64989. 99.0 365385. 1870. 3547. 115094. 1953.
6555. 5794. 423. 1985. 43587. 2305. 4340.
63761 100.0 363112. 3578. 3350. 116219. 1954.
6495. 5880. 445. 1919. 42271. 2188. 4563.
66019 101.2 397469. 2904. 3048. 117388. 1955.
6718. 5886. 524. 2216. 43761. 2187. 4727.
67857 104.6 419180. 2822. 2857. 118734. 1956.
6572. 5936. 581. 2359. 45131. 2209. 5069.
68169 108.4 442769. 2936. 2798. 120445. 1957.
6222. 6089. 626. 2328. 45278. 2217. 5409.
66513 110.8 444546. 4681. 2637. 121950. 1958.
5844. 6185. 605. 2456. 43530. 2191. 5702.
68655 112.6 482704. 3813. 2552. 123366. 1959.
5836. 6298. 597. 2520. 45214. 2233. 5957.
69564 114.2 502601. 3931. 2514. 125368. 1960.
5723. 6367. 615. 2489. 45850. 2270. 6250.
69331 115.7 518173. 4806. 2572. 127852. 1961.
5463. 6388. 662. 2594. 45397. 2279. 6548.
70551 116.9 554894. 4007. 2827. 130081. 1962.
5190. 6271. 623. 2626. 46652. 2340. 6849.
b34sreturn$
b34seend$
b34sexec qr ipcc=pcreg $ model y1 = x1 x2 x3 x4 x5 x6 $
title=('Regression on Total')$ b34seend$
b34sexec regression manydigits toll=.1e-08$
comment=('Regression on Total
')$
model y1=x1 x2 x3 x4 x5 x6 $
b34seend$
b34sexec data noob=16 nohead noconstant unit=08 filef=dp
heading=('no constant on input')$
input y1 x1 x2 x3 x4 x5 x6 y2 y3 y4 y5 y6 y7 y8 $
b34seend$
b34sexec regression manydigits toll=.1e-08 noint$
comment=('Regression on Total
')$
model y1=x1 x2 x3 x4 x5 x6 $
b34seend$
==
_____________________________________________________________
Edited output from running the problem in Table 10.19 on a Gateway 2000 Intel P5-90 is given
next. The manydigits option has been specified on the regression command to write the
coefficients with 16 digits of accuracy. In addition, the usual coefficient output is given. The
models run are QR, regression on levels and regression on deviations from the means.
QR option version 1 July 1996
Comments
regression on total
Of
35000 Real*8 space,
144 is being used.
10-90
Chapter 10
OLS using QR decomposition for Y
Number of observations
Adjusted R square
Standard Error of Estimate
Sum of Squared Residuals
Variable
X2
X5
X3
X4
X6
X1
CONSTANT
= Y1
X- 1
=
16
=
0.992465007628829
=
304.854073561898
=
836424.055505548
Coefficient
-0.358191792926104010E-01
-0.511041056535210610E-01
-2.02022980381710490
-1.03322686717371810
1829.15146461373640
15.0618722714605970
-3482258.63459618620
X- 3
X- 6
X- 4
X- 5
X- 7
X- 2
X-15
Standard Error
0.334910077722366880E-01
0.226073200069318770
0.488399681651600530
0.214274163161636330
455.478499142097410
84.9149257747525470
890420.383607150170
t value
-1.06951631722183400
-0.226051144663991450
-4.13642735594212410
-4.82198531044692570
4.01588981271120100
0.177376028231056780
-3.91080291815572870
Singular values of X
1663668.227889470
3.648093794807085
83899.57794622086
0.3423709062101756E-03
3407.197376095865
1582.643681003795
41.69360109707301
-46093.11125516977
1192.224208619154
2880.890760850634
1604.488608619055
1776.626531885898
-151.1972948782361
3.910802944796743
9.450064836564154
5.263136522573899
5.827793314774812
PC Regression Coef.
-257497.7072024466
-210.6907153931316
t val. PC Reg. Coef.
-844.6589025164000
-0.6911198952713113
Problem Number
Subproblem Number
1
1
F to enter
F to remove
Tolerance
Maximum no of steps
Dependent variable X( 1).
Standard Error of Y =
0.99999998E-02
0.49999999E-02
0.99999997E-09
7
Variable Name Y1
3511.9683
for degrees of freedom
.............
Step Number 7
Variable Entering
3
Multiple R
0.997737
Std Error of Y.X
304.854
R Square
0.995479
X- 2
X- 3
X- 4
X- 5
X- 6
X- 7
X-15
X1
X2
X3
X4
X5
X6
CONSTANT
Var
X1
X2
X3
X4
X5
X6
CONSTANT
T Val.
Coef
0.13686
0.68732
0.99746
0.99906
0.17379
0.99696
0.99644
0.0590
-0.3358
-0.8095
-0.8491
-0.0751
0.8011
0.2345E-01
-0.2126
-0.9877E-01
-0.4123E-01
-0.9187E-01
54.73
0.992465
219.234869071137
235.234869071137
241.415578849055
92936.0029969030
Order of entrance (or deletion) of the variables =
Estimate of computational error in coefficients =
1 0.4144E-01
2 -323.9
3 -0.3364
7 -0.1038
2
1
0.99999998E-02
0.49999999E-02
0.99999997E-09
6
6
4
2
7
15
4 -0.6822E-01
5
3
5
95.36
F Sig.
1.000000
Partial Cor. for Var. not in equation
Variable
Coefficient
F for selection
84.91492341168758
0.3349100704795319E-01
0.4883996712397694
0.2142741577415063
0.2260731940689374
455.4784907402389
890420.3671920912
84.91492
0.1774
0.3349101E-01 -1.070
0.4883997
-4.136
0.2142742
-4.822
0.2260732
-0.2261
455.4785
4.016
890420.4
-3.911
Adjusted R Square
-2 * ln(Maximum of Likelihood Function)
Akaike Information Criterion (AIC)
Scwartz Information Criterion (SIC)
Residual Variance
F to enter
F to remove
Tolerance
Maximum no of steps
T Sig. P. Cor. Elasticity
SE
15.06187367747587
-0.3581918143189604E-01
-2.020229835731279
-1.033226876342277
-0.5110409828822657E-01
1829.151500000934
-3482258.703813081
X- 2
15.06187
X- 3 -0.3581918E-01
X- 4 -2.020230
X- 5 -1.033227
X- 6 -0.5110410E-01
X- 7
1829.152
X-15 -3482259.
***************
Problem Number
Subproblem Number
15.
Analysis of Variance for reduction in SS due to variable entering
Source
DF
SS
MS
F
Due Regression
6
0.18417E+09
0.30695E+08
330.29
Dev. from Reg.
9
0.83642E+06
92936.
Total
15
0.18501E+09
0.12334E+08
Multiple Regression Equation
Variable
Coefficient
Std. Error
Y1
=
Extended precision
=
6
202.8
Special Topics in OLS Estimation
Dependent variable X( 1). Variable Name Y1
Standard Error of Y =
3511.9683
for degrees of freedom
.............
Step Number 6
Variable Entering
2
Multiple R
0.997737
Std Error of Y.X
304.854
R Square
0.995479
Extended precision for constant and SE
XXXXXXX1
X2
X3
X4
X5
X6
2
3
4
5
6
7
X1
X2
X3
X4
X5
X6
XXXXXX-
Var
15.
Analysis of Variance for reduction in SS due to variable entering
Source
DF
SS
MS
F
Due Regression
6
0.18417E+09
0.30695E+08
330.29
Dev. from Reg.
9
0.83642E+06
92936.
Total
15
0.18501E+09
0.12334E+08
Multiple Regression Equation
Variable
Coefficient
Std. Error
Y1
=
CONST.
-3482259.
890420.4
Extended precision
=
10-91
T Val.
890420.3882199302
SE
15.06187230978256
-0.3581917930289365E-01
-2.020229803946594
-1.033226867209468
-0.5110410558489326E-01
1829.151464660804
84.91492337552916
0.3349100682722105E-01
0.4883996678667511
0.2142741571071722
0.2260731936835627
455.4784862720230
84.91492
0.1774
0.3349101E-01 -1.070
0.4883997
-4.136
0.2142742
-4.822
0.2260732
-0.2261
455.4785
4.016
Adjusted R Square
-2 * ln(Maximum of Likelihood Function)
Akaike Information Criterion (AIC)
Scwartz Information Criterion (SIC)
Residual Variance
Partial Cor. for Var. not in equation
Variable
Coefficient
F for selection
0.99644
-3482258.634718623
Coef
2
15.06187
3 -0.3581918E-01
4 -2.020230
5 -1.033227
6 -0.5110411E-01
7
1829.151
T Sig. P. Cor. Elasticity
-3.911
F Sig.
1.000000
0.13686
0.68732
0.99746
0.99906
0.17379
0.99696
0.0590
-0.3358
-0.8095
-0.8491
-0.0751
0.8011
0.2345E-01
-0.2126
-0.9877E-01
-0.4123E-01
-0.9187E-01
54.73
0.992465
219.234869616679
235.234869616679
241.415579394597
92936.0061656852
Order of entrance (or deletion) of the variables =
3
Estimate of computational error in coefficients =
1 -0.2957E-09
2 0.1304E-04
3 0.1019E-07
4
5
4
7
6
0.1905E-08
2
5 -0.4573E-06
6 -0.2079E-09
Table 10.20 lists the estimated coefficients for the three approaches and also lists results
obtained by Beaton, Rubin, and Barone (1976), using DORTHO. Careful readers will note that there
are slight differences between the reported coefficients in columns QR, LEVELS and DMEANS,
which were run with B34S on IBM/MVS and the corresponding values obtained in the output given
above. The differences are due to the fact that on the Intel P5-90 there is full IEEE accuracy, while
on the IBM there is less accuracy. DORTHO uses the classical Gram-Schmidt approach, which
Beaton, Rubin and Barone assure us "agree with Longley's hand-calculated solution in every
published place."15 The B34S QR results on raw, unscaled data are very close to these results. The
deviations-from-the-means output is next best in accuracy, while the levels output is least accurate.
15 Since most economic data only have a few significant digits, reporting regression coefficients to
16 places makes no sense. However, the study of all 16 digits does make sense when testing program
calculation accuracy. When performing these tests, an implicit assumption is that we have at least 16
significant digits of accuracy for the input data. If we have n significant digits, where n < 16,
reporting 16 digits of accuracy in the answer assumes implicitly that the last 16-n digits in the raw
data were 0. As will be shown in section 16.7, this does not mean that calculation should not be
carried out with a high degree of accuracy. Experiments with VPA math will show that the precision
selected for the initial data read into a floating point number may “doom” any subsequent
calculations even if the accuracy is boosted for calculation. The implication is that data read into
real*4 in memory, may cause problems with subsequent calculations even if the data has been moved
to real*8 because of the accuracy loss of the original data read.
92
Chapter 10
Unless the TOLL parameter is set to a very low number, the B34S regression command will not
attempt the problem. The regression command's measure of accuracy, based on the Faddeeva (1959)
algorithm, on the IBM/MVS shows the absolute values of the tests ranging from .04144 to 1619 for
output using the level data. The corresponding range for the P5-90 is .04144 to 323.9, which is
substantially lower. These errors are orders of magnitude from what is usually found for problems
such as those reported in Chapter 2. Nevertheless, about eight digits are correct and, if the usual
output is used, there would be no way to tell any difference from the DORTHO results. The regression
results from the deviations-from-the-means approach are more accurate and the error estimate values
range from .5206E-09 to .8714E-05 in absolute value on the IBM, with comparable values on the P590. The PC regression output shows the singular values of X. These range from .1664E+07 to
.34237D-03, which is a large spread. Equation (10.2-9) shows how this spread magnifies the
variance of the coefficients.
Section 16.7 introduces the topic of accuracy from the point of view of an exhaustive
investigation of the gains of calculation in real*16, how data is read (as character or as real numbers)
and the advantages of variable precision arithmetic (VPA) whereby ~1750 digits of accuracy are
possible in the current implementation of B34S. Since these approaches involve extensive use of the
matrix command, their discussion has been put off until later. Table 2.9 in Chapter 2 illustrates use
of the QR procedure to measure leverage as suggested by Davidson and MacKinnon (1993, 37-39).
Table 10.20 Results from the Longley Total Equation
───────────────────────────────────────────────────────────────────────────────────────
Variable DORTHO
QR
LEVELS
DMEANS
──────────────────────────────────────────────────────────────────────────────────────
Constant -3482258.634597 -3482258.63459606981 -3482258.656344591 -3482258.634727585
X1
15.061872271761
15.0618722714175008
15.06187281288034 15.06187235777704
X2
-.035819179293 -.035819179292592743 -.0358191799858490 -.035819179315730
X3
-2.020229803818 -2.02022980381681849 -2.020229814090662 -2.02022980410821
X4
-1.033226867174 -1.03322686717366818 -1.033226870110064 -1.03322686725428
X5
-0.051104105653 -0.05110410565364853 -.0511041031599990 -.051104105499356
X6
1829.151464614112
1829.15146461368289
1829.151475721324
1829.15146471942
───────────────────────────────────────────────────────────────────────────────────────
DORTHO reports results from Beaton, Rubin, and Barone (1976). QR reports results from the
B34S QR command. LEVELS reports the regression command where data are levels.
DMEANS reports results from the regression command where the data are deviations from
the means. Columns QR, LEVELS and DMEANS were run with B34S on IBM/MVS.
One of the key assumptions of OLS is that the correlation between the error term and the
right hand sides of the equation are forced to be zero or near zero. At issue is how sensitive this
calculation is to the problem at hand, the data precision and the method of calculation. The next two
examples investigate the relationship between the precision of the data (real*4, real*8, real*16 and
VPA), the method of solution (Cholesky vs OLS) and the difficulty of the problem. More detail on
this issue is contained in Stokes (2005). The first problem involves a very simple test case based on
the Grunfeld data that does not involve right hand side variables of different sizes. The second
example uses a three lag model of the gas data. A summary of the findings is given in Table 10.21.
Using the simple dataset and real*8 data, the correlation for the Choleski vs QR approach is between
xe14 and xe-15 where x is some number. The corresponding numbers for real*16 are xe-33 and ex-
Special Topics in OLS Estimation
93
34. Variable precision math (VPA) is discussed in some detail in chapter 16. Using default settings,
the correlation was xm-62. Using real*4 the corresponding correlations were xe-5 to xe-6.
For the somewhat more complex dataset a different pattern emerges. For real*8 the
correlations increase for all cases with the QR showing substantially less correlation. What is
surprising is the relatively poor real*4 showing of correlations in the area of xe-01. Use of the QR
reduces these to xe-05. The implication of this study is that any calculation involving real*4 data is
very prone to error and should be avoided. However if real*4 data are used, it is highly
recommended that the QR method be used. Even with the real*16 data, gains from using the QR are
shown. The jobs that made these calculations are shown next and the estimated coefficients are
shown. While the estimated coefficients are not markedly different, except for the real*4 cases, and
thus can mask calculation problems, when the error is correlated with the right hand side the
accuracy of the calculation is more rigorously tested.
Table 10.21 Correlation between the Error term and Right-Hand-Side
___________________________________________________
Grunfeld Example
GEF
GEC
_____________________________________________________________________________________
Real*8 Chol -0.356987E-14 -0.260707E-14
Real*8 QR
0.271906E-15
0.333125E-15
Real*16 Chol -0.852072E-33 -0.188439E-33
Real*16 QR
-0.912934E-34 -0.263815E-33
Real*4 LU
0.171420E-05
0.126598E-06
Real*4 QR
0.216786E-06 -0.236300E-06
VPA
-.332843M-62
-.179417M-62
______________________________________________________________________________________________________
Gas Data of lag Order 3
GASOUT(t-1)
GASOUT(t-2)
GASOUT(t-3)
GASIN(t-1)
GASIN(t-2)
GASIN(t-3)
______________________________________________________________________________________________________
Real*8 Chol -0.119284E-10
0.868642E-11 -0.430643E-11 -0.434085E-13
0.455703E-13 -0.123613E-13
Real*8 QR
0.131642E-13
0.120757E-13
0.106601E-13 -0.596223E-14 -0.858368E-14 -0.108972E-13
Real*16 Chol -0.132839E-28
0.747619E-29 -0.214679E-29
0.515527E-32
0.797312E-32
0.441218E-32
Real*16 QR
-0.137433E-31 -0.126724E-31 -0.112671E-31
0.584987E-32
0.889735E-32
0.117243E-31
Real*4 LU
-0.895673E-01 -0.918947E-01 -0.891362E-01
0.369165E-01
0.452364E-01
0.557397E-01
Real*4 QR
-0.144533E-05 -0.151427E-05 -0.153826E-05
0.124983E-06
0.420500E-06
0.724758E-06
VPA
.957024M-59
.796435M-59
.628011M-59
-.691398M-59
-.839960M-59
-.963458M-59
______________________________________________________________________________________________________
For data descriptions see text. All calculations made with C10H and C10I programs. The Cholesky calculation
made with various precisions of the LINPACK Cholesky routines. The QR results were run with the LINPACK
QR code except for the real*4 function call to the qr( ) which used LAPACK. The VPA results were done with
LINPACK routines that were converted to run VPA math. The real*8 LU and real*4 LU calculations were made with LINPACK routines
DGECO DGEDI and SGECO SGEDI respectively. Runs made with b34slf95.exe. Real*4 results differ fopr b34sia32.exe.
This first test case, whose commands are shown in Table 10.22, runs the simple Grunfeld GE
equation.
94
Chapter 10
Table 10.22 Effect of Data Precision on Accuracy using the Grunfeld Data
==CH10F
Effect of Data Precision on Accuracy
/;
/; Charles Renfro Test Regression
/;
b34sexec options ginclude('b34sdata.mac') member(grunfeld_4);
b34srun;
b34sexec matrix;
call loaddata;
call echooff;
/;
/; Set parameters for tests
/;
/;
call print(' ':);
call print('GE Equation':);
call print('-----------':);
call print('
':);
call print(' ':);
call load(cov :staging);
call load(cor :staging);
call echooff;
/;
dovpa=1;
call olsq(gei gef gec :print :savex);
call print('Tests on Real*8 Cholesky':);
i=nocols(%x);
test=%x;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
call olsq(gei
gef
gec
:print :savex :qr);
call print('Tests on Real*8 QR':);
i=nocols(%x);
test=%x;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
%yhold=%y;
%xhold=%x;
%ytest=r8tor16(%yhold);
%xtest=r8tor16(%xhold);
call olsq(%ytest %xtest :print :noint :savex);
call print('Tests on Real*16 Cholesky':);
i=nocols(%x);
test=%x;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
call olsq(%ytest %xtest :print :noint :qr :savex);
call print('Tests on Real*16 QR':);
i=nocols(%x);
test=%x;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
/;
Special Topics in OLS Estimation
/; This example show how key method is when data is not
/; saved with enough precision
call print('Real*4 Tests ':);
call print('_________ ':);
%ytest=r8tor4(%yhold);
%xtest=r8tor4(%xhold);
beta = inv(transpose(%xtest)*%xtest)*transpose(%xtest)*%ytest;
call print('beta from Real*4 using inverse ':);
call print(beta);
i=nocols(%xtest);
%res=%xtest*beta-%ytest;
test=%xtest;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
/; QR approach
r4_r=qr(
%xtest ,r4_q);
/; r8_r=qr(r4tor8(%xtest),r8_q);
/; call print('R looked at from real 4 to real 8 and from real 4',
/;
r8_r,r4_r);
r4_yhat=r4_q*transpose(r4_q)*%ytest;
r4_res =%ytest-r4_yhat;
/;
/; Beta not needed to get yhat from QR !!
/;
r4_beta=inv(r4_r)*transpose(r4_q)*%ytest;
call print('beta from Real*4 using QR ':);
call print(r4_beta);
i=nocols(%xtest);
test=%xtest;
test(1,i)=r4_res;
call print('Last Column is the residual':);
call print(cor(test));
if(dovpa.ne.0)then;
call print('VPA Tests ':);
call print('_________ ':);
%ytest=vpa(%yhold);
%xtest=vpa(%xhold);
beta = inv(transpose(%xtest)*%xtest)*transpose(%xtest)*%ytest;
call print('beta from VPA ':);
call print(beta);
i=nocols(%xtest);
%res=%xtest*beta-%ytest;
test=%xtest;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
endif;
b34srun;
==
Edited output produces:
95
96
Chapter 10
B34S 8.11C
Variable
YEAR
GMI
GMF
GMC
CI
CF
CC
USI
USF
USC
GEI
GEF
GEC
WI
WF
WC
CONSTANT
(D:M:Y)
19/ 3/07 (H:M:S)
8: 8:20
Label
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
DATA STEP
Mean
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
1944.50
608.020
4333.85
648.435
86.1235
693.210
121.245
405.460
1971.83
299.855
102.290
1941.33
400.160
42.8915
670.910
85.6400
1.00000
Year
General Motors gross investment
General Motors value of firm
General Motors stock of plant and equip.
Chrysler gross investment
Chrysler value of firm
Chrysler stock of plant and equip.
US Steel gross investment
US Steel value of firm
US Steel stock of plant and equip.
General Electric gross investment
General Electric value of firm
General Electric stock of plant & equip.
Westinghouse gross investment
Westinghouse value of firm
Westinghouse stock of plant and equip.
Number of observations in data file
20
Current missing variable code
1.000000000000000E+31
Data begins on (D:M:Y) 1: 1:1935 ends 1: 1:1954.
Frequency is
B34S(r) Matrix Command. d/m/y 19/ 3/07. h:m:s
=>
CALL LOADDATA$
=>
CALL ECHOOFF$
8: 8:20.
GE Equation
-----------
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F( 2,
17)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Lag
Coefficient
GEF
0
0.26551189E-01
GEC
0
0.15169387
CONSTANT
0
-9.9563065
Tests on Real*8 Cholesky
Last Column is the residual
Array
1
2
3
of
1
1.00000
0.118243
-0.356987E-14
3
by
2
0.118243
1.00000
-0.260707E-14
GEI
0.7053066881516170
0.6706368867576896
13216.58777024301
777.4463394260592
27.88272474895628
44848.61800000000
-93.31372761767831
102.2900000000000
48.58449936911327
398.8665105091626
20.34354567358886
0.9999691221270504
8.363317695562057E-09
58.73723628346180
20
SE
0.15566104E-01
0.25704083E-01
31.374249
3
t
1.7057055
5.9015476
-0.31734007
elements
3
-0.356987E-14
-0.260707E-14
1.00000
Ordinary Least Squares Estimation using QR Method
Dependent variable
GEI
Centered R**2
0.7053066881516169
Adjusted R**2
0.6706368867576895
Residual Sum of Squares
13216.58777024301
Residual Variance
777.4463394260592
Standard Error
27.88272474895628
Total Sum of Squares
44848.61800000000
Log Likelihood
-93.31372761767832
Mean of the Dependent Variable
102.2900000000000
Std. Error of Dependent Variable
48.58449936911327
Sum Absolute Residuals
398.8665105091630
F( 2,
17)
20.34354567358885
F Significance
0.9999691221270504
QR Rank Check variable (eps) set as
2.220446049250313E-16
Maximum Absolute Residual
58.73723628346178
Number of Observations
20
Variable
GEF
GEC
Lag
0
0
Coefficient
0.26551189E-01
0.15169387
SE
0.15566104E-01
0.25704083E-01
Grunfeld Investment Study
# Cases
t
1.7057055
5.9015476
1
Std. Dev.
5.91608
309.575
904.305
630.164
42.7256
160.599
111.328
129.352
301.088
153.022
48.5845
413.843
250.619
19.1102
222.392
62.2649
0.00000
PAGE
1
Variance
Maximum
Minimum
35.0000
95836.5
817767.
397107.
1825.47
25792.1
12393.8
16731.9
90653.9
23415.8
2360.45
171266.
62809.8
365.199
49458.2
3876.92
0.00000
1954.00
1486.70
6241.70
2226.30
174.930
1001.50
414.900
645.200
2676.30
669.700
189.600
2803.30
888.900
90.0800
1193.50
213.500
1.00000
1935.00
257.700
2792.20
2.80000
40.2900
410.1400
10.2000
209.900
1362.40
50.5000
33.1000
1170.60
97.8000
12.9300
191.500
0.800000
1.00000
Special Topics in OLS Estimation
CONSTANT
0
-9.9563065
Tests on Real*8 QR
Last Column is the residual
Array
of
1
1.00000
0.118243
0.271906E-15
1
2
3
3
31.374249
by
3
2
0.118243
1.00000
0.333125E-15
-0.31734007
elements
3
0.271906E-15
0.333125E-15
1.00000
Ordinary Least Squares Estimation - Real*16
Dependent variable
%YTEST
Centered R**2
0.70530668815161693372209221797820645
Adjusted R**2
0.67063688675768951415998542009328959
Residual Sum of Squares
13216.587770243005958442337265798097
Residual Variance
777.44633942605917402601983916459390
Standard Error
27.882724748956282268018488624583601
Total Sum of Squares
44848.617999999999664993310943828684
Log Likelihood
-93.313727617678318043967643581115597
Mean of the Dependent Variable
102.29000000000000021316282072803005
Std. Error of Dependent Variable
48.584499369113276923737439122417442
Sum Absolute Residuals
398.86651050916290890583210485686862
F( 2,
17)
20.343545673588852782321238534390417
F Significance
0.99996912212705035827298161166254431
1/Condition XPX
8.3633176955620372924913485687834572E-0009
Maximum Absolute Residual
58.737236283461805116033204434351558
Number of Observations
20
Variable
Lag
Coefficient
Col____1
0
0.26551189E-01
Col____2
0
0.15169387
Col____3
0
-9.9563065
Tests on Real*16 Cholesky
Last Column is the residual
Array
of
1
1.00000
0.118243
-0.852072E-33
1
2
3
3
by
SE
0.15566104E-01
0.25704083E-01
31.374249
3
2
0.118243
1.00000
-0.188439E-33
t
1.7057055
5.9015476
-0.31734007
elements (real*16)
3
-0.852072E-33
-0.188439E-33
1.00000
Ordinary Least Squares Estimation - Real*16 & QR
Dependent variable
%YTEST
Centered R**2
0.70530668815161693372209221797820645
Adjusted R**2
0.67063688675768951415998542009328959
Residual Sum of Squares
13216.587770243005958442337265798095
Residual Variance
777.44633942605917402601983916459380
Standard Error
27.882724748956282268018488624583598
Total Sum of Squares
44848.617999999999664993310943828684
Log Likelihood
-93.313727617678318043967643581115585
Mean of the Dependent Variable
102.29000000000000021316282072803005
Std. Error of Dependent Variable
48.584499369113276923737439122417442
Sum Absolute Residuals
398.86651050916290890583210485686862
F( 2,
17)
20.343545673588852782321238534390417
F Significance
0.99996912212705035827298161166254431
QR Rank Check variable (eps) set as
1.9259299443872358530559779425849273E-0034
Maximum Absolute Residual
58.737236283461805116033204434351546
Number of Observations
20
Variable
Lag
Coefficient
Col____1
0
0.26551189E-01
Col____2
0
0.15169387
Col____3
0
-9.9563065
Tests on Real*16 QR
Last Column is the residual
Array
of
1
1.00000
0.118243
-0.912934E-34
1
2
3
Real*4 Tests
_________
beta from Real*4
BETA
3
by
SE
0.15566104E-01
0.25704083E-01
31.374249
3
2
0.118243
1.00000
-0.263815E-33
t
1.7057055
5.9015476
-0.31734007
elements (real*16)
3
-0.912934E-34
-0.263815E-33
1.00000
using inverse
= Vector of
0.265513E-01
3
elements (real*4)
0.151694
-9.95650
Last Column is the residual
Array
1
of
1
1.00000
3
by
2
0.118243
3
elements (real*4)
3
0.171420E-05
97
98
Chapter 10
2
3
0.118243
0.171420E-05
beta from Real*4
1.00000
0.126598E-06
0.126598E-06
1.00000
using QR
R4_BETA = Vector of
0.265512E-01
3
elements (real*4)
0.151694
-9.95630
Last Column is the residual
Array
1
2
3
of
3
1
1.00000
0.118243
0.216786E-06
by
3
2
0.118243
1.00000
-0.236300E-06
elements (real*4)
3
0.216786E-06
-0.236300E-06
1.00000
VPA Tests
_________
beta from VPA
BETA
= Vector of
.265512M-1
3
.151694M+0
elements
VPA - FM
-.995631M+1
Last Column is the residual
Array
of
1
1 .100000M+1
2 .118243M+0
3 -.332843M-62
3
by
2
.118243M+0
.100000M+1
-.179417M-62
3
elements VPA - FM
3
-.332843M-62
-.179417M-62
.100000M+1
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
11856250, peak space used
110, peak number used
1294, # user temp clean
289943
120
0
The next setup shown in Table 10.23 runs the slightly more demanding gas data model.
Special Topics in OLS Estimation
Table 10.23 Effect of Data Precision on Accuracy using Gas Data
==CH10G
Effect of precision on OLS Assumptions
b34sexec options ginclude('gas.b34'); b34srun;
b34sexec matrix;
call loaddata;
call load(cov :staging);
call load(cor :staging);
call echooff;
lag=3;
/;
/; This options takes space
/;
dovpa=1;
doreal4=1;
call olsq(gasout gasout{1 to lag} gasin{1 to lag} :print :savex
:diag);
/; Save data for matlab
call makematlab(%y :file 'ydata2.m');
call makematlab(%x :file 'xdata2.m');
call print('Tests on Real*8 Cholesky':);
call print('++++++++++++++++++++++++':);
i=nocols(%x);
test=%x;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
call olsq(gasout gasout{1 to lag} gasin{1 to lag} :print :savex
:qr :diag);
call print('Tests on Real*8 QR':);
call print('++++++++++++++++++':);
i=nocols(%x);
test=%x;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
%yhold=%y;
%xhold=%x;
%ytest=r8tor16(%yhold);
%xtest=r8tor16(%xhold);
call olsq(%ytest %xtest :print :diag :noint :savex);
call print('Tests on Real*16 Cholesky':);
call print('+++++++++++++++++++++++++':);
i=nocols(%x);
test=%x;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
call olsq(%ytest %xtest :print :diag :noint :qr :savex);
call print('Tests on Real*16 QR':);
i=nocols(%x);
test=%x;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
99
100
Chapter 10
if(dovpa.ne.0)then;
call print('VPA Tests ':);
call print('_________ ':);
%ytest=vpa(%yhold);
%xtest=vpa(%xhold);
beta = inv(transpose(%xtest)*%xtest)*transpose(%xtest)*%ytest;
call print('beta from VPA ':);
call print(beta);
i=nocols(%xtest);
%res=%xtest*beta-%ytest;
test=%xtest;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
endif;
/; This example show how key method is when data is not
/; saved with enough precision
if(doreal4.ne.0)then;
call print('Real*4 Tests ':);
call print('_________ ':);
%ytest=r8tor4(%yhold);
%xtest=r8tor4(%xhold);
beta = inv(transpose(%xtest)*%xtest)*transpose(%xtest)*%ytest;
call print('beta from Real*4 using inverse ':);
call print(beta);
i=nocols(%xtest);
%res=%xtest*beta-%ytest;
test=%xtest;
test(1,i)=%res;
call print('Last Column is the residual':);
call print(cor(test));
/; QR approach
r4_r=qr(
%xtest ,r4_q);
/; r8_r=qr(r4tor8(%xtest),r8_q);
/; call print('R looked at from real 4 to real 8 and from real 4',
/;
r8_r,r4_r);
r4_yhat=r4_q*transpose(r4_q)*%ytest;
r4_res =%ytest-r4_yhat;
/;
/; Beta not needed to get yhat from QR !!
/;
r4_beta=inv(r4_r)*transpose(r4_q)*%ytest;
call print('beta from Real*4 using QR ':);
call print(r4_beta);
i=nocols(%xtest);
test=%xtest;
test(1,i)=r4_res;
call print('Last Column is the residual':);
call print(cor(test));
endif;
b34srun;
==
Special Topics in OLS Estimation
101
When the commands in Table 10.23 are run, the edited output is:
B34S 8.11C
Variable
TIME
GASIN
GASOUT
CONSTANT
(D:M:Y)
19/ 3/07 (H:M:S)
Label
DATA STEP
# Cases
1
2 Input gas rate in cu. ft / min
3 Percent CO2 in outlet gas
4
Number of observations in data file
Current missing variable code
=>
CALL LOADDATA$
=>
CALL LOAD(COV :STAGING)$
=>
CALL LOAD(COR :STAGING)$
=>
CALL ECHOOFF$
Ordinary Least Squares Estimation
Dependent variable
Centered R**2
Adjusted R**2
Residual Sum of Squares
Residual Variance
Standard Error
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F( 6,
286)
F Significance
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Array
of
1
1.00000
0.972640
0.899640
-0.484755
-0.599240
-0.726835
-0.119284E-10
7
by
Std. Dev.
85.5921
1.07277
3.20212
0.00000
Variance
Maximum
Minimum
7326.00
1.15083
10.2536
0.00000
296.000
2.83400
60.5000
1.00000
1.00000
-2.71600
45.6000
1.00000
8: 8:22.
15.04125896346881
31.04125896346881
60.48263983560535
6.465104255530167E-02
6.468796447937171E-02
6.696762383517987E-02
6.457896289465981E-02
6.472673911647753E-02
SE
0.57281193E-01
0.87318874E-01
0.41791902E-01
0.77592475E-01
0.15220401
0.10682485
0.56650687
2
0.972640
1.00000
0.972527
-0.394493
-0.485453
-0.600971
0.868642E-11
Mean
296
148.500
296 -0.568345E-01
296
53.5091
296
1.00000
GASOUT
0.9940295933829316
0.9939043400972588
18.05876021349723
6.314251822901129E-02
0.2512817506883683
3024.711945392491
-7.520629481734408
53.50784982935154
3.218478297691941
51.43868016689360
7936.155830513969
1.000000000000000
7.928195070756484E-08
1.523989618188139
293
-2 * ln(Maximum of Likelihood Function)
Akaike Information Criterion (AIC)
Scwartz Information Criterion (SIC)
Akaike (1970) Finite Prediction Error
Generalized Cross Validation
Hannan & Quinn (1979) HQ
Shibata (1981)
Rice (1984)
Variable
Lag
Coefficient
GASOUT
1
1.6122699
GASOUT
2
-0.93427375
GASOUT
3
0.19438193
GASIN
1
0.10562265
GASIN
2
-0.32153622
GASIN
3
-0.19498353
CONSTANT
0
6.8096728
Tests on Real*8 Cholesky
++++++++++++++++++++++++
Last Column is the residual
PAGE
296
1.000000000000000E+31
B34S(r) Matrix Command. d/m/y 19/ 3/07. h:m:s
1
2
3
4
5
6
7
8: 8:22
7
t
28.146583
-10.699562
4.6511865
1.3612487
-2.1125345
-1.8252639
12.020459
elements
3
0.899640
0.972527
1.00000
-0.330178
-0.395169
-0.487296
-0.430643E-11
4
-0.484755
-0.394493
-0.330178
1.00000
0.952560
0.834273
-0.434085E-13
Ordinary Least Squares Estimation using QR Method
Dependent variable
GASOUT
Centered R**2
0.9940295933829316
Adjusted R**2
0.9939043400972588
Residual Sum of Squares
18.05876021349716
Residual Variance
6.314251822901103E-02
Standard Error
0.2512817506883678
Total Sum of Squares
3024.711945392491
Log Likelihood
-7.520629481733790
Mean of the Dependent Variable
53.50784982935154
Std. Error of Dependent Variable
3.218478297691941
Sum Absolute Residuals
51.43868016594674
F( 6,
286)
7936.155830513969
F Significance
1.000000000000000
5
-0.599240
-0.485453
-0.395169
0.952560
1.00000
0.952600
0.455703E-13
6
-0.726835
-0.600971
-0.487296
0.834273
0.952600
1.00000
-0.123613E-13
7
-0.119284E-10
0.868642E-11
-0.430643E-11
-0.434085E-13
0.455703E-13
-0.123613E-13
1.00000
2
102
Chapter 10
QR Rank Check variable (eps) set as
Maximum Absolute Residual
Number of Observations
2.220446049250313E-16
1.523989618133797
293
-2 * ln(Maximum of Likelihood Function)
Akaike Information Criterion (AIC)
Scwartz Information Criterion (SIC)
Akaike (1970) Finite Prediction Error
Generalized Cross Validation
Hannan & Quinn (1979) HQ
Shibata (1981)
Rice (1984)
Variable
Lag
Coefficient
GASOUT
1
1.6122699
GASOUT
2
-0.93427375
GASOUT
3
0.19438193
GASIN
1
0.10562265
GASIN
2
-0.32153622
GASIN
3
-0.19498353
CONSTANT
0
6.8096728
Tests on Real*8 QR
++++++++++++++++++
Last Column is the residual
Array
1
2
3
4
5
6
7
of
1
1.00000
0.972640
0.899640
-0.484755
-0.599240
-0.726835
0.131642E-13
7
15.04125896346756
31.04125896346756
60.48263983560410
6.465104255530139E-02
6.468796447937145E-02
6.696762383517960E-02
6.457896289465954E-02
6.472673911647726E-02
SE
0.57281193E-01
0.87318874E-01
0.41791902E-01
0.77592475E-01
0.15220401
0.10682485
0.56650687
by
2
0.972640
1.00000
0.972527
-0.394493
-0.485453
-0.600971
0.120757E-13
7
t
28.146583
-10.699562
4.6511865
1.3612487
-2.1125345
-1.8252639
12.020459
elements
3
0.899640
0.972527
1.00000
-0.330178
-0.395169
-0.487296
0.106601E-13
4
-0.484755
-0.394493
-0.330178
1.00000
0.952560
0.834273
-0.596223E-14
5
-0.599240
-0.485453
-0.395169
0.952560
1.00000
0.952600
-0.858368E-14
6
-0.726835
-0.600971
-0.487296
0.834273
0.952600
1.00000
-0.108972E-13
7
0.131642E-13
0.120757E-13
0.106601E-13
-0.596223E-14
-0.858368E-14
-0.108972E-13
1.00000
Ordinary Least Squares Estimation - Real*16
Dependent variable
%YTEST
Centered R**2
0.99402959338293156924425522698084867
Adjusted R**2
0.99390434009725880496266617579862870
Residual Sum of Squares
18.058760213497255366748012081384418
Residual Variance
6.3142518229011382401216825459386075E-0002
Standard Error
0.25128175068836849856260081690384742
Total Sum of Squares
3024.7119453924911865704776486789551
Log Likelihood
-7.5206294817346103960626213352809158
Mean of the Dependent Variable
53.507849829351535632472389747957627
Std. Error of Dependent Variable
3.2184782976919404594999841764916592
Sum Absolute Residuals
51.438680165948185297626072577080069
F( 6,
286)
7936.1558305139014385914947531791336
F Significance
1.0000000000000000000000000000000000
1/Condition XPX
7.9281950718440669605441111957152000E-0008
Maximum Absolute Residual
1.5239896181337612967868794091600124
Number of Observations
293
-2 * ln(Maximum of Likelihood Function)
Akaike Information Criterion (AIC)
Scwartz Information Criterion (SIC)
Akaike (1970) Finite Prediction Error
Generalized Cross Validation
Hannan & Quinn (1979) HQ
Shibata (1981)
Rice (1984)
Variable
Lag
Coefficient
Col____1
0
1.6122699
Col____2
0
-0.93427375
Col____3
0
0.19438193
Col____4
0
0.10562265
Col____5
0
-0.32153622
Col____6
0
-0.19498353
Col____7
0
6.8096728
Tests on Real*16 Cholesky
+++++++++++++++++++++++++
Last Column is the residual
Array
1
2
3
4
5
6
7
of
1
1.00000
0.972640
0.899640
-0.484755
-0.599240
-0.726835
-0.132839E-28
7
15.041258963469220792125242670561870
31.041258963469220792125242670561870
60.482639835605759239643964773061853
6.4651042555301756724795384429405543E-0002
6.4687964479371800851596258250350072E-0002
6.6967623835179973193778768296956194E-0002
6.4578962894659895835614156355752737E-0002
6.4726739116477617801964201008546308E-0002
SE
0.57281193E-01
0.87318874E-01
0.41791902E-01
0.77592475E-01
0.15220401
0.10682485
0.56650687
by
2
0.972640
1.00000
0.972527
-0.394493
-0.485453
-0.600971
0.747619E-29
7
t
28.146583
-10.699562
4.6511865
1.3612487
-2.1125345
-1.8252639
12.020459
elements (real*16)
3
0.899640
0.972527
1.00000
-0.330178
-0.395169
-0.487296
-0.214679E-29
4
-0.484755
-0.394493
-0.330178
1.00000
0.952560
0.834273
0.515527E-32
5
-0.599240
-0.485453
-0.395169
0.952560
1.00000
0.952600
0.797312E-32
Ordinary Least Squares Estimation - Real*16 & QR
Dependent variable
%YTEST
Centered R**2
0.99402959338293156924425522698084867
Adjusted R**2
0.99390434009725880496266617579862870
Residual Sum of Squares
18.058760213497255366748012081384479
Residual Variance
6.3142518229011382401216825459386292E-0002
Standard Error
0.25128175068836849856260081690384785
6
-0.726835
-0.600971
-0.487296
0.834273
0.952600
1.00000
0.441218E-32
7
-0.132839E-28
0.747619E-29
-0.214679E-29
0.515527E-32
0.797312E-32
0.441218E-32
1.00000
Special Topics in OLS Estimation
Total Sum of Squares
Log Likelihood
Mean of the Dependent Variable
Std. Error of Dependent Variable
Sum Absolute Residuals
F( 6,
286)
F Significance
QR Rank Check variable (eps) set as
Maximum Absolute Residual
Number of Observations
3024.7119453924911865704776486789551
-7.5206294817346103960626213352814235
53.507849829351535632472389747957627
3.2184782976919404594999841764916592
51.438680165948185297626072576174537
7936.1558305139014385914947531791336
1.0000000000000000000000000000000000
1.9259299443872358530559779425849273E-0034
1.5239896181337612967868794091023147
293
-2 * ln(Maximum of Likelihood Function)
Akaike Information Criterion (AIC)
Scwartz Information Criterion (SIC)
Akaike (1970) Finite Prediction Error
Generalized Cross Validation
Hannan & Quinn (1979) HQ
Shibata (1981)
Rice (1984)
Variable
Lag
Coefficient
Col____1
0
1.6122699
Col____2
0
-0.93427375
Col____3
0
0.19438193
Col____4
0
0.10562265
Col____5
0
-0.32153622
Col____6
0
-0.19498353
Col____7
0
6.8096728
Tests on Real*16 QR
Last Column is the residual
Array
of
7
1
1.00000
0.972640
0.899640
-0.484755
-0.599240
-0.726835
-0.137433E-31
1
2
3
4
5
6
7
103
15.041258963469220792125242670562856
31.041258963469220792125242670562856
60.482639835605759239643964773062840
6.4651042555301756724795384429405760E-0002
6.4687964479371800851596258250350289E-0002
6.6967623835179973193778768296956423E-0002
6.4578962894659895835614156355752954E-0002
6.4726739116477617801964201008546524E-0002
SE
0.57281193E-01
0.87318874E-01
0.41791902E-01
0.77592475E-01
0.15220401
0.10682485
0.56650687
by
7
2
0.972640
1.00000
0.972527
-0.394493
-0.485453
-0.600971
-0.126724E-31
t
28.146583
-10.699562
4.6511865
1.3612487
-2.1125345
-1.8252639
12.020459
elements (real*16)
3
0.899640
0.972527
1.00000
-0.330178
-0.395169
-0.487296
-0.112671E-31
4
-0.484755
-0.394493
-0.330178
1.00000
0.952560
0.834273
0.584987E-32
5
-0.599240
-0.485453
-0.395169
0.952560
1.00000
0.952600
0.889735E-32
6
-0.726835
-0.600971
-0.487296
0.834273
0.952600
1.00000
0.117243E-31
7
-0.137433E-31
-0.126724E-31
-0.112671E-31
0.584987E-32
0.889735E-32
0.117243E-31
1.00000
VPA Tests
_________
beta from VPA
BETA
= Vector of
.161227M+1
7
elements
-.934274M+0
VPA - FM
.194382M+0
.105623M+0
-.321536M+0
-.194984M+0
.680967M+1
Last Column is the residual
Array
of
1
.100000M+1
.972640M+0
.899640M+0
-.484755M+0
-.599240M+0
-.726835M+0
.957024M-59
1
2
3
4
5
6
7
Real*4 Tests
_________
beta from Real*4
BETA
7
7
elements VPA - FM
3
.899640M+0
.972527M+0
.100000M+1
-.330178M+0
-.395169M+0
-.487296M+0
.628011M-59
4
-.484755M+0
-.394493M+0
-.330178M+0
.100000M+1
.952560M+0
.834273M+0
-.691398M-59
5
-.599240M+0
-.485453M+0
-.395169M+0
.952560M+0
.100000M+1
.952600M+0
-.839960M-59
6
-.726835M+0
-.600971M+0
-.487296M+0
.834273M+0
.952600M+0
.100000M+1
-.963458M-59
7
.957024M-59
.796435M-59
.628011M-59
-.691398M-59
-.839960M-59
-.963458M-59
.100000M+1
using inverse
= Vector of
1.61238
by
2
.972640M+0
.100000M+1
.972527M+0
-.394493M+0
-.485453M+0
-.600971M+0
.796435M-59
7
elements (real*4)
-0.942064
0.194891
0.105573
-0.320821
-0.195644
6.79072
Last Column is the residual
Array
of
1
1.00000
0.972640
0.899639
-0.484755
-0.599241
-0.726835
-0.895673E-01
1
2
3
4
5
6
7
beta from Real*4
by
7
2
0.972640
1.00000
0.972527
-0.394493
-0.485453
-0.600971
-0.918947E-01
elements (real*4)
3
0.899639
0.972527
1.00000
-0.330178
-0.395169
-0.487296
-0.891362E-01
4
-0.484755
-0.394493
-0.330178
1.00000
0.952560
0.834273
0.369165E-01
5
-0.599241
-0.485453
-0.395169
0.952560
1.00000
0.952600
0.452364E-01
6
-0.726835
-0.600971
-0.487296
0.834273
0.952600
1.00000
0.557397E-01
7
-0.895673E-01
-0.918947E-01
-0.891362E-01
0.369165E-01
0.452364E-01
0.557397E-01
1.00000
using QR
R4_BETA = Vector of
1.61227
7
7
-0.934273
elements (real*4)
0.194382
0.105623
-0.321537
-0.194985
6.80968
104
Chapter 10
Last Column is the residual
Array
1
2
3
4
5
6
7
of
1
1.00000
0.972640
0.899639
-0.484755
-0.599241
-0.726835
-0.144533E-05
7
by
2
0.972640
1.00000
0.972527
-0.394493
-0.485453
-0.600971
-0.151427E-05
7
elements (real*4)
3
0.899639
0.972527
1.00000
-0.330178
-0.395169
-0.487296
-0.153826E-05
4
-0.484755
-0.394493
-0.330178
1.00000
0.952560
0.834273
0.124983E-06
5
-0.599241
-0.485453
-0.395169
0.952560
1.00000
0.952600
0.420500E-06
6
-0.726835
-0.600971
-0.487296
0.834273
0.952600
1.00000
0.724758E-06
7
-0.144533E-05
-0.151427E-05
-0.153826E-05
0.124983E-06
0.420500E-06
0.724758E-06
1.00000
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
11856223, peak space used
90, peak number used
5108, # user temp clean
9183819
97
0
which validate the results reported in Table 10.21.
At issue whether these accuracy patterns obtained above are unique to B34S or can be
confirmed with other software. For that we turn to Matlab where the commands in Table 10.26 were
run using two versions of Matlab. The pattern of the calculations found in Table 10.24 is replicated
in Table 10.25 for real*8 and real*4 data using Matlab 2007a and Table 10.26 attempting the same
problem using Matlab 2006b. Note that there are differences indicating that in these two releases of
Matlab there were hidden and not announced changes that marginally impacted accuracy.16 However
the patterns exhibited in both tables are similar and support the results obtained with B34S reported
in Table 10.23. Using Matlab 2007a, real*4 gas data and the LU method produced correlations
between |-.0264| and .0035 which are way larger than epsilon for this data type. Since most
regression packages use LU or Cholesky to solve OLS, this finding shows the danger since the
calculation of the model has clearly broken down. Both versions of Matlab produced error messages
to warn the user. For Matlab 20006b the error message was "Warning: Matrix is close to singular or
badly scaled. Results may be inaccurate. RCOND = 5.224432e-008", which should alert users to
problems. For the GE problem the message was "Warning: Matrix is close to singular or badly
scaled. Results may be inaccurate. RCOND = 8.358012e-009." The exact Matlab commands used for
the Gas Data problem are listed in Table 10.24. The command getb34s brings in the x and y matrix
with the appropriate lags for the two problems studied.
16 The Matlab 2006b results were done on a Windows XP Professional Dell 650 work station.
The Matlab 2006b results were done on a Dell Latitude Windows XP Professional machine.
Special Topics in OLS Estimation
Table 10.24 Matlab Commands to Replicate Accuracy Results Obtained with B34S
x=getb34s('xdata2.m');
y=getb34s('ydata2.m');
beta8=inv(x'*x)*x'*y;
res8=y-x*beta8;
yhat_8=x*beta8;
newx8=x;
newx8(:,7)=res8;
disp('Matlab LU real*8 results')
c8=corr(newx8)
beta4=inv(single(x)'*single(x))*single(x)'*single(y);
disp('++++++++++++++++ beta4-beta8 +++++++++++++++++++++++')
beta4-beta8
yhat_4=single(x)*beta4;
res4 =single(y)-single(x)*beta4;
newx4=single(x);
newx4(:,7)=res4;
disp('Matlab LU real*4 results')
c4=corr(newx4)
betatest=[beta8 double(beta4) ]
% qr
disp('Matlab QR real*8 and QR real*4')
[q8,r8]=qr(x,0);
[q4,r4]=qr(single(x),0);
yhat_q8=q8*q8'*y;
yhat_q4=q4*q4'*single(y);
% plot(yhat_q8-yhat_q4)
% [yhat_q4 yhat_q8 yhat_8 yhat_4]
res_q8=
y -yhat_q8;
res_q4=single(y)-yhat_q4;
% Get residual another way via beta to test
beta_q4=inv(r4)*q4'*single(y);
% '
yhat_q4_alt=single(x)*beta_q4;
res_q4_alt=single(y)-yhat_q4_alt;
disp('beta_q4-beta8')
beta_q4-beta8
newx8_q=x;
newx4_q=single(x);
newx8_q(:,7)=res_q8;
newx4_q(:,7)=res_q4;
c8_q=corr(newx8_q);
c4_q=corr(newx4_q);
disp('alt c4_q via beta')
newx4_q_alt=single(x);
newx4_q_alt(:,7)=res_q4_alt;
c4_q_alt=corr(newx4_q_alt);
testres=[res8 res_q8
res4
res_q4 res_q4_alt];
testaccr =[c8(7,:)' c8_q(7,:)' c4(7,:)' c4_q(7,:)' c4_q_alt(7,:)']';
testaccr
disp('++++++++++++++++ Matlab Ending +++++++++++++++')
quit
105
106
Chapter 10
Table 10.25. Correlation of the Residual and RHS Variables using Matlab 2006b
______________________
Grunfeld Example
_____________________________________
GEF
GEC
Real*8 LU
-1.2559e-015
4.3021e-016
Real*8 QR
8.3267e-017
2.0817e-017
Real*4 LU
-4.4651e-007
1.4417e-007
Real*4 QR
-9.9298e-008
2.4814e-007
Real*4 QR_2
-4.7489e-009
1.0605e-007
__________________________________________________________________________________________________
Gas Data of lag Order 3
Model Estimated: gasout = f(gasout(t-1),…,gasout(t-3),gasin(t-1),……,gasin(t-3), constant)
Real*8 LU
2.4454e-010
2.4313e-010
2.233e-010 -1.1906e-010 -1.4591e-010 -1.7591e-010
Real*8 QR
-1.4745e-015 -1.0686e-015
-1.027e-015 -8.9304e-015 -4.5935e-015 2.9837e-016
Real*4 LU
-0.022544
-0.023669
-0.026415
0.0035028
0.0054714
0.0087814
Real*4 QR
2.6992e-006
4.2817e-006
5.239e-006
2.8188e-006
1.8091e-006 9.9824e-007
Real*4 QR_2 8.7203e-005
8.8714e-005
8.4771e-005 -2.5406e-005 -3.4171e-005 -4.6354e-005
_________________________________________________________________________________________________
All calculations done with Matlab Version 2006b. Correlation is done with the Matlab supplied
command corr. The LU factorization and QR analysis is done in Matlab with the LAPACK software
[1992 ]. Real*4 QR calculates the residual as e  y  Q ' Qy while Real*4 QR_2 uses
e  y  x  y  xR 1Q ' y . Real*8 data was converted to real*4 in Matlab using the built-in function
single( ).
Table 10.26. Correlation of the Residual and RHS Variables using Matlab 2007a
______________________
Grunfeld Example
_____________________________________
GEF
GEC
Real*8 LU
2.0123e-016 -4.4409e-016
Real*8 QR
4.9613e-016
2.6368e-016
Real*4 LU
-1.8515e-006 -3.2783e-007
Real*4 QR
-2.9802e-008
2.9802e-007
Real*4 QR_2
-7.0781e-008
4.4703e-008
__________________________________________________________________________________________________
Gas Data of lag Order 3
Model Estimated: gasout = f(gasout(t-1),…,gasout(t-3),gasin(t-1),……,gasin(t-3), constant)
Real*8 LU
1.257e-010
1.3066e-010
1.2848e-010 -5.5661e-011 -6.6412e-011 -7.9434e-011
Real*8 QR
-1.2436e-014 -1.1942e-014 -1.2615e-014
2.2346e-014
2.2511e-014
2.1032e-014
Real*4 LU
-0.033102
-0.033387
-0.032592
0.015702
0.019042
0.022815
Real*4 QR
2.0284e-006
1.682e-006
1.3784e-006 -2.6099e-006 -2.5911e-006 -2.6249e-006
Real*4 QR_2
-1.2402e-005 -1.1405e-005 -1.027e-005
1.1327e-005
1.2809e-005
1.3571e-005
_____________________________________________________________________________________
For a discussion of what is calculated, see Table 10.25. Matlab documentation indicates that the
LAPACK routines used were DLANGE, DGETRF, DGECON, DGETRI and SLANGE, SGETRF,
SGECON, SGETRI for real*8 and real*4 respectively for the inv( ) command and DGEQRF,
DORGQR and SGEQRF, SORGQR for the Matlab qr( ) command..
10.7 Conclusion
This chapter has illustrated the QR approach to regression analysis and the associated PC
regression. A number of examples were used to illustrate various problems. The Wampler data set
shows the effects on accuracy of rank problems in the X matrix, given the y vector, and changes in
the y vector, given the X matrix. While most researchers realize that problems can occur with X
close to not being full rank, only a few realize that the y vector can also cause problems. The Longley
data set was used to show the effects on accuracy from estimating the coefficients with
( X ' X )1 X ' y , with the deviations-from-the-means approach, or with the QR approach. The PC
regression was shown to provide important information about the structure of the OLS problem,
Special Topics in OLS Estimation
107
especially in the case of collinearity or near collinearity. The ridge lasso and elastic net approaches
were illustrated to show alternative approaches to data shrinkage. The LTS model, a resistant
estimation method, illustrated how outliers can be dropped systematically and the changes in the
estimated coefficients and t tests observed. The final example shows using two relatively easy
problems the relationship between data precision, and calculation method in imposing the OLS
assumption that the error of the model should be uncorrelated with the right hand side variables. The
main finding of this research is that real*4 calculation of the OLS model is subject to serious loss of
accuracy, especially when the QR method is not employed.