Revised Chapter 16 in Specifying and Diagnostically Testing Econometric Models (Edition
3) © by Houston H. Stokes 26 January 2010. All rights reserved. Preliminary Draft
Chapter 16.
Programming using the Matrix Command ................................................................................ 1
16.0 Introduction ....................................................................................................................... 1
16.1 Brief Introduction to the B34S Matrix language ........................................................... 2
16.2 Overview of Nonlinear Capability................................................................................. 18
16.3 Rules of the Matrix Language ....................................................................................... 24
16.4 Linear Algebra using the Matrix Language ................................................................. 51
16.5 Extended Eigenvalue Analysis ...................................................................................... 81
16.6 A Preliminary Investigation of Inversion Speed Differences ...................................... 90
Table 16.1 Relative Cost of Equalization & Refinement of a General Matrix ................... 105
16.7 Variable Precision Math............................................................................................... 108
Table 16.2 LRE for Various Approaches to an OLS Model of the Filippelli Data ............ 111
Table 16.4 Coefficients estimated with QR using Real*16 Filippelli Data ....................... 113
Table 16.5 Residual Sum of Squares on a VAR model of Order 6 – Gas Furnace Data.... 114
Table 16.6 LRE for Various Estimates of Coef, SE and RSS of Pontius Data ................ 116
Table 16.7 LRE for QR, Cholesky, SVD linpack and lapack for Eberhardt Data ............ 116
Table 16.8 VPA Alternative Estimates of Filippelli Data set ........................................... 124
Table 16.9 Lessons to be learned from VPA and Other Accuracy Experiments ................ 125
16.8 Conclusion ..................................................................................................................... 126
Programming using the Matrix Command
16.0 Introduction
The B34S matrix command is a full featured 4th generation programming language that
allows users to customize calculations that are not possible with the built-in B34S procedures.
This chapter provides an introduction to the matrix command language with both special
emphasis on linear algebra applications and the general design of the language. This chapter
should be thought of as an introduction and overview of the power of programming capability
with an emphasis on applications.1 At many junctures, the reader is pointed to other chapters for
further discussion of the theory underlying the application The role of this chapter is to provide a
1
In the 1960's, with the advent of the availability Fortran compilers and limited capability mainframe computers,
econometric researchers programmed / developed software that used column dependent input, had limited
capability and was difficult to extend as outlined in Stokes (2004b). In the 1970's and early 1980's econometric
software improved, but was still expensive to develop due to high cpu costs on mainframe computers. The PC
revolution together with the development of programming languages such as GAUSS® and MATLAB® stimulated
researchers to develop their own procedures, without waiting for software developers to "hard wire" this capability
in their commercially distributed systems. The matrix command allows a user to develop custom calculations using
a programming language with many econometric commands already available. While many of the cpu intensive
commands are "hard wired" into the language, many are themselves just subroutines or functions written in the
matrix language and available to the user to modity as needed. The goal of this chapter is to discuss how this might
be done. Many code examples are provides as an illustration of what is available in the language.
2
Matrix Command Language
reference to matrix command basics.2 An additional and no less important goal is to discuss a
number of tests of linpack and lapack eigenvalue, Cholesky and LU routines for speed and
accuracy.
Section 16.1 provides a brief introduction to the matrix command language. All
commands in the language are listed by type, but because of space limitations are not illustrated
in any detail. Since the matrix command has a running example for all commands, the user is
encouraged to experiment with commands of special interest by first reading the help file and
next running one or more the of the supplied examples. To illustrate the power of the system, a
program to perform a fast Fourier transform example with real*8 and complex*16 data is
shown. A user subroutine filter illustrates how the language can be extended.3 The help file for
the schur factorization, which is widely used in rational expectations models, is provided as an
example to both show capability and illustrate what a representative command help file contains.
Section 16.2 provides an overview of some of the nonlinear capability built into the language and
motivates why knowledge of this language or one similar like MATLAB or SPEAKEASY(r) is
important. The solutions to a variety of problems are illustrated but not discussed in any detail.
Section 16.3 discusses many of the rules of the matrix language while section 16.4 illustrates
matrix algebra applications. In sections 16.5 and 16.6 refinements to eigenvalue analysis and
inversion speed issues are illustrated. Section 16.7 shows the gains obtained by real*16 and VPA
math calculations.
16.1 Brief Introduction to the B34S Matrix language
The matrix language is a full featured 4th generation language that can be used to
program custom calculations. Analysis is supported for real*8, real*16, complex*16 and
complex*32 data. A number of character manipulation routines are supported. High resolution
graphics is available on all platforms4 and batch and interactive operation is available. The
matrix facility supports user programs, which use the local address space and subroutines and
functions, which have their own address space.5 This design means that variable names inside
these routines will not conflict with variables known at the global level which is set to 100.
Variables in the language are built using an object-oriented programming using analytic
statements such as:
y = r*2.;
2
Nonlinear modeling examples are discussed in Chapter 11, while many other examples of applications are given in
the other chapters such as 2 and 14. A subsequent book is under development Stokes (200x) that will discuss a large
number of applications, particularly in the area of time series analysis.
3
This facility is somewhat similar to the MATLAB m file which contains help commands in its first lines. In
contract to MATLAB, the b34s libraries allow placement of a large number of subroutines and help files in one file.
The b34s design dates from the IBM MVS PDS facility except that it is portable. SCA has a similar design as
regards to macro files.
4
5
See chapter 1 for the platforms that are supported.
Examples will be supplied later.
Chapter 16
3
where r is a variable that could be a matrix, 2D array, 1D array, vector or a scaler. The class of
the variables determines the calculation performed. If x were a n by k matrix of data values and y
was a n element vector of left hand side data points, the OLS solution using the text book
formula could be calculated as
beta=inv(transpose(x)*x)*transpose(x)*y;
By use of vectors that are created by statements such as integers(1,3) it is possible to
subset a matrix without the use of do loops and other programming constructs. This capability is
illustrated by
a=rn(matrix(10,5:));
newa= a(integers(1,3),integers(2,4));
call print('Pull off rows 1-3, cols 2-4',a,newa);
which produces output
=>
A=RN(MATRIX(10,5:))$
=>
NEWA=A(INTEGERS(1,3),INTEGERS(2,4))$
=>
CALL PRINT('Pull off rows 1-3, cols 2-4',A,NEWA)$
Pull off rows 1-3, cols 2-4
A
= Matrix of
1
2
3
4
5
6
7
8
9
10
1
1.82793
-0.641156
0.726593
0.174686
1.01451
-1.70319
2.23174
0.256844
1.26117
-0.303238
NEWA
= Matrix of
1
2
3
1
-2.14489
0.219954
-0.282409E-01
10
by
2
-2.14489
0.219954
-0.282409E-01
-0.957929
-0.795788
-0.853220
-1.34378
-0.266375
-0.396155
-1.07086
3
by
2
0.166069
1.27446
-0.555147
5
elements
3
0.166069
1.27446
-0.555147
1.27209
-0.752745
-1.68396
0.551978
1.14227
-1.84390
1.21187
3
4
-0.532415E-01
0.477187
0.410387
0.940303
-0.973475E-01
0.888468
0.411578
-0.956681
0.628140E-02
0.295704
5
0.466859
0.555387
-0.373611E-01
-1.63219
0.719606E-01
0.204583
0.604946
-0.559318
1.68875
-1.71790
elements
3
-0.532415E-01
0.477187
0.410387
In addition to analytic statements that might contain function calls, call statements which
provide a brach to a subroutine are supported. Examples are:
call olsq(y x{0 to 10} y{1 to 10} :print);
which estimates an OLS model for yt  f ( yt 1,
, yt 10 , xt ,
, xt 10 ) and
call tabulate(x,y,z);
which produces a table of x and y. Both functions and subroutines can be built-in to the
execuitable and thus hidden from the user or themselves written in the matrix language. The
formula and solve commands allow recursive solution of an analytic statement over a range of
index values. By vectorizing the calculation, at the loss of some generality, these features speed
up calculations that would have had to use do loops which have substantial overhead. A number
4
Matrix Command Language
of examples that illustrate these features are shown later in this document are all commands and
programming statements are shown. Inspection of the language will show that the matrix facility
has been influenced closely by SPEAKEASY®, which was developed by Stan Cohen. The
programming languages of the two systems are very similar and share the same save-file
structure. However thare are a number of important differences that will be discussed further
below. The matrix facility is not designed to run interactively, although commands can be given
interactively in the Manual Mode. Output is written to the b34s.out file and error messages
are displayed in both the b34s.out and b34s.log files. The objective of the matrix facility is
to give the user access to a powerful object-oriented programming language so that custom
calculations can be made.
A particular strength of the facility is to estimate complex nonlinear least squares and
maximum likelihood models. Such models, which are specified in matrix command programs,
can be solved using either subroutines or with the nonlinear commands nleq, nlpmin1,
nlpmin2, nlpmin3, nllsq, maxf1, maxf2, maxf3, cmaxf1, cmaxf2 and cmaxf3 are discussed in
Chapter 11. While the use of B34S subroutines for the complete calculation would give the user
total control of the estimation process, speed would be given up. The above nonlinear commands
give the user complete control of the form of the estimated model, which is specified in a matrix
command program. Since these programs are called by compiled solvers, there is a substantial
speed advantage over a design that writes the solver in a subroutine written in that program's
language.6 By design the nonlinear solvers were designed to call matrix command programs
not matrix command subroutines, although a link to a subroutine can be made.7
The below listed example illustrates the programming language and shows part of the
real*8 and complex*16 fft decomposition of data generated with the dcos function. This example
uses the commands dcos, fft and ifft. The code is completely vectorized with no loops. The
inverse fft is used to recover the series (times n). Real*8 and complex*16 problems are shown.
* Example from IMSL (10) Math Page 707-709;
n=7.;
ifft=grid(1.,n,1.);
xfft=dcos((ifft-1.)*2.*pi()/n);
rfft=fft(xfft);
bfft=fft(rfft:back);
call tabulate(xfft,rfft,bfft);
* Complex Case See IMSL(10) Math Page 715-717;
cfft=complex(0.0,1.);
hfft=(complex(2.*pi())*cfft/complex(n))*complex(3.0);
xfft=dexp(complex(ifft-1.)*hfft);
cfft=fft(xfft);
bfft=fft(cfft:back);
call tabulate(xfft,cfft,bfft);
Subroutines DUD, DUD2 and NARQ written in the matrix command language itself are supplied in file
matrix2.mac to illustrate a fully programmed nonlinear least squares solver using the Marquardt (1963) method
that mimics the SAS nonlin command.
7
The technical reason for this is that for a function or subroutine call a duplicate copy of all arguments is made to
named storage at the currect level plus 1. This way the arguments are "local" in the subroutine using a possibly
different name. The disadvantage is that this takes more space and slows execution. Use of a program allows all
variables to be accessed without explicitly being passed.
6
Chapter 16
5
The grid command creates a vector from 1. to 7. in increments of 1. The matrix language
supports integer*4 and real*8 data so the command was n=7. not n=7 which would have created
an integer. If data types are mixed, the program will generate a mixed mode error since the
parser does not know the data type to save the result. The complex command creates a
complex*16 datatype from 1 to 2 real*8 arguments. In the above example a series is generated,
the fft is calculated and the series is recovered (times 7.).
Output is:
=>
* EXAMPLE FROM IMSL (10) MATH PAGE 707-709$
=>
N=7.$
=>
IFFT=GRID(1.,N,1.)$
=>
XFFT=DCOS((IFFT-1.)*2.*PI()/N)$
=>
RFFT=FFT(XFFT)$
=>
BFFT=FFT(RFFT:BACK)$
=>
CALL TABULATE(XFFT,RFFT,BFFT)$
Obs
1
2
3
4
5
6
7
XFFT
1.000
0.6235
-0.2225
-0.9010
-0.9010
-0.2225
0.6235
RFFT
BFFT
-0.2220E-15
7.000
3.500
4.364
-0.4653E-15 -1.558
-0.5773E-14 -6.307
-0.2129E-16 -6.307
0.6328E-14 -1.558
-0.9279E-17
4.364
=>
* COMPLEX CASE
=>
CFFT=COMPLEX(0.0,1.)$
=>
HFFT=(COMPLEX(2.*PI())*CFFT/COMPLEX(N))*COMPLEX(3.0)$
=>
XFFT=DEXP(COMPLEX(IFFT-1.)*HFFT)$
=>
CFFT=FFT(XFFT)$
=>
BFFT=FFT(CFFT:BACK)$
=>
CALL TABULATE(XFFT,CFFT,BFFT)$
Obs
1
2
3
4
1.000
-0.9010
0.6235
-0.2225
SEE IMSL(10) MATH PAGE 715-717$
XFFT
0.000
0.4339
-0.7818
0.9749
CFFT
-0.2220E-15 0.4441E-15
-0.2720E-14 0.2818E-15
0.7938E-14 0.3890E-15
7.000
-0.2209E-14
7.000
-6.307
4.364
-1.558
BFFT
-0.4733E-29
3.037
-5.473
6.824
6
Matrix Command Language
5
6
7
-0.2225
0.6235
-0.9010
-0.9749
0.7818
-0.4339
0.1110E-13
-0.4496E-14
0.2165E-14
0.3556E-15
0.3917E-15
0.3464E-15
-1.558
4.364
-6.307
-6.824
5.473
-3.037
The matrix command Subroutine filter, listed next, shows some other aspects of the language.
Within the filter subroutine all variables are local. The filter subroutine can be called with
commands such as
call filter(xold,xnew,10.,14.);
subroutine filter(xold,xnew,nlow,nhigh);
/$
/$ Depending on nlow and nhigh filter can be a low pass
/$ or a high pass filter
/$
/$ Real FFT is done for a series. FFT values are zeroed
/$ out if outside range nlow - nhigh. xnew recovered
/$ by inverse FFT
/$
/$ FINTERC subroutine uses Complex FFT
/$
/$
Use of FILTER in place of FILTERC may result in
/$
Phase and Gain loss
/$
/$ xold
- input series
/$ xnew
- filtered series
/$ nlow
- lower filter bound
/$ nhigh - upper filter bound
/$
/$ Routine built 2 April 1999
/$
n=norows(xold);
if(n.le.0)then;
call print('Filter finds # LE 0');
go to done;
endif;
if(nlow.le.0.or.nlow.gt.n)then;
call print('Filter finds nlow not set correctly');
go to done;
endif;
if(nhigh.le.nlow.or.nhigh.gt.n)then;
call print('Filter finds nhigh not set correctly');
go to done;
endif;
fftold
= fft(xold);
fftnew
= array(n:);
i=integers(nlow,nhigh);
fftnew(i) = fftold(i);
xnew =afam(fft(fftnew :back))*(1./dfloat(n));
done continue;
return;
end;
The complete matrix command vocabulary of over 400 words is listed by subroutine, function
and keyword:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
List of Built-In Matrix Command Subroutines
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chapter 16
ACEFIT
ACF_PLOT
ADDCOL
ADDROW
AGGDATA
ALIGN
ARMA
AUTOBJ
BACKSPACE
BDS
BESTREG
B_G_TEST
BGARCH
BLUS
BPFILTER
BREAK
BUILDLAG
CCFTEST
CHAR1
CHARACTER
CHECKPOINT
CLEARALL
CLEARDAT
CLOSE
CLS
CMAXF1
CMAXF2
CMAXF3
COMPRESS
CONSTRAIN
CONTRACT
COPY
COPYLOG
COPYOUT
COPYTIME
COPYF
CSPECTRAL
CSUB
CSV
DATA_ACF
DATA2ACF
DATAFREQ
DATAVIEW
DELETECOL
DELETEROW
DES
DESCRIBE
DF
DISPLAYB
DIST_TAB
DODOS
DO_SPEC
DO2SPEC
DOUNIX
DQDAG
DQDNG
DQDAGI
-
DQDAGP
DQDAGS
DQAND
DTWODQ
ESACF
ECHOOFF
ECHOON
EPPRINT
EPRINT
ERASE
EXPAND
FORMS
FORPLOT
-
Alternating Conditional Expectation Model Estimation
Simple ACF Plot
Add a column to a 2d array or matrix.
Add a row to a 2d array or matrix.
Aggregate Data under control of an ID Vector.
Align Series with Missing Data
ARMA estimation using ML and MOM.
Automatic Estimation of Box-Jenkins Model
Backspace a unit
BDS Nonlinearity test.
Best OLS REGRESSION
Breusch-Godfrey (1978) Residual Test
Calculate function for a BGARCH model.
BLUS Residual Analysis
Baxter-King Filter.
Set User Program Break Point.
Builds NEWY and NEWX for VAR Modeling
Display CCF Function of Prewhitened data
Place a string is a character*1 array.
Place a string in a character*1 array.
Save workspace in portable file.
Clears all objects from workspace.
Clears data from workspace.
Close a logical unit.
Clear screen.
Constrained maximization of function using zxmwd.
Constrained maximization of function using dbconf/g.
Constrained maximization of function using db2pol.
Compress workspace.
Subset data based on range of values.
Contract a character array.
Copy an object to another object
Copy file to log file.
Copy file to output file.
Copy time info from series 1 to series 2
Copy a file from one unit to another.
Do cross spectral analysis.
Call Subroutine
Read and Write a CSV file
Calculate ACF and PACF Plots
Calculate ACF and PACF Plots added argument
Data Frequency
View a Series Under Menu Control
Delete a column from a matrix or array.
Delete a row from a matrix or array.
Code / decode.
Calculate Moment 1-4 and 6 of a series
Calculate Dickey-Fuller Unit Root Test.
Displays a Buffer contents
Distribution Table
Execute a command string if under dos/windows.
Display Periodogram and Spectrum
Display Periodogram and Spectrum added argument
Execute a command string if under unix.
Integrate a function using Gauss-Kronrod rules
Integrate a smooth function using a nonadaptive rule.
Integrate a function over infinite/semi-infinite
interval.
Integrate a function with singularity points given
Integrate a function with end point singularities
Multiple integration of a function
Two Dimensional Iterated Integral
Extended Sample Autocorrelation Function
Turn off listing of execution.
Turn on listing of execution.
Print to log and output file.
Print to log file.
Erase file(s).
Expand a character array
Build Control Forms
Forecast Plot
7
8
Matrix Command Language
FREE
FPLOT
FPRINT
GAMFIT
GARCH
GARCHEST
GET
GETDMF
GETKEY
GETMATLAB
GET_FILE
GET_NAME
GETRATS
GETSCA
GMFAC
GMINV
GMSOLV
GRAPH
GRAPHP
GRCHARSET
GRREPLAY
GTEST
GWRITE
GWRITE2
HEADER
HEXTOCH
HINICH82
HINICH96
HPFILTER
ISEXTRACT
IALEN
IBFCLOSE
IBFOPEN
IBFREADC
IBFREADR
IBFSEEK
IBFWRITER
IBFWRITEC
IB34S11
IFILESIZE
IFILLSTR
IGETICHAR
IGETCHARI
IJUSTSTR
ILCOPY
ILOCATESTR
ILOWER
INEXTR8
INEXTR4
INEXTSTR
-
INEXTI4
INTTOSTR
IRF
IR8TOSTR
ISTRTOR8
ISTRTOINT
IUPPER
I_DRNSES
I_DRNGES
I_DRNUN
I_DRNNOR
I_DRNBET
I_DRNCHI
I_DRNCHY
I_DRNEXP
I_DRNEXT
-
I_DRNGAM
I_DRNGCT
I_DRNGDA
-
Free a variable.
Plot a Function
Formatted print facility.
Generalized Additive Model Estimation
Calculate function for a ARCH/GARCH model.
Estimate ARCH/GARCH model.
Gets a variable from b34s.
Gets a data from a b34s DFM file.
Gets a key
Gets data from matlab.
Gets a File name
Get Name of a Matrix Variable
Reads RATS Portable file.
Reads SCA FSAVE and MAD portable files.
LU factorization of n by m matrix
Inverse of General Matrix using LAPACK
Solve Linear Equations system using LAPACK
High Resolution graph.
Multi-Pass Graphics Programing Capability
Set Character Set for Graphics.
Graph replay and reformat command.
Tests output of a ARCH/GARCH Model
Save Objects in GAUSS Format using one file
Save objects in GAUSS format using two files
Turn on header
Concert hex to a character representation.
Hinich 1982 Nonlinearity Test.
Hinich 1996 Nonlinearity Test.
Hodrick-Prescott Filter.
Place data in a structure.
Get actual length of a buffer of character data
Close a file that was used for Binary I/O
Open a File for Binary I/O
Reads from a binary file into Character*1 array
Reads from a binary file into Real*8 array
Position Binary read/write pointer
Write noncharacter buffer on a binary file
Write character buffer on a binary file
Parse a token using B34S11 parser
Determine number of bites in a file
Fill a string with a character
Obtain ichar info on a character buffer
Get character from ichar value
Left/Right/center a string
Move bites from one location to another
Locate a substring in a string - 200 length max
Lower case a string
- 200 length max
Convert next value in string to real*8 variable
Convert next value in string to real*4 variable
Extract next blank deliminated sub-string from a
string
Convert next value in a string to integer.
Convert integer to string using format
Impulse Response Functions of VAR Model
Convert real*8 value to string using format
Convert string to real*8
Convert string to integer
Upper case a string
- 200 length max
Initializes the table used by shuffled generators.
Get the table used in the shuffled generators.
Uniform (0,1) Generator
Random Normal Distribution
Random numbers from beta distribution
Random numbers from Chi-squared distribution
Random numbers from Cauchy distribution
Random numbers from standard exponential
Random numbers from mixture of two exponential
distributions
Random numbers from standard gamma distribution
Random numbers from general continuous distribution
Random integers from discrete distribution alias
Chapter 16
I_DRNGDT
I_DRNLNL
I_DRNMVN
I_DRNNOA
I_DRNNOR
I_DRNSTA
I_DRNTRI
I_DRNSPH
I_DRNVMS
I_DRNWIB
I_RNBIN
I_RNGET
I_RNOPG
I_RNOPT
I_RNSET
I_RNGEO
I_RNHYP
I_RNMTN
I_RNNBN
I_RNPER
I_RNSRI
KEENAN
KSWTEST
KSWTESTM
LAGMATRIX
LAGTEST
LAGTEST2
LAPACK
LM
LOAD
LOADDATA
LPMAX
LPMIN
LRE
MAKEDATA
MAKEFAIR
MAKEGLOBAL
MAKELOCAL
MAKEMATLAB
MAKEMAD
MAKERATS
MAKESCA
MANUAL
MARS
MARSPLINE
MARS_VAR
MAXF1
MAXF2
MAXF3
MELD
MENU
MESSAGE
MINIMAX
MISSPLOT
MQSTAT
MVNLTEST
NAMES
NLEQ
NLLSQ
NL2SOL
NLPMIN1
NLPMIN2
NLPMIN3
NLSTART
NOHEADER
OLSQ
OLSPLOT
OPEN
OUTDOUBLE
OUTINTEGER
-
approach
Random integers from discrete using table lookup
Random numbers from lognormal distribution
Random numbers from multivariate normal
Random normal numbers using acceptance/rejection
Random normal numbers using CDF method
Random numbers from stable distribution
Random numbers from triangular distribution
Random numbers on the unit circle
Random numbers from Von Mises distribution
Random numbers from Weibull distribution
Random integers from binomial distribution
Gets seed used in IMSL Random Number generators.
Gets the type of generator currently in use.
Selects the type of uniform (0,1) generator.
Sets seed used in IMSL Random Number generators.
Random integers from Geometric distribution
Random integers from Hypergeometric distribution.
Random numbers from multinomial distribution
Negative binomial distribution
Random perturbation of integers
Index of random sample without replacement
Keenan Nonlinearity test
K Period Stock Watson Test
Moving Period Stock Watson Test
Builds Lag Matrix.
3-D Graph to display RSS for OLS Lags
3-D Graph to display RSS for MARS Lags
Sets Key LAPACK parameters
Engle Lagrange Multiplier ARCH test.
Load a Subroutine from a library.
Load Data from b34s into MATRIX command.
Solve Linear Programming maximization problem.
Solve Linear Programming minimization problem.
McCullough Log Relative Error
Place data in a b34s data loading structure.
Make Fair-Parke Data Loading File
Make a variable global (seen at all levels).
Make a variable seen at only local level.
Place data in a file to be loaded into Matlab.
Makes SCA *.MAD datafile from vectors
Make RATS portable file.
Make SCA FSV portable file.
Place MATRIX command in manual mode.
Multivariate Autoregressive Spline Models
Updated MARS Command using Hastie-Tibshirani code
Joint Estination of VAR Model using MARS Approach
Maximize a function using IMSL ZXMIN.
Maximize a function using IMSL DUMINF/DUMING.
Maximize a function using simplex method (DU2POL).
Form all possible combinations of vectors.
Put up user Menu for Input
Put up user message and allow a decision.
Estimate MINIMAX with MAXF2
Plot of a series with Missing Data
Multivariate Q Statistic
Multivariate Third Order Hinich Test
List names in storage.
Jointly solve a number of nonlinear equations.
Nonlinear Least Squares Estimation.
Alternative Nonlinear Least Squares Estimation.
Nonlinear Programming fin. diff.
grad. DN2CONF.
Nonlinear Programming user supplied grad. DN2CONG.
Nonlinear Programming user supplied grad. DN0ONF.
Generate starting values for NL routines.
Turn off header.
Estimate OLS, MINIMAX and L1 models.
Plot of Fitted and Actual Data & Res
Open a file and attach to a unit.
Display a Real*8 value at a x, y on screen.
Display an Integer*4 value at a x, y on screen.
9
10
Matrix Command Language
OUTSTRING
PCOPY
PERMUTE
PISPLINE
PLOT
POLYFIT
POLYVAL
POLYMCONV
POLYMDISP
POLYMINV
POLYMMULT
PP
PRINT
PRINTALL
PRINTOFF
PRINTON
PRINTVASV
PRINTVASCMAT
PRINTVASRMAT
PROBIT
PVALUE_1
PVALUE_2
PVALUE_3
QPMIN
QUANTILE
READ
REAL16INFO
REAL16OFF
REAL16ON
REAL32OFF
REAL32ON
REAL32_VPA
RESET
RESET77
RESTORE
-
RTEST
RTEST2
REVERSE
-
REWIND
ROTHMAN
RMATLAB
RRPLOTS
RRPLOTS2
RUN
SAVE
SCHUR
SCREENCLOSE
SCREENOPEN
SCREENOUTOFF
SCREENOUTON
SET
SETCOL
SETLABEL
SETLEVEL
SETNDIMV
SETROW
SETTIME
SETWINDOW
SIGD
SIMULATE
SMOOTH
SOLVEFREE
SORT
SPECTRAL
STEPWISE
STOP
SUBRENAME
SUSPEND
SWARTEST
-
Display a string value at a x, y point on screen.
Copy an object from one pointer address to another
Reorder Square Matrix
Pi Spline Nonlinear Model Building
Line-Printer Graphics
Fit an nth degree polynomial
Evaluate an nth degree polynomial
Convert storage of a polynomial matrix
Display/Extract a polynomial matrix
Invert a Polynomial Matrix
Multiply a Polynomial Matrix
Calculate Phillips Peron Unit Root test
Print text and data objects.
Lists all variables in storage.
Turn off Printing
Turn on Printing (This is the default)
Resets so that vectors/arrays print as vectors/arrays
Vectors/Arrays print as Column Matrix/Array
Vectors/Arrays print as Row Matrix/Array
Estimate Probit (0-1) Model.
Present value of $1 recieved at end of n years
Present value of an Annuity of $1
Present value of $1 recieved throughout year
Quadratic Programming.
Calculate interquartile range.
Read data directly into MATRIX workspace from a file.
Obtain Real16 info
Turn off Real16 add
Turn on extended accuracy
Turn off Real32 add
Turn on extended accuracy for real*16
Turn on extended accuracy for real*16 using vpa
Calculate Ramsey (1969) regression specification test.
Thursby - Schmidt Regression Specification Test
Load data back in MATRIX facility from external save
file.
Test Residuals of Model
Test Residuals of Model - No RES and Y Plots
Test a real*8 vector for reversibility in Freq.
Domain
Rewind logical unit.
Test a real*8 vector for reversibility in Time Domain
Runs Matlab
Plots Recursive Residual Data
Plots Recursive Residual Coef
Terminates the matrix command being in "manual" mode.
Save current workspace in portable file format.
Performs Schur decomposition
Turn off Display Manager
Turn on Display Manager
Turn screen output off.
Turn screen output on.
Set all elements of an object to a value.
Set column of an object to a value.
Set the label of an object.
Set level.
Sets an element in an n dimensional object.
Set row of an object to a value.
Sets the time info in an existing series
Set window to main(1), help(2) or error(3).
Set print digits. Default g16.8
Dynamically Simulate OLS Model
Do exponential smoothing.
Set frequency of freeing temp variables.
Sort a real vector.
Spectral analysis of a vector or 1d array.
Stepwise OLS Regression
Stop execution of a matrix command.
Internally rename a subroutine.
Suspend loading and Execuiting a program
Stock-Watson VAR Test
Chapter 16
SYSTEM
TABULATE
TESTARG
TIMER
TRIPLES
TSAY
TSLINEUP
TSD
VAREST
VPASET
VOCAB
WRITE
-
Issue a system command.
List vectors in a table.
Lists what is passed to a subroutine or function.
Gets CPU time.
Calculate Triples Reversability Test
Calculate Tsay nonlinearity test.
Line up Time Series Data
Interface to TSD Data set
VAR Modeling
Set Variable Precision Math Options
List built-in subroutine vocabulary.
Write an object to an external file.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Matrix Command Built-In Function Vocabulary
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ACF
AFAM
ARGUMENT
ARRAY
BETAPROB
BINDF
BINPR
BOOTI
BOOTV
BOXCOX
BSNAK
BSOPK
BSINT
BSINT2
BSINT3
BSDER
BSDER2
BSDER3
BSITG
BSITG2
BSITG3
C1ARRAY
C8ARRAY
CATCOL
CATROW
CCF
-
CHAR
CHARDATE
CHARDATEMY
CHARTIME
CHISQPROB
CHTOR
CHTOHEX
CFUNC
COMB
COMPLEX
CSPLINEFIT
CSPLINE
CSPLINEVAL
CSPLINEDER
CSPLINEITG
CUSUM
CUSUMSQ
CWEEK
DABS
DARCOS
DARSIN
DATAN
DATAN2
DATENOW
DBLE
-
Calculate autocorrelation function of a 1d object.
Change a matrix or vector to an array class object.
Unpack character argument at run-time
Define a 1d or 2d array.
Calculate a beta probability.
Evaluate Binomial Distribution Function
Evaluate Binomial Probability Function
Calculate integers to be used with bootstrap.
Bootstraps a vector with replacement.
Box-Cox Transformation of a series given lamda.
Compute Not a Knot Sequence
Compute optimal spline knot sequence
Compute 1-D spline interpolant given knots
Compute 2-D spline interpolant given knots
Compute 3-D spline interpolant given knots
Compute 1-D spline values/derivatives given knots
Compute 2-D spline values/derivatives given knots
Compute 3-D spline values/derivatives given knots
Compute 1-D spline integral given knots
Compute 2-D spline integral given knots
Compute 3-D spline integral given knots
Create a Character*1 array
Create a Character*8 array
Concatenates an object by columns.
Concatenates an object by rows.
Calculate the cross correlation function on two
objects.
Convect an integer in range 0-127 to character.
Convert julian variable into character date dd\mm\yy.
Convert julian variable into character data mm\yyyy.
Converts julian variable into character date hh:mm:ss
Calculate chi-square probability.
Convert a character variable to a real variable.
Convert a character to its hex representation.
Call Function
Combination of N objects taken M at a time.
Build a complex variable from two real*8 variables.
Fit a 1 D Cubic Spline using alternative models
Calculate a cubic spline for 1 D data
Calculate spline value given spline
Calculate spline derivative given spline value
Calculate integral of a cubic spline
Cumulative sum.
Cumulative sum squared.
Name of the day in character.
Absolute value of a real*8 variable.
Arc cosine of a real*8 variable.
Arc sine of a real*8 variable.
Arc tan of a real*8 variable.
Arc tan of x / y. Signs inspected.
Date now in form dd/mm/yy
Convert real*4 to real*8.
11
12
Matrix Command Language
DCONJ
DCOS
DCOSH
DDOT
DERF
DERFC
DERIVATIVE
DET
DEXP
DFLOAT
DGAMMA
DIAG
DIAGMAT
DIF
DINT
DNINT
DIVIDE
DLGAMMA
DLOG
DLOG10
DMAX
DMAX1
DMIN
DMIN1
DMOD
DROPFIRST
DROPLAST
DSIN
DSINH
DSQRT
DTAN
DTANH
EIGENVAL
EPSILON
EVAL
EXP
EXTRACT
FACT
FDAYHMS
FFT
FIND
FLOAT
FPROB
FREQ
FRACDIF
FYEAR
GENARMA
GETDAY
GETHOUR
GETNDIMV
GETMINUTE
GETMONTH
GETQT
GETSECOND
GETYEAR
GOODCOL
GOODROW
GRID
HUGE
HYPDF
HYPPR
INTEGER8
I4TOI8
I8TOI4
ICHAR
ICOLOR
IDINT
IDNINT
INFOGRAPH
IMAG
IAMAX
-
Conjugate of complex argument.
Cosine of real*8 argument.
Hyperbolic cosine of real*8 argument.
Inner product to two vectors.
Error function of real*8/real*16 argument.
Inverse of error function.
Analytic derivative of a vector.
Determinate of a matrix.
Exponential of a real*8 argument.
Convert integer*4 to real*8.
Gamma function of real*8 argument.
Place diagonal of a matrix in an array.
Create diagonal matrix.
Difference a series.
Extract integer part of real*8 number
Extract nearest integer part of real*8 number
Divide with an alternative return.
Natural log of gamma function.
Natural log.
Base 10 log.
Largest element in an array.
Largest element between two arrays.
Smallest element in an array.
Smallest element between two arrays.
Remainder.
Drops observations on top or array.
Drops observations on bottom of an array.
Calculates sine.
Hyperbolic sine.
Square root of real*8 or complex*16 variable.
Tangent.
Hyperbolic tangent.
Eigenvalue of matrix. Alias EIG.
Positive value such that 1.+x ne 1.
Evaluate a character argument
Exponential of real*8 or complex*16 variable.
Extract elements of a character*1 variable.
Factorial
Gets fraction of a day.
Fast fourier transform.
Finds location of a character string.
Converts integer*4 to real*4.
Probability of F distribution.
Gets frequency of a time series.
Fractional Differencing
Gets fraction of a year from julian date.
Generate an ARMA series given parameters.
Obtain day of year from julian series.
Obtains hour of the day from julian date.
Obtain value from an n dimensional object.
Obtains minute of the day from julian date.
Obtains month from julian date.
Obtains quarter of year from julian date.
Obtains second from julian date.
Obtains year.
Deletes all columns where there is missing data.
Deletes all rows where there is missing data.
Defines a real*8 array with a given increment.
Largest number of type
Evaluate Hypergeometric Distribution Function
Evaluate Hypergeometric Probability Function
Load an Integer*8 object from a string
Move an object from integer*4 to integer*8
Move an object from integer*8 to integer*4
Convect a character to integer in range 0-127.
Sets Color numbers. Used with Graphp.
Converts from real*8 to integer*4.
Converts from real*8 to integer*4 with rounding.
Obtain Interacter Graphics INFO
Copy imaginary part of complex*16 number into real*8.
Largest abs element in 1 or 2D object
Chapter 16
IAMIN
IMAX
IMIN
INDEX
-
INLINE
INT
INTEGERS
INV
INVBETA
INVCHISQ
INVFDIS
INVTDIS
IQINT
IQNINT
ISMISSING
IWEEK
JULDAYDMY
JULDAYQY
JULDAYY
KEEPFIRST
KEEPLAST
KIND
KINDAS
KLASS
KPROD
LABEL
LAG
LEVEL
LOWERT
MCOV
MAKEJUL
MASKADD
MASKSUB
MATRIX
MEAN
MEDIAN
MFAM
MISSING
MLSUM
MOVELEFT
MOVERIGHT
NAMELIST
NEAREST
NCCHISQ
NOCOLS
NOELS
NORMDEN
NORMDIST
NOROWS
NOTFIND
OBJECT
PDFAC
PDFACDD
PDFACUD
PDINV
PDSOLV
PI
PINV
PLACE
POIDF
POIPR
POINTER
POLYDV
POLYMULT
POLYROOT
PROBIT
PROBNORM
PROBNORM2
PROD
Q1
-
Smallest abs element in 1 or 2D object
Largest element in 1 or 2D object
Smallest element in 1 or 2D object
Define integer index vector, address n dimensional
object.
Inline creation of a program
Copy real*4 to integer*4.
Generate an integer vector with given interval.
Inverse of a real*8 or complex*16 matrix.
Inverse beta distribution.
Inverse Chi-square distribution.
Inverse F distribution.
Inverse t distribution.
Converts from real*16 to integer*4.
Converts from real*16 to integer*4 with rounding.
Sets to 1.0 if variable is missing
Sets 1. for monday etc.
Given day, month, year gets julian value.
Given quarter and year gets julian value.
Given year gets julian value.
Given k, keeps first k observations.
Given k, keeps last k observations.
Returns kind of an object in integer.
Sets kind of second argument to kind first arg.
Returns klass of an object in integer.
Kronecker Product of two matrices.
Returns label of a variable.
Lags variable. Missing values propagated.
Returns current level.
Lower triangle of matrix.
Consistent Covariance Matrix
Make a Julian date from a time series
Add if mask is set.
Subtract if mask is set.
Define a matrix.
Average of a 1d object.
Median of a real*8 object.
Set 1d or 2d array to vector or matrix.
Returns missing value.
Sums log of elements of a 1d object.
Moves elements of character variable left.
Move elements of character variable right.
Creates a namelist.
Nearest distinct number of a given type
Non central chi-square probability.
Gets number of columns of an object.
Gets number of elements in an object.
Normal density.
1-norm, 2-norm and i-norm distance.
Gets number of rows of an object.
Location where a character is not found.
Put together character objects.
Cholesky factorization of PD matrix.
Downdate Cholesky factorization.
Update
Cholesky factorization.
Inverse of a PD matrix.
Solution of a PD matrix given right hand side.
Pi value.
Generalized inverse.
Places characters inside a character array.
Evaluate Poisson Distribution Function
Evaluate Poisson Probability Function
Machine address of a variable.
Division of polynomials.
Multiply two polynomials
Solution of a polynomial.
Inverse normal distribution.
Probability of normal distribution.
Bivariate probability of Nornal distribution.
Product of elements of a vector.
Q1 of a real*8 object.
13
14
Matrix Command Language
Q3
QCOMPLEX
QINT
QNINT
QREAL
QR
QRFAC
QRSOLVE
RANKER
RCOND
REAL
R8TOR16
R16TOR8
REAL16
REC
RECODE
RN
ROLLDOWN
ROLLLEFT
ROLLRIGHT
ROLLUP
RTOCH
SEIGENVAL
SEXTRACT
SFAM
SNGL
SPACING
SPECTRUM
SUBSET
SUBMATRIX
SUM
SUMCOLS
SUMROWS
SUMSQ
SVD
TIMEBASE
TIMENOW
TIMESTART
TINY
TDEN
TO_RMATRIX
TO_CMATRIX
TO_RARRAY
TO_CARRAY
TO_VECTOR
TO_ARRAY
TPROB
TRACE
TRANSPOSE
UPPERT
VARIANCE
VECTOR
VFAM
VOCAB
VPA
ZDOTC
ZDOTU
ZEROL
ZEROU
-
Q3 of a real*8 object.
Build complex*32 variable from real*16 inputs.
Extract integer part of real*16 number
Extract nearest integer part of real*16 number
Obtain real*16 part of a complex*326 number.
Obtain Cholesky R via QR method using LAPACK.
Obtain Cholesky R via QR method.
Solve OLS using QR.
Index array that ranks a vector.
1 / Condition of a Matrix
Obtain real*8 part of a complex*16 number.
Convert Real*8 to Real*16
Convert Real*16 to Real*8
Input a Real*16 Variable
Rectangular random number.
Recode a real*8 or character*8 variable
Normally distributed random number.
Moves rows of a 2d object down.
Moves cols of a 2d object left.
Moves cols of a 2d object right.
Moves rows of a 2d object up.
Copies a real*8 variable into character*8.
Eigenvalues of a symmetric matrix. Alias SEIG.
Takes data out of a field.
Creates a scalar object.
Converts real*8 to real*4.
Absolute spacing near a given number
Returns spectrum of a 1d object.
Subset 1d, 2d array, vector or matrix under a mask.
Define a Submatrix
Sum of elements.
Sum of columns of an object.
Sum of rows
of an object.
Sum of squared elements of an object.
Singular value decomposition of an object.
Obtains time base of an object.
Time now in form hh:mm:ss
Obtains time start of an object.
Smallest number of type
t distribution density.
Convert Object to Row-Matrix
Convert Object to Col-Matrix
Convert Object to Row-Array
Convert Object to Col-Matrix
Convert Object to Vector
Convert Object to Array
t distribution probability.
Trace of a matrix.
Transpose of a matrix.
Upper Triangle of matrix.
Variance of an object.
Create a vector.
Convert a 1d array to a vector.
List built in functions.
Variable Precision Math calculation
Conjugate product of two complex*16 objects.
Product of two complex*16 objects.
Zero lower triangle.
Zero upper triangle.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Matrix Programming Language key words
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
CALL
CONTINUE
DO
DOWHILE
NEXT i
-
Call a subroutine
go to statement
Starts a do loop
Starts a dowhile loop
End of a do loop
Chapter 16
ENDDO
ENDDOWHILE
END
EXITDO
EXITIF
FOR
FORMULA
GO TO
FUNCTION
IF( )
ENDIF
PROGRAM
RETURN
RETURN( )
SOLVE
SUBROUTINE
WHERE( )
-
15
End of a do loop
End of a dowhile loop
End of a program, function or Subroutine.
Exit a DO loop
Exit an IF statement
Start a do loop,
Define a recursive formula.
Transfer statement
Beginning of a function.
Beginning of an IF structure
End of an IF( )THEN structure
Beginning of a program,
Next to last statement before end.
Returns the result of a function.
Solve a recursive system.
Beginning of subroutine.
Starts a where structure.
Within the B34S Display Manager, individual help is available on each command.
Usually the help document shows an example. In addition, for each command an example that
can be run from the Tasks menu is provided in the file matrix.mac. Users are encouraged to
cut and paste the commends from these help documents and example files to create their custom
programs. Full documentation for the matrix command can be obtained from the display
manager or by running the command
b34sexec help=matrix; b34srun;
Since subroutine libraries and help libraries are text files, users can easily add examples and
helps from their own applications or build libraries of custom procedures. The help file for the
schur command, which is shown next, provides an example of the on line documentation which
is available for all matrix command keywords:
SCHUR
-
Performs Schur decomposition
call schur(a,s,u);
factors real*8 matrix A such that
A=U*S*transpose(U)
and S is upper triangular.
For complex*16 the equation is
A=U*S*transpose(dconj(U))
U is an orthogonal matrix such that
for real*8
u*transpose(u) = I
Eigenvalues of A are along diagonal of
S.
An optional calling sequence for real*8 is
call schur(a,s,z,wr,wi);
where wr and wi are the real and
imaginary parts, respectively, of
the computed eigenvalues in the same order
that they appear on the diagonal of the
output Schur form s. Complex conjugate pairs
16
Matrix Command Language
of eigenvalues will appear consecutively with
the eigenvalue having the positive imaginary
part first.
Optional calling sequence for complex*16 is
call schur(a,s,z,w);
where w contains the complex eigenvalues.
The Schur decomposition can be performed on
many real*8 and complex*16 matrices for which
eigenvalues cannot be found. For detail see
MATLAB manual page 4-36.
The schur command uses the lapack version 3 routines
dgees and zgees.
Exemple:
b34sexec matrix;
* Example from MATLAB - General Matrix;
a=matrix(3,3: 6., 12., 19.,
-9., -20., -33.,
4.,
9., 15.);
call schur(a,s,u);
call print(a,s,u);
is_ident=u*transpose(u);
is_a
=u*s*transpose(u);
* Positive Def. case ;
aa=transpose(a)*a;
call schur(aa,ss,uu);
ee=eigenval(aa);
call print(aa,ss,uu,ee);
* Expanded calls;
call schur(a,s,u,wr,wi);
call print('Real and Imag eigenvalues');
call tabulate(wr,wi);
* Testing Properties;
call print(is_a,is_ident);
* Random Problem ;
n=10;
a=rn(matrix(n,n:));
call schur(a,s,u);
call print(a,s,u);
is_ident=u*transpose(u);
is_a
=u*s*transpose(u);
call schur(a,s,u,wr,wi);
call print('Real and Imag eigenvalues');
call tabulate(wr,wi);
call print(is_a,is_ident);
* Complex case ;
a=matrix(3,3: 6., 12., 19.,
-9., -20., -33.,
4.,
9., 15.);
ca=complex(a,2.*a);
call schur(ca,cs,cu,cw);
call print(ca,cs,cu,'Eigenvalues two ways',
cw,eigenval(ca));
is_ca=cu*cs*transpose(dconj(cu));
call print(is_ca);
b34srun;
When run this example produces edited output:
Chapter 16
B34S(r) Matrix Command. d/m/y 29/ 6/07. h:m:s
=>
* EXAMPLE FROM MATLAB - GENERAL MATRIX$
=>
=>
=>
A=MATRIX(3,3: 6., 12., 19.,
-9., -20., -33.,
4.,
9., 15.)$
=>
CALL SCHUR(A,S,U)$
=>
CALL PRINT(A,S,U)$
A
= Matrix of
1
2
3
3
1
6.00000
-9.00000
4.00000
S
= Matrix of
1
2
3
3
1
-1.00000
0.00000
0.00000
U
1
2
3
3
IS_IDENT=U*TRANSPOSE(U)$
=>
IS_A
=>
* POSITIVE DEF. CASE $
=>
AA=TRANSPOSE(A)*A$
=>
CALL SCHUR(AA,SS,UU)$
=>
EE=EIGENVAL(AA)$
=>
CALL PRINT(AA,SS,UU,EE)$
SS
UU
3
3
elements
3
-44.6948
-0.609557
1.00000
by
3
1
2432.40
0.00000
0.00000
1
-0.233460
-0.506875
-0.829804
by
elements
3
0.577350
0.577350
0.577350
3
2
288.000
625.000
1023.00
3
3
by
3
by
elements
3
-0.852649E-13
-0.810500E-13
0.685245E-03
3
2
-0.842147
-0.321212
0.433141
= Complex Vector of
elements
3
471.000
1023.00
1675.00
2
0.333414E-12
0.599956
0.00000
= Matrix of
1
2
3
EE
1
133.000
288.000
471.000
= Matrix of
1
2
3
by
=U*S*TRANSPOSE(U)$
= Matrix of
1
2
3
elements
3
19.0000
-33.0000
15.0000
2
0.664753
0.782061E-01
-0.742959
=>
AA
3
2
20.7846
1.00000
0.00000
= Matrix of
1
-0.474100
0.812743
-0.338643
by
2
12.0000
-20.0000
9.00000
9:15:23.
elements
3
0.486091
-0.799938
0.351873
3
elements
17
18
Matrix Command Language
(
2432.
,
0.000
)
(
0.6000
,
0.000
)
(
0.6852E-03,
0.000
)
Note that the diagonal of SS contains the eigenvalues shown in EE
=>
* EXPANDED CALLS$
=>
CALL SCHUR(A,S,U,WR,WI)$
=>
CALL PRINT('Real and Imag eigenvalues')$
Real and Imag eigenvalues
=>
CALL TABULATE(WR,WI)$
Obs
1
2
3
WR
-1.000
1.000
1.000
WI
0.000
0.000
0.000
=>
* TESTING PROPERTIES$
=>
CALL PRINT(IS_A,IS_IDENT)$
A is recovered from the Schur factorization and U 'U  I . To save space the random problem is
not shown.
IS_A
1
2
3
= Matrix of
1
6.00000
-9.00000
4.00000
IS_IDENT= Matrix of
1
2
3
1
1.00000
0.555112E-16
0.555112E-16
3
by
2
12.0000
-20.0000
9.00000
3
by
2
0.555112E-16
1.00000
0.555112E-16
3
elements
3
19.0000
-33.0000
15.0000
3
elements
3
0.555112E-16
0.555112E-16
1.00000
16.2 Overview of Nonlinear Capability
The B34S matrix command contains a number of nonlinear commands that allow the
user to specify a model in a 4th generation language while performing the calculation using
compiled code. Chapter 11 discussed nonlinear least squares and a number of
maximization/minimization examples. In many cases a matrix command uses routines from the
commercially available IMSL subroutine library, LINPACK, LAPACK, EISPACK or
FFTPACK. For nonlinear modeling applications users of the stand-alone IMSL product would
have to license a Fortran compiler, write the model and main program in Fortran, build routines
to display the results and compile all code each time a model needed to be estimated. In contrast,
the B34S implementation allows the user to specify the model in a 4th generation language and
further process the results from within a general programming language. Optionally it is possible
to view the solution progress from a GUI. Grouping the nonlinear capability of the matrix
command by function (with some overlap) and showing the underlying routines used, there are:
Constrained maximization commands:
CMAXF1 CMAXF2 CMAXF3 -
Constrained maximization of function using zxmwd.
Constrained maximization of function using dbconf/g.
Constrained maximization of function using db2pol.
Chapter 16
Unconstrained maximization commands:
MAXF1
MAXF2
MAXF3
-
Maximize a function using IMSL ZXMIN.
Maximize a function using IMSL DUMINF/DUMING.
Maximize a function using simplex method (DU2POL).
Linear and non-linear programming commands:
LPMAX
LPMIN
NLEQ
NLPMIN1
NLPMIN2
NLPMIN3
-
Solve Linear Programming maximization problem.
Solve Linear Programming minimization problem.
Jointly solve a number of nonlinear equations.
Nonlinear Programming fin. diff.
grad. DN2CONF.
Nonlinear Programming user supplied grad. DN2CONG.
Nonlinear Programming user supplied grad. DN0ONF.
Nonlinear least squares and utility commands:
BGARCH
GARCH
GARCHEST
NLEQ
NLLSQ
NL2SO
NLSTART
QPMIN
SOLVEFREE
-
Calculate function for a BGARCH model.
Calculate function for a ARCH/GARCH model.
Estimate a ARCH/GARCH Model
Jointly solve a number of nonlinear equations.
Nonlinear Least Squares Estimation.
Alternative Nonlinear Least Squares Estimation.
Generate starting values for NL routines.
Quadratic Programming.
Set frequency of freeing temp variables.
Integration of a user function Commands:
DQDAG
DQDNG
DQDAGI
DQDAGP
DQDAGS
DQAND
-
Integrate a function using Gauss-Kronrod rules
Integrate a smooth function using a nonadaptive rule.
Integrate a function over infinite/semi-infinite interval.
Integrate a function with singularity points given
Integrate a function with end point singularities
Multiple integration of a function
Spline and Related Commands:
ACEFIT
-
Alternating Conditional Expectation Model Estimation
BSNAK
BSOPK
BSINT
BSINT2
BSINT3
BSDER
BSDER2
BSDER3
BSITG
BSITG2
BSITG3
CSPLINEFIT
CSPLINE
CSPLINEVAL
CSPLINEDER
CSPLINEITG
-
Compute Not a Knot Sequence
Compute optimal spline know sequence
Compute 1-D spline interpolant given knots
Compute 2-D spline interpolant given knots
Compute 3-D spline interpolant given knots
Compute 1-D spline values/derivatives given knots
Compute 2-D spline values/derivatives given knots
Compute 3-D spline values/derivatives given knots
Compute 1-D spline integral given knots
Compute 2-D spline integral given knots
Compute 3-D spline integral given knots
Fit a 1 D Cubic Spline using alternative models
Calculate a cubic spline for 1 D data
Calculate spline value given spline
Calculate spline derivative given spline value
Calculate integral of a cubic spline
GAMFIT
-
Generalized Additive Model Estimation
MARS
PISPLINE
-
Multivariate Autoregressive Spline Models
Pi Spline Nonlinear Model Building
19
20
Matrix Command Language
While space limits a full discussion of each command, examples from within each group
will be discussed briefly and illustrated using supplied problems in this Chapter and in Chapter
11 where the optimization and NLLS capability was discussed in some detail. The strength of the
nonlinear capability is that the user has great flexibility to specify the model in a B34S matrix
program. Once the model has been coded, the solution proceeds using the built-in command
which consists of compiled code.8 The user can optionally display on the screen the solution
process. Some of the applications in this area are shown next.
Integration is an important topic and a number of commands are available. For example
the command dqand provides solution of up to 20 integrals where the user specifies the model in
a matrix language program. Consider the problem
b b b
e
 ( x12  x22  x32 )
dx1dx2 dx3
(16.2-1)
a a a
where the ranges of integration are successively widened. The above problem can be solved with
b34sexec matrix;
* This is a big problem. Note maxsub 100000 ;
program test;
f=dexp(-1.*(x(1)*x(1)+x(2)*x(2)+x(3)*x(3)));
return;
end;
/$ We solve 6 problems – each with wider bounds.
/$ As constant => inf and => pi()**1.5
lowerv=array(3:);
upperv=array(3:);
x
=array(3:);
call print(test);
call echooff;
j=integers(3);
do i=1,6;
cc=dfloat(i)/2.0;
lowerv(j)=(-1.)*cc;
upperv(j)=
cc;
call dqand(f x :name test
8
:lower lowerv
:upper upperv
:errabs .0001
:errrel .001
:maxsub 100000
:print);
In contrast to the B34S design where the nonlinear commands are built into the language, the MATLAB
minimization commands fmin and fminbnd are totally written in the MATLAB 4th generation language. While the
user can see what is being calculated, the cost is that as the model is solved the MATLAB parser must crack each
statement in the command. This design substantially slows execution.
Chapter 16
call print('lower set as
call print('results
call print('error
enddo;
',cc:);
',%result:);
',%error:);
call print('Limit answer
b34srun;
',pi()**1.5 :);
21
to produce answers for the range (–3., 3.) of:
Integration using DQAND
For Integral
Lower Integration value
Upper Integration value
1
-3.000000000000000
3.000000000000000
For Integral
Lower Integration value
Upper Integration value
2
-3.000000000000000
3.000000000000000
For Integral
Lower Integration value
Upper Integration value
3
-3.000000000000000
3.000000000000000
ERRABS set as
1.000000000000000E-04
ERRREL set as
1.000000000000000E-03
MAXSUB set as
100000
Result of Integration
5.567958983584796
Error estimate
3.054134012359100E-08
Limit answer
5.568327996831708
Spline models can be used to fit a model to data where the underlying function is not known.
Assume the function
1
1
.5
0
.5
0

( x3  x y z ) dx dy dz
which is very hard to evaluate. The setup
b34sexec matrix;
* Test Example from IMSL(10) ;
call echooff;
nxdata=21;
nydata=6;
nzdata=8;
kx=5;
ky=2;
kz=3;
i=integers(nxdata);
j=integers(nydata);
k=integers(nzdata);
xdata=dfloat(i-11)/10.;
ydata=dfloat(j-1)/5.;
zdata=dfloat(k-1)/dfloat(nzdata-1);
iimax=index(nxdata,nydata,nzdata:);
f=array(iimax:);
do ii=1,nxdata;
do jj=1,nydata;
do kk=1,nzdata;
ii3=index(nxdata,nydata,nzdata:ii,jj,kk);
f(ii3)=(xdata(ii)**3.) + (xdata(ii)*ydata(jj)*zdata(kk));
enddo;
enddo;
(16.2-2)
22
Matrix Command Language
enddo;
xknot=bsnak(xdata,kx);
yknot=bsnak(ydata,ky);
zknot=bsnak(zdata,kz);
bscoef3=bsint3(xdata,ydata,zdata,f,xknot,yknot,zknot);
a=0.0;
b=1.0;
c=.5;
d=1.0;
e=0.0;
ff=.5;
val=bsitg3(a,b,c,d,e,ff,xknot,yknot,zknot,bscoef3);
g =.5*(b**4.-a**4.);
h =(b-a)*(b+a);
ri=g*(d-c);
rj=.5*h*(d-c)*(d+c);
exact=.5*(ri*(ff-e)+.5*rj*(ff-e)*(ff+e));
error=val-exact;
call
call
call
call
call
call
call
print('Test of bsitg3 ***********************':);
print('Lower 1
= ',a:);
print('Upper 1
= ',b:);
print('Lower 2
= ',c:);
print('Upper 2
= ',d:);
print('Lower 3
= ',e:);
print('Upper 3
= ',ff:);
call print('Integral = ',val:);
call print('Exact
= ',exact:);
call print('Error
= ',error:);
b34srun;
allows solution of the above three dimensional problem without explicit knowledge of the
function. Note for this test problem we generate the data but as far as the bsitg3 command is
known, the function is not known. What is happening in the solution is that splines are used to
approximate the function and from these splines, the integral can be calculated. If the exact
answer is known, the results can be tested. For this problem the answers were:
Test of bsitg3
Lower 1
=
Upper 1
=
Lower 2
=
Upper 2
=
Lower 3
=
Upper 3
=
Integral =
Exact
=
Error
=
***********************
0.000000000000000E+00
1.000000000000000
0.5000000000000000
1.000000000000000
0.000000000000000E+00
0.5000000000000000
8.593750000000001E-02
8.593750000000000E-02
1.387778780781446E-17
More extensive problems involving higher dimensions where the function is not explicitly
known can be solved with the mars, gamfit and pispline models which are documented in
Chapter 14 which can be run as procedures or as part of the matrix language.9 A simpler
problem that can easily be seen is
9

1
0
x 3 dx  .25 . Its solution is found by:
Another option is the acefit command which will estimate an ACE model. The gamfit command estimates GAM
models. For further detail see chapter 14.
Chapter 16
23
b34sexec matrix;
* Test Example from IMSL(10) ;
call echooff;
ndata=21;
korder=5;
i
=integers(ndata);
xdata =dfloat(i-11)/10.;
f
=xdata**3.;
xknot =bsnak(xdata,korder);
bscoef=bsint(xdata,f,xknot);
a
=0.0;
b
=1.0;
val
=bsitg(a,b,xknot,bscoef);
* fi(x)= x**4./4.;
exact =(b**4./4.)-(a**4./4.);
error=exact-val;
call print('Test of bsitg ***********************':);
call print('Lower
= ',a:);
call print('Upper
= ',b:);
call print('Integral = ',val:);
call print('Exact
= ',exact:);
call print('Error
= ',error:);
b34srun;
Edited output is:
Test of bsitg ***********************
Lower
Upper
Integral
Exact
Error
=
=
=
=
=
0.000000000000000E+00
1.000000000000000
0.2500000000000001
0.2500000000000000
-1.110223024625157E-16
In both of the above examples the data was simulated by evaluation of a function. In most cases,
this is not possible. The power of the spline capability is that nonlinear models can be fit using
few data points. Since a spline model cannot forecast outside the range of x variable it would
appear that such models are of limited use. However if a spline model is fit, then values can be
interpolated and more observations can be generated. These observations can be used to fit to a
nonlinear model. Since Chapter 11 contains extensive examples for nonlinear least squares and
maximization problems, these features will not be discussed further here. In the next sections we
discuss the matrix command language.
24
Matrix Command Language
16.3 Rules of the Matrix Language
While the B34S help facility is the place to go for detailed instructions, the basic
structure of the matrix command can be illustrated by a number of examples and simple rules
shown next.
1. Command Form. The matrix command begins with the statement
b34sexec matrix;
and ends with the statement
b34srun;
All commands are between these two statements, unless the matrix command in running in
interactive manual mode under the Display Manager. This "manual mode" allows only one line
commands to be specified.
2. Sentence Terminator. All matrix statements must end in $ or ;. For example:
x=dsin(q);
There is no continuation character needed and sentences can extend over many lines.
3. Assignment Issures. Mixed mode math is not allowed. For example assuming x is real*8
x=x*2;
is not allowed because x is real*8 and 2 is an integer*4. The reason mixed mode is not allowed is
that the processor would not know what to do with the result. This design is in contrast to many
languages that automatically create real*8 values. The correct form for the above statement is:
x=x*2.;
if real*8 results are desired or
x=idint(x)*2;
if you want an integer*4 result and x was real*8 before the command. The form
x=dint(x*2.);
truncates x*2. and places it in the real*8 variable x.
4. Structured Objects. Calculated structured objects can only be used on the right of an
expression or in a subroutine call as input. For example if x is a 2-D object
mm=mean(x(,3));
Chapter 16
25
calculates the mean of column 3 while
nn=mean(x(3,));
calculates the mean of row 3.
i=integers(2,30);
y(i)=x(i-1);
copies x elements from locations 1 through 29 into y locations 2 to 30. What is not allowed is
i=integers(1,29);
y(i+1)=x(i);
since it involves a calculated subscript on the left of the equals sign.
5. Data storage issues. Since structured objects repackage the data, they cannot be used for
output from a subroutine or function. For example assume x is a 3 by 5 matrix. If we wanted to
take the log of a row or column, the correct code is
x(2,)=dlog(x(2,));
x(,3)=dlog(x(,3));
The code
subroutine dd(x);
x=log(x);
return;
end;
call dd(x(2,));
call dd(x(,3));
will not work since rows and columns are repackaged into vectors which do not line up with the
original storage of the matrix. If a user function is desired to be used, then logic such as
b34sexec matrix;
x=matrix(3,3:1 2 3 4 5 6 7 8 9);
call print(x);
function dd(x);
yy=dlog(x);
return(yy);
end;
x(2,)=dd(x(2,));
x(,3)=dd(x(,3));
call print(x);
b34srun;
should be used. However the above code has a "hidden" bug that impacts the x(2,3) element.
The reader should study what is happening. As a hint. The original x(2,3) term was 6. The log of
26
Matrix Command Language
6 = 1.7918. The value found in the x(2,3) position is .583179 or the log(log(6)) because of the
first replacement which might not be what is intended.
6. Automatic Expansion. Structured objects can be used on the left of an assignment statement
to load data. To add another element to x use
x=3.0;
x(2)=4.0;
while to place 0.0 in column 2 use
x=rn(matrix(4,4:));
x(,2)=0.;
To place 99. in row 3
x(3,)=99.;
while to set element 3,2 of x to .77 use
x(3,2)=.77;
The following code shows advanced structured index processing. This code is available in
matrix.mac in overview_2
/$ Illustrates Structural Index Processing
b34sexec matrix;
x =rn(matrix(6,6:));
y =matrix(6,6:);
yy =matrix(6,6:);
z =matrix(6,6:);
zz =matrix(6,6:);
i=integers(4,6);
j=integers(1,3);
xhold=x;
hold=x(,i);
call print('cols 4-6 x go to hold',x,hold);
y(i, )=xhold(j,);
call print('Rows 1-3 xhold in rows 4-6 y ',xhold,y);
y=y*0.0;
j2
=xhold(j,);
y(i, )=j2
;
call print('Rows 1-3 xhold in rows 4-6 y ',xhold,y);
z(,i)=xhold(,j);
call print('cols 1-3 xhold in cols 4-6 z ',xhold,z);
j55 =xhold(,j);
z=z*0.0;
z(,i)=j55;
call print('cols 1-3 xhold in cols 4-6 z ',xhold,z);
Chapter 16
yy=yy*0.0;
yy(i,)=xhold;
call print('rows 1-3 xhold in rows 4-6 yy',xhold,yy);
zz=zz*0.0;
do ii=1,3;
jj=ii+3;
zz(,jj)=xhold(ii,);
enddo;
/;
/;
i=integers(4,6);
j=integers(1,3);
call print('Note that
zz(,i)=
xhold(j,) will not work':);
call print('Testing zzalt(,i)= transpose(xhold(j,))':);
/; Use of Transpose speeds things up over do loop
zzalt=zz*0.0;
zzalt(,i)= transpose(xhold(j,)) ;
call print('rows 1-3 xhold in cols 4-6 zz',xhold,zz,zzalt);
zz=zz*0.0;
zzalt=zz;
do ii=1,3;
jj=ii+3;
zz(jj,)=xhold(,ii);
enddo;
call print('Note that zz(i,)=xhold(,j) will not work':);
call print('Testing zzalt(i,)= transpose(xhold(,j))':);
zzalt(i,)=transpose(xhold(,j));
call print('cols 1-3 xhold in rows 4-6 zz',xhold,zz,zzalt);
oldx=rn(matrix(20,6:));
newx=
matrix(20,5:);
i=integers(4);
newx(,i)=oldx(,i);
call print('Col 1-4 in oldx goes to newx',oldx,newx);
oldx=rn(matrix(20,6:));
newx=
matrix(20,5:);
i=integers(4);
newx(1,i)=oldx(1,i);
call print('This puts the first element in col ',oldx,newx);
newx=newx*0.0;
newx(i,1)=oldx(i,1);
call print('This puts the first element in row ',oldx,newx);
newx=newx*0.0;
newx( ,i)=oldx( ,i);
call print('Whole col copied here',oldx,newx);
oldx=rn(matrix(10,5:));
newx=
matrix(20,5:);
27
28
Matrix Command Language
i=integers(4);
newx(i,1)=oldx(i,1);
call print('This puts the first element in row ',oldx,newx);
newx=newx*0.0;
newx(i,)=oldx(i,);
call print('Whole row copied',oldx,newx);
* We subset a matrix here ;
a=rn(matrix(10,5:));
call print('Pull off rows 1-3, cols 2-4',
a,a(integers(1,3),integers(2,4)));
b34srun;
The reader is invited to run this sample program and inspect the results. Structured index
programming is compact and fast and should be used wherever possible. The do command is
provided for the cases, hopefully few are far between, when structured index processing is not
possible. In this example is is demonstrated that by use of the transpose after the structured
extract, a do loop is not required. Styructured index programming takes care but can achieve
great gains due to lowering the paser overhead implicit in a do loop.
7. Restrictions on the left hand side of an expression. Functions or math expressions are not
allowed on the left hand side of an equation. Assume the user wants to load another row. The
command
x(norows(x)+1,)=v;
in the sequence
x=matrix(3,3:1 2 3 4 5 6 7 8 9);
v=vector(3:22 33 44);
x(norows(x)+1,)=v;
will not work. The correct way to proceed is:
x=matrix(3,3:1 2 3 4 5 6 7 8 9);
v=vector(3:22 33 44);
n=norows(x)+1;
x(n,)=v;
to produce
 1.
 4.

 7.

 22.
2. 3.
5. 6. 
8. 9. 

33. 44. 
Chapter 16
29
Note that the matrix, array and vector commands automatically convert integers to real*8 in
the one exception to rule 3 about mixed mode operations above. The command
x(i+1)=value;
will not work since there is a calculation implicit on the left. The correct code is:
j=i+1;
x(j)=value;
Advanced code includes:
b34sexec matrix display=col80medium;
x=matrix(3,3:1 2 3 4 5 6 7 8 9);
v=vector(:1 2 3 4 5 6 7 8 9);
xx=matrix(3,3:v);
/; Note that xx is saved by columns hence the elements
/; in xx2 repack into a 9 by 1 vector of the columns of x
/; xx3 is transpose(x)
xx2=matrix(9,1:xx);
xx3=matrix(3,3:xx2);
call print(x,v,xx,xx2,xx3);
b34srun;
X
= Matrix of
1
2
3
V
1
1.00000
4.00000
7.00000
3
XX
= Matrix of
1
2
3
XX2
1
2
3
4
5
6
7
8
9
1
1.00000
4.00000
7.00000
= Matrix of
1
1.00000
4.00000
7.00000
2.00000
5.00000
8.00000
3.00000
6.00000
9.00000
3
2
2.00000
5.00000
8.00000
= Vector of
1.00000
6.00000
by
elements
9
3
3.00000
6.00000
9.00000
elements
2.00000
7.00000
3.00000
8.00000
3
by
2
2.00000
5.00000
8.00000
9
3
4.00000
9.00000
elements
3
3.00000
6.00000
9.00000
by
1
elements
5.00000
30
Matrix Command Language
XX3
= Matrix of
1
1.00000
2.00000
3.00000
1
2
3
3
by
2
4.00000
5.00000
6.00000
3
elements
3
7.00000
8.00000
9.00000
1-D and 2-D objects can be concatenated using catcol and catrow commands. If objects are of
unequal length, missing data will be supplied. Examples files for catcol and catrow should be
run for further detail.
8. Matrix/Vector Math vs. Array Math. Matrix and array math is supported. If x is a 3 by 3
matrix, the command
ax=afam(x);
will create a 3 by 3 array ax. If x is a 3 by 1 array. The command
mx=vfam(x);
will create a 3 by 1 matrix mx containing x. To convert x to a vector, column by column use
vvnew=vector(:x);
Array math is element by element math, while matrix math uses linear algebra rules. If v is a
vector of 6 elements the command
newv=afam(v)*afam(v);
squares all elements while
p=v*v;
is the inner product or the sum of the elements squared. An important issue is how to handle
matrix/vector addition. If A and B are both n by m matrices, the command
c=a+b;
creates the n by m matrix C where Ci , j  Ai , j  Bi , j . As Greene(2000,11) notes "matrices cannot
be added unless they have the same dimensions, in which case they are said to be conformable
for addition." If A and B were vectors of length n, then Ci  Ai  Bi . If A is a n by n matrix, the
statement
c=a+2.;
creates C where Ci ,i  Ai ,i  2. and Ci , j  Ai , j for i  j . If A were an 2-D array, then
Ci , j  Ai , j  2. If A was a 1-D object, then element by element math would be used. This
convention is similar to SPEAKEASY and in contrast to MATLAB which for addition and
subtraction of scalars handles things as if the objects were arrays. In B34S if a scalar is added or
subtracted from a m by n matrix where m  n , an error message is given. For vectors we have
=>
VX=VECTOR(5:1 1 1 1 1)$
Chapter 16
=>
31
CALL PRINT((VX+1.))$
Vector of
2.00000
5
2.00000
elements
2.00000
2.00000
2.00000
element by element operations. This is similar to MATLAB operations on n by 1 and 1 by n
objects which are treated as if they were vectors.
9. Keywords as variable Names. Keywords should not be used as variable names. If they are
user, the command with this name is "turned off." This can cause unpredictable results with user
programs, subroutines and functions. User's keywords cannot conflict with user program,
subroutine or function names since the users code is not loaded unless a statement of the form
call load(name);
is given.
10. Passing Arguments. Subroutines and functions allow passing arguments which can be
changed. Structured index arrays cannot be changed (see rule 5 above). For example:
call myprog(x,y);
y=myfunc(x,y);
A complete example:
b34sexec matrix;
subroutine test(a);
call print('In routine test A= ',a);
* Reset a;
call character(a,'This is a very long string');
return;
end;
/$ pass in character*8
call test('junk');
call character(jj,'some junk going in');
call print(jj);
/$ pass in a character*1 array
call test(jj);
call print(jj);
b34srun;
Special characters such as : and | are not allowed in user subroutines or function calls because of
the difficulty of parsing these characters in the user routine. This restriction may change in future
versions of the matrix command if there is demand.
11. Coding assumptions. Statements such as:
32
Matrix Command Language
y = x-z;
are allowed. Statements such as
y = -x+z;
will not work as intended. The error message will be "Cannot classify sentence Y ...". The
command should be given as
y = -1.*x + z;
or better still
y = (-1.*x) + z;
A statement
y = x**2;
where x is real*8 will get a mixed mode message and should be given as
y = x**2.;
Complex statements such as
yhat = b1*dexp(-b2*x)+ b3*dexp(-(x-b4)**2./b5**2.)
+ b6*dexp(-(x-b7)**2./b8**2.);
will not work as intended and should have ( ) around the power expressions and -1.* .
yhat = b1*dexp(-1.0*b2*x)+ b3*dexp(-1.0*((x-b4)**2.)/(b5**2.))
+ b6*dexp(-1.0*((x-b7)**2.)/(b8**2.));
It is a good plan to use ( ) to make sure what is calculated was what is intended.
Examples of matrix language statements:
The statement
y=dsin(x);
is an analytic statement that creates the structured object y by taking the sin of the structured
object x. The variable x can be a scalar, 1-D object (array, vector) or a 2-D object (matrix, array).
The following code copies elements 5-10 of y to x(2),...,x(7)
i=integers(5,10);
j=i-3;
x(j)=y(i);
and is much faster than the scalar implementation
do j=2,7;
x(j)=y(j+3);
Chapter 16
33
enddo;
which has high parse overhead.
12. Automatic Expansion of Variables – Some Cautionary notes. The following code
illustrates automatic expansion issues.
x(1)=10.;
x(2)=20.;
The array x contains elements 10. and 20. Warning! The commands
x(1)=10.;
x(2)=20;
produces an array of 0 20 since the statement
x(2)=20;
redefines the x array to be integer! This is an easy mistake to make since computers do what we
tell them to do very quickly! Statements such as
x(0) = 20.;
x(-1)= 20.;
x(1) = 20.;
all set element 1 of x to 20. The x(0) and x(-1) statements will generate a message warning the
user.
13. Memory Management. Automatic expansion of arrays can will cause the program to
"waste" memory since newer copies of the expanded variable will not fit into the old location.
The matrix command will have to allocate a new space which will leave a "hole" in memory.
The command
call compress;
cannot be used to compress the workspace if it is given while in a user subroutine, function or
program. In addition to space requirements, prior allocation will substantially speed up
execution. If memory problems are encountered, the command
call names(all);
can be used to see how the variables are saved in memory and whether as the calculation
proceeds more space is used. For example compare the following code;
n=10;
x=array(n:);
call names(all);
do i=1,n;
x(i)=dfloat(i);
call names(all);
enddo;
34
Matrix Command Language
with
n=10;
call names(all);
do i=1,n;
x(i)=dfloat(i);
call names(all);
enddo;
The first job will run faster and not use up memory. This job can be found in the file matrix.mac
under MEMORY and should be run by users wanting to write efficient subroutines. An
alternative is to use the solvefree command as
do i=1,2000;
call solvefree(:alttemp);
* many commands here ;
call solvefree(:cleantemp);
enddo;
The first call with :alttemp sets %%____ style temp variables in place of the default ##____
style. The command :cleantemp resets the temp style to ##____ and cleans all %%____ temps,
leaving the ##_____ style temps in place. If this capability is used carefully, substantial
speed gains can be made. In addition the max number of temps will not be reached. Use of this
feature slows down processing and is usually not needed. The command
call solvefree(:cleantemp2);
cleans user temps at or above the current level. This can be useful within a nested call to clean
work space. Many systems like SPEAKEASY do automatic compression which substantially
slows execution since the location of all variables must be constantly checked on the chance that
they have moved. The matrix command releases temp variables after each line of code but does
not do a compress unless told to do so. New temps are slotted into unused locations. The latter is
not possible if objects are getting bigger during execution of a job.
The dowhile loop usually is cycled many times and needs active memory management. An
Example is:
b34sexec matrix;
sum=0.0;
add=1.;
ccount=1.;
count=1.;
tol=.1e-6;
/$ outer dowhile does things 2 times
call outstring(2,2,'We sum until we can add nothing!!');
call outstring(2,4,'Tol set as ');
call outdouble(20,4,tol);
call echooff;
call solvefree(:alttemp);
Chapter 16
35
dowhile(ccount.ge.1..and.ccount.le.3.);
sum=0.0;
add=1.;
count=1.;
dowhile(add.gt.tol);
oldsum=sum;
sum=oldsum+((1./count)**3.);
count=count+1.;
call outdouble(2,6,add);
add=sum-oldsum;
/$ This section cleans temps
if(dmod(count,10.).eq.0.)then;
call solvefree(:cleantemp);
call solvefree(:alttemp);
endif;
enddowhile;
ccount=ccount+1.;
call print('Sum was ',sum:);
call print('Count was ',count);
enddowhile;
b34srun;
14. Missing Data. Missing data often causes problems. Assume the following code:
b34sexec matrix;
x=rn(array(10:));
lagx=lag(x,1);
y=x-(10.*lagx);
goody=goodrow(y);
call tabulate(x,lagx,y,goody);
b34srun;
Y will contain missing data in row 1. The variable goody will contain 9 non missing value
observations.
15. Recursive solutions. In many cases the solution to a problem requires recursive evaluation
of an expression. While the use of recursive function calls is possible, it is not desirable since
there is great overhead in calling the function or subroutine over and over again. The do loop,
while still slow, is approximately 100 times faster than a recursive function call. The test
problem RECURSIVE in c:\b34slm\matrix.mac documents how slow the recursive function call
and do loop are for large problems. Another reason that a recursive function call is not
recommended is that the stack must be saved. The best way to handle a recursive call is to use
the solve statement to define the expression that has to be evaluated one observation at a time. If
the expression contains multiple expressions that are the same, a formula can be defined and
used in the solve statement. The formula and solve statements evaluate an expression over a
range, one observation at a time. This is in contrast to the usual analytic expression which is
evaluated completely on the right before the copy is made. Unlike an expression, a formula or
solve statement can refer to itself on the right. The block keyword determines the order in which
the formulas are evaluated. If the expression in the solve statement does not have duplicate code,
36
Matrix Command Language
it is faster not to define a formula. Examples of both approaches are given next. The first
problem is a simple expression not requiring a formula. The code
b34sexec matrix;
test=array(10:);
test(1)=.1;
b=.9;
solve(test=b*test(t-1)
:range 2 norows(test)
:block test);
call print(test);
b34srun;
works but
test = b*lag(test,1);
will not get the "correct" answer since the right hand side is built before the copy is done.
The formula statement requires use of the subscript t unless the variable is a scaler. The use of
the formula and solve statements are illustrated below:
b34sexec options ginclude('gas.b34'); b34srun;
b34sexec matrix ;
call loaddata;
double=array(norows(gasout):);
formula double = dlog(gasout(t)*2.);
call names;
call print(double);
test2=array(norows(gasout):);
solve(test2=test2(t-1)+double(t)+double(t-1)
:range 2, norows(gasout)
:block double);
call print(mean(test2));
b34srun;
The following two statements are the same but execute at different speeds.
do i=1,n;
x(i)=y(i)/2.;
enddo;
solve(x=y(t)/2. :range 1 n);
The formula and solve statements can be used to generate an AR(1) model. This is example
solve7 in c:\b34slm\matrix.mac
b34sexec matrix ;
* Generate ar(1) model;
g(1)=1.;
theta=
.97;
vv
= 10. ;
formula jj=g(t-1)*theta+vv*rn(1.);
solve(g=jj(t) :range 2 300 :block jj);
call graph(g :heading 'AR(1) process');
call print(g);
call autobj(g :print :nac 24 :npac 24 :nodif
:autobuild );
b34srun;
Solve and formula statements cannot contain user functions. More detail on the solve and
formula statements are given below. In RATS, unless do loops are used, recursive models are
Chapter 16
37
only allowed in the maximize command, and then only in a very restrictive form.10 The B34S
implementation, while slower, allows more choices.
16. User defined data structures. The B34S matrix command allows users to build custom
data types. The below example shows the structure PEOPLE consisting of a name field
(PNAME), a SSN field (SSN), an age field (AGE), a race field (RACE) and an income field
(INCOME). The built-in function sextract( ) is used to take out a field and the built-in
subroutine isextract is used to place data in a structure. Both sextract and isextract allow a
third argument that operates on an element. The name sextract is "structure extract" while
isextract is "inverse structure extract." Use of these commands is illustrated by:
b34sexec matrix;
people=namelist(pname,ssn,age,race,income);
pname =namelist(sue,joan,bob);
ssn
=array(:20,100,122);
age
=idint(array(:35,45,58));
race =namelist(hisp,white,black);
income=array(:40000,35000,50000);
call tabulate(pname,ssn,age,race,income);
call print('This prints the age vector',sextract(people(3)));
call print('Second person',sextract(people(1),2),
sextract(people(3),2));
* make everyone a year older ;
nage=age+1;
call isextract(people(3),nage);
call print(age);
* make first person 77 years old;
call isextract(people(3),77,1);
call print(age);
b34srun;
Data structures are very powerful and, in the hands of an expert programmer, can be made to
bring order to complex problems.
17. Advanced programming Concepts and Techniques for Large Problems
Programs such as SPEAKEASY and MATLAB, which are meant to be used interactively,
have automatic workspace compression. As a result a SPEAKEASY LINKULE programmer has
to check for movement of defined objects anytime an object is created or freed. In a design
decision to increase speed, B34S does not move the variables inside named storage unless told
to do so. If a do loop terminates and the user is not in a SUBROUTINE, temp variables are freed.
If a new temp variable is needed, B34S will try to place this variable in a free slot. If a variable
is growing, this may not be possible. Hence it is good programming practice to create arrays
and not rely on automatic variable expansion. In a subroutine call, a variable passed in is first
copied to another location and set to the current level + 1. Thus there are named storage
implications of a subroutine call. The command
call compress;
will manually clean out all temp variables and compress the workspace. While this command
takes time, in a large job it may be required to save time and space. Temp variables are named
10
In Rats version 6.30 this restriction seems to have been somewhat lifted
38
Matrix Command Language
##1 ...... ##999999. If the peak number of temp variables gets > 999999, then B34S has to reuse
old names and as a result slows down checking to see if a name is currently being used. A call to
compress will reset the temp variable counter as well as free up space. If compress is called
from a place it cannot run, say in a do loop or in a subroutine or program or function, then it
will not run. No message will be given. The matrix command termination message gives space
used, peak space used and peak and current temp # usage. Users can monitor their programs with
these measures to optimize performance. In the opinion of the developer, the B34S matrix
command do loop is too slow. The problem is that the do loop will start to run without knowing
the ending point because it is supported at the lowest programming level. In contrast,
SPEAKEASY requires that the user have a do loop only in a subroutine, program or function
where the loop end is known in theory. Ways to increase do loop speed are high on the "to do"
list. Faster CPU's, better compilers or better chip design may be the answer. The Lahey LF95
compiler appears to make faster do loops than the older Lahey LF90 compiler. This suggests
that the compiler management of the cache may be part of what is slowing the do loop down.
The test problem solve6 in c:\b34slm\matrix.mac illustrates some of these issues. Times and
gains from the solve statement vary based on the compiler used to build the B34S. Columns 1
and 2 were run on the same machine (400 GH) with the same source code. The Lahey LF95
compiler was a major upgrade over the order LF90. Column 3 shows the same problem with the
addition of a solvefree call added to the do loop run on a 1000 GH machine running LF95 5.6g.
In this example the source code was improved. The speed-up exceeds the chip gain of 2.5
(1000/400) and can be attributed to compiler improvements, and source code improvements and
chip design improvements.
SOLVE time
DO
time
Gain of SOLVE
LF90 4.50i
9.718
41.69
4.3897
LF95 5.5b
9.22
13.73
1.49
LF95 5.6g
1.3018
1.422
1.0932
In summary LF90 appears to make a very slow do loop while LF95 is faster. In simple equations
the formula and solve commands are useful. With large complex sequences of commands, the
do loop cost may have to be "eaten" by the user since it is relative low in comparison to the cost
of parsing the expression. Speed can be increased by using variables for constants because at
parse time all scalars are made temps. Creating temps outside the loop speeds things up. The
following four examples show various speed code:
* slow code;
do i=1,1000;
x(i)=x(i)*2.;
enddo;
* better code;
two=2.0;
do i=1,1000;
x(i)=x(i)*two;
enddo;
* vectorized code;
i=integers(1,1000);
x=x(i)*2.;
* Compact vectorized code
x=x(integers(3,1000))*2.;
Chapter 16
39
If all elements need to be changed the fastest code is
x=x*2.;
In the vectorized examples parse time is the same no matter whether there are 10 elements in x or
10,000,000. For speed gains from the use of masks, see # 20 below.
Since B34S can create, compile and execute Fortran programs, for complex calculations a
branch to Fortran is always an option. The larger the dataset the less the overhead costs. The
fortran example in matrix.mac illustrates dynamically building, compiling and calling a Fortran
program from the matrix command.
b34sexec matrix;
call open(70,'_test.f');
call rewind(70);
/$
1234567890
call character(test,"
write(6,*)'This is a test # 2'"
"
n=1000
"
"
write(6,*)n
"
"
do i=1,n
"
"
write(6,*) sin(float(i))
"
"
enddo
"
"
stop
"
"
end
");
call write(test,70);
call close(70);
/$ lf95 is Lahey Compiler
/$ g77 is Linux Compiler
/$ fortcl is script to run Lahey LF95 on Unix to link libs
call dodos('lf95
_test.f');
* call dounix('g77
_test.f -o_test');
call dounix('lf95
_test.f -o_test');
* call dounix('fortcl _test.f -o_test');
call dodos('_test > testout':);
call dounix('./_test > testout':);
call open(71,'testout');
call character(test2,'
call read(test2,71);
call print(test2);
testd=0.0;
n=0;
call read(n,71);
testd=array(n:);
call read(testd,71);
call print(testd);
call
call
call
call
call
close(71);
dodos('erase
dodos('erase
dounix('rm
dounix('rm
testout');
_test.f');
testout');
_test.f');
');
40
Matrix Command Language
b34srun;
A substantially more complex example using a GARCH model is shown next. At issue is how to
treat the first non observed second moment observation. Three problems are run. The garchest
command does not set this value to 0.0. The Fortran implementation does set the value to 0.0
and 100% matches RATS output. The Fortran implementation is orders of magnitude slower but
shows the user's ability to have total control of what is being maximized.
/$ Tests RATS vs GARCHEST vs FORTRAN
/$
In the FORTRAN SETUP see line arch(1)=0.0
/$
If
line is commented out => GARCHEST = FORTRAN
/$
If
line is not commented out FORTRAN = RATS
/$
This illustrates the effect of starting values!!!!!!
/$
Also illustrates Fortran as a viable alternative when there
/$
are very special models to be run that are recursive in
/$
nature
b34sexec options ginclude('b34sdata.mac') member(lee4);
b34srun;
%b34slet dorats=1;
/$ Using garchest
%b34slet dob34s1=1;
/$ Using Fortran
%b34slet dob34s2=1;
/$ **********************************************************
%b34sif(&dob34s1.ne.0)%then;
b34sexec matrix ;
call loaddata
;
* The data has
*
a1 = GMA =
*
b1_n = GAR
*
b1 = GAR =
been generated by GAUSS by following settings $
0.09
$
= 0.5 ( When Negative)
$
0.01
$
call echooff
;
maxlag=0
y=doo1
y=y-mean(y)
;
;
;
v=variance(y)
;
arch=array(norows(y):) + dsqrt(v);
* GARCH on a TGARCH Model ;
call garchest(res,arch,y,func,maxlag,n
:ngar
1
:garparms array(:.1)
:ngma
1
:gmaparms array(:.1)
:maxit 2000
:maxfun 2000
:maxg
2000
:steptol .1d-14
:cparms array(2:.1,.1)
:print );
b34srun;
%b34sendif;
/$ Fortran
Chapter 16
41
%b34sif(&dob34s2.ne.0)%then;
b34sexec matrix ;
call loaddata
;
* The data has
*
a1 = GMA =
*
b1_n = GAR
*
b1 = GAR =
been generated by GAUSS by following settings $
0.09
$
= 0.5 ( When Negative)
$
0.01
$
* call echooff
;
/$ Setup fortran
call open(70,'_test.f');
call rewind(70);
/$ We now save the Fortran Program in a Character object
/$ Will get overflows
call character(fortran,
/$23456789012345678901234567890
"
implicit real*8(a-h,o-z)
"
"
parameter(nn=10000)
"
"
dimension data1(nn)
"
"
dimension res1(nn)
"
"
dimension res2(nn)
"
"
dimension parm(100)
"
"
call dcopy(nn,0.0d+00,0,data1,1)"
"
call dcopy(nn,0.0d+00,0,res2 ,1)"
"
open(unit=8,file='data.dat') "
"
open(unit=9,file='tdata.dat') "
"
read(8,*)nob
"
"
read(8,*)(data1(ii),ii=1,nob) "
"
read(9,*)npar
"
"
read(9,*)(parm(ii),ii=1,npar) "
"
read(9,*) res2(1)
"
"
close(unit=9)
"
"
"
"
do i=1,nob
"
"
res1(i)=data1(i)-parm(3)
"
"
enddo
"
"
"
"
func=0.0d+00
"
"
do i=2,nob
"
"
res2(i) =parm(1)+(parm(2)* res2(i-1)
) +"
"
*
(parm(4)*(res1(i-1)**2) ) "
"
"
"
"
if(dabs(res2(i)).le.dmach(3))then
func= 1.d+40
go to 100
endif
"
"
"
"
"
"
"
"
"
"
"
func=func+(dlog(dabs(res2(i))))+
* ((res1(i)**2)/res2(i))
enddo
"
func=-.5d+00*func
"
100 continue
"
close(unit=8)
"
open(unit=8,file='testout')
"
write(8,fmt='(e25.16)')func
"
close(unit=8)
"
stop
"
end
");
/$
Fortran Object written here
call write(fortran,70);
call close(70);
maxlag=0
;
"
"
"
"
"
"
42
Matrix Command Language
y=doo1
y=y-mean(y)
;
;
* compile fortran and save data;
/$ lf95 is Lahey Compiler
/$ g77 is Linux Compiler
/$ fortcl is script to run Lahey LF95 on Unix to link libs
call dodos('lf95
_test.f');
* call dounix('g77
_test.f -o_test');
* call dounix('lf95
_test.f -o_test');
call dounix('fortcl _test.f -o_test');
call open(72,'data.dat');
call rewind(72);
call write(norows(y),72);
call write(y,72,'(3e25.16)');
call close(72);
v=variance(y)
;
arch=array(norows(y):) + dsqrt(v);
i=2;
j=norows(y);
count=0.0;
call echooff;
program test;
call open(72,'tdata.dat');
call rewind(72);
npar=4;
call write(npar,72);
call write(parm,72,'(e25.16)');
/$
/$
If below line is commented out => GARCHEST = FORTRAN
/$
If below line is not commented out FORTRAN = RATS
/$
arch(1)=0.0d+00 ;
call write(arch(1),72,'(e25.16)');
call close(72);
call dodos('_test');
call dounix('./_test ');
call open(71,'testout');
func=0.0;
call read(func,71);
call close(71);
count=count+1.0;
call outdouble(10,5 ,func);
call outdouble(10,6 ,count);
call outdouble(10,7, parm(1));
call outdouble(10,8, parm(2));
call outdouble(10,9, parm(3));
call outdouble(10,10,parm(4));
return;
end;
ll
uu
=array(4:
=array(4:
-.1e+10, .1e-10,.1e-10,.1e-10);
.1e+10, .1e+10,.1e+10,.1e+10);
* parm=array(:.0001 .0001 .0001 .0001);
* parm(1)=v;
* parm(3)=mean(y);
rvec=array(4:
.1
.1,
.1,
.1);
Chapter 16
parm=rvec;
* call names(all);
call cmaxf2(func
:name test
:parms parm
:ivalue rvec
:maxit 2000
:maxfun 2000
:maxg
2000
:lower ll
:upper uu
:print);
*call dodos('erase testout');
call dodos('erase _test.exe');
*call dounix('rm
testout');
call dounix('rm
_test');
b34srun;
%b34sendif;
%b34sif(&dorats.ne.0)%then;
b34sexec options open('rats.dat') unit(28) disp=unknown$ b34srun$
b34sexec options open('rats.in') unit(29) disp=unknown$ b34srun$
b34sexec options clean(28)$ b34srun$
b34sexec options clean(29)$ b34srun$
b34sexec pgmcall$
rats passasts
pcomments('* ',
'* Data passed from B34S(r) system to RATS',
'* ') $
pgmcards$
* The data has been generated by GAUSS by following settings
*
a1 = GMA = 0.09
*
b1_n = GAR = 0.5 ( When Negative)
*
b1 = GAR = 0.01
compute gstart=2,gend=1000
declare series u
declare series h
declare series s
*
;* Residuals
;* Variances
;* SD
set rt = doo1
set h = 0.0
nonlin(parmset=base) p0 a0 a1 b1
nonlin(parmset=constraint) a1>=0.0 b1>=0.0
* GARCH
************
Not correct model
frml at = rt(t)-p0
frml g1 = a0 + a1*at(t-1)**2 + b1*h(t-1)
frml logl = -.5*log(h(t)=g1(t))-.5*at(t)**2/h(t)
smpl 2 1000
compute p0 = 0.1
compute a0 = 0.1, a1 = 0.1, b1 =0.1
*
*
maximize(parmset=base+constraint,method=simplex, $
recursive,iterations=100) logl
maximize(parmset=base+constraint,method=bhhh, $
recursive,iterations=10000) logl
b34sreturn;
b34srun;
43
44
Matrix Command Language
b34sexec options close(28)$ b34srun$
b34sexec options close(29)$ b34srun$
b34sexec options
/$
dodos('start /w /r rats386 rats.in rats.out')
dodos('start /w /r rats32s rats.in /run')
dounix('rats
rats.in rats.out')$ b34srun$
b34sexec options npageout
writeout('output from rats',' ',' ')
copyfout('rats.out')
dodos('erase rats.in','erase rats.out','erase rats.dat')
dounix('rm
rats.in','rm
rats.out','rm
rats.dat') $
b34srun
%b34sendif;
Edited output is shown next:
B34S 8.11C
Variable
(D:M:Y)
# Cases
DOO1
1
DOO2
2
DOO3
3
DOO4
4
DOO5
5
DOO6
6
DOO7
7
DOO8
8
DOO9
9
DO10
10
CONSTANT 11
1/ 7/07 (H:M:S)
Mean
1000
1000
1000
1000
1000
1000
1000
1000
1000
1000
1000
-0.8631441630E-02
0.1027493796E-01
0.1702775142E-02
-0.1313232100E-01
0.2912316775E-01
-0.5975311198E-02
0.3427857886E-02
-0.1170972201E-01
0.1179714228E-01
0.8936612970E-02
1.000000000
Number of observations in data file
Current missing variable code
B34S(r) Matrix Command. d/m/y
8: 3:34
DATA STEP
Std Deviation
0.3606170368
0.3416024362
0.3401124651
0.3612532834
0.3418287186
0.3665673422
0.3380034576
0.3630108981
0.3544133951
0.3380290307
0.000000000
TGARCH GMA .09 GAR2 .5 GAR .01
Variance
0.1300446473
0.1166922244
0.1156764889
0.1305039348
0.1168468729
0.1343716164
0.1142463373
0.1317769122
0.1256088546
0.1142636256
0.000000000
1000
1.000000000000000E+31
1/ 7/07. h:m:s
8: 3:34.
=>
CALL LOADDATA
$
=>
* THE DATA HAS BEEN GENERATED BY GAUSS BY FOLLOWING SETTINGS $
=>
*
A1 = GMA = 0.09
$
=>
*
B1_N = GAR = 0.5 ( WHEN NEGATIVE)
$
=>
*
B1 = GAR = 0.01
$
=>
CALL ECHOOFF
$
GARCH/ARCH Model Estimated using DB2ONF Routine
Constrained Maximum Likelihood Estimation using CMAXF2 Command
Finite-difference Gradiant
Model Estimated
res1(t)=y(t)-cons(1)-ar(1)*y(t-1)
-...
-ma(1)*res1(t-1)
-...
res2(t)=
cons(2)+gar(1)*res2(t-1)
+...
+gma(1)*(res1(t-1)**2)+...
where: gar(i) and gma(i) ge 0.0
LF
=-.5*sum((ln(res2(t))+res1(t)**2/res2(t)))
Final Functional Value
# of parameters
# of good digits in function
# of iterations
# of function evaluations
# of gradiant evaluations
Scaled Gradient Tolerance
Scaled Step Tolerance
Relative Function Tolerance
False Convergence Tolerance
Maximum allowable step size
Size of Initial Trust region
# of terms dropped in ML
1/ Condition of Hessian Matrix
# Name
1 GAR
order Parm. Est.
1
0.21526580
530.5635346303904
4
15
11
23
13
6.055454452393343E-06
1.000000000000000E-15
3.666852862501036E-11
2.220446049250313E-14
2000.000000000000
-1.000000000000000
0
5.340454918178504E-04
SE
0.15889283
t-stat
1.3547861
Maximum
1.217023800
1.167096100
1.246665600
1.152290800
1.128081900
1.521247000
1.096066300
1.593915500
1.172364000
1.225441200
1.000000000
Minimum
-1.143099300
-1.045661400
-1.053053800
-1.303597500
-1.161346400
-1.480897300
-0.9579734600
-1.135067800
-1.100395100
-1.297209500
1.000000000
PAGE
1
Chapter 16
2 GMA
3 CONS_1
4 CONS_2
1
0
0
0.15494322
0.11895044E-01
0.81870430E-01
0.43788030E-01
0.10240513E-01
0.19816063E-01
45
3.5384834
1.1615672
4.1315184
SE calculated as sqrt |diagonal(inv(%hessian))|
Order of Parms: AR, MA, GAR, GMA, MU, vd, Const 1&2
Hessian Matrix
1
947.739
711.220
29.0281
7063.23
1
2
3
4
2
677.515
1045.35
-669.541
5153.42
3
-31.3699
-698.983
10146.9
547.320
4
7092.35
4863.30
111.143
55246.5
Gradiant Vector
0.296595E-03
0.370295E-03
-0.188742E-03
0.202584E-02
0.100000E-16
-0.100000E+33
0.100000E-16
0.100000E+33
0.100000E+33
0.100000E+33
Lower vector
0.100000E-16
Upper vector
0.100000E+33
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
11869781, peak space used
68, peak number used
29, # user temp clean
B34S(r) Matrix Command. d/m/y
1/ 7/07. h:m:s
29977
69
0
8: 3:35.
=>
CALL LOADDATA
$
=>
* THE DATA HAS BEEN GENERATED BY GAUSS BY FOLLOWING SETTINGS $
=>
*
A1 = GMA = 0.09
$
=>
*
B1_N = GAR = 0.5 ( WHEN NEGATIVE)
$
=>
*
B1 = GAR = 0.01
$
=>
* CALL ECHOOFF
=>
CALL OPEN(70,'_test.f')$
=>
CALL REWIND(70)$
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
CALL CHARACTER(FORTRAN,
"
implicit real*8(a-h,o-z)
"
"
parameter(nn=10000)
"
"
dimension data1(nn)
"
"
dimension res1(nn)
"
"
dimension res2(nn)
"
"
dimension parm(100)
"
"
call dcopy(nn,0.0d+00,0,data1,1)"
"
call dcopy(nn,0.0d+00,0,res2 ,1)"
"
open(unit=8,file='data.dat') "
"
open(unit=9,file='tdata.dat') "
"
read(8,*)nob
"
"
read(8,*)(data1(ii),ii=1,nob) "
"
read(9,*)npar
"
"
read(9,*)(parm(ii),ii=1,npar) "
"
read(9,*) res2(1)
"
"
close(unit=9)
"
"
"
"
do i=1,nob
"
"
res1(i)=data1(i)-parm(3)
"
"
enddo
"
"
"
"
func=0.0d+00
"
"
do i=2,nob
"
"
res2(i) =parm(1)+(parm(2)* res2(i-1)
) +"
"
*
(parm(4)*(res1(i-1)**2) ) "
"
if(dabs(res2(i)).le.dmach(3))then
"
"
func= 1.d+40
"
"
go to 100
"
$
46
Matrix Command Language
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
"
"
"
"
"
"
"
"
"
"
"
"
endif
func=func+(dlog(dabs(res2(i))))+
* ((res1(i)**2)/res2(i))
enddo
"
func=-.5d+00*func
"
100 continue
"
close(unit=8)
"
open(unit=8,file='testout')
"
write(8,fmt='(e25.16)')func
"
close(unit=8)
"
stop
"
end
")$
=>
CALL WRITE(FORTRAN,70)$
=>
CALL CLOSE(70)$
=>
MAXLAG=0
$
=>
Y=DOO1
$
=>
Y=Y-MEAN(Y)
$
=>
* COMPILE FORTRAN AND SAVE DATA$
=>
CALL
=>
* CALL DOUNIX('G77
_TEST.F -O_TEST')$
=>
* CALL DOUNIX('LF95
_TEST.F -O_TEST')$
=>
CALL DOUNIX('fortcl _test.f -o_test')$
=>
CALL OPEN(72,'data.dat')$
=>
CALL REWIND(72)$
=>
CALL WRITE(NOROWS(Y),72)$
=>
CALL WRITE(Y,72,'(3e25.16)')$
=>
CALL CLOSE(72)$
=>
V=VARIANCE(Y)
=>
ARCH=ARRAY(NOROWS(Y):) + DSQRT(V)$
=>
I=2$
=>
J=NOROWS(Y)$
=>
COUNT=0.0$
=>
CALL ECHOOFF$
DODOS('lf95
"
"
"
_test.f')$
$
Constrained Maximum Likelihood Estimation using CMAXF2 Command
Final Functional Value
530.5828775439368
# of parameters
4
# of good digits in function 15
# of iterations
28
# of function evaluations
38
# of gradiant evaluations
30
Scaled Gradient Tolerance
6.055454452393343E-06
Scaled Step Tolerance
3.666852862501036E-11
Relative Function Tolerance
3.666852862501036E-11
False Convergence Tolerance
2.220446049250313E-14
Maximum allowable step size
2000.000000000000
Size of Initial Trust region -1.000000000000000
1 / Cond. of Hessian Matrix
4.118214520668572E-04
#
Name
1 BETA___1
2 BETA___2
Coefficient
0.74116830E-01
0.28143495
Standard Error
0.20221217E-01
0.17082386
T Value
3.6653001
1.6475154
Chapter 16
3 BETA___3
4 BETA___4
0.11717038E-01
0.14761371
0.11319858E-01
0.46375692E-01
47
1.0350870
3.1829975
SE calculated as sqrt |diagonal(inv(%hessian))|
Hessian Matrix
1
65640.6
8277.36
1376.24
6230.55
1
2
3
4
2
8265.13
1086.36
147.313
865.144
3
1262.64
115.630
8253.82
-369.582
4
6204.13
858.414
-388.923
1216.54
Gradiant Vector
0.179395E-02
0.248061E-03
-0.844847E-04
0.213908E-03
0.100000E-10
0.100000E-10
0.100000E-10
0.100000E+10
0.100000E+10
0.100000E+10
Lower vector
-0.100000E+10
Upper vector
0.100000E+10
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
B34S 8.11C
(D:M:Y)
11868953, peak space used
58, peak number used
10326, # user temp clean
1/ 7/07 (H:M:S)
8: 3:56
28116
60
0
PGMCALL STEP
TGARCH GMA .09 GAR2 .5 GAR .01
output from rats
*
* Data passed from B34S(r) system to RATS
*
CALENDAR(IRREGULAR)
ALLOCATE
1000
OPEN DATA rats.dat
DATA(FORMAT=FREE,ORG=OBS,
$
MISSING=
0.1000000000000000E+32
) / $
DOO1
$
DOO2
$
DOO3
$
DOO4
$
DOO5
$
DOO6
$
DOO7
$
DOO8
$
DOO9
$
DO10
$
CONSTANT
SET TREND = T
TABLE
Series
Obs
Mean
Std Error
DOO1
1000
-0.008631
0.360617
DOO2
1000
0.010275
0.341602
DOO3
1000
0.001703
0.340112
DOO4
1000
-0.013132
0.361253
DOO5
1000
0.029123
0.341829
DOO6
1000
-0.005975
0.366567
DOO7
1000
0.003428
0.338003
DOO8
1000
-0.011710
0.363011
DOO9
1000
0.011797
0.354413
DO10
1000
0.008937
0.338029
TREND
1000
500.500000
288.819436
Minimum
-1.143099
-1.045661
-1.053054
-1.303597
-1.161346
-1.480897
-0.957973
-1.135068
-1.100395
-1.297209
1.000000
* The data has been generated by GAUSS by following settings
*
a1 = GMA = 0.09
*
b1_n = GAR = 0.5 ( When Negative)
*
b1 = GAR = 0.01
compute gstart=2,gend=1000
declare series u
;* Residuals
declare series h
;* Variances
declare series s
;* SD
*
set rt = doo1
set h = 0.0
nonlin(parmset=base) p0 a0 a1 b1
nonlin(parmset=constraint) a1>=0.0 b1>=0.0
* GARCH
************ Not correct model
frml at = rt(t)-p0
frml g1 = a0 + a1*at(t-1)**2 + b1*h(t-1)
frml logl = -.5*log(h(t)=g1(t))-.5*at(t)**2/h(t)
smpl 2 1000
compute p0 = 0.1
compute a0 = 0.1, a1 = 0.1, b1 =0.1
Maximum
1.217024
1.167096
1.246666
1.152291
1.128082
1.521247
1.096066
1.593916
1.172364
1.225441
1000.000000
PAGE
2
48
*
*
Matrix Command Language
maximize(parmset=base+constraint,method=simplex, $
recursive,iterations=100) logl
maximize(parmset=base+constraint,method=bhhh, $
recursive,iterations=10000) logl
MAXIMIZE - Estimation by BHHH
Convergence in
11 Iterations. Final criterion was
Usable Observations
999
Function Value
530.58287754
0.0000017 <=
0.0000100
Variable
Coeff
Std Error
T-Stat
Signif
*******************************************************************************
1. P0
0.0030855387 0.0113778017
0.27119 0.78624540
2. A0
0.0741165760 0.0202703454
3.65640 0.00025578
3. A1
0.1476142210 0.0517016977
2.85511 0.00430214
4. B1
0.2814353360 0.1738216127
1.61910 0.10542480
Note that for the RATS and Fortran implementation the function value was 530.582877. The
garchest vaue was 530.5635346303904. What is most surprising is the effect on the parameters
which are shown below for garchest. (Note p0 = CONS_1, a0=CONS_2, a1=GMA and
b1=GAR). This example shows that one observation out of 1000 can make an important
difference. It also shows the abilkity for the user to have 100% control of the function being
maximized.
#
1
2
3
4
Name
GAR
GMA
CONS_1
CONS_2
order Parm. Est.
1
0.21526580
1
0.15494322
0
0.11895044E-01
0
0.81870430E-01
SE
0.15889283
0.43788030E-01
0.10240513E-01
0.19816063E-01
t-stat
1.3547861
3.5384834
1.1615672
4.1315184
18. Termination Issues. Do loop and if statement termination must be hit. If this is not done, the
max if statement limit or do statement limit can be exceeded depending on program logic. This
"limitation" comes from having if and do loops outside programs. In Fortran, the complete do
loop or if statement is known to the compiler when the executable was built. In an interpreted
language such as the matrix command, the command parser does not know about a statement it
has not read yet. With in a built-in command such as olsq the possible logical paths are
completely known at compile time.
Remark: A human is a curious mixture of compiled and interpreted code. If one drinks too much, it can be
predicted what will happen in terms of loss of coordination etc. In this sense the body knows what will
occur, given an input. Free will, in contrast to predestination, implies an interpretative structure, where
when one hits a branch (become an economist) one will never know what would have occurred had one
taken another path.
As an example of interpreted code where the end is never seen consider
loop continue;
if(dabs(z1-z2).gt.1.d-13)then;
z2=z1;
z1=dlog(z1)+c;
go to loop;
endif;
which will never hit endif; The B34S parser will not know the position of this statement and
the max if statement limit could be hit if the if structure was executed many times. A better
approach is not to use an if structure in this situation. Better code is:
loop continue;
if(dabs(z1-z2).le.1.d-13)go to nextstep;
Chapter 16
49
z2=z1;
z1=dlog(z1)+c;
go to loop;
nextstep continue;
19. Mask Issues. Assume an array x where for x < 0 we want y=x**2. while for x  0.0 we want
y=2*x. A slow but very clear way to do this would be:
do i=1,norows(x);
if(x(i) .lt. 0.0)y(i)=x(i)**2.;
if(x(i) .ge. 0.0)y(i)=x(i)*2. ;
enddo;
since the larger the X array the more parsing is required because the do loop cycles more times.
A vectorized way to do the same calculation is to define two masks. Mask1 = 0.0 if the logical
expression is false, = 1.0 if it is true. Faster code would be
mask1= x .lt. 0.0 ;
mask2= x .ge. 0.0 ;
y= mask1*(x**2.0) + mask2*(x*2.0);
Compact fast code would be
y= (x .lt. 0.0)*(x**2.0) + (x .ge. 0.0 )*(x*2.0);
Complete problem:
b34sexec matrix;
call print('If X GE 0.0 y=20*x. Else y=2*x':);
x=rn(array(20:));
mask1= x .lt. 0.0 ;
mask2= x .ge. 0.0 ;
y= mask1*(x*2.0) + mask2*(x*20.);
call tabulate(x,y,mask1,mask2);
b34srun;
Compact code (placing the logical calculation in the calculation expression) is:
b34sexec matrix;
call print('If X GE 0.0 y=20*x. Else y=2*x':);
x=rn(array(20:));
y= (x.lt.0.0)*(x*2.0) + (x.ge.0.0)*(x*20.);
call tabulate(x,y);
b34srun;
Logical mask expressions should be used in function and subroutine calls to speed calculation.
20. N Dimensional Objects. While the matrix command saves only 1 and 2 dimensional
objects, it is possible to save and address n dimensional objects in 1-D arrays. B34S saves n
dimensional objects by col. The command index(2 3 5) creates an integer array with
elements 2 3 5, index(2 3 5:) determines the number of elements in a 3 dimensional array
with dimensions 2, 3,5 and index(a,b,c:i,j,k) determines the position on a one
dimensional vector of the i, j, k element of a three dimensional array with max dimensions a, b
and c. The commands:
nn=index(i,j,k:);
x=array(nn);
call setndimv(index(i,j,k),index(1,2,3),x,value);
50
Matrix Command Language
will make an 3 dimensional(i, j, k) object x and place value in the 1, 2, 3 position. The function
call
yy=getndimv(index(i,j,k),index(1,2,3),x);
or
yy=x(index(i,j,k:1,2,3));
can be used to pull a value out. For example to define the 4 dimensional object x with
dimensions 2 3 4 5:
nn=index(2,3,4,5:);
x=array(nn:);
To fill this array with values 1.,...,norows(x)
x=dfloat(integers(norows(x)));
or to set the 1, 2, 3, 1 value to 100.
call setndim(index(2,3,4,5),index(1,2,3,1),x,100.);
Examples of this facility:
b34sexec matrix;
x=rn(array(index(4,4,4:):));
call print(x,getndimv(index(4,4,4),index(1,2,1),x));
do k=1,4;
do i=1,4;
do j=1,4;
test=getndimv(index(4,4,4),index(i,j,k),x);
call print(i,j,k,test);
enddo;
enddo;
enddo;
b34srun;
b34sexec matrix;
xx=index(1,2,3,4,5,4,3);
call names(all);
call print(xx);
call print('Integer*4 Array
call print('# elements in 1
call print('Position of 1 2
call print('Integer*4 Array
call print('# elements in 1
call print('Position of 1 3
b34srun;
',index(1 2 3 4 5 4 3));
2 3 4 is 24',index(2 3 4:));
in a 4 by 4 is 5',index(4 4:1 2):);
',index(1,2,3,4,5 4 3));
2 3 5 is 30',index(2,3,5:));
in a 4 by 4 is 9',index(4,4:1,3):);
b34sexec matrix;
mm=index(4,5,6:);
xx=rn(array(mm:));
idim =index(4,5,6);
idim2=index(2,2,2);
call setndimv(idim,idim2,xx,10.);
Chapter 16
51
vv= getndimv(idim,idim2 ,xx);
call print(xx,vv);
b34srun;
21. Complex Math Issues. The statements
x=complex(1.5,1.5);
y=complex(1.0,0.0);
a=x*y;
produces a=(1.5,1.5); To zero out the imag part of a use
a=complex(real(x*y),0.0);
In summary, the B34S matrix facility provides a 4th generation programming language
that is tailored to applied econometrics and time series applications. The next section discusses
basic linear algebra using the matrix facility..
16.4 Linear Algebra using the Matrix Language
Basic rules of linear algebra as discussed in Greene (2000) are illustrated using the
matrix command. Although the complex domain is supported, due to space limitations, this
material was removed. Interested readers can look at the extensive example files for each
individual matrix command. Assume A is an m by n matrix, B is n by k and C is m by k, then
C  AB
C '  B ' A'
(16.4-1)
A( B  C )  AB  AC
The following code, which is part of ch16_13 in bookruns.mac, illustrates these calculations:
a=matrix(2,3:1 3 2 4 5,-1);
b=matrix(3,2:2 4 1 6 0 5);
c=a*b;
call print(a,b,c,'AB',a*b,'BA',b*a);
n=3;
a=rn(matrix(n,n:));
b=rn(a);
c=a*b;
call print(a,b,c,'AB',a*b,' '
'BA',b*a,' '
'a*(b+c) = a*b+a*c',
a*(b+c), a*b+a*c );
call print(' ',
'We show that transpose(a*b) = transpose(b)*transpose(a)',
transpose(a*b),
transpose(b)*transpose(a));
Edited output is:
=>
A=MATRIX(2,3:1 3 2 4 5,-1)$
=>
B=MATRIX(3,2:2 4 1 6 0
=>
C=A*B$
5)$
52
=>
Matrix Command Language
CALL PRINT(A,B,C,'AB',A*B,'BA',B*A)$
A
= Matrix of
2
1
1.00000
4.00000
1
2
B
2
3.00000
5.00000
= Matrix of
3
1
2.00000
1.00000
0.00000
1
2
3
C
3
elements
3
2.00000
-1.00000
by
2
elements
by
2
elements
by
2
elements
by
3
elements
2
4.00000
6.00000
5.00000
= Matrix of
2
1
5.00000
13.0000
1
2
by
2
32.0000
41.0000
AB
Matrix of
2
1
5.00000
13.0000
1
2
2
32.0000
41.0000
BA
Matrix of
3
1
18.0000
25.0000
20.0000
1
2
3
2
26.0000
33.0000
25.0000
=>
N=3$
=>
A=RN(MATRIX(N,N:))$
=>
B=RN(A)$
=>
C=A*B$
=>
=>
=>
=>
CALL PRINT(A,B,C,'AB',A*B,'
'BA',B*A,'
'a*(b+c) =
A*(B+C),
A
= Matrix of
1
2
3
B
1
2.05157
1.08325
0.825589E-01
= Matrix of
1
2
3
C
1
-0.605638
0.307389
-1.54789
= Matrix of
1
2
3
1
1.19362
1.32678
0.764914
3
3
0.00000
-4.00000
-5.00000
'
'
a*b+a*c',
A*B+A*C )$
by
2
1.27773
-1.22596
0.338525
3
by
2
1.49779
-0.168215
0.498469
3
2
2.19986
1.06882
-0.162207
3
elements
3
-1.32010
-1.52445
-0.459242
3
elements
3
1.26792
0.741401
-0.187157
by
3
elements
3
3.79559
0.749854
0.441611
Chapter 16
53
AB
Matrix of
1
2
3
1
1.19362
1.32678
0.764914
3
by
2
2.19986
1.06882
-0.162207
3
elements
3
3.79559
0.749854
0.441611
BA
Matrix of
1
2
3
1
0.484652
0.509621
-2.65109
3
by
2
-2.18085
0.849968
-2.65224
3
elements
3
-2.06609
-0.489832
1.36943
a*(b+c) = a*b+a*c
Matrix of
1
2
3
1
4.32792
-0.172882
0.961325
3
2
8.29281
2.38876
0.455725
Matrix of
1
2
3
=>
=>
=>
by
1
4.32792
-0.172882
0.961325
3
3
elements
3
11.9577
3.26893
0.806010
by
2
8.29281
2.38876
0.455725
3
elements
3
11.9577
3.26893
0.806010
CALL PRINT(' ',
'We show that transpose(a*b) = transpose(b)*transpose(a)',
TRANSPOSE(A*B),
TRANSPOSE(B)*TRANSPOSE(A))$
We show that
transpose(a*b)
Matrix of
1
2
3
1
1.19362
2.19986
3.79559
1
1.19362
2.19986
3.79559
3
by
2
1.32678
1.06882
0.749854
Matrix of
1
2
3
= transpose(b)*transpose(a)
3
2
1.32678
1.06882
0.749854
3
elements
3
0.764914
-0.162207
0.441611
by
3
elements
3
0.764914
-0.162207
0.441611
If we define i as a n by 1 matrix of 1's, then a vector of the means of series x can be calculated
as i ' x / n where x is a vector and n is the number of observations in the vector. This is shown by
b34sexec matrix;
call print(' '
'Define i as a n by 1 matrix of ones':)
n=10;
i=matrix(n,1:vector(n:)+1.);
seriesx=rn(vector(n:));
mm=mean(seriesx);
call print(mm);
meanmm=i*transpose(i)*seriesx/dfloat(norows(seriesx));
call print(meanmm);
54
Matrix Command Language
b34srun;
which produces values of the mean two different ways:
=> CALL PRINT(' '
=> 'Define i as a n by 1 matrix of ones':)
=>
=> N=10$
=>
I=MATRIX(N,1:VECTOR(N:)+1.)$
=>
SERIESX=RN(VECTOR(N:))$
=>
MM=MEAN(SERIESX)$
=>
CALL PRINT(MM)$
MM
=
0.19517131
=>
MEANMM=I*TRANSPOSE(I)*SERIESX/DFLOAT(NOROWS(SERIESX))$
=>
CALL PRINT(MEANMM)$
MEANMM
= Vector of
0.195171
3
0.195171
elements
0.195171
An idempotent matrix M has the property that M M  M , while if M is symmetric then in
1
addition, M ' M  M . Greene (2000, 16) discusses a matrix M  [ I  i ' i ] with this property. If
n
Z  [ x, y ] where x and y are vectors, then Z ' M Z calculates the variance covariance matrix.
Note that the diagonal elements of M i ,i  1  (1/ n ) while M i , j  1/ n where i  j . This is
shown next where we calculate the mean of series x two ways using mean to get mm and i ' x / n
to get meanmm. The covariance matrix is calculated using the cov function and as
Z ' MZ /(n  1) where Z is a matrix whose columns are the vectors of data. Finally Mi is tested
and found to be close to 0.0 as expected. In terms of the program, M 0  M .
b34sexec matrix;
call load(cov :staging);
call echooff;
* Examples from Greene(2000);
/;
/; Use of i
/;
n=3;
call print('Define i as a n by 1 matrix of ones':);
i=matrix(n,1:vector(n:)+1.);
seriesx=rn(vector(n:));
mm=mean(seriesx);
meanmm=i*transpose(i)*seriesx/dfloat(norows(seriesx));
call print(mm,meanmm);
/$ Get Variance Covariance
call print('Define Idempotent matrix M'
'Diagonal = 1-(1/n). Off Diag -(1/n)':);
Chapter 16
bigi=matrix(n,n:)+1.;
littlei=(1./dfloat(n))*(i*transpose(i));
m0=(bigi-littlei);
call print('m0 m0*m0 transpose(m0)*m0',m0,m0*m0,transpose(m0)*m0);
seriesy=rn(seriesx);
con=vector(n:)+1.;
z=catcol(seriesy,seriesx,con);
call print(z);
vcov=transpose(z)*m0*z;
call print(variance(seriesy));
call print(variance(seriesx));
call print('Sums and Cross Products ',vcov,m0*z);
call print('cov(z)
',
cov(z));
call print('(1.dfloat(n-1))*vcov', (1./dfloat(n-1))*vcov);
call print('mo * i . Is this 0?',m0*i);
b34srun;
Output is:
B34S(r) Matrix Command. d/m/y
=>
CALL LOAD(COV :STAGING)$
=>
CALL ECHOOFF$
1/ 7/07. h:m:s 15:41: 8.
Define i as a n by 1 matrix of ones
MM
=
1.0724604
MEANMM
= Vector of
1.07246
3
elements
1.07246
1.07246
Define Idempotent matrix MDiagonal = 1-(1/n). Off Diag -(1/n)
m0 m0*m0 transpose(m0)*m0
M0
= Matrix of
1
2
3
1
0.666667
-0.333333
-0.333333
Matrix of
1
2
3
1
0.666667
-0.333333
-0.333333
Matrix of
1
2
3
Z
1
0.666667
-0.333333
-0.333333
= Matrix of
3
by
2
-0.333333
0.666667
-0.333333
3
by
2
-0.333333
0.666667
-0.333333
3
by
2
-0.333333
0.666667
-0.333333
3
by
3
elements
3
-0.333333
-0.333333
0.666667
3
elements
3
-0.333333
-0.333333
0.666667
3
elements
3
-0.333333
-0.333333
0.666667
3
elements
55
56
Matrix Command Language
1
2
3
1
1.27773
-1.22596
0.338525
2
2.05157
1.08325
0.825589E-01
3
1.00000
1.00000
1.00000
1.5996959
0.96933890
Sums and Cross Products
VCOV
1
2
3
= Matrix of
3
1
3.19939
0.902698
0.433308E-16
Matrix of
1
2
3
2
0.902698
1.93868
0.357201E-15
3
1
1.14763
-1.35606
0.208429
by
by
2
0.979110
0.107914E-01
-0.989901
3
elements
3
0.555112E-16
0.444089E-15
0.333067E-15
3
elements
3
0.111022E-15
0.111022E-15
0.111022E-15
cov(z)
Array
1
2
3
of
3
1
1.59970
0.451349
0.00000
by
2
0.451349
0.969339
0.00000
3
elements
3
0.00000
0.00000
0.00000
(1.dfloat(n-1))*vcov
Matrix of
1
2
3
3
1
1.59970
0.451349
0.216654E-16
mo * i .
2
0.451349
0.969339
0.178601E-15
3
elements
3
0.277556E-16
0.222045E-15
0.166533E-15
Is this 0?
Matrix of
1
2
3
by
3
by
1
elements
1
0.111022E-15
0.111022E-15
0.111022E-15
Since the Variance-Covariance Matrix can be obtained two ways, of interest is which to
use. The traditional method is slower, more accurate and takes less space. The idempotent matrix
method is faster due to no do loops but as will be shown is not as accurate, especially in the case
Chapter 16
57
of real*4 calculations using data that is not scaled. This will be demonstrated next by running the
following program:
/;
/; This illustrates two ways to get the variance-Covariance Matrix
/; Using Greene's Idempotentent Matrix and real*4 if tghe data is not
/; Scaled, there can be problems of accuracy that are detected using
/; real*16 results as the benchmark
/;
b34sexec options ginclude('gas.b34'); b34srun;
b34sexec matrix;
call loaddata;
call load(cov :staging);
call load(cov2 :staging);
call load(cor :staging);
call echooff;
scale=1000.;
h=catcol(gasin,scale*gasout);
/; call print(h);
call print('Covariance using function cov':);
call print(cov(h));
call print('Covariance using function cov2':);
call print(cov2(h));
call print('Difference of two methods using real*8':);
call print(cov(h)-cov2(h));
call print('Correlation using function cor':);
call print(cor(h));
call print('Real*16 results':);
h16=r8tor16(h);
call print('Covariance using function cov':);
call print(cov(h16));
call print('Covariance using function cov2':);
call print(cov2(h16));
call print('Difference of the two methods using real*16 ':);
call print(cov(h)-cov2(h));
/; Testing which is closer
real8_1=afam(cov(h));
real8_2=afam(cov2(h));
call
call
call
call
call
print('Difference against real16 for real8_1 & real8_2 ':);
print('Where Traditional Method = real8_1 ':);
print('Where M0
Method = real8_2 ':);
print(r8tor16(real8_1)-afam(cov(h16)));
print(r8tor16(real8_2)-afam(cov(h16)));
call print('Correlation using function cor':);
call print(cor(h16));
/; real*4 results
58
Matrix Command Language
h4=r8tor4(h);
call print(' ':);
call print('Where Traditional Method = real4_1 ':);
call print('Where M0
Method = real4_2 ':);
real4_1=afam(cov(h4));
real4_2=afam(cov2(h4));
call print(real4_1,real4_2);
call print('Difference against real16 for real4_1 & real4_2 ':);
call print(r8tor16(r4tor8(real4_1))-afam(cov(h16)));
call print(r8tor16(r4tor8(real4_2))-afam(cov(h16)));
call print('Correlation using function cor':);
call print(cor(h4));
b34srun;
that calls functions cov and cov2 which are listed next.
function cov(x);
/;
/; Use matrix language to calculate cov of a matrix. For series use
/;
mm=catcol(series1,series2)
/; Can use real*4, real*8, real*16 and VPA.
/;
test=afam(x);
i=nocols(x);
d=kindas(x,dfloat(norows(x)-1));
do j=1,i;
test(,j)=test(,j)-mean(test(,j));
enddo;
ccov=afam(transpose(mfam(test))*mfam(test))/d;
if(klass(x).eq.2.or.klass(x).eq.1)ccov=mfam(ccov);
return(ccov);
end;
function cov2(x);
/;
/; Use matrix language to calculate cov of a matrix. For series use
/;
mm=catcol(series1,series2)
/; Can use real*4, real*8, real*16 and VPA.
/;
/; Uses Greene(2000,16) idempotent M0 matrix
/; function cov( ) is a more traditional approach that uses
/; far less space at the cost of a speed loss due to do loops
/; cov( ) is more accurate than cov2( ) if there are scaling
/; differences
/;
/; At issue is that m0 is n by n !!!!!
/;
/; Use of i which is a vector of 1's
/; z=catcol(x1,x2,...,xk)
/; ccov=transpose(z)*m0*z/(1/(n-1));
Chapter 16
59
/; where m0 diagonal = 1-(1/n). Off Diag = -1/n
/;
n=norows(x);
real_n=kindas(x,dfloat(n));
real_one=kindas(x,1.0);
/; Define i as a n by 1 matrix of ones
i=matrix(n,1:kindas(x,(vector(n:)+1.)));
/; Get Variance Covariance
/; Define Idempotent matrix M Diagonal = 1-(1/n). Off Diag -(1/n)
bigi=kindas(x,matrix(n,n:)) + real_one;
littlei=(real_one/real_n)*(i*transpose(i));
m0=(bigi-littlei);
ccov2=transpose(mfam(x))*m0*mfam(x)/(real_n-real_one);
if(klass(x).eq.6.or.klass(x).eq.5)ccov2=afam(ccov2);
return(ccov2);
end;
The variance-covariance matrix for the scaled gas data is calculated using real*8, real*4 and
real*16. Assuming the real*16 results are the correct answers, the results obtained for the two
methods are compared. The gasout series was multiplied by 1000 to cause a scale problem that
will be detected using real*4 calculations. Annotated output is shown below:
Variable
TIME
GASIN
GASOUT
CONSTANT
Label
# Cases
1
2 Input gas rate in cu. ft / min
3 Percent CO2 in outlet gas
4
Number of observations in data file
Current missing variable code
B34S(r) Matrix Command. d/m/y
=>
CALL LOADDATA$
=>
CALL LOAD(COV
=>
CALL LOAD(COV2 :STAGING)$
=>
CALL LOAD(COR
=>
CALL ECHOOFF$
Mean
296
148.500
296 -0.568345E-01
296
53.5091
296
1.00000
Std. Dev.
85.5921
1.07277
3.20212
0.00000
Variance
Maximum
Minimum
7326.00
1.15083
10.2536
0.00000
296.000
2.83400
60.5000
1.00000
1.00000
-2.71600
45.6000
1.00000
296
1.000000000000000E+31
2/ 7/07. h:m:s 10:51: 4.
:STAGING)$
:STAGING)$
Using real*8 the two methods appear to be producing the same answers with a small difference
in the cov(2,2) position of .625849e-6.
Covariance using function cov
Array
1
2
1
1.15083
-1664.15
of
2
by
2
-1664.15
0.102536E+08
Covariance using function cov2
2
elements
60
Matrix Command Language
Array
1
2
of
1
1.15083
-1664.15
2
by
2
elements
2
elements
2
elements
2
elements (real*16)
2
elements (real*16)
2
-1664.15
0.102536E+08
Difference of two methods using real*8
Array
1
2
of
1
-0.666134E-15
0.682121E-12
2
by
2
-0.341061E-11
0.625849E-06
Correlation using function cor
Array
1
2
of
1
1.00000
-0.484451
2
by
2
-0.484451
1.00000
Real*16 results
Covariance using function cov
Array
1
2
of
1
1.15083
-1664.15
2
by
2
-1664.15
0.102536E+08
Covariance using function cov2
Array
1
2
of
1
1.15083
-1664.15
2
by
2
-1664.15
0.102536E+08
Difference of the two methods using real*16
Array
1
2
of
1
-0.666134E-15
0.682121E-12
2
by
2
elements
2
-0.341061E-11
0.625849E-06
When testing against the real*16 results the traditional method difference for the 2,2 position is
.133107E-09 which is smaller than the -.625716E-06 result for the MO method. This suggests
accuracy gains even with real*8.
Difference against real16 for real8_1 & real8_2
Where Traditional Method = real8_1
Where M0
Method = real8_2
Array
1
2
1
-0.353299E-15
0.102890E-11
Array
1
2
of
of
1
0.312834E-15
0.346778E-12
2
by
2
elements (real*16)
2
elements (real*16)
2
elements (real*16)
2
0.102890E-11
0.133107E-09
2
by
2
0.443950E-11
-0.625716E-06
Correlation using function cor
Array
1
2
1
1.00000
-0.484451
of
2
by
2
-0.484451
1.00000
Using real*4 but comparing in real*16 against the real*16 results produces the result that for the
traditional method in the 2,2 position the difference is .469079. However for the MO method, the
Chapter 16
61
difference is -104.531. These findings show the accuracy loss when real*4 calculations are made
both with an appropriate method and with a poorer method.
Where
Where
Traditional Method = real4_1
M0
Method = real4_2
REAL4_1 = Array
1
2
of
1
1.15083
-1664.15
REAL4_2 = Array
1
2
2
by
2
elements (real*4)
2
elements (real*4)
2
-1664.15
0.102536E+08
of
2
1
1.15083
-1664.15
by
2
-1664.15
0.102535E+08
Difference against real16 for real4_1 & real4_2
Array
of
2
1
0.313754E-07
-0.478797E-04
1
2
Array
of
1
2
2
elements (real*16)
2
elements (real*16)
2
elements (real*4)
2
-0.478797E-04
0.469079
2
1
0.313754E-07
0.741906E-04
by
by
2
-0.478797E-04
-104.531
Correlation using function cor
Array
1
2
of
1
1.00000
-0.484451
2
by
2
-0.484451
1.00000
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
7869592, peak space used
37, peak number used
918, # user temp clean
803279
48
0
In Chapter 2 equation (2.9-6) we showed the relationship between the population error e
and the sample error u. Recall that u  ( I  X ( X ' X )1 X ')e  Me . The sum of squared sample
residuals was related to the population residuals by u ' u  e ' MMe  e ' Me. Since M is not full
rank, it is not possible to estimate the population residual as e  M 1u and hence there are only
N-K BLUS residuals. Theil shows that u ' u  My . Sample job CH16_LUS in bookruns.mac
illustrates these calculations. For more detail see Chapter 2.
b34sexec matrix;
* See Chapter 2 equation 2.9-4) - (2.9-7) ;
* In this example all coefficients are 1.0;
n=20;
k=5;
beta=vector(k:)+1.;
x=rn(matrix(n,k:));
x(,1)=1.0;
y=x*beta+rn(vector(norows(x):));
bigi=matrix(n,n:) + 1.;
m=(bigi-x*inv(transpose(x)*x)*transpose(x));
mm=m*m;
test=sum(dabs(mm-m));
62
Matrix Command Language
call print('Test ',test:);
call print('Theil (1971) page shows sumsq error = y*m*y');
testss=y*m*y;
betahat=inv(transpose(x)*x)*transpose(x)*y;
call olsq(y,x :noint :print);
call print(testss,betahat);
u=y-x*betahat;
u_alt=m*y;
sse=sumsq(u);
sse2=sumsq(u_alt);
call print('Two ways to get sum of squares',sse,sse2);
call print('Show M not full rank',det(m));
b34srun;
Which when run produces:
=>
N=20$
=>
K=5$
=>
BETA=VECTOR(K:)+1.$
=>
X=RN(MATRIX(N,K:))$
=>
X(,1)=1.0$
=>
Y=X*BETA+RN(VECTOR(NOROWS(X):))$
=>
BIGI=MATRIX(N,N:) + 1.$
=>
M=(BIGI-X*INV(TRANSPOSE(X)*X)*TRANSPOSE(X))$
=>
MM=M*M$
=>
TEST=SUM(DABS(MM-M))$
=>
CALL PRINT('Test ',TEST:)$
Test
=>
9.350213745623615E-15
CALL PRINT('Theil (1971) page shows sumsq error = y*m*y')$
Theil (1971) page shows sumsq error = y*m*y
=>
TESTSS=Y*M*Y$
=>
BETAHAT=INV(TRANSPOSE(X)*X)*TRANSPOSE(X)*Y$
=>
CALL OLSQ(Y,X :NOINT :PRINT)$
Ordinary Least Squares Estimation
Dependent variable
Y
Centered R**2
Residual Sum of Squares
Residual Variance
Sum Absolute Residuals
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
Lag
0
0
0
Coefficient
1.5104320
0.87494443
0.63416941
0.8293118234985847
14.14785861384947
0.9431905742566311
12.12518257459587
0.2405358242295580
1.875788654723293
20
SE
0.24171911
0.27288711
0.22752179
t
6.2487072
3.2062505
2.7872909
Chapter 16
Col____4
Col____5
=>
0
0
1.2184971
1.0098815
0.22690316
0.29505957
63
5.3701196
3.4226360
CALL PRINT(TESTSS,BETAHAT)$
TESTSS
=
14.147859
BETAHAT = Vector of
1.51043
5
0.874944
elements
0.634169
1.21850
=>
U=Y-X*BETAHAT$
=>
U_ALT=M*Y$
=>
SSE=SUMSQ(U)$
=>
SSE2=SUMSQ(U_ALT)$
=>
CALL PRINT('Two ways to get sum of squares',SSE,SSE2)$
1.00988
Two ways to get sum of squares
SSE
=
14.147859
SSE2
=
14.147859
=>
CALL PRINT('Show M not full rank',DET(M))$
Show M not full rank
0.17959633E-79
The BLUE property of OLS discussed in chapter 2 requires that there be no correlation between
the estimated left hand side vector ŷ  X ˆ and the residual vector ê . Given that X is a matrix of
explanatory variables, and ŷ  X ˆ , then:
yˆ ' eˆ
0
( X ˆ ) ' eˆ  0
( X ˆ ) '( yˆ  y )  0
(16.4-2)
ˆ ' X ' yˆ  ˆ ' X ' X ˆ  0
ˆ '[ X ' yˆ  X ' X ˆ ]  0
which implies the restriction X ' yˆ  X ' X ˆ from which the OLS solution
equation ˆ  ( X ' X ) 1 X ' y quickly follows. This is illustrated assuming   [1., 2.,3., 4.,5.]
n=30;
x=rn(matrix(n,5:));
x(,1)=1.0;
beta=vector(5:1. 2. 3. 4. 5.);
y=x*beta +rn(vector(n:));
xpx=transpose(x)*x;
betahat=inv(xpx)*transpose(x)*y;
call olsq(y x :print :noint);
64
Matrix Command Language
resid=y-x*betahat;
/$ Test if orthogonal
call print(beta,betahat,'Is residual Orthogonal with yhat?',
ddot(resid,x*betahat));
The results are verified with the olsq command and the orthogonal restriction yˆ ' eˆ  0 is tested.
The calculated ddot value 0.21405100E-12 suggests that the restriction is met by the solution
vector.
=>
N=30$
=>
X=RN(MATRIX(N,5:))$
=>
X(,1)=1.0$
=>
BETA=VECTOR(5:1. 2. 3. 4. 5.)$
=>
Y=X*BETA +RN(VECTOR(N:))$
=>
XPX=TRANSPOSE(X)*X$
=>
BETAHAT=INV(XPX)*TRANSPOSE(X)*Y$
=>
CALL OLSQ(Y X :PRINT :NOINT)$
Ordinary Least Squares Estimation
Dependent variable
Y
Centered R**2
Residual Sum of Squares
Residual Variance
Sum Absolute Residuals
1/Condition XPX
Maximum Absolute Residual
Number of Observations
Variable
Col____1
Col____2
Col____3
Col____4
Col____5
Lag
0
0
0
0
0
Coefficient
1.1096979
1.8668577
2.8053762
3.9369505
4.8105246
0.9804554404605988
23.50833421529196
0.9403333686116784
21.45426309533863
0.2398115678243269
2.059994965106420
30
SE
0.18104778
0.18101248
0.21003376
0.17176593
0.18925387
t
6.1293098
10.313420
13.356787
22.920439
25.418369
=>
RESID=Y-X*BETAHAT$
=>
=>
CALL PRINT(BETA,BETAHAT,'Is residual Orthogonal with yhat?',
DDOT(RESID,X*BETAHAT))$
BETA
= Vector of
1.00000
5
2.00000
BETAHAT = Vector of
elements
3.00000
5
1.10970
1.86686
Is residual Orthogonal with yhat?
4.00000
5.00000
3.93695
4.81052
elements
2.80538
0.21405100E-12
Given A and B are square matrices, linear algebra rules for inverse and determinant are:
Chapter 16
| A1 |
 1/ | A |
1

A A
( AB)
65
1
( A1 ) '
I
 B 1 A1
(16.4-3)
 ( A ') 1
( ABC ) 1  C 1B 1 A1
This is illustrated by:
/$ Rules of Inverses
call print(xpx, inv(xpx),xpx*inv(xpx));
call print(' ' 'We perform tests involving inverses '
'1/det(x) = det(inv(x))',1./det(xpx),det(inv(xpx)),
'Test if:
transpose(inv(x)) = inv(transpose(x))',
transpose(inv(xpx)),
inv(transpose(xpx)));
x1=xpx;
x2=rn(xpx);
x3=rn(x2);
call print('Test if:
inv(x1*x2*x3) = inv(x3)*inv(x2)*inv(x1)',
inv(x1*x2*x3),inv(x3)*inv(x2)*inv(x1));
which when run produces:
=>
CALL PRINT(XPX, INV(XPX),XPX*INV(XPX))$
XPX
= Matrix of
1
2
3
4
5
1
30.0000
4.88927
10.3848
-1.45371
-0.536573
Matrix of
1
2
3
4
5
1
0.396630E-01
-0.842611E-02
-0.142149E-01
0.160454E-02
-0.234751E-02
Matrix of
1
2
3
4
5
=>
=>
=>
=>
=>
1
1.00000
-0.754605E-16
-0.104083E-16
0.867362E-17
0.00000
5
by
2
4.88927
22.1316
0.241093
-6.20091
-2.52910
5
by
2
-0.842611E-02
0.507700E-01
0.228504E-02
0.113704E-01
0.492987E-02
5
by
2
0.763278E-16
1.00000
0.277556E-16
0.589806E-16
0.277556E-16
5
elements
3
10.3848
0.241093
30.4069
3.57937
-7.08262
5
4
0.160454E-02
0.113704E-01
-0.417971E-02
0.397024E-01
0.486005E-02
5
-0.234751E-02
0.492987E-02
0.655197E-02
0.486005E-02
0.273106E-01
4
0.303577E-17
-0.520417E-17
0.277556E-16
1.00000
0.277556E-16
5
0.00000
0.416334E-16
-0.555112E-16
0.00000
1.00000
elements
3
0.398986E-16
0.208167E-16
1.00000
0.346945E-16
0.555112E-16
CALL PRINT(' ' 'We perform tests involving inverses '
'1/det(x) = det(inv(x))',1./DET(XPX),DET(INV(XPX)),
'Test if:
transpose(inv(x)) = inv(transpose(x))',
TRANSPOSE(INV(XPX)),
INV(TRANSPOSE(XPX)))$
We perform tests involving inverses
1/det(x) = det(inv(x))
5
-0.536573
-2.52910
-7.08262
-4.84565
39.5877
elements
3
-0.142149E-01
0.228504E-02
0.397422E-01
-0.417971E-02
0.655197E-02
5
4
-1.45371
-6.20091
3.57937
27.9920
-4.84565
66
Matrix Command Language
0.62036286E-07
Test if:
0.62036286E-07
transpose(inv(x)) = inv(transpose(x))
Matrix of
1
2
3
4
5
1
0.396630E-01
-0.842611E-02
-0.142149E-01
0.160454E-02
-0.234751E-02
5
2
-0.842611E-02
0.507700E-01
0.228504E-02
0.113704E-01
0.492987E-02
Matrix of
1
2
3
4
5
1
0.396630E-01
-0.842611E-02
-0.142149E-01
0.160454E-02
-0.234751E-02
by
5
by
2
-0.842611E-02
0.507700E-01
0.228504E-02
0.113704E-01
0.492987E-02
5
elements
3
-0.142149E-01
0.228504E-02
0.397422E-01
-0.417971E-02
0.655197E-02
5
4
0.160454E-02
0.113704E-01
-0.417971E-02
0.397024E-01
0.486005E-02
5
-0.234751E-02
0.492987E-02
0.655197E-02
0.486005E-02
0.273106E-01
4
0.160454E-02
0.113704E-01
-0.417971E-02
0.397024E-01
0.486005E-02
5
-0.234751E-02
0.492987E-02
0.655197E-02
0.486005E-02
0.273106E-01
elements
3
-0.142149E-01
0.228504E-02
0.397422E-01
-0.417971E-02
0.655197E-02
=>
X1=XPX$
=>
X2=RN(XPX)$
=>
X3=RN(X2)$
=>
=>
CALL PRINT('Test if:
inv(x1*x2*x3) = inv(x3)*inv(x2)*inv(x1)',
INV(X1*X2*X3),INV(X3)*INV(X2)*INV(X1))$
Test if:
inv(x1*x2*x3) = inv(x3)*inv(x2)*inv(x1)
Matrix of
1
2
3
4
5
1
-0.110146E-01
-0.897845E-02
-0.528828E-02
0.799993E-02
-0.559507E-04
5
2
-0.295302E-01
-0.246023E-01
0.373354E-01
-0.780149E-02
-0.513571E-02
Matrix of
1
2
3
4
5
1
-0.110146E-01
-0.897845E-02
-0.528828E-02
0.799993E-02
-0.559507E-04
by
5
by
2
-0.295302E-01
-0.246023E-01
0.373354E-01
-0.780149E-02
-0.513571E-02
5
elements
3
0.305773E-01
0.475944E-01
-0.612578E-02
-0.239615E-02
0.296128E-01
5
4
-0.208158E-01
-0.141358E-01
0.128556E-01
0.111239E-01
-0.399576E-02
5
0.230482E-01
0.218481E-01
-0.678314E-02
-0.453775E-02
0.199970E-01
4
-0.208158E-01
-0.141358E-01
0.128556E-01
0.111239E-01
-0.399576E-02
5
0.230482E-01
0.218481E-01
-0.678314E-02
-0.453775E-02
0.199970E-01
elements
3
0.305773E-01
0.475944E-01
-0.612578E-02
-0.239615E-02
0.296128E-01
The Kronecker product between matrix A which is k by j and B which is m by n produces C
which is k*m by j*n. In words, every element of A is multiplied by the B matrix. Using the
Greene (2000) example data with:
a=matrix(2,2:3 0 5 2); b=matrix(2,2:1 4 4 7);
call print(a,b,kprod(a,b),
a(1,1)*b,
a(1,2)*b,
a(2,1)*b,
a(2,2)*b);
we print A, B the Kronecker product A  B and each element.
=>
A=MATRIX(2,2:3 0 5 2)$
=>
B=MATRIX(2,2:1 4 4 7)$
=>
CALL PRINT(A,B,KPROD(A,B),
Chapter 16
=>
=>
A(1,1)*B,
A(2,1)*B,
A
= Matrix of
1
2
B
1
3.00000
5.00000
1
1.00000
4.00000
1
3.00000
12.0000
5.00000
20.0000
1
3.00000
12.0000
1
0.00000
0.00000
1
5.00000
20.0000
1
2.00000
8.00000
2
by
2
elements
4
by
4
elements
2
3
0.00000
0.00000
2.00000
8.00000
by
2
elements
by
2
elements
by
2
elements
by
2
elements
4
0.00000
0.00000
8.00000
14.0000
2
2
2
20.0000
35.0000
Matrix of
1
2
elements
2
0.00000
0.00000
Matrix of
1
2
2
2
12.0000
21.0000
Matrix of
1
2
by
2
12.0000
21.0000
20.0000
35.0000
Matrix of
1
2
2
2
4.00000
7.00000
Matrix of
1
2
3
4
A(1,2)*B,
A(2,2)*B)$
2
0.00000
2.00000
= Matrix of
1
2
67
2
2
8.00000
14.0000
There are a number of very important factorizations in linear algebra. Assume A is a general
matrix and B is a positive definite matrix, both of size n by n. The LU factorization of a general
matrix writes A in terms of a lower triangular matrix L and an upper triangular matrix U.
A  LU
(16.4-4)
A1  U 1L1
The Cholesky decomposition writes the B in terms of an upper triangular matrix R as
B  R'R
(16.4-5)
The following code illustrates these decompositions.
/$ LU and Cholesky factorization
n=5; x=rn(matrix(n,n:));
xpx=transpose(x)*x;
call gmfac(xpx,l,u);
r=pdfac(xpx);
call print('Inverse from L U = inv(u)*inv(l)'
68
Matrix Command Language
'inv(xpx)',
inv(xpx),
'inv(u)',
inv(u),
'inv(l)',
inv(l),
'Test inverse from looking at u and l',
'inv(u)*inv(l)',
inv(u)*inv(l));
call print(xpx,l,u,'l*u',l*u,
'Cholesky Factorization of pd matrix',r,
'transpose(r)*r',
transpose(r)*r);
Edited output is:
=>
N=5$
=>
X=RN(MATRIX(N,N:))$
=>
XPX=TRANSPOSE(X)*X$
=>
CALL GMFAC(XPX,L,U)$
=>
R=PDFAC(XPX)$
=>
=>
=>
=>
=>
=>
=>
CALL PRINT('Inverse from L U = inv(u)*inv(l)'
'inv(xpx)',
INV(XPX),
'inv(u)',
INV(U),
'inv(l)',
INV(L),
'Test inverse from looking at u and l',
'inv(u)*inv(l)',
INV(U)*INV(L))$
Inverse from L U = inv(u)*inv(l)
inv(xpx)
Matrix of
1
2
3
4
5
1
0.672320
0.565469
-0.197200
0.289017
-0.201206
5
by
2
0.565469
0.930016
-0.224394
0.213563
0.540409E-01
5
elements
3
-0.197200
-0.224394
0.347142
0.120672E-01
-0.307224E-01
4
0.289017
0.213563
0.120672E-01
0.300834
-0.170462
5
-0.201206
0.540409E-01
-0.307224E-01
-0.170462
0.441608
4
0.211351
0.234423
0.208219E-03
0.235035
0.00000
5
-0.201206
0.540409E-01
-0.307224E-01
-0.170462
0.441608
inv(u)
Matrix of
1
2
3
4
5
1
0.152498
0.00000
0.00000
0.00000
0.00000
5
by
2
0.243980
0.548226
0.00000
0.00000
0.00000
5
elements
3
-0.211385
-0.220842
0.345005
0.00000
0.00000
inv(l)
Matrix of
1
2
1
1.00000
0.445036
5
by
2
0.00000
1.00000
5
elements
3
0.00000
0.00000
4
0.00000
0.00000
5
0.00000
0.00000
Chapter 16
3
4
5
-0.612703
0.899230
-0.455621
-0.640112
0.997398
0.122373
1.00000
0.885904E-03
-0.695695E-01
69
0.00000
1.00000
-0.386004
0.00000
0.00000
1.00000
Test inverse from looking at u and l
inv(u)*inv(l)
Matrix of
1
2
3
4
5
=>
=>
=>
=>
1
0.672320
0.565469
-0.197200
0.289017
-0.201206
5
by
2
0.565469
0.930016
-0.224394
0.213563
0.540409E-01
5
elements
3
-0.197200
-0.224394
0.347142
0.120672E-01
-0.307224E-01
4
0.289017
0.213563
0.120672E-01
0.300834
-0.170462
5
-0.201206
0.540409E-01
-0.307224E-01
-0.170462
0.441608
CALL PRINT(XPX,L,U,'l*u',L*U,
'Cholesky Factorization of pd matrix',R,
'transpose(r)*r',
TRANSPOSE(R)*R)$
XPX
1
2
3
4
5
L
= Matrix of
1
6.55746
-2.91830
2.14973
-2.98786
2.34107
= Matrix of
1
2
3
4
5
U
1
1.00000
-0.445036
0.327830
-0.455643
0.357008
= Matrix of
1
2
3
4
5
1
6.55746
0.00000
0.00000
0.00000
0.00000
5
by
2
-2.91830
3.12282
0.210901
-0.490650
-1.88651
5
by
2
0.00000
1.00000
0.640112
-0.997965
-0.463060
5
by
2
-2.91830
1.82407
0.00000
0.00000
0.00000
5
elements
3
2.14973
0.210901
4.35066
-2.14731
0.427456
5
5
2.34107
-1.88651
0.427456
1.41839
4.13919
4
0.00000
0.00000
0.00000
1.00000
0.386004
5
0.00000
0.00000
0.00000
0.00000
1.00000
elements
3
0.00000
0.00000
1.00000
-0.885904E-03
0.692275E-01
5
4
-2.98786
-0.490650
-2.14731
7.43273
1.41839
elements
3
2.14973
1.16761
2.89851
0.00000
0.00000
4
-2.98786
-1.82035
-0.256781E-02
4.25468
0.00000
5
2.34107
-0.844652
0.200657
1.64233
2.26445
l*u
Matrix of
1
2
3
4
5
1
6.55746
-2.91830
2.14973
-2.98786
2.34107
5
by
2
-2.91830
3.12282
0.210901
-0.490650
-1.88651
5
elements
3
2.14973
0.210901
4.35066
-2.14731
0.427456
4
-2.98786
-0.490650
-2.14731
7.43273
1.41839
5
2.34107
-1.88651
0.427456
1.41839
4.13919
Cholesky Factorization of pd matrix
R
= Matrix of
1
2
3
4
5
1
2.56075
0.00000
0.00000
0.00000
0.00000
5
by
2
-1.13963
1.35058
0.00000
0.00000
0.00000
5
elements
3
0.839491
0.864523
1.70250
0.00000
0.00000
4
-1.16679
-1.34783
-0.150825E-02
2.06269
0.00000
5
0.914210
-0.625399
0.117860
0.796207
1.50481
70
Matrix Command Language
transpose(r)*r
Matrix of
1
2
3
4
5
1
6.55746
-2.91830
2.14973
-2.98786
2.34107
5
by
2
-2.91830
3.12282
0.210901
-0.490650
-1.88651
5
elements
3
2.14973
0.210901
4.35066
-2.14731
0.427456
4
-2.98786
-0.490650
-2.14731
7.43273
1.41839
5
2.34107
-1.88651
0.427456
1.41839
4.13919
The usual eigenvalue decomposition of A writes it in terms of a "left handed" eigenvector matrix
V and a diagonal matrix D with the eigenvalues  i along the diagonal as
AV  VD
(16.4-6)
A  VDV 1
n
The trace of a matrix is

j 1
j
or the sum of the diagonal elements of D while the determinant is
nj1  j . If A is symmetric we have V '  V 1 which implies that all the columns in V are
orthogonal or that for this case V 'V  I . The Schur decomposition writes
A  USU '
(16.4-7)
where U is an orthagonal matrix and S is block upper triangular with the eigenvalues on the
diagonal. Here for all matrices U 'U  I , unlike the eigenvalue decomposition where V 'V  I
only when the factored matrix is symmetric. Code to illustrate these types of calculations for
both general and symmetric matrices:
/$ Eigenvalues & Schur
we write a*v = v*lamda
a=matrix(2,2:5,1 2,4);
lamda=eig(a,v);
det1=det(a);
trace1=trace(a);
det2=prod(lamda);
trace2=sum(lamda);
lamda=diagmat(lamda);
call schur(a,s,u);
call print('We have defined a general matrix',a,
lamda,
v,
'Is sum of eigenvalues trace?'
'Is product of eigenvalues det?'
det1,det2,trace1,trace2,
'With Eigenvalues a = v*lamda*inv(v)',
v*lamda*inv(v),
' ',
'With Schur a = u*s*transpose(u)
'
'Schur => s upper triangular',
s,u,
'a = u*s*transpose(u)'
'u*transpose(u)=I'
u*transpose(u),
'This is a from the schur, is it a?',
u*s*transpose(u));
Chapter 16
/$ PD Matrix Case
call print('Positive Def case s = lamda'
'transpose(v)*v = I'
'sum(lamda)=trace'
'prod(lamda)=det');
a=xpx;
/$ Note we use the symmetric eigen call here
lamda=seig(a,v);
d
=diagmat(lamda);
call schur(a,s,u);
call print(a,lamda,d,v,
'With Eigenvalues a = v*d*inv(v)',
v*d*inv(v),
' ',
'inv(u)=transpose(u)',
inv(u),transpose(u),
'Is v*transpose(v)= I ?',
v*transpose(v),
'Is transpose(v)*v= I ?',
transpose(v)*v,
'A = v*d*inv(v)',
v*d*inv(v),
'With Schur a = u*s*transpose(u)',
'Schur => s upper triangular',
s,u,
'a = u*s*transpose(u)',
'u*transpose(u)=I'
u*transpose(u),
'This is a matrix from the schur',
u*s*transpose(u),
'sum(lamda)=trace'
'prod(lamda)=det',
sum(lamda),trace(a),
prod(lamda),det(a));
produces detailed but instructive results:
=>
A=MATRIX(2,2:5,1 2,4)$
=>
LAMDA=EIG(A,V)$
=>
DET1=DET(A)$
=>
TRACE1=TRACE(A)$
=>
DET2=PROD(LAMDA)$
=>
TRACE2=SUM(LAMDA)$
=>
LAMDA=DIAGMAT(LAMDA)$
=>
CALL SCHUR(A,S,U)$
=>
CALL PRINT('We have defined a general matrix',A,
71
72
Matrix Command Language
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
LAMDA,
V,
'Is sum of eigenvalues trace?'
'Is product of eigenvalues det?'
DET1,DET2,TRACE1,TRACE2,
'With Eigenvalues a = v*lamda*inv(v)',
V*LAMDA*INV(V),
' ',
'With Schur a = u*s*transpose(u)
'
'Schur => s upper triangular',
S,U,
'a = u*s*transpose(u)'
'u*transpose(u)=I'
U*TRANSPOSE(U),
'This is a from the schur, is it a?',
U*S*TRANSPOSE(U))$
We have defined a general matrix
A
= Matrix of
1
2
LAMDA
1 (
2 (
V
2
1
5.00000
2.00000
by
1
,
,
2 by
0.000
0.000
)
)
= Complex Matrix of
1 (
2 (
0.7071
0.7071
elements
2
1.00000
4.00000
= Complex Matrix of
6.000
0.000
2
1
,
,
(
(
2 elements
0.000
3.000
2 by
0.000
0.000
)
)
2
,
,
0.000
0.000
)
)
2 elements
( -0.4714
( 0.9428
2
,
,
0.000
0.000
)
)
Is sum of eigenvalues trace?
Is product of eigenvalues det?
DET1
=
DET2
=
TRACE1
=
TRACE2
=
18.000000
(18.00000000000000,0.000000000000000E+00)
9.0000000
(9.000000000000000,0.000000000000000E+00)
With Eigenvalues a = v*lamda*inv(v)
Complex Matrix of
1 (
2 (
5.000
2.000
1
,
,
2 by
0.000
0.000
)
)
(
(
2 elements
1.000
4.000
2
,
,
0.000
0.000
With Schur a = u*s*transpose(u)
Schur => s upper triangular
S
= Matrix of
1
2
U
1
6.00000
0.00000
= Matrix of
2
by
2
elements
2
elements
2
-1.00000
3.00000
2
by
)
)
Chapter 16
1
2
1
0.707107
0.707107
2
-0.707107
0.707107
a = u*s*transpose(u)
u*transpose(u)=I
Matrix of
1
2
1
1.00000
0.00000
2
by
2
elements
2
elements
2
0.00000
1.00000
This is a from the schur, is it a?
Matrix of
1
2
=>
=>
=>
=>
1
5.00000
2.00000
2
by
2
1.00000
4.00000
CALL PRINT('Positive Def case s = lamda'
'transpose(v)*v = I'
'sum(lamda)=trace'
'prod(lamda)=det')$
Positive Def case s = lamda
transpose(v)*v = I
sum(lamda)=trace
prod(lamda)=det
=>
A=XPX$
=>
LAMDA=SEIG(A,V)$
=>
D
=>
CALL SCHUR(A,S,U)$
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
CALL PRINT(A,LAMDA,D,V,
'With Eigenvalues a = v*d*inv(v)',
V*D*INV(V),
' ',
'inv(u)=transpose(u)',
INV(U),TRANSPOSE(U),
'Is v*transpose(v)= I ?',
V*TRANSPOSE(V),
'Is transpose(v)*v= I ?',
TRANSPOSE(V)*V,
=DIAGMAT(LAMDA)$
'A = v*d*inv(v)',
V*D*INV(V),
'With Schur a = u*s*transpose(u)',
'Schur => s upper triangular',
S,U,
'a = u*s*transpose(u)',
'u*transpose(u)=I'
U*TRANSPOSE(U),
'This is a matrix from the schur',
U*S*TRANSPOSE(U),
'sum(lamda)=trace'
'prod(lamda)=det',
SUM(LAMDA),TRACE(A),
PROD(LAMDA),DET(A))$
73
74
Matrix Command Language
A
= Matrix of
1
2
3
4
5
LAMDA
1
6.55746
-2.91830
2.14973
-2.98786
2.34107
= Matrix of
5
by
2
-2.91830
3.12282
0.210901
-0.490650
-1.88651
5
elements
3
2.14973
0.210901
4.35066
-2.14731
0.427456
5
by
1
elements
5
by
5
elements
4
-2.98786
-0.490650
-2.14731
7.43273
1.41839
5
2.34107
-1.88651
0.427456
1.41839
4.13919
4
0.00000
0.00000
0.00000
8.20610
0.00000
5
0.00000
0.00000
0.00000
0.00000
11.7777
4
-0.286747
0.441200
0.143791
-0.615213
-0.569172
5
0.681563
-0.228428
0.364217
-0.563809
0.180992
4
-2.98786
-0.490650
-2.14731
7.43273
1.41839
5
2.34107
-1.88651
0.427456
1.41839
4.13919
4
-0.563809
-0.615213
-0.306957
0.270891
-0.368819
5
0.180992
-0.569172
-0.209667
-0.110225
0.766274
4
-0.563809
-0.615213
-0.306957
0.270891
-0.368819
5
0.180992
-0.569172
-0.209667
-0.110225
0.766274
1
1
2
3
4
5
D
0.639475
1.59842
3.38122
8.20610
11.7777
= Matrix of
1
2
3
4
5
V
1
0.639475
0.00000
0.00000
0.00000
0.00000
= Matrix of
1
2
3
4
5
1
-0.608222
-0.703468
0.222860
-0.270891
0.110225
2
0.00000
1.59842
0.00000
0.00000
0.00000
5
3
0.00000
0.00000
3.38122
0.00000
0.00000
by
2
0.244514
-0.395537
0.246090
0.368819
-0.766274
5
elements
3
-0.153385
0.319134
0.858163
0.306957
0.209667
With Eigenvalues a = v*d*inv(v)
Matrix of
1
2
3
4
5
1
6.55746
-2.91830
2.14973
-2.98786
2.34107
5
by
2
-2.91830
3.12282
0.210901
-0.490650
-1.88651
5
elements
3
2.14973
0.210901
4.35066
-2.14731
0.427456
inv(u)=transpose(u)
Matrix of
1
2
3
4
5
1
0.681563
-0.286747
0.153385
0.608222
-0.244514
Matrix of
1
2
3
4
5
1
0.681563
-0.286747
0.153385
0.608222
-0.244514
Is v*transpose(v)= I ?
5
by
2
-0.228428
0.441200
-0.319134
0.703468
0.395537
5
by
2
-0.228428
0.441200
-0.319134
0.703468
0.395537
5
elements
3
0.364217
0.143791
-0.858163
-0.222860
-0.246090
5
elements
3
0.364217
0.143791
-0.858163
-0.222860
-0.246090
Chapter 16
Matrix of
1
2
3
4
5
1
1.00000
0.471845E-15
-0.249800E-15
0.111022E-15
-0.222045E-15
5
by
2
0.471845E-15
1.00000
-0.319189E-15
0.111022E-15
-0.138778E-16
5
75
elements
3
-0.249800E-15
-0.319189E-15
1.00000
0.277556E-16
-0.152656E-15
4
0.111022E-15
0.111022E-15
0.277556E-16
1.00000
-0.693889E-16
5
-0.222045E-15
-0.138778E-16
-0.152656E-15
-0.693889E-16
1.00000
4
-0.138778E-15
0.388578E-15
0.277556E-16
1.00000
-0.971445E-16
5
-0.693889E-17
0.555112E-16
-0.902056E-16
-0.971445E-16
1.00000
Is transpose(v)*v= I ?
Matrix of
1
2
3
4
5
1
1.00000
0.00000
0.211636E-15
-0.138778E-15
-0.693889E-17
5
by
2
0.00000
1.00000
-0.832667E-16
0.388578E-15
0.555112E-16
5
elements
3
0.211636E-15
-0.832667E-16
1.00000
0.277556E-16
-0.902056E-16
A = v*d*inv(v)
Matrix of
1
2
3
4
5
1
6.55746
-2.91830
2.14973
-2.98786
2.34107
5
by
2
-2.91830
3.12282
0.210901
-0.490650
-1.88651
5
elements
3
2.14973
0.210901
4.35066
-2.14731
0.427456
4
-2.98786
-0.490650
-2.14731
7.43273
1.41839
5
2.34107
-1.88651
0.427456
1.41839
4.13919
With Schur a = u*s*transpose(u)
Schur => s upper triangular
S
= Matrix of
1
2
3
4
5
U
1
11.7777
0.00000
0.00000
0.00000
0.00000
= Matrix of
1
2
3
4
5
1
0.681563
-0.228428
0.364217
-0.563809
0.180992
5
by
2
-0.330465E-15
8.20610
0.00000
0.00000
0.00000
5
by
2
-0.286747
0.441200
0.143791
-0.615213
-0.569172
5
elements
3
0.221534E-16
0.212782E-15
3.38122
0.00000
0.00000
5
4
-0.503934E-14
0.273997E-15
-0.666146E-15
0.639475
0.00000
5
0.874203E-15
0.165088E-15
0.189045E-15
-0.183493E-15
1.59842
4
0.608222
0.703468
-0.222860
0.270891
-0.110225
5
-0.244514
0.395537
-0.246090
-0.368819
0.766274
4
0.346945E-15
0.527356E-15
-0.277556E-16
1.00000
-0.277556E-15
5
-0.138778E-15
0.555112E-16
-0.277556E-16
-0.277556E-15
1.00000
elements
3
0.153385
-0.319134
-0.858163
-0.306957
-0.209667
a = u*s*transpose(u)
u*transpose(u)=I
Matrix of
1
2
3
4
5
1
1.00000
0.111022E-15
-0.305311E-15
0.346945E-15
-0.138778E-15
5
by
2
0.111022E-15
1.00000
-0.693889E-16
0.527356E-15
0.555112E-16
5
elements
3
-0.305311E-15
-0.693889E-16
1.00000
-0.277556E-16
-0.277556E-16
This is a matrix from the schur
Matrix of
1
1
6.55746
5
by
2
-2.91830
5
elements
3
2.14973
4
-2.98786
5
2.34107
76
Matrix Command Language
2
3
4
5
-2.91830
2.14973
-2.98786
2.34107
3.12282
0.210901
-0.490650
-1.88651
0.210901
4.35066
-2.14731
0.427456
-0.490650
-2.14731
7.43273
1.41839
-1.88651
0.427456
1.41839
4.13919
sum(lamda)=trace
prod(lamda)=det
25.602863
25.602863
334.02779
334.02779
Note that only with the symmetric matrix are the eigenvectors orthogonal. The SVD
decomposition, discussed in Chapter 10, writes
A  U V '
(16.4-8)
where both U and V are orthogonal (U 'U  I and V 'V  I) whether A is symmetric or not.  is
a diagonal matrix and if is symmetric contains the eigenvalues. Define X as the n by k matrix of
explanatory variables and factor as X  U V ' . The OLS coefficients ˆ  V  1U ' y  V  1ˆ
where ˆ  U ' y and is the principle component coefficient vector. As discussed in chapter 10,
the QR factorization operates on X directly and writes it as
X  QR
(16.4-9)
where R is the upper triangular Cholesky matrix and Q ' Q  I . Using the QR approach the OLS
coefficient vector can be calculated as ˆ  R1Q' y where Q is the truncated QR factorization
1
Q.
11
1
The eigenvalue and SVD decompositions are shown next.
/$ SVD case
n=4;
noob=20;
x=rn(matrix(noob,n:));
s=svd(x,b,11,u,v);
call print('X',x,'Singular values',s,'Left Singular vectors',U,
'Right Singular vectors',v);
call print('Test of Factorization. Is S along diagonal?',
'Transpose(u)*x*v',transpose(u)*x*v,
'Is U orthagonal?','transpose(U)*U',
transpose(U)*U,
'Is V orthagonal?','transpose(V)*V',
transpose(V)*V,
'
');
/$ OLS with SVD
n=30;
11
See Chapter 10 and especially equation (10.1-4) for a further discussion of the SVD and QR approaches to OLS
estimation.
Chapter 16
77
k=5;
x
=rn(matrix(n,k:));
x(,1)
=1.0;
beta
=vector(5:1. 2. 3. 4. 5.);
y
=x*beta +rn(vector(n:));
xpx
=transpose(x)*x;
*
Solve reduced problem;
s
=svd(x,bad,21,u,v);
sigma
=diagmat(s);
betahat1=inv(xpx)*transpose(x)*y;
betahat2=v*inv(sigma)*transpose(u)*y;
call print('OLS from two approaches',betahat1,betahat2);
/$ Show that SVD of PD matrix produces eigenvalues
x=rn(matrix(5,5:));
xpx=Transpose(x)*x;
e=eig(xpx);
ee=seig(xpx);
s=svd(xpx);
call print(e,ee,s);
b34srun;
Edited output produces:
=>
N=4$
=>
NOOB=20$
=>
X=RN(MATRIX(NOOB,N:))$
=>
S=SVD(X,B,11,U,V)$
=>
=>
CALL PRINT('X',X,'Singular values',S,'Left Singular vectors',U,
'Right Singular vectors',V)$
X
X
= Matrix of
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
20
1
-1.68924
1.29993
0.971522
0.275934
1.04040
2.19227
1.34663
0.487722
-1.42759
0.784950
1.24785
-0.170586
1.94869
-0.143676
1.25783
1.04911
-0.417547
0.361858
0.975757E-02
2.33496
by
2
0.835446
1.91536
-0.974110
-1.85672
0.720938
0.677597
1.52468
0.825081
0.717196
0.415158
-0.322106
1.74290
0.720260
1.02657
-1.44179
1.40971
0.304695
-0.991184
0.233525
1.40810
4
elements
3
-0.657493
1.47204
0.819547
-0.279932
-0.898864E-01
1.44798
-0.880582
0.482098
2.55604
-0.532131E-01
-0.169876
-1.43276
-0.299384
0.746487
0.860977
0.919866E-01
-1.34788
3.42229
-0.797029
-0.509794
4
-0.254551
-0.390060
1.15896
-0.159116
0.622223
0.493764E-01
-0.986808
0.478458
0.966945
-0.504051
-0.409886
-0.209339
-0.516373
-1.32309
0.180785
0.210994
1.02744
1.01987
1.35735
-1.84371
Singular values
S
= Vector of
6.23289
4
elements
5.81154
4.30460
3.01437
Left Singular vectors
U
= Matrix of
1
9
17
20
2
10
18
by
20
elements
3
11
19
4
12
20
5
13
6
14
7
15
8
16
78
Matrix Command Language
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-0.322101E-01
-0.320597
0.893493E-01
0.255356
-0.237380
0.294401
-0.117659
-0.262047
0.202830
-0.125971
0.396289
0.269062
0.135539
0.484921E-01
-0.149306
0.196584
-0.227056E-02
-0.287309E-01
0.388463
0.115154
0.641075E-02
0.698504E-01
-0.322599E-01
-0.792262E-01
-0.270935
0.688969
0.392287E-01
0.151415
0.422894E-01
0.290474E-01
0.126017
0.977614E-01
0.818546E-02
0.241971
0.602448E-01
-0.800231E-01
0.308152
0.123591
-0.341018E-01
0.128668
-0.895837E-01
0.210172
-0.743264E-01
0.426337E-01
0.469393E-02
0.218939
0.183609E-01
-0.970485E-01
-0.389875E-02
0.710544E-01
0.791055
-0.311206
-0.229800
0.478663E-01
-0.201913E-01
0.532134E-01
-0.240932
0.508172
0.150667
0.112420
-0.260349
0.994671E-01
0.447487E-01
0.318931
-0.116724
-0.105955
0.229697
0.461419E-01
-0.560129
-0.806928E-02
-0.581931E-01
0.200548
0.904080E-01
-0.274459E-02
0.628860E-01
0.411161
-0.249172E-01
-0.924831E-01
-0.207906E-01
-0.516838E-01
0.116558
0.118010
0.417217E-03
-0.269435E-02
0.242999
0.382602E-01
-0.220869
0.570221E-01
0.976281
0.115740E-01
0.884524E-01
-0.210630E-01
0.117491E-01
-0.232379
-0.199072E-01
0.181101
0.129450
-0.396893E-01
0.598312E-01
0.700518E-01
-0.356546E-01
-0.887863E-01
0.250716
0.369693E-02
-0.102393
0.108324
-0.174255E-01
0.512990E-01
-0.219255
0.231058E-01
0.168326
0.543786
0.352634E-01
0.671949
-0.951078E-01
0.279124E-01
0.128265
0.113523
-0.807279E-01
0.630360E-01
0.296990
0.224954
0.189791
0.303561
0.520870E-02
0.301436
-0.211241
-0.160541E-01
0.117248
-0.390195
-0.140321
0.361415
0.397197E-03
-0.102591E-01
-0.174926
-0.237031E-01
-0.452214E-01
-0.772205E-01
0.231855E-01
-0.495775E-01
0.155384E-01
0.133311
0.102239E-01
-0.100480
0.528093
0.964087E-01
0.134421E-01
-0.313488E-01
-0.297865E-01
0.278619E-01
-0.228266
0.946900
-0.275030E-03
0.208484
0.119206E-01
-0.574899E-01
-0.139860
-0.641483E-01
-0.490760E-01
0.250314
-0.301890E-02
0.229046
-0.336067
-0.471335E-01
-0.266146E-01
0.133400
-0.830885E-02
-0.116959
0.181067E-02
0.269130E-01
-0.207292
0.787399E-01
0.265673E-01
-0.145931E-01
-0.130967E-01
0.244190E-01
0.742236
-0.955852E-01
-0.965548E-01
0.119293
0.965272E-01
-0.228019E-01
0.270443
-0.124349E-01
-0.712554E-01
-0.385389
-0.289396
0.364765
0.203495
0.143602
0.168026
-0.243721
-0.333015
-0.953795E-01
0.456352E-02
-0.125418
0.484881E-02
-0.541132E-01
0.554756E-02
-0.732890E-01
-0.175532
-0.207380
-0.589486E-01
0.131425E-01
-0.279063E-01
0.320377E-01
0.141768
0.542150E-01
-0.587722E-04
-0.772195E-01
0.349257E-01
0.235356E-01
-0.645416E-01
-0.169038
0.853532
-0.792930E-01
-0.778142E-01
-0.395774E-01
-0.122433
0.394257
0.679223E-01
-0.126451
0.222758E-01
0.937636E-01
0.302488E-01
-0.246618
-0.972989E-01
-0.465850E-01
-0.408373
-0.143208
0.733172E-01
0.110617E-01
0.143426
0.153050
-0.492924
-0.139727
0.996318E-01
0.224285
-0.337589E-02
0.730950
Right Singular vectors
V
= Matrix of
1
2
3
4
=>
=>
=>
=>
=>
=>
=>
1
0.606725
0.598306
-0.338768
-0.398937
4
by
2
0.545782
-0.263772E-01
0.833411
0.827817E-01
4
elements
3
-0.525263
0.767939
0.363935
0.438230E-01
4
-0.241052
-0.227166
0.241276
-0.912182
CALL PRINT('Test of Factorization. Is S along diagonal?',
'Transpose(u)*x*v',TRANSPOSE(U)*X*V,
'Is U orthagonal?','transpose(U)*U',
TRANSPOSE(U)*U,
'Is V orthagonal?','transpose(V)*V',
TRANSPOSE(V)*V,
'
')$
Test of Factorization. Is S along diagonal?
Transpose(u)*x*v
Matrix of
1
2
3
4
5
6
7
8
9
1
6.23289
0.444089E-15
-0.115186E-14
0.666134E-15
0.314691E-15
0.278651E-16
-0.313373E-15
0.349962E-15
0.762711E-16
20
by
2
-0.194289E-15
5.81154
0.700828E-15
-0.666134E-15
0.374147E-16
-0.531336E-16
-0.169282E-15
-0.249257E-15
-0.421771E-15
4
elements
3
-0.888178E-15
0.221698E-14
4.30460
-0.416334E-15
-0.450527E-15
-0.322149E-15
-0.161397E-15
0.106333E-15
-0.811914E-15
4
0.444089E-15
-0.133227E-14
-0.333067E-15
3.01437
-0.341664E-16
-0.154196E-15
-0.129486E-15
0.272142E-16
0.315349E-15
0.275910
0.344104
0.375527
-0.234639
0.179450
0.265510
-0.206155
-0.436742
-0.222228
0.115344
-0.615182E-01
-0.120714
-0.140903
0.152946E-01
0.264761
-0.267798
0.108067E-01
0.118000
0.276577
-0.717059E-02
0.176119
-0.168916
-0.296716E-01
-0.130592
0.262122
0.278325
0.863076
-0.743152E-01
-0.661692E-01
0.129263
-0.553338E-01
0.153749E-01
-0.784820E-01
-0.114721
-0.110976
-0.981376E-01
0.840641
0.625786E-01
-0.429142E-01
-0.697688E-01
-0.753492E-01
-0.100587
-0.358442E-01
-0.103534
-0.556430E-01
-0.409910E-01
0.871121
0.164006E-01
-0.113239E-01
-0.602158E-01
-0.858638E-01
-0.352846E-01
-0.570154E-01
0.560134E-01
-0.280015E-01
0.227976E-01
0.935966
-0.830720E-01
0.109161E-01
0.100278
-0.296105E-01
-0.680625E-01
0.939612E-01
0.508002E-01
-0.580439E-01
-0.189478E-01
-0.537323E-02
-0.439334E-01
-0.381130E-01
-0.247489E-01
-0.424956E-01
-0.189020E-01
0.861935E-03
-0.139336E-01
-0.201022E-01
-0.579962E-01
-0.444373E-01
0.300595E-01
-0.294424E-01
-0.618489E-01
0.740036E-02
-0.567557E-02
-0.481729E-01
-0.471590E-01
0.208588E-01
-0.983946E-02
-0.107268
0.106358
-0.280739E-01
-0.724567E-01
-0.787375E-01
0.885985
-0.916376E-01
0.436799E-01
-0.937637E-01
-0.387356E-01
-0.307506E-01
-0.726982E-01
0.120646
0.165615E-01
-0.199654E-01
0.777693
-0.220366E-01
0.158059E-02
0.467469E-01
0.575085E-01
-0.258233E-01
-0.249419E-01
-0.609790E-01
0.845400E-01
0.484097E-01
0.882325
0.605200E-03
0.144754E-01
-0.113702
-0.794022E-01
-0.800897E-01
0.589704E-01
-0.795206E-01
0.314735E-01
-0.755218E-01
0.884023
-0.112126
-0.195894E-01
0.433601E-01
0.154202
-0.341732E-01
0.803092E-01
-0.534232E-01
-0.824188E-01
-0.228036E-01
0.539288E-01
-0.109945
0.343820E-01
0.137477
-0.993364E-01
-0.536145E-01
-0.347539E-02
-0.153688
-0.331699E-01
0.482572E-02
0.200144
-0.245629E-01
0.603488E-01
-0.820223E-01
-0.112053
-0.145848
-0.501192E-01
0.191119E-01
-0.292739E-01
0.509073E-01
-0.107362
0.703497E-01
0.998936E-01
0.551242E-02
-0.137046
-0.102689
-0.106721
0.116293
0.209481
Chapter 16
10
11
12
13
14
15
16
17
18
19
20
-0.584209E-16
0.250437E-15
-0.432716E-15
0.564207E-16
0.941971E-16
0.223408E-15
0.545941E-15
-0.297085E-15
-0.255229E-15
0.107049E-16
0.420439E-15
-0.249419E-16
0.349601E-15
0.206492E-15
0.195553E-15
-0.218145E-16
0.403956E-15
-0.225733E-16
-0.829490E-17
-0.174354E-15
-0.394639E-15
0.381199E-15
0.162566E-15
0.221742E-15
-0.131224E-15
0.216933E-15
0.571930E-15
-0.893632E-16
0.563153E-15
-0.478270E-15
-0.537131E-15
-0.474331E-15
0.516133E-15
79
-0.212325E-15
-0.975753E-16
-0.270899E-15
-0.118217E-15
-0.137410E-15
0.117226E-15
-0.768332E-16
0.453433E-16
0.708976E-15
-0.632835E-16
-0.124486E-15
Is U orthagonal?
transpose(U)*U
Matrix of
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1
9
17
1.00000
0.277556E-16
0.624500E-16
-0.208167E-16
0.555112E-16
-0.693889E-16
-0.256739E-15
-0.104083E-15
-0.163064E-15
0.166533E-15
0.166533E-15
0.381639E-16
0.216840E-16
0.635342E-16
-0.214672E-16
-0.693889E-17
-0.398986E-16
0.138778E-16
0.00000
0.208167E-16
-0.138778E-16
0.433681E-16
0.468375E-16
0.325261E-16
0.277556E-16
1.00000
0.451028E-16
-0.416334E-16
-0.520417E-17
0.693889E-17
0.138778E-16
-0.277556E-16
-0.156125E-16
0.124466E-15
0.992045E-16
-0.501986E-16
0.416334E-16
0.763278E-16
-0.156125E-16
-0.763278E-16
0.277556E-16
0.381639E-16
-0.485723E-16
-0.112757E-16
0.173472E-17
-0.104083E-16
0.260209E-17
0.169136E-16
0.624500E-16
0.451028E-16
1.00000
0.624500E-16
-0.126635E-15
0.390313E-16
0.555112E-16
0.867362E-16
-0.884709E-16
0.555112E-16
-0.138778E-16
0.138778E-16
20
by
2
10
18
-0.208167E-16
-0.416334E-16
0.624500E-16
1.00000
0.693889E-17
-0.156125E-16
0.267147E-15
0.416334E-16
-0.607153E-17
-0.204697E-15
-0.277556E-16
0.130104E-15
-0.500901E-16
-0.140946E-16
0.548064E-16
-0.364292E-16
0.173472E-17
-0.312250E-16
-0.416334E-16
-0.867362E-17
0.156125E-16
-0.216840E-16
-0.101915E-16
0.407660E-16
0.555112E-16
-0.520417E-17
-0.126635E-15
0.693889E-17
1.00000
0.433681E-17
0.451028E-16
0.607153E-17
-0.156125E-16
-0.148861E-15
-0.487891E-18
0.700395E-16
0.121431E-16
-0.156125E-16
0.156125E-16
0.121431E-16
0.346945E-17
-0.260209E-17
0.144849E-15
0.00000
-0.780626E-17
0.477049E-17
-0.737257E-17
0.101915E-16
-0.693889E-16
0.693889E-17
0.390313E-16
-0.156125E-16
0.433681E-17
1.00000
-0.485723E-16
-0.346945E-17
0.546438E-16
0.138778E-16
-0.208167E-16
-0.416334E-16
20
elements
3
11
19
-0.256739E-15
0.138778E-16
0.555112E-16
0.267147E-15
0.451028E-16
-0.485723E-16
1.00000
0.555112E-16
-0.137043E-15
-0.326128E-15
-0.451028E-16
0.277556E-16
-0.141705E-15
-0.358871E-16
0.241777E-16
-0.104083E-16
-0.277556E-16
0.156125E-16
0.555112E-16
-0.693889E-17
-0.277556E-16
0.294903E-16
-0.290566E-16
0.294903E-16
-0.104083E-15
-0.277556E-16
0.867362E-16
0.416334E-16
0.607153E-17
-0.346945E-17
0.555112E-16
1.00000
-0.693889E-17
-0.115034E-15
-0.726415E-17
-0.207625E-16
0.329597E-16
-0.451028E-16
-0.242861E-16
0.104083E-16
0.121431E-16
0.277556E-16
-0.546438E-16
-0.112757E-16
-0.173472E-17
-0.290566E-16
-0.177809E-16
-0.121431E-16
-0.163064E-15
-0.156125E-16
-0.884709E-16
-0.607153E-17
-0.156125E-16
0.546438E-16
-0.137043E-15
-0.693889E-17
1.00000
0.194289E-15
0.00000
-0.138778E-16
4
12
20
0.166533E-15
0.124466E-15
0.555112E-16
-0.204697E-15
-0.148861E-15
0.138778E-16
-0.326128E-15
-0.115034E-15
0.194289E-15
1.00000
-0.325261E-16
-0.555112E-16
0.305745E-16
-0.889046E-17
-0.112757E-16
0.173472E-16
0.140946E-16
0.138778E-16
-0.555112E-16
-0.235272E-16
-0.138778E-16
0.615827E-16
-0.379471E-18
-0.260209E-16
0.166533E-15
0.992045E-16
-0.138778E-16
-0.277556E-16
-0.487891E-18
-0.208167E-16
-0.451028E-16
-0.726415E-17
0.00000
-0.325261E-16
1.00000
-0.607153E-17
-0.659195E-16
-0.211962E-16
-0.416334E-16
-0.145717E-15
-0.948677E-17
0.693889E-16
-0.693889E-17
-0.139049E-16
-0.138778E-16
0.234188E-16
-0.233103E-17
-0.693889E-17
0.381639E-16
-0.501986E-16
0.138778E-16
0.130104E-15
0.700395E-16
-0.416334E-16
0.277556E-16
-0.207625E-16
-0.138778E-16
-0.555112E-16
-0.607153E-17
1.00000
Is V orthagonal?
transpose(V)*V
Matrix of
1
2
3
4
=>
1
1.00000
0.124900E-15
0.461436E-15
-0.166533E-15
N=30$
4
by
2
0.124900E-15
1.00000
-0.191687E-15
0.971445E-16
4
elements
3
0.461436E-15
-0.191687E-15
1.00000
-0.624500E-16
4
-0.166533E-15
0.971445E-16
-0.624500E-16
1.00000
5
13
6
14
7
15
8
16
0.216840E-16
0.416334E-16
-0.693889E-17
-0.763278E-16
0.00000
-0.485723E-16
0.433681E-16
-0.104083E-16
-0.500901E-16
0.121431E-16
-0.364292E-16
0.121431E-16
-0.416334E-16
0.144849E-15
-0.216840E-16
0.477049E-17
-0.141705E-15
0.329597E-16
-0.104083E-16
0.104083E-16
0.555112E-16
-0.546438E-16
0.294903E-16
-0.290566E-16
0.305745E-16
-0.659195E-16
0.173472E-16
-0.145717E-15
-0.555112E-16
-0.693889E-17
0.615827E-16
0.234188E-16
1.00000
-0.521501E-16
-0.329597E-16
-0.574627E-17
-0.274303E-16
-0.617995E-17
0.187431E-16
-0.109504E-16
-0.329597E-16
0.190820E-16
1.00000
0.242861E-16
0.208167E-16
-0.225514E-16
-0.173472E-17
-0.416334E-16
-0.274303E-16
-0.138778E-16
0.208167E-16
0.329597E-16
1.00000
0.112757E-16
-0.303577E-16
-0.954098E-17
0.187431E-16
-0.130104E-16
-0.173472E-17
-0.433681E-18
-0.303577E-16
-0.411997E-17
1.00000
-0.249366E-17
0.635342E-16
0.763278E-16
-0.398986E-16
0.277556E-16
0.208167E-16
-0.112757E-16
0.468375E-16
0.260209E-17
-0.140946E-16
-0.156125E-16
0.173472E-17
0.346945E-17
-0.867362E-17
0.00000
-0.101915E-16
-0.737257E-17
-0.358871E-16
-0.451028E-16
-0.277556E-16
0.121431E-16
-0.693889E-17
-0.112757E-16
-0.290566E-16
-0.177809E-16
-0.889046E-17
-0.211962E-16
0.140946E-16
-0.948677E-17
-0.235272E-16
-0.139049E-16
-0.379471E-18
-0.233103E-17
-0.521501E-16
1.00000
0.190820E-16
0.173472E-16
-0.138778E-16
-0.147451E-16
-0.130104E-16
-0.355618E-16
-0.574627E-17
0.173472E-16
0.242861E-16
1.00000
0.329597E-16
0.251535E-16
-0.433681E-18
-0.650521E-17
-0.617995E-17
-0.147451E-16
-0.225514E-16
0.251535E-16
0.112757E-16
1.00000
-0.411997E-17
-0.975782E-17
-0.109504E-16
-0.355618E-16
-0.416334E-16
-0.650521E-17
-0.954098E-17
-0.975782E-17
-0.249366E-17
1.00000
-0.214672E-16
-0.156125E-16
0.138778E-16
0.381639E-16
-0.138778E-16
0.173472E-17
0.325261E-16
0.169136E-16
0.548064E-16
0.156125E-16
-0.312250E-16
-0.260209E-17
0.156125E-16
-0.780626E-17
0.407660E-16
0.101915E-16
0.241777E-16
-0.242861E-16
0.156125E-16
0.277556E-16
-0.277556E-16
-0.173472E-17
0.294903E-16
-0.121431E-16
-0.112757E-16
-0.416334E-16
0.138778E-16
0.693889E-16
-0.138778E-16
-0.138778E-16
-0.260209E-16
-0.693889E-17
80
Matrix Command Language
=>
K=5$
=>
X
=RN(MATRIX(N,K:))$
=>
X(,1)
=1.0$
=>
BETA
=VECTOR(5:1. 2. 3. 4. 5.)$
=>
Y
=X*BETA +RN(VECTOR(N:))$
=>
XPX
=TRANSPOSE(X)*X$
=>
*
=>
S
=SVD(X,BAD,21,U,V)$
=>
SIGMA
=DIAGMAT(S)$
=>
BETAHAT1=INV(XPX)*TRANSPOSE(X)*Y$
=>
BETAHAT2=V*INV(SIGMA)*TRANSPOSE(U)*Y$
=>
CALL PRINT('OLS from two approaches',BETAHAT1,BETAHAT2)$
SOLVE REDUCED PROBLEM$
OLS from two approaches
BETAHAT1= Vector of
0.662539
5
2.95431
BETAHAT2= Vector of
0.662539
5
X=RN(MATRIX(5,5:))$
=>
XPX=TRANSPOSE(X)*X$
=>
E=EIG(XPX)$
=>
EE=SEIG(XPX)$
=>
S=SVD(XPX)$
=>
CALL PRINT(E,EE,S)$
2.95431
= Complex Vector of
(
(
EE
13.45
1.472
,
,
= Matrix of
3.50355
4.73075
3.50355
4.73075
elements
2.00676
=>
E
elements
2.00676
5
0.000
0.000
)
)
5
(
by
5.467
1
,
0.000
)
(
0.3605
elements
1
S
1
0.360474
2
1.47185
3
2.25274
4
5.46686
5
13.4516
= Vector of
13.4516
5
5.46686
elements
2.25274
1.47185
0.360474
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
2881535, peak space used
88, peak number used
407, # user temp clean
6234
97
0
,
0.000
)
(
2.253
,
0.000
)
Chapter 16
81
16.5 Extended Eigenvalue Analysis
Recall from (16.4) that given a general matrix A, an eigenvalue value decomposition
writes AV  VD or A  VDV 1 where V is the usual "left hand" eigenvector matrix V and
D is a diagonal matrix with the eigenvalues  i along the diagonal. It should be noted that the
matrix V is not unique and can be scaled. The matrix command e=eig(a,V) will use the
eispack routines rg and cg to produce a non-scaled eigenvector matrix, while the commands
e=eig(a,v :LAPACK) or e=eig(a,v :LAPACK2)will produce eigenvalues using
lapack routines dgeevx/zgeevx or dgeev/zgeev where V is scaled so that each column has a
norm of 1. The dgeevx/zgeevx call does not balance the matrix while dgeev/zgeev does. In
addition to the usual eigenvectors, it is possible to define "right handed" eigenvectors V * where:
(V * ) H A  D(V * ) H
A
 ((V * ) H )1 D(V * ) H
(16.5-1)
The below listed code illustrates these refinements for real*8 and complex*16 matrices. We first
estimate and test the non-scaled eigenvectors evec which were estimated using eispack. Next
the lapack code is used to estimate and test the right and left handed eigenvectors.
b34sexec matrix;
* Exercises Eigenvalue calculations ;
* IMSL test case ;
A = matrix(3,3: 8.0, -1.0,-5.0,
-4.0, 4.0,-2.0,
18.0, -5.0,-7.0);
e =eig(a,evec);
call print('Test Eispack',a,evec*diagmat(e)*inv(evec));
e2 =eig(a,evecr,evecl :lapack);
call print('test eispack vs lapack':);
call print(a,e,evec,e2,evecr,evecl);
call print('test right'
evecr*diagmat(e2)*inv(evecr)
'test left'
inv(transpose(dconj(evecl)))*diagmat(e2)*transpose(dconj(evecl)));
ca=complex(a,a*a);
e =eig(ca,evec);
call print('Test Eispack factorization',
ca,evec*diagmat(e)*inv(evec));
e2 =eig(ca,evecr,evecl :lapack);
call print('test eispack vs lapack':);
call print(ca,e,evec,e2,evecr,evecl);
call print('test right'
evecr*diagmat(e2)*inv(evecr)
'test left'
82
Matrix Command Language
inv(transpose(dconj(evecl)))*diagmat(e2)*transpose(dconj(evecl)));
b34srun;
Edited output is listed next.
B34S(r) Matrix Command.
Date of Run d/m/y
Version February 2004.
26/ 2/04. Time of Run h:m:s
14:55:16.
=>
* EXERCISES EIGENVALUE CALCULATIONS $
=>
* IMSL TEST CASE $
=>
=>
=>
A = MATRIX(3,3: 8.0, -1.0,-5.0,
-4.0, 4.0,-2.0,
18.0, -5.0,-7.0)$
=>
E
=>
CALL PRINT('Test Eispack',A,EVEC*DIAGMAT(E)*INV(EVEC))$
=EIG(A,EVEC)$
Test Eispack
A
= Matrix of
3
1
8.00000
-4.00000
18.0000
1
2
3
by
3
2
-1.00000
4.00000
-5.00000
3
-5.00000
-2.00000
-7.00000
Complex Matrix of
1 (
2 (
3 (
8.000
-4.000
18.00
1
,
,
,
elements
3 by
0.5467E-14)
0.3828E-14)
0.7021E-14)
(
(
(
3 elements
-1.000
4.000
-5.000
=>
E2 =EIG(A,EVECR,EVECL
=>
CALL PRINT('test eispack vs lapack':)$
2
, -0.2402E-14)
, -0.3625E-15)
, -0.3845E-14)
(
(
(
-5.000
-2.000
-7.000
(
1.000
3
, -0.2178E-14)
, -0.2136E-14)
, -0.2400E-14)
:LAPACK)$
test eispack vs lapack
=>
CALL PRINT(A,E,EVEC,E2,EVECR,EVECL)$
A
= Matrix of
1
8.00000
-4.00000
18.0000
1
2
3
E
by
3
2
-1.00000
4.00000
-5.00000
2.000
,
4.000
= Complex Matrix of
1
3
)
elements
3
-5.00000
-2.00000
-7.00000
= Complex Vector of
(
EVEC
3
(
elements
2.000
3 by
,
-4.000
)
,
3 elements
2
3
0.000
)
Chapter 16
1 ( 0.1129
2 ( -0.4268
3 ( 0.6526
E2
,
,
,
)
)
)
( 0.1129
( -0.4268
( 0.6526
= Complex Vector of
(
EVECR
2.000
,
EVECL
4.000
3
)
(
= Complex Matrix of
1
1 ( 0.3162
,
2 ( -0.9992E-15,
3 ( 0.6325
,
0.3162
0.6325
0.000
)
)
)
2.000
,
-4.000
)
)
)
(
(
(
2.367
4.735
2.367
,
,
,
0.000
0.000
0.000
)
)
)
)
(
1.000
,
0.000
)
)
)
)
(
(
(
0.4082
0.8165
0.4082
3
,
,
,
0.000
0.000
0.000
)
)
)
( -0.8165
( 0.4082
( 0.4082
3
,
,
,
0.000
0.000
0.000
)
)
)
3 elements
2
( 0.3162
, -0.3162
( -0.9992E-15, -0.6325
( 0.6325
,
0.000
3 by
1
,
0.000
)
, -0.8771E-01)
, 0.1754
)
, -0.5397
, -0.6526
, -0.4268
elements
3 by
= Complex Matrix of
1 ( -0.8771
2 ( 0.2631
3 ( 0.3508
=>
=>
=>
=>
0.5397
0.6526
0.4268
83
( -0.8771
( 0.2631
( 0.3508
3 elements
2
,
0.000
)
, 0.8771E-01)
, -0.1754
)
CALL PRINT('test right'
EVECR*DIAGMAT(E2)*INV(EVECR)
'test left'
INV(TRANSPOSE(DCONJ(EVECL)))*DIAGMAT(E2)*TRANSPOSE(DCONJ(EVECL)))$
test right
Complex Matrix of
1 (
2 (
3 (
8.000
-4.000
18.00
1
,
,
,
0.000
0.000
0.000
3 by
)
)
)
(
(
(
-1.000
4.000
-5.000
3 elements
2
,
,
,
0.2220E-15)
0.000
)
0.4441E-15)
(
(
(
-5.000
-2.000
-7.000
3
,
,
,
(
(
(
-5.000
-2.000
-7.000
3
, -0.6865E-15)
, 0.7069E-15)
, -0.2347E-14)
(
(
(
-5.000
-2.000
-7.000
3
,
,
,
0.8777E-16)
0.1755E-15)
0.8777E-16)
test left
Complex Matrix of
1 (
2 (
3 (
8.000
-4.000
18.00
1
,
,
,
3 by
0.1595E-14)
0.3626E-15)
0.4039E-14)
(
(
(
-1.000
4.000
-5.000
=>
CA=COMPLEX(A,A*A)$
=>
E
=>
=>
CALL PRINT('Test Eispack factorization',
CA,EVEC*DIAGMAT(E)*INV(EVEC))$
3 elements
2
, -0.1314E-15)
, -0.1813E-15)
, -0.4599E-15)
=EIG(CA,EVEC)$
Test Eispack factorization
CA
= Complex Matrix of
1 (
2 (
3 (
8.000
-4.000
18.00
1
,
,
,
-22.00
-84.00
38.00
Complex Matrix of
1
3 by
)
)
)
(
(
(
-1.000
4.000
-5.000
3 by
3 elements
2
,
,
,
13.00
30.00
-3.000
)
)
)
3 elements
2
3
-3.000
26.00
-31.00
)
)
)
84
Matrix Command Language
1 (
2 (
3 (
8.000
-4.000
18.00
,
,
,
-22.00
-84.00
38.00
)
)
)
(
(
(
-1.000
4.000
-5.000
=>
E2 =EIG(CA,EVECR,EVECL
=>
CALL PRINT('test eispack vs lapack':)$
,
,
,
13.00
30.00
-3.000
)
)
)
(
(
(
-5.000
-2.000
-7.000
,
,
,
-3.000
26.00
-31.00
)
)
)
13.00
30.00
-3.000
)
)
)
(
(
(
-5.000
-2.000
-7.000
3
,
,
,
-3.000
26.00
-31.00
)
)
)
1.000
)
(
18.00
,
-16.00
)
0.2242
0.4484
0.2242
)
)
)
( -0.9345
( 0.6072
( -2.476
3
,
,
,
1.542
2.476
0.6072
)
)
)
1.000
)
(
,
-16.00
)
2
, -0.2498E-15)
,
0.000
)
, -0.3608E-15)
(
(
(
3
0.3162
, -0.3162
0.1665E-14, -0.6325
0.6325
,
0.000
)
)
)
:LAPACK)$
test eispack vs lapack
=>
CALL PRINT(CA,E,EVEC,E2,EVECR,EVECL)$
CA
= Complex Matrix of
1 (
2 (
3 (
E
8.000
-4.000
18.00
-22.00
-84.00
38.00
)
)
)
(
(
(
= Complex Vector of
(
EVEC
-14.00
,
-8.000
)
(
E2
-7.205
-6.911
-7.499
1
, -0.2937
, -7.499
,
6.911
)
)
)
EVECR
1 (
2 (
3 (
EVECL
-14.00
,
-8.000
(
(
(
(
= Complex Matrix of
1 ( 0.8771
2 ( -0.2631
3 ( -0.3508
1.000
0.4082
0.8165
0.4082
3 by
1
,
0.000
)
, 0.8771E-01)
, -0.1754
)
2
,
,
,
elements
3 by
)
)
)
,
3 elements
( -0.5782
( -1.156
( -0.5782
= Complex Matrix of
1
0.3162
, -0.3162
0.6325
,
0.000
0.1638E-14, -0.6325
1.000
3
)
2
,
,
,
elements
3 by
= Complex Vector of
(
3 elements
-1.000
4.000
-5.000
3
= Complex Matrix of
1 (
2 (
3 (
=>
=>
=>
=>
1
,
,
,
3 by
( 0.8165
( -0.4082
( -0.4082
,
18.00
3 elements
3 elements
2
,
0.000
)
, 0.1665E-15)
, -0.3053E-15)
( 0.8771
( -0.2631
( -0.3508
3
,
0.000
)
, -0.8771E-01)
, 0.1754
)
CALL PRINT('test right'
EVECR*DIAGMAT(E2)*INV(EVECR)
'test left'
INV(TRANSPOSE(DCONJ(EVECL)))*DIAGMAT(E2)*TRANSPOSE(DCONJ(EVECL)))$
test right
Complex Matrix of
1 (
2 (
3 (
8.000
-4.000
18.00
test left
1
,
,
,
-22.00
-84.00
38.00
3 by
)
)
)
(
(
(
-1.000
4.000
-5.000
3 elements
2
,
,
,
13.00
30.00
-3.000
)
)
)
(
(
(
-5.000
-2.000
-7.000
3
,
,
,
-3.000
26.00
-31.00
)
)
)
Chapter 16
Complex Matrix of
1 (
2 (
3 (
8.000
-4.000
18.00
1
,
,
,
-22.00
-84.00
38.00
3 by
)
)
)
(
(
(
-1.000
4.000
-5.000
85
3 elements
2
,
,
,
13.00
30.00
-3.000
)
)
)
(
(
(
-5.000
-2.000
-7.000
3
,
,
,
-3.000
26.00
-31.00
)
)
)
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
2874697, peak space used
28, peak number used
86, # user temp clean
1518
40
0
The reason the matrix command has both types of eigenvalue routines is that while above order
200 the lapack code is faster, below order 200 the eispack code is faster. The next job tests this
and in addition uses the eispack symmetric matrix code tred1/imtql1 and tred2/imtql2 for
eigenvalue and eigenvalue/eigenvector respectively. In the following series of tests, where
appropriate, two computers were used, both running Windows XP Professional. The Dell
Latitude Model C810 had an Intel Family 6 Model 11 CPU with a speed of 1,122 MH as
measured by MS Word version 2003 and 512 MB of memory and represents the "low end". The
Dell Model 650 workstation has two dual core Xeon CPU's each 3,056 MH as measured by MS
Word and represents "high end" computing capability.
b34sexec matrix;
* ispeed1 on pd matrix ;
* ispeed2 on general matrix;
* ispeed3 on complex general matrix;
* up 625 has been run ;
igraph=0;
ispeed1=1;
ispeed2=1;
ispeed3=1;
upper=450;
mesh=50;
/$ PD Results
if(ispeed1.ne.0)then;
call echooff;
icount=0;
n=0;
top continue;
icount=icount+1;
n=n+mesh;
if(n .eq. upper)go to done;
x=rec(matrix(n,n:));
x=transpose(x)*x;
* x=complex(x,dsqrt(dabs(x)));
call compress;
call timer(base10);
e=seig(x);
call timer(base20);
call compress;
call timer(base110);
86
Matrix Command Language
e=seig(x,evec);
call timer(base220);
call compress;
call timer(base11);
e=eig(x);
call timer(base22);
call compress;
call timer(base111);
e=eig(x:lapack2);
call timer(base222);
call compress;
call timer(base1);
e=eig(x,evec);
call timer(base2);
call compress;
call timer(base3);
e=eig(x,evec,evec2 :lapack2);
call timer(base4);
call compress;
call timer(base5);
e=eig(x,evec:lapack2);
call timer(base6);
size(icount)
sm1(icount)
sm2(icount)
eispack1(icount)
lapack1(icount)
eispack2(icount)
lapack2a(icount)
lapack2b(icount)
= dfloat(n);
=base20-base10;
=base220-base110;
=(base22-base11);
=(base222-base111);
=(base2-base1);
=(base4-base3);
=(base6-base5);
call free(x,xinv1,ii);
go to top;
done continue;
call print('EISPACK vs LAPACK on PD Matrix ':);
call print('lapack2a gets both right and left eigenvectors':);
call tabulate(size,sm1 sm2,eispack1,lapack1,eispack2,lapack2a,lapack2b);
if(igraph.eq.1)
call graph(size sm1,sm2,eispack1,lapack1,eispack2,lapack2a,lapack2b
:plottype xyplot :nokey :file 'pd_matrix.wmf'
:heading 'Real*8 PD Matrix Results');
endif;
if(ispeed2.ne.0)then;
call echooff;
icount=0;
n=0;
top2 continue;
icount=icount+1;
n=n+mesh;
if(n .eq. upper)go to done2;
Chapter 16
x=rec(matrix(n,n:));
* x=transpose(x)*x;
* x=complex(x,dsqrt(dabs(x)));
call compress;
call timer(base11);
e=eig(x);
call timer(base22);
call compress;
call timer(base111);
e=eig(x:lapack2);
call timer(base222);
call compress;
call timer(base1);
e=eig(x,evec);
call timer(base2);
call compress;
call timer(base3);
e=eig(x,evec,evec2 :lapack2);
call timer(base4);
call compress;
call timer(base5);
e=eig(x,evec:lapack2);
call timer(base6);
size(icount)
eispack1(icount)
lapack1(icount)
eispack2(icount)
lapack2a(icount)
lapack2b(icount)
= dfloat(n);
=(base22-base11);
=(base222-base111);
=(base2-base1);
=(base4-base3);
=(base6-base5);
call free(x,xinv1,ii);
go to top2;
done2 continue;
call print('EISPACK vs LAPACK on General Matrix ':);
call print('lapack2a gets both right and left eigenvectors':);
call tabulate(size,eispack1,lapack1,eispack2,lapack2a,lapack2b);
if(igraph.eq.1)
call graph(size ,eispack1,lapack1,eispack2,lapack2a,lapack2b
:plottype xyplot :nokey :file 'real_8.wmf'
:heading 'Real*8 General Matrix Results');
endif;
if(ispeed3.ne.0)then;
call echooff;
icount=0;
n=0;
top3 continue;
icount=icount+1;
n=n+mesh;
if(n .eq. upper)go to done3;
x=rec(matrix(n,n:));
87
88
Matrix Command Language
x=complex(x,dsqrt(dabs(x)));
call compress;
call timer(base11);
e=eig(x);
call timer(base22);
call compress;
call timer(base111);
e=eig(x:lapack2);
call timer(base222);
call compress;
call timer(base1);
e=eig(x,evec);
call timer(base2);
call compress;
call timer(base3);
e=eig(x,evec,evec2 :lapack2);
call timer(base4);
call compress;
call timer(base5);
e=eig(x,evec:lapack2);
call timer(base6);
size(icount)
eispack1(icount)
lapack1(icount)
eispack2(icount)
lapack2a(icount)
lapack2b(icount)
= dfloat(n);
=(base22-base11);
=(base222-base111);
=(base2-base1);
=(base4-base3);
=(base6-base5);
call free(x,xinv1,ii);
go to top3;
done3 continue;
call print('EISPACK vs LAPACK on a Complex General Matrix ':);
call print('lapack2a gets both right and left eigenvectors':);
call tabulate(size,eispack1,lapack1,eispack2,lapack2a,lapack2b);
if(igraph.eq.1)
call graph(size ,eispack1,lapack1,eispack2,lapack2a,lapack2b
:plottype xyplot :nokey :file 'complex_16.wmf'
:heading 'Complex*16 Results');
endif;
b34srun;
Results from running this script on the Dell 650 workstation are shown next. Note that:
lapack2a
SM1
SM1
Eispack1
Lapack1
gets both right and left eigenvectors
=> Eispack with eigenvalue only tred11/imtql1
=> Eispack eigenvalue and Vec. tred12/imtql2
=> Eispack with eigenvalue Only rg
=> Lapack with eigenvalue Only DGEEVX
Chapter 16
89
Eispack2 => Eispack eigenvalue and eigenvector rg
Lapack2a => Both right and left eigenvalues dgeevx
Lapack2b => Lapack eigenvalue and Vec.
dgeevx
EISPACK vs LAPACK on PD Matrix
lapack2a gets both right and left eigenvectors
Obs
1
2
3
4
5
6
7
8
SIZE
50.00
100.0
150.0
200.0
250.0
300.0
350.0
400.0
SM1
SM2
EISPACK1
LAPACK1
EISPACK2
LAPACK2A
LAPACK2B
0.000
0.000
0.000
0.000
0.1562E-01
0.000
0.000
0.000
0.1562E-01 0.1562E-01 0.1562E-01 0.1562E-01 0.3125E-01 0.4688E-01
0.1562E-01 0.1562E-01 0.3125E-01 0.4688E-01 0.6250E-01 0.1250
0.1250
0.1562E-01 0.6250E-01 0.6250E-01 0.1250
0.1719
0.3125
0.2969
0.3125E-01 0.1250
0.1406
0.2500
0.3438
0.6875
0.6406
0.4688E-01 0.2188
0.2656
0.4375
0.6875
1.156
1.062
0.7812E-01 0.4375
0.4531
0.7188
1.188
1.828
1.688
0.1406
0.6875
0.7656
1.094
1.938
2.766
2.609
EISPACK vs LAPACK on General Matrix
lapack2a gets both right and left eigenvectors
Obs
1
2
3
4
5
6
7
8
SIZE
50.00
100.0
150.0
200.0
250.0
300.0
350.0
400.0
EISPACK1
LAPACK1
EISPACK2
LAPACK2A
LAPACK2B
0.000
0.000
0.1562E-01
0.000
0.1562E-01
0.1562E-01 0.1562E-01 0.3125E-01 0.3125E-01 0.4688E-01
0.4688E-01 0.4688E-01 0.9375E-01 0.1094
0.1094
0.9375E-01 0.1094
0.2344
0.2656
0.2500
0.1875
0.2031
0.4844
0.5781
0.5312
0.3438
0.3594
0.9688
0.9844
0.9219
0.5781
0.5938
1.609
1.594
1.453
0.9844
0.8906
2.578
2.422
2.234
EISPACK vs LAPACK on a Complex General Matrix
lapack2a gets both right and left eigenvectors
Obs
1
2
3
4
5
6
7
8
SIZE
50.00
100.0
150.0
200.0
250.0
300.0
350.0
400.0
EISPACK1
LAPACK1
EISPACK2
LAPACK2A
LAPACK2B
0.1562E-01 0.1562E-01 0.3125E-01 0.1562E-01 0.1562E-01
0.4688E-01 0.4688E-01 0.1875
0.9375E-01 0.7812E-01
0.1719
0.1094
0.6875
0.2812
0.2656
0.3906
0.2656
1.609
0.6875
0.6406
0.8594
0.5625
3.344
1.438
1.328
1.672
0.9062
6.031
2.375
2.188
3.000
1.438
9.922
3.812
3.531
4.359
2.141
14.47
5.688
5.312
By using the appropriate routine to calculate the eigenvalues (TRED1/IMTQL1) of the
PD matrix for order 400 the cost is 18.37% (.1406/.7656) of the real*8 general matrix eispack
routine RG or 12.85% (.1406/1.094) of the more expensive lapack routine DGEEVX. What is
surprising is that even at a size of 400 by 400, eispack dominates lapack in terms of time (.7656
vs 1.094) if only eigenvalues are requested. This gain appears to carry through for calculations
involving eigenvectors. Since the column LAPACK2A, involves calculation of both right and left
hand side eigenvectors, the correct column to compare is LAPACK2B. Here eispack time is
74.28% (1.938/2.609) of lapack.
For a real*8 general matrix the results are somewhat different. For just eigenvalues,
lapack never dominates eispack. However if eigenvectors are requested, lapack begins to
dominate in the range 300 to 350. At 200 and 250 the eispack / lapack) times were close to the
same (.2344 /.2500) and (.4844 /.5312) while at 300 these became (.9688/.9219) respectively as
lapack became slightly faster. At 400 the times were (2.578/2.234) and lapack was moving out
as the clear winner. The lesson to be drawn is that the bigger the system, the more appropriate it
is to use lapack. Note that the option :lapack2 has been used that turns off balancing. Thus,
the reported tests are biased in favor of lapack since eispack balances. It has been found and
noted by others, that balancing can be dangerous, especially for complex matrices. The job
speed1.b34 in c:\b34slm\b34stest\ can be used to benchmark these results of different machines.
90
Matrix Command Language
For a complex*16 matrix the cross over appears between 100 to 150 if only eigenvalues
are requested. At 150 lapack is the clear winner with a time of .1094 vs eispack's .1719. By an
order 400 matrix, the times were 4.359 and 2.141 respectively or over 2.0 times faster. If both
eigenvalues and eigenvectors are requested for an order 400 matrix the relative times were 14.47
and 5.312 where lapack is 2.724 times faster. Since most problems are relatively small, these
results suggest that for the time being eispack should be the default eigen routine for both real*8
and complex*16 problems. In additional advantage of eispack is that it runs with substantially
less work memory. The lapack code was developed to run large matrices and make use of a
block design. This seems to work especially well with complex*16 matrices. For real*16 and
complex*32 matrices, lapack is not available and especially modified versions of the eispack
routines are used. The options :lapack which calls DGEEV/ZGEEV runs into problems on large
complex*16 matrices due to balancing. The effect of permutations and scaling can be
investigated using the options :lapackp, :lapacks and :lapackb which turn on and off various
options in DGEEVX/ZGEEVX. The above test problem appears to fails for large complex
matrices generated using the sample code when these options are turned on. Section 16.6
discusses the ilaenv routine whereby alternative blocksizes can be investigated. It is to noted that
the above times can be influenced by the size of the workspace which appears to influence
memory access speed. The default B34S setting is to let lapack obtain the optimum work space.
16.6 A Preliminary Investigation of Inversion Speed Differences
There are substantial speed differences between various approaches to matrix inversion.
The below listed program shows the relative speed of inverting a positive definite matrix of from
size 50 to 600 using lapack, linpack, a SVD (for the pseudo inverse) and a Cholesky
decomposition using linpack.
b34sexec matrix;
* Tests speed of Linpack vs LAPACK vs svd (pinv) vs ;
* Requires a large size ;
call echooff;
icount=0;
n=0;
upper=600;
mesh=50;
top continue;
icount=icount+1;
n=n+mesh;
if(n .gt. upper)go to done;
call print('Doing size ',n:);
x=rn(matrix(n,n:));
x=transpose(x)*x;
ii=matrix(n,n:)+1.;
/$ Use LAPACK
LU
call compress;
call timer(base1);
xinv1=inv(x:gmat);
call timer(base2);
error1(icount)=sum(dabs(ii-(xinv1*x)));
/$ Use LINPACK 'Default' LU
call compress;
Chapter 16
91
call timer(base3);
xinv1=inv(x);
call timer(base4);
error2(icount)=sum(dabs(ii-(xinv1*x)));
/$ Use IMSL pinv code
call compress;
call timer(base5);
xinv1=pinv(x);
call timer(base6);
error3(icount)=sum(dabs(ii-(xinv1*x)));
/$ Use Linpack DPOCO / DPODI Code
call compress;
call timer(base7);
xinv1=inv(x :pdmat);
call timer(base8);
error4(icount)=sum(dabs(ii-(xinv1*x)));
size(icount)
=dfloat(n);
lapack(icount) =(base2-base1);
linpack(icount)=(base4-base3);
svdt(icount)
=(base6-base5);
chol(icount)
=(base8-base7);
call free(x,xinv1,ii);
go to top;
done continue;
call tabulate(size,lapack,linpack,svdt,chol,
error1,error2,error3,error4);
call graph(size lapack,linpack
:heading 'Lapack Vs Linpack'
:plottype xyplot);
call graph(size lapack,linpack svdt
:heading 'LAPACK vs Linpack vs SVD'
:plottype xyplot);
b34srun;
Edited output from running the above program on a Dell 650 workstation machine using the
built-in Fortran CPU timer produces:
Obs
1
2
3
4
5
6
7
8
9
10
11
12
SIZE
50.00
100.0
150.0
200.0
250.0
300.0
350.0
400.0
450.0
500.0
550.0
600.0
LAPACK
LINPACK
SVDT
CHOL
ERROR1
ERROR2
ERROR3
ERROR4
0.000
0.000
0.1562E-01
0.000
0.2129E-08 0.2129E-08 0.4866E-08 0.2845E-08
0.000
0.1562E-01 0.1562E-01 0.1562E-01 0.1133E-05 0.1109E-05 0.2193E-05 0.1418E-05
0.1562E-01 0.1562E-01 0.6250E-01
0.000
0.1352E-08 0.1315E-08 0.3145E-08 0.1685E-08
0.1562E-01 0.1562E-01 0.1406
0.1562E-01 0.1432E-05 0.1425E-05 0.3864E-05 0.1920E-05
0.3125E-01 0.3125E-01 0.3438
0.1562E-01 0.1982E-08 0.1957E-08 0.4886E-08 0.2428E-08
0.7812E-01 0.7812E-01 0.7812
0.3125E-01 0.1292E-05 0.1297E-05 0.3265E-05 0.1692E-05
0.1406
0.1562
1.281
0.7812E-01 0.2907E-07 0.2874E-07 0.6778E-07 0.3556E-07
0.2031
0.2656
1.906
0.1250
0.5379E-06 0.5386E-06 0.1298E-05 0.6448E-06
0.2812
0.3750
2.766
0.2031
0.4115E-06 0.4115E-06 0.1069E-05 0.5292E-06
0.3906
0.8281
5.328
0.4531
0.1325E-06 0.1315E-06 0.3427E-06 0.1671E-06
0.5469
0.8594
6.562
0.5312
0.1731E-06 0.1731E-06 0.4113E-06 0.2253E-06
0.8906
1.141
6.797
0.6719
0.6622E-06 0.6546E-06 0.1540E-05 0.7743E-06
The lapack code uses dgetrf/dgecon/dgetri, linpack uses dgeco/dgefa/dgedi and SVD
uses the IMSL routine dslgrr which internally calls the linpack SVD routine dsvdc. The
blocksize of the lapack workspace was optimized by a preliminary call to dgetrf which in turn
calls the lapack ilaenv routine. All routines were compiled with Lahey Fortran LF95 version 7.1
release except for the IMSL code which was compiled with an earlier release of Lahey. Note that
there is a call to compress before each test. If this is removed and placed right before go to
92
Matrix Command Language
top; there will be a noticeable speed difference, especially for the large systems. This is due to
the fact that there is unused temporary space in memory and new temp variables are allocated
quite a distance away from the x matrix. This "thrashing of memory" slows things down. The
matrix command inv( ) defaults to the linpack LU solver since as this example shows for
matrix sizes of 300 and smaller linpack at a speed near that of lapack. For larger systems,
lapack runs faster.12 For example for order 600 the gain is ~ 21.95% ((1.141-.8906/1.141). If the
system were a positive definite matrix, the gain would be ~41.11% ((1.141-.6719)/1.141) over
the linpack LU solver. The SVD approach is ~ 10.12 times more costly (6.797/.6719) than the
Cholesky and ~5.96 (6.797/1.141) times more expensive than the linpackl LU. For the problems
run, the error calculated as | I  ( XX 1 ) | is comparable.
The inv command uses linpack as the default inverter. Users with large problems
involving general matrix systems should use the call inv(x:gmat) to get the LAPACK
routines. The subroutine gminv uses lapack by default and optionally returns an indicator of
whether the matrix is not full rank. Prior to a call to a lapack routine, the routine ilaenv is called
to determine the optimum blocksize. This is then used to set the work space. The defaults are
used. The B34S user is given the option of playing with these defaults, although most users
may not have sufficient knowledge to make this choice.
Since many of the matrices in econometrics are positive definite, it is wasteful to
calculated inverses with the general matrix LU approach. In addition since both linpack and
lapack can detect if the matrix is positive definite, by using the Cholesky approach, one obtains
speed and in addition has a built in error check against possible problems.. While the linpack
routines dpoco/dpodi have stood the test of time since the 1970's, the lapack Cholesky routines
dpotrf/dpocon/dpotri appear to have speed advantages for larger systems that are even greater
than found for the LU inverters. The following code illustrates what was found:
b34sexec matrix;
* Tests speed of Linpack vs LAPACK ;
* Uses PD matrix;
call echooff;
icount=0;
n=0;
upper=700;
mesh=25;
top continue;
icount=icount+1;
n=n+mesh;
if(n .eq. upper)go to done;
x=rn(matrix(n,n:));
x=transpose(x)*x;
ii=matrix(n,n:)+1.;
/$ LINPACK PDF
call compress;
call timer(base11);
xinv1=inv(x :pdmat);
call timer(base22);
12
The tests shown here use as the default the optimum blocksize for lapack. A later example investigates the effect
of blocksize changes on speed. At issue is that fact that lapack default workspace is quite large. Users with large
probelms may run out of memory and have to reduce the blocksize of the calculation.
Chapter 16
93
error(icount)=sumsq(ii-(xinv1*x));
/$ LAPACK PDF
call compress;
call timer(base111);
xinv1=inv(x :pdmat2);
call timer(base222);
error0(icount)=sumsq(ii-(xinv1*x));
/$ LAPACK LU
call compress;
call timer(base1);
xinv1=inv(x:gmat);
call timer(base2);
error1(icount)=sumsq(ii-(xinv1*x));
/$ LINPACK LU
call compress;
call timer(base3);
xinv1=inv(x);
call timer(base4);
error2(icount)=sumsq(ii-(xinv1*x));
size(icount)
= dfloat(n);
pdmat(icount) =(base22-base11);
pdmat2(icount) =(base222-base111);
lapack(icount) =(base2-base1);
linpack(icount)=(base4-base3);
call free(x,xinv1,ii);
go to top;
done continue;
call print('LINPACK Cholesky vs LAPACK Cholesky':);
call tabulate(size,pdmat,pdmat2, error, error0,
lapack,linpack,error1,error2);
call graph(size pdmat,pdmat2,lapack,linpack :nokey
:plottype xyplot);
b34srun;
The above program was run on the Dell 650 machine with dual 3.056 MH Xeon processors,
each with two cores. The operating system was XP Professional and the memory was 4.016 MB.
B34S(r) Matrix Command. d/m/y
3/ 7/07. h:m:s 12:34: 0.
=>
* TESTS SPEED OF LINPACK VS LAPACK $
=>
* USES PD MATRIX$
=>
CALL ECHOOFF$
LINPACK Cholesky vs LAPACK Cholesky
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
SIZE
25.00
50.00
75.00
100.0
125.0
150.0
175.0
200.0
225.0
250.0
275.0
300.0
325.0
PDMAT
PDMAT2
ERROR
ERROR0
LAPACK
LINPACK
ERROR1
ERROR2
0.000
0.000
0.7573E-25 0.3868E-25
0.000
0.000
0.2938E-25 0.2647E-25
0.000
0.000
0.1331E-23 0.1380E-23
0.000
0.000
0.8757E-24 0.9335E-24
0.000
0.000
0.3375E-22 0.4386E-22
0.000
0.000
0.2817E-22 0.2816E-22
0.000
0.000
0.4514E-21 0.9178E-21
0.000
0.000
0.4007E-21 0.4350E-21
0.000
0.000
0.8331E-15 0.6825E-15
0.000
0.000
0.4070E-15 0.3993E-15
0.000
0.1562E-01 0.7372E-19 0.9395E-19 0.1562E-01 0.1562E-01 0.7390E-19 0.7555E-19
0.1562E-01 0.1562E-01 0.2189E-19 0.3052E-19 0.1562E-01
0.000
0.2251E-19 0.2182E-19
0.1562E-01 0.1562E-01 0.2318E-20 0.2453E-20 0.1562E-01 0.1562E-01 0.1688E-20 0.1794E-20
0.1562E-01 0.1562E-01 0.3924E-21 0.3721E-21 0.3125E-01 0.1562E-01 0.2295E-21 0.2246E-21
0.3125E-01 0.1562E-01 0.2808E-19 0.3311E-19 0.4688E-01 0.3125E-01 0.1474E-19 0.1403E-19
0.3125E-01 0.3125E-01 0.3262E-20 0.3096E-20 0.4688E-01 0.4688E-01 0.1694E-20 0.1689E-20
0.3125E-01 0.4688E-01 0.1200E-19 0.1277E-19 0.6250E-01 0.9375E-01 0.9001E-20 0.8945E-20
0.4688E-01 0.6250E-01 0.5676E-19 0.7629E-19 0.9375E-01 0.1250
0.5762E-19 0.5736E-19
94
Matrix Command Language
14
15
16
17
18
19
20
21
22
23
24
25
26
27
350.0
375.0
400.0
425.0
450.0
475.0
500.0
525.0
550.0
575.0
600.0
625.0
650.0
675.0
0.7812E-01
0.9375E-01
0.1250
0.1562
0.2031
0.2656
0.3438
0.4375
0.5469
0.6406
0.7656
0.8750
1.016
1.141
0.6250E-01
0.7812E-01
0.9375E-01
0.1250
0.1406
0.1719
0.1875
0.2500
0.2812
0.3438
0.3750
0.4375
0.4688
0.5625
0.1672E-18
0.1625E-17
0.1240E-18
0.8648E-19
0.3484E-17
0.5813E-17
0.6721E-18
0.2468E-16
0.1045E-17
0.2694E-19
0.3614E-14
0.3936E-16
0.3966E-16
0.1106E-16
0.1691E-18
0.1808E-17
0.1074E-18
0.9579E-19
0.3320E-17
0.5600E-17
0.6349E-18
0.2307E-16
0.9424E-18
0.2545E-19
0.3663E-14
0.4080E-16
0.4584E-16
0.1359E-16
0.1406
0.1562
0.2031
0.2344
0.2812
0.3281
0.3906
0.4531
0.5469
0.6094
0.7188
0.7969
0.9062
1.000
0.1719
0.2188
0.2812
0.3281
0.4062
0.5312
0.6562
0.7969
0.9531
1.125
1.297
1.516
1.672
1.891
0.1120E-18
0.9533E-18
0.8809E-19
0.5594E-19
0.2044E-17
0.3526E-17
0.4566E-18
0.1327E-16
0.7014E-18
0.2018E-19
0.2943E-14
0.2895E-16
0.3556E-16
0.9189E-17
0.1095E-18
0.9562E-18
0.8789E-19
0.5497E-19
0.2065E-17
0.3493E-17
0.4522E-18
0.1301E-16
0.6929E-18
0.1994E-19
0.2858E-14
0.2929E-16
0.3567E-16
0.9295E-17
For matrices above 325 there are increasing speed gains from using the lapack Cholesky
inverters. At 675 the gain was on the order of a 50.70% reduction in cost ((1.141-.5625)/1.141)
in lapack was used. An uninformed user who went with a general matrix inverter and selected
the linpack LU inverter would have found costs went up 1.66 times (1.891/1.141) over the
linpack Cholesky solver and 3.363 times (1.891/.5625) over what could be obtained with the
lapack Cholesky routine.
A number of researchers have stayed with linpack due to possible accuracy issues
involving lapack. These appear to be related to not using the lapack subroutines correctly. For
example the lapack LU factoring routine dgetrf provides a return INFO that in their words is
set > 0 if "U(i,i) is exactly zero. The factorization has been completed, but the factor U is
exactly singular, and division by zero will occur if it is used to solve a system of equations."
Experience tells us that it is dangerous to proceed in near singular cases that are not trapped by
INFO and that a call to dgecon to get the condition is in order. As an example, consider
attempting to invert
x=matrix(3,3:1 2 3 4 5 6 7 8 9);
if the condition is not checked or is ignored. The linpack and lapack rcond values of this matrix
are 0.20559686E-17 1.541976423090495E-18 and are not different from zero when tested as
if((rcond+1.0d+00).eq.1.0d+00)write(6,*)'Matrix is near Singular'
in Fortran. Letting the inverse proceed is very dangerous as is illustrated with the following
MATLAB code:
>> x=[1 2 3;4 5 6; 7 8 9]
x =
1
2
3
4
5
6
7
8
9
>> ix=inv(x)
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 1.541976e-018.
ix =
-4.5036e+015 9.0072e+015 -4.5036e+015
9.0072e+015 -1.8014e+016 9.0072e+015
-4.5036e+015 9.0072e+015 -4.5036e+015
>> ix*x
ans =
4
0
0
0
8
0
4
0
0
Chapter 16
95
where xxˆ 1  I .
If there is concern over the accuracy of an OLS model, the QR approach can be used.
Another approach would be to use real*16 math which is possible with the matrix command
The command r8tor16 that will convert a matrix prior to a call to inv. Possible accuracy
improvements can be obtained in a matrix inversion using refinement and or refinement and
equalization. These are done using the lapack routines dgesvx/zgesvx at the cost of a reduction
in speed. The sample job gminv_4 in matrix.mac address accuracy questions.
b34sexec matrix;
call echooff;
n=100;
* test1 and test3 use LAPACK ;
x=rn(matrix(n,n:));
* to show effect of balancing uncomment next statement;
x(1,)=x(1,)*100000.;
call gminv(x,xinv1,info);
xinv2=inv(x);
xinv3=inv(x:gmat);
j=inv(x,rcond:gmat);
j=inv(x,rcond2);
xinv4=inv(x,rcond3 :refine);
xinv5=inv(x,rcond4 :refinee);
dtest=matrix(n,n:)+1.0;
test1=x*xinv1;
test2=x*xinv2;
test3=x*xinv3;
test4=x*xinv4;
test5=x*xinv5;
if(n.le.5)call print(x ,xinv1 ,xinv2,xinv3 ,test1,test2,test3);
call print('Matrix is of order
',n:);
call print('LAPACK 3 => refine':);
call print('LAPACK 4 => refinee':);
call print('Max
Error for LAPACK 1', dmax(dabs(dtest-test1)):);
call print('Max
Error for LAPACK 2', dmax(dabs(dtest-test3)):);
call print('Max
Error for LAPACK 3', dmax(dabs(dtest-test4)):);
call print('Max
Error for LAPACK 4', dmax(dabs(dtest-test5)):);
call print('Max
Error for LINPACK ', dmax(dabs(dtest-test2)):);
call print('Sum
Error for LAPACK 1', sum(dabs(dtest-test1)):);
call print('Sum
Error for LAPACK 2', sum(dabs(dtest-test3)):);
call print('Sum
Error for LAPACK 3', sum(dabs(dtest-test4)):);
call print('Sum
Error for LAPACK 4', sum(dabs(dtest-test5)):);
call print('Sum
Error for LINPACK ', sum(dabs(dtest-test2)):);
call print('Sumsq Error for LAPACK 1',sumsq(dtest-test1):);
call print('Sumsq Error for LAPACK 2',sumsq(dtest-test3):);
call print('Sumsq Error for LAPACK 3',sumsq(dtest-test4):);
call print('Sumsq Error for LAPACK 4',sumsq(dtest-test5):);
call print('Sumsq Error for LINPACK ',sumsq(dtest-test2):);
call print('rcond rcond2 rcond3,rcond4',rcond,rcond2,rcond3,rcond4);
cx=complex(x,dsqrt(dabs(x)));
call gminv(cx,cxinv1,info);
cxinv2=inv(cx);
cxinv3=inv(cx:gmat);
cxinv4=inv(cx,rcond3 :refine);
cxinv5=inv(cx,rcond4 :refinee);
dc=complex(dtest,0.0);
test1=cx*cxinv1;
test2=cx*cxinv2;
test3=cx*cxinv3;
test4=cx*cxinv4;
test5=cx*cxinv5;
j=inv(x,rcond:gmat);
j=inv(x,rcond2);
if(n.le.5)call print(cx,cxinv1,cxinv2,cxinv3,test1,test2,test3);
call print('Matrix is of order
',n:);
call print('Max
Error for LAPACK 1 real',
96
Matrix Command Language
call print('Max
call print('Max
call print('Max
call print('Max
call print('Max
call print('Max
call print('Max
call print('Max
call print('Max
call
call
call
call
call
call
call
call
call
call
call
call
call
call
call
call
call
call
call
call
call
print('Sum
print('Sum
print('Sum
print('Sum
print('Sum
print('Sum
print('Sum
print('Sum
print('Sum
print('Sum
print('Sumsq
print('Sumsq
print('Sumsq
print('Sumsq
print('Sumsq
print('Sumsq
print('Sumsq
print('Sumsq
print('Sumsq
print('Sumsq
print('rcond
dmax(dabs(real(dc-test1))):);
Error for LAPACK 2 real',
dmax(dabs(real(dc-test3))):);
Error for LAPACK 3 real',
dmax(dabs(real(dc-test4))):);
Error for LAPACK 4 real',
dmax(dabs(real(dc-test5))):);
Error for LINPACK real',
dmax(dabs(real(dc-test2))):);
Error for LAPACK 1 imag',
dmax(dabs(imag(dc-test1))):);
Error for LAPACK 2 imag',
dmax(dabs(imag(dc-test3))):);
Error for LAPACK 3 imag',
dmax(dabs(imag(dc-test4))):);
Error for LAPACK 4 imag',
dmax(dabs(imag(dc-test5))):);
Error for LINPACK imag',
dmax(dabs(imag(dc-test2))):);
Error for LAPACK 1 real',sum(dabs(real(dc-test1))):);
Error for LAPACK 2 real',sum(dabs(real(dc-test3))):);
Error for LAPACK 3 real',sum(dabs(real(dc-test4))):);
Error for LAPACK 4 real',sum(dabs(real(dc-test5))):);
Error for LINPACK real',sum(dabs(real(dc-test2))):);
Error for LAPACK 1 imag',sum(dabs(imag(dc-test1))):);
Error for LAPACK 2 imag',sum(dabs(imag(dc-test3))):);
Error for LAPACK 3 imag',sum(dabs(imag(dc-test4))):);
Error for LAPACK 4 imag',sum(dabs(imag(dc-test5))):);
Error for LINPACK imag',sum(dabs(imag(dc-test2))):);
Error for LAPACK 1 real',sumsq(real(dc-test1)):);
Error for LAPACK 2 real',sumsq(real(dc-test3)):);
Error for LAPACK 3 real',sumsq(real(dc-test4)):);
Error for LAPACK 4 real',sumsq(real(dc-test5)):);
Error for LINPACK real',sumsq(real(dc-test2)):);
Error for LAPACK 1 imag',sumsq(imag(dc-test1)):);
Error for LAPACK 2 imag',sumsq(imag(dc-test3)):);
Error for LAPACK 3 imag',sumsq(imag(dc-test4)):);
Error for LAPACK 4 imag',sumsq(imag(dc-test5)):);
Error for LINPACK imag',sumsq(imag(dc-test2)):);
rcond2 rcond3,rcond4',rcond,rcond2,rcond3,rcond4);
b34srun;
The test job forms a 100 by 100 matrix of random normal numbers. The job was run with and
with out multiplying the first row by 100000 to induce possible accuracy problems. The matrix
xinv1 and xinv3 were calculated by the lapack LU inverters using the matrix command
gminv and inv respectfully. These should run 100% the same and in the output are LAPACK 1
and LAPACK 2. The matrix XINV2 was calculated with the linpack default LU inverter and is
shown in the tables as LINPACK, while XINV3 and XINV4 were calculated using refinement
and equalization/refinement and are shown as LAPACK 3 and LAPACK 4 respectively. The
goal is to see how much of an improvement refinement and equalization/refinement make. The
maintained hypothesis is that in a poorly condition matrix, these added steps should, make a
difference in accuracy and pay for their added cost.
Real Matrix – Row 1 Adjusted
Matrix Command. Version January 2002.
=>
CALL ECHOOFF$
Matrix is of order
LAPACK 3 => refine
LAPACK 4 => refinee
100
Chapter 16
Max
Max
Max
Max
Max
Sum
Sum
Sum
Sum
Sum
Sumsq
Sumsq
Sumsq
Sumsq
Sumsq
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
for
for
for
for
for
for
for
for
for
for
for
for
for
for
for
LAPACK 1
LAPACK 2
LAPACK 3
LAPACK 4
LINPACK
LAPACK 1
LAPACK 2
LAPACK 3
LAPACK 4
LINPACK
LAPACK 1
LAPACK 2
LAPACK 3
LAPACK 4
LINPACK
7.334165275096893E-09
7.334165275096893E-09
8.585629984736443E-10
7.275957614183426E-10
5.918991519138217E-09
2.031745830718839E-07
2.031745830718839E-07
1.340332846354633E-08
1.186949409438610E-08
1.520938054071593E-07
6.346678588451170E-16
6.346678588451170E-16
4.170551230676075E-18
3.291420393045907E-18
3.502669224237807E-16
rcond rcond2 rcond3,rcond4
RCOND
=
0.15657795E-07
RCOND2
=
0.32449440E-07
RCOND3
=
0.15657795E-07
RCOND4
=
0.36041204E-04
Complex Case – Row 1 adjusted
Matrix is of order
Max
Error for LAPACK 1
Max
Error for LAPACK 2
Max
Error for LAPACK 3
Max
Error for LAPACK 4
Max
Error for LINPACK
Max
Error for LAPACK 1
Max
Error for LAPACK 2
Max
Error for LAPACK 3
Max
Error for LAPACK 4
Max
Error for LINPACK
Sum
Error for LAPACK 1
Sum
Error for LAPACK 2
Sum
Error for LAPACK 3
Sum
Error for LAPACK 4
Sum
Error for LINPACK
Sum
Error for LAPACK 1
Sum
Error for LAPACK 2
Sum
Error for LAPACK 3
Sum
Error for LAPACK 4
Sum
Error for LINPACK
Sumsq Error for LAPACK 1
Sumsq Error for LAPACK 2
Sumsq Error for LAPACK 3
Sumsq Error for LAPACK 4
Sumsq Error for LINPACK
Sumsq Error for LAPACK 1
Sumsq Error for LAPACK 2
Sumsq Error for LAPACK 3
Sumsq Error for LAPACK 4
Sumsq Error for LINPACK
real
real
real
real
real
imag
imag
imag
imag
imag
real
real
real
real
real
imag
imag
imag
imag
imag
real
real
real
real
real
imag
imag
imag
imag
imag
rcond rcond2 rcond3,rcond4
RCOND
=
0.15657795E-07
RCOND2
=
0.32449440E-07
RCOND3
=
0.19119694E-06
RCOND4
=
0.36654208E-03
100
1.089574652723968E-09
1.089574652723968E-09
7.730704965069890E-11
9.436007530894130E-11
8.449205779470503E-10
1.143234840128571E-09
1.143234840128571E-09
1.132320903707296E-10
1.371063262922689E-10
9.201812645187601E-10
2.892584898682689E-08
2.892584898682689E-08
2.080174401783049E-09
2.380964692715299E-09
1.709247001571822E-08
2.400662145467662E-08
2.400662145467662E-08
2.333172253018985E-09
3.145330114537025E-09
1.685032463933840E-08
1.305137954525119E-17
1.305137954525119E-17
8.000884036556901E-20
1.005147393906282E-19
4.711921249005595E-18
9.729510489806058E-18
9.729510489806058E-18
1.030133637684152E-19
2.061903353378799E-19
5.314085939146504E-18
97
98
Matrix Command Language
b34s Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
2882597, peak space used
38, peak number used
226, # user temp clean
583540
38
0
The next part of the job does not have the first row multiplied by 1000,000. The first section is
for a real*8 matrix.
Matrix Command. Version January 2002.
=>
CALL ECHOOFF$
Matrix is of order
LAPACK 3 => refine
LAPACK 4 => refinee
Max
Error for LAPACK 1
Max
Error for LAPACK 2
Max
Error for LAPACK 3
Max
Error for LAPACK 4
Max
Error for LINPACK
Sum
Error for LAPACK 1
Sum
Error for LAPACK 2
Sum
Error for LAPACK 3
Sum
Error for LAPACK 4
Sum
Error for LINPACK
Sumsq Error for LAPACK 1
Sumsq Error for LAPACK 2
Sumsq Error for LAPACK 3
Sumsq Error for LAPACK 4
Sumsq Error for LINPACK
100
1.365574320288943E-13
1.365574320288943E-13
3.086420008457935E-14
3.086420008457935E-14
1.767475055203249E-13
1.473775453617650E-10
1.473775453617650E-10
1.983027804464133E-11
1.983027804464133E-11
1.427848022651258E-10
4.135364007296756E-24
4.135364007296756E-24
1.074041726812546E-25
1.074041726812546E-25
3.829022654580863E-24
rcond rcond2 rcond3,rcond4
RCOND
=
0.41205738E-04
RCOND2
=
0.84677306E-04
RCOND3
=
0.41205738E-04
RCOND4
=
0.41205738E-04
Complex*16 matrix. Row # 1 not adjusted.
Matrix is of order
Max
Error for LAPACK 1
Max
Error for LAPACK 2
Max
Error for LAPACK 3
Max
Error for LAPACK 4
Max
Error for LINPACK
Max
Error for LAPACK 1
Max
Error for LAPACK 2
Max
Error for LAPACK 3
Max
Error for LAPACK 4
Max
Error for LINPACK
Sum
Error for LAPACK 1
Sum
Error for LAPACK 2
Sum
Error for LAPACK 3
Sum
Error for LAPACK 4
Sum
Error for LINPACK
Sum
Error for LAPACK 1
Sum
Error for LAPACK 2
Sum
Error for LAPACK 3
Sum
Error for LAPACK 4
Sum
Error for LINPACK
Sumsq Error for LAPACK 1
Sumsq Error for LAPACK 2
real
real
real
real
real
imag
imag
imag
imag
imag
real
real
real
real
real
imag
imag
imag
imag
imag
real
real
100
1.976196983832779E-14
1.976196983832779E-14
3.736594367254042E-15
3.736594367254042E-15
2.207956040223280E-14
2.116362640691705E-14
2.116362640691705E-14
3.580469254416130E-15
3.580469254416130E-15
2.059463710679665E-14
3.078720134733204E-11
3.078720134733204E-11
3.024266750114961E-12
3.024266750114961E-12
2.831859478359157E-11
3.166186637957452E-11
3.166186637957452E-11
2.939662402390297E-12
2.939662402390297E-12
2.766293287102470E-11
1.652403344027760E-25
1.652403344027760E-25
Chapter 16
Sumsq
Sumsq
Sumsq
Sumsq
Sumsq
Sumsq
Sumsq
Sumsq
Error
Error
Error
Error
Error
Error
Error
Error
for
for
for
for
for
for
for
for
LAPACK 3
LAPACK 4
LINPACK
LAPACK 1
LAPACK 2
LAPACK 3
LAPACK 4
LINPACK
real
real
real
imag
imag
imag
imag
imag
99
1.862322216364066E-27
1.862322216364066E-27
1.447918924517758E-25
1.770763319588756E-25
1.770763319588756E-25
1.727395868198397E-27
1.727395868198397E-27
1.354019631664373E-25
rcond rcond2 rcond3,rcond4
RCOND
=
0.41205738E-04
RCOND2
=
0.84677306E-04
RCOND3
=
0.34681902E-03
RCOND4
=
0.34681902E-03
B34S Matrix Command Ending. Last Command reached.
Space available in allocator
Number variables used
Number temp variables used
2882597, peak space used
38, peak number used
221, # user temp clean
593517
39
0
In the real case with an adjustment to the first row, linpack max error and sum of squared
error of 5.9189d-09 and 3.503E-16 were slightly less than the comparable LAPACK1 and
LAPACK2 values of 7.3341E-09 and 6.347E-16 respectively. For the refinement cases
(LAPACK3 and LAPACK 4) the max errors were 8.586E-10 and 7.276E-10 respectively. The
sum of squares of equalization/refinement errors were 4.171E-18 and 3.291E-18 respectively
and should be compared to the lapack and linpack sum of squared values of 2.031745E-7 and
1.520938E-7 indicating that these adjustments make a real difference.
Looking at the matrix where row 1 was not adjusted, we see the lapack max error of
1.366E-13, the linpack max error of 1.767E-13 and the two refinement cases getting the same
3.086E-14 since equalization of the matrix did not have to be done. Sum of squared errors for
lapack, linpack and the refinement cases were 4.135E-24, 3.829E-24 and 1.0740E-25
respectively. Here refinement makes a small but dectable difference. Here the accuracy is better
than in the first more difficult case. Users are invited to experiment with this job and try the
real*16 add and real*16 multiply adjustments to the blas routines ddot and dsum which provide
a way to increase accuracy without going to real*8 math.
Results for the complex*16 case are listed and follow the pattern with respect to showing
a gain for refinement, especially on the real side using the sum of squared error criteria. For the
adjusted case the linpack routines slightly outperforms the lapack code using the max error
criteria. For example the max error for the real part of the matrix when inverted by lapack was
1.089574E-9 which is larger than the linpack value of 8.4492057E-10. For the un-adjusted case,
the pattern reverses with the corresponding values being almost the same as 2.207956E-14 and
1.9761968E-14 respectively. The same pattern is observed for the imag part of the matrix.
The natural question to ask concerns the relative cost of refinement and or equalization
which were shown to improve the inverse calculation, especially in problem matrix cases. The
relative speeds of various inversion strategies is investigated in test job INVSPEED in
matrix.mac was run with matrices of order 200 – 600 by 100 incruments and times
indicated. The code run is:
100
Matrix Command Language
b34sexec matrix;
* By setting n to different values we test and compare inverse speed;
call echooff;
do n=200,600,100;
x=rec(matrix(n,n:));pdx=transpose(x)*x;
dd= matrix(n,n:)+1.;
cdd=complex(dd,0.0);
nn=namelist(math,inv,gmat,smat,pdmat,pdmat2,refine,refinee,);
cpdx=complex(x,mfam(dsqrt(x)));
scpdx=transpose(cpdx)*cpdx;
cpdx=dconj(transpose(cpdx))*cpdx;
if(n.le.5)call print(pdx,cpdx,scpdx,eig(pdx),eig(cpdx),eig(scpdx));
call compress;
/; call print('Using LINPACK DGECO/DGRDI - ZGECO/ZGEDI':);
call timer(base1);
xinv=(1.0/pdx);
call timer(base2);
/; call print('Inverse using (1.0/pdx) took',(base2-base1):);
realm(1)=base2-base1;
error1(1)=sumsq((pdx*xinv)-dd);
call compress;
call timer(base1);
cinv=(complex(1.0,0.)/cpdx);
call timer(base2);
/; call print('Inverse using (1.0/cpdx) took',(base2-base1):);
complexm(1)=base2-base1;
error2a(1)=sumsq(real((cpdx*cinv)-cdd));
error2b(1)=sumsq(imag((cpdx*cinv)-cdd));
call compress;
call timer(base1);
xinv=inv(pdx);
call timer(base2);
/; call print('Inverse using inv(pdx) took',(base2-base1):);
realm(2)=base2-base1;
error1(2)=sumsq((pdx*xinv)-dd);
call compress;
call timer(base1);
cinv=inv(cpdx);
call timer(base2);
/; call print('Inverse using inv(cpdx) took',(base2-base1):);
complexm(2)=base2-base1;
error2a(2)=sumsq(real((cpdx*cinv)-cdd));
error2b(2)=sumsq(imag((cpdx*cinv)-cdd));
call compress;
/; call print('Using LAPACK ':);
call timer(base1);
xinv=inv(pdx:GMAT);
call timer(base2);
/; call print('Inverse using inv(pdx:GMAT) took',(base2-base1):);
realm(3)=base2-base1;
error1(3)=sumsq((pdx*xinv)-dd);
call compress;
call timer(base1);
cinv=inv(cpdx:GMAT);
call timer(base2);
/; call print('Inverse using inv(cpdx:GMAT) took',(base2-base1):);
complexm(3)=base2-base1;
error2a(3)=sumsq(real((cpdx*cinv)-cdd));
error2b(3)=sumsq(imag((cpdx*cinv)-cdd));
Chapter 16
call compress;
/; call print('Using LINPACK':);
call timer(base1);
xinv=inv(pdx:SMAT);
call timer(base2);
/; call print('Inverse using inv(pdx:SMAT) took',(base2-base1):);
realm(4)=base2-base1;
error1(4)=sumsq((pdx*xinv)-dd);
call compress;
call timer(base1);
cinv=inv(scpdx:SMAT);
call timer(base2);
/; call print('Inverse using inv(scpdx:SMAT) took',(base2-base1):);
complexm(4)=base2-base1;
error2a(4)=sumsq(real((scpdx*cinv)-cdd));
error2b(4)=sumsq(imag((scpdx*cinv)-cdd));
call compress;
/; call print('Using LINPACK':);
call timer(base1);
xinv=inv(pdx:PDMAT);
call timer(base2);
/; call print('Inverse using inv(pdx:PDMAT) took',(base2-base1):);
realm(5)=base2-base1;
error1(5)=sumsq((pdx*xinv)-dd);
call compress;
call timer(base1);
cinv=inv(cpdx:PDMAT);
call timer(base2);
/; call print('Inverse using inv(cpdx:PDMAT) took',(base2-base1):);
complexm(5)=base2-base1;
error2a(5)=sumsq(real((cpdx*cinv)-cdd));
error2b(5)=sumsq(imag((cpdx*cinv)-cdd));
/; call compress;
/; call print('Using LAPACK':);
call timer(base1);
xinv=inv(pdx:PDMAT2);
call timer(base2);
/; call print('Inverse using inv(pdx:PDMAT2) took',(base2-base1):);
realm(6)=base2-base1;
error1(6)=sumsq((pdx*xinv)-dd);
/; call compress;
call timer(base1);
cinv=inv(cpdx:PDMAT2);
call timer(base2);
/; call print('Inverse using inv(cpdx:PDMAT2) took',(base2-base1):);
complexm(6)=base2-base1;
error2a(6)=sumsq(real((cpdx*cinv)-cdd));
error2b(6)=sumsq(imag((cpdx*cinv)-cdd));
/; call print('Using LAPACK':);
call timer(base1);
xinv=inv(pdx:REFINE);
call timer(base2);
/; call print('Inverse using inv(pdx:REFINE) took',(base2-base1):);
realm(7)=base2-base1;
error1(7)=sumsq((pdx*xinv)-dd);
call compress;
call timer(base1);
cinv=inv(cpdx:REFINE);
call timer(base2);
/; call print('Inverse using inv(cpdx:REFINE) took',(base2-base1):);
101
102
Matrix Command Language
complexm(7)=base2-base1;
error2a(7)=sumsq(real((cpdx*cinv)-cdd));
error2b(7)=sumsq(imag((cpdx*cinv)-cdd));
call compress;
/; call print('Using LAPACK':);
call timer(base1);
xinv=inv(pdx:REFINEE);
call timer(base2);
/; call print('Inverse using inv(pdx:REFINEE) took',(base2-base1):);
realm(8)=base2-base1;
error1(8)=sumsq((pdx*xinv)-dd);
call compress;
call timer(base1);
cinv=inv(cpdx:REFINEE);
call timer(base2);
/; call print('Inverse using inv(cpdx:REFINEE) took',(base2-base1):);
complexm(8)=base2-base1;
error2a(8)=sumsq(real((cpdx*cinv)-cdd));
error2b(8)=sumsq(imag((cpdx*cinv)-cdd));
/; call print('Error2a and error2b = real and imag Complex*16 error':);
call print(' ':);
call print('Matrix Order',n:);
call tabulate(nn,realm,error1,complexm,error2a,error2b);
call compress;
enddo;
The columns REALM and COMPLEXM refer to the times for real and complex matrices. For
real matrices, ERROR1 =  [ XXˆ 1  I ]2 . For complex matrices ERROR2 and ERROR3 refer to
the real and imaginary parts of the complex X matrix. The general solvers, symmetric matrix
solvers and positive definite solvers were all applied to the same positive definite matrix.
SMAT and PDMAT refer to the linpack symmetric and Cholesky inverters, while PDMAT2
refers to the lapack Cholesky inverters. The results for the Dell Latitude computer were:
B34S(r) Matrix Command. d/m/y
4/ 7/07. h:m:s 14:12: 6.
=>
* BY SETTING N TO DIFFERENT VALUES WE TEST AND COMPARE INVERSE SPEED$
=>
CALL ECHOOFF$
Matrix Order
Obs
1
2
3
4
5
6
7
8
NN
MATH
INV
GMAT
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
Matrix Order
Obs
1
2
3
200
REALM
ERROR1
COMPLEXM
ERROR2A
ERROR2B
0.4006E-01 0.2575E-17 0.1803
0.2944E-17 0.1246E-17
0.4006E-01 0.2575E-17 0.1702
0.2944E-17 0.1246E-17
0.6009E-01 0.3765E-17 0.1702
0.1686E-17 0.1233E-17
0.3004E-01 0.6412E-18 0.9013E-01 0.1409E-18 0.9842E-19
0.4006E-01 0.1755E-17 0.8011E-01 0.1063E-17 0.7556E-18
0.4006E-01 0.9240E-18 0.1001
0.1542E-17 0.1124E-17
0.5107
0.5322E-18
1.652
0.4179E-18 0.2918E-18
0.5207
0.5322E-18
1.642
0.4179E-18 0.2918E-18
300
NN
MATH
INV
GMAT
REALM
0.2103
0.2203
0.2003
ERROR1
COMPLEXM
0.4420E-17 0.9514
0.4420E-17 0.9013
0.6948E-17 0.6509
ERROR2A
ERROR2B
0.2733E-16 0.1806E-16
0.2733E-16 0.1806E-16
0.2860E-16 0.1147E-16
Chapter 16
4
5
6
7
8
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
Matrix Order
Obs
1
2
3
4
5
6
7
8
Obs
1
2
3
4
5
6
7
8
Obs
1
2
3
4
5
6
7
8
0.3205
0.3405
0.3305
5.858
5.868
0.1013E-17
0.1892E-16
0.9589E-17
0.4138E-17
0.4138E-17
0.8467E-18
0.1285E-16
0.8387E-17
0.4494E-17
0.4494E-17
REALM
0.8412
0.9313
0.5808
0.2804
0.2704
0.2403
6.179
5.999
ERROR1
COMPLEXM
0.8782E-16
2.614
0.8782E-16
2.534
0.1007E-15
1.562
0.8933E-17
1.001
0.8864E-16
1.102
0.5656E-16 0.7511
0.7450E-17
14.08
0.7450E-17
14.41
ERROR2A
ERROR2B
0.7911E-16 0.7384E-16
0.7911E-16 0.7384E-16
0.1187E-15 0.5403E-16
0.2147E-17 0.2602E-17
0.4047E-16 0.3070E-16
0.4240E-16 0.2621E-16
0.1410E-16 0.1249E-16
0.1410E-16 0.1249E-16
REALM
1.933
1.963
1.182
0.6910
0.6810
0.5308
10.53
10.85
ERROR1
COMPLEXM
0.9835E-12
5.939
0.9835E-12
5.528
0.1467E-11
3.315
0.1207E-12
2.153
0.3561E-12
2.874
0.8006E-12
1.602
0.1149E-12
27.72
0.1149E-12
27.45
ERROR2A
ERROR2B
0.3184E-14 0.2082E-14
0.3184E-14 0.2082E-14
0.4225E-14 0.4437E-14
0.2204E-16 0.2372E-16
0.4918E-14 0.3120E-14
0.2881E-14 0.3356E-14
0.6522E-15 0.3650E-15
0.6522E-15 0.3650E-15
REALM
4.016
4.126
2.143
1.422
1.913
0.9814
25.91
26.41
ERROR1
COMPLEXM
0.1054E-14
10.76
0.1054E-14
10.52
0.1307E-14
5.688
0.1689E-15
3.946
0.5407E-15
5.718
0.5722E-15
2.904
0.1457E-15
48.27
0.1457E-15
48.26
ERROR2A
ERROR2B
0.1935E-14 0.1553E-14
0.1935E-14 0.1553E-14
0.2383E-14 0.1676E-14
0.5304E-16 0.4960E-16
0.1455E-14 0.1471E-14
0.1774E-14 0.1299E-14
0.2620E-15 0.2328E-15
0.2620E-15 0.2328E-15
500
NN
MATH
INV
GMAT
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
Matrix Order
0.1317E-17
0.4051E-17
0.1367E-16
0.1249E-17
0.1249E-17
400
NN
MATH
INV
GMAT
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
Matrix Order
0.9013E-01
0.8012E-01
0.1102
2.444
2.343
103
600
NN
MATH
INV
GMAT
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
For the Dell Workstation 650 the following was obtained:
B34S(r) Matrix Command. d/m/y
5/ 7/07. h:m:s 11:14: 1.
=>
* BY SETTING N TO DIFFERENT VALUES WE TEST AND COMPARE INVERSE SPEED$
=>
CALL ECHOOFF$
Matrix Order
Obs
1
2
3
4
5
6
7
8
NN
MATH
INV
GMAT
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
Matrix Order
Obs
1
200
REALM
ERROR1
COMPLEXM
ERROR2A
ERROR2B
0.1562E-01 0.2575E-17 0.7812E-01 0.2944E-17 0.1246E-17
0.1562E-01 0.2575E-17 0.6250E-01 0.2944E-17 0.1246E-17
0.3125E-01 0.3765E-17 0.7812E-01 0.1686E-17 0.1233E-17
0.1562E-01 0.6412E-18 0.3125E-01 0.1409E-18 0.9842E-19
0.000
0.1755E-17 0.3125E-01 0.1063E-17 0.7556E-18
0.1562E-01 0.9240E-18 0.3125E-01 0.1542E-17 0.1124E-17
0.1875
0.5322E-18 0.5469
0.4179E-18 0.2918E-18
0.1875
0.5322E-18 0.5312
0.4179E-18 0.2918E-18
300
NN
MATH
REALM
ERROR1
COMPLEXM
0.9375E-01 0.4420E-17 0.2656
ERROR2A
ERROR2B
0.2733E-16 0.1806E-16
104
Matrix Command Language
2
3
4
5
6
7
8
INV
GMAT
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
Matrix Order
Obs
1
2
3
4
5
6
7
8
Obs
1
2
3
4
5
6
7
8
Obs
1
2
3
4
5
6
7
8
0.2812
0.1875
0.1250
0.1250
0.1094
1.875
1.859
0.2733E-16
0.2860E-16
0.1013E-17
0.1892E-16
0.9589E-17
0.4138E-17
0.4138E-17
0.1806E-16
0.1147E-16
0.8467E-18
0.1285E-16
0.8387E-17
0.4494E-17
0.4494E-17
REALM
ERROR1
COMPLEXM
0.2656
0.8782E-16 0.7344
0.2344
0.8782E-16 0.7031
0.1719
0.1007E-15 0.4844
0.1094
0.8933E-17 0.3125
0.1250
0.8864E-16 0.4219
0.7812E-01 0.5656E-16 0.2656
2.047
0.7450E-17
4.375
2.016
0.7450E-17
4.359
ERROR2A
ERROR2B
0.7911E-16 0.7384E-16
0.7911E-16 0.7384E-16
0.1187E-15 0.5403E-16
0.2147E-17 0.2602E-17
0.4047E-16 0.3070E-16
0.4240E-16 0.2621E-16
0.1410E-16 0.1249E-16
0.1410E-16 0.1249E-16
REALM
0.5625
0.5781
0.3750
0.2656
0.3281
0.2344
3.484
3.531
ERROR1
COMPLEXM
0.9835E-12
1.484
0.9835E-12
1.438
0.1467E-11 0.9531
0.1207E-12 0.6406
0.3561E-12 0.8438
0.8006E-12 0.5781
0.1149E-12
8.562
0.1149E-12
8.406
ERROR2A
ERROR2B
0.3184E-14 0.2082E-14
0.3184E-14 0.2082E-14
0.4225E-14 0.4437E-14
0.2204E-16 0.2372E-16
0.4918E-14 0.3120E-14
0.2881E-14 0.3356E-14
0.6522E-15 0.3650E-15
0.6522E-15 0.3650E-15
REALM
1.141
1.109
0.6719
0.4844
0.6875
0.3906
8.453
8.516
ERROR1
COMPLEXM
0.1054E-14
2.547
0.1054E-14
2.516
0.1307E-14
1.688
0.1689E-15
1.109
0.5407E-15
1.500
0.5722E-15 0.9531
0.1457E-15
14.56
0.1457E-15
15.14
ERROR2A
ERROR2B
0.1935E-14 0.1553E-14
0.1935E-14 0.1553E-14
0.2383E-14 0.1676E-14
0.5304E-16 0.4960E-16
0.1455E-14 0.1471E-14
0.1774E-14 0.1299E-14
0.2620E-15 0.2328E-15
0.2620E-15 0.2328E-15
500
NN
MATH
INV
GMAT
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
Matrix Order
0.4420E-17
0.6948E-17
0.1317E-17
0.4051E-17
0.1367E-16
0.1249E-17
0.1249E-17
400
NN
MATH
INV
GMAT
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
Matrix Order
0.9375E-01
0.7812E-01
0.3125E-01
0.4688E-01
0.3125E-01
0.7969
0.7969
600
NN
MATH
INV
GMAT
SMAT
PDMAT
PDMAT2
REFINE
REFINEE
Math refers to using the form invx=1./x; while INV uses invx=inv(x);. Since
both call the same linpack routines, they should and do run the same. GMAT uses lapack while
SMAT and PDMAT use the linpack routines for symmetric (DSICO, DSIFA and DSIDI) and
positive definite (DPOCO, DPOFA and DOPDI) matrices respectfully. The errors are the sum of
squared errors where ERROR2A and ERROR2B refer to the real and imaginary part of the
complex matrix. Here as expected the refine and refinee options are superior, although at
great cost as shown in table 16.1
Chapter 16
105
Table 16.1 Relative Cost of Equalization & Refinement of a General Matrix
Computer
Order
Mean
Mean
Dell Latitude
Real Matrix
Equal_Refine
Dell 650 Workstation
Real Matrix
LU
Cost
LU
Equal_Refine
Cost
200 0.06009
0.5207
7.665
0.03125
0.1875
5.000
300
0.203
2.343 10.542
0.07812
0.7969
9.201
400
0.5808
5.999
9.329
0.1719
2.016 10.728
500
1.182
10.85
8.179
0.375
3.531
8.416
600
2.143
26.41 11.324
0.6719
8.516 11.675
9.408
9.004
Complex Matrix
Complex Matrix
200
0.1702
1.642
8.647
0.1702
0.642
2.772
300
0.6509
5.868
8.015
0.6509
5.868
8.015
400
1.562
14.41
8.225
1.562
14.41
8.225
500
3.315
27.45
7.281
3.315
27.45
7.281
600
5.688
48.26
7.485
5.688
48.26
7.485
7.931
6.756
The cost of equalization and refinement is around 9 time more for real matrices and 7-8
times more for complex matrices. The calculations in Table 16.1were made with positive definite
matrices since in many econometric applications the matrix involved is positive definite. While
Dell Latitude was substantially slower, the relative cost of equalization and refinement was
relative stable across machines. At matrices of 300 or larger using the Dell workstation, lapack
was faster. For the Dell Lattitide the cross over point was at matrices of size 400 or larger. For
the Cholesky inverters, at order 600 the gain from lapack was around 2 (1.949=1.913/.9814) for
the Dell Latitude and 1.8 (1.76=.6875/.3906) for the Dell Workstation over linpack.
The above tests have been performed using the lapack default blocksize as calculated by
the routine ilanenv. The below listed job (LAPACK_2 in matrix.mac) investigates the gains from
alternate blocksizes using a Dell 650 running 3.04 GH and a Dell Latitude running a 1.0 GH
chip.
/$ Blocksize tests
b34sexec matrix;
call echooff;
isize=12;
Mat_ord =array(isize:);
linpack =array(isize:);
lapack1 =array(isize:);
lapack4 =array(isize:);
lapack7 =array(isize:);
lapack10 =array(isize:);
lapack13 =array(isize:);
lapack16 =array(isize:);
lapack19 =array(isize:);
lapackd =array(isize:);
j=0;
106
Matrix Command Language
do i=1,19,3;
n=64;
top continue;
j=j+1;
if(n.gt.768)go to endit;
/; call print('Order of Matrix
mat_ord(j)=n;
x=rec(matrix(n,n:));
',n:);
/; set blocksize for lapack
/; LINPACK need only to be run one time
call lapack(1,i);
if(i.eq.1)then;
call timer(t1);
xx=inv(x);
call timer(t2);
/; call print('LINPACK time
linpack(j)=t2-t1;
call compress;
endif;
call timer(t1);
xx=inv(x:gmat);
call timer(t2);
/; call print('LAPACK time
if(i.eq.1)lapack1(j)=t2-t1;
if(i.eq.4)lapack4(j)=t2-t1;
if(i.eq.7)lapack7(j)=t2-t1;
if(i.eq.10)lapack10(j)=t2-t1;
if(i.eq.13)lapack13(j)=t2-t1;
if(i.eq.16)lapack16(j)=t2-t1;
if(i.eq.19)lapack19(j)=t2-t1;
call compress;
if(i.eq.1)then;
call lapack(:reset);
call timer(t1);
xx=inv(x:gmat);
call timer(t2);
/; call print('LAPACK
lapackd(j)=t2-t1;
call compress;
endif;
',t2-t1:);
',t2-t1:);
Defaults ',t2-t1:);
n=n+64;
go to top;
endit continue;
j=0;
enddo;
call print(' ':);
call print('Effects on Relative Speed of LAPACK blocksize':);
call tabulate(mat_ord,linpack,lapack1,lapack4,lapack7,
lapack10,lapack13,lapack16,lapack19,
lapackd);
b34srun;
Matrices of rank 64, 128,...,768 were generated using the rectangular IMSL random number
generator. Since these numbers are in the range 0.0 – 1.0, then the large numbers possible from a
random normal generator are not observed and there is less likelihood of one matrix having a
Chapter 16
107
problem being inverted. The job was run with 12,000,000 real*8 workspace. The variables
LAPACKi refer to a lapack inversion where the blocksize was set as i. LAPACKD is the default
recommended blocksize as calculated by the lapack routine ilanenv. Edited output from this job
is listed next for both the Dell Latitude and Dell Workstation to control for possible chip related
differences in addition to chip speed.
B34S(r) Matrix Command. d/m/y
4/ 7/07. h:m:s 16:17:50.
Run with Dell Latitude
=>
CALL ECHOOFF$
Effects on Relative Speed of LAPACK blocksize
Obs
MAT_ORD
1
2
3
4
5
6
7
8
9
10
11
12
64
128
192
256
320
384
448
512
576
640
704
768
LINPACK
LAPACK1
LAPACK4
LAPACK7
LAPACK10
LAPACK13
LAPACK16
LAPACK19
LAPACKD
0.000
0.000
0.000
0.1002E-01
0.000
0.1001E-01
0.000
0.000
0.000
0.2003E-01 0.2003E-01 0.2003E-01 0.2003E-01 0.2003E-01 0.2003E-01 0.2003E-01 0.2002E-01 0.1001E-01
0.3004E-01 0.5007E-01 0.5007E-01 0.6009E-01 0.5007E-01 0.6009E-01 0.6009E-01 0.6009E-01 0.5007E-01
0.9013E-01 0.1302
0.1202
0.1402
0.1402
0.1302
0.1302
0.1302
0.1402
0.3205
0.3505
0.3104
0.2904
0.3004
0.3004
0.3104
0.2904
0.2804
0.7110
0.8212
0.6209
0.6009
0.5708
0.5708
0.6209
0.5508
0.5107
1.312
1.472
1.062
0.9914
0.9614
0.9614
0.9714
0.9113
0.8813
2.854
2.604
1.722
1.612
1.552
1.482
1.622
1.472
1.392
3.435
3.725
2.363
2.223
2.313
2.163
2.273
2.063
2.063
5.147
4.977
3.355
3.104
3.205
2.914
2.874
2.914
2.664
7.160
6.710
4.737
4.116
4.006
3.936
3.805
3.805
3.605
10.30
8.883
5.838
5.348
5.217
5.067
5.047
5.207
4.677
Run on Dell Precision Workstation 650
B34S(r) Matrix Command. d/m/y
=>
5/ 7/07. h:m:s 11:30:52.
CALL ECHOOFF$
Effects on Relative Speed of LAPACK blocksize
Obs
MAT_ORD
1
2
3
4
5
6
7
8
9
10
11
12
64
128
192
256
320
384
448
512
576
640
704
768
LINPACK
LAPACK1
LAPACK4
LAPACK7
LAPACK10
LAPACK13
LAPACK16
LAPACK19
LAPACKD
0.000
0.000
0.1562E-01
0.000
0.000
0.000
0.1562E-01
0.000
0.000
0.1562E-01
0.000
0.000
0.1562E-01
0.000
0.1562E-01
0.000
0.1562E-01 0.1562E-01
0.1562E-01 0.1562E-01 0.1562E-01 0.1562E-01 0.3125E-01 0.1562E-01 0.1562E-01 0.1562E-01 0.3125E-01
0.3125E-01 0.4688E-01 0.3125E-01 0.4688E-01 0.4688E-01 0.4688E-01 0.4688E-01 0.4688E-01 0.4688E-01
0.1094
0.1250
0.1094
0.1094
0.1094
0.9375E-01 0.1094
0.9375E-01 0.9375E-01
0.2344
0.2656
0.2188
0.2031
0.1875
0.2031
0.1875
0.1875
0.1875
0.3750
0.4375
0.3281
0.3281
0.3125
0.2969
0.2969
0.3125
0.2812
0.7500
0.7344
0.5469
0.5156
0.5000
0.5156
0.5000
0.4844
0.4531
0.9844
0.9844
0.7500
0.7031
0.7031
0.7031
0.6719
0.6562
0.6250
1.422
1.375
1.031
0.9688
0.9688
0.9219
0.9375
0.9062
0.8906
1.891
1.828
1.359
1.297
1.266
1.234
1.250
1.219
1.156
2.516
2.328
1.781
1.656
1.656
1.594
1.609
1.578
1.516
The findings indicate that the default blocksize lapack is faster than linpack for matrices of sizes
greater than 320. The blocksize=1 lapack never beats linpack until size 512 when the times
were .7344 vs .7500 and 2.604 vs 2.854 on the workstation and Latitude respectively. On the
workstation at size 768, linpack was running at 2.516, while lapack was running at 2.328 or
1.516 with the default blocksize. For a general matrix lapack defaults suggest a blocksize of 64,
with a minimum blocksize of 2. The cross over point is set as 128. The above data suggests that
the linpack/lapack cross over for ther workstation using the default blocksize is between 256 to
320. Note that at 256 the ratio was 3.125E-2/4.6875E-2 while at 320 that ratio tipped in lapack's
favor to be .09375/.1094. To investigate whether a blocksize > 1 but far less than 64 would be of
benefit, in the above job we set i=4, 7, 10, 13,16 and 19 and repeated the calculations. For a
matrix size of 768 and i=4 the lapack time fell to 1.781 from 2.328. For i=7 the time was
marginally better falling to 1.656 which is close to the default setting of 1.516. The above
experiment outlines the gains for blocksize adjustments but also suggests that if changes are
108
Matrix Command Language
made to increase the block size to a modest 4 and thereby save space, there still are substantial
gains. For matrices in the usual range (under order 256) the space saving linpack code will
always beat lapack. These findings were not altered when the test job was run on the Dell
Latitude computer and suggest that although for many applications the linpack code is a better
choice, the B34S user is given the capability to modify this choice. Readers are invited to rerun
these examples on their own systems since the results may be chip sensitive. The brief speed and
accuracy tests reported above highlight the fact that selecting just the right inverters can make a
substantial difference. For most users running lapack, it is best to use the default settings
although the usual "one size fits all" approach may carry with it substantial hidden costs.
The discussion of refinement capability has the hidden assumption that the calculation is
assumed to remain using real*8 data. Another assumption is that the blas code not be modified
to increase accuracy. In the next section these assumptions are relaxed as calculations are made
in real*16 and VPA and when using real*8 routines, the blas routines are enhanced to give more
accuracy.
16.7 Variable Precision Math
Many software systems allow real*4 data storage but move the data to real*8 to make a
calculation. In many cases the resulting accuracy is not the same as what would be obtained with
a direct read into real*8. The B34S matrix command supports real*4, real*8, real*16,
complex*16, Complex*32, integer*4 and integer*8 data types to facilitate research into the
impact of data storage accuracy on the calculation. In addition the variable precision subroutine
library developed by Smith (1991) has been implemented to give accuracy to better that 1700
digits. The use of this code is discussed next. It is not just important to calculate in real*8, the
precision in which the data was initially read makes a major difference, even in simple problems.
A simple example from Stokes (2005) involving 2.00 / 4.11 will illustrate the problems of
precision.
str=>vpa
r*8=>vpa
r*8=>r*16
str=>r*16
r*8=>r*8
r*4=>r*4
.4866180048661800486618004866180048661800486618004866M+0
.48661800486618001080454992664677M+0
.48661800486618001080454992664677E+00
.48661800486618004866180048661800E+00
.48661800486618000000000000000000E+00
.48661798200000000000000000000000E+00
The line str>vpa lists the exact answer obtained when the data (2.0 and 4.11) are read from a
string into a variable precision arithmetic (VPA) routine while the line r*8=>vpa shows what
happens to accuracy when the data are first read into real*8 or double precision, then moved to a
vpa datatype. The line r*8=>r*16 shows what occurs when the data are first read into real*8,
then converted to real*16 before making the calculation. In this case the results are the same as
what is obtained with r*8=>vpa but are inferior to the line str=>r*16 where the data are read
directly into real*16. The lines r*8=>r*8 and r*4=>r*4 show what can be expected using the
usual double precision and single precision math, respectively. The importance of this simple
example is that is can be used to disentangle the effect of data storage precision and data
calculation precision in a very simple problem where each can be isolated. When there are many
calculations needed to solve a problem (to invert a 100 by 100 matrix by elimination involves a
third of a million operations), round off error can mount, especially when numbers differ in size.
Chapter 16
109
Strang (1976, page 32) notes "if floating-point numbers are added, and their exponents c differ
say by two, then the last two digits in the smaller number will be more or less lost..." Real*4 or
single precision on IEEE machines has a range of 1.18*10-38 to 3.40*1038. This gives a precision
of 7-8 digits at best. Real*8 or double precision has a range of 2.23*10-308 to 1.79*10308 and at
best gives a precision of 15-16 digits. Real*16 has a range of 10-4931 to 104932 and gives up to 32
digits of precision. VPA or variable precision arithmetic allows variable precision calculations.
To measure the effects of data precision and calculation method on accuracy requires a
number of different test data sets. The first problem attempted was the StRD (Rogers-FillibenGill-Guthrie-Lagergren-Vangel 1998) Filippelli data set, which contains 82 observations on a
10
polynomial model of the form y   0    i x i  e where x ranges from -3.13200249 to i 1
10
8.781464495 and x ranges from 90,828.258 to 2,726,901,792.451598. Answers to 15 digits
are supplied by StRD. Table 16.2 reports 15 experiments involving various ways to estimate the
model. The linpack Cholesky routines and general matrix routines detect rank problems and will
not solve the problem if the data are not converted to real*16. The QR approach obtains an
average LRE13 of 7.306, 7.415 and 8.368 on the coefficients, SE and residual sum of squares.
The exact numbers obtained are listed in Table 16.3. If the accuracy improvements for the BLAS
routines suggested are enabled, these LRE numbers jump to 8.118, 8.098 and 9.803, respectively.
Note that both accuracy improvements result in the same gain. Experiments # 4 and # 5 first
copy the data that have been first read into real*8 into a real*16 variable and attempt estimation
with a Cholesky and a QR approach. The LRE's are the same for both approaches (7.925, 8.708,
8.167). This experiment shows the effect of calculation precision and at first would lead one to
believe that there is little gain obtained using real*16 calculation except for the fact that the
Cholesky condition is not seen as 0.0. However, this interpretation would be premature without
checking for data base precision effects (i. e., at what precision was the data initially read), which
we do below.
Experiments 6-12 test various combinations of calculation precision and routine
selection. In Experiment # 6 we use the linpack SVD routines on real*8 data. The results are
poor (LRE numbers of 2.195, 2.132 and 4.039).14 When the accuracy improvements are enabled,
(experiment 7 and 8), there is a slight loss of accuracy on the coefficients to 1.901 but a slight
gain on the SE to 2.431 . However, when the real*8 data are copied to real*16 in experiment 9,
the SVD LRE numbers jump to 7.924, 8.708 and 8.167, respectively, which are similar to what
was found in experiments 4 and 5 and show clearly the effect of calculation precision conditional
on data reading into real*8 before the data are moved to real*16. These results are similar to
those in the real*16 Cholesky experiment 4 and the real*16 QR experiment 5.
Experiments 10-12 study the effect of using lapack's SVD routine in place of linpack.
For experiment 10, the coefficient LRE jumps to 7.490, which is quite good and in fact beats the
13
LRE is the Log Relative Error as discussed by McCullough (1999). Assume x is value obtained and c is the
correct value. Then LRE   log10 | ( x  c) / c | .
14
The author has used the linpack code since 1979. These results were not expected and seem to be related to the
extreme values in the X matrix in the Filippelli data. When real*16 is used, accuracy of the linpack SVD routine
improves.
110
Matrix Command Language
QR LRE reported for experiment 1. This value is far better than the linpack LRE of 2.195.15
However, the LRE of the SE is poor with a LRE of 1.910, which is less than that found with
linpack of 2.132. The LRE of e ' e of 1.606 is also less than the linpack LRE of 3.258. Since
the SE requires knowledge of ( X ' X )1 , calculated as ( X ' X )1  V 2V ' , extreme values along
the diagonal of  may be causing errors when forming  2 . However, this possibility does not
explain the poor performance of the residual sum of squares LRE of 1.606.16 The reason may be
related to the fact that the data set has such high x10 values that minor coefficient differences
will result in substantial changes in the relative residual sum of squares.
Experiments 13-15 first load the data in real*16 and proceed to the same routines as used
for experiments 4 - 6. Here we see LRE numbers of 14.68 14.99 and 15.00 for the Cholesky
experiment and 14.79, 14.96 15.00 for the QR experiment which is the same as SVD calculated
with linpack. These are close to perfect answers. Table 16.3 lists the coefficients obtained for
experiment 1, which used real*8 data while Table 16.4 lists the exact coefficients obtained for
the QR using data read directly into real*16. Experiments 13-15 show gain from reading the
Filippelli data set in real*16. Since all there experiments produced similar LRE values, it
suggests that if the data are read with enough precision, the results are less sensitive to the
estimation method. This finding has important implications for data base design. The next task is
to study less extreme (stiff) data sets and observe the results.
15
McCullough (1999, 2000) Used lapack QR and SVD routines to estimate the coefficients of the Filippelli data
finding that "QR generally returns more accurate digits than SVD." The LRE values found were 7.4 and 6.3
respectively. For S-PLUS he found 8.4 and 5.8, respectively, where the underlying routines were not known.
16
The sum of squares was tested against the published value of 0.795851382172941E-03. The lapack SVD routine
obtained 0.8155689538070673E-03.
Chapter 16
111
Table 16.2 LRE for Various Approaches to an OLS Model of the Filippelli Data
_____________________________________________________________
Various options of real*8 data
Experiment
1
2
3
4
5
6
7
8
9
10
11
12
TYPE
QR
ACC_1
ACC_2
R16_CHOL
R16_QR
SVD
SVD_ACC1
SVD_ACC2
SVD_R16
SVD_LAPK
SVD2ACC1
SVD2ACC2
COEF
7.306
8.118
8.118
7.924
7.924
2.195
1.901
1.901
7.924
7.490
7.490
7.490
SE
7.415
8.098
8.098
8.708
8.708
2.132
2.431
2.431
8.708
1.910
1.910
1.910
RSS_LE
8.368
9.803
9.803
8.167
8.167
4.039
3.258
3.258
8.167
1.606
1.606
1.606
Various Options using Data read directly in real*16
13
14
15
R16_CHOL
R16_QR
R16_SVD
14.68
14.79
14.79
14.99
14.96
14.96
15.00
15.00
15.00
___________________________________________________________
Experiments 4, 5 and 9 involve reading data first into real*8 and then converting the data to real*16. Experiments 13, 6-8 and 10-12 involve real*8 data. Experiments 13-15 use data read directly into real*16. Experiments 2, 7 and 11
enhance blas routines by accumulating real*8 data using IMSL routines DQADD and DQMULT. Experiments 3, 8
and 12 assumulate real*8 blas calculations using real*16 data as outlined in Stokes (2005). See Chapter 10 for a
detailed discussion of the methods used. The coefficients obtained for experiment # 1 and 14 are listed in Tables
16.3 and 16.4.
112
Matrix Command Language
Table 16.3 Coefficients and SE Estimated Using QR Models of the Real*8 Filippelli Data
__________________________________________________________________
Coef
Coef
Coef
Coef
Coef
Coef
Coef
Coef
Coef
Coef
Coef
1
2
3
4
5
6
7
8
9
10
11
Test Value
-2772.179591933420
-2316.371081608930
-1127.973940983720
-354.4782337033490
-75.12420173937571
-10.87531803553430
-1.062214985889470
-0.6701911545934081E-01
-0.2467810782754790E-02
-0.4029625250804040E-04
-1467.489614229800
Mean
Variance
Minimum
Maximum
SE
SE
SE
SE
SE
SE
SE
SE
SE
SE
SE
1
2
3
4
5
6
7
8
9
10
11
LRE
LRE
LRE
LRE
LRE
LRE
LRE
LRE
LRE
7.33
7.32
7.32
7.31
7.31
7.30
7.30
7.29
7.29
7.28
7.33
559.7798867059487
466.4775900975754
227.2042833290517
71.64786889794284
15.28971847592676
2.236911685945726
0.2216243305780890
0.1423637686503493E-01
0.5356174292732132E-03
0.8966328708850490E-05
298.0845420801842
7.42
7.41
7.41
7.41
7.41
7.41
7.41
7.41
7.42
7.43
7.43
7.306448565286121
2.587670394878226E-04
7.280096349919187
7.329023461850447
559.7798654749500
466.4775721277960
227.2042744777510
71.64786608759270
15.28971787474000
2.236911598160330
0.2216243219342270
0.1423637631547240E-01
0.5356174088898210E-03
0.8966328373738681E-05
298.0845309955370
Mean
Variance
Minimum
Maximum
Value Obtained
-2772.179723094652
-2316.371192269638
-1127.973995395338
-354.4782509735776
-75.12420543777237
-10.87531857690271
-1.062215039398714
-0.6701911887876555E-01
-0.2467810910390330E-02
-0.4029625462234867E-04
-1467.489683023960
7.414701487211084
7.386168559949404E-05
7.405390067654106
7.429617565744895
Residual sum of squares:
RSS 0.7958513821729410E-03
0.7958513787598208E-03
8.37
________________________________________________________________
Test values are reported on the left-hand side. LRE = log relative error. The coefficients report experiment # 1 from
Table 16.2. The same linpack QR routine was modified and to run for real*16 data. Results for this experiment are
shown in Table 16.4.
Chapter 16
113
Table 16.4 Coefficients estimated with QR using Real*16 Filippelli Data
Coefficients Using QR on Data Loaded into Real*16
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
–2772.1795919334239280284475535721
–2316.3710816089307588219679140978
–1127.9739409837156985716700141998
–354.47823370334877161073848496470
–75.124201739375713890522075522684
–10.875318035534251085281081177145
–1.0622149858894676645966112202356
–0.67019115459340837592673412281191E-01
–0.24678107827547865084085445245647E-02
–0.40296252508040367129713154870917E-04
–1467.4896142297958822878485135961
Mean
Variance
Minimum
Maximum
LRE
LRE
LRE
LRE
14.788490320266543980835382276091684
6.3569618908829012635712782954099325E-0002
14.347002403969724322813759016211991
15.000000000000000000000000000000000
SE Using QR on DATA Loaded into Real*16
1.
559.77986547494987457477254797527
2.
466.47757212779645269310982974610
3.
227.20427447775131062939817526228
4.
71.647866087592737261665720850718
5.
15.289717874740006503075678978592
6.
2.2369115981603327555186234039771
7.
0.22162432193422740206612983379340
8.
0.14236376315472394891823309147959E-01
9.
0.53561740888982093625865193118466E-03
10. 0.89663283737386822210041526987951E-05
11. 298.08453099553698520055234224439
Mean
Variance
Minimum
Maximum
LRE
LRE
LRE
LRE
LRE
14.85
15.00
14.42
15.00
15.00
14.35
14.66
15.00
14.85
15.00
14.55
LRE
15.00
15.00
14.86
15.00
15.00
14.91
14.74
15.00
15.00
15.00
15.00
14.955903576675283545444986642213045
7.2174779096858864768608287934814669E-0003
14.741319930323011772906976043000329
15.000000000000000000000000000000000
Residual sum of squares
0.79585138217294058848463068814293E-03
15.00
LRE = log relative error. This is experiment # 14 from Table 16.2 but uses the linpack QR routine modified by
Stokes (2005) to run with real*16 data. For this experiment the data was read directly into real*16.
The Box-Jenkins (1976) Gas Furnace data have been widely studied and modeled and
are close in difficulty to what are found in many applied models in time series. While "correct"
15 digit agreed upon answers are not available, it is possible to study the effect on the residual
sum of squares using 11 approaches reported in Table 16.5.17 Since OLS minimizes the sum of
squared errors, a "better" answer is one with a smaller e ' e . Using this criteria, the linpack
general matrix solver DGECO, Experiment 3, is "best" followed closely by the lapack general
matrix solver, Experiment 4, and the linpack SVD routine, Experiment 10. Experiments 5 and 6
use the lapack general matrix solver that allows refinement and, in the case of Experiment 6
refinement and equilibration. These approaches did not do as well in determining a minimum
e ' e and were substantially more expensive in terms of computer time. Of interest is why
Experiment 1 and Experiment 8 did not produce the same answer since they both used the
linpack Cholesky routines. The answer relates to the way the coefficients are calculated. In the
17
Since this data set does not have the rank problems found with the Filippelli data, it is possible to attempt a
number of alternative procedures. Not all these procedures should be used.
114
Matrix Command Language
former case the Cholesky R is used to obtain the coefficients without explicitly forming
( X ' X ) 1 using the linpack routine DPOSL, while in the latter case ( X ' X )1 is formed from R
using DPODI. In general, the answers are very close for this exercise.
Table 16.5 Residual Sum of Squares on a VAR model of Order 6 – Gas Furnace Data
________________________________________________________________
Residual
1. OLSQ
2. OLSQ
3. OLSQ
4. OLSQ
5. OLSQ
6. OLSQ
7. OLSQ
8. OLSQ
9. OLSQ
10. OLSQ
11. OLSQ
Sum of Squares for various methods
using Linpack Chosleky – solving from R
using LINPACK QR
using LINPACK DGECO
using LAPACK DGETRE-DGECON-DGETRI
using LAPACK DGESVX
using LAPACK DGESVX with equilibration
using LAPACK DPOTRF-DPOCON-DPTTRI
using LINPACK DPOCO-DPODI
using LINPACK DSICO-DSIDI
using SVD Linpack
using SVD Lapack
16.13858295915815
16.13858295915821
16.13858295915803
16.13858295915806
16.13858295935751
16.13858295963500
16.13858295915812
16.13858295915811
16.13858295915814
16.13858295915808
16.13858295915810
_________________________________________________________________
Model estimated was gasout=f(gasout{1 to 6}, gasin{1 to 6}). Data from Box-Jenkins [3]. Data studied in Stokes
[32]. Experiment 1 solves for
 using Cholesky R directly. Experiments 3-9 form ( X ' X )1 .
The StRD Pontius data are classified as of a lower level of difficulty, although more
challenging than the gas furnace data studied in the prior section. The Pontius data consists of 40
observations of a model of the form y   0  1 x   2 x 2 for a model which is almost a perfect fit.
The eigenvalues of ( X ' X ) , as calculated by the eispack routine RG, were 0.8109E+13,
0.7317E+27, 3.613, giving a condition estimate that tripped the condition tolerance in the
linpack LU and Cholesky routines for both real*8 and real*4 data. Calculations were "forced"
by ignoring this check.18 Results are reported for a number of experiments in Table 16.5 that
vary precision, method of calculation and degree of Fortran optimization for real*4 data. The
base method was the QR for real*8 data which gives a LRE = 13.54 for  . When accuracy was
enabled the LRE for the SE and e ' e increased slightly from 12.39 to 12.51 and 12.09 to 12.21
respectively in experiments 1 and 2. The linpack SVD produced a LRE of 13.92, 13.92 and
13.53 for the coefficient, the SE and e ' e , respectively, while for lapack these were 13.48, 12.74
and 12.93, respectively, in Experiments 3 and 4. Here, using accuracy as a criteria, linpack
edged lapack. Since in the Filippelli data set the reverse was found, there appears to be no "best"
SVD routine for all cases. In addition to accuracy, there are other aspects of the selection
18
The same data was estimated in Windows RATS Doan (1992) version 6.0. While the reported coefficients agreed
with the benchmark for 11, 11 and 14 digits, respectively, RATS unexpectedly produced a SE of 0.0 and a t of 0.0
for the  2 term. The "certified" coefficients and standard errors are:
0  0.673565789473684E-03
1  0.732059160401003E-06
 2  -0.316081871345029E-14
which produce a t for
2
0.107938612033077E-03
0.157817399981659E-09
0.486652849992036E-16
of -64.95, not zero.
Chapter 16
115
process that include relative speed of execution (tested in Table 16.7 and found to be a function
of the size of the problem and computer chip) and memory requirements that are not tested here
since they are published.19
Experiments 5-8 show forced linpack LU and Cholesky models for real*8 data. In
Experiments 7-8, added accuracy in the accumulators was enabled. Slight accuracy gains were
observed, especially in the RSS calculation where the LRE jumped from 12.77 & 12.73 to 13.23
and 13.39, respectively. What is interesting is that in this case, even though the condition of
( X ' X ) was large, the LU and Cholesky approaches were able to get reasonable answers. The
linpack condition check appears to be conservative since in the usual case the software would
not attempt the solution of this problem.
Experiments 9-14 concern real*4 data.20 Again, the QR was found to be most accurate,
with scores of 5.36, 6.01 and 5.65 for the coefficients, SE's and RSS, respectively. These runs
were made with code compiled by Lahey Fortran version 7.10 running opt = 1. When accuracy
enhancement was enabled, the LRE for the SE fell from 6.01 to 4.37. This difference was traced
to the fact that the BLAS real*4 routine SDOT is optimized to hold data in registers while the
higher accuracy routine SDSDOT did not optimize to the same extent. This is shown when the
same calculation was done with opt=0 and the QR SE accuracy was 4.21 and 4.37 for nonaccuracy and accuracy-enabled code respectively. Higher accuracy was observed for opt=1 LUforced of 5.27 vs 4.80 for opt=0 calculations. Why the forced Cholesky experiment seems to run
more accurately at opt=0 than opt=1 (see Experiment 12) is not clear. What seems to be the case
is that the level of optimization and its resulting changes in registers seems to make a detectable
difference only with real*4 precision data. A strong case can be made not to use this precision
for this or any other econometric problem. When real*8 calculations are used, these knife edge
type differences are not observed.
19
For lapack memory was set to the suggested amount from the first call to the routine. Experimentation with
alternative lapack memory, possible with the B34S system implementation of lapack, was not attempted for his
problem since it was discussed earlier.
20
Data was first read in real*8. Then the B34S routine RND( ) first checked for maximum and minimum allowable
real*4 size, using the Fortran functions HUGH( ) and TINY( ). Next, the real*8 data was written to a buffer,
using g25.16, and re-read into real*4, using the format g25.16. This approach gives a close approximation to having
read the data directly into real*4. Use of the Fortran function sngl( ) can be dangerous in that, among other things,
range checking is not performed.
116
Matrix Command Language
Table 16.6 LRE for Various Estimates of Coef, SE and RSS of Pontius Data
Real*8
#
1.
2.
3.
4.
5.
6.
7.
8.
Data
Method
QR
QR_AC
SVD-LINPACK
SVD_LAPACK
LU-Forced
Chol-Forced
LU-Forced_AC
Chol-Forced_AC
COEF
13.54
13.52
13.92
13.48
12.61
12.11
12.77
12.17
SE
12.39
12.51
13.92
12.74
13.02
13.00
13.61
13.63
RSS
12.09
12.21
13.53
12.93
12.77
12.73
13.23
13.39
Real*4
9.
10.
11.
12.
13.
14.
Data Optimization = 1
QR
QR_AC
LU-Forced
Chol-Forced
LU-Forced_AC
Chol-Forced_AC
5.36
5.36
3.93
3.97
3.95
4.01
6.01
4.37
5.27
3.36
5.30
3.32
5.65
4.06
5.36
3.06
4.78
3.02
Real*4 Data Optimization = 0
9.
QR
5.36
4.21
3.91
10.
QR_AC
5.36
4.37
4.06
11.
LU_Forced
4.31
4.80
4.45
12.
Chol_Forced
4.48
4.51
4.26
13.
LU_Forced_AC
3.95
5.30
4.78
14.
Chol_Forced_AC
4.16
3.79
3.48
_______________________________________________________________________
All data were initially read in real*8. For real*4 results data were then converted to real*4. Forced means that the
LINPACK condition check has been bypassed for testing purposes. All reported LRE values are for the means. All
real*4 tests have been done with LINPACK routines. Real*4 accumulators have not been enabled in cases where
_AC is not added to the method name.
The Eberhardt data consist of 11 observations of a one input model y  1 x . The level of
difficulty is rated as average. Results are shown in Table 16.7. Here the Cholesky, the linpack
SVD and the lapack SVD all produce 100% identical LRE values of 14.72, 15.00 and 14.91
respectively. For the QR the Coefficient LRE was 14.72 while the SE and residual LRE's were
marginally less at 14.40 and 14.05. Here again the methods being considered run very close
together.
Table 16.7 LRE for QR, Cholesky, SVD linpack and lapack for Eberhardt Data
_______________________________________________________________________
Method
QR
Chol
SVD-LINPACK
SVD-LAPACK
COEF
14.72
14.72
14.72
14.72
SE
14.40
15.00
15.00
15.00
RSS
14.05
14.91
14.91
14.91
_______________________________________________________________________
All data read in real*8.
The above results suggest that in certain problems that have a high degree of
multicollinearity, the results are sensitive to the level of precision of the calculation as well as the
method of the calculation. A challenging example was the Filippelli polynomial data set which
was discussed earlier. However, the discussion was not complete because the real*16 QR results
Chapter 16
117
were only compared to the 15-digit "official" benchmark, and not a benchmark with more digits.
Since real*16 will give more than 15 digits of accuracy, an important final task for the next
section is to extend the Filippelli benchmark, using variable precision arithmetic to benchmark
the accuracy of the real*16 results obtained.
The variable precision library developed by Smith [30] was implemented in the B34S to
extend the Filippelli benchmark and thus fully test the true accuracy of the reported real*16
results. The linpack LU inversion routines DGECO, DGEFA and DGEDI were rewritten to allow
variable precision calculations. What was formerly a real*8 variable became a 328 element
real*8 vector. Simple statements, such as A=A+B*C, had to be individually coded, using a
customized pointer routine, IVPAADD( ) that would address the correct element to pass to a
lower level routine to make the calculation. 21 A simple example gives some insight into how
this is done:
c
c
c
if (z(k) .ne. 0.0d0) ek = dsign(ek,-z(k))
if(vpa_logic(kindr,
* z(ivpaadd(kindr,k,1,k,1)),'ne',
vpa_work(i_zero)) )then
call vpa_mul(kindr,vpa_work(i_mone),z(ivpaadd(kindr,k,1,k,1)),
*
vpa_work(iwork(4)))
call vpa_func_2(kindr,'sign',vpa_work(i_ek),
*
vpa_work(iwork(4)),
*
vpa_work(iwork(5)) )
call vpa_equal(kindr,vpa_work(iwork(5)),vpa_work(i_ek))
endif
vpa_work( ) is a 328 by 20 work array. The line z(ivpadd(kindr,k,1,k,1)
addresses the kth element of Z, which is 328 by k, and compares it to a constant = 0.0 saved in
vpa_work(i_zero). If these two variables are not equal then the three calls are executed to
solve ek = dsign(ek,-z(k)). The first call forms –z(k) and places it in
VPA_work(iwork(4)). The variable vpa_work(i_mone) contains –1.0. Next, the
SIGN function is called and the result placed in VPA_work(iwork(5)). Finally a copy is
performed. This simple example shows what is involved to "convert" a real*8 program to do
VPA math. The results can be spectacular.22 The test job vpainv is shown next:
/;
/; Shows gains in accuracy of the inverse with vpa
/;
b34sexec matrix;
call echooff;
n=6;
x=rn(matrix(n,n:));
21
Stokes (2005) provides added detail on how this was accomplished.
The job vpainv, in paper_86.mac which is distributed which B34S, illustrates the gains in accuracy for
alternative precision settings. Assuming a matrix X, X*inv(X) produces off diagonal elements in the order of
|.1e-1728|, which is far superior to what can be obtained with real*4, real*8 or real*16 results which are also shown
in the test problem. The B34S VPA implementation allows these high-accuracy calculations to be mixed with lower
precision commands, using real*4, real*8 and real*16, since data can be moved from one precision to another. This
allows experimentation concerning how sensitive the results are to accuracy settings.
22
118
Matrix Command Language
ix=inv(x,rcond8);
r16x=r8tor16(x);
ir16x=inv(r16x,rcond16);
call print('Real*4 tests',sngl(x),inv(sngl(x)),sngl(x)*inv(sngl(x)));
call print('Real*8 tests',x,
ix,
x*ix);
call print('Real*16 tests',r16x,ir16x,r16x*ir16x);
vpax=vpa(x);
ivpax=inv(vpax,rcondvpa);
detvpa=%det;
call print(rcond8,rcond16,rcondvpa,det(x),det(r16x),detvpa);
call print('Default accuracy');
call print('VPA Inverse ',vpax,ivpax,vpax*ivpax);
/; call vpaset(:info);
do i=100,1850,100;
call vpaset(:ndigits i);
call vpaset(:jform2 10);
call print('*************************************************':);
vpax=mfam(dsqrt(dabs(vpa(x))));
call vpaset(:jform2 i);
call print('Looking at vpax(2,1) given ndigits was set as ',i:);
call print(vpax(2,1));
ivpax=inv(vpax);
call print('VPAX and Inverse VPAX at high accuracy ',
vpax,ivpax,vpax*ivpax);
call print('*************************************************':);
enddo;
b34srun;
Edited output from running this script is shown next. First a real*4 matrix x is first displayed,
then inverted and then x x 1 displayed. Errors are in the range of |1.e-6| and smaller.
Real*4
1
2
3
4
5
6
1
2
3
4
5
6
tests
Matrix of
6
1
2.05157
1.08325
0.825589E-01
1.27773
-1.22596
0.338526
2
-1.32010
-1.52445
-0.459242
-0.605638
0.307389
-1.54789
Matrix of
6
1
0.821598
0.778618
-0.262284
0.250162
0.561824
0.139631
Matrix of
1
by
by
2
-1.25918
-1.58549
0.249125
-0.950123E-01
-0.716494
0.321648
6
2
by
6
elements (real*4)
3
1.49779
-0.168215
0.498469
1.26792
0.741401
-0.187157
6
5
0.647330
-0.595625E-01
1.14983
-0.579426
1.63783
1.81526
6
1.92704
0.146707
-0.271132E-01
-0.138891
-0.137510
-1.96044
5
-0.665297
-0.316747
0.424313
0.271452
-0.505871E-01
0.120854
6
0.791364
0.709570
-0.322461
0.237183
0.541239
-0.337968
elements (real*4)
3
-0.958025
-1.34620
0.406023
-0.938542
-0.495201
0.147814
6
4
0.409240
1.06262
-0.886348
0.396807
0.245425
0.527160
4
-0.255133
-0.310931
0.676249
-0.628610E-01
-0.454585
-0.300940
elements (real*4)
3
4
5
6
Chapter 16
1
2
3
4
5
6
1.00000
-0.105908E-06
0.255151E-07
-0.364185E-07
0.125920E-06
-0.429915E-07
-0.586484E-07
1.00000
-0.116802E-06
-0.112549E-06
-0.233107E-07
-0.220843E-06
0.147839E-07
-0.591370E-08
1.00000
-0.528599E-07
0.455514E-08
-0.739041E-07
119
-0.649433E-07
-0.114246E-06
-0.251482E-07
1.00000
0.481459E-07
0.121912E-07
-0.117634E-06
-0.735698E-07
-0.628774E-07
-0.118921E-06
1.00000
-0.798442E-07
0.191178E-06
0.108380E-06
0.153868E-06
0.111764E-06
0.365477E-07
1.00000
Next the experiment is repeated for real*8 and real*16 versions of the same matrix. Here errors
are in the area of |.1e-15| and smaller and |.1e-33| and smaller. Using the default VPA setting and
the same matrix, these errors become |.1e-62| and smaller.
Real*8
X
tests
= Matrix of
1
2
3
4
5
6
IX
1
2.05157
1.08325
0.825589E-01
1.27773
-1.22596
0.338525
6
2
-1.32010
-1.52445
-0.459242
-0.605638
0.307389
-1.54789
= Matrix of
1
2
3
4
5
6
1
0.821598
0.778618
-0.262284
0.250162
0.561824
0.139631
6
1
1.00000
0.693889E-17
-0.598480E-16
-0.242861E-15
-0.128370E-15
0.00000
by
2
-1.25918
-1.58550
0.249125
-0.950124E-01
-0.716494
0.321648
Matrix of
1
2
3
4
5
6
by
6
by
2
0.00000
1.00000
0.693889E-17
0.270617E-15
-0.159595E-15
0.111022E-15
6
elements
3
1.49779
-0.168215
0.498469
1.26792
0.741401
-0.187157
6
5
0.647330
-0.595625E-01
1.14983
-0.579426
1.63783
1.81526
6
1.92704
0.146707
-0.271132E-01
-0.138891
-0.137510
-1.96044
4
-0.255133
-0.310931
0.676249
-0.628611E-01
-0.454585
-0.300940
5
-0.665297
-0.316747
0.424313
0.271452
-0.505871E-01
0.120854
6
0.791364
0.709570
-0.322461
0.237183
0.541239
-0.337968
4
0.444089E-15
-0.111022E-15
0.107553E-15
1.00000
0.194289E-15
-0.111022E-15
5
-0.222045E-15
-0.100614E-15
-0.129237E-15
-0.180411E-15
1.00000
-0.555112E-16
6
-0.111022E-15
-0.249800E-15
-0.329597E-16
-0.180411E-15
0.277556E-15
1.00000
5
0.647330
-0.595625E-01
1.14983
-0.579426
1.63783
1.81526
6
1.92704
0.146707
-0.271132E-01
-0.138891
-0.137510
-1.96044
5
-0.665297
-0.316747
0.424313
0.271452
-0.505871E-01
0.120854
6
0.791364
0.709570
-0.322461
0.237183
0.541239
-0.337968
elements
3
-0.958025
-1.34620
0.406023
-0.938542
-0.495201
0.147814
6
4
0.409240
1.06262
-0.886348
0.396807
0.245425
0.527160
elements
3
-0.222045E-15
-0.971445E-16
1.00000
0.267147E-15
-0.246331E-15
0.00000
Real*16 tests
R16X
1
2
3
4
5
6
IR16X
1
2
3
4
5
6
= Matrix of
1
2.05157
1.08325
0.825589E-01
1.27773
-1.22596
0.338525
= Matrix of
1
0.821598
0.778618
-0.262284
0.250162
0.561824
0.139631
6
by
2
-1.32010
-1.52445
-0.459242
-0.605638
0.307389
-1.54789
6
by
2
-1.25918
-1.58550
0.249125
-0.950124E-01
-0.716494
0.321648
6
elements (real*16)
3
1.49779
-0.168215
0.498469
1.26792
0.741401
-0.187157
6
4
0.409240
1.06262
-0.886348
0.396807
0.245425
0.527160
elements (real*16)
3
-0.958025
-1.34620
0.406023
-0.938542
-0.495201
0.147814
4
-0.255133
-0.310931
0.676249
-0.628611E-01
-0.454585
-0.300940
120
Matrix Command Language
Matrix of
1
2
3
4
5
6
RCOND8
6
1
1.00000
0.255788E-33
-0.564237E-35
0.382177E-33
-0.328010E-33
-0.481482E-34
=
by
2
0.192593E-33
1.00000
0.917826E-34
0.120371E-34
0.601853E-35
0.962965E-34
6
elements (real*16)
3
0.288889E-33
0.120371E-34
1.00000
0.752316E-34
-0.243751E-33
0.481482E-34
4
-0.192593E-33
-0.541668E-34
0.165510E-33
1.00000
0.541668E-34
0.00000
5
0.120371E-33
0.692131E-34
0.184318E-34
0.120371E-34
1.00000
0.481482E-34
6
-0.192593E-33
-0.662038E-34
-0.168519E-33
-0.144445E-33
0.156482E-33
1.00000
0.50111667E-01
RCOND16 =
0.5011166670247408E-01
RCONDVPA=
5.01116667024740759246941521228642361326495469435182039839368M-2
15.503129
15.50312907174408
DETVPA
=
1.55031290717440844136019448415020291808052552694282172989314M+1
Default accuracy
VPA Inverse
VPAX
= Matrix of
1
1 .205157M+1
2 .108325M+1
3 .825589M-1
4 .127773M+1
5 -.122596M+1
6 .338525M+0
IVPAX
= Matrix of
1
1 .821598M+0
2 .778618M+0
3 -.262284M+0
4 .250162M+0
5 .561824M+0
6 .139631M+0
Matrix of
1
1 .100000M+1
2 .281368M-62
3 .505466M-63
4 .155177M-62
5 -.614843M-63
6 .241052M-62
6
by
2
-.132010M+1
-.152445M+1
-.459242M+0
-.605638M+0
.307389M+0
-.154789M+1
6
by
2
-.125918M+1
-.158550M+1
.249125M+0
-.950124M-1
-.716494M+0
.321648M+0
6
by
2
-.349857M-63
.100000M+1
-.281391M-63
.248980M-63
-.423209M-63
-.149789M-62
6
elements VPA - FM
3
.149779M+1
-.168215M+0
.498469M+0
.126792M+1
.741401M+0
-.187157M+0
6
5
.647330M+0
-.595625M-1
.114983M+1
-.579426M+0
.163783M+1
.181526M+1
6
.192704M+1
.146707M+0
-.271132M-1
-.138891M+0
-.137510M+0
-.196044M+1
5
-.665297M+0
-.316747M+0
.424313M+0
.271452M+0
-.505871M-1
.120854M+0
6
.791364M+0
.709570M+0
-.322461M+0
.237183M+0
.541239M+0
-.337968M+0
5
-.276120M-63
.128858M-63
-.794001M-64
.139723M-63
.100000M+1
.169336M-64
6
-.542557M-63
-.416666M-63
.136114M-63
-.699228M-63
.736800M-63
.100000M+1
elements VPA - FM
3
-.958025M+0
-.134620M+1
.406023M+0
-.938542M+0
-.495201M+0
.147814M+0
6
4
.409240M+0
.106262M+1
-.886348M+0
.396807M+0
.245425M+0
.527160M+0
4
-.255133M+0
-.310931M+0
.676249M+0
-.628611M-1
-.454585M+0
-.300940M+0
elements VPA - FM
3
.223368M-63
.722028M-63
.100000M+1
.754062M-64
-.400764M-63
-.141051M-63
4
-.757910M-63
-.165087M-62
-.312900M-63
.100000M+1
-.685434M-64
-.160277M-62
Chapter 16
121
Next the VPA degrees of accuracy is increased from 100 to 1700+ in steps of 100. Edited results
for 100 and 1775 show accuracy in the range of |.1e-104| and an astounding |.1e-1784|, which
illustrate what is possible with VAP math. A typical element, x(2,1) is also shown.
*************************************************
Looking at vpax(2,1) given ndigits was set as
100
1.0407938475538507976540614667166763813116409216491186532751876440441549
90671195410174838727421096493M+0
VPAX and Inverse VPAX at high accuracy
VPAX
1
2
3
4
5
6
IVPAX
= Matrix of
1
.143233M+1
.104079M+1
.287331M+0
.113037M+1
.110723M+1
.581829M+0
= Matrix of
1
1 .727136M+0
2 -.342033M+1
3 -.109686M+0
4 .283037M+1
5 -.758210M+0
6 .203284M+1
Matrix of
1
2
3
4
5
6
1
.100000M+1
.679132M-105
-.229392M-104
-.122064M-104
-.760298M-105
-.200000M-104
6
by
2
.114896M+1
.123469M+1
.677674M+0
.778228M+0
.554427M+0
.124414M+1
6
by
2
.730328M+0
-.219101M+0
-.901577M+0
.937660M+0
-.279579M+0
-.474316M-1
6
by
2
.921927M-105
.100000M+1
.622168M-106
.309049M-105
.576995M-105
.374939M-105
6
elements VPA - FM
3
.122384M+1
.410141M+0
.706023M+0
.112602M+1
.861046M+0
.432617M+0
6
5
.804568M+0
.244054M+0
.107230M+1
.761200M+0
.127978M+1
.134732M+1
6
.138818M+1
.383024M+0
.164661M+0
.372681M+0
.370824M+0
.140016M+1
5
.158871M+1
-.228244M+1
-.137187M+1
.135031M+1
.665131M+0
.451570M+0
6
-.837851M+0
.274594M+1
.106148M+0
-.234069M+1
.591982M+0
-.766261M+0
5
-.932517M-105
-.102494M-104
.965828M-106
-.104756M-104
.100000M+1
-.113653M-104
6
.000000M 0
-.221304M-104
.115613M-104
-.494081M-105
-.341342M-105
.100000M+1
elements VPA - FM
3
-.308940M+0
-.233482M+1
.229806M+0
.272689M+1
-.193377M+0
.904066M+0
6
4
.639719M+0
.103084M+1
.941461M+0
.629926M+0
.495403M+0
.726058M+0
4
-.175557M+1
.595158M+1
.219987M+1
-.526084M+1
.311109M+0
-.280993M+1
elements VPA - FM
3
.200000M-104
.221145M-104
.100000M+1
.120743M-104
.221558M-104
.000000M 0
4
.300000M-104
.000000M 0
.230075M-104
.100000M+1
.100000M-104
.400000M-104
*************************************************
Note: Precision out of range when calling FMSET.
NPREC
= 1800
Nearest valid precision used given ndig= 256
Upperlimit on :ndigits
= 1775
*************************************************
Looking at vpax(2,1) given ndigits was set as
1800
1.0407938475538507976540614667166763813116409216491186532751876440441549
9067119541017483872742109649284847786731954780618396337243557760905134253
6563520156336803061100714463959556366263662269873821308469464832965854167
0138171860068377342247552386420701260553904819498240561168336065413626401
0250669906640460786770367950409741912099989603137336731422032053417523721
3966297964166071008246385873024394829403802520042514409728651300143976957
2812397779120024875167135613154575066947601426816271130247154879147668923
9394356296009388425802127641937860496756504613699970799418421555804093055
3237120995762235218535766576766661537848740795670861988624303476444175233
7226176225798507544402708599420593094629811345970349094729351139588955728
5119428379300506359255141263770626457625458403838601615539373960461597816
0073349099782610400889436729448523898284270595056658203704057321081365781
8654413565539069157902511584029231833064933298214721741970329932207991290
6527078671610757572078982983114671804609743149945456219122385716278877328
122
Matrix Command Language
1968071093789651792050131594215163153195866826910747687397553138316978173
9321736744421971655768821520112748477863475786714301867009421997175231617
0363238010614721981137580098809073513520578006575598464157845090062991068
3355234688196773978064825447224414805727326031381881663445151323266575394
3649227970422690469305135088104621928676013036474412367065826577456525169
9851349360204073235768035720102081408651568383935674777433519021016931172
0962506684179407159824884668004581146345625430570260083442728751522255963
6613251768712917596895179532314049464823036176041089662244758723460968285
4191733256205711195695710155099750814377036616922522753789328720253554539
2344474454352662945653575100394159947087765297030779022580375458745250914
46728867302226218905058287716250997200000000000000M+0
VPAX and Inverse VPAX at high accuracy
VPAX
1
2
3
4
5
6
IVPAX
= Matrix of
1
.143233M+1
.104079M+1
.287331M+0
.113037M+1
.110723M+1
.581829M+0
= Matrix of
1
1 .727136M+0
2 -.342033M+1
3 -.109686M+0
4 .283037M+1
5 -.758210M+0
6 .203284M+1
Matrix of
6
by
2
.114896M+1
.123469M+1
.677674M+0
.778228M+0
.554427M+0
.124414M+1
6
by
2
.730328M+0
-.219101M+0
-.901577M+0
.937660M+0
-.279579M+0
-.474316M-1
6
by
6
elements VPA - FM
3
.122384M+1
.410141M+0
.706023M+0
.112602M+1
.861046M+0
.432617M+0
6
5
.804568M+0
.244054M+0
.107230M+1
.761200M+0
.127978M+1
.134732M+1
6
.138818M+1
.383024M+0
.164661M+0
.372681M+0
.370824M+0
.140016M+1
5
.158871M+1
-.228244M+1
-.137187M+1
.135031M+1
.665131M+0
.451570M+0
6
-.837851M+0
.274594M+1
.106148M+0
-.234069M+1
.591982M+0
-.766261M+0
elements VPA - FM
3
-.308940M+0
-.233482M+1
.229806M+0
.272689M+1
-.193377M+0
.904066M+0
6
4
.639719M+0
.103084M+1
.941461M+0
.629926M+0
.495403M+0
.726058M+0
4
-.175557M+1
.595158M+1
.219987M+1
-.526084M+1
.311109M+0
-.280993M+1
elements VPA - FM
1
2
3
4
5
6
1 .100000M+1
-.168696M-1785 .000000M 0
-.200000M-1784 .196614M-1784 -.200000M-1784
2 .266236M-1784 .100000M+1
.143811M-1784 -.500000M-1784 .297167M-1784 -.177795M-1784
3 .105965M-1784 -.129815M-1785 .100000M+1
-.103406M-1784 -.856883M-1785 -.529863M-1785
4 -.489586M-1786 -.602335M-1785 -.793529M-1785 .100000M+1
.113342M-1784 -.942817M-1785
5 -.866758M-1785 -.612497M-1785 -.136012M-1784 .000000M 0
.100000M+1
-.160420M-1787
6 .000000M 0
-.503337M-1785 .100000M-1784 -.100000M-1784 .110410M-1784 .100000M+1
*************************************************
Table 16.8 shows the Filippelli Data set benchmark, an extended printout of the QR
real*16 results and the expanded Filippelli benchmark calculated with VPA data to 40 digits. A
ruler listed at the top table is designed to assist the reader in determining at which digit there is a
difference. Consider coefficient # 1. The VPA beta agrees with the real*16 QR beta up to the
28th digit, which is far beyond the 15th digit, which was all that was listed for the "benchmark"
which is shown again in Table 16.8. The VPA experiment documents that the real*16 calculation
is in fact substantially more accurate than the best real*8 QR, which produced, on average 7
digits, as reported in Table 16.3. Recall that the "converted" real*16 results (data converted from
real*8 to real*16), reported in Table 16.2 Experiment 5, had only a marginally better LRE of
7.924 than the real*8 QR results that found the LRE was 7.31. Although in Tables 16.2 & 16.3 it
was reported that the "true" real*16 QR results (data loaded directly into real*16), had a LRE
value was 14.79, once we had the VPA benchmark for 40 digits, it was apparent that the LRE
was substantially larger. Even calculations of the 10th and 11th coefficients, when compared with
the VPA data, produced 27 digits of accuracy. It should be remembered that these impressive
Chapter 16
123
results for real*16 are due to both the accuracy of the calculation and the fact that the data was
directly read into real*16, not converted from real*8 to real*16. As we have shown, the data
base precision makes a real difference in addition to the precision of the calculation. The
important implication is that the inherent precision of the calculation method will be no help and
in fact may give misleadingly "accurate" results unless the data is read with sufficient
precision.23
Some of the key lessions of this paper are listed in Table 16.9. The main finding is the
accuracy tradeoff between the precision of the data and the calculation method used. In all cases,
it is important to check for rank problems before proceeding with a calculation. The less the
precision of the data, the more appropriate it is to consider higher accuracy solution methods
such as the QR and the SVD approach.24 Real*4 data saving precision, and more important
calculation precision, was found to be associated with substantial loss of accuracy, even with less
stiff data sets.
23
In order to 100% isolate the VPA results from data reading issues, the loading of data into the VPA array
proceeded as follows. The real*16 data was printed to a character*1 array using e50.32. Next, the VPA string input
routine was used to convert this character*1 array into a VPA variable. This way both real*16 and the VPA results
were using the same data. Experiments were also conducted by reading the data in character form directly into the
VPA routines. For this problem both methods of data input into VPA made no difference since there were relative
few digits. In results not reported but available in paper_86.mac, the Filippelli problem was "extended" by
11
20
adding x , , x to the right hand side to make the problem more difficult (stiff). Both the VPA and the native
real*16 experiments were run and both successfully solved the problem, suggesting "reserve" capability to handle a
substantially more stiff problem.
24
While the main trust of the paper has been to show the effect of various factors on the number of "correct" digits
of a calculation, in applied econometric work an important consideration is how many digits to report. If the
government data is known only to k digits, many researchers argue that only k digits of accuracy should be reported.
In many situations, this is appropriate although such a practice makes it difficult to access the underlying accuracy
of the calculation routines used in the software system. Clearly if variables such as ŷ or ê are to be calculated, all
estimated digits should be used to insure
 e  0 etc.
124
Matrix Command Language
Table 16.8 VPA Alternative Estimates of Filippelli Data set
_______________________________________________________________________
_
VPA BETA
Real*16 QR
Answer for
VPA SE
Real*16 QR
Answer for
1
beta
coef
1
SE
SE
10
20
30
40
50
12345678901234567890123456789012345678901234567890
--------------------------------------------------.2772179591933423928028447556649596044434M+4
-0.2772179591933423928028447553572108500000E+04
-0.2772179591933420E+04
.5597798654749498745747725508021651489727M+3
0.5597798654749498745747725479752748700000E+03
0.5597798654749500E+03
VPA BETA
Real*16 QR
Answer for
VPA SE
Real*16 QR
Answer for
2
beta
coef
2
SE
SE
-.2316371081608930758821967916501044936138M+4
-0.2316371081608930758821967914097820200000E+04
-0.2316371081608930E+04
.4664775721277964526931098320484471124838M+3
0.4664775721277964526931098297461005100000E+03
0.4664775721277960E+03
VPA BETA
Real*16 QR
Answer for
VPA SE
Real*16 QR
Answer for
3
beta
coef
3
SE
SE
-.1127973940983715698571670015266249731414M+4
-0.1127973940983715698571670014199826100000E+04
-0.1127973940983720E+04
.2272042744777513106293981763510244738352M+3
0.2272042744777513106293981752622826700000E+03
0.2272042744777510E+03
VPA BETA
Real*16 QR
Answer for
VPA SE
Real*16 QR
Answer for
4
beta
coef
4
SE
SE
-.3544782337033487716107384852595281875294M+3
-0.3544782337033487716107384849646966900000E+03
-0.3544782337033490E+03
.7164786608759273726166572118158443735326M+2
0.7164786608759273726166572085071780100000E+02
0.7164786608759270E+02
VPA BETA
Real*16 QR
Answer for
VPA SE
Real*16 QR
Answer for
5
beta
coef
5
SE
SE
-.7512420173937571389052207557481187222874M+2
-0.7512420173937571389052207552268365400000E+02
-0.7512420173937570E+02
.1528971787474000650307567904607140782062M+2
0.1528971787474000650307567897859220700000E+02
0.1528971787474000E+02
VPA BETA
Real*16 QR
Answer for
VPA SE
Real*16 QR
Answer for
6
beta
coef
6
SE
SE
-.1087531803553425108528108118290083531722M+2
-0.1087531803553425108528108117714492600000E+02
-0.1087531803553430E+02
.2236911598160332755518623413323850745016M+1
0.2236911598160332755518623403977080500000E+01
0.2236911598160330E+01
VPA BETA
Real*16 QR
Answer for
VPA SE
Real*16 QR
Answer for
7
beta
coef
7
SE
SE
-.1062214985889467664596611220591597363944M+1
-0.1062214985889467664596611220235596600000E+01
-0.1062214985889470E+01
.2216243219342274020661298346608897939687M+0
0.2216243219342274020661298337934033000000E+00
0.2216243219342270E+00
VPA BETA
Real*16 QR
Answer for
VPA SE
Real*16 QR
Answer for
8
beta
coef
8
SE
SE
-.6701911545934083759267341228848844976973M-1
-0.6701911545934083759267341228119136200000E-01
-0.6701911545934080E-01
.1423637631547239489182330919953278852498M-1
0.1423637631547239489182330914795936200000E-01
0.1423637631547240E-01
VPA BETA
Real*16 QR
Answer for
VPA SE
Real*16 QR
Answer for
9
beta
coef
9
SE
SE
-.2467810782754786508408544524189188555839M-2
-0.2467810782754786508408544524564670500000E-02
-0.2467810782754790E-02
.5356174088898209362586519329555783802279M-3
0.5356174088898209362586519311846583900000E-03
0.5356174088898210E-03
VPA BETA 10
Real*16 QR beta
Answer for coef
VPA SE
10
Real*16 QR SE
Answer for SE
-.4029625250804036712971315485276426445821M-4
-0.4029625250804036712971315487091695800000E-04
-0.4029625250804040E-04
.8966328373738682221004152725410272047808M-5
0.8966328373738682221004152698795102200000E-05
0.8966328373738680E-05
VPA BETA 11
Real*16 QR beta
Answer for coef
VPA SE
11
Real*16 QR SE
Answer for SE
-.1467489614229795882287848515307287127546M+4
-0.1467489614229795882287848513596070800000E+04
-0.1467489614229800E+04
.2980845309955369852005523437755166954313M+3
0.2980845309955369852005523422443903600000E+03
0.2980845309955370E+03
____________________________________________________
Chapter 16
125
Table 16.9 Lessons to be learned from VPA and Other Accuracy Experiments
_______________________________________________________________________
1. The QR method of solving an OLS regression model can provided 1-2 more digits of accuracy and in
fact may be the only way to successfully solve a "stiff" or multicollinear model.
2. The precision in which data are initially loaded into memory (for example, single precision) impacts
accuracy, even in cases when it is later moved to a higher precision (for example double precision) for the
calculation. This suggests that data should be read into the precision in which the calculation is made to
avoid numeric representation accuracy issues that occur when the precision of the data is increased. This
means that the current practive of a number of software systems to save data in real*4, but move this data
into real*8 for calculations is a dangerous practive that unnecessarily induces accuracy issues.
3. In many cases, accuracy gains can be made by boosting the precision of accumulators such as the
BLAS routines for sum, absolute sum and dot product. Such routines should be used throughout software
systems and will increase the accuracy of the variance and other calculations. It is desirable to be able to
switch on and off such accuracy improvements to test the sensitivity of the given problem to these
changes. Accuracy improvements to these routines have a CPU cost.
4. Data base design should take into account the needs of the users who may want to read data into
higher-than-usual precision. For data that is not transformed in a data bank, the user should be able to get
all reported digits of precision without rounding (due to numeric representation) loss. This means
allowing a character representation to be accessable and well as a real*4 or real*8 representation. This
suggestion has far reaching implications since most if not all databanks save data in some kind of real
format. If character saving of all data is not possible, a partial solution would be to save all data in at least
real*8.
5. The new 64-bit computers will make higher-precision calculations more viable and may prove useful
for the estimation of problems requiring high precision for their successful solution. Real*16 and
complex*32 will not have to be emulated in software by the compilers. These technological changes on
the hardware side suggest that software designers may want to offer greater than double precision math in
future releases of their products.
6. The lower the precision of the data, the more imperative it is to check for rank problems, use highquality numeric routines (lapack/linpack etc.) and utilize inherently higher accuracy solution methods,
such as the QR. For many problems, however, if data are read with sufficient accuracy, this may not be
needed.
7. If data are not initially read with sufficient precision, high-accuracy methods of calculation, such as the
QR, can provide misleadingly "accurate" results that are in fact tainted by numeric representation issues
inherent in the initial data read. This initial data "corruption" cannot be "cured" by any subsequent
increase in data precision. The more "stiff" the problem, the more this becomes an important
consideration.
126
Matrix Command Language
16.8 Conclusion
In the 1960's when econometric software was not generally available, users passed
around Fortran programs, usually with crude column dependent command structure. In that era,
an applied econometrician needed to know the theory, the econometrics and in addition be able
to program in Fortran. In the 70's this practice gave way to commercially available procedure
driven software that could perform the usual analysis. In parallel a number of 4th generation
languages were developed. These included APL, the SAS® Matrix Language, Speakeasy® and
later MATLAB®, Gauss® and Mathmatica®. The B34S matrix command, while patterned on
Speakeasy, is targeted for econometric analysis of time series, nonlinear detection and modeling
and Monti Carlo analysis. The learning curve for its use is substantially less than Fortran and
many built-in commands allow the user to program custom calculations relatively quickly. For
routine analysis the built-in procedures are usually sufficient.
The examples in this chapter illustrate the wide range of problems that can be solved
using data saved in a number of precisions. Inspection of matrix language programs, as well as
programs written in othetr 4th generation languages, both document the calculation being made
and facilitate replication exercises with other software. Such systems make it easy to experiment,
something substantially more difficult when Fortran and or C programs had to be custom built
for each research step.25 In many of the chapters there is more discussion of specific problems
that were studied with the help of the matrix command.
25
Stokes (2004b) extensively discussed this aspect of modern econometric software languages.