Revised Chapter 16 in Specifying and Diagnostically Testing Econometric Models (Edition 3) © by Houston H. Stokes 26 January 2010. All rights reserved. Preliminary Draft Chapter 16. Programming using the Matrix Command ................................................................................ 1 16.0 Introduction ....................................................................................................................... 1 16.1 Brief Introduction to the B34S Matrix language ........................................................... 2 16.2 Overview of Nonlinear Capability................................................................................. 18 16.3 Rules of the Matrix Language ....................................................................................... 24 16.4 Linear Algebra using the Matrix Language ................................................................. 51 16.5 Extended Eigenvalue Analysis ...................................................................................... 81 16.6 A Preliminary Investigation of Inversion Speed Differences ...................................... 90 Table 16.1 Relative Cost of Equalization & Refinement of a General Matrix ................... 105 16.7 Variable Precision Math............................................................................................... 108 Table 16.2 LRE for Various Approaches to an OLS Model of the Filippelli Data ............ 111 Table 16.4 Coefficients estimated with QR using Real*16 Filippelli Data ....................... 113 Table 16.5 Residual Sum of Squares on a VAR model of Order 6 – Gas Furnace Data.... 114 Table 16.6 LRE for Various Estimates of Coef, SE and RSS of Pontius Data ................ 116 Table 16.7 LRE for QR, Cholesky, SVD linpack and lapack for Eberhardt Data ............ 116 Table 16.8 VPA Alternative Estimates of Filippelli Data set ........................................... 124 Table 16.9 Lessons to be learned from VPA and Other Accuracy Experiments ................ 125 16.8 Conclusion ..................................................................................................................... 126 Programming using the Matrix Command 16.0 Introduction The B34S matrix command is a full featured 4th generation programming language that allows users to customize calculations that are not possible with the built-in B34S procedures. This chapter provides an introduction to the matrix command language with both special emphasis on linear algebra applications and the general design of the language. This chapter should be thought of as an introduction and overview of the power of programming capability with an emphasis on applications.1 At many junctures, the reader is pointed to other chapters for further discussion of the theory underlying the application The role of this chapter is to provide a 1 In the 1960's, with the advent of the availability Fortran compilers and limited capability mainframe computers, econometric researchers programmed / developed software that used column dependent input, had limited capability and was difficult to extend as outlined in Stokes (2004b). In the 1970's and early 1980's econometric software improved, but was still expensive to develop due to high cpu costs on mainframe computers. The PC revolution together with the development of programming languages such as GAUSS® and MATLAB® stimulated researchers to develop their own procedures, without waiting for software developers to "hard wire" this capability in their commercially distributed systems. The matrix command allows a user to develop custom calculations using a programming language with many econometric commands already available. While many of the cpu intensive commands are "hard wired" into the language, many are themselves just subroutines or functions written in the matrix language and available to the user to modity as needed. The goal of this chapter is to discuss how this might be done. Many code examples are provides as an illustration of what is available in the language. 2 Matrix Command Language reference to matrix command basics.2 An additional and no less important goal is to discuss a number of tests of linpack and lapack eigenvalue, Cholesky and LU routines for speed and accuracy. Section 16.1 provides a brief introduction to the matrix command language. All commands in the language are listed by type, but because of space limitations are not illustrated in any detail. Since the matrix command has a running example for all commands, the user is encouraged to experiment with commands of special interest by first reading the help file and next running one or more the of the supplied examples. To illustrate the power of the system, a program to perform a fast Fourier transform example with real*8 and complex*16 data is shown. A user subroutine filter illustrates how the language can be extended.3 The help file for the schur factorization, which is widely used in rational expectations models, is provided as an example to both show capability and illustrate what a representative command help file contains. Section 16.2 provides an overview of some of the nonlinear capability built into the language and motivates why knowledge of this language or one similar like MATLAB or SPEAKEASY(r) is important. The solutions to a variety of problems are illustrated but not discussed in any detail. Section 16.3 discusses many of the rules of the matrix language while section 16.4 illustrates matrix algebra applications. In sections 16.5 and 16.6 refinements to eigenvalue analysis and inversion speed issues are illustrated. Section 16.7 shows the gains obtained by real*16 and VPA math calculations. 16.1 Brief Introduction to the B34S Matrix language The matrix language is a full featured 4th generation language that can be used to program custom calculations. Analysis is supported for real*8, real*16, complex*16 and complex*32 data. A number of character manipulation routines are supported. High resolution graphics is available on all platforms4 and batch and interactive operation is available. The matrix facility supports user programs, which use the local address space and subroutines and functions, which have their own address space.5 This design means that variable names inside these routines will not conflict with variables known at the global level which is set to 100. Variables in the language are built using an object-oriented programming using analytic statements such as: y = r*2.; 2 Nonlinear modeling examples are discussed in Chapter 11, while many other examples of applications are given in the other chapters such as 2 and 14. A subsequent book is under development Stokes (200x) that will discuss a large number of applications, particularly in the area of time series analysis. 3 This facility is somewhat similar to the MATLAB m file which contains help commands in its first lines. In contract to MATLAB, the b34s libraries allow placement of a large number of subroutines and help files in one file. The b34s design dates from the IBM MVS PDS facility except that it is portable. SCA has a similar design as regards to macro files. 4 5 See chapter 1 for the platforms that are supported. Examples will be supplied later. Chapter 16 3 where r is a variable that could be a matrix, 2D array, 1D array, vector or a scaler. The class of the variables determines the calculation performed. If x were a n by k matrix of data values and y was a n element vector of left hand side data points, the OLS solution using the text book formula could be calculated as beta=inv(transpose(x)*x)*transpose(x)*y; By use of vectors that are created by statements such as integers(1,3) it is possible to subset a matrix without the use of do loops and other programming constructs. This capability is illustrated by a=rn(matrix(10,5:)); newa= a(integers(1,3),integers(2,4)); call print('Pull off rows 1-3, cols 2-4',a,newa); which produces output => A=RN(MATRIX(10,5:))$ => NEWA=A(INTEGERS(1,3),INTEGERS(2,4))$ => CALL PRINT('Pull off rows 1-3, cols 2-4',A,NEWA)$ Pull off rows 1-3, cols 2-4 A = Matrix of 1 2 3 4 5 6 7 8 9 10 1 1.82793 -0.641156 0.726593 0.174686 1.01451 -1.70319 2.23174 0.256844 1.26117 -0.303238 NEWA = Matrix of 1 2 3 1 -2.14489 0.219954 -0.282409E-01 10 by 2 -2.14489 0.219954 -0.282409E-01 -0.957929 -0.795788 -0.853220 -1.34378 -0.266375 -0.396155 -1.07086 3 by 2 0.166069 1.27446 -0.555147 5 elements 3 0.166069 1.27446 -0.555147 1.27209 -0.752745 -1.68396 0.551978 1.14227 -1.84390 1.21187 3 4 -0.532415E-01 0.477187 0.410387 0.940303 -0.973475E-01 0.888468 0.411578 -0.956681 0.628140E-02 0.295704 5 0.466859 0.555387 -0.373611E-01 -1.63219 0.719606E-01 0.204583 0.604946 -0.559318 1.68875 -1.71790 elements 3 -0.532415E-01 0.477187 0.410387 In addition to analytic statements that might contain function calls, call statements which provide a brach to a subroutine are supported. Examples are: call olsq(y x{0 to 10} y{1 to 10} :print); which estimates an OLS model for yt f ( yt 1, , yt 10 , xt , , xt 10 ) and call tabulate(x,y,z); which produces a table of x and y. Both functions and subroutines can be built-in to the execuitable and thus hidden from the user or themselves written in the matrix language. The formula and solve commands allow recursive solution of an analytic statement over a range of index values. By vectorizing the calculation, at the loss of some generality, these features speed up calculations that would have had to use do loops which have substantial overhead. A number 4 Matrix Command Language of examples that illustrate these features are shown later in this document are all commands and programming statements are shown. Inspection of the language will show that the matrix facility has been influenced closely by SPEAKEASY®, which was developed by Stan Cohen. The programming languages of the two systems are very similar and share the same save-file structure. However thare are a number of important differences that will be discussed further below. The matrix facility is not designed to run interactively, although commands can be given interactively in the Manual Mode. Output is written to the b34s.out file and error messages are displayed in both the b34s.out and b34s.log files. The objective of the matrix facility is to give the user access to a powerful object-oriented programming language so that custom calculations can be made. A particular strength of the facility is to estimate complex nonlinear least squares and maximum likelihood models. Such models, which are specified in matrix command programs, can be solved using either subroutines or with the nonlinear commands nleq, nlpmin1, nlpmin2, nlpmin3, nllsq, maxf1, maxf2, maxf3, cmaxf1, cmaxf2 and cmaxf3 are discussed in Chapter 11. While the use of B34S subroutines for the complete calculation would give the user total control of the estimation process, speed would be given up. The above nonlinear commands give the user complete control of the form of the estimated model, which is specified in a matrix command program. Since these programs are called by compiled solvers, there is a substantial speed advantage over a design that writes the solver in a subroutine written in that program's language.6 By design the nonlinear solvers were designed to call matrix command programs not matrix command subroutines, although a link to a subroutine can be made.7 The below listed example illustrates the programming language and shows part of the real*8 and complex*16 fft decomposition of data generated with the dcos function. This example uses the commands dcos, fft and ifft. The code is completely vectorized with no loops. The inverse fft is used to recover the series (times n). Real*8 and complex*16 problems are shown. * Example from IMSL (10) Math Page 707-709; n=7.; ifft=grid(1.,n,1.); xfft=dcos((ifft-1.)*2.*pi()/n); rfft=fft(xfft); bfft=fft(rfft:back); call tabulate(xfft,rfft,bfft); * Complex Case See IMSL(10) Math Page 715-717; cfft=complex(0.0,1.); hfft=(complex(2.*pi())*cfft/complex(n))*complex(3.0); xfft=dexp(complex(ifft-1.)*hfft); cfft=fft(xfft); bfft=fft(cfft:back); call tabulate(xfft,cfft,bfft); Subroutines DUD, DUD2 and NARQ written in the matrix command language itself are supplied in file matrix2.mac to illustrate a fully programmed nonlinear least squares solver using the Marquardt (1963) method that mimics the SAS nonlin command. 7 The technical reason for this is that for a function or subroutine call a duplicate copy of all arguments is made to named storage at the currect level plus 1. This way the arguments are "local" in the subroutine using a possibly different name. The disadvantage is that this takes more space and slows execution. Use of a program allows all variables to be accessed without explicitly being passed. 6 Chapter 16 5 The grid command creates a vector from 1. to 7. in increments of 1. The matrix language supports integer*4 and real*8 data so the command was n=7. not n=7 which would have created an integer. If data types are mixed, the program will generate a mixed mode error since the parser does not know the data type to save the result. The complex command creates a complex*16 datatype from 1 to 2 real*8 arguments. In the above example a series is generated, the fft is calculated and the series is recovered (times 7.). Output is: => * EXAMPLE FROM IMSL (10) MATH PAGE 707-709$ => N=7.$ => IFFT=GRID(1.,N,1.)$ => XFFT=DCOS((IFFT-1.)*2.*PI()/N)$ => RFFT=FFT(XFFT)$ => BFFT=FFT(RFFT:BACK)$ => CALL TABULATE(XFFT,RFFT,BFFT)$ Obs 1 2 3 4 5 6 7 XFFT 1.000 0.6235 -0.2225 -0.9010 -0.9010 -0.2225 0.6235 RFFT BFFT -0.2220E-15 7.000 3.500 4.364 -0.4653E-15 -1.558 -0.5773E-14 -6.307 -0.2129E-16 -6.307 0.6328E-14 -1.558 -0.9279E-17 4.364 => * COMPLEX CASE => CFFT=COMPLEX(0.0,1.)$ => HFFT=(COMPLEX(2.*PI())*CFFT/COMPLEX(N))*COMPLEX(3.0)$ => XFFT=DEXP(COMPLEX(IFFT-1.)*HFFT)$ => CFFT=FFT(XFFT)$ => BFFT=FFT(CFFT:BACK)$ => CALL TABULATE(XFFT,CFFT,BFFT)$ Obs 1 2 3 4 1.000 -0.9010 0.6235 -0.2225 SEE IMSL(10) MATH PAGE 715-717$ XFFT 0.000 0.4339 -0.7818 0.9749 CFFT -0.2220E-15 0.4441E-15 -0.2720E-14 0.2818E-15 0.7938E-14 0.3890E-15 7.000 -0.2209E-14 7.000 -6.307 4.364 -1.558 BFFT -0.4733E-29 3.037 -5.473 6.824 6 Matrix Command Language 5 6 7 -0.2225 0.6235 -0.9010 -0.9749 0.7818 -0.4339 0.1110E-13 -0.4496E-14 0.2165E-14 0.3556E-15 0.3917E-15 0.3464E-15 -1.558 4.364 -6.307 -6.824 5.473 -3.037 The matrix command Subroutine filter, listed next, shows some other aspects of the language. Within the filter subroutine all variables are local. The filter subroutine can be called with commands such as call filter(xold,xnew,10.,14.); subroutine filter(xold,xnew,nlow,nhigh); /$ /$ Depending on nlow and nhigh filter can be a low pass /$ or a high pass filter /$ /$ Real FFT is done for a series. FFT values are zeroed /$ out if outside range nlow - nhigh. xnew recovered /$ by inverse FFT /$ /$ FINTERC subroutine uses Complex FFT /$ /$ Use of FILTER in place of FILTERC may result in /$ Phase and Gain loss /$ /$ xold - input series /$ xnew - filtered series /$ nlow - lower filter bound /$ nhigh - upper filter bound /$ /$ Routine built 2 April 1999 /$ n=norows(xold); if(n.le.0)then; call print('Filter finds # LE 0'); go to done; endif; if(nlow.le.0.or.nlow.gt.n)then; call print('Filter finds nlow not set correctly'); go to done; endif; if(nhigh.le.nlow.or.nhigh.gt.n)then; call print('Filter finds nhigh not set correctly'); go to done; endif; fftold = fft(xold); fftnew = array(n:); i=integers(nlow,nhigh); fftnew(i) = fftold(i); xnew =afam(fft(fftnew :back))*(1./dfloat(n)); done continue; return; end; The complete matrix command vocabulary of over 400 words is listed by subroutine, function and keyword: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ List of Built-In Matrix Command Subroutines ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chapter 16 ACEFIT ACF_PLOT ADDCOL ADDROW AGGDATA ALIGN ARMA AUTOBJ BACKSPACE BDS BESTREG B_G_TEST BGARCH BLUS BPFILTER BREAK BUILDLAG CCFTEST CHAR1 CHARACTER CHECKPOINT CLEARALL CLEARDAT CLOSE CLS CMAXF1 CMAXF2 CMAXF3 COMPRESS CONSTRAIN CONTRACT COPY COPYLOG COPYOUT COPYTIME COPYF CSPECTRAL CSUB CSV DATA_ACF DATA2ACF DATAFREQ DATAVIEW DELETECOL DELETEROW DES DESCRIBE DF DISPLAYB DIST_TAB DODOS DO_SPEC DO2SPEC DOUNIX DQDAG DQDNG DQDAGI - DQDAGP DQDAGS DQAND DTWODQ ESACF ECHOOFF ECHOON EPPRINT EPRINT ERASE EXPAND FORMS FORPLOT - Alternating Conditional Expectation Model Estimation Simple ACF Plot Add a column to a 2d array or matrix. Add a row to a 2d array or matrix. Aggregate Data under control of an ID Vector. Align Series with Missing Data ARMA estimation using ML and MOM. Automatic Estimation of Box-Jenkins Model Backspace a unit BDS Nonlinearity test. Best OLS REGRESSION Breusch-Godfrey (1978) Residual Test Calculate function for a BGARCH model. BLUS Residual Analysis Baxter-King Filter. Set User Program Break Point. Builds NEWY and NEWX for VAR Modeling Display CCF Function of Prewhitened data Place a string is a character*1 array. Place a string in a character*1 array. Save workspace in portable file. Clears all objects from workspace. Clears data from workspace. Close a logical unit. Clear screen. Constrained maximization of function using zxmwd. Constrained maximization of function using dbconf/g. Constrained maximization of function using db2pol. Compress workspace. Subset data based on range of values. Contract a character array. Copy an object to another object Copy file to log file. Copy file to output file. Copy time info from series 1 to series 2 Copy a file from one unit to another. Do cross spectral analysis. Call Subroutine Read and Write a CSV file Calculate ACF and PACF Plots Calculate ACF and PACF Plots added argument Data Frequency View a Series Under Menu Control Delete a column from a matrix or array. Delete a row from a matrix or array. Code / decode. Calculate Moment 1-4 and 6 of a series Calculate Dickey-Fuller Unit Root Test. Displays a Buffer contents Distribution Table Execute a command string if under dos/windows. Display Periodogram and Spectrum Display Periodogram and Spectrum added argument Execute a command string if under unix. Integrate a function using Gauss-Kronrod rules Integrate a smooth function using a nonadaptive rule. Integrate a function over infinite/semi-infinite interval. Integrate a function with singularity points given Integrate a function with end point singularities Multiple integration of a function Two Dimensional Iterated Integral Extended Sample Autocorrelation Function Turn off listing of execution. Turn on listing of execution. Print to log and output file. Print to log file. Erase file(s). Expand a character array Build Control Forms Forecast Plot 7 8 Matrix Command Language FREE FPLOT FPRINT GAMFIT GARCH GARCHEST GET GETDMF GETKEY GETMATLAB GET_FILE GET_NAME GETRATS GETSCA GMFAC GMINV GMSOLV GRAPH GRAPHP GRCHARSET GRREPLAY GTEST GWRITE GWRITE2 HEADER HEXTOCH HINICH82 HINICH96 HPFILTER ISEXTRACT IALEN IBFCLOSE IBFOPEN IBFREADC IBFREADR IBFSEEK IBFWRITER IBFWRITEC IB34S11 IFILESIZE IFILLSTR IGETICHAR IGETCHARI IJUSTSTR ILCOPY ILOCATESTR ILOWER INEXTR8 INEXTR4 INEXTSTR - INEXTI4 INTTOSTR IRF IR8TOSTR ISTRTOR8 ISTRTOINT IUPPER I_DRNSES I_DRNGES I_DRNUN I_DRNNOR I_DRNBET I_DRNCHI I_DRNCHY I_DRNEXP I_DRNEXT - I_DRNGAM I_DRNGCT I_DRNGDA - Free a variable. Plot a Function Formatted print facility. Generalized Additive Model Estimation Calculate function for a ARCH/GARCH model. Estimate ARCH/GARCH model. Gets a variable from b34s. Gets a data from a b34s DFM file. Gets a key Gets data from matlab. Gets a File name Get Name of a Matrix Variable Reads RATS Portable file. Reads SCA FSAVE and MAD portable files. LU factorization of n by m matrix Inverse of General Matrix using LAPACK Solve Linear Equations system using LAPACK High Resolution graph. Multi-Pass Graphics Programing Capability Set Character Set for Graphics. Graph replay and reformat command. Tests output of a ARCH/GARCH Model Save Objects in GAUSS Format using one file Save objects in GAUSS format using two files Turn on header Concert hex to a character representation. Hinich 1982 Nonlinearity Test. Hinich 1996 Nonlinearity Test. Hodrick-Prescott Filter. Place data in a structure. Get actual length of a buffer of character data Close a file that was used for Binary I/O Open a File for Binary I/O Reads from a binary file into Character*1 array Reads from a binary file into Real*8 array Position Binary read/write pointer Write noncharacter buffer on a binary file Write character buffer on a binary file Parse a token using B34S11 parser Determine number of bites in a file Fill a string with a character Obtain ichar info on a character buffer Get character from ichar value Left/Right/center a string Move bites from one location to another Locate a substring in a string - 200 length max Lower case a string - 200 length max Convert next value in string to real*8 variable Convert next value in string to real*4 variable Extract next blank deliminated sub-string from a string Convert next value in a string to integer. Convert integer to string using format Impulse Response Functions of VAR Model Convert real*8 value to string using format Convert string to real*8 Convert string to integer Upper case a string - 200 length max Initializes the table used by shuffled generators. Get the table used in the shuffled generators. Uniform (0,1) Generator Random Normal Distribution Random numbers from beta distribution Random numbers from Chi-squared distribution Random numbers from Cauchy distribution Random numbers from standard exponential Random numbers from mixture of two exponential distributions Random numbers from standard gamma distribution Random numbers from general continuous distribution Random integers from discrete distribution alias Chapter 16 I_DRNGDT I_DRNLNL I_DRNMVN I_DRNNOA I_DRNNOR I_DRNSTA I_DRNTRI I_DRNSPH I_DRNVMS I_DRNWIB I_RNBIN I_RNGET I_RNOPG I_RNOPT I_RNSET I_RNGEO I_RNHYP I_RNMTN I_RNNBN I_RNPER I_RNSRI KEENAN KSWTEST KSWTESTM LAGMATRIX LAGTEST LAGTEST2 LAPACK LM LOAD LOADDATA LPMAX LPMIN LRE MAKEDATA MAKEFAIR MAKEGLOBAL MAKELOCAL MAKEMATLAB MAKEMAD MAKERATS MAKESCA MANUAL MARS MARSPLINE MARS_VAR MAXF1 MAXF2 MAXF3 MELD MENU MESSAGE MINIMAX MISSPLOT MQSTAT MVNLTEST NAMES NLEQ NLLSQ NL2SOL NLPMIN1 NLPMIN2 NLPMIN3 NLSTART NOHEADER OLSQ OLSPLOT OPEN OUTDOUBLE OUTINTEGER - approach Random integers from discrete using table lookup Random numbers from lognormal distribution Random numbers from multivariate normal Random normal numbers using acceptance/rejection Random normal numbers using CDF method Random numbers from stable distribution Random numbers from triangular distribution Random numbers on the unit circle Random numbers from Von Mises distribution Random numbers from Weibull distribution Random integers from binomial distribution Gets seed used in IMSL Random Number generators. Gets the type of generator currently in use. Selects the type of uniform (0,1) generator. Sets seed used in IMSL Random Number generators. Random integers from Geometric distribution Random integers from Hypergeometric distribution. Random numbers from multinomial distribution Negative binomial distribution Random perturbation of integers Index of random sample without replacement Keenan Nonlinearity test K Period Stock Watson Test Moving Period Stock Watson Test Builds Lag Matrix. 3-D Graph to display RSS for OLS Lags 3-D Graph to display RSS for MARS Lags Sets Key LAPACK parameters Engle Lagrange Multiplier ARCH test. Load a Subroutine from a library. Load Data from b34s into MATRIX command. Solve Linear Programming maximization problem. Solve Linear Programming minimization problem. McCullough Log Relative Error Place data in a b34s data loading structure. Make Fair-Parke Data Loading File Make a variable global (seen at all levels). Make a variable seen at only local level. Place data in a file to be loaded into Matlab. Makes SCA *.MAD datafile from vectors Make RATS portable file. Make SCA FSV portable file. Place MATRIX command in manual mode. Multivariate Autoregressive Spline Models Updated MARS Command using Hastie-Tibshirani code Joint Estination of VAR Model using MARS Approach Maximize a function using IMSL ZXMIN. Maximize a function using IMSL DUMINF/DUMING. Maximize a function using simplex method (DU2POL). Form all possible combinations of vectors. Put up user Menu for Input Put up user message and allow a decision. Estimate MINIMAX with MAXF2 Plot of a series with Missing Data Multivariate Q Statistic Multivariate Third Order Hinich Test List names in storage. Jointly solve a number of nonlinear equations. Nonlinear Least Squares Estimation. Alternative Nonlinear Least Squares Estimation. Nonlinear Programming fin. diff. grad. DN2CONF. Nonlinear Programming user supplied grad. DN2CONG. Nonlinear Programming user supplied grad. DN0ONF. Generate starting values for NL routines. Turn off header. Estimate OLS, MINIMAX and L1 models. Plot of Fitted and Actual Data & Res Open a file and attach to a unit. Display a Real*8 value at a x, y on screen. Display an Integer*4 value at a x, y on screen. 9 10 Matrix Command Language OUTSTRING PCOPY PERMUTE PISPLINE PLOT POLYFIT POLYVAL POLYMCONV POLYMDISP POLYMINV POLYMMULT PP PRINT PRINTALL PRINTOFF PRINTON PRINTVASV PRINTVASCMAT PRINTVASRMAT PROBIT PVALUE_1 PVALUE_2 PVALUE_3 QPMIN QUANTILE READ REAL16INFO REAL16OFF REAL16ON REAL32OFF REAL32ON REAL32_VPA RESET RESET77 RESTORE - RTEST RTEST2 REVERSE - REWIND ROTHMAN RMATLAB RRPLOTS RRPLOTS2 RUN SAVE SCHUR SCREENCLOSE SCREENOPEN SCREENOUTOFF SCREENOUTON SET SETCOL SETLABEL SETLEVEL SETNDIMV SETROW SETTIME SETWINDOW SIGD SIMULATE SMOOTH SOLVEFREE SORT SPECTRAL STEPWISE STOP SUBRENAME SUSPEND SWARTEST - Display a string value at a x, y point on screen. Copy an object from one pointer address to another Reorder Square Matrix Pi Spline Nonlinear Model Building Line-Printer Graphics Fit an nth degree polynomial Evaluate an nth degree polynomial Convert storage of a polynomial matrix Display/Extract a polynomial matrix Invert a Polynomial Matrix Multiply a Polynomial Matrix Calculate Phillips Peron Unit Root test Print text and data objects. Lists all variables in storage. Turn off Printing Turn on Printing (This is the default) Resets so that vectors/arrays print as vectors/arrays Vectors/Arrays print as Column Matrix/Array Vectors/Arrays print as Row Matrix/Array Estimate Probit (0-1) Model. Present value of $1 recieved at end of n years Present value of an Annuity of $1 Present value of $1 recieved throughout year Quadratic Programming. Calculate interquartile range. Read data directly into MATRIX workspace from a file. Obtain Real16 info Turn off Real16 add Turn on extended accuracy Turn off Real32 add Turn on extended accuracy for real*16 Turn on extended accuracy for real*16 using vpa Calculate Ramsey (1969) regression specification test. Thursby - Schmidt Regression Specification Test Load data back in MATRIX facility from external save file. Test Residuals of Model Test Residuals of Model - No RES and Y Plots Test a real*8 vector for reversibility in Freq. Domain Rewind logical unit. Test a real*8 vector for reversibility in Time Domain Runs Matlab Plots Recursive Residual Data Plots Recursive Residual Coef Terminates the matrix command being in "manual" mode. Save current workspace in portable file format. Performs Schur decomposition Turn off Display Manager Turn on Display Manager Turn screen output off. Turn screen output on. Set all elements of an object to a value. Set column of an object to a value. Set the label of an object. Set level. Sets an element in an n dimensional object. Set row of an object to a value. Sets the time info in an existing series Set window to main(1), help(2) or error(3). Set print digits. Default g16.8 Dynamically Simulate OLS Model Do exponential smoothing. Set frequency of freeing temp variables. Sort a real vector. Spectral analysis of a vector or 1d array. Stepwise OLS Regression Stop execution of a matrix command. Internally rename a subroutine. Suspend loading and Execuiting a program Stock-Watson VAR Test Chapter 16 SYSTEM TABULATE TESTARG TIMER TRIPLES TSAY TSLINEUP TSD VAREST VPASET VOCAB WRITE - Issue a system command. List vectors in a table. Lists what is passed to a subroutine or function. Gets CPU time. Calculate Triples Reversability Test Calculate Tsay nonlinearity test. Line up Time Series Data Interface to TSD Data set VAR Modeling Set Variable Precision Math Options List built-in subroutine vocabulary. Write an object to an external file. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Matrix Command Built-In Function Vocabulary +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ACF AFAM ARGUMENT ARRAY BETAPROB BINDF BINPR BOOTI BOOTV BOXCOX BSNAK BSOPK BSINT BSINT2 BSINT3 BSDER BSDER2 BSDER3 BSITG BSITG2 BSITG3 C1ARRAY C8ARRAY CATCOL CATROW CCF - CHAR CHARDATE CHARDATEMY CHARTIME CHISQPROB CHTOR CHTOHEX CFUNC COMB COMPLEX CSPLINEFIT CSPLINE CSPLINEVAL CSPLINEDER CSPLINEITG CUSUM CUSUMSQ CWEEK DABS DARCOS DARSIN DATAN DATAN2 DATENOW DBLE - Calculate autocorrelation function of a 1d object. Change a matrix or vector to an array class object. Unpack character argument at run-time Define a 1d or 2d array. Calculate a beta probability. Evaluate Binomial Distribution Function Evaluate Binomial Probability Function Calculate integers to be used with bootstrap. Bootstraps a vector with replacement. Box-Cox Transformation of a series given lamda. Compute Not a Knot Sequence Compute optimal spline knot sequence Compute 1-D spline interpolant given knots Compute 2-D spline interpolant given knots Compute 3-D spline interpolant given knots Compute 1-D spline values/derivatives given knots Compute 2-D spline values/derivatives given knots Compute 3-D spline values/derivatives given knots Compute 1-D spline integral given knots Compute 2-D spline integral given knots Compute 3-D spline integral given knots Create a Character*1 array Create a Character*8 array Concatenates an object by columns. Concatenates an object by rows. Calculate the cross correlation function on two objects. Convect an integer in range 0-127 to character. Convert julian variable into character date dd\mm\yy. Convert julian variable into character data mm\yyyy. Converts julian variable into character date hh:mm:ss Calculate chi-square probability. Convert a character variable to a real variable. Convert a character to its hex representation. Call Function Combination of N objects taken M at a time. Build a complex variable from two real*8 variables. Fit a 1 D Cubic Spline using alternative models Calculate a cubic spline for 1 D data Calculate spline value given spline Calculate spline derivative given spline value Calculate integral of a cubic spline Cumulative sum. Cumulative sum squared. Name of the day in character. Absolute value of a real*8 variable. Arc cosine of a real*8 variable. Arc sine of a real*8 variable. Arc tan of a real*8 variable. Arc tan of x / y. Signs inspected. Date now in form dd/mm/yy Convert real*4 to real*8. 11 12 Matrix Command Language DCONJ DCOS DCOSH DDOT DERF DERFC DERIVATIVE DET DEXP DFLOAT DGAMMA DIAG DIAGMAT DIF DINT DNINT DIVIDE DLGAMMA DLOG DLOG10 DMAX DMAX1 DMIN DMIN1 DMOD DROPFIRST DROPLAST DSIN DSINH DSQRT DTAN DTANH EIGENVAL EPSILON EVAL EXP EXTRACT FACT FDAYHMS FFT FIND FLOAT FPROB FREQ FRACDIF FYEAR GENARMA GETDAY GETHOUR GETNDIMV GETMINUTE GETMONTH GETQT GETSECOND GETYEAR GOODCOL GOODROW GRID HUGE HYPDF HYPPR INTEGER8 I4TOI8 I8TOI4 ICHAR ICOLOR IDINT IDNINT INFOGRAPH IMAG IAMAX - Conjugate of complex argument. Cosine of real*8 argument. Hyperbolic cosine of real*8 argument. Inner product to two vectors. Error function of real*8/real*16 argument. Inverse of error function. Analytic derivative of a vector. Determinate of a matrix. Exponential of a real*8 argument. Convert integer*4 to real*8. Gamma function of real*8 argument. Place diagonal of a matrix in an array. Create diagonal matrix. Difference a series. Extract integer part of real*8 number Extract nearest integer part of real*8 number Divide with an alternative return. Natural log of gamma function. Natural log. Base 10 log. Largest element in an array. Largest element between two arrays. Smallest element in an array. Smallest element between two arrays. Remainder. Drops observations on top or array. Drops observations on bottom of an array. Calculates sine. Hyperbolic sine. Square root of real*8 or complex*16 variable. Tangent. Hyperbolic tangent. Eigenvalue of matrix. Alias EIG. Positive value such that 1.+x ne 1. Evaluate a character argument Exponential of real*8 or complex*16 variable. Extract elements of a character*1 variable. Factorial Gets fraction of a day. Fast fourier transform. Finds location of a character string. Converts integer*4 to real*4. Probability of F distribution. Gets frequency of a time series. Fractional Differencing Gets fraction of a year from julian date. Generate an ARMA series given parameters. Obtain day of year from julian series. Obtains hour of the day from julian date. Obtain value from an n dimensional object. Obtains minute of the day from julian date. Obtains month from julian date. Obtains quarter of year from julian date. Obtains second from julian date. Obtains year. Deletes all columns where there is missing data. Deletes all rows where there is missing data. Defines a real*8 array with a given increment. Largest number of type Evaluate Hypergeometric Distribution Function Evaluate Hypergeometric Probability Function Load an Integer*8 object from a string Move an object from integer*4 to integer*8 Move an object from integer*8 to integer*4 Convect a character to integer in range 0-127. Sets Color numbers. Used with Graphp. Converts from real*8 to integer*4. Converts from real*8 to integer*4 with rounding. Obtain Interacter Graphics INFO Copy imaginary part of complex*16 number into real*8. Largest abs element in 1 or 2D object Chapter 16 IAMIN IMAX IMIN INDEX - INLINE INT INTEGERS INV INVBETA INVCHISQ INVFDIS INVTDIS IQINT IQNINT ISMISSING IWEEK JULDAYDMY JULDAYQY JULDAYY KEEPFIRST KEEPLAST KIND KINDAS KLASS KPROD LABEL LAG LEVEL LOWERT MCOV MAKEJUL MASKADD MASKSUB MATRIX MEAN MEDIAN MFAM MISSING MLSUM MOVELEFT MOVERIGHT NAMELIST NEAREST NCCHISQ NOCOLS NOELS NORMDEN NORMDIST NOROWS NOTFIND OBJECT PDFAC PDFACDD PDFACUD PDINV PDSOLV PI PINV PLACE POIDF POIPR POINTER POLYDV POLYMULT POLYROOT PROBIT PROBNORM PROBNORM2 PROD Q1 - Smallest abs element in 1 or 2D object Largest element in 1 or 2D object Smallest element in 1 or 2D object Define integer index vector, address n dimensional object. Inline creation of a program Copy real*4 to integer*4. Generate an integer vector with given interval. Inverse of a real*8 or complex*16 matrix. Inverse beta distribution. Inverse Chi-square distribution. Inverse F distribution. Inverse t distribution. Converts from real*16 to integer*4. Converts from real*16 to integer*4 with rounding. Sets to 1.0 if variable is missing Sets 1. for monday etc. Given day, month, year gets julian value. Given quarter and year gets julian value. Given year gets julian value. Given k, keeps first k observations. Given k, keeps last k observations. Returns kind of an object in integer. Sets kind of second argument to kind first arg. Returns klass of an object in integer. Kronecker Product of two matrices. Returns label of a variable. Lags variable. Missing values propagated. Returns current level. Lower triangle of matrix. Consistent Covariance Matrix Make a Julian date from a time series Add if mask is set. Subtract if mask is set. Define a matrix. Average of a 1d object. Median of a real*8 object. Set 1d or 2d array to vector or matrix. Returns missing value. Sums log of elements of a 1d object. Moves elements of character variable left. Move elements of character variable right. Creates a namelist. Nearest distinct number of a given type Non central chi-square probability. Gets number of columns of an object. Gets number of elements in an object. Normal density. 1-norm, 2-norm and i-norm distance. Gets number of rows of an object. Location where a character is not found. Put together character objects. Cholesky factorization of PD matrix. Downdate Cholesky factorization. Update Cholesky factorization. Inverse of a PD matrix. Solution of a PD matrix given right hand side. Pi value. Generalized inverse. Places characters inside a character array. Evaluate Poisson Distribution Function Evaluate Poisson Probability Function Machine address of a variable. Division of polynomials. Multiply two polynomials Solution of a polynomial. Inverse normal distribution. Probability of normal distribution. Bivariate probability of Nornal distribution. Product of elements of a vector. Q1 of a real*8 object. 13 14 Matrix Command Language Q3 QCOMPLEX QINT QNINT QREAL QR QRFAC QRSOLVE RANKER RCOND REAL R8TOR16 R16TOR8 REAL16 REC RECODE RN ROLLDOWN ROLLLEFT ROLLRIGHT ROLLUP RTOCH SEIGENVAL SEXTRACT SFAM SNGL SPACING SPECTRUM SUBSET SUBMATRIX SUM SUMCOLS SUMROWS SUMSQ SVD TIMEBASE TIMENOW TIMESTART TINY TDEN TO_RMATRIX TO_CMATRIX TO_RARRAY TO_CARRAY TO_VECTOR TO_ARRAY TPROB TRACE TRANSPOSE UPPERT VARIANCE VECTOR VFAM VOCAB VPA ZDOTC ZDOTU ZEROL ZEROU - Q3 of a real*8 object. Build complex*32 variable from real*16 inputs. Extract integer part of real*16 number Extract nearest integer part of real*16 number Obtain real*16 part of a complex*326 number. Obtain Cholesky R via QR method using LAPACK. Obtain Cholesky R via QR method. Solve OLS using QR. Index array that ranks a vector. 1 / Condition of a Matrix Obtain real*8 part of a complex*16 number. Convert Real*8 to Real*16 Convert Real*16 to Real*8 Input a Real*16 Variable Rectangular random number. Recode a real*8 or character*8 variable Normally distributed random number. Moves rows of a 2d object down. Moves cols of a 2d object left. Moves cols of a 2d object right. Moves rows of a 2d object up. Copies a real*8 variable into character*8. Eigenvalues of a symmetric matrix. Alias SEIG. Takes data out of a field. Creates a scalar object. Converts real*8 to real*4. Absolute spacing near a given number Returns spectrum of a 1d object. Subset 1d, 2d array, vector or matrix under a mask. Define a Submatrix Sum of elements. Sum of columns of an object. Sum of rows of an object. Sum of squared elements of an object. Singular value decomposition of an object. Obtains time base of an object. Time now in form hh:mm:ss Obtains time start of an object. Smallest number of type t distribution density. Convert Object to Row-Matrix Convert Object to Col-Matrix Convert Object to Row-Array Convert Object to Col-Matrix Convert Object to Vector Convert Object to Array t distribution probability. Trace of a matrix. Transpose of a matrix. Upper Triangle of matrix. Variance of an object. Create a vector. Convert a 1d array to a vector. List built in functions. Variable Precision Math calculation Conjugate product of two complex*16 objects. Product of two complex*16 objects. Zero lower triangle. Zero upper triangle. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Matrix Programming Language key words ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ CALL CONTINUE DO DOWHILE NEXT i - Call a subroutine go to statement Starts a do loop Starts a dowhile loop End of a do loop Chapter 16 ENDDO ENDDOWHILE END EXITDO EXITIF FOR FORMULA GO TO FUNCTION IF( ) ENDIF PROGRAM RETURN RETURN( ) SOLVE SUBROUTINE WHERE( ) - 15 End of a do loop End of a dowhile loop End of a program, function or Subroutine. Exit a DO loop Exit an IF statement Start a do loop, Define a recursive formula. Transfer statement Beginning of a function. Beginning of an IF structure End of an IF( )THEN structure Beginning of a program, Next to last statement before end. Returns the result of a function. Solve a recursive system. Beginning of subroutine. Starts a where structure. Within the B34S Display Manager, individual help is available on each command. Usually the help document shows an example. In addition, for each command an example that can be run from the Tasks menu is provided in the file matrix.mac. Users are encouraged to cut and paste the commends from these help documents and example files to create their custom programs. Full documentation for the matrix command can be obtained from the display manager or by running the command b34sexec help=matrix; b34srun; Since subroutine libraries and help libraries are text files, users can easily add examples and helps from their own applications or build libraries of custom procedures. The help file for the schur command, which is shown next, provides an example of the on line documentation which is available for all matrix command keywords: SCHUR - Performs Schur decomposition call schur(a,s,u); factors real*8 matrix A such that A=U*S*transpose(U) and S is upper triangular. For complex*16 the equation is A=U*S*transpose(dconj(U)) U is an orthogonal matrix such that for real*8 u*transpose(u) = I Eigenvalues of A are along diagonal of S. An optional calling sequence for real*8 is call schur(a,s,z,wr,wi); where wr and wi are the real and imaginary parts, respectively, of the computed eigenvalues in the same order that they appear on the diagonal of the output Schur form s. Complex conjugate pairs 16 Matrix Command Language of eigenvalues will appear consecutively with the eigenvalue having the positive imaginary part first. Optional calling sequence for complex*16 is call schur(a,s,z,w); where w contains the complex eigenvalues. The Schur decomposition can be performed on many real*8 and complex*16 matrices for which eigenvalues cannot be found. For detail see MATLAB manual page 4-36. The schur command uses the lapack version 3 routines dgees and zgees. Exemple: b34sexec matrix; * Example from MATLAB - General Matrix; a=matrix(3,3: 6., 12., 19., -9., -20., -33., 4., 9., 15.); call schur(a,s,u); call print(a,s,u); is_ident=u*transpose(u); is_a =u*s*transpose(u); * Positive Def. case ; aa=transpose(a)*a; call schur(aa,ss,uu); ee=eigenval(aa); call print(aa,ss,uu,ee); * Expanded calls; call schur(a,s,u,wr,wi); call print('Real and Imag eigenvalues'); call tabulate(wr,wi); * Testing Properties; call print(is_a,is_ident); * Random Problem ; n=10; a=rn(matrix(n,n:)); call schur(a,s,u); call print(a,s,u); is_ident=u*transpose(u); is_a =u*s*transpose(u); call schur(a,s,u,wr,wi); call print('Real and Imag eigenvalues'); call tabulate(wr,wi); call print(is_a,is_ident); * Complex case ; a=matrix(3,3: 6., 12., 19., -9., -20., -33., 4., 9., 15.); ca=complex(a,2.*a); call schur(ca,cs,cu,cw); call print(ca,cs,cu,'Eigenvalues two ways', cw,eigenval(ca)); is_ca=cu*cs*transpose(dconj(cu)); call print(is_ca); b34srun; When run this example produces edited output: Chapter 16 B34S(r) Matrix Command. d/m/y 29/ 6/07. h:m:s => * EXAMPLE FROM MATLAB - GENERAL MATRIX$ => => => A=MATRIX(3,3: 6., 12., 19., -9., -20., -33., 4., 9., 15.)$ => CALL SCHUR(A,S,U)$ => CALL PRINT(A,S,U)$ A = Matrix of 1 2 3 3 1 6.00000 -9.00000 4.00000 S = Matrix of 1 2 3 3 1 -1.00000 0.00000 0.00000 U 1 2 3 3 IS_IDENT=U*TRANSPOSE(U)$ => IS_A => * POSITIVE DEF. CASE $ => AA=TRANSPOSE(A)*A$ => CALL SCHUR(AA,SS,UU)$ => EE=EIGENVAL(AA)$ => CALL PRINT(AA,SS,UU,EE)$ SS UU 3 3 elements 3 -44.6948 -0.609557 1.00000 by 3 1 2432.40 0.00000 0.00000 1 -0.233460 -0.506875 -0.829804 by elements 3 0.577350 0.577350 0.577350 3 2 288.000 625.000 1023.00 3 3 by 3 by elements 3 -0.852649E-13 -0.810500E-13 0.685245E-03 3 2 -0.842147 -0.321212 0.433141 = Complex Vector of elements 3 471.000 1023.00 1675.00 2 0.333414E-12 0.599956 0.00000 = Matrix of 1 2 3 EE 1 133.000 288.000 471.000 = Matrix of 1 2 3 by =U*S*TRANSPOSE(U)$ = Matrix of 1 2 3 elements 3 19.0000 -33.0000 15.0000 2 0.664753 0.782061E-01 -0.742959 => AA 3 2 20.7846 1.00000 0.00000 = Matrix of 1 -0.474100 0.812743 -0.338643 by 2 12.0000 -20.0000 9.00000 9:15:23. elements 3 0.486091 -0.799938 0.351873 3 elements 17 18 Matrix Command Language ( 2432. , 0.000 ) ( 0.6000 , 0.000 ) ( 0.6852E-03, 0.000 ) Note that the diagonal of SS contains the eigenvalues shown in EE => * EXPANDED CALLS$ => CALL SCHUR(A,S,U,WR,WI)$ => CALL PRINT('Real and Imag eigenvalues')$ Real and Imag eigenvalues => CALL TABULATE(WR,WI)$ Obs 1 2 3 WR -1.000 1.000 1.000 WI 0.000 0.000 0.000 => * TESTING PROPERTIES$ => CALL PRINT(IS_A,IS_IDENT)$ A is recovered from the Schur factorization and U 'U I . To save space the random problem is not shown. IS_A 1 2 3 = Matrix of 1 6.00000 -9.00000 4.00000 IS_IDENT= Matrix of 1 2 3 1 1.00000 0.555112E-16 0.555112E-16 3 by 2 12.0000 -20.0000 9.00000 3 by 2 0.555112E-16 1.00000 0.555112E-16 3 elements 3 19.0000 -33.0000 15.0000 3 elements 3 0.555112E-16 0.555112E-16 1.00000 16.2 Overview of Nonlinear Capability The B34S matrix command contains a number of nonlinear commands that allow the user to specify a model in a 4th generation language while performing the calculation using compiled code. Chapter 11 discussed nonlinear least squares and a number of maximization/minimization examples. In many cases a matrix command uses routines from the commercially available IMSL subroutine library, LINPACK, LAPACK, EISPACK or FFTPACK. For nonlinear modeling applications users of the stand-alone IMSL product would have to license a Fortran compiler, write the model and main program in Fortran, build routines to display the results and compile all code each time a model needed to be estimated. In contrast, the B34S implementation allows the user to specify the model in a 4th generation language and further process the results from within a general programming language. Optionally it is possible to view the solution progress from a GUI. Grouping the nonlinear capability of the matrix command by function (with some overlap) and showing the underlying routines used, there are: Constrained maximization commands: CMAXF1 CMAXF2 CMAXF3 - Constrained maximization of function using zxmwd. Constrained maximization of function using dbconf/g. Constrained maximization of function using db2pol. Chapter 16 Unconstrained maximization commands: MAXF1 MAXF2 MAXF3 - Maximize a function using IMSL ZXMIN. Maximize a function using IMSL DUMINF/DUMING. Maximize a function using simplex method (DU2POL). Linear and non-linear programming commands: LPMAX LPMIN NLEQ NLPMIN1 NLPMIN2 NLPMIN3 - Solve Linear Programming maximization problem. Solve Linear Programming minimization problem. Jointly solve a number of nonlinear equations. Nonlinear Programming fin. diff. grad. DN2CONF. Nonlinear Programming user supplied grad. DN2CONG. Nonlinear Programming user supplied grad. DN0ONF. Nonlinear least squares and utility commands: BGARCH GARCH GARCHEST NLEQ NLLSQ NL2SO NLSTART QPMIN SOLVEFREE - Calculate function for a BGARCH model. Calculate function for a ARCH/GARCH model. Estimate a ARCH/GARCH Model Jointly solve a number of nonlinear equations. Nonlinear Least Squares Estimation. Alternative Nonlinear Least Squares Estimation. Generate starting values for NL routines. Quadratic Programming. Set frequency of freeing temp variables. Integration of a user function Commands: DQDAG DQDNG DQDAGI DQDAGP DQDAGS DQAND - Integrate a function using Gauss-Kronrod rules Integrate a smooth function using a nonadaptive rule. Integrate a function over infinite/semi-infinite interval. Integrate a function with singularity points given Integrate a function with end point singularities Multiple integration of a function Spline and Related Commands: ACEFIT - Alternating Conditional Expectation Model Estimation BSNAK BSOPK BSINT BSINT2 BSINT3 BSDER BSDER2 BSDER3 BSITG BSITG2 BSITG3 CSPLINEFIT CSPLINE CSPLINEVAL CSPLINEDER CSPLINEITG - Compute Not a Knot Sequence Compute optimal spline know sequence Compute 1-D spline interpolant given knots Compute 2-D spline interpolant given knots Compute 3-D spline interpolant given knots Compute 1-D spline values/derivatives given knots Compute 2-D spline values/derivatives given knots Compute 3-D spline values/derivatives given knots Compute 1-D spline integral given knots Compute 2-D spline integral given knots Compute 3-D spline integral given knots Fit a 1 D Cubic Spline using alternative models Calculate a cubic spline for 1 D data Calculate spline value given spline Calculate spline derivative given spline value Calculate integral of a cubic spline GAMFIT - Generalized Additive Model Estimation MARS PISPLINE - Multivariate Autoregressive Spline Models Pi Spline Nonlinear Model Building 19 20 Matrix Command Language While space limits a full discussion of each command, examples from within each group will be discussed briefly and illustrated using supplied problems in this Chapter and in Chapter 11 where the optimization and NLLS capability was discussed in some detail. The strength of the nonlinear capability is that the user has great flexibility to specify the model in a B34S matrix program. Once the model has been coded, the solution proceeds using the built-in command which consists of compiled code.8 The user can optionally display on the screen the solution process. Some of the applications in this area are shown next. Integration is an important topic and a number of commands are available. For example the command dqand provides solution of up to 20 integrals where the user specifies the model in a matrix language program. Consider the problem b b b e ( x12 x22 x32 ) dx1dx2 dx3 (16.2-1) a a a where the ranges of integration are successively widened. The above problem can be solved with b34sexec matrix; * This is a big problem. Note maxsub 100000 ; program test; f=dexp(-1.*(x(1)*x(1)+x(2)*x(2)+x(3)*x(3))); return; end; /$ We solve 6 problems – each with wider bounds. /$ As constant => inf and => pi()**1.5 lowerv=array(3:); upperv=array(3:); x =array(3:); call print(test); call echooff; j=integers(3); do i=1,6; cc=dfloat(i)/2.0; lowerv(j)=(-1.)*cc; upperv(j)= cc; call dqand(f x :name test 8 :lower lowerv :upper upperv :errabs .0001 :errrel .001 :maxsub 100000 :print); In contrast to the B34S design where the nonlinear commands are built into the language, the MATLAB minimization commands fmin and fminbnd are totally written in the MATLAB 4th generation language. While the user can see what is being calculated, the cost is that as the model is solved the MATLAB parser must crack each statement in the command. This design substantially slows execution. Chapter 16 call print('lower set as call print('results call print('error enddo; ',cc:); ',%result:); ',%error:); call print('Limit answer b34srun; ',pi()**1.5 :); 21 to produce answers for the range (–3., 3.) of: Integration using DQAND For Integral Lower Integration value Upper Integration value 1 -3.000000000000000 3.000000000000000 For Integral Lower Integration value Upper Integration value 2 -3.000000000000000 3.000000000000000 For Integral Lower Integration value Upper Integration value 3 -3.000000000000000 3.000000000000000 ERRABS set as 1.000000000000000E-04 ERRREL set as 1.000000000000000E-03 MAXSUB set as 100000 Result of Integration 5.567958983584796 Error estimate 3.054134012359100E-08 Limit answer 5.568327996831708 Spline models can be used to fit a model to data where the underlying function is not known. Assume the function 1 1 .5 0 .5 0 ( x3 x y z ) dx dy dz which is very hard to evaluate. The setup b34sexec matrix; * Test Example from IMSL(10) ; call echooff; nxdata=21; nydata=6; nzdata=8; kx=5; ky=2; kz=3; i=integers(nxdata); j=integers(nydata); k=integers(nzdata); xdata=dfloat(i-11)/10.; ydata=dfloat(j-1)/5.; zdata=dfloat(k-1)/dfloat(nzdata-1); iimax=index(nxdata,nydata,nzdata:); f=array(iimax:); do ii=1,nxdata; do jj=1,nydata; do kk=1,nzdata; ii3=index(nxdata,nydata,nzdata:ii,jj,kk); f(ii3)=(xdata(ii)**3.) + (xdata(ii)*ydata(jj)*zdata(kk)); enddo; enddo; (16.2-2) 22 Matrix Command Language enddo; xknot=bsnak(xdata,kx); yknot=bsnak(ydata,ky); zknot=bsnak(zdata,kz); bscoef3=bsint3(xdata,ydata,zdata,f,xknot,yknot,zknot); a=0.0; b=1.0; c=.5; d=1.0; e=0.0; ff=.5; val=bsitg3(a,b,c,d,e,ff,xknot,yknot,zknot,bscoef3); g =.5*(b**4.-a**4.); h =(b-a)*(b+a); ri=g*(d-c); rj=.5*h*(d-c)*(d+c); exact=.5*(ri*(ff-e)+.5*rj*(ff-e)*(ff+e)); error=val-exact; call call call call call call call print('Test of bsitg3 ***********************':); print('Lower 1 = ',a:); print('Upper 1 = ',b:); print('Lower 2 = ',c:); print('Upper 2 = ',d:); print('Lower 3 = ',e:); print('Upper 3 = ',ff:); call print('Integral = ',val:); call print('Exact = ',exact:); call print('Error = ',error:); b34srun; allows solution of the above three dimensional problem without explicit knowledge of the function. Note for this test problem we generate the data but as far as the bsitg3 command is known, the function is not known. What is happening in the solution is that splines are used to approximate the function and from these splines, the integral can be calculated. If the exact answer is known, the results can be tested. For this problem the answers were: Test of bsitg3 Lower 1 = Upper 1 = Lower 2 = Upper 2 = Lower 3 = Upper 3 = Integral = Exact = Error = *********************** 0.000000000000000E+00 1.000000000000000 0.5000000000000000 1.000000000000000 0.000000000000000E+00 0.5000000000000000 8.593750000000001E-02 8.593750000000000E-02 1.387778780781446E-17 More extensive problems involving higher dimensions where the function is not explicitly known can be solved with the mars, gamfit and pispline models which are documented in Chapter 14 which can be run as procedures or as part of the matrix language.9 A simpler problem that can easily be seen is 9 1 0 x 3 dx .25 . Its solution is found by: Another option is the acefit command which will estimate an ACE model. The gamfit command estimates GAM models. For further detail see chapter 14. Chapter 16 23 b34sexec matrix; * Test Example from IMSL(10) ; call echooff; ndata=21; korder=5; i =integers(ndata); xdata =dfloat(i-11)/10.; f =xdata**3.; xknot =bsnak(xdata,korder); bscoef=bsint(xdata,f,xknot); a =0.0; b =1.0; val =bsitg(a,b,xknot,bscoef); * fi(x)= x**4./4.; exact =(b**4./4.)-(a**4./4.); error=exact-val; call print('Test of bsitg ***********************':); call print('Lower = ',a:); call print('Upper = ',b:); call print('Integral = ',val:); call print('Exact = ',exact:); call print('Error = ',error:); b34srun; Edited output is: Test of bsitg *********************** Lower Upper Integral Exact Error = = = = = 0.000000000000000E+00 1.000000000000000 0.2500000000000001 0.2500000000000000 -1.110223024625157E-16 In both of the above examples the data was simulated by evaluation of a function. In most cases, this is not possible. The power of the spline capability is that nonlinear models can be fit using few data points. Since a spline model cannot forecast outside the range of x variable it would appear that such models are of limited use. However if a spline model is fit, then values can be interpolated and more observations can be generated. These observations can be used to fit to a nonlinear model. Since Chapter 11 contains extensive examples for nonlinear least squares and maximization problems, these features will not be discussed further here. In the next sections we discuss the matrix command language. 24 Matrix Command Language 16.3 Rules of the Matrix Language While the B34S help facility is the place to go for detailed instructions, the basic structure of the matrix command can be illustrated by a number of examples and simple rules shown next. 1. Command Form. The matrix command begins with the statement b34sexec matrix; and ends with the statement b34srun; All commands are between these two statements, unless the matrix command in running in interactive manual mode under the Display Manager. This "manual mode" allows only one line commands to be specified. 2. Sentence Terminator. All matrix statements must end in $ or ;. For example: x=dsin(q); There is no continuation character needed and sentences can extend over many lines. 3. Assignment Issures. Mixed mode math is not allowed. For example assuming x is real*8 x=x*2; is not allowed because x is real*8 and 2 is an integer*4. The reason mixed mode is not allowed is that the processor would not know what to do with the result. This design is in contrast to many languages that automatically create real*8 values. The correct form for the above statement is: x=x*2.; if real*8 results are desired or x=idint(x)*2; if you want an integer*4 result and x was real*8 before the command. The form x=dint(x*2.); truncates x*2. and places it in the real*8 variable x. 4. Structured Objects. Calculated structured objects can only be used on the right of an expression or in a subroutine call as input. For example if x is a 2-D object mm=mean(x(,3)); Chapter 16 25 calculates the mean of column 3 while nn=mean(x(3,)); calculates the mean of row 3. i=integers(2,30); y(i)=x(i-1); copies x elements from locations 1 through 29 into y locations 2 to 30. What is not allowed is i=integers(1,29); y(i+1)=x(i); since it involves a calculated subscript on the left of the equals sign. 5. Data storage issues. Since structured objects repackage the data, they cannot be used for output from a subroutine or function. For example assume x is a 3 by 5 matrix. If we wanted to take the log of a row or column, the correct code is x(2,)=dlog(x(2,)); x(,3)=dlog(x(,3)); The code subroutine dd(x); x=log(x); return; end; call dd(x(2,)); call dd(x(,3)); will not work since rows and columns are repackaged into vectors which do not line up with the original storage of the matrix. If a user function is desired to be used, then logic such as b34sexec matrix; x=matrix(3,3:1 2 3 4 5 6 7 8 9); call print(x); function dd(x); yy=dlog(x); return(yy); end; x(2,)=dd(x(2,)); x(,3)=dd(x(,3)); call print(x); b34srun; should be used. However the above code has a "hidden" bug that impacts the x(2,3) element. The reader should study what is happening. As a hint. The original x(2,3) term was 6. The log of 26 Matrix Command Language 6 = 1.7918. The value found in the x(2,3) position is .583179 or the log(log(6)) because of the first replacement which might not be what is intended. 6. Automatic Expansion. Structured objects can be used on the left of an assignment statement to load data. To add another element to x use x=3.0; x(2)=4.0; while to place 0.0 in column 2 use x=rn(matrix(4,4:)); x(,2)=0.; To place 99. in row 3 x(3,)=99.; while to set element 3,2 of x to .77 use x(3,2)=.77; The following code shows advanced structured index processing. This code is available in matrix.mac in overview_2 /$ Illustrates Structural Index Processing b34sexec matrix; x =rn(matrix(6,6:)); y =matrix(6,6:); yy =matrix(6,6:); z =matrix(6,6:); zz =matrix(6,6:); i=integers(4,6); j=integers(1,3); xhold=x; hold=x(,i); call print('cols 4-6 x go to hold',x,hold); y(i, )=xhold(j,); call print('Rows 1-3 xhold in rows 4-6 y ',xhold,y); y=y*0.0; j2 =xhold(j,); y(i, )=j2 ; call print('Rows 1-3 xhold in rows 4-6 y ',xhold,y); z(,i)=xhold(,j); call print('cols 1-3 xhold in cols 4-6 z ',xhold,z); j55 =xhold(,j); z=z*0.0; z(,i)=j55; call print('cols 1-3 xhold in cols 4-6 z ',xhold,z); Chapter 16 yy=yy*0.0; yy(i,)=xhold; call print('rows 1-3 xhold in rows 4-6 yy',xhold,yy); zz=zz*0.0; do ii=1,3; jj=ii+3; zz(,jj)=xhold(ii,); enddo; /; /; i=integers(4,6); j=integers(1,3); call print('Note that zz(,i)= xhold(j,) will not work':); call print('Testing zzalt(,i)= transpose(xhold(j,))':); /; Use of Transpose speeds things up over do loop zzalt=zz*0.0; zzalt(,i)= transpose(xhold(j,)) ; call print('rows 1-3 xhold in cols 4-6 zz',xhold,zz,zzalt); zz=zz*0.0; zzalt=zz; do ii=1,3; jj=ii+3; zz(jj,)=xhold(,ii); enddo; call print('Note that zz(i,)=xhold(,j) will not work':); call print('Testing zzalt(i,)= transpose(xhold(,j))':); zzalt(i,)=transpose(xhold(,j)); call print('cols 1-3 xhold in rows 4-6 zz',xhold,zz,zzalt); oldx=rn(matrix(20,6:)); newx= matrix(20,5:); i=integers(4); newx(,i)=oldx(,i); call print('Col 1-4 in oldx goes to newx',oldx,newx); oldx=rn(matrix(20,6:)); newx= matrix(20,5:); i=integers(4); newx(1,i)=oldx(1,i); call print('This puts the first element in col ',oldx,newx); newx=newx*0.0; newx(i,1)=oldx(i,1); call print('This puts the first element in row ',oldx,newx); newx=newx*0.0; newx( ,i)=oldx( ,i); call print('Whole col copied here',oldx,newx); oldx=rn(matrix(10,5:)); newx= matrix(20,5:); 27 28 Matrix Command Language i=integers(4); newx(i,1)=oldx(i,1); call print('This puts the first element in row ',oldx,newx); newx=newx*0.0; newx(i,)=oldx(i,); call print('Whole row copied',oldx,newx); * We subset a matrix here ; a=rn(matrix(10,5:)); call print('Pull off rows 1-3, cols 2-4', a,a(integers(1,3),integers(2,4))); b34srun; The reader is invited to run this sample program and inspect the results. Structured index programming is compact and fast and should be used wherever possible. The do command is provided for the cases, hopefully few are far between, when structured index processing is not possible. In this example is is demonstrated that by use of the transpose after the structured extract, a do loop is not required. Styructured index programming takes care but can achieve great gains due to lowering the paser overhead implicit in a do loop. 7. Restrictions on the left hand side of an expression. Functions or math expressions are not allowed on the left hand side of an equation. Assume the user wants to load another row. The command x(norows(x)+1,)=v; in the sequence x=matrix(3,3:1 2 3 4 5 6 7 8 9); v=vector(3:22 33 44); x(norows(x)+1,)=v; will not work. The correct way to proceed is: x=matrix(3,3:1 2 3 4 5 6 7 8 9); v=vector(3:22 33 44); n=norows(x)+1; x(n,)=v; to produce 1. 4. 7. 22. 2. 3. 5. 6. 8. 9. 33. 44. Chapter 16 29 Note that the matrix, array and vector commands automatically convert integers to real*8 in the one exception to rule 3 about mixed mode operations above. The command x(i+1)=value; will not work since there is a calculation implicit on the left. The correct code is: j=i+1; x(j)=value; Advanced code includes: b34sexec matrix display=col80medium; x=matrix(3,3:1 2 3 4 5 6 7 8 9); v=vector(:1 2 3 4 5 6 7 8 9); xx=matrix(3,3:v); /; Note that xx is saved by columns hence the elements /; in xx2 repack into a 9 by 1 vector of the columns of x /; xx3 is transpose(x) xx2=matrix(9,1:xx); xx3=matrix(3,3:xx2); call print(x,v,xx,xx2,xx3); b34srun; X = Matrix of 1 2 3 V 1 1.00000 4.00000 7.00000 3 XX = Matrix of 1 2 3 XX2 1 2 3 4 5 6 7 8 9 1 1.00000 4.00000 7.00000 = Matrix of 1 1.00000 4.00000 7.00000 2.00000 5.00000 8.00000 3.00000 6.00000 9.00000 3 2 2.00000 5.00000 8.00000 = Vector of 1.00000 6.00000 by elements 9 3 3.00000 6.00000 9.00000 elements 2.00000 7.00000 3.00000 8.00000 3 by 2 2.00000 5.00000 8.00000 9 3 4.00000 9.00000 elements 3 3.00000 6.00000 9.00000 by 1 elements 5.00000 30 Matrix Command Language XX3 = Matrix of 1 1.00000 2.00000 3.00000 1 2 3 3 by 2 4.00000 5.00000 6.00000 3 elements 3 7.00000 8.00000 9.00000 1-D and 2-D objects can be concatenated using catcol and catrow commands. If objects are of unequal length, missing data will be supplied. Examples files for catcol and catrow should be run for further detail. 8. Matrix/Vector Math vs. Array Math. Matrix and array math is supported. If x is a 3 by 3 matrix, the command ax=afam(x); will create a 3 by 3 array ax. If x is a 3 by 1 array. The command mx=vfam(x); will create a 3 by 1 matrix mx containing x. To convert x to a vector, column by column use vvnew=vector(:x); Array math is element by element math, while matrix math uses linear algebra rules. If v is a vector of 6 elements the command newv=afam(v)*afam(v); squares all elements while p=v*v; is the inner product or the sum of the elements squared. An important issue is how to handle matrix/vector addition. If A and B are both n by m matrices, the command c=a+b; creates the n by m matrix C where Ci , j Ai , j Bi , j . As Greene(2000,11) notes "matrices cannot be added unless they have the same dimensions, in which case they are said to be conformable for addition." If A and B were vectors of length n, then Ci Ai Bi . If A is a n by n matrix, the statement c=a+2.; creates C where Ci ,i Ai ,i 2. and Ci , j Ai , j for i j . If A were an 2-D array, then Ci , j Ai , j 2. If A was a 1-D object, then element by element math would be used. This convention is similar to SPEAKEASY and in contrast to MATLAB which for addition and subtraction of scalars handles things as if the objects were arrays. In B34S if a scalar is added or subtracted from a m by n matrix where m n , an error message is given. For vectors we have => VX=VECTOR(5:1 1 1 1 1)$ Chapter 16 => 31 CALL PRINT((VX+1.))$ Vector of 2.00000 5 2.00000 elements 2.00000 2.00000 2.00000 element by element operations. This is similar to MATLAB operations on n by 1 and 1 by n objects which are treated as if they were vectors. 9. Keywords as variable Names. Keywords should not be used as variable names. If they are user, the command with this name is "turned off." This can cause unpredictable results with user programs, subroutines and functions. User's keywords cannot conflict with user program, subroutine or function names since the users code is not loaded unless a statement of the form call load(name); is given. 10. Passing Arguments. Subroutines and functions allow passing arguments which can be changed. Structured index arrays cannot be changed (see rule 5 above). For example: call myprog(x,y); y=myfunc(x,y); A complete example: b34sexec matrix; subroutine test(a); call print('In routine test A= ',a); * Reset a; call character(a,'This is a very long string'); return; end; /$ pass in character*8 call test('junk'); call character(jj,'some junk going in'); call print(jj); /$ pass in a character*1 array call test(jj); call print(jj); b34srun; Special characters such as : and | are not allowed in user subroutines or function calls because of the difficulty of parsing these characters in the user routine. This restriction may change in future versions of the matrix command if there is demand. 11. Coding assumptions. Statements such as: 32 Matrix Command Language y = x-z; are allowed. Statements such as y = -x+z; will not work as intended. The error message will be "Cannot classify sentence Y ...". The command should be given as y = -1.*x + z; or better still y = (-1.*x) + z; A statement y = x**2; where x is real*8 will get a mixed mode message and should be given as y = x**2.; Complex statements such as yhat = b1*dexp(-b2*x)+ b3*dexp(-(x-b4)**2./b5**2.) + b6*dexp(-(x-b7)**2./b8**2.); will not work as intended and should have ( ) around the power expressions and -1.* . yhat = b1*dexp(-1.0*b2*x)+ b3*dexp(-1.0*((x-b4)**2.)/(b5**2.)) + b6*dexp(-1.0*((x-b7)**2.)/(b8**2.)); It is a good plan to use ( ) to make sure what is calculated was what is intended. Examples of matrix language statements: The statement y=dsin(x); is an analytic statement that creates the structured object y by taking the sin of the structured object x. The variable x can be a scalar, 1-D object (array, vector) or a 2-D object (matrix, array). The following code copies elements 5-10 of y to x(2),...,x(7) i=integers(5,10); j=i-3; x(j)=y(i); and is much faster than the scalar implementation do j=2,7; x(j)=y(j+3); Chapter 16 33 enddo; which has high parse overhead. 12. Automatic Expansion of Variables – Some Cautionary notes. The following code illustrates automatic expansion issues. x(1)=10.; x(2)=20.; The array x contains elements 10. and 20. Warning! The commands x(1)=10.; x(2)=20; produces an array of 0 20 since the statement x(2)=20; redefines the x array to be integer! This is an easy mistake to make since computers do what we tell them to do very quickly! Statements such as x(0) = 20.; x(-1)= 20.; x(1) = 20.; all set element 1 of x to 20. The x(0) and x(-1) statements will generate a message warning the user. 13. Memory Management. Automatic expansion of arrays can will cause the program to "waste" memory since newer copies of the expanded variable will not fit into the old location. The matrix command will have to allocate a new space which will leave a "hole" in memory. The command call compress; cannot be used to compress the workspace if it is given while in a user subroutine, function or program. In addition to space requirements, prior allocation will substantially speed up execution. If memory problems are encountered, the command call names(all); can be used to see how the variables are saved in memory and whether as the calculation proceeds more space is used. For example compare the following code; n=10; x=array(n:); call names(all); do i=1,n; x(i)=dfloat(i); call names(all); enddo; 34 Matrix Command Language with n=10; call names(all); do i=1,n; x(i)=dfloat(i); call names(all); enddo; The first job will run faster and not use up memory. This job can be found in the file matrix.mac under MEMORY and should be run by users wanting to write efficient subroutines. An alternative is to use the solvefree command as do i=1,2000; call solvefree(:alttemp); * many commands here ; call solvefree(:cleantemp); enddo; The first call with :alttemp sets %%____ style temp variables in place of the default ##____ style. The command :cleantemp resets the temp style to ##____ and cleans all %%____ temps, leaving the ##_____ style temps in place. If this capability is used carefully, substantial speed gains can be made. In addition the max number of temps will not be reached. Use of this feature slows down processing and is usually not needed. The command call solvefree(:cleantemp2); cleans user temps at or above the current level. This can be useful within a nested call to clean work space. Many systems like SPEAKEASY do automatic compression which substantially slows execution since the location of all variables must be constantly checked on the chance that they have moved. The matrix command releases temp variables after each line of code but does not do a compress unless told to do so. New temps are slotted into unused locations. The latter is not possible if objects are getting bigger during execution of a job. The dowhile loop usually is cycled many times and needs active memory management. An Example is: b34sexec matrix; sum=0.0; add=1.; ccount=1.; count=1.; tol=.1e-6; /$ outer dowhile does things 2 times call outstring(2,2,'We sum until we can add nothing!!'); call outstring(2,4,'Tol set as '); call outdouble(20,4,tol); call echooff; call solvefree(:alttemp); Chapter 16 35 dowhile(ccount.ge.1..and.ccount.le.3.); sum=0.0; add=1.; count=1.; dowhile(add.gt.tol); oldsum=sum; sum=oldsum+((1./count)**3.); count=count+1.; call outdouble(2,6,add); add=sum-oldsum; /$ This section cleans temps if(dmod(count,10.).eq.0.)then; call solvefree(:cleantemp); call solvefree(:alttemp); endif; enddowhile; ccount=ccount+1.; call print('Sum was ',sum:); call print('Count was ',count); enddowhile; b34srun; 14. Missing Data. Missing data often causes problems. Assume the following code: b34sexec matrix; x=rn(array(10:)); lagx=lag(x,1); y=x-(10.*lagx); goody=goodrow(y); call tabulate(x,lagx,y,goody); b34srun; Y will contain missing data in row 1. The variable goody will contain 9 non missing value observations. 15. Recursive solutions. In many cases the solution to a problem requires recursive evaluation of an expression. While the use of recursive function calls is possible, it is not desirable since there is great overhead in calling the function or subroutine over and over again. The do loop, while still slow, is approximately 100 times faster than a recursive function call. The test problem RECURSIVE in c:\b34slm\matrix.mac documents how slow the recursive function call and do loop are for large problems. Another reason that a recursive function call is not recommended is that the stack must be saved. The best way to handle a recursive call is to use the solve statement to define the expression that has to be evaluated one observation at a time. If the expression contains multiple expressions that are the same, a formula can be defined and used in the solve statement. The formula and solve statements evaluate an expression over a range, one observation at a time. This is in contrast to the usual analytic expression which is evaluated completely on the right before the copy is made. Unlike an expression, a formula or solve statement can refer to itself on the right. The block keyword determines the order in which the formulas are evaluated. If the expression in the solve statement does not have duplicate code, 36 Matrix Command Language it is faster not to define a formula. Examples of both approaches are given next. The first problem is a simple expression not requiring a formula. The code b34sexec matrix; test=array(10:); test(1)=.1; b=.9; solve(test=b*test(t-1) :range 2 norows(test) :block test); call print(test); b34srun; works but test = b*lag(test,1); will not get the "correct" answer since the right hand side is built before the copy is done. The formula statement requires use of the subscript t unless the variable is a scaler. The use of the formula and solve statements are illustrated below: b34sexec options ginclude('gas.b34'); b34srun; b34sexec matrix ; call loaddata; double=array(norows(gasout):); formula double = dlog(gasout(t)*2.); call names; call print(double); test2=array(norows(gasout):); solve(test2=test2(t-1)+double(t)+double(t-1) :range 2, norows(gasout) :block double); call print(mean(test2)); b34srun; The following two statements are the same but execute at different speeds. do i=1,n; x(i)=y(i)/2.; enddo; solve(x=y(t)/2. :range 1 n); The formula and solve statements can be used to generate an AR(1) model. This is example solve7 in c:\b34slm\matrix.mac b34sexec matrix ; * Generate ar(1) model; g(1)=1.; theta= .97; vv = 10. ; formula jj=g(t-1)*theta+vv*rn(1.); solve(g=jj(t) :range 2 300 :block jj); call graph(g :heading 'AR(1) process'); call print(g); call autobj(g :print :nac 24 :npac 24 :nodif :autobuild ); b34srun; Solve and formula statements cannot contain user functions. More detail on the solve and formula statements are given below. In RATS, unless do loops are used, recursive models are Chapter 16 37 only allowed in the maximize command, and then only in a very restrictive form.10 The B34S implementation, while slower, allows more choices. 16. User defined data structures. The B34S matrix command allows users to build custom data types. The below example shows the structure PEOPLE consisting of a name field (PNAME), a SSN field (SSN), an age field (AGE), a race field (RACE) and an income field (INCOME). The built-in function sextract( ) is used to take out a field and the built-in subroutine isextract is used to place data in a structure. Both sextract and isextract allow a third argument that operates on an element. The name sextract is "structure extract" while isextract is "inverse structure extract." Use of these commands is illustrated by: b34sexec matrix; people=namelist(pname,ssn,age,race,income); pname =namelist(sue,joan,bob); ssn =array(:20,100,122); age =idint(array(:35,45,58)); race =namelist(hisp,white,black); income=array(:40000,35000,50000); call tabulate(pname,ssn,age,race,income); call print('This prints the age vector',sextract(people(3))); call print('Second person',sextract(people(1),2), sextract(people(3),2)); * make everyone a year older ; nage=age+1; call isextract(people(3),nage); call print(age); * make first person 77 years old; call isextract(people(3),77,1); call print(age); b34srun; Data structures are very powerful and, in the hands of an expert programmer, can be made to bring order to complex problems. 17. Advanced programming Concepts and Techniques for Large Problems Programs such as SPEAKEASY and MATLAB, which are meant to be used interactively, have automatic workspace compression. As a result a SPEAKEASY LINKULE programmer has to check for movement of defined objects anytime an object is created or freed. In a design decision to increase speed, B34S does not move the variables inside named storage unless told to do so. If a do loop terminates and the user is not in a SUBROUTINE, temp variables are freed. If a new temp variable is needed, B34S will try to place this variable in a free slot. If a variable is growing, this may not be possible. Hence it is good programming practice to create arrays and not rely on automatic variable expansion. In a subroutine call, a variable passed in is first copied to another location and set to the current level + 1. Thus there are named storage implications of a subroutine call. The command call compress; will manually clean out all temp variables and compress the workspace. While this command takes time, in a large job it may be required to save time and space. Temp variables are named 10 In Rats version 6.30 this restriction seems to have been somewhat lifted 38 Matrix Command Language ##1 ...... ##999999. If the peak number of temp variables gets > 999999, then B34S has to reuse old names and as a result slows down checking to see if a name is currently being used. A call to compress will reset the temp variable counter as well as free up space. If compress is called from a place it cannot run, say in a do loop or in a subroutine or program or function, then it will not run. No message will be given. The matrix command termination message gives space used, peak space used and peak and current temp # usage. Users can monitor their programs with these measures to optimize performance. In the opinion of the developer, the B34S matrix command do loop is too slow. The problem is that the do loop will start to run without knowing the ending point because it is supported at the lowest programming level. In contrast, SPEAKEASY requires that the user have a do loop only in a subroutine, program or function where the loop end is known in theory. Ways to increase do loop speed are high on the "to do" list. Faster CPU's, better compilers or better chip design may be the answer. The Lahey LF95 compiler appears to make faster do loops than the older Lahey LF90 compiler. This suggests that the compiler management of the cache may be part of what is slowing the do loop down. The test problem solve6 in c:\b34slm\matrix.mac illustrates some of these issues. Times and gains from the solve statement vary based on the compiler used to build the B34S. Columns 1 and 2 were run on the same machine (400 GH) with the same source code. The Lahey LF95 compiler was a major upgrade over the order LF90. Column 3 shows the same problem with the addition of a solvefree call added to the do loop run on a 1000 GH machine running LF95 5.6g. In this example the source code was improved. The speed-up exceeds the chip gain of 2.5 (1000/400) and can be attributed to compiler improvements, and source code improvements and chip design improvements. SOLVE time DO time Gain of SOLVE LF90 4.50i 9.718 41.69 4.3897 LF95 5.5b 9.22 13.73 1.49 LF95 5.6g 1.3018 1.422 1.0932 In summary LF90 appears to make a very slow do loop while LF95 is faster. In simple equations the formula and solve commands are useful. With large complex sequences of commands, the do loop cost may have to be "eaten" by the user since it is relative low in comparison to the cost of parsing the expression. Speed can be increased by using variables for constants because at parse time all scalars are made temps. Creating temps outside the loop speeds things up. The following four examples show various speed code: * slow code; do i=1,1000; x(i)=x(i)*2.; enddo; * better code; two=2.0; do i=1,1000; x(i)=x(i)*two; enddo; * vectorized code; i=integers(1,1000); x=x(i)*2.; * Compact vectorized code x=x(integers(3,1000))*2.; Chapter 16 39 If all elements need to be changed the fastest code is x=x*2.; In the vectorized examples parse time is the same no matter whether there are 10 elements in x or 10,000,000. For speed gains from the use of masks, see # 20 below. Since B34S can create, compile and execute Fortran programs, for complex calculations a branch to Fortran is always an option. The larger the dataset the less the overhead costs. The fortran example in matrix.mac illustrates dynamically building, compiling and calling a Fortran program from the matrix command. b34sexec matrix; call open(70,'_test.f'); call rewind(70); /$ 1234567890 call character(test," write(6,*)'This is a test # 2'" " n=1000 " " write(6,*)n " " do i=1,n " " write(6,*) sin(float(i)) " " enddo " " stop " " end "); call write(test,70); call close(70); /$ lf95 is Lahey Compiler /$ g77 is Linux Compiler /$ fortcl is script to run Lahey LF95 on Unix to link libs call dodos('lf95 _test.f'); * call dounix('g77 _test.f -o_test'); call dounix('lf95 _test.f -o_test'); * call dounix('fortcl _test.f -o_test'); call dodos('_test > testout':); call dounix('./_test > testout':); call open(71,'testout'); call character(test2,' call read(test2,71); call print(test2); testd=0.0; n=0; call read(n,71); testd=array(n:); call read(testd,71); call print(testd); call call call call call close(71); dodos('erase dodos('erase dounix('rm dounix('rm testout'); _test.f'); testout'); _test.f'); '); 40 Matrix Command Language b34srun; A substantially more complex example using a GARCH model is shown next. At issue is how to treat the first non observed second moment observation. Three problems are run. The garchest command does not set this value to 0.0. The Fortran implementation does set the value to 0.0 and 100% matches RATS output. The Fortran implementation is orders of magnitude slower but shows the user's ability to have total control of what is being maximized. /$ Tests RATS vs GARCHEST vs FORTRAN /$ In the FORTRAN SETUP see line arch(1)=0.0 /$ If line is commented out => GARCHEST = FORTRAN /$ If line is not commented out FORTRAN = RATS /$ This illustrates the effect of starting values!!!!!! /$ Also illustrates Fortran as a viable alternative when there /$ are very special models to be run that are recursive in /$ nature b34sexec options ginclude('b34sdata.mac') member(lee4); b34srun; %b34slet dorats=1; /$ Using garchest %b34slet dob34s1=1; /$ Using Fortran %b34slet dob34s2=1; /$ ********************************************************** %b34sif(&dob34s1.ne.0)%then; b34sexec matrix ; call loaddata ; * The data has * a1 = GMA = * b1_n = GAR * b1 = GAR = been generated by GAUSS by following settings $ 0.09 $ = 0.5 ( When Negative) $ 0.01 $ call echooff ; maxlag=0 y=doo1 y=y-mean(y) ; ; ; v=variance(y) ; arch=array(norows(y):) + dsqrt(v); * GARCH on a TGARCH Model ; call garchest(res,arch,y,func,maxlag,n :ngar 1 :garparms array(:.1) :ngma 1 :gmaparms array(:.1) :maxit 2000 :maxfun 2000 :maxg 2000 :steptol .1d-14 :cparms array(2:.1,.1) :print ); b34srun; %b34sendif; /$ Fortran Chapter 16 41 %b34sif(&dob34s2.ne.0)%then; b34sexec matrix ; call loaddata ; * The data has * a1 = GMA = * b1_n = GAR * b1 = GAR = been generated by GAUSS by following settings $ 0.09 $ = 0.5 ( When Negative) $ 0.01 $ * call echooff ; /$ Setup fortran call open(70,'_test.f'); call rewind(70); /$ We now save the Fortran Program in a Character object /$ Will get overflows call character(fortran, /$23456789012345678901234567890 " implicit real*8(a-h,o-z) " " parameter(nn=10000) " " dimension data1(nn) " " dimension res1(nn) " " dimension res2(nn) " " dimension parm(100) " " call dcopy(nn,0.0d+00,0,data1,1)" " call dcopy(nn,0.0d+00,0,res2 ,1)" " open(unit=8,file='data.dat') " " open(unit=9,file='tdata.dat') " " read(8,*)nob " " read(8,*)(data1(ii),ii=1,nob) " " read(9,*)npar " " read(9,*)(parm(ii),ii=1,npar) " " read(9,*) res2(1) " " close(unit=9) " " " " do i=1,nob " " res1(i)=data1(i)-parm(3) " " enddo " " " " func=0.0d+00 " " do i=2,nob " " res2(i) =parm(1)+(parm(2)* res2(i-1) ) +" " * (parm(4)*(res1(i-1)**2) ) " " " " " if(dabs(res2(i)).le.dmach(3))then func= 1.d+40 go to 100 endif " " " " " " " " " " " func=func+(dlog(dabs(res2(i))))+ * ((res1(i)**2)/res2(i)) enddo " func=-.5d+00*func " 100 continue " close(unit=8) " open(unit=8,file='testout') " write(8,fmt='(e25.16)')func " close(unit=8) " stop " end "); /$ Fortran Object written here call write(fortran,70); call close(70); maxlag=0 ; " " " " " " 42 Matrix Command Language y=doo1 y=y-mean(y) ; ; * compile fortran and save data; /$ lf95 is Lahey Compiler /$ g77 is Linux Compiler /$ fortcl is script to run Lahey LF95 on Unix to link libs call dodos('lf95 _test.f'); * call dounix('g77 _test.f -o_test'); * call dounix('lf95 _test.f -o_test'); call dounix('fortcl _test.f -o_test'); call open(72,'data.dat'); call rewind(72); call write(norows(y),72); call write(y,72,'(3e25.16)'); call close(72); v=variance(y) ; arch=array(norows(y):) + dsqrt(v); i=2; j=norows(y); count=0.0; call echooff; program test; call open(72,'tdata.dat'); call rewind(72); npar=4; call write(npar,72); call write(parm,72,'(e25.16)'); /$ /$ If below line is commented out => GARCHEST = FORTRAN /$ If below line is not commented out FORTRAN = RATS /$ arch(1)=0.0d+00 ; call write(arch(1),72,'(e25.16)'); call close(72); call dodos('_test'); call dounix('./_test '); call open(71,'testout'); func=0.0; call read(func,71); call close(71); count=count+1.0; call outdouble(10,5 ,func); call outdouble(10,6 ,count); call outdouble(10,7, parm(1)); call outdouble(10,8, parm(2)); call outdouble(10,9, parm(3)); call outdouble(10,10,parm(4)); return; end; ll uu =array(4: =array(4: -.1e+10, .1e-10,.1e-10,.1e-10); .1e+10, .1e+10,.1e+10,.1e+10); * parm=array(:.0001 .0001 .0001 .0001); * parm(1)=v; * parm(3)=mean(y); rvec=array(4: .1 .1, .1, .1); Chapter 16 parm=rvec; * call names(all); call cmaxf2(func :name test :parms parm :ivalue rvec :maxit 2000 :maxfun 2000 :maxg 2000 :lower ll :upper uu :print); *call dodos('erase testout'); call dodos('erase _test.exe'); *call dounix('rm testout'); call dounix('rm _test'); b34srun; %b34sendif; %b34sif(&dorats.ne.0)%then; b34sexec options open('rats.dat') unit(28) disp=unknown$ b34srun$ b34sexec options open('rats.in') unit(29) disp=unknown$ b34srun$ b34sexec options clean(28)$ b34srun$ b34sexec options clean(29)$ b34srun$ b34sexec pgmcall$ rats passasts pcomments('* ', '* Data passed from B34S(r) system to RATS', '* ') $ pgmcards$ * The data has been generated by GAUSS by following settings * a1 = GMA = 0.09 * b1_n = GAR = 0.5 ( When Negative) * b1 = GAR = 0.01 compute gstart=2,gend=1000 declare series u declare series h declare series s * ;* Residuals ;* Variances ;* SD set rt = doo1 set h = 0.0 nonlin(parmset=base) p0 a0 a1 b1 nonlin(parmset=constraint) a1>=0.0 b1>=0.0 * GARCH ************ Not correct model frml at = rt(t)-p0 frml g1 = a0 + a1*at(t-1)**2 + b1*h(t-1) frml logl = -.5*log(h(t)=g1(t))-.5*at(t)**2/h(t) smpl 2 1000 compute p0 = 0.1 compute a0 = 0.1, a1 = 0.1, b1 =0.1 * * maximize(parmset=base+constraint,method=simplex, $ recursive,iterations=100) logl maximize(parmset=base+constraint,method=bhhh, $ recursive,iterations=10000) logl b34sreturn; b34srun; 43 44 Matrix Command Language b34sexec options close(28)$ b34srun$ b34sexec options close(29)$ b34srun$ b34sexec options /$ dodos('start /w /r rats386 rats.in rats.out') dodos('start /w /r rats32s rats.in /run') dounix('rats rats.in rats.out')$ b34srun$ b34sexec options npageout writeout('output from rats',' ',' ') copyfout('rats.out') dodos('erase rats.in','erase rats.out','erase rats.dat') dounix('rm rats.in','rm rats.out','rm rats.dat') $ b34srun %b34sendif; Edited output is shown next: B34S 8.11C Variable (D:M:Y) # Cases DOO1 1 DOO2 2 DOO3 3 DOO4 4 DOO5 5 DOO6 6 DOO7 7 DOO8 8 DOO9 9 DO10 10 CONSTANT 11 1/ 7/07 (H:M:S) Mean 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 -0.8631441630E-02 0.1027493796E-01 0.1702775142E-02 -0.1313232100E-01 0.2912316775E-01 -0.5975311198E-02 0.3427857886E-02 -0.1170972201E-01 0.1179714228E-01 0.8936612970E-02 1.000000000 Number of observations in data file Current missing variable code B34S(r) Matrix Command. d/m/y 8: 3:34 DATA STEP Std Deviation 0.3606170368 0.3416024362 0.3401124651 0.3612532834 0.3418287186 0.3665673422 0.3380034576 0.3630108981 0.3544133951 0.3380290307 0.000000000 TGARCH GMA .09 GAR2 .5 GAR .01 Variance 0.1300446473 0.1166922244 0.1156764889 0.1305039348 0.1168468729 0.1343716164 0.1142463373 0.1317769122 0.1256088546 0.1142636256 0.000000000 1000 1.000000000000000E+31 1/ 7/07. h:m:s 8: 3:34. => CALL LOADDATA $ => * THE DATA HAS BEEN GENERATED BY GAUSS BY FOLLOWING SETTINGS $ => * A1 = GMA = 0.09 $ => * B1_N = GAR = 0.5 ( WHEN NEGATIVE) $ => * B1 = GAR = 0.01 $ => CALL ECHOOFF $ GARCH/ARCH Model Estimated using DB2ONF Routine Constrained Maximum Likelihood Estimation using CMAXF2 Command Finite-difference Gradiant Model Estimated res1(t)=y(t)-cons(1)-ar(1)*y(t-1) -... -ma(1)*res1(t-1) -... res2(t)= cons(2)+gar(1)*res2(t-1) +... +gma(1)*(res1(t-1)**2)+... where: gar(i) and gma(i) ge 0.0 LF =-.5*sum((ln(res2(t))+res1(t)**2/res2(t))) Final Functional Value # of parameters # of good digits in function # of iterations # of function evaluations # of gradiant evaluations Scaled Gradient Tolerance Scaled Step Tolerance Relative Function Tolerance False Convergence Tolerance Maximum allowable step size Size of Initial Trust region # of terms dropped in ML 1/ Condition of Hessian Matrix # Name 1 GAR order Parm. Est. 1 0.21526580 530.5635346303904 4 15 11 23 13 6.055454452393343E-06 1.000000000000000E-15 3.666852862501036E-11 2.220446049250313E-14 2000.000000000000 -1.000000000000000 0 5.340454918178504E-04 SE 0.15889283 t-stat 1.3547861 Maximum 1.217023800 1.167096100 1.246665600 1.152290800 1.128081900 1.521247000 1.096066300 1.593915500 1.172364000 1.225441200 1.000000000 Minimum -1.143099300 -1.045661400 -1.053053800 -1.303597500 -1.161346400 -1.480897300 -0.9579734600 -1.135067800 -1.100395100 -1.297209500 1.000000000 PAGE 1 Chapter 16 2 GMA 3 CONS_1 4 CONS_2 1 0 0 0.15494322 0.11895044E-01 0.81870430E-01 0.43788030E-01 0.10240513E-01 0.19816063E-01 45 3.5384834 1.1615672 4.1315184 SE calculated as sqrt |diagonal(inv(%hessian))| Order of Parms: AR, MA, GAR, GMA, MU, vd, Const 1&2 Hessian Matrix 1 947.739 711.220 29.0281 7063.23 1 2 3 4 2 677.515 1045.35 -669.541 5153.42 3 -31.3699 -698.983 10146.9 547.320 4 7092.35 4863.30 111.143 55246.5 Gradiant Vector 0.296595E-03 0.370295E-03 -0.188742E-03 0.202584E-02 0.100000E-16 -0.100000E+33 0.100000E-16 0.100000E+33 0.100000E+33 0.100000E+33 Lower vector 0.100000E-16 Upper vector 0.100000E+33 B34S Matrix Command Ending. Last Command reached. Space available in allocator Number variables used Number temp variables used 11869781, peak space used 68, peak number used 29, # user temp clean B34S(r) Matrix Command. d/m/y 1/ 7/07. h:m:s 29977 69 0 8: 3:35. => CALL LOADDATA $ => * THE DATA HAS BEEN GENERATED BY GAUSS BY FOLLOWING SETTINGS $ => * A1 = GMA = 0.09 $ => * B1_N = GAR = 0.5 ( WHEN NEGATIVE) $ => * B1 = GAR = 0.01 $ => * CALL ECHOOFF => CALL OPEN(70,'_test.f')$ => CALL REWIND(70)$ => => => => => => => => => => => => => => => => => => => => => => => => => => => => => CALL CHARACTER(FORTRAN, " implicit real*8(a-h,o-z) " " parameter(nn=10000) " " dimension data1(nn) " " dimension res1(nn) " " dimension res2(nn) " " dimension parm(100) " " call dcopy(nn,0.0d+00,0,data1,1)" " call dcopy(nn,0.0d+00,0,res2 ,1)" " open(unit=8,file='data.dat') " " open(unit=9,file='tdata.dat') " " read(8,*)nob " " read(8,*)(data1(ii),ii=1,nob) " " read(9,*)npar " " read(9,*)(parm(ii),ii=1,npar) " " read(9,*) res2(1) " " close(unit=9) " " " " do i=1,nob " " res1(i)=data1(i)-parm(3) " " enddo " " " " func=0.0d+00 " " do i=2,nob " " res2(i) =parm(1)+(parm(2)* res2(i-1) ) +" " * (parm(4)*(res1(i-1)**2) ) " " if(dabs(res2(i)).le.dmach(3))then " " func= 1.d+40 " " go to 100 " $ 46 Matrix Command Language => => => => => => => => => => => => " " " " " " " " " " " " endif func=func+(dlog(dabs(res2(i))))+ * ((res1(i)**2)/res2(i)) enddo " func=-.5d+00*func " 100 continue " close(unit=8) " open(unit=8,file='testout') " write(8,fmt='(e25.16)')func " close(unit=8) " stop " end ")$ => CALL WRITE(FORTRAN,70)$ => CALL CLOSE(70)$ => MAXLAG=0 $ => Y=DOO1 $ => Y=Y-MEAN(Y) $ => * COMPILE FORTRAN AND SAVE DATA$ => CALL => * CALL DOUNIX('G77 _TEST.F -O_TEST')$ => * CALL DOUNIX('LF95 _TEST.F -O_TEST')$ => CALL DOUNIX('fortcl _test.f -o_test')$ => CALL OPEN(72,'data.dat')$ => CALL REWIND(72)$ => CALL WRITE(NOROWS(Y),72)$ => CALL WRITE(Y,72,'(3e25.16)')$ => CALL CLOSE(72)$ => V=VARIANCE(Y) => ARCH=ARRAY(NOROWS(Y):) + DSQRT(V)$ => I=2$ => J=NOROWS(Y)$ => COUNT=0.0$ => CALL ECHOOFF$ DODOS('lf95 " " " _test.f')$ $ Constrained Maximum Likelihood Estimation using CMAXF2 Command Final Functional Value 530.5828775439368 # of parameters 4 # of good digits in function 15 # of iterations 28 # of function evaluations 38 # of gradiant evaluations 30 Scaled Gradient Tolerance 6.055454452393343E-06 Scaled Step Tolerance 3.666852862501036E-11 Relative Function Tolerance 3.666852862501036E-11 False Convergence Tolerance 2.220446049250313E-14 Maximum allowable step size 2000.000000000000 Size of Initial Trust region -1.000000000000000 1 / Cond. of Hessian Matrix 4.118214520668572E-04 # Name 1 BETA___1 2 BETA___2 Coefficient 0.74116830E-01 0.28143495 Standard Error 0.20221217E-01 0.17082386 T Value 3.6653001 1.6475154 Chapter 16 3 BETA___3 4 BETA___4 0.11717038E-01 0.14761371 0.11319858E-01 0.46375692E-01 47 1.0350870 3.1829975 SE calculated as sqrt |diagonal(inv(%hessian))| Hessian Matrix 1 65640.6 8277.36 1376.24 6230.55 1 2 3 4 2 8265.13 1086.36 147.313 865.144 3 1262.64 115.630 8253.82 -369.582 4 6204.13 858.414 -388.923 1216.54 Gradiant Vector 0.179395E-02 0.248061E-03 -0.844847E-04 0.213908E-03 0.100000E-10 0.100000E-10 0.100000E-10 0.100000E+10 0.100000E+10 0.100000E+10 Lower vector -0.100000E+10 Upper vector 0.100000E+10 B34S Matrix Command Ending. Last Command reached. Space available in allocator Number variables used Number temp variables used B34S 8.11C (D:M:Y) 11868953, peak space used 58, peak number used 10326, # user temp clean 1/ 7/07 (H:M:S) 8: 3:56 28116 60 0 PGMCALL STEP TGARCH GMA .09 GAR2 .5 GAR .01 output from rats * * Data passed from B34S(r) system to RATS * CALENDAR(IRREGULAR) ALLOCATE 1000 OPEN DATA rats.dat DATA(FORMAT=FREE,ORG=OBS, $ MISSING= 0.1000000000000000E+32 ) / $ DOO1 $ DOO2 $ DOO3 $ DOO4 $ DOO5 $ DOO6 $ DOO7 $ DOO8 $ DOO9 $ DO10 $ CONSTANT SET TREND = T TABLE Series Obs Mean Std Error DOO1 1000 -0.008631 0.360617 DOO2 1000 0.010275 0.341602 DOO3 1000 0.001703 0.340112 DOO4 1000 -0.013132 0.361253 DOO5 1000 0.029123 0.341829 DOO6 1000 -0.005975 0.366567 DOO7 1000 0.003428 0.338003 DOO8 1000 -0.011710 0.363011 DOO9 1000 0.011797 0.354413 DO10 1000 0.008937 0.338029 TREND 1000 500.500000 288.819436 Minimum -1.143099 -1.045661 -1.053054 -1.303597 -1.161346 -1.480897 -0.957973 -1.135068 -1.100395 -1.297209 1.000000 * The data has been generated by GAUSS by following settings * a1 = GMA = 0.09 * b1_n = GAR = 0.5 ( When Negative) * b1 = GAR = 0.01 compute gstart=2,gend=1000 declare series u ;* Residuals declare series h ;* Variances declare series s ;* SD * set rt = doo1 set h = 0.0 nonlin(parmset=base) p0 a0 a1 b1 nonlin(parmset=constraint) a1>=0.0 b1>=0.0 * GARCH ************ Not correct model frml at = rt(t)-p0 frml g1 = a0 + a1*at(t-1)**2 + b1*h(t-1) frml logl = -.5*log(h(t)=g1(t))-.5*at(t)**2/h(t) smpl 2 1000 compute p0 = 0.1 compute a0 = 0.1, a1 = 0.1, b1 =0.1 Maximum 1.217024 1.167096 1.246666 1.152291 1.128082 1.521247 1.096066 1.593916 1.172364 1.225441 1000.000000 PAGE 2 48 * * Matrix Command Language maximize(parmset=base+constraint,method=simplex, $ recursive,iterations=100) logl maximize(parmset=base+constraint,method=bhhh, $ recursive,iterations=10000) logl MAXIMIZE - Estimation by BHHH Convergence in 11 Iterations. Final criterion was Usable Observations 999 Function Value 530.58287754 0.0000017 <= 0.0000100 Variable Coeff Std Error T-Stat Signif ******************************************************************************* 1. P0 0.0030855387 0.0113778017 0.27119 0.78624540 2. A0 0.0741165760 0.0202703454 3.65640 0.00025578 3. A1 0.1476142210 0.0517016977 2.85511 0.00430214 4. B1 0.2814353360 0.1738216127 1.61910 0.10542480 Note that for the RATS and Fortran implementation the function value was 530.582877. The garchest vaue was 530.5635346303904. What is most surprising is the effect on the parameters which are shown below for garchest. (Note p0 = CONS_1, a0=CONS_2, a1=GMA and b1=GAR). This example shows that one observation out of 1000 can make an important difference. It also shows the abilkity for the user to have 100% control of the function being maximized. # 1 2 3 4 Name GAR GMA CONS_1 CONS_2 order Parm. Est. 1 0.21526580 1 0.15494322 0 0.11895044E-01 0 0.81870430E-01 SE 0.15889283 0.43788030E-01 0.10240513E-01 0.19816063E-01 t-stat 1.3547861 3.5384834 1.1615672 4.1315184 18. Termination Issues. Do loop and if statement termination must be hit. If this is not done, the max if statement limit or do statement limit can be exceeded depending on program logic. This "limitation" comes from having if and do loops outside programs. In Fortran, the complete do loop or if statement is known to the compiler when the executable was built. In an interpreted language such as the matrix command, the command parser does not know about a statement it has not read yet. With in a built-in command such as olsq the possible logical paths are completely known at compile time. Remark: A human is a curious mixture of compiled and interpreted code. If one drinks too much, it can be predicted what will happen in terms of loss of coordination etc. In this sense the body knows what will occur, given an input. Free will, in contrast to predestination, implies an interpretative structure, where when one hits a branch (become an economist) one will never know what would have occurred had one taken another path. As an example of interpreted code where the end is never seen consider loop continue; if(dabs(z1-z2).gt.1.d-13)then; z2=z1; z1=dlog(z1)+c; go to loop; endif; which will never hit endif; The B34S parser will not know the position of this statement and the max if statement limit could be hit if the if structure was executed many times. A better approach is not to use an if structure in this situation. Better code is: loop continue; if(dabs(z1-z2).le.1.d-13)go to nextstep; Chapter 16 49 z2=z1; z1=dlog(z1)+c; go to loop; nextstep continue; 19. Mask Issues. Assume an array x where for x < 0 we want y=x**2. while for x 0.0 we want y=2*x. A slow but very clear way to do this would be: do i=1,norows(x); if(x(i) .lt. 0.0)y(i)=x(i)**2.; if(x(i) .ge. 0.0)y(i)=x(i)*2. ; enddo; since the larger the X array the more parsing is required because the do loop cycles more times. A vectorized way to do the same calculation is to define two masks. Mask1 = 0.0 if the logical expression is false, = 1.0 if it is true. Faster code would be mask1= x .lt. 0.0 ; mask2= x .ge. 0.0 ; y= mask1*(x**2.0) + mask2*(x*2.0); Compact fast code would be y= (x .lt. 0.0)*(x**2.0) + (x .ge. 0.0 )*(x*2.0); Complete problem: b34sexec matrix; call print('If X GE 0.0 y=20*x. Else y=2*x':); x=rn(array(20:)); mask1= x .lt. 0.0 ; mask2= x .ge. 0.0 ; y= mask1*(x*2.0) + mask2*(x*20.); call tabulate(x,y,mask1,mask2); b34srun; Compact code (placing the logical calculation in the calculation expression) is: b34sexec matrix; call print('If X GE 0.0 y=20*x. Else y=2*x':); x=rn(array(20:)); y= (x.lt.0.0)*(x*2.0) + (x.ge.0.0)*(x*20.); call tabulate(x,y); b34srun; Logical mask expressions should be used in function and subroutine calls to speed calculation. 20. N Dimensional Objects. While the matrix command saves only 1 and 2 dimensional objects, it is possible to save and address n dimensional objects in 1-D arrays. B34S saves n dimensional objects by col. The command index(2 3 5) creates an integer array with elements 2 3 5, index(2 3 5:) determines the number of elements in a 3 dimensional array with dimensions 2, 3,5 and index(a,b,c:i,j,k) determines the position on a one dimensional vector of the i, j, k element of a three dimensional array with max dimensions a, b and c. The commands: nn=index(i,j,k:); x=array(nn); call setndimv(index(i,j,k),index(1,2,3),x,value); 50 Matrix Command Language will make an 3 dimensional(i, j, k) object x and place value in the 1, 2, 3 position. The function call yy=getndimv(index(i,j,k),index(1,2,3),x); or yy=x(index(i,j,k:1,2,3)); can be used to pull a value out. For example to define the 4 dimensional object x with dimensions 2 3 4 5: nn=index(2,3,4,5:); x=array(nn:); To fill this array with values 1.,...,norows(x) x=dfloat(integers(norows(x))); or to set the 1, 2, 3, 1 value to 100. call setndim(index(2,3,4,5),index(1,2,3,1),x,100.); Examples of this facility: b34sexec matrix; x=rn(array(index(4,4,4:):)); call print(x,getndimv(index(4,4,4),index(1,2,1),x)); do k=1,4; do i=1,4; do j=1,4; test=getndimv(index(4,4,4),index(i,j,k),x); call print(i,j,k,test); enddo; enddo; enddo; b34srun; b34sexec matrix; xx=index(1,2,3,4,5,4,3); call names(all); call print(xx); call print('Integer*4 Array call print('# elements in 1 call print('Position of 1 2 call print('Integer*4 Array call print('# elements in 1 call print('Position of 1 3 b34srun; ',index(1 2 3 4 5 4 3)); 2 3 4 is 24',index(2 3 4:)); in a 4 by 4 is 5',index(4 4:1 2):); ',index(1,2,3,4,5 4 3)); 2 3 5 is 30',index(2,3,5:)); in a 4 by 4 is 9',index(4,4:1,3):); b34sexec matrix; mm=index(4,5,6:); xx=rn(array(mm:)); idim =index(4,5,6); idim2=index(2,2,2); call setndimv(idim,idim2,xx,10.); Chapter 16 51 vv= getndimv(idim,idim2 ,xx); call print(xx,vv); b34srun; 21. Complex Math Issues. The statements x=complex(1.5,1.5); y=complex(1.0,0.0); a=x*y; produces a=(1.5,1.5); To zero out the imag part of a use a=complex(real(x*y),0.0); In summary, the B34S matrix facility provides a 4th generation programming language that is tailored to applied econometrics and time series applications. The next section discusses basic linear algebra using the matrix facility.. 16.4 Linear Algebra using the Matrix Language Basic rules of linear algebra as discussed in Greene (2000) are illustrated using the matrix command. Although the complex domain is supported, due to space limitations, this material was removed. Interested readers can look at the extensive example files for each individual matrix command. Assume A is an m by n matrix, B is n by k and C is m by k, then C AB C ' B ' A' (16.4-1) A( B C ) AB AC The following code, which is part of ch16_13 in bookruns.mac, illustrates these calculations: a=matrix(2,3:1 3 2 4 5,-1); b=matrix(3,2:2 4 1 6 0 5); c=a*b; call print(a,b,c,'AB',a*b,'BA',b*a); n=3; a=rn(matrix(n,n:)); b=rn(a); c=a*b; call print(a,b,c,'AB',a*b,' ' 'BA',b*a,' ' 'a*(b+c) = a*b+a*c', a*(b+c), a*b+a*c ); call print(' ', 'We show that transpose(a*b) = transpose(b)*transpose(a)', transpose(a*b), transpose(b)*transpose(a)); Edited output is: => A=MATRIX(2,3:1 3 2 4 5,-1)$ => B=MATRIX(3,2:2 4 1 6 0 => C=A*B$ 5)$ 52 => Matrix Command Language CALL PRINT(A,B,C,'AB',A*B,'BA',B*A)$ A = Matrix of 2 1 1.00000 4.00000 1 2 B 2 3.00000 5.00000 = Matrix of 3 1 2.00000 1.00000 0.00000 1 2 3 C 3 elements 3 2.00000 -1.00000 by 2 elements by 2 elements by 2 elements by 3 elements 2 4.00000 6.00000 5.00000 = Matrix of 2 1 5.00000 13.0000 1 2 by 2 32.0000 41.0000 AB Matrix of 2 1 5.00000 13.0000 1 2 2 32.0000 41.0000 BA Matrix of 3 1 18.0000 25.0000 20.0000 1 2 3 2 26.0000 33.0000 25.0000 => N=3$ => A=RN(MATRIX(N,N:))$ => B=RN(A)$ => C=A*B$ => => => => CALL PRINT(A,B,C,'AB',A*B,' 'BA',B*A,' 'a*(b+c) = A*(B+C), A = Matrix of 1 2 3 B 1 2.05157 1.08325 0.825589E-01 = Matrix of 1 2 3 C 1 -0.605638 0.307389 -1.54789 = Matrix of 1 2 3 1 1.19362 1.32678 0.764914 3 3 0.00000 -4.00000 -5.00000 ' ' a*b+a*c', A*B+A*C )$ by 2 1.27773 -1.22596 0.338525 3 by 2 1.49779 -0.168215 0.498469 3 2 2.19986 1.06882 -0.162207 3 elements 3 -1.32010 -1.52445 -0.459242 3 elements 3 1.26792 0.741401 -0.187157 by 3 elements 3 3.79559 0.749854 0.441611 Chapter 16 53 AB Matrix of 1 2 3 1 1.19362 1.32678 0.764914 3 by 2 2.19986 1.06882 -0.162207 3 elements 3 3.79559 0.749854 0.441611 BA Matrix of 1 2 3 1 0.484652 0.509621 -2.65109 3 by 2 -2.18085 0.849968 -2.65224 3 elements 3 -2.06609 -0.489832 1.36943 a*(b+c) = a*b+a*c Matrix of 1 2 3 1 4.32792 -0.172882 0.961325 3 2 8.29281 2.38876 0.455725 Matrix of 1 2 3 => => => by 1 4.32792 -0.172882 0.961325 3 3 elements 3 11.9577 3.26893 0.806010 by 2 8.29281 2.38876 0.455725 3 elements 3 11.9577 3.26893 0.806010 CALL PRINT(' ', 'We show that transpose(a*b) = transpose(b)*transpose(a)', TRANSPOSE(A*B), TRANSPOSE(B)*TRANSPOSE(A))$ We show that transpose(a*b) Matrix of 1 2 3 1 1.19362 2.19986 3.79559 1 1.19362 2.19986 3.79559 3 by 2 1.32678 1.06882 0.749854 Matrix of 1 2 3 = transpose(b)*transpose(a) 3 2 1.32678 1.06882 0.749854 3 elements 3 0.764914 -0.162207 0.441611 by 3 elements 3 0.764914 -0.162207 0.441611 If we define i as a n by 1 matrix of 1's, then a vector of the means of series x can be calculated as i ' x / n where x is a vector and n is the number of observations in the vector. This is shown by b34sexec matrix; call print(' ' 'Define i as a n by 1 matrix of ones':) n=10; i=matrix(n,1:vector(n:)+1.); seriesx=rn(vector(n:)); mm=mean(seriesx); call print(mm); meanmm=i*transpose(i)*seriesx/dfloat(norows(seriesx)); call print(meanmm); 54 Matrix Command Language b34srun; which produces values of the mean two different ways: => CALL PRINT(' ' => 'Define i as a n by 1 matrix of ones':) => => N=10$ => I=MATRIX(N,1:VECTOR(N:)+1.)$ => SERIESX=RN(VECTOR(N:))$ => MM=MEAN(SERIESX)$ => CALL PRINT(MM)$ MM = 0.19517131 => MEANMM=I*TRANSPOSE(I)*SERIESX/DFLOAT(NOROWS(SERIESX))$ => CALL PRINT(MEANMM)$ MEANMM = Vector of 0.195171 3 0.195171 elements 0.195171 An idempotent matrix M has the property that M M M , while if M is symmetric then in 1 addition, M ' M M . Greene (2000, 16) discusses a matrix M [ I i ' i ] with this property. If n Z [ x, y ] where x and y are vectors, then Z ' M Z calculates the variance covariance matrix. Note that the diagonal elements of M i ,i 1 (1/ n ) while M i , j 1/ n where i j . This is shown next where we calculate the mean of series x two ways using mean to get mm and i ' x / n to get meanmm. The covariance matrix is calculated using the cov function and as Z ' MZ /(n 1) where Z is a matrix whose columns are the vectors of data. Finally Mi is tested and found to be close to 0.0 as expected. In terms of the program, M 0 M . b34sexec matrix; call load(cov :staging); call echooff; * Examples from Greene(2000); /; /; Use of i /; n=3; call print('Define i as a n by 1 matrix of ones':); i=matrix(n,1:vector(n:)+1.); seriesx=rn(vector(n:)); mm=mean(seriesx); meanmm=i*transpose(i)*seriesx/dfloat(norows(seriesx)); call print(mm,meanmm); /$ Get Variance Covariance call print('Define Idempotent matrix M' 'Diagonal = 1-(1/n). Off Diag -(1/n)':); Chapter 16 bigi=matrix(n,n:)+1.; littlei=(1./dfloat(n))*(i*transpose(i)); m0=(bigi-littlei); call print('m0 m0*m0 transpose(m0)*m0',m0,m0*m0,transpose(m0)*m0); seriesy=rn(seriesx); con=vector(n:)+1.; z=catcol(seriesy,seriesx,con); call print(z); vcov=transpose(z)*m0*z; call print(variance(seriesy)); call print(variance(seriesx)); call print('Sums and Cross Products ',vcov,m0*z); call print('cov(z) ', cov(z)); call print('(1.dfloat(n-1))*vcov', (1./dfloat(n-1))*vcov); call print('mo * i . Is this 0?',m0*i); b34srun; Output is: B34S(r) Matrix Command. d/m/y => CALL LOAD(COV :STAGING)$ => CALL ECHOOFF$ 1/ 7/07. h:m:s 15:41: 8. Define i as a n by 1 matrix of ones MM = 1.0724604 MEANMM = Vector of 1.07246 3 elements 1.07246 1.07246 Define Idempotent matrix MDiagonal = 1-(1/n). Off Diag -(1/n) m0 m0*m0 transpose(m0)*m0 M0 = Matrix of 1 2 3 1 0.666667 -0.333333 -0.333333 Matrix of 1 2 3 1 0.666667 -0.333333 -0.333333 Matrix of 1 2 3 Z 1 0.666667 -0.333333 -0.333333 = Matrix of 3 by 2 -0.333333 0.666667 -0.333333 3 by 2 -0.333333 0.666667 -0.333333 3 by 2 -0.333333 0.666667 -0.333333 3 by 3 elements 3 -0.333333 -0.333333 0.666667 3 elements 3 -0.333333 -0.333333 0.666667 3 elements 3 -0.333333 -0.333333 0.666667 3 elements 55 56 Matrix Command Language 1 2 3 1 1.27773 -1.22596 0.338525 2 2.05157 1.08325 0.825589E-01 3 1.00000 1.00000 1.00000 1.5996959 0.96933890 Sums and Cross Products VCOV 1 2 3 = Matrix of 3 1 3.19939 0.902698 0.433308E-16 Matrix of 1 2 3 2 0.902698 1.93868 0.357201E-15 3 1 1.14763 -1.35606 0.208429 by by 2 0.979110 0.107914E-01 -0.989901 3 elements 3 0.555112E-16 0.444089E-15 0.333067E-15 3 elements 3 0.111022E-15 0.111022E-15 0.111022E-15 cov(z) Array 1 2 3 of 3 1 1.59970 0.451349 0.00000 by 2 0.451349 0.969339 0.00000 3 elements 3 0.00000 0.00000 0.00000 (1.dfloat(n-1))*vcov Matrix of 1 2 3 3 1 1.59970 0.451349 0.216654E-16 mo * i . 2 0.451349 0.969339 0.178601E-15 3 elements 3 0.277556E-16 0.222045E-15 0.166533E-15 Is this 0? Matrix of 1 2 3 by 3 by 1 elements 1 0.111022E-15 0.111022E-15 0.111022E-15 Since the Variance-Covariance Matrix can be obtained two ways, of interest is which to use. The traditional method is slower, more accurate and takes less space. The idempotent matrix method is faster due to no do loops but as will be shown is not as accurate, especially in the case Chapter 16 57 of real*4 calculations using data that is not scaled. This will be demonstrated next by running the following program: /; /; This illustrates two ways to get the variance-Covariance Matrix /; Using Greene's Idempotentent Matrix and real*4 if tghe data is not /; Scaled, there can be problems of accuracy that are detected using /; real*16 results as the benchmark /; b34sexec options ginclude('gas.b34'); b34srun; b34sexec matrix; call loaddata; call load(cov :staging); call load(cov2 :staging); call load(cor :staging); call echooff; scale=1000.; h=catcol(gasin,scale*gasout); /; call print(h); call print('Covariance using function cov':); call print(cov(h)); call print('Covariance using function cov2':); call print(cov2(h)); call print('Difference of two methods using real*8':); call print(cov(h)-cov2(h)); call print('Correlation using function cor':); call print(cor(h)); call print('Real*16 results':); h16=r8tor16(h); call print('Covariance using function cov':); call print(cov(h16)); call print('Covariance using function cov2':); call print(cov2(h16)); call print('Difference of the two methods using real*16 ':); call print(cov(h)-cov2(h)); /; Testing which is closer real8_1=afam(cov(h)); real8_2=afam(cov2(h)); call call call call call print('Difference against real16 for real8_1 & real8_2 ':); print('Where Traditional Method = real8_1 ':); print('Where M0 Method = real8_2 ':); print(r8tor16(real8_1)-afam(cov(h16))); print(r8tor16(real8_2)-afam(cov(h16))); call print('Correlation using function cor':); call print(cor(h16)); /; real*4 results 58 Matrix Command Language h4=r8tor4(h); call print(' ':); call print('Where Traditional Method = real4_1 ':); call print('Where M0 Method = real4_2 ':); real4_1=afam(cov(h4)); real4_2=afam(cov2(h4)); call print(real4_1,real4_2); call print('Difference against real16 for real4_1 & real4_2 ':); call print(r8tor16(r4tor8(real4_1))-afam(cov(h16))); call print(r8tor16(r4tor8(real4_2))-afam(cov(h16))); call print('Correlation using function cor':); call print(cor(h4)); b34srun; that calls functions cov and cov2 which are listed next. function cov(x); /; /; Use matrix language to calculate cov of a matrix. For series use /; mm=catcol(series1,series2) /; Can use real*4, real*8, real*16 and VPA. /; test=afam(x); i=nocols(x); d=kindas(x,dfloat(norows(x)-1)); do j=1,i; test(,j)=test(,j)-mean(test(,j)); enddo; ccov=afam(transpose(mfam(test))*mfam(test))/d; if(klass(x).eq.2.or.klass(x).eq.1)ccov=mfam(ccov); return(ccov); end; function cov2(x); /; /; Use matrix language to calculate cov of a matrix. For series use /; mm=catcol(series1,series2) /; Can use real*4, real*8, real*16 and VPA. /; /; Uses Greene(2000,16) idempotent M0 matrix /; function cov( ) is a more traditional approach that uses /; far less space at the cost of a speed loss due to do loops /; cov( ) is more accurate than cov2( ) if there are scaling /; differences /; /; At issue is that m0 is n by n !!!!! /; /; Use of i which is a vector of 1's /; z=catcol(x1,x2,...,xk) /; ccov=transpose(z)*m0*z/(1/(n-1)); Chapter 16 59 /; where m0 diagonal = 1-(1/n). Off Diag = -1/n /; n=norows(x); real_n=kindas(x,dfloat(n)); real_one=kindas(x,1.0); /; Define i as a n by 1 matrix of ones i=matrix(n,1:kindas(x,(vector(n:)+1.))); /; Get Variance Covariance /; Define Idempotent matrix M Diagonal = 1-(1/n). Off Diag -(1/n) bigi=kindas(x,matrix(n,n:)) + real_one; littlei=(real_one/real_n)*(i*transpose(i)); m0=(bigi-littlei); ccov2=transpose(mfam(x))*m0*mfam(x)/(real_n-real_one); if(klass(x).eq.6.or.klass(x).eq.5)ccov2=afam(ccov2); return(ccov2); end; The variance-covariance matrix for the scaled gas data is calculated using real*8, real*4 and real*16. Assuming the real*16 results are the correct answers, the results obtained for the two methods are compared. The gasout series was multiplied by 1000 to cause a scale problem that will be detected using real*4 calculations. Annotated output is shown below: Variable TIME GASIN GASOUT CONSTANT Label # Cases 1 2 Input gas rate in cu. ft / min 3 Percent CO2 in outlet gas 4 Number of observations in data file Current missing variable code B34S(r) Matrix Command. d/m/y => CALL LOADDATA$ => CALL LOAD(COV => CALL LOAD(COV2 :STAGING)$ => CALL LOAD(COR => CALL ECHOOFF$ Mean 296 148.500 296 -0.568345E-01 296 53.5091 296 1.00000 Std. Dev. 85.5921 1.07277 3.20212 0.00000 Variance Maximum Minimum 7326.00 1.15083 10.2536 0.00000 296.000 2.83400 60.5000 1.00000 1.00000 -2.71600 45.6000 1.00000 296 1.000000000000000E+31 2/ 7/07. h:m:s 10:51: 4. :STAGING)$ :STAGING)$ Using real*8 the two methods appear to be producing the same answers with a small difference in the cov(2,2) position of .625849e-6. Covariance using function cov Array 1 2 1 1.15083 -1664.15 of 2 by 2 -1664.15 0.102536E+08 Covariance using function cov2 2 elements 60 Matrix Command Language Array 1 2 of 1 1.15083 -1664.15 2 by 2 elements 2 elements 2 elements 2 elements (real*16) 2 elements (real*16) 2 -1664.15 0.102536E+08 Difference of two methods using real*8 Array 1 2 of 1 -0.666134E-15 0.682121E-12 2 by 2 -0.341061E-11 0.625849E-06 Correlation using function cor Array 1 2 of 1 1.00000 -0.484451 2 by 2 -0.484451 1.00000 Real*16 results Covariance using function cov Array 1 2 of 1 1.15083 -1664.15 2 by 2 -1664.15 0.102536E+08 Covariance using function cov2 Array 1 2 of 1 1.15083 -1664.15 2 by 2 -1664.15 0.102536E+08 Difference of the two methods using real*16 Array 1 2 of 1 -0.666134E-15 0.682121E-12 2 by 2 elements 2 -0.341061E-11 0.625849E-06 When testing against the real*16 results the traditional method difference for the 2,2 position is .133107E-09 which is smaller than the -.625716E-06 result for the MO method. This suggests accuracy gains even with real*8. Difference against real16 for real8_1 & real8_2 Where Traditional Method = real8_1 Where M0 Method = real8_2 Array 1 2 1 -0.353299E-15 0.102890E-11 Array 1 2 of of 1 0.312834E-15 0.346778E-12 2 by 2 elements (real*16) 2 elements (real*16) 2 elements (real*16) 2 0.102890E-11 0.133107E-09 2 by 2 0.443950E-11 -0.625716E-06 Correlation using function cor Array 1 2 1 1.00000 -0.484451 of 2 by 2 -0.484451 1.00000 Using real*4 but comparing in real*16 against the real*16 results produces the result that for the traditional method in the 2,2 position the difference is .469079. However for the MO method, the Chapter 16 61 difference is -104.531. These findings show the accuracy loss when real*4 calculations are made both with an appropriate method and with a poorer method. Where Where Traditional Method = real4_1 M0 Method = real4_2 REAL4_1 = Array 1 2 of 1 1.15083 -1664.15 REAL4_2 = Array 1 2 2 by 2 elements (real*4) 2 elements (real*4) 2 -1664.15 0.102536E+08 of 2 1 1.15083 -1664.15 by 2 -1664.15 0.102535E+08 Difference against real16 for real4_1 & real4_2 Array of 2 1 0.313754E-07 -0.478797E-04 1 2 Array of 1 2 2 elements (real*16) 2 elements (real*16) 2 elements (real*4) 2 -0.478797E-04 0.469079 2 1 0.313754E-07 0.741906E-04 by by 2 -0.478797E-04 -104.531 Correlation using function cor Array 1 2 of 1 1.00000 -0.484451 2 by 2 -0.484451 1.00000 B34S Matrix Command Ending. Last Command reached. Space available in allocator Number variables used Number temp variables used 7869592, peak space used 37, peak number used 918, # user temp clean 803279 48 0 In Chapter 2 equation (2.9-6) we showed the relationship between the population error e and the sample error u. Recall that u ( I X ( X ' X )1 X ')e Me . The sum of squared sample residuals was related to the population residuals by u ' u e ' MMe e ' Me. Since M is not full rank, it is not possible to estimate the population residual as e M 1u and hence there are only N-K BLUS residuals. Theil shows that u ' u My . Sample job CH16_LUS in bookruns.mac illustrates these calculations. For more detail see Chapter 2. b34sexec matrix; * See Chapter 2 equation 2.9-4) - (2.9-7) ; * In this example all coefficients are 1.0; n=20; k=5; beta=vector(k:)+1.; x=rn(matrix(n,k:)); x(,1)=1.0; y=x*beta+rn(vector(norows(x):)); bigi=matrix(n,n:) + 1.; m=(bigi-x*inv(transpose(x)*x)*transpose(x)); mm=m*m; test=sum(dabs(mm-m)); 62 Matrix Command Language call print('Test ',test:); call print('Theil (1971) page shows sumsq error = y*m*y'); testss=y*m*y; betahat=inv(transpose(x)*x)*transpose(x)*y; call olsq(y,x :noint :print); call print(testss,betahat); u=y-x*betahat; u_alt=m*y; sse=sumsq(u); sse2=sumsq(u_alt); call print('Two ways to get sum of squares',sse,sse2); call print('Show M not full rank',det(m)); b34srun; Which when run produces: => N=20$ => K=5$ => BETA=VECTOR(K:)+1.$ => X=RN(MATRIX(N,K:))$ => X(,1)=1.0$ => Y=X*BETA+RN(VECTOR(NOROWS(X):))$ => BIGI=MATRIX(N,N:) + 1.$ => M=(BIGI-X*INV(TRANSPOSE(X)*X)*TRANSPOSE(X))$ => MM=M*M$ => TEST=SUM(DABS(MM-M))$ => CALL PRINT('Test ',TEST:)$ Test => 9.350213745623615E-15 CALL PRINT('Theil (1971) page shows sumsq error = y*m*y')$ Theil (1971) page shows sumsq error = y*m*y => TESTSS=Y*M*Y$ => BETAHAT=INV(TRANSPOSE(X)*X)*TRANSPOSE(X)*Y$ => CALL OLSQ(Y,X :NOINT :PRINT)$ Ordinary Least Squares Estimation Dependent variable Y Centered R**2 Residual Sum of Squares Residual Variance Sum Absolute Residuals 1/Condition XPX Maximum Absolute Residual Number of Observations Variable Col____1 Col____2 Col____3 Lag 0 0 0 Coefficient 1.5104320 0.87494443 0.63416941 0.8293118234985847 14.14785861384947 0.9431905742566311 12.12518257459587 0.2405358242295580 1.875788654723293 20 SE 0.24171911 0.27288711 0.22752179 t 6.2487072 3.2062505 2.7872909 Chapter 16 Col____4 Col____5 => 0 0 1.2184971 1.0098815 0.22690316 0.29505957 63 5.3701196 3.4226360 CALL PRINT(TESTSS,BETAHAT)$ TESTSS = 14.147859 BETAHAT = Vector of 1.51043 5 0.874944 elements 0.634169 1.21850 => U=Y-X*BETAHAT$ => U_ALT=M*Y$ => SSE=SUMSQ(U)$ => SSE2=SUMSQ(U_ALT)$ => CALL PRINT('Two ways to get sum of squares',SSE,SSE2)$ 1.00988 Two ways to get sum of squares SSE = 14.147859 SSE2 = 14.147859 => CALL PRINT('Show M not full rank',DET(M))$ Show M not full rank 0.17959633E-79 The BLUE property of OLS discussed in chapter 2 requires that there be no correlation between the estimated left hand side vector ŷ X ˆ and the residual vector ê . Given that X is a matrix of explanatory variables, and ŷ X ˆ , then: yˆ ' eˆ 0 ( X ˆ ) ' eˆ 0 ( X ˆ ) '( yˆ y ) 0 (16.4-2) ˆ ' X ' yˆ ˆ ' X ' X ˆ 0 ˆ '[ X ' yˆ X ' X ˆ ] 0 which implies the restriction X ' yˆ X ' X ˆ from which the OLS solution equation ˆ ( X ' X ) 1 X ' y quickly follows. This is illustrated assuming [1., 2.,3., 4.,5.] n=30; x=rn(matrix(n,5:)); x(,1)=1.0; beta=vector(5:1. 2. 3. 4. 5.); y=x*beta +rn(vector(n:)); xpx=transpose(x)*x; betahat=inv(xpx)*transpose(x)*y; call olsq(y x :print :noint); 64 Matrix Command Language resid=y-x*betahat; /$ Test if orthogonal call print(beta,betahat,'Is residual Orthogonal with yhat?', ddot(resid,x*betahat)); The results are verified with the olsq command and the orthogonal restriction yˆ ' eˆ 0 is tested. The calculated ddot value 0.21405100E-12 suggests that the restriction is met by the solution vector. => N=30$ => X=RN(MATRIX(N,5:))$ => X(,1)=1.0$ => BETA=VECTOR(5:1. 2. 3. 4. 5.)$ => Y=X*BETA +RN(VECTOR(N:))$ => XPX=TRANSPOSE(X)*X$ => BETAHAT=INV(XPX)*TRANSPOSE(X)*Y$ => CALL OLSQ(Y X :PRINT :NOINT)$ Ordinary Least Squares Estimation Dependent variable Y Centered R**2 Residual Sum of Squares Residual Variance Sum Absolute Residuals 1/Condition XPX Maximum Absolute Residual Number of Observations Variable Col____1 Col____2 Col____3 Col____4 Col____5 Lag 0 0 0 0 0 Coefficient 1.1096979 1.8668577 2.8053762 3.9369505 4.8105246 0.9804554404605988 23.50833421529196 0.9403333686116784 21.45426309533863 0.2398115678243269 2.059994965106420 30 SE 0.18104778 0.18101248 0.21003376 0.17176593 0.18925387 t 6.1293098 10.313420 13.356787 22.920439 25.418369 => RESID=Y-X*BETAHAT$ => => CALL PRINT(BETA,BETAHAT,'Is residual Orthogonal with yhat?', DDOT(RESID,X*BETAHAT))$ BETA = Vector of 1.00000 5 2.00000 BETAHAT = Vector of elements 3.00000 5 1.10970 1.86686 Is residual Orthogonal with yhat? 4.00000 5.00000 3.93695 4.81052 elements 2.80538 0.21405100E-12 Given A and B are square matrices, linear algebra rules for inverse and determinant are: Chapter 16 | A1 | 1/ | A | 1 A A ( AB) 65 1 ( A1 ) ' I B 1 A1 (16.4-3) ( A ') 1 ( ABC ) 1 C 1B 1 A1 This is illustrated by: /$ Rules of Inverses call print(xpx, inv(xpx),xpx*inv(xpx)); call print(' ' 'We perform tests involving inverses ' '1/det(x) = det(inv(x))',1./det(xpx),det(inv(xpx)), 'Test if: transpose(inv(x)) = inv(transpose(x))', transpose(inv(xpx)), inv(transpose(xpx))); x1=xpx; x2=rn(xpx); x3=rn(x2); call print('Test if: inv(x1*x2*x3) = inv(x3)*inv(x2)*inv(x1)', inv(x1*x2*x3),inv(x3)*inv(x2)*inv(x1)); which when run produces: => CALL PRINT(XPX, INV(XPX),XPX*INV(XPX))$ XPX = Matrix of 1 2 3 4 5 1 30.0000 4.88927 10.3848 -1.45371 -0.536573 Matrix of 1 2 3 4 5 1 0.396630E-01 -0.842611E-02 -0.142149E-01 0.160454E-02 -0.234751E-02 Matrix of 1 2 3 4 5 => => => => => 1 1.00000 -0.754605E-16 -0.104083E-16 0.867362E-17 0.00000 5 by 2 4.88927 22.1316 0.241093 -6.20091 -2.52910 5 by 2 -0.842611E-02 0.507700E-01 0.228504E-02 0.113704E-01 0.492987E-02 5 by 2 0.763278E-16 1.00000 0.277556E-16 0.589806E-16 0.277556E-16 5 elements 3 10.3848 0.241093 30.4069 3.57937 -7.08262 5 4 0.160454E-02 0.113704E-01 -0.417971E-02 0.397024E-01 0.486005E-02 5 -0.234751E-02 0.492987E-02 0.655197E-02 0.486005E-02 0.273106E-01 4 0.303577E-17 -0.520417E-17 0.277556E-16 1.00000 0.277556E-16 5 0.00000 0.416334E-16 -0.555112E-16 0.00000 1.00000 elements 3 0.398986E-16 0.208167E-16 1.00000 0.346945E-16 0.555112E-16 CALL PRINT(' ' 'We perform tests involving inverses ' '1/det(x) = det(inv(x))',1./DET(XPX),DET(INV(XPX)), 'Test if: transpose(inv(x)) = inv(transpose(x))', TRANSPOSE(INV(XPX)), INV(TRANSPOSE(XPX)))$ We perform tests involving inverses 1/det(x) = det(inv(x)) 5 -0.536573 -2.52910 -7.08262 -4.84565 39.5877 elements 3 -0.142149E-01 0.228504E-02 0.397422E-01 -0.417971E-02 0.655197E-02 5 4 -1.45371 -6.20091 3.57937 27.9920 -4.84565 66 Matrix Command Language 0.62036286E-07 Test if: 0.62036286E-07 transpose(inv(x)) = inv(transpose(x)) Matrix of 1 2 3 4 5 1 0.396630E-01 -0.842611E-02 -0.142149E-01 0.160454E-02 -0.234751E-02 5 2 -0.842611E-02 0.507700E-01 0.228504E-02 0.113704E-01 0.492987E-02 Matrix of 1 2 3 4 5 1 0.396630E-01 -0.842611E-02 -0.142149E-01 0.160454E-02 -0.234751E-02 by 5 by 2 -0.842611E-02 0.507700E-01 0.228504E-02 0.113704E-01 0.492987E-02 5 elements 3 -0.142149E-01 0.228504E-02 0.397422E-01 -0.417971E-02 0.655197E-02 5 4 0.160454E-02 0.113704E-01 -0.417971E-02 0.397024E-01 0.486005E-02 5 -0.234751E-02 0.492987E-02 0.655197E-02 0.486005E-02 0.273106E-01 4 0.160454E-02 0.113704E-01 -0.417971E-02 0.397024E-01 0.486005E-02 5 -0.234751E-02 0.492987E-02 0.655197E-02 0.486005E-02 0.273106E-01 elements 3 -0.142149E-01 0.228504E-02 0.397422E-01 -0.417971E-02 0.655197E-02 => X1=XPX$ => X2=RN(XPX)$ => X3=RN(X2)$ => => CALL PRINT('Test if: inv(x1*x2*x3) = inv(x3)*inv(x2)*inv(x1)', INV(X1*X2*X3),INV(X3)*INV(X2)*INV(X1))$ Test if: inv(x1*x2*x3) = inv(x3)*inv(x2)*inv(x1) Matrix of 1 2 3 4 5 1 -0.110146E-01 -0.897845E-02 -0.528828E-02 0.799993E-02 -0.559507E-04 5 2 -0.295302E-01 -0.246023E-01 0.373354E-01 -0.780149E-02 -0.513571E-02 Matrix of 1 2 3 4 5 1 -0.110146E-01 -0.897845E-02 -0.528828E-02 0.799993E-02 -0.559507E-04 by 5 by 2 -0.295302E-01 -0.246023E-01 0.373354E-01 -0.780149E-02 -0.513571E-02 5 elements 3 0.305773E-01 0.475944E-01 -0.612578E-02 -0.239615E-02 0.296128E-01 5 4 -0.208158E-01 -0.141358E-01 0.128556E-01 0.111239E-01 -0.399576E-02 5 0.230482E-01 0.218481E-01 -0.678314E-02 -0.453775E-02 0.199970E-01 4 -0.208158E-01 -0.141358E-01 0.128556E-01 0.111239E-01 -0.399576E-02 5 0.230482E-01 0.218481E-01 -0.678314E-02 -0.453775E-02 0.199970E-01 elements 3 0.305773E-01 0.475944E-01 -0.612578E-02 -0.239615E-02 0.296128E-01 The Kronecker product between matrix A which is k by j and B which is m by n produces C which is k*m by j*n. In words, every element of A is multiplied by the B matrix. Using the Greene (2000) example data with: a=matrix(2,2:3 0 5 2); b=matrix(2,2:1 4 4 7); call print(a,b,kprod(a,b), a(1,1)*b, a(1,2)*b, a(2,1)*b, a(2,2)*b); we print A, B the Kronecker product A B and each element. => A=MATRIX(2,2:3 0 5 2)$ => B=MATRIX(2,2:1 4 4 7)$ => CALL PRINT(A,B,KPROD(A,B), Chapter 16 => => A(1,1)*B, A(2,1)*B, A = Matrix of 1 2 B 1 3.00000 5.00000 1 1.00000 4.00000 1 3.00000 12.0000 5.00000 20.0000 1 3.00000 12.0000 1 0.00000 0.00000 1 5.00000 20.0000 1 2.00000 8.00000 2 by 2 elements 4 by 4 elements 2 3 0.00000 0.00000 2.00000 8.00000 by 2 elements by 2 elements by 2 elements by 2 elements 4 0.00000 0.00000 8.00000 14.0000 2 2 2 20.0000 35.0000 Matrix of 1 2 elements 2 0.00000 0.00000 Matrix of 1 2 2 2 12.0000 21.0000 Matrix of 1 2 by 2 12.0000 21.0000 20.0000 35.0000 Matrix of 1 2 2 2 4.00000 7.00000 Matrix of 1 2 3 4 A(1,2)*B, A(2,2)*B)$ 2 0.00000 2.00000 = Matrix of 1 2 67 2 2 8.00000 14.0000 There are a number of very important factorizations in linear algebra. Assume A is a general matrix and B is a positive definite matrix, both of size n by n. The LU factorization of a general matrix writes A in terms of a lower triangular matrix L and an upper triangular matrix U. A LU (16.4-4) A1 U 1L1 The Cholesky decomposition writes the B in terms of an upper triangular matrix R as B R'R (16.4-5) The following code illustrates these decompositions. /$ LU and Cholesky factorization n=5; x=rn(matrix(n,n:)); xpx=transpose(x)*x; call gmfac(xpx,l,u); r=pdfac(xpx); call print('Inverse from L U = inv(u)*inv(l)' 68 Matrix Command Language 'inv(xpx)', inv(xpx), 'inv(u)', inv(u), 'inv(l)', inv(l), 'Test inverse from looking at u and l', 'inv(u)*inv(l)', inv(u)*inv(l)); call print(xpx,l,u,'l*u',l*u, 'Cholesky Factorization of pd matrix',r, 'transpose(r)*r', transpose(r)*r); Edited output is: => N=5$ => X=RN(MATRIX(N,N:))$ => XPX=TRANSPOSE(X)*X$ => CALL GMFAC(XPX,L,U)$ => R=PDFAC(XPX)$ => => => => => => => CALL PRINT('Inverse from L U = inv(u)*inv(l)' 'inv(xpx)', INV(XPX), 'inv(u)', INV(U), 'inv(l)', INV(L), 'Test inverse from looking at u and l', 'inv(u)*inv(l)', INV(U)*INV(L))$ Inverse from L U = inv(u)*inv(l) inv(xpx) Matrix of 1 2 3 4 5 1 0.672320 0.565469 -0.197200 0.289017 -0.201206 5 by 2 0.565469 0.930016 -0.224394 0.213563 0.540409E-01 5 elements 3 -0.197200 -0.224394 0.347142 0.120672E-01 -0.307224E-01 4 0.289017 0.213563 0.120672E-01 0.300834 -0.170462 5 -0.201206 0.540409E-01 -0.307224E-01 -0.170462 0.441608 4 0.211351 0.234423 0.208219E-03 0.235035 0.00000 5 -0.201206 0.540409E-01 -0.307224E-01 -0.170462 0.441608 inv(u) Matrix of 1 2 3 4 5 1 0.152498 0.00000 0.00000 0.00000 0.00000 5 by 2 0.243980 0.548226 0.00000 0.00000 0.00000 5 elements 3 -0.211385 -0.220842 0.345005 0.00000 0.00000 inv(l) Matrix of 1 2 1 1.00000 0.445036 5 by 2 0.00000 1.00000 5 elements 3 0.00000 0.00000 4 0.00000 0.00000 5 0.00000 0.00000 Chapter 16 3 4 5 -0.612703 0.899230 -0.455621 -0.640112 0.997398 0.122373 1.00000 0.885904E-03 -0.695695E-01 69 0.00000 1.00000 -0.386004 0.00000 0.00000 1.00000 Test inverse from looking at u and l inv(u)*inv(l) Matrix of 1 2 3 4 5 => => => => 1 0.672320 0.565469 -0.197200 0.289017 -0.201206 5 by 2 0.565469 0.930016 -0.224394 0.213563 0.540409E-01 5 elements 3 -0.197200 -0.224394 0.347142 0.120672E-01 -0.307224E-01 4 0.289017 0.213563 0.120672E-01 0.300834 -0.170462 5 -0.201206 0.540409E-01 -0.307224E-01 -0.170462 0.441608 CALL PRINT(XPX,L,U,'l*u',L*U, 'Cholesky Factorization of pd matrix',R, 'transpose(r)*r', TRANSPOSE(R)*R)$ XPX 1 2 3 4 5 L = Matrix of 1 6.55746 -2.91830 2.14973 -2.98786 2.34107 = Matrix of 1 2 3 4 5 U 1 1.00000 -0.445036 0.327830 -0.455643 0.357008 = Matrix of 1 2 3 4 5 1 6.55746 0.00000 0.00000 0.00000 0.00000 5 by 2 -2.91830 3.12282 0.210901 -0.490650 -1.88651 5 by 2 0.00000 1.00000 0.640112 -0.997965 -0.463060 5 by 2 -2.91830 1.82407 0.00000 0.00000 0.00000 5 elements 3 2.14973 0.210901 4.35066 -2.14731 0.427456 5 5 2.34107 -1.88651 0.427456 1.41839 4.13919 4 0.00000 0.00000 0.00000 1.00000 0.386004 5 0.00000 0.00000 0.00000 0.00000 1.00000 elements 3 0.00000 0.00000 1.00000 -0.885904E-03 0.692275E-01 5 4 -2.98786 -0.490650 -2.14731 7.43273 1.41839 elements 3 2.14973 1.16761 2.89851 0.00000 0.00000 4 -2.98786 -1.82035 -0.256781E-02 4.25468 0.00000 5 2.34107 -0.844652 0.200657 1.64233 2.26445 l*u Matrix of 1 2 3 4 5 1 6.55746 -2.91830 2.14973 -2.98786 2.34107 5 by 2 -2.91830 3.12282 0.210901 -0.490650 -1.88651 5 elements 3 2.14973 0.210901 4.35066 -2.14731 0.427456 4 -2.98786 -0.490650 -2.14731 7.43273 1.41839 5 2.34107 -1.88651 0.427456 1.41839 4.13919 Cholesky Factorization of pd matrix R = Matrix of 1 2 3 4 5 1 2.56075 0.00000 0.00000 0.00000 0.00000 5 by 2 -1.13963 1.35058 0.00000 0.00000 0.00000 5 elements 3 0.839491 0.864523 1.70250 0.00000 0.00000 4 -1.16679 -1.34783 -0.150825E-02 2.06269 0.00000 5 0.914210 -0.625399 0.117860 0.796207 1.50481 70 Matrix Command Language transpose(r)*r Matrix of 1 2 3 4 5 1 6.55746 -2.91830 2.14973 -2.98786 2.34107 5 by 2 -2.91830 3.12282 0.210901 -0.490650 -1.88651 5 elements 3 2.14973 0.210901 4.35066 -2.14731 0.427456 4 -2.98786 -0.490650 -2.14731 7.43273 1.41839 5 2.34107 -1.88651 0.427456 1.41839 4.13919 The usual eigenvalue decomposition of A writes it in terms of a "left handed" eigenvector matrix V and a diagonal matrix D with the eigenvalues i along the diagonal as AV VD (16.4-6) A VDV 1 n The trace of a matrix is j 1 j or the sum of the diagonal elements of D while the determinant is nj1 j . If A is symmetric we have V ' V 1 which implies that all the columns in V are orthogonal or that for this case V 'V I . The Schur decomposition writes A USU ' (16.4-7) where U is an orthagonal matrix and S is block upper triangular with the eigenvalues on the diagonal. Here for all matrices U 'U I , unlike the eigenvalue decomposition where V 'V I only when the factored matrix is symmetric. Code to illustrate these types of calculations for both general and symmetric matrices: /$ Eigenvalues & Schur we write a*v = v*lamda a=matrix(2,2:5,1 2,4); lamda=eig(a,v); det1=det(a); trace1=trace(a); det2=prod(lamda); trace2=sum(lamda); lamda=diagmat(lamda); call schur(a,s,u); call print('We have defined a general matrix',a, lamda, v, 'Is sum of eigenvalues trace?' 'Is product of eigenvalues det?' det1,det2,trace1,trace2, 'With Eigenvalues a = v*lamda*inv(v)', v*lamda*inv(v), ' ', 'With Schur a = u*s*transpose(u) ' 'Schur => s upper triangular', s,u, 'a = u*s*transpose(u)' 'u*transpose(u)=I' u*transpose(u), 'This is a from the schur, is it a?', u*s*transpose(u)); Chapter 16 /$ PD Matrix Case call print('Positive Def case s = lamda' 'transpose(v)*v = I' 'sum(lamda)=trace' 'prod(lamda)=det'); a=xpx; /$ Note we use the symmetric eigen call here lamda=seig(a,v); d =diagmat(lamda); call schur(a,s,u); call print(a,lamda,d,v, 'With Eigenvalues a = v*d*inv(v)', v*d*inv(v), ' ', 'inv(u)=transpose(u)', inv(u),transpose(u), 'Is v*transpose(v)= I ?', v*transpose(v), 'Is transpose(v)*v= I ?', transpose(v)*v, 'A = v*d*inv(v)', v*d*inv(v), 'With Schur a = u*s*transpose(u)', 'Schur => s upper triangular', s,u, 'a = u*s*transpose(u)', 'u*transpose(u)=I' u*transpose(u), 'This is a matrix from the schur', u*s*transpose(u), 'sum(lamda)=trace' 'prod(lamda)=det', sum(lamda),trace(a), prod(lamda),det(a)); produces detailed but instructive results: => A=MATRIX(2,2:5,1 2,4)$ => LAMDA=EIG(A,V)$ => DET1=DET(A)$ => TRACE1=TRACE(A)$ => DET2=PROD(LAMDA)$ => TRACE2=SUM(LAMDA)$ => LAMDA=DIAGMAT(LAMDA)$ => CALL SCHUR(A,S,U)$ => CALL PRINT('We have defined a general matrix',A, 71 72 Matrix Command Language => => => => => => => => => => => => => => => => LAMDA, V, 'Is sum of eigenvalues trace?' 'Is product of eigenvalues det?' DET1,DET2,TRACE1,TRACE2, 'With Eigenvalues a = v*lamda*inv(v)', V*LAMDA*INV(V), ' ', 'With Schur a = u*s*transpose(u) ' 'Schur => s upper triangular', S,U, 'a = u*s*transpose(u)' 'u*transpose(u)=I' U*TRANSPOSE(U), 'This is a from the schur, is it a?', U*S*TRANSPOSE(U))$ We have defined a general matrix A = Matrix of 1 2 LAMDA 1 ( 2 ( V 2 1 5.00000 2.00000 by 1 , , 2 by 0.000 0.000 ) ) = Complex Matrix of 1 ( 2 ( 0.7071 0.7071 elements 2 1.00000 4.00000 = Complex Matrix of 6.000 0.000 2 1 , , ( ( 2 elements 0.000 3.000 2 by 0.000 0.000 ) ) 2 , , 0.000 0.000 ) ) 2 elements ( -0.4714 ( 0.9428 2 , , 0.000 0.000 ) ) Is sum of eigenvalues trace? Is product of eigenvalues det? DET1 = DET2 = TRACE1 = TRACE2 = 18.000000 (18.00000000000000,0.000000000000000E+00) 9.0000000 (9.000000000000000,0.000000000000000E+00) With Eigenvalues a = v*lamda*inv(v) Complex Matrix of 1 ( 2 ( 5.000 2.000 1 , , 2 by 0.000 0.000 ) ) ( ( 2 elements 1.000 4.000 2 , , 0.000 0.000 With Schur a = u*s*transpose(u) Schur => s upper triangular S = Matrix of 1 2 U 1 6.00000 0.00000 = Matrix of 2 by 2 elements 2 elements 2 -1.00000 3.00000 2 by ) ) Chapter 16 1 2 1 0.707107 0.707107 2 -0.707107 0.707107 a = u*s*transpose(u) u*transpose(u)=I Matrix of 1 2 1 1.00000 0.00000 2 by 2 elements 2 elements 2 0.00000 1.00000 This is a from the schur, is it a? Matrix of 1 2 => => => => 1 5.00000 2.00000 2 by 2 1.00000 4.00000 CALL PRINT('Positive Def case s = lamda' 'transpose(v)*v = I' 'sum(lamda)=trace' 'prod(lamda)=det')$ Positive Def case s = lamda transpose(v)*v = I sum(lamda)=trace prod(lamda)=det => A=XPX$ => LAMDA=SEIG(A,V)$ => D => CALL SCHUR(A,S,U)$ => => => => => => => => => => => => => => => => => => => => => => => => => CALL PRINT(A,LAMDA,D,V, 'With Eigenvalues a = v*d*inv(v)', V*D*INV(V), ' ', 'inv(u)=transpose(u)', INV(U),TRANSPOSE(U), 'Is v*transpose(v)= I ?', V*TRANSPOSE(V), 'Is transpose(v)*v= I ?', TRANSPOSE(V)*V, =DIAGMAT(LAMDA)$ 'A = v*d*inv(v)', V*D*INV(V), 'With Schur a = u*s*transpose(u)', 'Schur => s upper triangular', S,U, 'a = u*s*transpose(u)', 'u*transpose(u)=I' U*TRANSPOSE(U), 'This is a matrix from the schur', U*S*TRANSPOSE(U), 'sum(lamda)=trace' 'prod(lamda)=det', SUM(LAMDA),TRACE(A), PROD(LAMDA),DET(A))$ 73 74 Matrix Command Language A = Matrix of 1 2 3 4 5 LAMDA 1 6.55746 -2.91830 2.14973 -2.98786 2.34107 = Matrix of 5 by 2 -2.91830 3.12282 0.210901 -0.490650 -1.88651 5 elements 3 2.14973 0.210901 4.35066 -2.14731 0.427456 5 by 1 elements 5 by 5 elements 4 -2.98786 -0.490650 -2.14731 7.43273 1.41839 5 2.34107 -1.88651 0.427456 1.41839 4.13919 4 0.00000 0.00000 0.00000 8.20610 0.00000 5 0.00000 0.00000 0.00000 0.00000 11.7777 4 -0.286747 0.441200 0.143791 -0.615213 -0.569172 5 0.681563 -0.228428 0.364217 -0.563809 0.180992 4 -2.98786 -0.490650 -2.14731 7.43273 1.41839 5 2.34107 -1.88651 0.427456 1.41839 4.13919 4 -0.563809 -0.615213 -0.306957 0.270891 -0.368819 5 0.180992 -0.569172 -0.209667 -0.110225 0.766274 4 -0.563809 -0.615213 -0.306957 0.270891 -0.368819 5 0.180992 -0.569172 -0.209667 -0.110225 0.766274 1 1 2 3 4 5 D 0.639475 1.59842 3.38122 8.20610 11.7777 = Matrix of 1 2 3 4 5 V 1 0.639475 0.00000 0.00000 0.00000 0.00000 = Matrix of 1 2 3 4 5 1 -0.608222 -0.703468 0.222860 -0.270891 0.110225 2 0.00000 1.59842 0.00000 0.00000 0.00000 5 3 0.00000 0.00000 3.38122 0.00000 0.00000 by 2 0.244514 -0.395537 0.246090 0.368819 -0.766274 5 elements 3 -0.153385 0.319134 0.858163 0.306957 0.209667 With Eigenvalues a = v*d*inv(v) Matrix of 1 2 3 4 5 1 6.55746 -2.91830 2.14973 -2.98786 2.34107 5 by 2 -2.91830 3.12282 0.210901 -0.490650 -1.88651 5 elements 3 2.14973 0.210901 4.35066 -2.14731 0.427456 inv(u)=transpose(u) Matrix of 1 2 3 4 5 1 0.681563 -0.286747 0.153385 0.608222 -0.244514 Matrix of 1 2 3 4 5 1 0.681563 -0.286747 0.153385 0.608222 -0.244514 Is v*transpose(v)= I ? 5 by 2 -0.228428 0.441200 -0.319134 0.703468 0.395537 5 by 2 -0.228428 0.441200 -0.319134 0.703468 0.395537 5 elements 3 0.364217 0.143791 -0.858163 -0.222860 -0.246090 5 elements 3 0.364217 0.143791 -0.858163 -0.222860 -0.246090 Chapter 16 Matrix of 1 2 3 4 5 1 1.00000 0.471845E-15 -0.249800E-15 0.111022E-15 -0.222045E-15 5 by 2 0.471845E-15 1.00000 -0.319189E-15 0.111022E-15 -0.138778E-16 5 75 elements 3 -0.249800E-15 -0.319189E-15 1.00000 0.277556E-16 -0.152656E-15 4 0.111022E-15 0.111022E-15 0.277556E-16 1.00000 -0.693889E-16 5 -0.222045E-15 -0.138778E-16 -0.152656E-15 -0.693889E-16 1.00000 4 -0.138778E-15 0.388578E-15 0.277556E-16 1.00000 -0.971445E-16 5 -0.693889E-17 0.555112E-16 -0.902056E-16 -0.971445E-16 1.00000 Is transpose(v)*v= I ? Matrix of 1 2 3 4 5 1 1.00000 0.00000 0.211636E-15 -0.138778E-15 -0.693889E-17 5 by 2 0.00000 1.00000 -0.832667E-16 0.388578E-15 0.555112E-16 5 elements 3 0.211636E-15 -0.832667E-16 1.00000 0.277556E-16 -0.902056E-16 A = v*d*inv(v) Matrix of 1 2 3 4 5 1 6.55746 -2.91830 2.14973 -2.98786 2.34107 5 by 2 -2.91830 3.12282 0.210901 -0.490650 -1.88651 5 elements 3 2.14973 0.210901 4.35066 -2.14731 0.427456 4 -2.98786 -0.490650 -2.14731 7.43273 1.41839 5 2.34107 -1.88651 0.427456 1.41839 4.13919 With Schur a = u*s*transpose(u) Schur => s upper triangular S = Matrix of 1 2 3 4 5 U 1 11.7777 0.00000 0.00000 0.00000 0.00000 = Matrix of 1 2 3 4 5 1 0.681563 -0.228428 0.364217 -0.563809 0.180992 5 by 2 -0.330465E-15 8.20610 0.00000 0.00000 0.00000 5 by 2 -0.286747 0.441200 0.143791 -0.615213 -0.569172 5 elements 3 0.221534E-16 0.212782E-15 3.38122 0.00000 0.00000 5 4 -0.503934E-14 0.273997E-15 -0.666146E-15 0.639475 0.00000 5 0.874203E-15 0.165088E-15 0.189045E-15 -0.183493E-15 1.59842 4 0.608222 0.703468 -0.222860 0.270891 -0.110225 5 -0.244514 0.395537 -0.246090 -0.368819 0.766274 4 0.346945E-15 0.527356E-15 -0.277556E-16 1.00000 -0.277556E-15 5 -0.138778E-15 0.555112E-16 -0.277556E-16 -0.277556E-15 1.00000 elements 3 0.153385 -0.319134 -0.858163 -0.306957 -0.209667 a = u*s*transpose(u) u*transpose(u)=I Matrix of 1 2 3 4 5 1 1.00000 0.111022E-15 -0.305311E-15 0.346945E-15 -0.138778E-15 5 by 2 0.111022E-15 1.00000 -0.693889E-16 0.527356E-15 0.555112E-16 5 elements 3 -0.305311E-15 -0.693889E-16 1.00000 -0.277556E-16 -0.277556E-16 This is a matrix from the schur Matrix of 1 1 6.55746 5 by 2 -2.91830 5 elements 3 2.14973 4 -2.98786 5 2.34107 76 Matrix Command Language 2 3 4 5 -2.91830 2.14973 -2.98786 2.34107 3.12282 0.210901 -0.490650 -1.88651 0.210901 4.35066 -2.14731 0.427456 -0.490650 -2.14731 7.43273 1.41839 -1.88651 0.427456 1.41839 4.13919 sum(lamda)=trace prod(lamda)=det 25.602863 25.602863 334.02779 334.02779 Note that only with the symmetric matrix are the eigenvectors orthogonal. The SVD decomposition, discussed in Chapter 10, writes A U V ' (16.4-8) where both U and V are orthogonal (U 'U I and V 'V I) whether A is symmetric or not. is a diagonal matrix and if is symmetric contains the eigenvalues. Define X as the n by k matrix of explanatory variables and factor as X U V ' . The OLS coefficients ˆ V 1U ' y V 1ˆ where ˆ U ' y and is the principle component coefficient vector. As discussed in chapter 10, the QR factorization operates on X directly and writes it as X QR (16.4-9) where R is the upper triangular Cholesky matrix and Q ' Q I . Using the QR approach the OLS coefficient vector can be calculated as ˆ R1Q' y where Q is the truncated QR factorization 1 Q. 11 1 The eigenvalue and SVD decompositions are shown next. /$ SVD case n=4; noob=20; x=rn(matrix(noob,n:)); s=svd(x,b,11,u,v); call print('X',x,'Singular values',s,'Left Singular vectors',U, 'Right Singular vectors',v); call print('Test of Factorization. Is S along diagonal?', 'Transpose(u)*x*v',transpose(u)*x*v, 'Is U orthagonal?','transpose(U)*U', transpose(U)*U, 'Is V orthagonal?','transpose(V)*V', transpose(V)*V, ' '); /$ OLS with SVD n=30; 11 See Chapter 10 and especially equation (10.1-4) for a further discussion of the SVD and QR approaches to OLS estimation. Chapter 16 77 k=5; x =rn(matrix(n,k:)); x(,1) =1.0; beta =vector(5:1. 2. 3. 4. 5.); y =x*beta +rn(vector(n:)); xpx =transpose(x)*x; * Solve reduced problem; s =svd(x,bad,21,u,v); sigma =diagmat(s); betahat1=inv(xpx)*transpose(x)*y; betahat2=v*inv(sigma)*transpose(u)*y; call print('OLS from two approaches',betahat1,betahat2); /$ Show that SVD of PD matrix produces eigenvalues x=rn(matrix(5,5:)); xpx=Transpose(x)*x; e=eig(xpx); ee=seig(xpx); s=svd(xpx); call print(e,ee,s); b34srun; Edited output produces: => N=4$ => NOOB=20$ => X=RN(MATRIX(NOOB,N:))$ => S=SVD(X,B,11,U,V)$ => => CALL PRINT('X',X,'Singular values',S,'Left Singular vectors',U, 'Right Singular vectors',V)$ X X = Matrix of 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 20 1 -1.68924 1.29993 0.971522 0.275934 1.04040 2.19227 1.34663 0.487722 -1.42759 0.784950 1.24785 -0.170586 1.94869 -0.143676 1.25783 1.04911 -0.417547 0.361858 0.975757E-02 2.33496 by 2 0.835446 1.91536 -0.974110 -1.85672 0.720938 0.677597 1.52468 0.825081 0.717196 0.415158 -0.322106 1.74290 0.720260 1.02657 -1.44179 1.40971 0.304695 -0.991184 0.233525 1.40810 4 elements 3 -0.657493 1.47204 0.819547 -0.279932 -0.898864E-01 1.44798 -0.880582 0.482098 2.55604 -0.532131E-01 -0.169876 -1.43276 -0.299384 0.746487 0.860977 0.919866E-01 -1.34788 3.42229 -0.797029 -0.509794 4 -0.254551 -0.390060 1.15896 -0.159116 0.622223 0.493764E-01 -0.986808 0.478458 0.966945 -0.504051 -0.409886 -0.209339 -0.516373 -1.32309 0.180785 0.210994 1.02744 1.01987 1.35735 -1.84371 Singular values S = Vector of 6.23289 4 elements 5.81154 4.30460 3.01437 Left Singular vectors U = Matrix of 1 9 17 20 2 10 18 by 20 elements 3 11 19 4 12 20 5 13 6 14 7 15 8 16 78 Matrix Command Language 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 -0.322101E-01 -0.320597 0.893493E-01 0.255356 -0.237380 0.294401 -0.117659 -0.262047 0.202830 -0.125971 0.396289 0.269062 0.135539 0.484921E-01 -0.149306 0.196584 -0.227056E-02 -0.287309E-01 0.388463 0.115154 0.641075E-02 0.698504E-01 -0.322599E-01 -0.792262E-01 -0.270935 0.688969 0.392287E-01 0.151415 0.422894E-01 0.290474E-01 0.126017 0.977614E-01 0.818546E-02 0.241971 0.602448E-01 -0.800231E-01 0.308152 0.123591 -0.341018E-01 0.128668 -0.895837E-01 0.210172 -0.743264E-01 0.426337E-01 0.469393E-02 0.218939 0.183609E-01 -0.970485E-01 -0.389875E-02 0.710544E-01 0.791055 -0.311206 -0.229800 0.478663E-01 -0.201913E-01 0.532134E-01 -0.240932 0.508172 0.150667 0.112420 -0.260349 0.994671E-01 0.447487E-01 0.318931 -0.116724 -0.105955 0.229697 0.461419E-01 -0.560129 -0.806928E-02 -0.581931E-01 0.200548 0.904080E-01 -0.274459E-02 0.628860E-01 0.411161 -0.249172E-01 -0.924831E-01 -0.207906E-01 -0.516838E-01 0.116558 0.118010 0.417217E-03 -0.269435E-02 0.242999 0.382602E-01 -0.220869 0.570221E-01 0.976281 0.115740E-01 0.884524E-01 -0.210630E-01 0.117491E-01 -0.232379 -0.199072E-01 0.181101 0.129450 -0.396893E-01 0.598312E-01 0.700518E-01 -0.356546E-01 -0.887863E-01 0.250716 0.369693E-02 -0.102393 0.108324 -0.174255E-01 0.512990E-01 -0.219255 0.231058E-01 0.168326 0.543786 0.352634E-01 0.671949 -0.951078E-01 0.279124E-01 0.128265 0.113523 -0.807279E-01 0.630360E-01 0.296990 0.224954 0.189791 0.303561 0.520870E-02 0.301436 -0.211241 -0.160541E-01 0.117248 -0.390195 -0.140321 0.361415 0.397197E-03 -0.102591E-01 -0.174926 -0.237031E-01 -0.452214E-01 -0.772205E-01 0.231855E-01 -0.495775E-01 0.155384E-01 0.133311 0.102239E-01 -0.100480 0.528093 0.964087E-01 0.134421E-01 -0.313488E-01 -0.297865E-01 0.278619E-01 -0.228266 0.946900 -0.275030E-03 0.208484 0.119206E-01 -0.574899E-01 -0.139860 -0.641483E-01 -0.490760E-01 0.250314 -0.301890E-02 0.229046 -0.336067 -0.471335E-01 -0.266146E-01 0.133400 -0.830885E-02 -0.116959 0.181067E-02 0.269130E-01 -0.207292 0.787399E-01 0.265673E-01 -0.145931E-01 -0.130967E-01 0.244190E-01 0.742236 -0.955852E-01 -0.965548E-01 0.119293 0.965272E-01 -0.228019E-01 0.270443 -0.124349E-01 -0.712554E-01 -0.385389 -0.289396 0.364765 0.203495 0.143602 0.168026 -0.243721 -0.333015 -0.953795E-01 0.456352E-02 -0.125418 0.484881E-02 -0.541132E-01 0.554756E-02 -0.732890E-01 -0.175532 -0.207380 -0.589486E-01 0.131425E-01 -0.279063E-01 0.320377E-01 0.141768 0.542150E-01 -0.587722E-04 -0.772195E-01 0.349257E-01 0.235356E-01 -0.645416E-01 -0.169038 0.853532 -0.792930E-01 -0.778142E-01 -0.395774E-01 -0.122433 0.394257 0.679223E-01 -0.126451 0.222758E-01 0.937636E-01 0.302488E-01 -0.246618 -0.972989E-01 -0.465850E-01 -0.408373 -0.143208 0.733172E-01 0.110617E-01 0.143426 0.153050 -0.492924 -0.139727 0.996318E-01 0.224285 -0.337589E-02 0.730950 Right Singular vectors V = Matrix of 1 2 3 4 => => => => => => => 1 0.606725 0.598306 -0.338768 -0.398937 4 by 2 0.545782 -0.263772E-01 0.833411 0.827817E-01 4 elements 3 -0.525263 0.767939 0.363935 0.438230E-01 4 -0.241052 -0.227166 0.241276 -0.912182 CALL PRINT('Test of Factorization. Is S along diagonal?', 'Transpose(u)*x*v',TRANSPOSE(U)*X*V, 'Is U orthagonal?','transpose(U)*U', TRANSPOSE(U)*U, 'Is V orthagonal?','transpose(V)*V', TRANSPOSE(V)*V, ' ')$ Test of Factorization. Is S along diagonal? Transpose(u)*x*v Matrix of 1 2 3 4 5 6 7 8 9 1 6.23289 0.444089E-15 -0.115186E-14 0.666134E-15 0.314691E-15 0.278651E-16 -0.313373E-15 0.349962E-15 0.762711E-16 20 by 2 -0.194289E-15 5.81154 0.700828E-15 -0.666134E-15 0.374147E-16 -0.531336E-16 -0.169282E-15 -0.249257E-15 -0.421771E-15 4 elements 3 -0.888178E-15 0.221698E-14 4.30460 -0.416334E-15 -0.450527E-15 -0.322149E-15 -0.161397E-15 0.106333E-15 -0.811914E-15 4 0.444089E-15 -0.133227E-14 -0.333067E-15 3.01437 -0.341664E-16 -0.154196E-15 -0.129486E-15 0.272142E-16 0.315349E-15 0.275910 0.344104 0.375527 -0.234639 0.179450 0.265510 -0.206155 -0.436742 -0.222228 0.115344 -0.615182E-01 -0.120714 -0.140903 0.152946E-01 0.264761 -0.267798 0.108067E-01 0.118000 0.276577 -0.717059E-02 0.176119 -0.168916 -0.296716E-01 -0.130592 0.262122 0.278325 0.863076 -0.743152E-01 -0.661692E-01 0.129263 -0.553338E-01 0.153749E-01 -0.784820E-01 -0.114721 -0.110976 -0.981376E-01 0.840641 0.625786E-01 -0.429142E-01 -0.697688E-01 -0.753492E-01 -0.100587 -0.358442E-01 -0.103534 -0.556430E-01 -0.409910E-01 0.871121 0.164006E-01 -0.113239E-01 -0.602158E-01 -0.858638E-01 -0.352846E-01 -0.570154E-01 0.560134E-01 -0.280015E-01 0.227976E-01 0.935966 -0.830720E-01 0.109161E-01 0.100278 -0.296105E-01 -0.680625E-01 0.939612E-01 0.508002E-01 -0.580439E-01 -0.189478E-01 -0.537323E-02 -0.439334E-01 -0.381130E-01 -0.247489E-01 -0.424956E-01 -0.189020E-01 0.861935E-03 -0.139336E-01 -0.201022E-01 -0.579962E-01 -0.444373E-01 0.300595E-01 -0.294424E-01 -0.618489E-01 0.740036E-02 -0.567557E-02 -0.481729E-01 -0.471590E-01 0.208588E-01 -0.983946E-02 -0.107268 0.106358 -0.280739E-01 -0.724567E-01 -0.787375E-01 0.885985 -0.916376E-01 0.436799E-01 -0.937637E-01 -0.387356E-01 -0.307506E-01 -0.726982E-01 0.120646 0.165615E-01 -0.199654E-01 0.777693 -0.220366E-01 0.158059E-02 0.467469E-01 0.575085E-01 -0.258233E-01 -0.249419E-01 -0.609790E-01 0.845400E-01 0.484097E-01 0.882325 0.605200E-03 0.144754E-01 -0.113702 -0.794022E-01 -0.800897E-01 0.589704E-01 -0.795206E-01 0.314735E-01 -0.755218E-01 0.884023 -0.112126 -0.195894E-01 0.433601E-01 0.154202 -0.341732E-01 0.803092E-01 -0.534232E-01 -0.824188E-01 -0.228036E-01 0.539288E-01 -0.109945 0.343820E-01 0.137477 -0.993364E-01 -0.536145E-01 -0.347539E-02 -0.153688 -0.331699E-01 0.482572E-02 0.200144 -0.245629E-01 0.603488E-01 -0.820223E-01 -0.112053 -0.145848 -0.501192E-01 0.191119E-01 -0.292739E-01 0.509073E-01 -0.107362 0.703497E-01 0.998936E-01 0.551242E-02 -0.137046 -0.102689 -0.106721 0.116293 0.209481 Chapter 16 10 11 12 13 14 15 16 17 18 19 20 -0.584209E-16 0.250437E-15 -0.432716E-15 0.564207E-16 0.941971E-16 0.223408E-15 0.545941E-15 -0.297085E-15 -0.255229E-15 0.107049E-16 0.420439E-15 -0.249419E-16 0.349601E-15 0.206492E-15 0.195553E-15 -0.218145E-16 0.403956E-15 -0.225733E-16 -0.829490E-17 -0.174354E-15 -0.394639E-15 0.381199E-15 0.162566E-15 0.221742E-15 -0.131224E-15 0.216933E-15 0.571930E-15 -0.893632E-16 0.563153E-15 -0.478270E-15 -0.537131E-15 -0.474331E-15 0.516133E-15 79 -0.212325E-15 -0.975753E-16 -0.270899E-15 -0.118217E-15 -0.137410E-15 0.117226E-15 -0.768332E-16 0.453433E-16 0.708976E-15 -0.632835E-16 -0.124486E-15 Is U orthagonal? transpose(U)*U Matrix of 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 9 17 1.00000 0.277556E-16 0.624500E-16 -0.208167E-16 0.555112E-16 -0.693889E-16 -0.256739E-15 -0.104083E-15 -0.163064E-15 0.166533E-15 0.166533E-15 0.381639E-16 0.216840E-16 0.635342E-16 -0.214672E-16 -0.693889E-17 -0.398986E-16 0.138778E-16 0.00000 0.208167E-16 -0.138778E-16 0.433681E-16 0.468375E-16 0.325261E-16 0.277556E-16 1.00000 0.451028E-16 -0.416334E-16 -0.520417E-17 0.693889E-17 0.138778E-16 -0.277556E-16 -0.156125E-16 0.124466E-15 0.992045E-16 -0.501986E-16 0.416334E-16 0.763278E-16 -0.156125E-16 -0.763278E-16 0.277556E-16 0.381639E-16 -0.485723E-16 -0.112757E-16 0.173472E-17 -0.104083E-16 0.260209E-17 0.169136E-16 0.624500E-16 0.451028E-16 1.00000 0.624500E-16 -0.126635E-15 0.390313E-16 0.555112E-16 0.867362E-16 -0.884709E-16 0.555112E-16 -0.138778E-16 0.138778E-16 20 by 2 10 18 -0.208167E-16 -0.416334E-16 0.624500E-16 1.00000 0.693889E-17 -0.156125E-16 0.267147E-15 0.416334E-16 -0.607153E-17 -0.204697E-15 -0.277556E-16 0.130104E-15 -0.500901E-16 -0.140946E-16 0.548064E-16 -0.364292E-16 0.173472E-17 -0.312250E-16 -0.416334E-16 -0.867362E-17 0.156125E-16 -0.216840E-16 -0.101915E-16 0.407660E-16 0.555112E-16 -0.520417E-17 -0.126635E-15 0.693889E-17 1.00000 0.433681E-17 0.451028E-16 0.607153E-17 -0.156125E-16 -0.148861E-15 -0.487891E-18 0.700395E-16 0.121431E-16 -0.156125E-16 0.156125E-16 0.121431E-16 0.346945E-17 -0.260209E-17 0.144849E-15 0.00000 -0.780626E-17 0.477049E-17 -0.737257E-17 0.101915E-16 -0.693889E-16 0.693889E-17 0.390313E-16 -0.156125E-16 0.433681E-17 1.00000 -0.485723E-16 -0.346945E-17 0.546438E-16 0.138778E-16 -0.208167E-16 -0.416334E-16 20 elements 3 11 19 -0.256739E-15 0.138778E-16 0.555112E-16 0.267147E-15 0.451028E-16 -0.485723E-16 1.00000 0.555112E-16 -0.137043E-15 -0.326128E-15 -0.451028E-16 0.277556E-16 -0.141705E-15 -0.358871E-16 0.241777E-16 -0.104083E-16 -0.277556E-16 0.156125E-16 0.555112E-16 -0.693889E-17 -0.277556E-16 0.294903E-16 -0.290566E-16 0.294903E-16 -0.104083E-15 -0.277556E-16 0.867362E-16 0.416334E-16 0.607153E-17 -0.346945E-17 0.555112E-16 1.00000 -0.693889E-17 -0.115034E-15 -0.726415E-17 -0.207625E-16 0.329597E-16 -0.451028E-16 -0.242861E-16 0.104083E-16 0.121431E-16 0.277556E-16 -0.546438E-16 -0.112757E-16 -0.173472E-17 -0.290566E-16 -0.177809E-16 -0.121431E-16 -0.163064E-15 -0.156125E-16 -0.884709E-16 -0.607153E-17 -0.156125E-16 0.546438E-16 -0.137043E-15 -0.693889E-17 1.00000 0.194289E-15 0.00000 -0.138778E-16 4 12 20 0.166533E-15 0.124466E-15 0.555112E-16 -0.204697E-15 -0.148861E-15 0.138778E-16 -0.326128E-15 -0.115034E-15 0.194289E-15 1.00000 -0.325261E-16 -0.555112E-16 0.305745E-16 -0.889046E-17 -0.112757E-16 0.173472E-16 0.140946E-16 0.138778E-16 -0.555112E-16 -0.235272E-16 -0.138778E-16 0.615827E-16 -0.379471E-18 -0.260209E-16 0.166533E-15 0.992045E-16 -0.138778E-16 -0.277556E-16 -0.487891E-18 -0.208167E-16 -0.451028E-16 -0.726415E-17 0.00000 -0.325261E-16 1.00000 -0.607153E-17 -0.659195E-16 -0.211962E-16 -0.416334E-16 -0.145717E-15 -0.948677E-17 0.693889E-16 -0.693889E-17 -0.139049E-16 -0.138778E-16 0.234188E-16 -0.233103E-17 -0.693889E-17 0.381639E-16 -0.501986E-16 0.138778E-16 0.130104E-15 0.700395E-16 -0.416334E-16 0.277556E-16 -0.207625E-16 -0.138778E-16 -0.555112E-16 -0.607153E-17 1.00000 Is V orthagonal? transpose(V)*V Matrix of 1 2 3 4 => 1 1.00000 0.124900E-15 0.461436E-15 -0.166533E-15 N=30$ 4 by 2 0.124900E-15 1.00000 -0.191687E-15 0.971445E-16 4 elements 3 0.461436E-15 -0.191687E-15 1.00000 -0.624500E-16 4 -0.166533E-15 0.971445E-16 -0.624500E-16 1.00000 5 13 6 14 7 15 8 16 0.216840E-16 0.416334E-16 -0.693889E-17 -0.763278E-16 0.00000 -0.485723E-16 0.433681E-16 -0.104083E-16 -0.500901E-16 0.121431E-16 -0.364292E-16 0.121431E-16 -0.416334E-16 0.144849E-15 -0.216840E-16 0.477049E-17 -0.141705E-15 0.329597E-16 -0.104083E-16 0.104083E-16 0.555112E-16 -0.546438E-16 0.294903E-16 -0.290566E-16 0.305745E-16 -0.659195E-16 0.173472E-16 -0.145717E-15 -0.555112E-16 -0.693889E-17 0.615827E-16 0.234188E-16 1.00000 -0.521501E-16 -0.329597E-16 -0.574627E-17 -0.274303E-16 -0.617995E-17 0.187431E-16 -0.109504E-16 -0.329597E-16 0.190820E-16 1.00000 0.242861E-16 0.208167E-16 -0.225514E-16 -0.173472E-17 -0.416334E-16 -0.274303E-16 -0.138778E-16 0.208167E-16 0.329597E-16 1.00000 0.112757E-16 -0.303577E-16 -0.954098E-17 0.187431E-16 -0.130104E-16 -0.173472E-17 -0.433681E-18 -0.303577E-16 -0.411997E-17 1.00000 -0.249366E-17 0.635342E-16 0.763278E-16 -0.398986E-16 0.277556E-16 0.208167E-16 -0.112757E-16 0.468375E-16 0.260209E-17 -0.140946E-16 -0.156125E-16 0.173472E-17 0.346945E-17 -0.867362E-17 0.00000 -0.101915E-16 -0.737257E-17 -0.358871E-16 -0.451028E-16 -0.277556E-16 0.121431E-16 -0.693889E-17 -0.112757E-16 -0.290566E-16 -0.177809E-16 -0.889046E-17 -0.211962E-16 0.140946E-16 -0.948677E-17 -0.235272E-16 -0.139049E-16 -0.379471E-18 -0.233103E-17 -0.521501E-16 1.00000 0.190820E-16 0.173472E-16 -0.138778E-16 -0.147451E-16 -0.130104E-16 -0.355618E-16 -0.574627E-17 0.173472E-16 0.242861E-16 1.00000 0.329597E-16 0.251535E-16 -0.433681E-18 -0.650521E-17 -0.617995E-17 -0.147451E-16 -0.225514E-16 0.251535E-16 0.112757E-16 1.00000 -0.411997E-17 -0.975782E-17 -0.109504E-16 -0.355618E-16 -0.416334E-16 -0.650521E-17 -0.954098E-17 -0.975782E-17 -0.249366E-17 1.00000 -0.214672E-16 -0.156125E-16 0.138778E-16 0.381639E-16 -0.138778E-16 0.173472E-17 0.325261E-16 0.169136E-16 0.548064E-16 0.156125E-16 -0.312250E-16 -0.260209E-17 0.156125E-16 -0.780626E-17 0.407660E-16 0.101915E-16 0.241777E-16 -0.242861E-16 0.156125E-16 0.277556E-16 -0.277556E-16 -0.173472E-17 0.294903E-16 -0.121431E-16 -0.112757E-16 -0.416334E-16 0.138778E-16 0.693889E-16 -0.138778E-16 -0.138778E-16 -0.260209E-16 -0.693889E-17 80 Matrix Command Language => K=5$ => X =RN(MATRIX(N,K:))$ => X(,1) =1.0$ => BETA =VECTOR(5:1. 2. 3. 4. 5.)$ => Y =X*BETA +RN(VECTOR(N:))$ => XPX =TRANSPOSE(X)*X$ => * => S =SVD(X,BAD,21,U,V)$ => SIGMA =DIAGMAT(S)$ => BETAHAT1=INV(XPX)*TRANSPOSE(X)*Y$ => BETAHAT2=V*INV(SIGMA)*TRANSPOSE(U)*Y$ => CALL PRINT('OLS from two approaches',BETAHAT1,BETAHAT2)$ SOLVE REDUCED PROBLEM$ OLS from two approaches BETAHAT1= Vector of 0.662539 5 2.95431 BETAHAT2= Vector of 0.662539 5 X=RN(MATRIX(5,5:))$ => XPX=TRANSPOSE(X)*X$ => E=EIG(XPX)$ => EE=SEIG(XPX)$ => S=SVD(XPX)$ => CALL PRINT(E,EE,S)$ 2.95431 = Complex Vector of ( ( EE 13.45 1.472 , , = Matrix of 3.50355 4.73075 3.50355 4.73075 elements 2.00676 => E elements 2.00676 5 0.000 0.000 ) ) 5 ( by 5.467 1 , 0.000 ) ( 0.3605 elements 1 S 1 0.360474 2 1.47185 3 2.25274 4 5.46686 5 13.4516 = Vector of 13.4516 5 5.46686 elements 2.25274 1.47185 0.360474 B34S Matrix Command Ending. Last Command reached. Space available in allocator Number variables used Number temp variables used 2881535, peak space used 88, peak number used 407, # user temp clean 6234 97 0 , 0.000 ) ( 2.253 , 0.000 ) Chapter 16 81 16.5 Extended Eigenvalue Analysis Recall from (16.4) that given a general matrix A, an eigenvalue value decomposition writes AV VD or A VDV 1 where V is the usual "left hand" eigenvector matrix V and D is a diagonal matrix with the eigenvalues i along the diagonal. It should be noted that the matrix V is not unique and can be scaled. The matrix command e=eig(a,V) will use the eispack routines rg and cg to produce a non-scaled eigenvector matrix, while the commands e=eig(a,v :LAPACK) or e=eig(a,v :LAPACK2)will produce eigenvalues using lapack routines dgeevx/zgeevx or dgeev/zgeev where V is scaled so that each column has a norm of 1. The dgeevx/zgeevx call does not balance the matrix while dgeev/zgeev does. In addition to the usual eigenvectors, it is possible to define "right handed" eigenvectors V * where: (V * ) H A D(V * ) H A ((V * ) H )1 D(V * ) H (16.5-1) The below listed code illustrates these refinements for real*8 and complex*16 matrices. We first estimate and test the non-scaled eigenvectors evec which were estimated using eispack. Next the lapack code is used to estimate and test the right and left handed eigenvectors. b34sexec matrix; * Exercises Eigenvalue calculations ; * IMSL test case ; A = matrix(3,3: 8.0, -1.0,-5.0, -4.0, 4.0,-2.0, 18.0, -5.0,-7.0); e =eig(a,evec); call print('Test Eispack',a,evec*diagmat(e)*inv(evec)); e2 =eig(a,evecr,evecl :lapack); call print('test eispack vs lapack':); call print(a,e,evec,e2,evecr,evecl); call print('test right' evecr*diagmat(e2)*inv(evecr) 'test left' inv(transpose(dconj(evecl)))*diagmat(e2)*transpose(dconj(evecl))); ca=complex(a,a*a); e =eig(ca,evec); call print('Test Eispack factorization', ca,evec*diagmat(e)*inv(evec)); e2 =eig(ca,evecr,evecl :lapack); call print('test eispack vs lapack':); call print(ca,e,evec,e2,evecr,evecl); call print('test right' evecr*diagmat(e2)*inv(evecr) 'test left' 82 Matrix Command Language inv(transpose(dconj(evecl)))*diagmat(e2)*transpose(dconj(evecl))); b34srun; Edited output is listed next. B34S(r) Matrix Command. Date of Run d/m/y Version February 2004. 26/ 2/04. Time of Run h:m:s 14:55:16. => * EXERCISES EIGENVALUE CALCULATIONS $ => * IMSL TEST CASE $ => => => A = MATRIX(3,3: 8.0, -1.0,-5.0, -4.0, 4.0,-2.0, 18.0, -5.0,-7.0)$ => E => CALL PRINT('Test Eispack',A,EVEC*DIAGMAT(E)*INV(EVEC))$ =EIG(A,EVEC)$ Test Eispack A = Matrix of 3 1 8.00000 -4.00000 18.0000 1 2 3 by 3 2 -1.00000 4.00000 -5.00000 3 -5.00000 -2.00000 -7.00000 Complex Matrix of 1 ( 2 ( 3 ( 8.000 -4.000 18.00 1 , , , elements 3 by 0.5467E-14) 0.3828E-14) 0.7021E-14) ( ( ( 3 elements -1.000 4.000 -5.000 => E2 =EIG(A,EVECR,EVECL => CALL PRINT('test eispack vs lapack':)$ 2 , -0.2402E-14) , -0.3625E-15) , -0.3845E-14) ( ( ( -5.000 -2.000 -7.000 ( 1.000 3 , -0.2178E-14) , -0.2136E-14) , -0.2400E-14) :LAPACK)$ test eispack vs lapack => CALL PRINT(A,E,EVEC,E2,EVECR,EVECL)$ A = Matrix of 1 8.00000 -4.00000 18.0000 1 2 3 E by 3 2 -1.00000 4.00000 -5.00000 2.000 , 4.000 = Complex Matrix of 1 3 ) elements 3 -5.00000 -2.00000 -7.00000 = Complex Vector of ( EVEC 3 ( elements 2.000 3 by , -4.000 ) , 3 elements 2 3 0.000 ) Chapter 16 1 ( 0.1129 2 ( -0.4268 3 ( 0.6526 E2 , , , ) ) ) ( 0.1129 ( -0.4268 ( 0.6526 = Complex Vector of ( EVECR 2.000 , EVECL 4.000 3 ) ( = Complex Matrix of 1 1 ( 0.3162 , 2 ( -0.9992E-15, 3 ( 0.6325 , 0.3162 0.6325 0.000 ) ) ) 2.000 , -4.000 ) ) ) ( ( ( 2.367 4.735 2.367 , , , 0.000 0.000 0.000 ) ) ) ) ( 1.000 , 0.000 ) ) ) ) ( ( ( 0.4082 0.8165 0.4082 3 , , , 0.000 0.000 0.000 ) ) ) ( -0.8165 ( 0.4082 ( 0.4082 3 , , , 0.000 0.000 0.000 ) ) ) 3 elements 2 ( 0.3162 , -0.3162 ( -0.9992E-15, -0.6325 ( 0.6325 , 0.000 3 by 1 , 0.000 ) , -0.8771E-01) , 0.1754 ) , -0.5397 , -0.6526 , -0.4268 elements 3 by = Complex Matrix of 1 ( -0.8771 2 ( 0.2631 3 ( 0.3508 => => => => 0.5397 0.6526 0.4268 83 ( -0.8771 ( 0.2631 ( 0.3508 3 elements 2 , 0.000 ) , 0.8771E-01) , -0.1754 ) CALL PRINT('test right' EVECR*DIAGMAT(E2)*INV(EVECR) 'test left' INV(TRANSPOSE(DCONJ(EVECL)))*DIAGMAT(E2)*TRANSPOSE(DCONJ(EVECL)))$ test right Complex Matrix of 1 ( 2 ( 3 ( 8.000 -4.000 18.00 1 , , , 0.000 0.000 0.000 3 by ) ) ) ( ( ( -1.000 4.000 -5.000 3 elements 2 , , , 0.2220E-15) 0.000 ) 0.4441E-15) ( ( ( -5.000 -2.000 -7.000 3 , , , ( ( ( -5.000 -2.000 -7.000 3 , -0.6865E-15) , 0.7069E-15) , -0.2347E-14) ( ( ( -5.000 -2.000 -7.000 3 , , , 0.8777E-16) 0.1755E-15) 0.8777E-16) test left Complex Matrix of 1 ( 2 ( 3 ( 8.000 -4.000 18.00 1 , , , 3 by 0.1595E-14) 0.3626E-15) 0.4039E-14) ( ( ( -1.000 4.000 -5.000 => CA=COMPLEX(A,A*A)$ => E => => CALL PRINT('Test Eispack factorization', CA,EVEC*DIAGMAT(E)*INV(EVEC))$ 3 elements 2 , -0.1314E-15) , -0.1813E-15) , -0.4599E-15) =EIG(CA,EVEC)$ Test Eispack factorization CA = Complex Matrix of 1 ( 2 ( 3 ( 8.000 -4.000 18.00 1 , , , -22.00 -84.00 38.00 Complex Matrix of 1 3 by ) ) ) ( ( ( -1.000 4.000 -5.000 3 by 3 elements 2 , , , 13.00 30.00 -3.000 ) ) ) 3 elements 2 3 -3.000 26.00 -31.00 ) ) ) 84 Matrix Command Language 1 ( 2 ( 3 ( 8.000 -4.000 18.00 , , , -22.00 -84.00 38.00 ) ) ) ( ( ( -1.000 4.000 -5.000 => E2 =EIG(CA,EVECR,EVECL => CALL PRINT('test eispack vs lapack':)$ , , , 13.00 30.00 -3.000 ) ) ) ( ( ( -5.000 -2.000 -7.000 , , , -3.000 26.00 -31.00 ) ) ) 13.00 30.00 -3.000 ) ) ) ( ( ( -5.000 -2.000 -7.000 3 , , , -3.000 26.00 -31.00 ) ) ) 1.000 ) ( 18.00 , -16.00 ) 0.2242 0.4484 0.2242 ) ) ) ( -0.9345 ( 0.6072 ( -2.476 3 , , , 1.542 2.476 0.6072 ) ) ) 1.000 ) ( , -16.00 ) 2 , -0.2498E-15) , 0.000 ) , -0.3608E-15) ( ( ( 3 0.3162 , -0.3162 0.1665E-14, -0.6325 0.6325 , 0.000 ) ) ) :LAPACK)$ test eispack vs lapack => CALL PRINT(CA,E,EVEC,E2,EVECR,EVECL)$ CA = Complex Matrix of 1 ( 2 ( 3 ( E 8.000 -4.000 18.00 -22.00 -84.00 38.00 ) ) ) ( ( ( = Complex Vector of ( EVEC -14.00 , -8.000 ) ( E2 -7.205 -6.911 -7.499 1 , -0.2937 , -7.499 , 6.911 ) ) ) EVECR 1 ( 2 ( 3 ( EVECL -14.00 , -8.000 ( ( ( ( = Complex Matrix of 1 ( 0.8771 2 ( -0.2631 3 ( -0.3508 1.000 0.4082 0.8165 0.4082 3 by 1 , 0.000 ) , 0.8771E-01) , -0.1754 ) 2 , , , elements 3 by ) ) ) , 3 elements ( -0.5782 ( -1.156 ( -0.5782 = Complex Matrix of 1 0.3162 , -0.3162 0.6325 , 0.000 0.1638E-14, -0.6325 1.000 3 ) 2 , , , elements 3 by = Complex Vector of ( 3 elements -1.000 4.000 -5.000 3 = Complex Matrix of 1 ( 2 ( 3 ( => => => => 1 , , , 3 by ( 0.8165 ( -0.4082 ( -0.4082 , 18.00 3 elements 3 elements 2 , 0.000 ) , 0.1665E-15) , -0.3053E-15) ( 0.8771 ( -0.2631 ( -0.3508 3 , 0.000 ) , -0.8771E-01) , 0.1754 ) CALL PRINT('test right' EVECR*DIAGMAT(E2)*INV(EVECR) 'test left' INV(TRANSPOSE(DCONJ(EVECL)))*DIAGMAT(E2)*TRANSPOSE(DCONJ(EVECL)))$ test right Complex Matrix of 1 ( 2 ( 3 ( 8.000 -4.000 18.00 test left 1 , , , -22.00 -84.00 38.00 3 by ) ) ) ( ( ( -1.000 4.000 -5.000 3 elements 2 , , , 13.00 30.00 -3.000 ) ) ) ( ( ( -5.000 -2.000 -7.000 3 , , , -3.000 26.00 -31.00 ) ) ) Chapter 16 Complex Matrix of 1 ( 2 ( 3 ( 8.000 -4.000 18.00 1 , , , -22.00 -84.00 38.00 3 by ) ) ) ( ( ( -1.000 4.000 -5.000 85 3 elements 2 , , , 13.00 30.00 -3.000 ) ) ) ( ( ( -5.000 -2.000 -7.000 3 , , , -3.000 26.00 -31.00 ) ) ) B34S Matrix Command Ending. Last Command reached. Space available in allocator Number variables used Number temp variables used 2874697, peak space used 28, peak number used 86, # user temp clean 1518 40 0 The reason the matrix command has both types of eigenvalue routines is that while above order 200 the lapack code is faster, below order 200 the eispack code is faster. The next job tests this and in addition uses the eispack symmetric matrix code tred1/imtql1 and tred2/imtql2 for eigenvalue and eigenvalue/eigenvector respectively. In the following series of tests, where appropriate, two computers were used, both running Windows XP Professional. The Dell Latitude Model C810 had an Intel Family 6 Model 11 CPU with a speed of 1,122 MH as measured by MS Word version 2003 and 512 MB of memory and represents the "low end". The Dell Model 650 workstation has two dual core Xeon CPU's each 3,056 MH as measured by MS Word and represents "high end" computing capability. b34sexec matrix; * ispeed1 on pd matrix ; * ispeed2 on general matrix; * ispeed3 on complex general matrix; * up 625 has been run ; igraph=0; ispeed1=1; ispeed2=1; ispeed3=1; upper=450; mesh=50; /$ PD Results if(ispeed1.ne.0)then; call echooff; icount=0; n=0; top continue; icount=icount+1; n=n+mesh; if(n .eq. upper)go to done; x=rec(matrix(n,n:)); x=transpose(x)*x; * x=complex(x,dsqrt(dabs(x))); call compress; call timer(base10); e=seig(x); call timer(base20); call compress; call timer(base110); 86 Matrix Command Language e=seig(x,evec); call timer(base220); call compress; call timer(base11); e=eig(x); call timer(base22); call compress; call timer(base111); e=eig(x:lapack2); call timer(base222); call compress; call timer(base1); e=eig(x,evec); call timer(base2); call compress; call timer(base3); e=eig(x,evec,evec2 :lapack2); call timer(base4); call compress; call timer(base5); e=eig(x,evec:lapack2); call timer(base6); size(icount) sm1(icount) sm2(icount) eispack1(icount) lapack1(icount) eispack2(icount) lapack2a(icount) lapack2b(icount) = dfloat(n); =base20-base10; =base220-base110; =(base22-base11); =(base222-base111); =(base2-base1); =(base4-base3); =(base6-base5); call free(x,xinv1,ii); go to top; done continue; call print('EISPACK vs LAPACK on PD Matrix ':); call print('lapack2a gets both right and left eigenvectors':); call tabulate(size,sm1 sm2,eispack1,lapack1,eispack2,lapack2a,lapack2b); if(igraph.eq.1) call graph(size sm1,sm2,eispack1,lapack1,eispack2,lapack2a,lapack2b :plottype xyplot :nokey :file 'pd_matrix.wmf' :heading 'Real*8 PD Matrix Results'); endif; if(ispeed2.ne.0)then; call echooff; icount=0; n=0; top2 continue; icount=icount+1; n=n+mesh; if(n .eq. upper)go to done2; Chapter 16 x=rec(matrix(n,n:)); * x=transpose(x)*x; * x=complex(x,dsqrt(dabs(x))); call compress; call timer(base11); e=eig(x); call timer(base22); call compress; call timer(base111); e=eig(x:lapack2); call timer(base222); call compress; call timer(base1); e=eig(x,evec); call timer(base2); call compress; call timer(base3); e=eig(x,evec,evec2 :lapack2); call timer(base4); call compress; call timer(base5); e=eig(x,evec:lapack2); call timer(base6); size(icount) eispack1(icount) lapack1(icount) eispack2(icount) lapack2a(icount) lapack2b(icount) = dfloat(n); =(base22-base11); =(base222-base111); =(base2-base1); =(base4-base3); =(base6-base5); call free(x,xinv1,ii); go to top2; done2 continue; call print('EISPACK vs LAPACK on General Matrix ':); call print('lapack2a gets both right and left eigenvectors':); call tabulate(size,eispack1,lapack1,eispack2,lapack2a,lapack2b); if(igraph.eq.1) call graph(size ,eispack1,lapack1,eispack2,lapack2a,lapack2b :plottype xyplot :nokey :file 'real_8.wmf' :heading 'Real*8 General Matrix Results'); endif; if(ispeed3.ne.0)then; call echooff; icount=0; n=0; top3 continue; icount=icount+1; n=n+mesh; if(n .eq. upper)go to done3; x=rec(matrix(n,n:)); 87 88 Matrix Command Language x=complex(x,dsqrt(dabs(x))); call compress; call timer(base11); e=eig(x); call timer(base22); call compress; call timer(base111); e=eig(x:lapack2); call timer(base222); call compress; call timer(base1); e=eig(x,evec); call timer(base2); call compress; call timer(base3); e=eig(x,evec,evec2 :lapack2); call timer(base4); call compress; call timer(base5); e=eig(x,evec:lapack2); call timer(base6); size(icount) eispack1(icount) lapack1(icount) eispack2(icount) lapack2a(icount) lapack2b(icount) = dfloat(n); =(base22-base11); =(base222-base111); =(base2-base1); =(base4-base3); =(base6-base5); call free(x,xinv1,ii); go to top3; done3 continue; call print('EISPACK vs LAPACK on a Complex General Matrix ':); call print('lapack2a gets both right and left eigenvectors':); call tabulate(size,eispack1,lapack1,eispack2,lapack2a,lapack2b); if(igraph.eq.1) call graph(size ,eispack1,lapack1,eispack2,lapack2a,lapack2b :plottype xyplot :nokey :file 'complex_16.wmf' :heading 'Complex*16 Results'); endif; b34srun; Results from running this script on the Dell 650 workstation are shown next. Note that: lapack2a SM1 SM1 Eispack1 Lapack1 gets both right and left eigenvectors => Eispack with eigenvalue only tred11/imtql1 => Eispack eigenvalue and Vec. tred12/imtql2 => Eispack with eigenvalue Only rg => Lapack with eigenvalue Only DGEEVX Chapter 16 89 Eispack2 => Eispack eigenvalue and eigenvector rg Lapack2a => Both right and left eigenvalues dgeevx Lapack2b => Lapack eigenvalue and Vec. dgeevx EISPACK vs LAPACK on PD Matrix lapack2a gets both right and left eigenvectors Obs 1 2 3 4 5 6 7 8 SIZE 50.00 100.0 150.0 200.0 250.0 300.0 350.0 400.0 SM1 SM2 EISPACK1 LAPACK1 EISPACK2 LAPACK2A LAPACK2B 0.000 0.000 0.000 0.000 0.1562E-01 0.000 0.000 0.000 0.1562E-01 0.1562E-01 0.1562E-01 0.1562E-01 0.3125E-01 0.4688E-01 0.1562E-01 0.1562E-01 0.3125E-01 0.4688E-01 0.6250E-01 0.1250 0.1250 0.1562E-01 0.6250E-01 0.6250E-01 0.1250 0.1719 0.3125 0.2969 0.3125E-01 0.1250 0.1406 0.2500 0.3438 0.6875 0.6406 0.4688E-01 0.2188 0.2656 0.4375 0.6875 1.156 1.062 0.7812E-01 0.4375 0.4531 0.7188 1.188 1.828 1.688 0.1406 0.6875 0.7656 1.094 1.938 2.766 2.609 EISPACK vs LAPACK on General Matrix lapack2a gets both right and left eigenvectors Obs 1 2 3 4 5 6 7 8 SIZE 50.00 100.0 150.0 200.0 250.0 300.0 350.0 400.0 EISPACK1 LAPACK1 EISPACK2 LAPACK2A LAPACK2B 0.000 0.000 0.1562E-01 0.000 0.1562E-01 0.1562E-01 0.1562E-01 0.3125E-01 0.3125E-01 0.4688E-01 0.4688E-01 0.4688E-01 0.9375E-01 0.1094 0.1094 0.9375E-01 0.1094 0.2344 0.2656 0.2500 0.1875 0.2031 0.4844 0.5781 0.5312 0.3438 0.3594 0.9688 0.9844 0.9219 0.5781 0.5938 1.609 1.594 1.453 0.9844 0.8906 2.578 2.422 2.234 EISPACK vs LAPACK on a Complex General Matrix lapack2a gets both right and left eigenvectors Obs 1 2 3 4 5 6 7 8 SIZE 50.00 100.0 150.0 200.0 250.0 300.0 350.0 400.0 EISPACK1 LAPACK1 EISPACK2 LAPACK2A LAPACK2B 0.1562E-01 0.1562E-01 0.3125E-01 0.1562E-01 0.1562E-01 0.4688E-01 0.4688E-01 0.1875 0.9375E-01 0.7812E-01 0.1719 0.1094 0.6875 0.2812 0.2656 0.3906 0.2656 1.609 0.6875 0.6406 0.8594 0.5625 3.344 1.438 1.328 1.672 0.9062 6.031 2.375 2.188 3.000 1.438 9.922 3.812 3.531 4.359 2.141 14.47 5.688 5.312 By using the appropriate routine to calculate the eigenvalues (TRED1/IMTQL1) of the PD matrix for order 400 the cost is 18.37% (.1406/.7656) of the real*8 general matrix eispack routine RG or 12.85% (.1406/1.094) of the more expensive lapack routine DGEEVX. What is surprising is that even at a size of 400 by 400, eispack dominates lapack in terms of time (.7656 vs 1.094) if only eigenvalues are requested. This gain appears to carry through for calculations involving eigenvectors. Since the column LAPACK2A, involves calculation of both right and left hand side eigenvectors, the correct column to compare is LAPACK2B. Here eispack time is 74.28% (1.938/2.609) of lapack. For a real*8 general matrix the results are somewhat different. For just eigenvalues, lapack never dominates eispack. However if eigenvectors are requested, lapack begins to dominate in the range 300 to 350. At 200 and 250 the eispack / lapack) times were close to the same (.2344 /.2500) and (.4844 /.5312) while at 300 these became (.9688/.9219) respectively as lapack became slightly faster. At 400 the times were (2.578/2.234) and lapack was moving out as the clear winner. The lesson to be drawn is that the bigger the system, the more appropriate it is to use lapack. Note that the option :lapack2 has been used that turns off balancing. Thus, the reported tests are biased in favor of lapack since eispack balances. It has been found and noted by others, that balancing can be dangerous, especially for complex matrices. The job speed1.b34 in c:\b34slm\b34stest\ can be used to benchmark these results of different machines. 90 Matrix Command Language For a complex*16 matrix the cross over appears between 100 to 150 if only eigenvalues are requested. At 150 lapack is the clear winner with a time of .1094 vs eispack's .1719. By an order 400 matrix, the times were 4.359 and 2.141 respectively or over 2.0 times faster. If both eigenvalues and eigenvectors are requested for an order 400 matrix the relative times were 14.47 and 5.312 where lapack is 2.724 times faster. Since most problems are relatively small, these results suggest that for the time being eispack should be the default eigen routine for both real*8 and complex*16 problems. In additional advantage of eispack is that it runs with substantially less work memory. The lapack code was developed to run large matrices and make use of a block design. This seems to work especially well with complex*16 matrices. For real*16 and complex*32 matrices, lapack is not available and especially modified versions of the eispack routines are used. The options :lapack which calls DGEEV/ZGEEV runs into problems on large complex*16 matrices due to balancing. The effect of permutations and scaling can be investigated using the options :lapackp, :lapacks and :lapackb which turn on and off various options in DGEEVX/ZGEEVX. The above test problem appears to fails for large complex matrices generated using the sample code when these options are turned on. Section 16.6 discusses the ilaenv routine whereby alternative blocksizes can be investigated. It is to noted that the above times can be influenced by the size of the workspace which appears to influence memory access speed. The default B34S setting is to let lapack obtain the optimum work space. 16.6 A Preliminary Investigation of Inversion Speed Differences There are substantial speed differences between various approaches to matrix inversion. The below listed program shows the relative speed of inverting a positive definite matrix of from size 50 to 600 using lapack, linpack, a SVD (for the pseudo inverse) and a Cholesky decomposition using linpack. b34sexec matrix; * Tests speed of Linpack vs LAPACK vs svd (pinv) vs ; * Requires a large size ; call echooff; icount=0; n=0; upper=600; mesh=50; top continue; icount=icount+1; n=n+mesh; if(n .gt. upper)go to done; call print('Doing size ',n:); x=rn(matrix(n,n:)); x=transpose(x)*x; ii=matrix(n,n:)+1.; /$ Use LAPACK LU call compress; call timer(base1); xinv1=inv(x:gmat); call timer(base2); error1(icount)=sum(dabs(ii-(xinv1*x))); /$ Use LINPACK 'Default' LU call compress; Chapter 16 91 call timer(base3); xinv1=inv(x); call timer(base4); error2(icount)=sum(dabs(ii-(xinv1*x))); /$ Use IMSL pinv code call compress; call timer(base5); xinv1=pinv(x); call timer(base6); error3(icount)=sum(dabs(ii-(xinv1*x))); /$ Use Linpack DPOCO / DPODI Code call compress; call timer(base7); xinv1=inv(x :pdmat); call timer(base8); error4(icount)=sum(dabs(ii-(xinv1*x))); size(icount) =dfloat(n); lapack(icount) =(base2-base1); linpack(icount)=(base4-base3); svdt(icount) =(base6-base5); chol(icount) =(base8-base7); call free(x,xinv1,ii); go to top; done continue; call tabulate(size,lapack,linpack,svdt,chol, error1,error2,error3,error4); call graph(size lapack,linpack :heading 'Lapack Vs Linpack' :plottype xyplot); call graph(size lapack,linpack svdt :heading 'LAPACK vs Linpack vs SVD' :plottype xyplot); b34srun; Edited output from running the above program on a Dell 650 workstation machine using the built-in Fortran CPU timer produces: Obs 1 2 3 4 5 6 7 8 9 10 11 12 SIZE 50.00 100.0 150.0 200.0 250.0 300.0 350.0 400.0 450.0 500.0 550.0 600.0 LAPACK LINPACK SVDT CHOL ERROR1 ERROR2 ERROR3 ERROR4 0.000 0.000 0.1562E-01 0.000 0.2129E-08 0.2129E-08 0.4866E-08 0.2845E-08 0.000 0.1562E-01 0.1562E-01 0.1562E-01 0.1133E-05 0.1109E-05 0.2193E-05 0.1418E-05 0.1562E-01 0.1562E-01 0.6250E-01 0.000 0.1352E-08 0.1315E-08 0.3145E-08 0.1685E-08 0.1562E-01 0.1562E-01 0.1406 0.1562E-01 0.1432E-05 0.1425E-05 0.3864E-05 0.1920E-05 0.3125E-01 0.3125E-01 0.3438 0.1562E-01 0.1982E-08 0.1957E-08 0.4886E-08 0.2428E-08 0.7812E-01 0.7812E-01 0.7812 0.3125E-01 0.1292E-05 0.1297E-05 0.3265E-05 0.1692E-05 0.1406 0.1562 1.281 0.7812E-01 0.2907E-07 0.2874E-07 0.6778E-07 0.3556E-07 0.2031 0.2656 1.906 0.1250 0.5379E-06 0.5386E-06 0.1298E-05 0.6448E-06 0.2812 0.3750 2.766 0.2031 0.4115E-06 0.4115E-06 0.1069E-05 0.5292E-06 0.3906 0.8281 5.328 0.4531 0.1325E-06 0.1315E-06 0.3427E-06 0.1671E-06 0.5469 0.8594 6.562 0.5312 0.1731E-06 0.1731E-06 0.4113E-06 0.2253E-06 0.8906 1.141 6.797 0.6719 0.6622E-06 0.6546E-06 0.1540E-05 0.7743E-06 The lapack code uses dgetrf/dgecon/dgetri, linpack uses dgeco/dgefa/dgedi and SVD uses the IMSL routine dslgrr which internally calls the linpack SVD routine dsvdc. The blocksize of the lapack workspace was optimized by a preliminary call to dgetrf which in turn calls the lapack ilaenv routine. All routines were compiled with Lahey Fortran LF95 version 7.1 release except for the IMSL code which was compiled with an earlier release of Lahey. Note that there is a call to compress before each test. If this is removed and placed right before go to 92 Matrix Command Language top; there will be a noticeable speed difference, especially for the large systems. This is due to the fact that there is unused temporary space in memory and new temp variables are allocated quite a distance away from the x matrix. This "thrashing of memory" slows things down. The matrix command inv( ) defaults to the linpack LU solver since as this example shows for matrix sizes of 300 and smaller linpack at a speed near that of lapack. For larger systems, lapack runs faster.12 For example for order 600 the gain is ~ 21.95% ((1.141-.8906/1.141). If the system were a positive definite matrix, the gain would be ~41.11% ((1.141-.6719)/1.141) over the linpack LU solver. The SVD approach is ~ 10.12 times more costly (6.797/.6719) than the Cholesky and ~5.96 (6.797/1.141) times more expensive than the linpackl LU. For the problems run, the error calculated as | I ( XX 1 ) | is comparable. The inv command uses linpack as the default inverter. Users with large problems involving general matrix systems should use the call inv(x:gmat) to get the LAPACK routines. The subroutine gminv uses lapack by default and optionally returns an indicator of whether the matrix is not full rank. Prior to a call to a lapack routine, the routine ilaenv is called to determine the optimum blocksize. This is then used to set the work space. The defaults are used. The B34S user is given the option of playing with these defaults, although most users may not have sufficient knowledge to make this choice. Since many of the matrices in econometrics are positive definite, it is wasteful to calculated inverses with the general matrix LU approach. In addition since both linpack and lapack can detect if the matrix is positive definite, by using the Cholesky approach, one obtains speed and in addition has a built in error check against possible problems.. While the linpack routines dpoco/dpodi have stood the test of time since the 1970's, the lapack Cholesky routines dpotrf/dpocon/dpotri appear to have speed advantages for larger systems that are even greater than found for the LU inverters. The following code illustrates what was found: b34sexec matrix; * Tests speed of Linpack vs LAPACK ; * Uses PD matrix; call echooff; icount=0; n=0; upper=700; mesh=25; top continue; icount=icount+1; n=n+mesh; if(n .eq. upper)go to done; x=rn(matrix(n,n:)); x=transpose(x)*x; ii=matrix(n,n:)+1.; /$ LINPACK PDF call compress; call timer(base11); xinv1=inv(x :pdmat); call timer(base22); 12 The tests shown here use as the default the optimum blocksize for lapack. A later example investigates the effect of blocksize changes on speed. At issue is that fact that lapack default workspace is quite large. Users with large probelms may run out of memory and have to reduce the blocksize of the calculation. Chapter 16 93 error(icount)=sumsq(ii-(xinv1*x)); /$ LAPACK PDF call compress; call timer(base111); xinv1=inv(x :pdmat2); call timer(base222); error0(icount)=sumsq(ii-(xinv1*x)); /$ LAPACK LU call compress; call timer(base1); xinv1=inv(x:gmat); call timer(base2); error1(icount)=sumsq(ii-(xinv1*x)); /$ LINPACK LU call compress; call timer(base3); xinv1=inv(x); call timer(base4); error2(icount)=sumsq(ii-(xinv1*x)); size(icount) = dfloat(n); pdmat(icount) =(base22-base11); pdmat2(icount) =(base222-base111); lapack(icount) =(base2-base1); linpack(icount)=(base4-base3); call free(x,xinv1,ii); go to top; done continue; call print('LINPACK Cholesky vs LAPACK Cholesky':); call tabulate(size,pdmat,pdmat2, error, error0, lapack,linpack,error1,error2); call graph(size pdmat,pdmat2,lapack,linpack :nokey :plottype xyplot); b34srun; The above program was run on the Dell 650 machine with dual 3.056 MH Xeon processors, each with two cores. The operating system was XP Professional and the memory was 4.016 MB. B34S(r) Matrix Command. d/m/y 3/ 7/07. h:m:s 12:34: 0. => * TESTS SPEED OF LINPACK VS LAPACK $ => * USES PD MATRIX$ => CALL ECHOOFF$ LINPACK Cholesky vs LAPACK Cholesky Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 SIZE 25.00 50.00 75.00 100.0 125.0 150.0 175.0 200.0 225.0 250.0 275.0 300.0 325.0 PDMAT PDMAT2 ERROR ERROR0 LAPACK LINPACK ERROR1 ERROR2 0.000 0.000 0.7573E-25 0.3868E-25 0.000 0.000 0.2938E-25 0.2647E-25 0.000 0.000 0.1331E-23 0.1380E-23 0.000 0.000 0.8757E-24 0.9335E-24 0.000 0.000 0.3375E-22 0.4386E-22 0.000 0.000 0.2817E-22 0.2816E-22 0.000 0.000 0.4514E-21 0.9178E-21 0.000 0.000 0.4007E-21 0.4350E-21 0.000 0.000 0.8331E-15 0.6825E-15 0.000 0.000 0.4070E-15 0.3993E-15 0.000 0.1562E-01 0.7372E-19 0.9395E-19 0.1562E-01 0.1562E-01 0.7390E-19 0.7555E-19 0.1562E-01 0.1562E-01 0.2189E-19 0.3052E-19 0.1562E-01 0.000 0.2251E-19 0.2182E-19 0.1562E-01 0.1562E-01 0.2318E-20 0.2453E-20 0.1562E-01 0.1562E-01 0.1688E-20 0.1794E-20 0.1562E-01 0.1562E-01 0.3924E-21 0.3721E-21 0.3125E-01 0.1562E-01 0.2295E-21 0.2246E-21 0.3125E-01 0.1562E-01 0.2808E-19 0.3311E-19 0.4688E-01 0.3125E-01 0.1474E-19 0.1403E-19 0.3125E-01 0.3125E-01 0.3262E-20 0.3096E-20 0.4688E-01 0.4688E-01 0.1694E-20 0.1689E-20 0.3125E-01 0.4688E-01 0.1200E-19 0.1277E-19 0.6250E-01 0.9375E-01 0.9001E-20 0.8945E-20 0.4688E-01 0.6250E-01 0.5676E-19 0.7629E-19 0.9375E-01 0.1250 0.5762E-19 0.5736E-19 94 Matrix Command Language 14 15 16 17 18 19 20 21 22 23 24 25 26 27 350.0 375.0 400.0 425.0 450.0 475.0 500.0 525.0 550.0 575.0 600.0 625.0 650.0 675.0 0.7812E-01 0.9375E-01 0.1250 0.1562 0.2031 0.2656 0.3438 0.4375 0.5469 0.6406 0.7656 0.8750 1.016 1.141 0.6250E-01 0.7812E-01 0.9375E-01 0.1250 0.1406 0.1719 0.1875 0.2500 0.2812 0.3438 0.3750 0.4375 0.4688 0.5625 0.1672E-18 0.1625E-17 0.1240E-18 0.8648E-19 0.3484E-17 0.5813E-17 0.6721E-18 0.2468E-16 0.1045E-17 0.2694E-19 0.3614E-14 0.3936E-16 0.3966E-16 0.1106E-16 0.1691E-18 0.1808E-17 0.1074E-18 0.9579E-19 0.3320E-17 0.5600E-17 0.6349E-18 0.2307E-16 0.9424E-18 0.2545E-19 0.3663E-14 0.4080E-16 0.4584E-16 0.1359E-16 0.1406 0.1562 0.2031 0.2344 0.2812 0.3281 0.3906 0.4531 0.5469 0.6094 0.7188 0.7969 0.9062 1.000 0.1719 0.2188 0.2812 0.3281 0.4062 0.5312 0.6562 0.7969 0.9531 1.125 1.297 1.516 1.672 1.891 0.1120E-18 0.9533E-18 0.8809E-19 0.5594E-19 0.2044E-17 0.3526E-17 0.4566E-18 0.1327E-16 0.7014E-18 0.2018E-19 0.2943E-14 0.2895E-16 0.3556E-16 0.9189E-17 0.1095E-18 0.9562E-18 0.8789E-19 0.5497E-19 0.2065E-17 0.3493E-17 0.4522E-18 0.1301E-16 0.6929E-18 0.1994E-19 0.2858E-14 0.2929E-16 0.3567E-16 0.9295E-17 For matrices above 325 there are increasing speed gains from using the lapack Cholesky inverters. At 675 the gain was on the order of a 50.70% reduction in cost ((1.141-.5625)/1.141) in lapack was used. An uninformed user who went with a general matrix inverter and selected the linpack LU inverter would have found costs went up 1.66 times (1.891/1.141) over the linpack Cholesky solver and 3.363 times (1.891/.5625) over what could be obtained with the lapack Cholesky routine. A number of researchers have stayed with linpack due to possible accuracy issues involving lapack. These appear to be related to not using the lapack subroutines correctly. For example the lapack LU factoring routine dgetrf provides a return INFO that in their words is set > 0 if "U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations." Experience tells us that it is dangerous to proceed in near singular cases that are not trapped by INFO and that a call to dgecon to get the condition is in order. As an example, consider attempting to invert x=matrix(3,3:1 2 3 4 5 6 7 8 9); if the condition is not checked or is ignored. The linpack and lapack rcond values of this matrix are 0.20559686E-17 1.541976423090495E-18 and are not different from zero when tested as if((rcond+1.0d+00).eq.1.0d+00)write(6,*)'Matrix is near Singular' in Fortran. Letting the inverse proceed is very dangerous as is illustrated with the following MATLAB code: >> x=[1 2 3;4 5 6; 7 8 9] x = 1 2 3 4 5 6 7 8 9 >> ix=inv(x) Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.541976e-018. ix = -4.5036e+015 9.0072e+015 -4.5036e+015 9.0072e+015 -1.8014e+016 9.0072e+015 -4.5036e+015 9.0072e+015 -4.5036e+015 >> ix*x ans = 4 0 0 0 8 0 4 0 0 Chapter 16 95 where xxˆ 1 I . If there is concern over the accuracy of an OLS model, the QR approach can be used. Another approach would be to use real*16 math which is possible with the matrix command The command r8tor16 that will convert a matrix prior to a call to inv. Possible accuracy improvements can be obtained in a matrix inversion using refinement and or refinement and equalization. These are done using the lapack routines dgesvx/zgesvx at the cost of a reduction in speed. The sample job gminv_4 in matrix.mac address accuracy questions. b34sexec matrix; call echooff; n=100; * test1 and test3 use LAPACK ; x=rn(matrix(n,n:)); * to show effect of balancing uncomment next statement; x(1,)=x(1,)*100000.; call gminv(x,xinv1,info); xinv2=inv(x); xinv3=inv(x:gmat); j=inv(x,rcond:gmat); j=inv(x,rcond2); xinv4=inv(x,rcond3 :refine); xinv5=inv(x,rcond4 :refinee); dtest=matrix(n,n:)+1.0; test1=x*xinv1; test2=x*xinv2; test3=x*xinv3; test4=x*xinv4; test5=x*xinv5; if(n.le.5)call print(x ,xinv1 ,xinv2,xinv3 ,test1,test2,test3); call print('Matrix is of order ',n:); call print('LAPACK 3 => refine':); call print('LAPACK 4 => refinee':); call print('Max Error for LAPACK 1', dmax(dabs(dtest-test1)):); call print('Max Error for LAPACK 2', dmax(dabs(dtest-test3)):); call print('Max Error for LAPACK 3', dmax(dabs(dtest-test4)):); call print('Max Error for LAPACK 4', dmax(dabs(dtest-test5)):); call print('Max Error for LINPACK ', dmax(dabs(dtest-test2)):); call print('Sum Error for LAPACK 1', sum(dabs(dtest-test1)):); call print('Sum Error for LAPACK 2', sum(dabs(dtest-test3)):); call print('Sum Error for LAPACK 3', sum(dabs(dtest-test4)):); call print('Sum Error for LAPACK 4', sum(dabs(dtest-test5)):); call print('Sum Error for LINPACK ', sum(dabs(dtest-test2)):); call print('Sumsq Error for LAPACK 1',sumsq(dtest-test1):); call print('Sumsq Error for LAPACK 2',sumsq(dtest-test3):); call print('Sumsq Error for LAPACK 3',sumsq(dtest-test4):); call print('Sumsq Error for LAPACK 4',sumsq(dtest-test5):); call print('Sumsq Error for LINPACK ',sumsq(dtest-test2):); call print('rcond rcond2 rcond3,rcond4',rcond,rcond2,rcond3,rcond4); cx=complex(x,dsqrt(dabs(x))); call gminv(cx,cxinv1,info); cxinv2=inv(cx); cxinv3=inv(cx:gmat); cxinv4=inv(cx,rcond3 :refine); cxinv5=inv(cx,rcond4 :refinee); dc=complex(dtest,0.0); test1=cx*cxinv1; test2=cx*cxinv2; test3=cx*cxinv3; test4=cx*cxinv4; test5=cx*cxinv5; j=inv(x,rcond:gmat); j=inv(x,rcond2); if(n.le.5)call print(cx,cxinv1,cxinv2,cxinv3,test1,test2,test3); call print('Matrix is of order ',n:); call print('Max Error for LAPACK 1 real', 96 Matrix Command Language call print('Max call print('Max call print('Max call print('Max call print('Max call print('Max call print('Max call print('Max call print('Max call call call call call call call call call call call call call call call call call call call call call print('Sum print('Sum print('Sum print('Sum print('Sum print('Sum print('Sum print('Sum print('Sum print('Sum print('Sumsq print('Sumsq print('Sumsq print('Sumsq print('Sumsq print('Sumsq print('Sumsq print('Sumsq print('Sumsq print('Sumsq print('rcond dmax(dabs(real(dc-test1))):); Error for LAPACK 2 real', dmax(dabs(real(dc-test3))):); Error for LAPACK 3 real', dmax(dabs(real(dc-test4))):); Error for LAPACK 4 real', dmax(dabs(real(dc-test5))):); Error for LINPACK real', dmax(dabs(real(dc-test2))):); Error for LAPACK 1 imag', dmax(dabs(imag(dc-test1))):); Error for LAPACK 2 imag', dmax(dabs(imag(dc-test3))):); Error for LAPACK 3 imag', dmax(dabs(imag(dc-test4))):); Error for LAPACK 4 imag', dmax(dabs(imag(dc-test5))):); Error for LINPACK imag', dmax(dabs(imag(dc-test2))):); Error for LAPACK 1 real',sum(dabs(real(dc-test1))):); Error for LAPACK 2 real',sum(dabs(real(dc-test3))):); Error for LAPACK 3 real',sum(dabs(real(dc-test4))):); Error for LAPACK 4 real',sum(dabs(real(dc-test5))):); Error for LINPACK real',sum(dabs(real(dc-test2))):); Error for LAPACK 1 imag',sum(dabs(imag(dc-test1))):); Error for LAPACK 2 imag',sum(dabs(imag(dc-test3))):); Error for LAPACK 3 imag',sum(dabs(imag(dc-test4))):); Error for LAPACK 4 imag',sum(dabs(imag(dc-test5))):); Error for LINPACK imag',sum(dabs(imag(dc-test2))):); Error for LAPACK 1 real',sumsq(real(dc-test1)):); Error for LAPACK 2 real',sumsq(real(dc-test3)):); Error for LAPACK 3 real',sumsq(real(dc-test4)):); Error for LAPACK 4 real',sumsq(real(dc-test5)):); Error for LINPACK real',sumsq(real(dc-test2)):); Error for LAPACK 1 imag',sumsq(imag(dc-test1)):); Error for LAPACK 2 imag',sumsq(imag(dc-test3)):); Error for LAPACK 3 imag',sumsq(imag(dc-test4)):); Error for LAPACK 4 imag',sumsq(imag(dc-test5)):); Error for LINPACK imag',sumsq(imag(dc-test2)):); rcond2 rcond3,rcond4',rcond,rcond2,rcond3,rcond4); b34srun; The test job forms a 100 by 100 matrix of random normal numbers. The job was run with and with out multiplying the first row by 100000 to induce possible accuracy problems. The matrix xinv1 and xinv3 were calculated by the lapack LU inverters using the matrix command gminv and inv respectfully. These should run 100% the same and in the output are LAPACK 1 and LAPACK 2. The matrix XINV2 was calculated with the linpack default LU inverter and is shown in the tables as LINPACK, while XINV3 and XINV4 were calculated using refinement and equalization/refinement and are shown as LAPACK 3 and LAPACK 4 respectively. The goal is to see how much of an improvement refinement and equalization/refinement make. The maintained hypothesis is that in a poorly condition matrix, these added steps should, make a difference in accuracy and pay for their added cost. Real Matrix – Row 1 Adjusted Matrix Command. Version January 2002. => CALL ECHOOFF$ Matrix is of order LAPACK 3 => refine LAPACK 4 => refinee 100 Chapter 16 Max Max Max Max Max Sum Sum Sum Sum Sum Sumsq Sumsq Sumsq Sumsq Sumsq Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error for for for for for for for for for for for for for for for LAPACK 1 LAPACK 2 LAPACK 3 LAPACK 4 LINPACK LAPACK 1 LAPACK 2 LAPACK 3 LAPACK 4 LINPACK LAPACK 1 LAPACK 2 LAPACK 3 LAPACK 4 LINPACK 7.334165275096893E-09 7.334165275096893E-09 8.585629984736443E-10 7.275957614183426E-10 5.918991519138217E-09 2.031745830718839E-07 2.031745830718839E-07 1.340332846354633E-08 1.186949409438610E-08 1.520938054071593E-07 6.346678588451170E-16 6.346678588451170E-16 4.170551230676075E-18 3.291420393045907E-18 3.502669224237807E-16 rcond rcond2 rcond3,rcond4 RCOND = 0.15657795E-07 RCOND2 = 0.32449440E-07 RCOND3 = 0.15657795E-07 RCOND4 = 0.36041204E-04 Complex Case – Row 1 adjusted Matrix is of order Max Error for LAPACK 1 Max Error for LAPACK 2 Max Error for LAPACK 3 Max Error for LAPACK 4 Max Error for LINPACK Max Error for LAPACK 1 Max Error for LAPACK 2 Max Error for LAPACK 3 Max Error for LAPACK 4 Max Error for LINPACK Sum Error for LAPACK 1 Sum Error for LAPACK 2 Sum Error for LAPACK 3 Sum Error for LAPACK 4 Sum Error for LINPACK Sum Error for LAPACK 1 Sum Error for LAPACK 2 Sum Error for LAPACK 3 Sum Error for LAPACK 4 Sum Error for LINPACK Sumsq Error for LAPACK 1 Sumsq Error for LAPACK 2 Sumsq Error for LAPACK 3 Sumsq Error for LAPACK 4 Sumsq Error for LINPACK Sumsq Error for LAPACK 1 Sumsq Error for LAPACK 2 Sumsq Error for LAPACK 3 Sumsq Error for LAPACK 4 Sumsq Error for LINPACK real real real real real imag imag imag imag imag real real real real real imag imag imag imag imag real real real real real imag imag imag imag imag rcond rcond2 rcond3,rcond4 RCOND = 0.15657795E-07 RCOND2 = 0.32449440E-07 RCOND3 = 0.19119694E-06 RCOND4 = 0.36654208E-03 100 1.089574652723968E-09 1.089574652723968E-09 7.730704965069890E-11 9.436007530894130E-11 8.449205779470503E-10 1.143234840128571E-09 1.143234840128571E-09 1.132320903707296E-10 1.371063262922689E-10 9.201812645187601E-10 2.892584898682689E-08 2.892584898682689E-08 2.080174401783049E-09 2.380964692715299E-09 1.709247001571822E-08 2.400662145467662E-08 2.400662145467662E-08 2.333172253018985E-09 3.145330114537025E-09 1.685032463933840E-08 1.305137954525119E-17 1.305137954525119E-17 8.000884036556901E-20 1.005147393906282E-19 4.711921249005595E-18 9.729510489806058E-18 9.729510489806058E-18 1.030133637684152E-19 2.061903353378799E-19 5.314085939146504E-18 97 98 Matrix Command Language b34s Matrix Command Ending. Last Command reached. Space available in allocator Number variables used Number temp variables used 2882597, peak space used 38, peak number used 226, # user temp clean 583540 38 0 The next part of the job does not have the first row multiplied by 1000,000. The first section is for a real*8 matrix. Matrix Command. Version January 2002. => CALL ECHOOFF$ Matrix is of order LAPACK 3 => refine LAPACK 4 => refinee Max Error for LAPACK 1 Max Error for LAPACK 2 Max Error for LAPACK 3 Max Error for LAPACK 4 Max Error for LINPACK Sum Error for LAPACK 1 Sum Error for LAPACK 2 Sum Error for LAPACK 3 Sum Error for LAPACK 4 Sum Error for LINPACK Sumsq Error for LAPACK 1 Sumsq Error for LAPACK 2 Sumsq Error for LAPACK 3 Sumsq Error for LAPACK 4 Sumsq Error for LINPACK 100 1.365574320288943E-13 1.365574320288943E-13 3.086420008457935E-14 3.086420008457935E-14 1.767475055203249E-13 1.473775453617650E-10 1.473775453617650E-10 1.983027804464133E-11 1.983027804464133E-11 1.427848022651258E-10 4.135364007296756E-24 4.135364007296756E-24 1.074041726812546E-25 1.074041726812546E-25 3.829022654580863E-24 rcond rcond2 rcond3,rcond4 RCOND = 0.41205738E-04 RCOND2 = 0.84677306E-04 RCOND3 = 0.41205738E-04 RCOND4 = 0.41205738E-04 Complex*16 matrix. Row # 1 not adjusted. Matrix is of order Max Error for LAPACK 1 Max Error for LAPACK 2 Max Error for LAPACK 3 Max Error for LAPACK 4 Max Error for LINPACK Max Error for LAPACK 1 Max Error for LAPACK 2 Max Error for LAPACK 3 Max Error for LAPACK 4 Max Error for LINPACK Sum Error for LAPACK 1 Sum Error for LAPACK 2 Sum Error for LAPACK 3 Sum Error for LAPACK 4 Sum Error for LINPACK Sum Error for LAPACK 1 Sum Error for LAPACK 2 Sum Error for LAPACK 3 Sum Error for LAPACK 4 Sum Error for LINPACK Sumsq Error for LAPACK 1 Sumsq Error for LAPACK 2 real real real real real imag imag imag imag imag real real real real real imag imag imag imag imag real real 100 1.976196983832779E-14 1.976196983832779E-14 3.736594367254042E-15 3.736594367254042E-15 2.207956040223280E-14 2.116362640691705E-14 2.116362640691705E-14 3.580469254416130E-15 3.580469254416130E-15 2.059463710679665E-14 3.078720134733204E-11 3.078720134733204E-11 3.024266750114961E-12 3.024266750114961E-12 2.831859478359157E-11 3.166186637957452E-11 3.166186637957452E-11 2.939662402390297E-12 2.939662402390297E-12 2.766293287102470E-11 1.652403344027760E-25 1.652403344027760E-25 Chapter 16 Sumsq Sumsq Sumsq Sumsq Sumsq Sumsq Sumsq Sumsq Error Error Error Error Error Error Error Error for for for for for for for for LAPACK 3 LAPACK 4 LINPACK LAPACK 1 LAPACK 2 LAPACK 3 LAPACK 4 LINPACK real real real imag imag imag imag imag 99 1.862322216364066E-27 1.862322216364066E-27 1.447918924517758E-25 1.770763319588756E-25 1.770763319588756E-25 1.727395868198397E-27 1.727395868198397E-27 1.354019631664373E-25 rcond rcond2 rcond3,rcond4 RCOND = 0.41205738E-04 RCOND2 = 0.84677306E-04 RCOND3 = 0.34681902E-03 RCOND4 = 0.34681902E-03 B34S Matrix Command Ending. Last Command reached. Space available in allocator Number variables used Number temp variables used 2882597, peak space used 38, peak number used 221, # user temp clean 593517 39 0 In the real case with an adjustment to the first row, linpack max error and sum of squared error of 5.9189d-09 and 3.503E-16 were slightly less than the comparable LAPACK1 and LAPACK2 values of 7.3341E-09 and 6.347E-16 respectively. For the refinement cases (LAPACK3 and LAPACK 4) the max errors were 8.586E-10 and 7.276E-10 respectively. The sum of squares of equalization/refinement errors were 4.171E-18 and 3.291E-18 respectively and should be compared to the lapack and linpack sum of squared values of 2.031745E-7 and 1.520938E-7 indicating that these adjustments make a real difference. Looking at the matrix where row 1 was not adjusted, we see the lapack max error of 1.366E-13, the linpack max error of 1.767E-13 and the two refinement cases getting the same 3.086E-14 since equalization of the matrix did not have to be done. Sum of squared errors for lapack, linpack and the refinement cases were 4.135E-24, 3.829E-24 and 1.0740E-25 respectively. Here refinement makes a small but dectable difference. Here the accuracy is better than in the first more difficult case. Users are invited to experiment with this job and try the real*16 add and real*16 multiply adjustments to the blas routines ddot and dsum which provide a way to increase accuracy without going to real*8 math. Results for the complex*16 case are listed and follow the pattern with respect to showing a gain for refinement, especially on the real side using the sum of squared error criteria. For the adjusted case the linpack routines slightly outperforms the lapack code using the max error criteria. For example the max error for the real part of the matrix when inverted by lapack was 1.089574E-9 which is larger than the linpack value of 8.4492057E-10. For the un-adjusted case, the pattern reverses with the corresponding values being almost the same as 2.207956E-14 and 1.9761968E-14 respectively. The same pattern is observed for the imag part of the matrix. The natural question to ask concerns the relative cost of refinement and or equalization which were shown to improve the inverse calculation, especially in problem matrix cases. The relative speeds of various inversion strategies is investigated in test job INVSPEED in matrix.mac was run with matrices of order 200 – 600 by 100 incruments and times indicated. The code run is: 100 Matrix Command Language b34sexec matrix; * By setting n to different values we test and compare inverse speed; call echooff; do n=200,600,100; x=rec(matrix(n,n:));pdx=transpose(x)*x; dd= matrix(n,n:)+1.; cdd=complex(dd,0.0); nn=namelist(math,inv,gmat,smat,pdmat,pdmat2,refine,refinee,); cpdx=complex(x,mfam(dsqrt(x))); scpdx=transpose(cpdx)*cpdx; cpdx=dconj(transpose(cpdx))*cpdx; if(n.le.5)call print(pdx,cpdx,scpdx,eig(pdx),eig(cpdx),eig(scpdx)); call compress; /; call print('Using LINPACK DGECO/DGRDI - ZGECO/ZGEDI':); call timer(base1); xinv=(1.0/pdx); call timer(base2); /; call print('Inverse using (1.0/pdx) took',(base2-base1):); realm(1)=base2-base1; error1(1)=sumsq((pdx*xinv)-dd); call compress; call timer(base1); cinv=(complex(1.0,0.)/cpdx); call timer(base2); /; call print('Inverse using (1.0/cpdx) took',(base2-base1):); complexm(1)=base2-base1; error2a(1)=sumsq(real((cpdx*cinv)-cdd)); error2b(1)=sumsq(imag((cpdx*cinv)-cdd)); call compress; call timer(base1); xinv=inv(pdx); call timer(base2); /; call print('Inverse using inv(pdx) took',(base2-base1):); realm(2)=base2-base1; error1(2)=sumsq((pdx*xinv)-dd); call compress; call timer(base1); cinv=inv(cpdx); call timer(base2); /; call print('Inverse using inv(cpdx) took',(base2-base1):); complexm(2)=base2-base1; error2a(2)=sumsq(real((cpdx*cinv)-cdd)); error2b(2)=sumsq(imag((cpdx*cinv)-cdd)); call compress; /; call print('Using LAPACK ':); call timer(base1); xinv=inv(pdx:GMAT); call timer(base2); /; call print('Inverse using inv(pdx:GMAT) took',(base2-base1):); realm(3)=base2-base1; error1(3)=sumsq((pdx*xinv)-dd); call compress; call timer(base1); cinv=inv(cpdx:GMAT); call timer(base2); /; call print('Inverse using inv(cpdx:GMAT) took',(base2-base1):); complexm(3)=base2-base1; error2a(3)=sumsq(real((cpdx*cinv)-cdd)); error2b(3)=sumsq(imag((cpdx*cinv)-cdd)); Chapter 16 call compress; /; call print('Using LINPACK':); call timer(base1); xinv=inv(pdx:SMAT); call timer(base2); /; call print('Inverse using inv(pdx:SMAT) took',(base2-base1):); realm(4)=base2-base1; error1(4)=sumsq((pdx*xinv)-dd); call compress; call timer(base1); cinv=inv(scpdx:SMAT); call timer(base2); /; call print('Inverse using inv(scpdx:SMAT) took',(base2-base1):); complexm(4)=base2-base1; error2a(4)=sumsq(real((scpdx*cinv)-cdd)); error2b(4)=sumsq(imag((scpdx*cinv)-cdd)); call compress; /; call print('Using LINPACK':); call timer(base1); xinv=inv(pdx:PDMAT); call timer(base2); /; call print('Inverse using inv(pdx:PDMAT) took',(base2-base1):); realm(5)=base2-base1; error1(5)=sumsq((pdx*xinv)-dd); call compress; call timer(base1); cinv=inv(cpdx:PDMAT); call timer(base2); /; call print('Inverse using inv(cpdx:PDMAT) took',(base2-base1):); complexm(5)=base2-base1; error2a(5)=sumsq(real((cpdx*cinv)-cdd)); error2b(5)=sumsq(imag((cpdx*cinv)-cdd)); /; call compress; /; call print('Using LAPACK':); call timer(base1); xinv=inv(pdx:PDMAT2); call timer(base2); /; call print('Inverse using inv(pdx:PDMAT2) took',(base2-base1):); realm(6)=base2-base1; error1(6)=sumsq((pdx*xinv)-dd); /; call compress; call timer(base1); cinv=inv(cpdx:PDMAT2); call timer(base2); /; call print('Inverse using inv(cpdx:PDMAT2) took',(base2-base1):); complexm(6)=base2-base1; error2a(6)=sumsq(real((cpdx*cinv)-cdd)); error2b(6)=sumsq(imag((cpdx*cinv)-cdd)); /; call print('Using LAPACK':); call timer(base1); xinv=inv(pdx:REFINE); call timer(base2); /; call print('Inverse using inv(pdx:REFINE) took',(base2-base1):); realm(7)=base2-base1; error1(7)=sumsq((pdx*xinv)-dd); call compress; call timer(base1); cinv=inv(cpdx:REFINE); call timer(base2); /; call print('Inverse using inv(cpdx:REFINE) took',(base2-base1):); 101 102 Matrix Command Language complexm(7)=base2-base1; error2a(7)=sumsq(real((cpdx*cinv)-cdd)); error2b(7)=sumsq(imag((cpdx*cinv)-cdd)); call compress; /; call print('Using LAPACK':); call timer(base1); xinv=inv(pdx:REFINEE); call timer(base2); /; call print('Inverse using inv(pdx:REFINEE) took',(base2-base1):); realm(8)=base2-base1; error1(8)=sumsq((pdx*xinv)-dd); call compress; call timer(base1); cinv=inv(cpdx:REFINEE); call timer(base2); /; call print('Inverse using inv(cpdx:REFINEE) took',(base2-base1):); complexm(8)=base2-base1; error2a(8)=sumsq(real((cpdx*cinv)-cdd)); error2b(8)=sumsq(imag((cpdx*cinv)-cdd)); /; call print('Error2a and error2b = real and imag Complex*16 error':); call print(' ':); call print('Matrix Order',n:); call tabulate(nn,realm,error1,complexm,error2a,error2b); call compress; enddo; The columns REALM and COMPLEXM refer to the times for real and complex matrices. For real matrices, ERROR1 = [ XXˆ 1 I ]2 . For complex matrices ERROR2 and ERROR3 refer to the real and imaginary parts of the complex X matrix. The general solvers, symmetric matrix solvers and positive definite solvers were all applied to the same positive definite matrix. SMAT and PDMAT refer to the linpack symmetric and Cholesky inverters, while PDMAT2 refers to the lapack Cholesky inverters. The results for the Dell Latitude computer were: B34S(r) Matrix Command. d/m/y 4/ 7/07. h:m:s 14:12: 6. => * BY SETTING N TO DIFFERENT VALUES WE TEST AND COMPARE INVERSE SPEED$ => CALL ECHOOFF$ Matrix Order Obs 1 2 3 4 5 6 7 8 NN MATH INV GMAT SMAT PDMAT PDMAT2 REFINE REFINEE Matrix Order Obs 1 2 3 200 REALM ERROR1 COMPLEXM ERROR2A ERROR2B 0.4006E-01 0.2575E-17 0.1803 0.2944E-17 0.1246E-17 0.4006E-01 0.2575E-17 0.1702 0.2944E-17 0.1246E-17 0.6009E-01 0.3765E-17 0.1702 0.1686E-17 0.1233E-17 0.3004E-01 0.6412E-18 0.9013E-01 0.1409E-18 0.9842E-19 0.4006E-01 0.1755E-17 0.8011E-01 0.1063E-17 0.7556E-18 0.4006E-01 0.9240E-18 0.1001 0.1542E-17 0.1124E-17 0.5107 0.5322E-18 1.652 0.4179E-18 0.2918E-18 0.5207 0.5322E-18 1.642 0.4179E-18 0.2918E-18 300 NN MATH INV GMAT REALM 0.2103 0.2203 0.2003 ERROR1 COMPLEXM 0.4420E-17 0.9514 0.4420E-17 0.9013 0.6948E-17 0.6509 ERROR2A ERROR2B 0.2733E-16 0.1806E-16 0.2733E-16 0.1806E-16 0.2860E-16 0.1147E-16 Chapter 16 4 5 6 7 8 SMAT PDMAT PDMAT2 REFINE REFINEE Matrix Order Obs 1 2 3 4 5 6 7 8 Obs 1 2 3 4 5 6 7 8 Obs 1 2 3 4 5 6 7 8 0.3205 0.3405 0.3305 5.858 5.868 0.1013E-17 0.1892E-16 0.9589E-17 0.4138E-17 0.4138E-17 0.8467E-18 0.1285E-16 0.8387E-17 0.4494E-17 0.4494E-17 REALM 0.8412 0.9313 0.5808 0.2804 0.2704 0.2403 6.179 5.999 ERROR1 COMPLEXM 0.8782E-16 2.614 0.8782E-16 2.534 0.1007E-15 1.562 0.8933E-17 1.001 0.8864E-16 1.102 0.5656E-16 0.7511 0.7450E-17 14.08 0.7450E-17 14.41 ERROR2A ERROR2B 0.7911E-16 0.7384E-16 0.7911E-16 0.7384E-16 0.1187E-15 0.5403E-16 0.2147E-17 0.2602E-17 0.4047E-16 0.3070E-16 0.4240E-16 0.2621E-16 0.1410E-16 0.1249E-16 0.1410E-16 0.1249E-16 REALM 1.933 1.963 1.182 0.6910 0.6810 0.5308 10.53 10.85 ERROR1 COMPLEXM 0.9835E-12 5.939 0.9835E-12 5.528 0.1467E-11 3.315 0.1207E-12 2.153 0.3561E-12 2.874 0.8006E-12 1.602 0.1149E-12 27.72 0.1149E-12 27.45 ERROR2A ERROR2B 0.3184E-14 0.2082E-14 0.3184E-14 0.2082E-14 0.4225E-14 0.4437E-14 0.2204E-16 0.2372E-16 0.4918E-14 0.3120E-14 0.2881E-14 0.3356E-14 0.6522E-15 0.3650E-15 0.6522E-15 0.3650E-15 REALM 4.016 4.126 2.143 1.422 1.913 0.9814 25.91 26.41 ERROR1 COMPLEXM 0.1054E-14 10.76 0.1054E-14 10.52 0.1307E-14 5.688 0.1689E-15 3.946 0.5407E-15 5.718 0.5722E-15 2.904 0.1457E-15 48.27 0.1457E-15 48.26 ERROR2A ERROR2B 0.1935E-14 0.1553E-14 0.1935E-14 0.1553E-14 0.2383E-14 0.1676E-14 0.5304E-16 0.4960E-16 0.1455E-14 0.1471E-14 0.1774E-14 0.1299E-14 0.2620E-15 0.2328E-15 0.2620E-15 0.2328E-15 500 NN MATH INV GMAT SMAT PDMAT PDMAT2 REFINE REFINEE Matrix Order 0.1317E-17 0.4051E-17 0.1367E-16 0.1249E-17 0.1249E-17 400 NN MATH INV GMAT SMAT PDMAT PDMAT2 REFINE REFINEE Matrix Order 0.9013E-01 0.8012E-01 0.1102 2.444 2.343 103 600 NN MATH INV GMAT SMAT PDMAT PDMAT2 REFINE REFINEE For the Dell Workstation 650 the following was obtained: B34S(r) Matrix Command. d/m/y 5/ 7/07. h:m:s 11:14: 1. => * BY SETTING N TO DIFFERENT VALUES WE TEST AND COMPARE INVERSE SPEED$ => CALL ECHOOFF$ Matrix Order Obs 1 2 3 4 5 6 7 8 NN MATH INV GMAT SMAT PDMAT PDMAT2 REFINE REFINEE Matrix Order Obs 1 200 REALM ERROR1 COMPLEXM ERROR2A ERROR2B 0.1562E-01 0.2575E-17 0.7812E-01 0.2944E-17 0.1246E-17 0.1562E-01 0.2575E-17 0.6250E-01 0.2944E-17 0.1246E-17 0.3125E-01 0.3765E-17 0.7812E-01 0.1686E-17 0.1233E-17 0.1562E-01 0.6412E-18 0.3125E-01 0.1409E-18 0.9842E-19 0.000 0.1755E-17 0.3125E-01 0.1063E-17 0.7556E-18 0.1562E-01 0.9240E-18 0.3125E-01 0.1542E-17 0.1124E-17 0.1875 0.5322E-18 0.5469 0.4179E-18 0.2918E-18 0.1875 0.5322E-18 0.5312 0.4179E-18 0.2918E-18 300 NN MATH REALM ERROR1 COMPLEXM 0.9375E-01 0.4420E-17 0.2656 ERROR2A ERROR2B 0.2733E-16 0.1806E-16 104 Matrix Command Language 2 3 4 5 6 7 8 INV GMAT SMAT PDMAT PDMAT2 REFINE REFINEE Matrix Order Obs 1 2 3 4 5 6 7 8 Obs 1 2 3 4 5 6 7 8 Obs 1 2 3 4 5 6 7 8 0.2812 0.1875 0.1250 0.1250 0.1094 1.875 1.859 0.2733E-16 0.2860E-16 0.1013E-17 0.1892E-16 0.9589E-17 0.4138E-17 0.4138E-17 0.1806E-16 0.1147E-16 0.8467E-18 0.1285E-16 0.8387E-17 0.4494E-17 0.4494E-17 REALM ERROR1 COMPLEXM 0.2656 0.8782E-16 0.7344 0.2344 0.8782E-16 0.7031 0.1719 0.1007E-15 0.4844 0.1094 0.8933E-17 0.3125 0.1250 0.8864E-16 0.4219 0.7812E-01 0.5656E-16 0.2656 2.047 0.7450E-17 4.375 2.016 0.7450E-17 4.359 ERROR2A ERROR2B 0.7911E-16 0.7384E-16 0.7911E-16 0.7384E-16 0.1187E-15 0.5403E-16 0.2147E-17 0.2602E-17 0.4047E-16 0.3070E-16 0.4240E-16 0.2621E-16 0.1410E-16 0.1249E-16 0.1410E-16 0.1249E-16 REALM 0.5625 0.5781 0.3750 0.2656 0.3281 0.2344 3.484 3.531 ERROR1 COMPLEXM 0.9835E-12 1.484 0.9835E-12 1.438 0.1467E-11 0.9531 0.1207E-12 0.6406 0.3561E-12 0.8438 0.8006E-12 0.5781 0.1149E-12 8.562 0.1149E-12 8.406 ERROR2A ERROR2B 0.3184E-14 0.2082E-14 0.3184E-14 0.2082E-14 0.4225E-14 0.4437E-14 0.2204E-16 0.2372E-16 0.4918E-14 0.3120E-14 0.2881E-14 0.3356E-14 0.6522E-15 0.3650E-15 0.6522E-15 0.3650E-15 REALM 1.141 1.109 0.6719 0.4844 0.6875 0.3906 8.453 8.516 ERROR1 COMPLEXM 0.1054E-14 2.547 0.1054E-14 2.516 0.1307E-14 1.688 0.1689E-15 1.109 0.5407E-15 1.500 0.5722E-15 0.9531 0.1457E-15 14.56 0.1457E-15 15.14 ERROR2A ERROR2B 0.1935E-14 0.1553E-14 0.1935E-14 0.1553E-14 0.2383E-14 0.1676E-14 0.5304E-16 0.4960E-16 0.1455E-14 0.1471E-14 0.1774E-14 0.1299E-14 0.2620E-15 0.2328E-15 0.2620E-15 0.2328E-15 500 NN MATH INV GMAT SMAT PDMAT PDMAT2 REFINE REFINEE Matrix Order 0.4420E-17 0.6948E-17 0.1317E-17 0.4051E-17 0.1367E-16 0.1249E-17 0.1249E-17 400 NN MATH INV GMAT SMAT PDMAT PDMAT2 REFINE REFINEE Matrix Order 0.9375E-01 0.7812E-01 0.3125E-01 0.4688E-01 0.3125E-01 0.7969 0.7969 600 NN MATH INV GMAT SMAT PDMAT PDMAT2 REFINE REFINEE Math refers to using the form invx=1./x; while INV uses invx=inv(x);. Since both call the same linpack routines, they should and do run the same. GMAT uses lapack while SMAT and PDMAT use the linpack routines for symmetric (DSICO, DSIFA and DSIDI) and positive definite (DPOCO, DPOFA and DOPDI) matrices respectfully. The errors are the sum of squared errors where ERROR2A and ERROR2B refer to the real and imaginary part of the complex matrix. Here as expected the refine and refinee options are superior, although at great cost as shown in table 16.1 Chapter 16 105 Table 16.1 Relative Cost of Equalization & Refinement of a General Matrix Computer Order Mean Mean Dell Latitude Real Matrix Equal_Refine Dell 650 Workstation Real Matrix LU Cost LU Equal_Refine Cost 200 0.06009 0.5207 7.665 0.03125 0.1875 5.000 300 0.203 2.343 10.542 0.07812 0.7969 9.201 400 0.5808 5.999 9.329 0.1719 2.016 10.728 500 1.182 10.85 8.179 0.375 3.531 8.416 600 2.143 26.41 11.324 0.6719 8.516 11.675 9.408 9.004 Complex Matrix Complex Matrix 200 0.1702 1.642 8.647 0.1702 0.642 2.772 300 0.6509 5.868 8.015 0.6509 5.868 8.015 400 1.562 14.41 8.225 1.562 14.41 8.225 500 3.315 27.45 7.281 3.315 27.45 7.281 600 5.688 48.26 7.485 5.688 48.26 7.485 7.931 6.756 The cost of equalization and refinement is around 9 time more for real matrices and 7-8 times more for complex matrices. The calculations in Table 16.1were made with positive definite matrices since in many econometric applications the matrix involved is positive definite. While Dell Latitude was substantially slower, the relative cost of equalization and refinement was relative stable across machines. At matrices of 300 or larger using the Dell workstation, lapack was faster. For the Dell Lattitide the cross over point was at matrices of size 400 or larger. For the Cholesky inverters, at order 600 the gain from lapack was around 2 (1.949=1.913/.9814) for the Dell Latitude and 1.8 (1.76=.6875/.3906) for the Dell Workstation over linpack. The above tests have been performed using the lapack default blocksize as calculated by the routine ilanenv. The below listed job (LAPACK_2 in matrix.mac) investigates the gains from alternate blocksizes using a Dell 650 running 3.04 GH and a Dell Latitude running a 1.0 GH chip. /$ Blocksize tests b34sexec matrix; call echooff; isize=12; Mat_ord =array(isize:); linpack =array(isize:); lapack1 =array(isize:); lapack4 =array(isize:); lapack7 =array(isize:); lapack10 =array(isize:); lapack13 =array(isize:); lapack16 =array(isize:); lapack19 =array(isize:); lapackd =array(isize:); j=0; 106 Matrix Command Language do i=1,19,3; n=64; top continue; j=j+1; if(n.gt.768)go to endit; /; call print('Order of Matrix mat_ord(j)=n; x=rec(matrix(n,n:)); ',n:); /; set blocksize for lapack /; LINPACK need only to be run one time call lapack(1,i); if(i.eq.1)then; call timer(t1); xx=inv(x); call timer(t2); /; call print('LINPACK time linpack(j)=t2-t1; call compress; endif; call timer(t1); xx=inv(x:gmat); call timer(t2); /; call print('LAPACK time if(i.eq.1)lapack1(j)=t2-t1; if(i.eq.4)lapack4(j)=t2-t1; if(i.eq.7)lapack7(j)=t2-t1; if(i.eq.10)lapack10(j)=t2-t1; if(i.eq.13)lapack13(j)=t2-t1; if(i.eq.16)lapack16(j)=t2-t1; if(i.eq.19)lapack19(j)=t2-t1; call compress; if(i.eq.1)then; call lapack(:reset); call timer(t1); xx=inv(x:gmat); call timer(t2); /; call print('LAPACK lapackd(j)=t2-t1; call compress; endif; ',t2-t1:); ',t2-t1:); Defaults ',t2-t1:); n=n+64; go to top; endit continue; j=0; enddo; call print(' ':); call print('Effects on Relative Speed of LAPACK blocksize':); call tabulate(mat_ord,linpack,lapack1,lapack4,lapack7, lapack10,lapack13,lapack16,lapack19, lapackd); b34srun; Matrices of rank 64, 128,...,768 were generated using the rectangular IMSL random number generator. Since these numbers are in the range 0.0 – 1.0, then the large numbers possible from a random normal generator are not observed and there is less likelihood of one matrix having a Chapter 16 107 problem being inverted. The job was run with 12,000,000 real*8 workspace. The variables LAPACKi refer to a lapack inversion where the blocksize was set as i. LAPACKD is the default recommended blocksize as calculated by the lapack routine ilanenv. Edited output from this job is listed next for both the Dell Latitude and Dell Workstation to control for possible chip related differences in addition to chip speed. B34S(r) Matrix Command. d/m/y 4/ 7/07. h:m:s 16:17:50. Run with Dell Latitude => CALL ECHOOFF$ Effects on Relative Speed of LAPACK blocksize Obs MAT_ORD 1 2 3 4 5 6 7 8 9 10 11 12 64 128 192 256 320 384 448 512 576 640 704 768 LINPACK LAPACK1 LAPACK4 LAPACK7 LAPACK10 LAPACK13 LAPACK16 LAPACK19 LAPACKD 0.000 0.000 0.000 0.1002E-01 0.000 0.1001E-01 0.000 0.000 0.000 0.2003E-01 0.2003E-01 0.2003E-01 0.2003E-01 0.2003E-01 0.2003E-01 0.2003E-01 0.2002E-01 0.1001E-01 0.3004E-01 0.5007E-01 0.5007E-01 0.6009E-01 0.5007E-01 0.6009E-01 0.6009E-01 0.6009E-01 0.5007E-01 0.9013E-01 0.1302 0.1202 0.1402 0.1402 0.1302 0.1302 0.1302 0.1402 0.3205 0.3505 0.3104 0.2904 0.3004 0.3004 0.3104 0.2904 0.2804 0.7110 0.8212 0.6209 0.6009 0.5708 0.5708 0.6209 0.5508 0.5107 1.312 1.472 1.062 0.9914 0.9614 0.9614 0.9714 0.9113 0.8813 2.854 2.604 1.722 1.612 1.552 1.482 1.622 1.472 1.392 3.435 3.725 2.363 2.223 2.313 2.163 2.273 2.063 2.063 5.147 4.977 3.355 3.104 3.205 2.914 2.874 2.914 2.664 7.160 6.710 4.737 4.116 4.006 3.936 3.805 3.805 3.605 10.30 8.883 5.838 5.348 5.217 5.067 5.047 5.207 4.677 Run on Dell Precision Workstation 650 B34S(r) Matrix Command. d/m/y => 5/ 7/07. h:m:s 11:30:52. CALL ECHOOFF$ Effects on Relative Speed of LAPACK blocksize Obs MAT_ORD 1 2 3 4 5 6 7 8 9 10 11 12 64 128 192 256 320 384 448 512 576 640 704 768 LINPACK LAPACK1 LAPACK4 LAPACK7 LAPACK10 LAPACK13 LAPACK16 LAPACK19 LAPACKD 0.000 0.000 0.1562E-01 0.000 0.000 0.000 0.1562E-01 0.000 0.000 0.1562E-01 0.000 0.000 0.1562E-01 0.000 0.1562E-01 0.000 0.1562E-01 0.1562E-01 0.1562E-01 0.1562E-01 0.1562E-01 0.1562E-01 0.3125E-01 0.1562E-01 0.1562E-01 0.1562E-01 0.3125E-01 0.3125E-01 0.4688E-01 0.3125E-01 0.4688E-01 0.4688E-01 0.4688E-01 0.4688E-01 0.4688E-01 0.4688E-01 0.1094 0.1250 0.1094 0.1094 0.1094 0.9375E-01 0.1094 0.9375E-01 0.9375E-01 0.2344 0.2656 0.2188 0.2031 0.1875 0.2031 0.1875 0.1875 0.1875 0.3750 0.4375 0.3281 0.3281 0.3125 0.2969 0.2969 0.3125 0.2812 0.7500 0.7344 0.5469 0.5156 0.5000 0.5156 0.5000 0.4844 0.4531 0.9844 0.9844 0.7500 0.7031 0.7031 0.7031 0.6719 0.6562 0.6250 1.422 1.375 1.031 0.9688 0.9688 0.9219 0.9375 0.9062 0.8906 1.891 1.828 1.359 1.297 1.266 1.234 1.250 1.219 1.156 2.516 2.328 1.781 1.656 1.656 1.594 1.609 1.578 1.516 The findings indicate that the default blocksize lapack is faster than linpack for matrices of sizes greater than 320. The blocksize=1 lapack never beats linpack until size 512 when the times were .7344 vs .7500 and 2.604 vs 2.854 on the workstation and Latitude respectively. On the workstation at size 768, linpack was running at 2.516, while lapack was running at 2.328 or 1.516 with the default blocksize. For a general matrix lapack defaults suggest a blocksize of 64, with a minimum blocksize of 2. The cross over point is set as 128. The above data suggests that the linpack/lapack cross over for ther workstation using the default blocksize is between 256 to 320. Note that at 256 the ratio was 3.125E-2/4.6875E-2 while at 320 that ratio tipped in lapack's favor to be .09375/.1094. To investigate whether a blocksize > 1 but far less than 64 would be of benefit, in the above job we set i=4, 7, 10, 13,16 and 19 and repeated the calculations. For a matrix size of 768 and i=4 the lapack time fell to 1.781 from 2.328. For i=7 the time was marginally better falling to 1.656 which is close to the default setting of 1.516. The above experiment outlines the gains for blocksize adjustments but also suggests that if changes are 108 Matrix Command Language made to increase the block size to a modest 4 and thereby save space, there still are substantial gains. For matrices in the usual range (under order 256) the space saving linpack code will always beat lapack. These findings were not altered when the test job was run on the Dell Latitude computer and suggest that although for many applications the linpack code is a better choice, the B34S user is given the capability to modify this choice. Readers are invited to rerun these examples on their own systems since the results may be chip sensitive. The brief speed and accuracy tests reported above highlight the fact that selecting just the right inverters can make a substantial difference. For most users running lapack, it is best to use the default settings although the usual "one size fits all" approach may carry with it substantial hidden costs. The discussion of refinement capability has the hidden assumption that the calculation is assumed to remain using real*8 data. Another assumption is that the blas code not be modified to increase accuracy. In the next section these assumptions are relaxed as calculations are made in real*16 and VPA and when using real*8 routines, the blas routines are enhanced to give more accuracy. 16.7 Variable Precision Math Many software systems allow real*4 data storage but move the data to real*8 to make a calculation. In many cases the resulting accuracy is not the same as what would be obtained with a direct read into real*8. The B34S matrix command supports real*4, real*8, real*16, complex*16, Complex*32, integer*4 and integer*8 data types to facilitate research into the impact of data storage accuracy on the calculation. In addition the variable precision subroutine library developed by Smith (1991) has been implemented to give accuracy to better that 1700 digits. The use of this code is discussed next. It is not just important to calculate in real*8, the precision in which the data was initially read makes a major difference, even in simple problems. A simple example from Stokes (2005) involving 2.00 / 4.11 will illustrate the problems of precision. str=>vpa r*8=>vpa r*8=>r*16 str=>r*16 r*8=>r*8 r*4=>r*4 .4866180048661800486618004866180048661800486618004866M+0 .48661800486618001080454992664677M+0 .48661800486618001080454992664677E+00 .48661800486618004866180048661800E+00 .48661800486618000000000000000000E+00 .48661798200000000000000000000000E+00 The line str>vpa lists the exact answer obtained when the data (2.0 and 4.11) are read from a string into a variable precision arithmetic (VPA) routine while the line r*8=>vpa shows what happens to accuracy when the data are first read into real*8 or double precision, then moved to a vpa datatype. The line r*8=>r*16 shows what occurs when the data are first read into real*8, then converted to real*16 before making the calculation. In this case the results are the same as what is obtained with r*8=>vpa but are inferior to the line str=>r*16 where the data are read directly into real*16. The lines r*8=>r*8 and r*4=>r*4 show what can be expected using the usual double precision and single precision math, respectively. The importance of this simple example is that is can be used to disentangle the effect of data storage precision and data calculation precision in a very simple problem where each can be isolated. When there are many calculations needed to solve a problem (to invert a 100 by 100 matrix by elimination involves a third of a million operations), round off error can mount, especially when numbers differ in size. Chapter 16 109 Strang (1976, page 32) notes "if floating-point numbers are added, and their exponents c differ say by two, then the last two digits in the smaller number will be more or less lost..." Real*4 or single precision on IEEE machines has a range of 1.18*10-38 to 3.40*1038. This gives a precision of 7-8 digits at best. Real*8 or double precision has a range of 2.23*10-308 to 1.79*10308 and at best gives a precision of 15-16 digits. Real*16 has a range of 10-4931 to 104932 and gives up to 32 digits of precision. VPA or variable precision arithmetic allows variable precision calculations. To measure the effects of data precision and calculation method on accuracy requires a number of different test data sets. The first problem attempted was the StRD (Rogers-FillibenGill-Guthrie-Lagergren-Vangel 1998) Filippelli data set, which contains 82 observations on a 10 polynomial model of the form y 0 i x i e where x ranges from -3.13200249 to i 1 10 8.781464495 and x ranges from 90,828.258 to 2,726,901,792.451598. Answers to 15 digits are supplied by StRD. Table 16.2 reports 15 experiments involving various ways to estimate the model. The linpack Cholesky routines and general matrix routines detect rank problems and will not solve the problem if the data are not converted to real*16. The QR approach obtains an average LRE13 of 7.306, 7.415 and 8.368 on the coefficients, SE and residual sum of squares. The exact numbers obtained are listed in Table 16.3. If the accuracy improvements for the BLAS routines suggested are enabled, these LRE numbers jump to 8.118, 8.098 and 9.803, respectively. Note that both accuracy improvements result in the same gain. Experiments # 4 and # 5 first copy the data that have been first read into real*8 into a real*16 variable and attempt estimation with a Cholesky and a QR approach. The LRE's are the same for both approaches (7.925, 8.708, 8.167). This experiment shows the effect of calculation precision and at first would lead one to believe that there is little gain obtained using real*16 calculation except for the fact that the Cholesky condition is not seen as 0.0. However, this interpretation would be premature without checking for data base precision effects (i. e., at what precision was the data initially read), which we do below. Experiments 6-12 test various combinations of calculation precision and routine selection. In Experiment # 6 we use the linpack SVD routines on real*8 data. The results are poor (LRE numbers of 2.195, 2.132 and 4.039).14 When the accuracy improvements are enabled, (experiment 7 and 8), there is a slight loss of accuracy on the coefficients to 1.901 but a slight gain on the SE to 2.431 . However, when the real*8 data are copied to real*16 in experiment 9, the SVD LRE numbers jump to 7.924, 8.708 and 8.167, respectively, which are similar to what was found in experiments 4 and 5 and show clearly the effect of calculation precision conditional on data reading into real*8 before the data are moved to real*16. These results are similar to those in the real*16 Cholesky experiment 4 and the real*16 QR experiment 5. Experiments 10-12 study the effect of using lapack's SVD routine in place of linpack. For experiment 10, the coefficient LRE jumps to 7.490, which is quite good and in fact beats the 13 LRE is the Log Relative Error as discussed by McCullough (1999). Assume x is value obtained and c is the correct value. Then LRE log10 | ( x c) / c | . 14 The author has used the linpack code since 1979. These results were not expected and seem to be related to the extreme values in the X matrix in the Filippelli data. When real*16 is used, accuracy of the linpack SVD routine improves. 110 Matrix Command Language QR LRE reported for experiment 1. This value is far better than the linpack LRE of 2.195.15 However, the LRE of the SE is poor with a LRE of 1.910, which is less than that found with linpack of 2.132. The LRE of e ' e of 1.606 is also less than the linpack LRE of 3.258. Since the SE requires knowledge of ( X ' X )1 , calculated as ( X ' X )1 V 2V ' , extreme values along the diagonal of may be causing errors when forming 2 . However, this possibility does not explain the poor performance of the residual sum of squares LRE of 1.606.16 The reason may be related to the fact that the data set has such high x10 values that minor coefficient differences will result in substantial changes in the relative residual sum of squares. Experiments 13-15 first load the data in real*16 and proceed to the same routines as used for experiments 4 - 6. Here we see LRE numbers of 14.68 14.99 and 15.00 for the Cholesky experiment and 14.79, 14.96 15.00 for the QR experiment which is the same as SVD calculated with linpack. These are close to perfect answers. Table 16.3 lists the coefficients obtained for experiment 1, which used real*8 data while Table 16.4 lists the exact coefficients obtained for the QR using data read directly into real*16. Experiments 13-15 show gain from reading the Filippelli data set in real*16. Since all there experiments produced similar LRE values, it suggests that if the data are read with enough precision, the results are less sensitive to the estimation method. This finding has important implications for data base design. The next task is to study less extreme (stiff) data sets and observe the results. 15 McCullough (1999, 2000) Used lapack QR and SVD routines to estimate the coefficients of the Filippelli data finding that "QR generally returns more accurate digits than SVD." The LRE values found were 7.4 and 6.3 respectively. For S-PLUS he found 8.4 and 5.8, respectively, where the underlying routines were not known. 16 The sum of squares was tested against the published value of 0.795851382172941E-03. The lapack SVD routine obtained 0.8155689538070673E-03. Chapter 16 111 Table 16.2 LRE for Various Approaches to an OLS Model of the Filippelli Data _____________________________________________________________ Various options of real*8 data Experiment 1 2 3 4 5 6 7 8 9 10 11 12 TYPE QR ACC_1 ACC_2 R16_CHOL R16_QR SVD SVD_ACC1 SVD_ACC2 SVD_R16 SVD_LAPK SVD2ACC1 SVD2ACC2 COEF 7.306 8.118 8.118 7.924 7.924 2.195 1.901 1.901 7.924 7.490 7.490 7.490 SE 7.415 8.098 8.098 8.708 8.708 2.132 2.431 2.431 8.708 1.910 1.910 1.910 RSS_LE 8.368 9.803 9.803 8.167 8.167 4.039 3.258 3.258 8.167 1.606 1.606 1.606 Various Options using Data read directly in real*16 13 14 15 R16_CHOL R16_QR R16_SVD 14.68 14.79 14.79 14.99 14.96 14.96 15.00 15.00 15.00 ___________________________________________________________ Experiments 4, 5 and 9 involve reading data first into real*8 and then converting the data to real*16. Experiments 13, 6-8 and 10-12 involve real*8 data. Experiments 13-15 use data read directly into real*16. Experiments 2, 7 and 11 enhance blas routines by accumulating real*8 data using IMSL routines DQADD and DQMULT. Experiments 3, 8 and 12 assumulate real*8 blas calculations using real*16 data as outlined in Stokes (2005). See Chapter 10 for a detailed discussion of the methods used. The coefficients obtained for experiment # 1 and 14 are listed in Tables 16.3 and 16.4. 112 Matrix Command Language Table 16.3 Coefficients and SE Estimated Using QR Models of the Real*8 Filippelli Data __________________________________________________________________ Coef Coef Coef Coef Coef Coef Coef Coef Coef Coef Coef 1 2 3 4 5 6 7 8 9 10 11 Test Value -2772.179591933420 -2316.371081608930 -1127.973940983720 -354.4782337033490 -75.12420173937571 -10.87531803553430 -1.062214985889470 -0.6701911545934081E-01 -0.2467810782754790E-02 -0.4029625250804040E-04 -1467.489614229800 Mean Variance Minimum Maximum SE SE SE SE SE SE SE SE SE SE SE 1 2 3 4 5 6 7 8 9 10 11 LRE LRE LRE LRE LRE LRE LRE LRE LRE 7.33 7.32 7.32 7.31 7.31 7.30 7.30 7.29 7.29 7.28 7.33 559.7798867059487 466.4775900975754 227.2042833290517 71.64786889794284 15.28971847592676 2.236911685945726 0.2216243305780890 0.1423637686503493E-01 0.5356174292732132E-03 0.8966328708850490E-05 298.0845420801842 7.42 7.41 7.41 7.41 7.41 7.41 7.41 7.41 7.42 7.43 7.43 7.306448565286121 2.587670394878226E-04 7.280096349919187 7.329023461850447 559.7798654749500 466.4775721277960 227.2042744777510 71.64786608759270 15.28971787474000 2.236911598160330 0.2216243219342270 0.1423637631547240E-01 0.5356174088898210E-03 0.8966328373738681E-05 298.0845309955370 Mean Variance Minimum Maximum Value Obtained -2772.179723094652 -2316.371192269638 -1127.973995395338 -354.4782509735776 -75.12420543777237 -10.87531857690271 -1.062215039398714 -0.6701911887876555E-01 -0.2467810910390330E-02 -0.4029625462234867E-04 -1467.489683023960 7.414701487211084 7.386168559949404E-05 7.405390067654106 7.429617565744895 Residual sum of squares: RSS 0.7958513821729410E-03 0.7958513787598208E-03 8.37 ________________________________________________________________ Test values are reported on the left-hand side. LRE = log relative error. The coefficients report experiment # 1 from Table 16.2. The same linpack QR routine was modified and to run for real*16 data. Results for this experiment are shown in Table 16.4. Chapter 16 113 Table 16.4 Coefficients estimated with QR using Real*16 Filippelli Data Coefficients Using QR on Data Loaded into Real*16 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. –2772.1795919334239280284475535721 –2316.3710816089307588219679140978 –1127.9739409837156985716700141998 –354.47823370334877161073848496470 –75.124201739375713890522075522684 –10.875318035534251085281081177145 –1.0622149858894676645966112202356 –0.67019115459340837592673412281191E-01 –0.24678107827547865084085445245647E-02 –0.40296252508040367129713154870917E-04 –1467.4896142297958822878485135961 Mean Variance Minimum Maximum LRE LRE LRE LRE 14.788490320266543980835382276091684 6.3569618908829012635712782954099325E-0002 14.347002403969724322813759016211991 15.000000000000000000000000000000000 SE Using QR on DATA Loaded into Real*16 1. 559.77986547494987457477254797527 2. 466.47757212779645269310982974610 3. 227.20427447775131062939817526228 4. 71.647866087592737261665720850718 5. 15.289717874740006503075678978592 6. 2.2369115981603327555186234039771 7. 0.22162432193422740206612983379340 8. 0.14236376315472394891823309147959E-01 9. 0.53561740888982093625865193118466E-03 10. 0.89663283737386822210041526987951E-05 11. 298.08453099553698520055234224439 Mean Variance Minimum Maximum LRE LRE LRE LRE LRE 14.85 15.00 14.42 15.00 15.00 14.35 14.66 15.00 14.85 15.00 14.55 LRE 15.00 15.00 14.86 15.00 15.00 14.91 14.74 15.00 15.00 15.00 15.00 14.955903576675283545444986642213045 7.2174779096858864768608287934814669E-0003 14.741319930323011772906976043000329 15.000000000000000000000000000000000 Residual sum of squares 0.79585138217294058848463068814293E-03 15.00 LRE = log relative error. This is experiment # 14 from Table 16.2 but uses the linpack QR routine modified by Stokes (2005) to run with real*16 data. For this experiment the data was read directly into real*16. The Box-Jenkins (1976) Gas Furnace data have been widely studied and modeled and are close in difficulty to what are found in many applied models in time series. While "correct" 15 digit agreed upon answers are not available, it is possible to study the effect on the residual sum of squares using 11 approaches reported in Table 16.5.17 Since OLS minimizes the sum of squared errors, a "better" answer is one with a smaller e ' e . Using this criteria, the linpack general matrix solver DGECO, Experiment 3, is "best" followed closely by the lapack general matrix solver, Experiment 4, and the linpack SVD routine, Experiment 10. Experiments 5 and 6 use the lapack general matrix solver that allows refinement and, in the case of Experiment 6 refinement and equilibration. These approaches did not do as well in determining a minimum e ' e and were substantially more expensive in terms of computer time. Of interest is why Experiment 1 and Experiment 8 did not produce the same answer since they both used the linpack Cholesky routines. The answer relates to the way the coefficients are calculated. In the 17 Since this data set does not have the rank problems found with the Filippelli data, it is possible to attempt a number of alternative procedures. Not all these procedures should be used. 114 Matrix Command Language former case the Cholesky R is used to obtain the coefficients without explicitly forming ( X ' X ) 1 using the linpack routine DPOSL, while in the latter case ( X ' X )1 is formed from R using DPODI. In general, the answers are very close for this exercise. Table 16.5 Residual Sum of Squares on a VAR model of Order 6 – Gas Furnace Data ________________________________________________________________ Residual 1. OLSQ 2. OLSQ 3. OLSQ 4. OLSQ 5. OLSQ 6. OLSQ 7. OLSQ 8. OLSQ 9. OLSQ 10. OLSQ 11. OLSQ Sum of Squares for various methods using Linpack Chosleky – solving from R using LINPACK QR using LINPACK DGECO using LAPACK DGETRE-DGECON-DGETRI using LAPACK DGESVX using LAPACK DGESVX with equilibration using LAPACK DPOTRF-DPOCON-DPTTRI using LINPACK DPOCO-DPODI using LINPACK DSICO-DSIDI using SVD Linpack using SVD Lapack 16.13858295915815 16.13858295915821 16.13858295915803 16.13858295915806 16.13858295935751 16.13858295963500 16.13858295915812 16.13858295915811 16.13858295915814 16.13858295915808 16.13858295915810 _________________________________________________________________ Model estimated was gasout=f(gasout{1 to 6}, gasin{1 to 6}). Data from Box-Jenkins [3]. Data studied in Stokes [32]. Experiment 1 solves for using Cholesky R directly. Experiments 3-9 form ( X ' X )1 . The StRD Pontius data are classified as of a lower level of difficulty, although more challenging than the gas furnace data studied in the prior section. The Pontius data consists of 40 observations of a model of the form y 0 1 x 2 x 2 for a model which is almost a perfect fit. The eigenvalues of ( X ' X ) , as calculated by the eispack routine RG, were 0.8109E+13, 0.7317E+27, 3.613, giving a condition estimate that tripped the condition tolerance in the linpack LU and Cholesky routines for both real*8 and real*4 data. Calculations were "forced" by ignoring this check.18 Results are reported for a number of experiments in Table 16.5 that vary precision, method of calculation and degree of Fortran optimization for real*4 data. The base method was the QR for real*8 data which gives a LRE = 13.54 for . When accuracy was enabled the LRE for the SE and e ' e increased slightly from 12.39 to 12.51 and 12.09 to 12.21 respectively in experiments 1 and 2. The linpack SVD produced a LRE of 13.92, 13.92 and 13.53 for the coefficient, the SE and e ' e , respectively, while for lapack these were 13.48, 12.74 and 12.93, respectively, in Experiments 3 and 4. Here, using accuracy as a criteria, linpack edged lapack. Since in the Filippelli data set the reverse was found, there appears to be no "best" SVD routine for all cases. In addition to accuracy, there are other aspects of the selection 18 The same data was estimated in Windows RATS Doan (1992) version 6.0. While the reported coefficients agreed with the benchmark for 11, 11 and 14 digits, respectively, RATS unexpectedly produced a SE of 0.0 and a t of 0.0 for the 2 term. The "certified" coefficients and standard errors are: 0 0.673565789473684E-03 1 0.732059160401003E-06 2 -0.316081871345029E-14 which produce a t for 2 0.107938612033077E-03 0.157817399981659E-09 0.486652849992036E-16 of -64.95, not zero. Chapter 16 115 process that include relative speed of execution (tested in Table 16.7 and found to be a function of the size of the problem and computer chip) and memory requirements that are not tested here since they are published.19 Experiments 5-8 show forced linpack LU and Cholesky models for real*8 data. In Experiments 7-8, added accuracy in the accumulators was enabled. Slight accuracy gains were observed, especially in the RSS calculation where the LRE jumped from 12.77 & 12.73 to 13.23 and 13.39, respectively. What is interesting is that in this case, even though the condition of ( X ' X ) was large, the LU and Cholesky approaches were able to get reasonable answers. The linpack condition check appears to be conservative since in the usual case the software would not attempt the solution of this problem. Experiments 9-14 concern real*4 data.20 Again, the QR was found to be most accurate, with scores of 5.36, 6.01 and 5.65 for the coefficients, SE's and RSS, respectively. These runs were made with code compiled by Lahey Fortran version 7.10 running opt = 1. When accuracy enhancement was enabled, the LRE for the SE fell from 6.01 to 4.37. This difference was traced to the fact that the BLAS real*4 routine SDOT is optimized to hold data in registers while the higher accuracy routine SDSDOT did not optimize to the same extent. This is shown when the same calculation was done with opt=0 and the QR SE accuracy was 4.21 and 4.37 for nonaccuracy and accuracy-enabled code respectively. Higher accuracy was observed for opt=1 LUforced of 5.27 vs 4.80 for opt=0 calculations. Why the forced Cholesky experiment seems to run more accurately at opt=0 than opt=1 (see Experiment 12) is not clear. What seems to be the case is that the level of optimization and its resulting changes in registers seems to make a detectable difference only with real*4 precision data. A strong case can be made not to use this precision for this or any other econometric problem. When real*8 calculations are used, these knife edge type differences are not observed. 19 For lapack memory was set to the suggested amount from the first call to the routine. Experimentation with alternative lapack memory, possible with the B34S system implementation of lapack, was not attempted for his problem since it was discussed earlier. 20 Data was first read in real*8. Then the B34S routine RND( ) first checked for maximum and minimum allowable real*4 size, using the Fortran functions HUGH( ) and TINY( ). Next, the real*8 data was written to a buffer, using g25.16, and re-read into real*4, using the format g25.16. This approach gives a close approximation to having read the data directly into real*4. Use of the Fortran function sngl( ) can be dangerous in that, among other things, range checking is not performed. 116 Matrix Command Language Table 16.6 LRE for Various Estimates of Coef, SE and RSS of Pontius Data Real*8 # 1. 2. 3. 4. 5. 6. 7. 8. Data Method QR QR_AC SVD-LINPACK SVD_LAPACK LU-Forced Chol-Forced LU-Forced_AC Chol-Forced_AC COEF 13.54 13.52 13.92 13.48 12.61 12.11 12.77 12.17 SE 12.39 12.51 13.92 12.74 13.02 13.00 13.61 13.63 RSS 12.09 12.21 13.53 12.93 12.77 12.73 13.23 13.39 Real*4 9. 10. 11. 12. 13. 14. Data Optimization = 1 QR QR_AC LU-Forced Chol-Forced LU-Forced_AC Chol-Forced_AC 5.36 5.36 3.93 3.97 3.95 4.01 6.01 4.37 5.27 3.36 5.30 3.32 5.65 4.06 5.36 3.06 4.78 3.02 Real*4 Data Optimization = 0 9. QR 5.36 4.21 3.91 10. QR_AC 5.36 4.37 4.06 11. LU_Forced 4.31 4.80 4.45 12. Chol_Forced 4.48 4.51 4.26 13. LU_Forced_AC 3.95 5.30 4.78 14. Chol_Forced_AC 4.16 3.79 3.48 _______________________________________________________________________ All data were initially read in real*8. For real*4 results data were then converted to real*4. Forced means that the LINPACK condition check has been bypassed for testing purposes. All reported LRE values are for the means. All real*4 tests have been done with LINPACK routines. Real*4 accumulators have not been enabled in cases where _AC is not added to the method name. The Eberhardt data consist of 11 observations of a one input model y 1 x . The level of difficulty is rated as average. Results are shown in Table 16.7. Here the Cholesky, the linpack SVD and the lapack SVD all produce 100% identical LRE values of 14.72, 15.00 and 14.91 respectively. For the QR the Coefficient LRE was 14.72 while the SE and residual LRE's were marginally less at 14.40 and 14.05. Here again the methods being considered run very close together. Table 16.7 LRE for QR, Cholesky, SVD linpack and lapack for Eberhardt Data _______________________________________________________________________ Method QR Chol SVD-LINPACK SVD-LAPACK COEF 14.72 14.72 14.72 14.72 SE 14.40 15.00 15.00 15.00 RSS 14.05 14.91 14.91 14.91 _______________________________________________________________________ All data read in real*8. The above results suggest that in certain problems that have a high degree of multicollinearity, the results are sensitive to the level of precision of the calculation as well as the method of the calculation. A challenging example was the Filippelli polynomial data set which was discussed earlier. However, the discussion was not complete because the real*16 QR results Chapter 16 117 were only compared to the 15-digit "official" benchmark, and not a benchmark with more digits. Since real*16 will give more than 15 digits of accuracy, an important final task for the next section is to extend the Filippelli benchmark, using variable precision arithmetic to benchmark the accuracy of the real*16 results obtained. The variable precision library developed by Smith [30] was implemented in the B34S to extend the Filippelli benchmark and thus fully test the true accuracy of the reported real*16 results. The linpack LU inversion routines DGECO, DGEFA and DGEDI were rewritten to allow variable precision calculations. What was formerly a real*8 variable became a 328 element real*8 vector. Simple statements, such as A=A+B*C, had to be individually coded, using a customized pointer routine, IVPAADD( ) that would address the correct element to pass to a lower level routine to make the calculation. 21 A simple example gives some insight into how this is done: c c c if (z(k) .ne. 0.0d0) ek = dsign(ek,-z(k)) if(vpa_logic(kindr, * z(ivpaadd(kindr,k,1,k,1)),'ne', vpa_work(i_zero)) )then call vpa_mul(kindr,vpa_work(i_mone),z(ivpaadd(kindr,k,1,k,1)), * vpa_work(iwork(4))) call vpa_func_2(kindr,'sign',vpa_work(i_ek), * vpa_work(iwork(4)), * vpa_work(iwork(5)) ) call vpa_equal(kindr,vpa_work(iwork(5)),vpa_work(i_ek)) endif vpa_work( ) is a 328 by 20 work array. The line z(ivpadd(kindr,k,1,k,1) addresses the kth element of Z, which is 328 by k, and compares it to a constant = 0.0 saved in vpa_work(i_zero). If these two variables are not equal then the three calls are executed to solve ek = dsign(ek,-z(k)). The first call forms –z(k) and places it in VPA_work(iwork(4)). The variable vpa_work(i_mone) contains –1.0. Next, the SIGN function is called and the result placed in VPA_work(iwork(5)). Finally a copy is performed. This simple example shows what is involved to "convert" a real*8 program to do VPA math. The results can be spectacular.22 The test job vpainv is shown next: /; /; Shows gains in accuracy of the inverse with vpa /; b34sexec matrix; call echooff; n=6; x=rn(matrix(n,n:)); 21 Stokes (2005) provides added detail on how this was accomplished. The job vpainv, in paper_86.mac which is distributed which B34S, illustrates the gains in accuracy for alternative precision settings. Assuming a matrix X, X*inv(X) produces off diagonal elements in the order of |.1e-1728|, which is far superior to what can be obtained with real*4, real*8 or real*16 results which are also shown in the test problem. The B34S VPA implementation allows these high-accuracy calculations to be mixed with lower precision commands, using real*4, real*8 and real*16, since data can be moved from one precision to another. This allows experimentation concerning how sensitive the results are to accuracy settings. 22 118 Matrix Command Language ix=inv(x,rcond8); r16x=r8tor16(x); ir16x=inv(r16x,rcond16); call print('Real*4 tests',sngl(x),inv(sngl(x)),sngl(x)*inv(sngl(x))); call print('Real*8 tests',x, ix, x*ix); call print('Real*16 tests',r16x,ir16x,r16x*ir16x); vpax=vpa(x); ivpax=inv(vpax,rcondvpa); detvpa=%det; call print(rcond8,rcond16,rcondvpa,det(x),det(r16x),detvpa); call print('Default accuracy'); call print('VPA Inverse ',vpax,ivpax,vpax*ivpax); /; call vpaset(:info); do i=100,1850,100; call vpaset(:ndigits i); call vpaset(:jform2 10); call print('*************************************************':); vpax=mfam(dsqrt(dabs(vpa(x)))); call vpaset(:jform2 i); call print('Looking at vpax(2,1) given ndigits was set as ',i:); call print(vpax(2,1)); ivpax=inv(vpax); call print('VPAX and Inverse VPAX at high accuracy ', vpax,ivpax,vpax*ivpax); call print('*************************************************':); enddo; b34srun; Edited output from running this script is shown next. First a real*4 matrix x is first displayed, then inverted and then x x 1 displayed. Errors are in the range of |1.e-6| and smaller. Real*4 1 2 3 4 5 6 1 2 3 4 5 6 tests Matrix of 6 1 2.05157 1.08325 0.825589E-01 1.27773 -1.22596 0.338526 2 -1.32010 -1.52445 -0.459242 -0.605638 0.307389 -1.54789 Matrix of 6 1 0.821598 0.778618 -0.262284 0.250162 0.561824 0.139631 Matrix of 1 by by 2 -1.25918 -1.58549 0.249125 -0.950123E-01 -0.716494 0.321648 6 2 by 6 elements (real*4) 3 1.49779 -0.168215 0.498469 1.26792 0.741401 -0.187157 6 5 0.647330 -0.595625E-01 1.14983 -0.579426 1.63783 1.81526 6 1.92704 0.146707 -0.271132E-01 -0.138891 -0.137510 -1.96044 5 -0.665297 -0.316747 0.424313 0.271452 -0.505871E-01 0.120854 6 0.791364 0.709570 -0.322461 0.237183 0.541239 -0.337968 elements (real*4) 3 -0.958025 -1.34620 0.406023 -0.938542 -0.495201 0.147814 6 4 0.409240 1.06262 -0.886348 0.396807 0.245425 0.527160 4 -0.255133 -0.310931 0.676249 -0.628610E-01 -0.454585 -0.300940 elements (real*4) 3 4 5 6 Chapter 16 1 2 3 4 5 6 1.00000 -0.105908E-06 0.255151E-07 -0.364185E-07 0.125920E-06 -0.429915E-07 -0.586484E-07 1.00000 -0.116802E-06 -0.112549E-06 -0.233107E-07 -0.220843E-06 0.147839E-07 -0.591370E-08 1.00000 -0.528599E-07 0.455514E-08 -0.739041E-07 119 -0.649433E-07 -0.114246E-06 -0.251482E-07 1.00000 0.481459E-07 0.121912E-07 -0.117634E-06 -0.735698E-07 -0.628774E-07 -0.118921E-06 1.00000 -0.798442E-07 0.191178E-06 0.108380E-06 0.153868E-06 0.111764E-06 0.365477E-07 1.00000 Next the experiment is repeated for real*8 and real*16 versions of the same matrix. Here errors are in the area of |.1e-15| and smaller and |.1e-33| and smaller. Using the default VPA setting and the same matrix, these errors become |.1e-62| and smaller. Real*8 X tests = Matrix of 1 2 3 4 5 6 IX 1 2.05157 1.08325 0.825589E-01 1.27773 -1.22596 0.338525 6 2 -1.32010 -1.52445 -0.459242 -0.605638 0.307389 -1.54789 = Matrix of 1 2 3 4 5 6 1 0.821598 0.778618 -0.262284 0.250162 0.561824 0.139631 6 1 1.00000 0.693889E-17 -0.598480E-16 -0.242861E-15 -0.128370E-15 0.00000 by 2 -1.25918 -1.58550 0.249125 -0.950124E-01 -0.716494 0.321648 Matrix of 1 2 3 4 5 6 by 6 by 2 0.00000 1.00000 0.693889E-17 0.270617E-15 -0.159595E-15 0.111022E-15 6 elements 3 1.49779 -0.168215 0.498469 1.26792 0.741401 -0.187157 6 5 0.647330 -0.595625E-01 1.14983 -0.579426 1.63783 1.81526 6 1.92704 0.146707 -0.271132E-01 -0.138891 -0.137510 -1.96044 4 -0.255133 -0.310931 0.676249 -0.628611E-01 -0.454585 -0.300940 5 -0.665297 -0.316747 0.424313 0.271452 -0.505871E-01 0.120854 6 0.791364 0.709570 -0.322461 0.237183 0.541239 -0.337968 4 0.444089E-15 -0.111022E-15 0.107553E-15 1.00000 0.194289E-15 -0.111022E-15 5 -0.222045E-15 -0.100614E-15 -0.129237E-15 -0.180411E-15 1.00000 -0.555112E-16 6 -0.111022E-15 -0.249800E-15 -0.329597E-16 -0.180411E-15 0.277556E-15 1.00000 5 0.647330 -0.595625E-01 1.14983 -0.579426 1.63783 1.81526 6 1.92704 0.146707 -0.271132E-01 -0.138891 -0.137510 -1.96044 5 -0.665297 -0.316747 0.424313 0.271452 -0.505871E-01 0.120854 6 0.791364 0.709570 -0.322461 0.237183 0.541239 -0.337968 elements 3 -0.958025 -1.34620 0.406023 -0.938542 -0.495201 0.147814 6 4 0.409240 1.06262 -0.886348 0.396807 0.245425 0.527160 elements 3 -0.222045E-15 -0.971445E-16 1.00000 0.267147E-15 -0.246331E-15 0.00000 Real*16 tests R16X 1 2 3 4 5 6 IR16X 1 2 3 4 5 6 = Matrix of 1 2.05157 1.08325 0.825589E-01 1.27773 -1.22596 0.338525 = Matrix of 1 0.821598 0.778618 -0.262284 0.250162 0.561824 0.139631 6 by 2 -1.32010 -1.52445 -0.459242 -0.605638 0.307389 -1.54789 6 by 2 -1.25918 -1.58550 0.249125 -0.950124E-01 -0.716494 0.321648 6 elements (real*16) 3 1.49779 -0.168215 0.498469 1.26792 0.741401 -0.187157 6 4 0.409240 1.06262 -0.886348 0.396807 0.245425 0.527160 elements (real*16) 3 -0.958025 -1.34620 0.406023 -0.938542 -0.495201 0.147814 4 -0.255133 -0.310931 0.676249 -0.628611E-01 -0.454585 -0.300940 120 Matrix Command Language Matrix of 1 2 3 4 5 6 RCOND8 6 1 1.00000 0.255788E-33 -0.564237E-35 0.382177E-33 -0.328010E-33 -0.481482E-34 = by 2 0.192593E-33 1.00000 0.917826E-34 0.120371E-34 0.601853E-35 0.962965E-34 6 elements (real*16) 3 0.288889E-33 0.120371E-34 1.00000 0.752316E-34 -0.243751E-33 0.481482E-34 4 -0.192593E-33 -0.541668E-34 0.165510E-33 1.00000 0.541668E-34 0.00000 5 0.120371E-33 0.692131E-34 0.184318E-34 0.120371E-34 1.00000 0.481482E-34 6 -0.192593E-33 -0.662038E-34 -0.168519E-33 -0.144445E-33 0.156482E-33 1.00000 0.50111667E-01 RCOND16 = 0.5011166670247408E-01 RCONDVPA= 5.01116667024740759246941521228642361326495469435182039839368M-2 15.503129 15.50312907174408 DETVPA = 1.55031290717440844136019448415020291808052552694282172989314M+1 Default accuracy VPA Inverse VPAX = Matrix of 1 1 .205157M+1 2 .108325M+1 3 .825589M-1 4 .127773M+1 5 -.122596M+1 6 .338525M+0 IVPAX = Matrix of 1 1 .821598M+0 2 .778618M+0 3 -.262284M+0 4 .250162M+0 5 .561824M+0 6 .139631M+0 Matrix of 1 1 .100000M+1 2 .281368M-62 3 .505466M-63 4 .155177M-62 5 -.614843M-63 6 .241052M-62 6 by 2 -.132010M+1 -.152445M+1 -.459242M+0 -.605638M+0 .307389M+0 -.154789M+1 6 by 2 -.125918M+1 -.158550M+1 .249125M+0 -.950124M-1 -.716494M+0 .321648M+0 6 by 2 -.349857M-63 .100000M+1 -.281391M-63 .248980M-63 -.423209M-63 -.149789M-62 6 elements VPA - FM 3 .149779M+1 -.168215M+0 .498469M+0 .126792M+1 .741401M+0 -.187157M+0 6 5 .647330M+0 -.595625M-1 .114983M+1 -.579426M+0 .163783M+1 .181526M+1 6 .192704M+1 .146707M+0 -.271132M-1 -.138891M+0 -.137510M+0 -.196044M+1 5 -.665297M+0 -.316747M+0 .424313M+0 .271452M+0 -.505871M-1 .120854M+0 6 .791364M+0 .709570M+0 -.322461M+0 .237183M+0 .541239M+0 -.337968M+0 5 -.276120M-63 .128858M-63 -.794001M-64 .139723M-63 .100000M+1 .169336M-64 6 -.542557M-63 -.416666M-63 .136114M-63 -.699228M-63 .736800M-63 .100000M+1 elements VPA - FM 3 -.958025M+0 -.134620M+1 .406023M+0 -.938542M+0 -.495201M+0 .147814M+0 6 4 .409240M+0 .106262M+1 -.886348M+0 .396807M+0 .245425M+0 .527160M+0 4 -.255133M+0 -.310931M+0 .676249M+0 -.628611M-1 -.454585M+0 -.300940M+0 elements VPA - FM 3 .223368M-63 .722028M-63 .100000M+1 .754062M-64 -.400764M-63 -.141051M-63 4 -.757910M-63 -.165087M-62 -.312900M-63 .100000M+1 -.685434M-64 -.160277M-62 Chapter 16 121 Next the VPA degrees of accuracy is increased from 100 to 1700+ in steps of 100. Edited results for 100 and 1775 show accuracy in the range of |.1e-104| and an astounding |.1e-1784|, which illustrate what is possible with VAP math. A typical element, x(2,1) is also shown. ************************************************* Looking at vpax(2,1) given ndigits was set as 100 1.0407938475538507976540614667166763813116409216491186532751876440441549 90671195410174838727421096493M+0 VPAX and Inverse VPAX at high accuracy VPAX 1 2 3 4 5 6 IVPAX = Matrix of 1 .143233M+1 .104079M+1 .287331M+0 .113037M+1 .110723M+1 .581829M+0 = Matrix of 1 1 .727136M+0 2 -.342033M+1 3 -.109686M+0 4 .283037M+1 5 -.758210M+0 6 .203284M+1 Matrix of 1 2 3 4 5 6 1 .100000M+1 .679132M-105 -.229392M-104 -.122064M-104 -.760298M-105 -.200000M-104 6 by 2 .114896M+1 .123469M+1 .677674M+0 .778228M+0 .554427M+0 .124414M+1 6 by 2 .730328M+0 -.219101M+0 -.901577M+0 .937660M+0 -.279579M+0 -.474316M-1 6 by 2 .921927M-105 .100000M+1 .622168M-106 .309049M-105 .576995M-105 .374939M-105 6 elements VPA - FM 3 .122384M+1 .410141M+0 .706023M+0 .112602M+1 .861046M+0 .432617M+0 6 5 .804568M+0 .244054M+0 .107230M+1 .761200M+0 .127978M+1 .134732M+1 6 .138818M+1 .383024M+0 .164661M+0 .372681M+0 .370824M+0 .140016M+1 5 .158871M+1 -.228244M+1 -.137187M+1 .135031M+1 .665131M+0 .451570M+0 6 -.837851M+0 .274594M+1 .106148M+0 -.234069M+1 .591982M+0 -.766261M+0 5 -.932517M-105 -.102494M-104 .965828M-106 -.104756M-104 .100000M+1 -.113653M-104 6 .000000M 0 -.221304M-104 .115613M-104 -.494081M-105 -.341342M-105 .100000M+1 elements VPA - FM 3 -.308940M+0 -.233482M+1 .229806M+0 .272689M+1 -.193377M+0 .904066M+0 6 4 .639719M+0 .103084M+1 .941461M+0 .629926M+0 .495403M+0 .726058M+0 4 -.175557M+1 .595158M+1 .219987M+1 -.526084M+1 .311109M+0 -.280993M+1 elements VPA - FM 3 .200000M-104 .221145M-104 .100000M+1 .120743M-104 .221558M-104 .000000M 0 4 .300000M-104 .000000M 0 .230075M-104 .100000M+1 .100000M-104 .400000M-104 ************************************************* Note: Precision out of range when calling FMSET. NPREC = 1800 Nearest valid precision used given ndig= 256 Upperlimit on :ndigits = 1775 ************************************************* Looking at vpax(2,1) given ndigits was set as 1800 1.0407938475538507976540614667166763813116409216491186532751876440441549 9067119541017483872742109649284847786731954780618396337243557760905134253 6563520156336803061100714463959556366263662269873821308469464832965854167 0138171860068377342247552386420701260553904819498240561168336065413626401 0250669906640460786770367950409741912099989603137336731422032053417523721 3966297964166071008246385873024394829403802520042514409728651300143976957 2812397779120024875167135613154575066947601426816271130247154879147668923 9394356296009388425802127641937860496756504613699970799418421555804093055 3237120995762235218535766576766661537848740795670861988624303476444175233 7226176225798507544402708599420593094629811345970349094729351139588955728 5119428379300506359255141263770626457625458403838601615539373960461597816 0073349099782610400889436729448523898284270595056658203704057321081365781 8654413565539069157902511584029231833064933298214721741970329932207991290 6527078671610757572078982983114671804609743149945456219122385716278877328 122 Matrix Command Language 1968071093789651792050131594215163153195866826910747687397553138316978173 9321736744421971655768821520112748477863475786714301867009421997175231617 0363238010614721981137580098809073513520578006575598464157845090062991068 3355234688196773978064825447224414805727326031381881663445151323266575394 3649227970422690469305135088104621928676013036474412367065826577456525169 9851349360204073235768035720102081408651568383935674777433519021016931172 0962506684179407159824884668004581146345625430570260083442728751522255963 6613251768712917596895179532314049464823036176041089662244758723460968285 4191733256205711195695710155099750814377036616922522753789328720253554539 2344474454352662945653575100394159947087765297030779022580375458745250914 46728867302226218905058287716250997200000000000000M+0 VPAX and Inverse VPAX at high accuracy VPAX 1 2 3 4 5 6 IVPAX = Matrix of 1 .143233M+1 .104079M+1 .287331M+0 .113037M+1 .110723M+1 .581829M+0 = Matrix of 1 1 .727136M+0 2 -.342033M+1 3 -.109686M+0 4 .283037M+1 5 -.758210M+0 6 .203284M+1 Matrix of 6 by 2 .114896M+1 .123469M+1 .677674M+0 .778228M+0 .554427M+0 .124414M+1 6 by 2 .730328M+0 -.219101M+0 -.901577M+0 .937660M+0 -.279579M+0 -.474316M-1 6 by 6 elements VPA - FM 3 .122384M+1 .410141M+0 .706023M+0 .112602M+1 .861046M+0 .432617M+0 6 5 .804568M+0 .244054M+0 .107230M+1 .761200M+0 .127978M+1 .134732M+1 6 .138818M+1 .383024M+0 .164661M+0 .372681M+0 .370824M+0 .140016M+1 5 .158871M+1 -.228244M+1 -.137187M+1 .135031M+1 .665131M+0 .451570M+0 6 -.837851M+0 .274594M+1 .106148M+0 -.234069M+1 .591982M+0 -.766261M+0 elements VPA - FM 3 -.308940M+0 -.233482M+1 .229806M+0 .272689M+1 -.193377M+0 .904066M+0 6 4 .639719M+0 .103084M+1 .941461M+0 .629926M+0 .495403M+0 .726058M+0 4 -.175557M+1 .595158M+1 .219987M+1 -.526084M+1 .311109M+0 -.280993M+1 elements VPA - FM 1 2 3 4 5 6 1 .100000M+1 -.168696M-1785 .000000M 0 -.200000M-1784 .196614M-1784 -.200000M-1784 2 .266236M-1784 .100000M+1 .143811M-1784 -.500000M-1784 .297167M-1784 -.177795M-1784 3 .105965M-1784 -.129815M-1785 .100000M+1 -.103406M-1784 -.856883M-1785 -.529863M-1785 4 -.489586M-1786 -.602335M-1785 -.793529M-1785 .100000M+1 .113342M-1784 -.942817M-1785 5 -.866758M-1785 -.612497M-1785 -.136012M-1784 .000000M 0 .100000M+1 -.160420M-1787 6 .000000M 0 -.503337M-1785 .100000M-1784 -.100000M-1784 .110410M-1784 .100000M+1 ************************************************* Table 16.8 shows the Filippelli Data set benchmark, an extended printout of the QR real*16 results and the expanded Filippelli benchmark calculated with VPA data to 40 digits. A ruler listed at the top table is designed to assist the reader in determining at which digit there is a difference. Consider coefficient # 1. The VPA beta agrees with the real*16 QR beta up to the 28th digit, which is far beyond the 15th digit, which was all that was listed for the "benchmark" which is shown again in Table 16.8. The VPA experiment documents that the real*16 calculation is in fact substantially more accurate than the best real*8 QR, which produced, on average 7 digits, as reported in Table 16.3. Recall that the "converted" real*16 results (data converted from real*8 to real*16), reported in Table 16.2 Experiment 5, had only a marginally better LRE of 7.924 than the real*8 QR results that found the LRE was 7.31. Although in Tables 16.2 & 16.3 it was reported that the "true" real*16 QR results (data loaded directly into real*16), had a LRE value was 14.79, once we had the VPA benchmark for 40 digits, it was apparent that the LRE was substantially larger. Even calculations of the 10th and 11th coefficients, when compared with the VPA data, produced 27 digits of accuracy. It should be remembered that these impressive Chapter 16 123 results for real*16 are due to both the accuracy of the calculation and the fact that the data was directly read into real*16, not converted from real*8 to real*16. As we have shown, the data base precision makes a real difference in addition to the precision of the calculation. The important implication is that the inherent precision of the calculation method will be no help and in fact may give misleadingly "accurate" results unless the data is read with sufficient precision.23 Some of the key lessions of this paper are listed in Table 16.9. The main finding is the accuracy tradeoff between the precision of the data and the calculation method used. In all cases, it is important to check for rank problems before proceeding with a calculation. The less the precision of the data, the more appropriate it is to consider higher accuracy solution methods such as the QR and the SVD approach.24 Real*4 data saving precision, and more important calculation precision, was found to be associated with substantial loss of accuracy, even with less stiff data sets. 23 In order to 100% isolate the VPA results from data reading issues, the loading of data into the VPA array proceeded as follows. The real*16 data was printed to a character*1 array using e50.32. Next, the VPA string input routine was used to convert this character*1 array into a VPA variable. This way both real*16 and the VPA results were using the same data. Experiments were also conducted by reading the data in character form directly into the VPA routines. For this problem both methods of data input into VPA made no difference since there were relative few digits. In results not reported but available in paper_86.mac, the Filippelli problem was "extended" by 11 20 adding x , , x to the right hand side to make the problem more difficult (stiff). Both the VPA and the native real*16 experiments were run and both successfully solved the problem, suggesting "reserve" capability to handle a substantially more stiff problem. 24 While the main trust of the paper has been to show the effect of various factors on the number of "correct" digits of a calculation, in applied econometric work an important consideration is how many digits to report. If the government data is known only to k digits, many researchers argue that only k digits of accuracy should be reported. In many situations, this is appropriate although such a practice makes it difficult to access the underlying accuracy of the calculation routines used in the software system. Clearly if variables such as ŷ or ê are to be calculated, all estimated digits should be used to insure e 0 etc. 124 Matrix Command Language Table 16.8 VPA Alternative Estimates of Filippelli Data set _______________________________________________________________________ _ VPA BETA Real*16 QR Answer for VPA SE Real*16 QR Answer for 1 beta coef 1 SE SE 10 20 30 40 50 12345678901234567890123456789012345678901234567890 --------------------------------------------------.2772179591933423928028447556649596044434M+4 -0.2772179591933423928028447553572108500000E+04 -0.2772179591933420E+04 .5597798654749498745747725508021651489727M+3 0.5597798654749498745747725479752748700000E+03 0.5597798654749500E+03 VPA BETA Real*16 QR Answer for VPA SE Real*16 QR Answer for 2 beta coef 2 SE SE -.2316371081608930758821967916501044936138M+4 -0.2316371081608930758821967914097820200000E+04 -0.2316371081608930E+04 .4664775721277964526931098320484471124838M+3 0.4664775721277964526931098297461005100000E+03 0.4664775721277960E+03 VPA BETA Real*16 QR Answer for VPA SE Real*16 QR Answer for 3 beta coef 3 SE SE -.1127973940983715698571670015266249731414M+4 -0.1127973940983715698571670014199826100000E+04 -0.1127973940983720E+04 .2272042744777513106293981763510244738352M+3 0.2272042744777513106293981752622826700000E+03 0.2272042744777510E+03 VPA BETA Real*16 QR Answer for VPA SE Real*16 QR Answer for 4 beta coef 4 SE SE -.3544782337033487716107384852595281875294M+3 -0.3544782337033487716107384849646966900000E+03 -0.3544782337033490E+03 .7164786608759273726166572118158443735326M+2 0.7164786608759273726166572085071780100000E+02 0.7164786608759270E+02 VPA BETA Real*16 QR Answer for VPA SE Real*16 QR Answer for 5 beta coef 5 SE SE -.7512420173937571389052207557481187222874M+2 -0.7512420173937571389052207552268365400000E+02 -0.7512420173937570E+02 .1528971787474000650307567904607140782062M+2 0.1528971787474000650307567897859220700000E+02 0.1528971787474000E+02 VPA BETA Real*16 QR Answer for VPA SE Real*16 QR Answer for 6 beta coef 6 SE SE -.1087531803553425108528108118290083531722M+2 -0.1087531803553425108528108117714492600000E+02 -0.1087531803553430E+02 .2236911598160332755518623413323850745016M+1 0.2236911598160332755518623403977080500000E+01 0.2236911598160330E+01 VPA BETA Real*16 QR Answer for VPA SE Real*16 QR Answer for 7 beta coef 7 SE SE -.1062214985889467664596611220591597363944M+1 -0.1062214985889467664596611220235596600000E+01 -0.1062214985889470E+01 .2216243219342274020661298346608897939687M+0 0.2216243219342274020661298337934033000000E+00 0.2216243219342270E+00 VPA BETA Real*16 QR Answer for VPA SE Real*16 QR Answer for 8 beta coef 8 SE SE -.6701911545934083759267341228848844976973M-1 -0.6701911545934083759267341228119136200000E-01 -0.6701911545934080E-01 .1423637631547239489182330919953278852498M-1 0.1423637631547239489182330914795936200000E-01 0.1423637631547240E-01 VPA BETA Real*16 QR Answer for VPA SE Real*16 QR Answer for 9 beta coef 9 SE SE -.2467810782754786508408544524189188555839M-2 -0.2467810782754786508408544524564670500000E-02 -0.2467810782754790E-02 .5356174088898209362586519329555783802279M-3 0.5356174088898209362586519311846583900000E-03 0.5356174088898210E-03 VPA BETA 10 Real*16 QR beta Answer for coef VPA SE 10 Real*16 QR SE Answer for SE -.4029625250804036712971315485276426445821M-4 -0.4029625250804036712971315487091695800000E-04 -0.4029625250804040E-04 .8966328373738682221004152725410272047808M-5 0.8966328373738682221004152698795102200000E-05 0.8966328373738680E-05 VPA BETA 11 Real*16 QR beta Answer for coef VPA SE 11 Real*16 QR SE Answer for SE -.1467489614229795882287848515307287127546M+4 -0.1467489614229795882287848513596070800000E+04 -0.1467489614229800E+04 .2980845309955369852005523437755166954313M+3 0.2980845309955369852005523422443903600000E+03 0.2980845309955370E+03 ____________________________________________________ Chapter 16 125 Table 16.9 Lessons to be learned from VPA and Other Accuracy Experiments _______________________________________________________________________ 1. The QR method of solving an OLS regression model can provided 1-2 more digits of accuracy and in fact may be the only way to successfully solve a "stiff" or multicollinear model. 2. The precision in which data are initially loaded into memory (for example, single precision) impacts accuracy, even in cases when it is later moved to a higher precision (for example double precision) for the calculation. This suggests that data should be read into the precision in which the calculation is made to avoid numeric representation accuracy issues that occur when the precision of the data is increased. This means that the current practive of a number of software systems to save data in real*4, but move this data into real*8 for calculations is a dangerous practive that unnecessarily induces accuracy issues. 3. In many cases, accuracy gains can be made by boosting the precision of accumulators such as the BLAS routines for sum, absolute sum and dot product. Such routines should be used throughout software systems and will increase the accuracy of the variance and other calculations. It is desirable to be able to switch on and off such accuracy improvements to test the sensitivity of the given problem to these changes. Accuracy improvements to these routines have a CPU cost. 4. Data base design should take into account the needs of the users who may want to read data into higher-than-usual precision. For data that is not transformed in a data bank, the user should be able to get all reported digits of precision without rounding (due to numeric representation) loss. This means allowing a character representation to be accessable and well as a real*4 or real*8 representation. This suggestion has far reaching implications since most if not all databanks save data in some kind of real format. If character saving of all data is not possible, a partial solution would be to save all data in at least real*8. 5. The new 64-bit computers will make higher-precision calculations more viable and may prove useful for the estimation of problems requiring high precision for their successful solution. Real*16 and complex*32 will not have to be emulated in software by the compilers. These technological changes on the hardware side suggest that software designers may want to offer greater than double precision math in future releases of their products. 6. The lower the precision of the data, the more imperative it is to check for rank problems, use highquality numeric routines (lapack/linpack etc.) and utilize inherently higher accuracy solution methods, such as the QR. For many problems, however, if data are read with sufficient accuracy, this may not be needed. 7. If data are not initially read with sufficient precision, high-accuracy methods of calculation, such as the QR, can provide misleadingly "accurate" results that are in fact tainted by numeric representation issues inherent in the initial data read. This initial data "corruption" cannot be "cured" by any subsequent increase in data precision. The more "stiff" the problem, the more this becomes an important consideration. 126 Matrix Command Language 16.8 Conclusion In the 1960's when econometric software was not generally available, users passed around Fortran programs, usually with crude column dependent command structure. In that era, an applied econometrician needed to know the theory, the econometrics and in addition be able to program in Fortran. In the 70's this practice gave way to commercially available procedure driven software that could perform the usual analysis. In parallel a number of 4th generation languages were developed. These included APL, the SAS® Matrix Language, Speakeasy® and later MATLAB®, Gauss® and Mathmatica®. The B34S matrix command, while patterned on Speakeasy, is targeted for econometric analysis of time series, nonlinear detection and modeling and Monti Carlo analysis. The learning curve for its use is substantially less than Fortran and many built-in commands allow the user to program custom calculations relatively quickly. For routine analysis the built-in procedures are usually sufficient. The examples in this chapter illustrate the wide range of problems that can be solved using data saved in a number of precisions. Inspection of matrix language programs, as well as programs written in othetr 4th generation languages, both document the calculation being made and facilitate replication exercises with other software. Such systems make it easy to experiment, something substantially more difficult when Fortran and or C programs had to be custom built for each research step.25 In many of the chapters there is more discussion of specific problems that were studied with the help of the matrix command. 25 Stokes (2004b) extensively discussed this aspect of modern econometric software languages.