Revised Chapter 1 in Specifying and Diagnostically Testing Econometric Models (Edition 3)
© by Houston H. Stokes 11 February 2009. All rights reserved. Preliminary Draft
Chapter 1
Applied Econometric Modeling................................... 1
1.0 Introduction ............................................ 1
1.1 Outline of the Book ..................................... 2
1.2 B34S Overview ........................................... 7
Table 1.1 B34S Commands ................................... 10
Table 1.2 Original Origin of Source Code for Various B34S
Commands .................................................. 11
Table 1.3 B34S Run Stand Alone ............................ 11
Table 1.4 B34S Run Under SAS Using the cb34sm Macro ....... 12
Table 1.5 B34S Platforms ................................. 15
1.3 B34s Display Manager ................................... 15
Table 1.6 B34S MAKEMENU Commands to Generate the rr Command 17
1.4 Conclusion ............................................. 21
Applied Econometric Modeling
1.0 Introduction
This book illustrates the use of model specification and diagnostic tests applied to a variety of
econometric modeling techniques. The techniques discussed include simple, one-equation OLS
models with continuous variables on the left-hand side. These models can be tested for the
appropriate specification and for changes in the parameters over time and for different levels of the
right-hand-side variables using recursive residuals (RR) and Best Linear unbiased scalar (BLUS)
residual tests. Extensions of the simple, one-equation model include models in which the left-hand
side is a 0-1 variable (probit and logit models) or models in which the left-hand-side variable is
bounded (tobit models).
If we relax the assumption of exogenous variables on the right-hand side of a model, the
appropriate estimation technique is either two-stage least squares or limited information maximum
likelihood. Sets of equations should be estimated with three-stage least squares if there is covariance
among the error terms. If the data set consists of pooled data (time-series and cross-section), errorcomponents models are appropriate. In more limited cases where market share analysis is desired,
Markov probability models are a viable alternative. Forecasting extensions of the simple OLS model
include modeling only the error (ARIMA analysis) or specifying the dynamics of the mapping of the
exogenous variables on the endogenous variables and modeling the error (transfer function
modeling). The vector autoregressive (VAR) and vector autoregressive moving average (VARMA)
models are shown to be a time series generalization of three-stage least squares and full information
maximum likelihood models. Transfer function and ARIMA modeling is a special case of these more
general VARMA forms. VAR models can be viewed in the frequency domain for added insight.
More specialized techniques include orderly searches for the appropriate equation specification,
1-1
1-2
Chapter 1
using the MINMAX and L1 models; optimal control analysis; nonlinear analysis and the QR
approach to computation. The purpose of this monograph is to illustrate the above techniques, using
actual research data. To facilitate the calculations, the B34S Data Analysis program was developed
and its application will be illustrated. The B34S matrix command was developed to provide a 4th
generation programming language that was especially suitable for econometric and time series
analysis. Many problems are illustrated using this programming language. Sample output for all
procedures discussed in the text has been provided so that the availability of the B34S program is
not required to benefit from this book.1
1.1 Outline of the Book
Chapter 2 discusses options involving regression analysis and specification tests. These
options are accessed from the regression, reg and robust commands and include ordinary least
squares, weighted least squares, generalized least squares, L1 estimation, MINIMAX and
heteroskedasticity, normality, and serial correlation tests. Additional features include BLUS
residual analysis, BAYES analysis options, and other residual analysis options (ra option).
Chapter 3 is devoted to a discussion of logit, probit and tobit models, all of which involve
restrictions on the range of the dependent variable. The basic code for the logit routines, which are
accessed with the loglin command, was obtained from Nerlove and Press (1973, 1976). The tobit
and probit code was obtained from Mathematica Policy Research Corp2 and is accessed by the
probit and tobit commands. The multinomial logistic code, which is accessed by the mloglin
command, was initially obtained from Kawasaki (1978, 1979). A revision of this program, based on
the prior B34S version, was obtained from Klein and Klein (1988), Klein (1988). The multinomial
probit procedure, which operates on ordered probit data and which is accessed with the mprobit
command, was developed from code originally written by McKelvey and Zavoina (1971, 1975).
Chapter 4 discusses the use of routines built by Les Jennings (1980) that calculate ordinary
least squares, limited-information maximum likelihood, two-stage least squares, three-stage least
squares, iterative three-stage least squares and full-information maximum likelihood estimation for
systems of equations. This code is accessed by the simeq command. Advantages of the Jennings
code are the speed and accuracy of the algorithms used (QR approach) and the option of obtaining
the constrained reduced form of a system of simultaneous equations.
Chapter 5 is devoted to problems that arise in the distribution of the error term when pooled
1 Programs are listed in the text using upper case Courier Font. Commands inside a program are listed in the text as
bold lower case Times Roman. Techniques such as ordinary least squares are listed in the text in upper case. For
example, a MARS model is estimated by the mars command in B34S.
2 The tobit command Fortran code was very old and possibly is the original Tobin program. The probit code
most likely was originally developed by John Cragg, but has been changed by a number of others. All three
programs (loglin, probit and tobit) were converted to double precision and extensively improved by the addition of
the LINPACK matrix subroutines (Dongarra, Bunch, Moler, and Stewart 1979). The original developers are
absolved from any responsibility for any possible errors that might have been inadvertently added.
Applied Econometric Modeling
1-3
time-series and cross- section data are used in a regression. The error-components procedure is a
solution that avoids either the assumption being made that the constant is the same in the cross
section as through time or the loss of degrees of freedom if this assumption is relaxed and multiple
dummy variables are entered into the equation for each time period or for each cross section
observation. The code used in this section is an extension of the Freiden (1973) program by Houston
H. Stokes, following suggestions by Henry and McDonald. The basic reference is Henry, McDonald,
Stokes (1976). Balanced error component models are accessed with the ecomp command while
dynamic unbalanced panel datasets can be analyzed with the reg command..
Chapter 6 discusses an extension of the basic Lee, Judge, and Zellner (1970) Markov
probability model, following suggestions contained in Theil (1972, Chap. 5). The basic Markov
code was first extended to allow more states, and many of the linear algebra routines were replaced
with LINPACK matrix routines. The code was next extended to include decomposition of the
transition probability matrix into the fundamental matrix, the exchange matrix, the mean firstpassage matrix, etc and placed in the transprob command. These extensions were used in a number
of articles by Kosobud and Stokes (1978, 1979, 1980) modeling OPEC behavior and Neuburger and
Stokes (1979a) modeling economic history.
Chapters 7 and 8 are devoted to the time series analysis. Chapter 7 discusses the use of the
autocorrelation and cross correlation function in building OLS models, autoregressive integrated
moving-average models (ARIMA) and transfer-function (TF) models. The commands bjiden and
bjest are used to identify and estimate these models, respectively. The basic code used in these
commands was originally built by David Pack (1977), following suggestions made by Box and
Jenkins (1976) and Box and Tiao (1975). Suggestions of Neuburger and Stokes (1979b), Stokes and
Neuburger (1979) and Stokes (1990) for additional diagnostic specification tests have been
incorporated. A simplified treatment of the ARIMA modeling process is contained in Nelson
(1973). Recent developments are outlined in Enders (1995, 2004) and Tsay (2002, 2005). The Pack
code has been extensively modified to include many features not found in the original, such as
spectral analysis, further diagnostic tests, and modified to improve accuracy. In addition an automatic
model estimation command autobj has been developed to run as part of matrix.
Chapter 8 discusses vector autoregressive moving-average model building (VARMA). The
btiden and btest commands identify and estimate a VARMA model. These commands were based
on a heavily modified version of the Wisconsin WMTS-1 program, which was developed by Tiao and
Box (Tiao, Box, Grupe, Hudak, Bell, Chang 1979). Enhancements to the code include tests on the
residuals suggested by Hinich (1982), Hinich and Patterson (1985, 1986), Hinich and Wolinsky
(1988), and Stokes and Hinich (1989) and a decomposition of the covariance matrix to study
instantaneous causality suggested by Granger and Newbold (1977, 223). Melvin Hinich most
generously supplied his code formed the basis of the bispec sentence, which is callable from a
number of procedures, and the mvnltest command. The important book by Patterson - Ansley
(2000) provides further information on detecting nonlinearity.
1-4
Chapter 1
Chapter 9 discusses how to use the recursive-residual (RR) analysis technique to test an
equation for parameter stability. The rr command code was written by Houston H. Stokes,
following the suggestions in the seminal article by Brown, Durbin, and Evans (1975) and Dufour
(1979, 1982). Since the recursive residual technique involves repeated calculation of regressions as
new observations are added, a great deal of effort has been devoted to making the code execute
quickly and accurately. Modifications of the recursive residual technique have been made to allow it
to be used with cross-section samples to test for both variation of the coefficients for different levels
of the explanatory variables and for interaction effects. The code has been improved by the inclusion
of LINPACK routines, particularly in updating and downdating the Cholesky decomposition, and by
the ability to display the results using high resolution graphics on the PC versions of B34S.
Chapter 10 discusses the QR approach to OLS estimation, a technique particularly suitable in
cases of multicollinearity. While ridge lasso estimation procedure attempts to deal with the
multicollinearity problem via investigating the effect on the OLS coefficients of a perturbation of the
X'X matrix, the QR procedure factors the N by K X matrix directly such that X=QR, where the K
vectors in the N by K matrix Q are orthonormal and the K by K matrix R is upper triangular. Both
ridge approaches and lasso approaches can be thought of as data reduction techniques. Although a
Cholesky decomposition of X'X into R'R is an alternative approach to get R, the disadvantage of the
latter procedure is that the condition (ratio of the largest to smallest eigenvalue) of X'X is the square
of the condition of R. When X'X is close to singularity, problems will arise that would have been
avoided if one could get R directly (via the QR method) without forming the more rank-deficient
matrix X'X. 3 The QR factorization code in B34S is taken directly from LINPACK and, by use of a
pivoting option, allows the user to detect dependencies among the columns of X. The qr command
performs the above procedures and can optionally calculate principal-component regressions.
Chapter 11 concerns nonlinear estimation, which used to be only accessed by the basically
superceded nonlin command. At present there are a number nonlinear commands that are available
under the matrix command. Unlike the older approach that required the user to code the model in a
Fortran subroutine, the matrix command approach allows the user to code the model in a 4th
generation language while using a compiled code to actually solve the system. This hybrid approach
differs from MATLAB® and other systems where both the solver and the model are coded in a 4th
generation language. The basic nonlinear least squares code was originally written by Meeter
(1964a, 1964b) to implement the Marquardt (1963) algorithm.4 Initially the code was improved via
the addition of LINPACK routines and the use of a dynamic calling option, which allowed the user to
code his/her own models and create a library of compiled Fortran subroutines that B34S could
branch to during execution. This approach, while fast, required knowledge of Fortran and user
compilers. In the matrix command implementation of the same program, while some speed is lost
3 Strang (1976) contains an excellent discussion of this approach.
4 The gaushaus routine, developed by Meeter (1964a, 1964b), has passed the test of time and three variants are
used: in the Box-Jenkins ARIMA and transfer function model-building section, in the Box-Tiao VARMA modelbuilding section and in the matrix command.
Applied Econometric Modeling
1-5
over the older Fortran implementation, the easy of use has been vastly improved. On the PC, the use
of screen writes allows visual monitoring of the solution progress. Chapter 11 discusses both
nonlinear least squares and constrained and unconstrained optimization. It should be read in
conjunction with chapter 16 that discusses other features of the matrix command.
Chapter 12 contains a discussion of the varfreq and kfilter commands, which allow
decomposition of a VAR model into the frequency domain, following methods suggested by Geweke
(1982b, 1982c), and state space estimation, following suggestions by Aoki (1987). The MTSM
program developed by Geweke (1982a) was modified by Stokes (1985, 1986b) and forms the basis
for the varfreq command. The kfilter command uses the code developed by Aoki (1987) and is only
discussed briefly. The importance of testing for unit roots is discussed and a number of tests are
illustrated. The b34s polysolv command is used to test models for unit roots. The dangers of unit
roots are illustrated with sample data on which various tests are performed using the pgmcall
procedure, which provides an interface to the RATS software and the bispec sentence containing unit
root and ARCH test options.
Chapter 13 contains a brief treatment of the optcontrol command, which implements the
Chow (1975, 1981) optimal control code. Since the use of this approach is extensively discussed and
documented in the seminal references by Chow (1975, 1981), only a brief discussion of the program
is given here. The initial implementation of the Chow program used the DYNCAL procedure to
allow researchers to build a library of compiled subroutines containing their models to which B343S
dynamically branched at the time of the execution. Supporting the dynamic link proved too difficult.
In recent years the approach was modified to allow B34S to branch to a stand alone user compiled
Fortran program and communicate via files. While there is a speed loss over a dynamic link
implementation, functionality is maintained and substantial models can be studied. An example with
the Klein-Goldberger model is provided.
Chapter 14 deals with a number of approaches to model nonlinear data when the explicit
model is not known and the techniques in Chapter 11 that require explicit specification of the
nonlinear model cannot therefore be used. The focus is on the marspline, mars_var, gamfit, acefit,
and pispline commands, which estimate models using MARS, GAM, ACE and  spline methods
respectively, which are discussed and illustrated on data that is potentially nonlinear. The no longer
distributed B34S mars command, that used heavily modified code obtained from Friedman (1991b),
has been replaced by the marspline and mars_var commands that use GPL code originally
redeveloped for R by Hastie and Tibshirani (1990) who also developed the routines that form the
basis for the gamfit and acefit commands The pispline command is based on code from Breiman
(1991a).5 After a brief discussion of these techniques, a number of data sets, including the Gas
5 Friedman who, originally developed the MARS program, later registered it as a trademark. In this book the word
MARS refers to the MARS method or approach unless explicitly referring to the Friedman 3.5 version software
program. The Friedman program is no longer distributed in the commercial B34S. The marspline command was
developed using the GPL R code developed by Hastie and Tibshirani to replace the old Friedman MARS™ code.
The ACE and GAM capability is based on extensions of the R GPL code developed by Hastie and Tibshirani (1990)
1-6
Chapter 1
Furnace data that was found to be nonlinear in Chapter 8, are studied. Both MARS and  spline
models were found to reduce the nonlinearity in the data. The McManus dataset, discussed in
Chapter 3, and the Sinai - Stokes (1972) production function dataset, discussed in Chapter 2, 9 and
11, are also analyzed using a number of methods. Finally, various sample datasets were developed
and models were applied and the residuals tested. In addition to MARS and  spline , the
generalized additive model GAM developed by Hastie and Tibshirani (1990), and Alternating
Conditional Expectation (ACE) approaches are illustrated in a number of cases. It is argued that
there is no one technique that can be used in all situations. In many cases the nonlinear diagnostic
tests discussed earlier and the GAM procedure can be utilized to point to problem. It has found to be
most helpful in model building if a number of diagnostic tests are first performed on the residuals of
the preliminary model. If evidence of nonlinearity is found, the next step is to utilize the GAM
approach to attempt to point to just what right hand side variables are nonlinear before proceeding
further with other specifications.
Chapter 15 illustrates the capabilities of the B34S spectral command, which is similar to
the SAS command proc spectra. Sample data sets are developed and their spectral representations
studied. The gas furnace data is used to illustrate the capability of the procedure.
Chapter 16 illustrates the capabilities of the B34S matrix command, which allows users to
program the econometric calculations. A major emphasis of the chapter is to discuss the software
design issues that went into the implementation of this facility. Many examples are provided and
the whole issue of efficient programming is discussed and illustrated. The matrix command is
actually a major program in its own right. It can read and write data files for use in the older
“procedure” part of B34S or it can be run stand alone.
Chapter 17 is concerned with model building using non-linear nonparametric methods.
Specific methods discussed include Recursive Covering, Regularized Discriminate Analysis (a
compromise between Linear Discriminate Analysis (LDA) and Quadratic Discriminate Analysis),
Projection Pursuit Estimation, Exploratory Projection Pursuit and Random Forest Modeling. The
Random Forest approach is a generalization of the CART (Classification abd regression trees)
approach by implementing "bagging." Bagging, is based on the bootstrap method of analysis. Using
bagging roughly two thirds of the sample is randomly selected with replacement (bagged). Once the
model is estimated using this bagged subsample, it is tested on the "out of bag" sample. This is
repeated many times and the resulting classification model selected is based on "voting." These
techniques were implemented in the matrix command as the rcover, rda, ppreg, ppexp and
ranforest commands. A main objective of this chapter is to both outline these methods and compare
and contrast their performance on a wide range of problems.
It is the firm conviction of the author that one learns econometrics by application of
techniques. Only by systematically testing the specification of a model can one truly begin to
and others. The developer of B34S appreciates having the availability of these routines.
Applied Econometric Modeling
1-7
appreciate how sensitive results may be to the initial specification. While many programs require the
user to "run blind," because they do not provide adequate diagnostic tests, B34S allows the user to
subject his/her model to a battery of test procedures, which will demonstrate how sensitive the
results are to alternative specifications of the functional form. The remaining chapters discuss some
of these procedures and their use in greater detail. Most chapters contain B34S control setups to run
sample programs using the procedures discussed. Readers are encouraged to run these sample
programs in their entirety.6 Edited versions of the output of these control programs are contained in
the text and are briefly discussed. Before discussion of the regression specification tests, a brief
description of the B34S software is provided to give the reader an overview of the model
specification and diagnostic testing options available.
1.2 B34S Overview
The B34S Data Analysis Program is a collection of econometric procedures that are useful
in the analysis of both cross-section and times-series models. The program consists of two parts: a
number of command driven procedures and a programming language. The command driven
procedures allow data loading and analysis of econometric models using regression and other
methods. The programming language, available under the matrix command, allows the user to
actually develop procedures to perform the analysis using an object oriented programming language.
Within the programming language, user programs, subroutines and functions can be built and used to
"customize" the calculations. The B34S has two way links with other systems such as SAS®,
SCA®, SPEAKEASY®, MATLAB® and RATS®. It is assumed that the reader has a good
background in econometrics, obtainable from study of such books as those by Johnston (1963, 1972,
1984), Theil (1971), Pindyck and Rubinfeld (1981), Chow (1983), Greene (2000), Enders (1995),
and Kmenta (1971, 1986). Many of the more advanced statistical procedures discussed involve
techniques that are just beginning to appear in econometric textbooks.7 To aid the reader, the
treatment of all techniques discussed in this book, while brief, is meant to be self-contained.
B34S can be run under SAS, SPEAKEASY, MATLAB, or as a stand-alone program. When
running as a stand-alone program, B34S will read data in a number of ways, the most common being
from a sequential file that can be either in E format, F format, Z format, A8 format or doubleprecision unformatted.8 B34S can create and read an SCA FSAVE data file library (Liu, Hudak
1986a, 1986b) or an SCA MAD file and can create and read its own data step. These options
6 The B34S code can be typed in or obtained from the libraries that are distributed with the software. Since most
sample setups are B34S macros, they can be selectively run with the options command on the PC under the Display
Manager.
7 Epstein (1987) contains a good discussion of how current econometric practice has evolved over time. Stokes (2003b)
discussed software development issues.
8 For a discussion of these formats, see any basic Fortran textbook or IBM Inc. (1972, 1988a, 1988b).
1-8
Chapter 1
facilitate data interchange between B34S and SCA, which are complementary programs. If PC
users have RATS, the citibase command provides access to CITIBASE data files, which can be
changed in frequency and loaded for further processing. B34S can also read and write a RATS
portable data file. While B34S procedures with the exception of the matrix have a hard limit of no
more than 98 series in one file, the dmf command (Data Management Facility) can be used to save
and manage data files substantially larger. The current limit of the matrix command is 10,000
objects, each of which can be a matrix. This facility, which can be thought of as a program within a
program, has essentially no limit except imposed by hardware.
B34S is usually run in batch mode by submitting command files that produce output and log
files. In addition to batch mode, the B34S Display Manager provides a graphical interface to edit and
submit command files and to inspect output and log files and, view and create high- resolution
graphics on the four main platforms, Windows 98/NT/2000/XP, Linux, Sun and RS/6000. The
Display Manager provides easy access to The Display Manager is designed to run in graphics mode
but automatically will run in text mode on a Unix dial up line. To speed up the learning curve, B34S
users of the Display Manager can select their own editor, although KEDIT or the Microsoft
NOTEPAD / WORDPAD editors are recommended. The makemenu command provides a facility by
which users can construct menus using the B34S control language to provide an automatic program
writing feature. Under the Display Manager a number of default menu files are provided under the
menu and graphics commands. Since these are just B34S command files using the makemenu
command, they can be modified or customized by the user. The makemenu command can be run in
batch mode to provide custom, menu-driven B34S applications that can be developed and
maintained by users. The makemenu command can be used to generate command files for other
programs in addition to B34S. Further information concerning these features are contained in Stokes
(1996b) which documents B34S commands and which is available on-line and Stokes (1996a)
which documents the "native" B34S command language. As an alternative to loading data into
B34S directly, use of the SAS MACRO cb34sm allows B34S to be seen as a SAS procedure. In this
mode of operation the B34S learning curve is reduced to only the desired B34S paragraph since the
cb34sm macro handles all data loading. Use of this command is illustrated below.
The command structure of B34S involves the specification of multiple paragraphs or
commands, each containing sentences. The first sentence in each paragraph begins with the keyword
b34sexec, which starts the parser. The last sentence in each paragraph, which turns off the parser,
must be either b34seend$ (B34S EXEC END) or b34srun$. Each sentence must end with the
delimiter $ or the delimiter ;. The b34seend$ sentence is used in batch operation. The b34srun$
sentence is used in an interactive environment to force B34S to execute the command. Its use is
similar to that of the SAS run; command. If the b34seend$ sentence is used, B34S will parse all
commands prior to attempting to execute. In this book, the term " B34S command" and " B34S
paragraph" will be used interchangeably except when referring to commands inside the matrix
command. The B34S matrix command starts a programming language, which allows customized
programming. This section of the B34S, although fully integrated into the procedure driven
section, is logically distinct. The matrix command programming language can load data from the
Applied Econometric Modeling
1-9
rest of B34S , can read SPEAKEASY, RATS, and SCA save files or it can directly read and write
data. More detail on the B34S command structure is contained in Stokes (2000). A list of currently
supported B34S commands or paragraphs is given in Table 1.1. Table 1.2 lists the origin of the
basic code in those commands that were developed from source provided by others.
The B34S control language is completely free format and is in many ways similar to that of
SAS. The SAS macro cb34sm, distributed with B34S, allows B34S to be called by a SAS user and
is intended to run on all SAS platforms. This is in contrast with the SAS procedure cb34s, which
only ran on SAS version 5.xx on MVS (Stokes 1986a). An example of this mode of operation is
illustrated in Table 1.3. Note that the B34S sentence termination character $ is used in place of ; to
not confuse SAS. It is recommended to run B34S under SAS in applications that need to utilize the
powerful SAS data step to load and process the data prior to the call to the more specialized B34S
procedure. A SAS/B34S job is illustrated in Table 1.4, while a B34S stand-alone job is shown in
Table 1.3. In this mode of operation, the B34S data paragraph is required, while in the SAS/B34S
job, the B34S data paragraph does not have to be explicitly supplied.
The complete B34S command reference manuals (Stokes 2006a, 2006b) are available online. While these manuals may be used as the sole references for B34S, they contain little
documentation of the statistics calculated and no output from examples of their use although a
number of sample programs are shown. The file c:\b34slm\example.mac, which contains working
examples of all procedures, and this book can be thought of as a major extension of the original
B34T manual written by Hodson Thornber (1966, 1967, 1968). Sections of Thornber's original
manual have been included in Chapter 2 of this book, with the author's permission. Chapter 16
contains an overview of the commands available in the matrix command.
1-10
Chapter 1
Table 1.1 B34S Commands
Command
HELP
OPTIONS
REGRESSION
LIST
PLOT
PROBIT
TOBIT
LOGLIN
ECOMP
AUTOC
RR
QR
DATA
MPROBIT
MLOGLIN
SIMEQ
TRANSPROB
BJIDEN
BJEST
BTIDEN
BTEST
VARFREQ
PGMCALL
POLYSOLV
DTASSM
OPTCONTROL
GAMFIT
KFILTER
SOURCE
SCAINPUT
FORECAST
MARS
PISPLINE
GENMOD
HRGRAPHICS
SORT
SPECTRAL
MAKEMENU
DMF
MVNLTEST
CITIBASE
READVBYV
DESCRIBE
REG
ROBUST
TRANSPOSE
FREQ
LPMAX
EXPAND
MATRIX
Description
Provide help, generate on-line manual.
Set B34S run-time options.
OLS and GLS estimation. BLUS and RA analysis.
Display data in B34S data file.
Plot and graph series in B34S data file.
Probit analysis on (0-1) dependent variables.
Tobit analysis on truncated dependent variables.
Logit analysis on up to four equations at once.
Error-components analysis.
Autocorrelation and cross correlation analysis.
Recursive-residual analysis.
QR factorization & principal component analysis.
Load data into B34S without SAS/B34S interface.
Multinomial probit analysis.
Multinomial logit analysis.
2SLS, LIML, 3SLS, I3SLS, FIML, SUR estimation.
Estimate Markov probability model.
Box-Jenkins identification. Spectral analysis.
Box-Jenkins ARIMA, transfer-function estimation.
Identification of VAR and VARMA models.
Estimation of VAR, VARMA and VMA models.
Spectral decomposition of VAR models.
Branch to SAS, SPEAKEASY, SPSS, SCA, TSP and LIMDEP.
Solution of polynomials.
Data-manipulation utilities.
Optimal control analysis.
Estimate a GAM Model.
Estimate state-space model.
B34S FORTRAN source manager.
B34S/SCA/RATS/MATLAB input/output option.
Automatic VAR Forecasting Model Development.
Multivariate Adaptive Regression Splines (No longer available).
PI Method of Fitting an underlying smooth function.
Generate Data sets with given covariance structure.
High Resolution Graphics.
Sort data.
Spectral Analysis.
User Menu facility.
Data Management Facility.
Multivariate tests for nonlinearity.
Load Citibase data into B34S using RATS.
Read Data Variable by Variable
Calculation of Various Summary Measures
Panel Time Series Analysis
L1, MINIMAX and OLS Models
Calculate Transpose of Data Matrix
Frequency Plots and Cross Tabulation
Linear Programming
Expand Weighted Dataset
General Programming Language with many commands
Applied Econometric Modeling
1-11
Table 1.2 Original Origin of Source Code for Various B34S Commands
Command
REGRESSION
LOGLIN
ECOMP
MPROBIT
MLOGLIN
SIMEQ
TRANSPROB
BJIDEN
BJEST
BTIDEN
BTEST
VARFREQ
OPTCONTROL
GAMFIT
KFILTER
MARS
PISPLINE
HRGRAPHICS
Description
Thornber (1966)
Nerlove-Press(1973)
Freiden (1973), Henry-McDonald-Stokes (1976)
McKelvey - Zavoina (1971, 1975)
Kawasaki (1978, 1979)
Jennings (1980)
Lee-Judge-Zellner (1970)
Tiao-Box (1981), Tiao-Grupe-Hudak-Bell-Chang (1979)
Tiao-Box (1981), Tiao-Grupe-Hudak-Bell-Chang (1979)
Tiao-Box (1981), Tiao-Grupe-Hudak-Bell-Chang (1979)
Tiao-Box (1981), Tiao-Grupe-Hudak-Bell-Chang (1979)
Geweke (1982a, 1982b)
Chow (1975, 1981)
Hastie and Tibshirani (1990)
Aoki (1987)
Friedman (1991b) (No longer commercially available)
Breiman (1991)
Interacter (1995a, 1995b)
LINPACK (Dongarra-Bunch-Moler-Stewart, 1979), EISPACK, LAPACK (Anderson-Bai-Bischof-Demmel-DongarraDu Croz-Greenbaum-Hammarling-McKenney-Ostrouchov-Sorenson, 1992) and FFTPACK are used throughout the
program. Nonlinearity tests based on code supplied by Hinich (1982) are callable from a number of places in the system.
All other code with the exception of the IMSL and Interacter routines were developed by Houston H. Stokes
Table 1.3 B34S Run Stand Alone
b34sexec data $
input x y
datacards$
11 22
33 44
55 66
99 77
77 88
b34sreturn$
b34seend$
b34sexec list$ var x$
b34sexec regression$
b34sexec robust$
b34seend$
model y = x$ b34seend$
model y=x$ b34seend$
1-12
Chapter 1
Table 1.4 B34S Run Under SAS Using the cb34sm Macro
* This job uses the SAS MACRO CB34SM;
%include 'c:\b34slm\cb34sm.sas';
data junk;
input x y;
cards;
11 22
33 44
55 66
99 77
77 88
;
proc means;
* Clean files **********************************
options noxwait;
run;
data _null_;
command ='erase myjob.b34';
call system(command);
* End of clean step ****************************
*
;
* Place B34S commands next after %readpgm ;
%readpgm
cards;
b34sexec list$ var x$
b34seend$
b34sexec regression$
model y = x$
b34seend$
b34sexec rr$ model y=x$ b34seend$
b34sexec describe$ b34seend$
b34sexec reg$ model y=x$ b34seend$
b34sexec options dispmoff$ b34srun$
;
run;
%cb34sm(data=junk, var=x y, u8='myjob.b34',
u3='myjob.b34',
options=nohead)
options noxwait;
run;
* This step calls b34s and copies files
;
data _null_;
command ='b34s myjob';
call system(command);
run;
endsas;
;
;
Applied Econometric Modeling
1-13
To facilitate importing code as a goal, B34S was designed with multi-level parsing (Stokes,
1987). At the calculation or lowest stage, the procedures run their own control language, which is
usually column-dependent. The B34S command language is not parsed by any procedure with the
exception of the matrix and options commands. This design facilitates getting code up fast and
allows the saving of partially "compiled" code, since a column-dependent command language is very
fast to execute.
One level up from the bottom, the B34S language (see Table 1.3) looks very much like SAS
but with important differences. The B34S parser looks at paragraphs. Each paragraph begins with
the key word b34sexec and ends with the keyword b34seend$ (B34S exec end) or b34srun$.
Outside the paragraph, the B34S parser passes the command stream to the next level, taking out only
comments. This allows two levels of command language to be mixed in the same file. The B34S
parser first scans for any B34S macro commands, which, if found, are expanded first.9 In the next
parse pass, once the B34S parser detects the key word b34sexec, it reads the complete paragraph and
writes the command language of the next level down. Hence the B34S parser stands outside the
program in the sense that it is a program generator. Its function is to provide a user command
interface and write lower level commands. The B34S program produces two files: *.log containing a
listing of the commands parsed together with any errors found and a *.out file, which contains the
output of the program.
B34S currently runs on the platforms listed in Table 1.5. In the early 1970s B34S ran only
under MVS. Later, a port was made to CMS as compilers on IBM progressed from the G compiler,
through the H compiler, to the H extended compiler and, finally, to a succession of IBM Fortran VS
compilers. At every stage a conscious effort was made to run under full optimization, using the most
modern routines. Stokes (2003b) lists some of this history.
A major design goal of B34S is to provide one-way, and in many cases two-way, links with
9 There is the potential for confusion between the terms MARCO file and macro command. A MACRO file
implements what used to be called an IBM PDS file where the partitions are divided by
==NAME
b34sexec commands here;
==
==NAME2
b34sexec command here
==
Such macros are called by
b34sexec option include(‘file.mac’) member(name1); b34srun;
macro commands, on the other hand, are a programming language that stands outside the normal B34S commands
and allows code generation. Examples are given below.
1-14
Chapter 1
other software systems. This is especially true with the various PC versions of B34S. There are
currently two-way links with SAS, SPEAKEASY, SCA , MATLAB and RATS. The term "oneway link" means that B34S can make a data-loading step for the other program and pass commands
to the program. If the other program can be loaded under B34S with the Lahey call system(' ')
subroutine, the other program's output will be seen in the B34S output window as if the other
program was a part of B34S. The term "two-way link" means that B34S can read data files from the
other program or pass data and commands to the other program. The term "be called by" means that
the other program can call B34S, pass data and obtain results as if B34S was a subroutine of the
other program.
In 1991, B34S was ported to run under the Lahey Fortran compiler F77L-EM/32. Due to the
excellent design of this compiler, only two basic changes were needed. These included making sure
that the CHARACTER data type was not passed as an address to a routine that thought it was
REAL*8, which was allowed under IBM. The other change involved replacing a BAL memory
allocation routine with the Lahey Fortran 90 ALLOCATE command. The only capability in B34S
that did not port was the dynamic link to a user-compiled Fortran subroutine. The developer of B34S
considered implementing this facility with a DLL, although the nonlinear capability in the matrix
command that allows links to user programs written in the matrix language makes this increasingly
unlikely. An argument for not implementing a DLL link is that the user would require a Fortran or c
compiler and models would not be portable across different platforms. In the development of B34S,
every effort has been made to make the program independent of any Microsoft® conventions. With
the availability of then state of the art 486/DX33 machines and the Lahey compiler, the PC became a
viable research platform. Today with the substantially faster Intel chips, PC performance, especially
under Linux, is comparable or better than work station performance at a substantially reduced cost.
In the early 90s it became apparent that mainframe capability in the PC was not enough. An
interface was needed, although batch capability had to be maintained. B34S was enhanced with a
GUI based on the Spindrift and Graphoria libraries and the developer of B34S started working with
Don Gable, the developer of Spindrift, to test enhancements to the Spindrift library. A major addition
to the new library was the doscreen subroutine, which facilitated user menus. This subroutine
became the basis of the B34S makemenu command. The Spindrift library was ported to the Lahey
LF90 compiler but bugs remained. The Graphoria library never worked properly under LF90, despite
many releases. At present these two libraries are being used only with the 6.xx version of B34S
under F77L-EM/32, which works well on smaller machines, but has been frozen. In late 1995 the
more powerful Interacter Library was integrated with B34S. All features of the Spindrift and
Graphoria libraries were implemented, enhancements to the interface were made, and substantial
graphic capability was added. In 1999 the same source was made to run on Windows 98/NT/2000
with LF95, Linux with LF95, RS/6000 and Sun. A major advantage to building B34S around IMSL
and Interacter is the portability that these systems provide. The Windows B34S does not make any
direct API calls itself. These are handled by Interacter which uses the Windows API on Windows
and the unix X-Windows system on all other platforms. Table 1.5 lists all the past and current
versions of B34S.
1-15
Applied Econometric Modeling
Table 1.5 B34S Platforms
Hardware
Frozen Versions
3090
3090
386/486/586
386/486/586
386/486/586
Supported Versions
RS/6000
SUN
Intel
Intel
Operating System
Compiler
Name
Version
MVS
CMS
DOS
DOS
Windows 95/98/NT
VS version 2.4
VS version 2.6
F77L-EM/32
F77L-EM/32
LF90
B34S
B34S
B34S
B34SI
B34SW
21Nov86
6.23
6.23a
7.11c
7.11c
AIX
Solaris
Windows NT/2K/XP
Linux
AIX
F77&F90
LF95
LF95
B34SX
B34SX
B34SLF95
B34S
8.10z
8.10z
8.10z
8.10z
_____
Notes: All currently supported versions were built using the Interacter Subroutine Library and link
in routines from the IMSL Fortran library. B34S version 6.23a on the PC used the Spindrift and
Graphoria libraries. The LF95 versions of B34S use code targeted for the Pentium® II, and III
chips.
1.3 B34s Display Manager
The B34S Display Manager provides a front end into the program as well as a means by
which to write lower level code. The design of this facility is sufficiently unique to warrant
discussion in its own right. This facility would not be possible without the Interacter Library. The
Display Manager:
- Manages the B34S *.log file.
- Manages the B34S *.out file.
- Allows the user to edit and submit jobs.
- Provides access to a user-modifiable help facility.
- Provides access to quick graphics.
- Provides access to a user-modifiable menu generator.
- Provides access to all help files, example jobs and shell files.
- Allows calls to be made to other supported systems.
Once a job with B34S commands is submitted from the Display Manager, it is parsed in the usual
manner and run by the base B34S system. Upon completion B34S returns to the Display Manager.
1-16
Chapter 1
Apart from the multi-level nature of B34S, such an organization is increasingly common. Another
approach would be to have the front end directly control the operation of the program, rather than run
through a lower-level control language.
What is relatively unique about the B34S Display Manager is the menus of the GUI that set
up specific commands are themselves generated by programs written in the B34S language. There
are several advantages to this design. These include the following:
-
The ability of users to customize the menus.
The removal of menu code from the b34s load module.
The ability to "fix" menus without the necessity to recompile.
1-17
Applied Econometric Modeling
-
The ability to manage menu systems from the user level.
The built-in ability of the menu language itself to write program statements in the b34s
language and in the language of other programs.
In the author’s experience, the first product that provides such an extensive capability is the
SAS/AF® facility. After B34S implemented this facility in the early 90’s a few other systems such
as MATLAB and later RATS designed such a user extendable menu facility. The advantage of the
elegant MATLAB implementation is that it is written in Java. The advantage of the B34S approach is
that it runs equally well in text or graphics more. The former implementation is useful on UNIX
systems. On Microsoft systems Visual Basic is an attempt to provide part of this capability across a
limited number of platforms from outside the system. The end result of this approach is inferior to
having the GUI menu generator built right into the application. The way in which the B34S Display
Manager menu generator is implemented will now be discussed.
Table 1.6 lists B34S makemenu control language, which will generate a B34S menu using
the Interacter Software Library to input regression commands. The example uses the B34S macro
language. The first field type=info sentence places the text "OLS Model Building" at row=2 and
column=2 in the menu. The next field sentence sets a number of B34S macro variables. The third
field type=input sentence displays "Beginning obs:" at column=2 in row=4 and asks for input.
Automatic help in the form of "Blank defaults to 1" is displayed at the bottom of the screen when
this line executes. After stepping through the menu and entering data, the menu can be executed by
the enter key. An important advantage of the makemenu facility is that users have full control of
menus and can modify them as well.
At execution the special comments of the form /$# after the pgmcards$ sentence become
commands. B34S macro variable structures such as
/$#
/$#
/$#
%b34sif(&in1.ne.0)%then
ibegin=%b34seval(&in1)
%b34sendif
$
$
resolve to be B34S parameters when B34S macro variables such as &in1 are found to be NE 0.
More information on the B34S makemenu command and the B34S command language is contained
in the on-line B34S help documents.
Table 1.6 B34S MAKEMENU Commands to Generate the rr Command
B34SEXEC MAKEMENU COMMANDN('OLS Model Building')
COMMANDH('Controls setting up a simple OLS Model. Optionally model'
'diagnostic tests and nonlinearity tests can be requested')$
FIELD TYPE=INFO PAGE=1 ROW=2 COL1=2 TEXTCOLOR=YELLOWCHR
TEXT('
OLS Model Building')
TEXTID(' ')$
1-18
Chapter 1
FIELD TYPE=HIDDEN PAGE=1 ROW=1 COL1=1
PRELINE('%A34SLET IN1
'%A34SLET IN2
'%A34SLET IIN2
'%A34SLET var3
'%A34SLET white
'%A34SLET DFTEST
'%A34SLET PPTEST
'%A34SLET LMTEST
'%A34SLET ACFVARSQ
'%A34SLET PACFVARSQ
'%A34SLET HINICH
=
=
=
=
=
=
=
=
=
=
=
0^
'
0^
'
0^
'
"_NULL_"^'
0 ^
'
0 ^
'
0 ^
'
0 ^
'
0 ^
'
0 ^
'
0 ^
'
)$
FIELD TYPE=INPUT PAGE=1 ROW=4 COL1=2 LETNAME(IN1) FIELDTYPE=INTEGER
DEFAULT='
'
INTRANGE(0,999999999)
COL2=24 TEXTID='Blank defaults to 1'
TEXT('Beginning obs.:')
$
FIELD TYPE=INPUT PAGE=1 ROW=5 COL1=2 LETNAME(IN2) FIELDTYPE=INTEGER
DEFAULT='
'
COL2=24 TEXTID='Blank defaults to last observation'
INTRANGE(0,999999999)
TEXT('Ending Obs.:')
OPTIONAL
$
field type=input fieldtype=chartest page=1 row=6 col1=2
preline('%A34SLET white=1^') col2=24
fieldhelp('Use White or Robust SE in place of usual formula')
textid('Enter YES use White SE')
text('White (1980) SE:')
default=('no ') POSTSTRING('YES')
$
field type=input fieldtype=chartest page=1 row=7 col1=2
preline('%A34SLET HINICH=1^') col2=24
fieldhelp(' Turns on Hinich residual testing options')
textid('Enter YES to perform Hinich non linearity tests')
text('Hinich t.:')
default=('no ') POSTSTRING('YES')
$
FIELD TYPE=INPUT PAGE=1 ROW=8 COL1=2 LETNAME(DFTEST)
FIELDTYPE=INTEGER DEFAULT='
'
COL2=24 TEXTID='Set Order of Dickey-Fuller Test'
INTRANGE(0,999999999)
TEXT('D-F Test:')
OPTIONAL
$
FIELD TYPE=INPUT PAGE=1 ROW=9 COL1=2 LETNAME(PPTEST)
FIELDTYPE=INTEGER DEFAULT='
'
COL2=24 TEXTID='Set order of Phillips-Perrone test'
Applied Econometric Modeling
INTRANGE(0,999999999)
TEXT('P-P Test:')
OPTIONAL
$
FIELD TYPE=INPUT PAGE=1 ROW=10 COL1=2 LETNAME(LMTEST)
FIELDTYPE=INTEGER DEFAULT='
' COL2=24
TEXTID='Set order of Engle Lagrangian Multiplier test'
INTRANGE(0,999999999)
TEXT('L-M Test:')
OPTIONAL
$
FIELD TYPE=INPUT PAGE=1 ROW=11 COL1=2 LETNAME(ACFVARSQ)
FIELDTYPE=INTEGER DEFAULT='
' COL2=24
TEXTID='Set order of ACF for ARCH Test on Squared Residuals'
INTRANGE(0,999)
TEXT('ACF ARCH Test:')
OPTIONAL
$
FIELD TYPE=INPUT PAGE=1 ROW=11 COL1=42 LETNAME(PACFVARSQ)
FIELDTYPE=INTEGER DEFAULT='
' COL2=64
TEXTID='Set order of PACF for ARCH test on Squared Residuals'
INTRANGE(0,999)
TEXT('PACF ARCH Test:')
OPTIONAL
$
FIELD TYPE=INPUT PAGE=1 ROW=18 COL1=2 LETNAME(var1) FIELDTYPE=QVARLIST
DEFAULT='
'
COL2=24 TEXTID='Specify left hand variable names here'
TEXT('Left Hand Var:')
REQUIRED
$
FIELD TYPE=INPUT PAGE=1 ROW=19 COL1=2 LETNAME(var2) FIELDTYPE=QVARLIST
DEFAULT='
'
COL2=24 TEXTID='Specify right hand variables'
TEXT('Right Hand Var:')
required
$
FIELD TYPE=INPUT PAGE=1 ROW=20 COL1=2 LETNAME(var3) FIELDTYPE=QVARLIST
DEFAULT='
'
preline('%A34SLET IIN2=1^')
COL2=24 TEXTID='Specify right hand variables'
TEXT('Right Hand Var:')
optional
$
PGMCARDS$
/$# B34SEXEC RR
/$# %B34SIF(&IN1.NE.0)%THEN
/$# IBEGIN=%B34SEVAL(&IN1)
/$# %B34SENDIF
/$# %B34SIF(&IN2.NE.0)%THEN
/$# IEND =%B34SEVAL(&IN2)
/$# %B34SENDIF
$
$
$
$
1-19
20
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
/$#
Chapter 2
%B34SIF(&WHITE.NE.0)%THEN
WHITE
%B34SENDIF
$
$
$
bispec
%b34sif(&hinich.ne.0)%then
$
iturno iauto vhtest
%b34sendif
$
%b34sif(&DFTEST.ne.0)%then
$
DF ADF(%b34seval(&DFTEST)) ADFT(%b34seval(&DFTEST))
%b34sendif
$
%b34sif(&PPTEST.ne.0)%then$
PP APP(%b34seval(&PPTEST)) APPT(%b34seval(&PPTEST))
%b34sendif
$
%b34sif(&LMTEST.ne.0)%then
$
LM(%b34seval(&LMTEST))
%b34sendif
$
%b34sif(&ACFVARSQ.ne.0)%then
$
ACFVARSQ(%b34seval(&ACFVARSQ))
%b34sendif
$
%b34sif(&PACFVARSQ.ne.0)%then
$
PACFVARSQ(%b34seval(&PACFVARSQ))
%b34sendif
$
$
MODEL %b34seval(&var1) =
%b34seval(&var2)
%b34sif(&iin2.ne.0)%then
$
%b34seval(&var3)
%b34sendif
$
$
B34SEEND
$
B34SRETURN$
B34SEEND$
The above example provides only a taste of what is possible.10
10 Bill Lattyak’s WORKBENCH program, which is actually a powerful scripting program, provides a very userfriendly and powerful way to run B34S that seems substantially lower the learning costs. The modern group of
users, having grown up in the GUI world, are not naturally drawn to scripting languages and thus are particularly
helped by such aids. Scripts build by WORKBENCH can be saved and further edited by the user.
Regression Specification Tests
21
1.4 Conclusion
Depending on the specific econometric problem, the chapters in this book do not necessarily
have to be read in chronological order. Chapter 2 should probably be read to get some idea of the
assumptions of the basic OLS model. If only the matrix command is needed, the reader can skip to
Chapter 16, which provides an introduction to this facility. Since the other chapters show matrix
command applications, as these are encountered, the reader may have to consult Chapter 16 to obtain
a better idea of the structure of the programming language.