prefix - Houston H. Stokes Page

advertisement
Preliminary draft dated 11 February 2009
Specifying and Diagnostically Testing Econometric Models
Third Edition
Houston H. Stokes
University of Illinois at Chicago
For any questions concerning this manuscript or the B34S program see the author or call 312643-4383 or 312-996-0971.
Copyright
Houston H. Stokes
1982, 1988, 1989, 1990, 1991, 1996, 2006
CONTENTS
Tables
Figures
PREFACE
1. Applied Econometric Modeling
2. Regression Analysis With Appropriate Specification Tests
3. Logit, Tobit, Probit
4. Simultaneous Equations Systems
5. Error-Components Analysis
6. Markov Probability Analysis
7. Time Series Analysis Part I: Identification of ARIMA and Transfer Function Models
8. Time Series Analysis Part II: VAR, VARMA and VMA Models
9. Testing the Specification of OLS Equations With Recursive Residuals
10. Special Topics in OLS Estimation
11. Nonlinear Estimation Options in B34S
12. Special Topics in Time Series Analysis
13. Optimal Control Analysis
14. MARS and Pi_Spline Model Building
15. Spectral Analysis of Time Series
16 Programming using the Matrix Command
17. Model Building Using Nonlinear Nonparametric Methods
BIBLIOGRAPHY
INDEX
Preface
3
PREFACE
In the late 1960s, I became aware of the enormous gap between the appropriate statistical
procedures suggested by econometric theory and the then availability of options in statistical
packages. My research interests comprised estimating equations using generalized least squares
(GLS) with more than first-order serial correlation in the error (Sinai and Stokes 1972). To my
amazement, I discovered that few statistical packages were able to perform GLS estimation, and,
the ones that could, were restricted to first-order GLS. An exception was the b34t program,
which was developed by Hodson Thornber (1966) at the University of Chicago in the middle
1960s to estimate up to ninth-order GLS. The program b34t consisted of 3,000 Fontran II
statements for an IBM 7094 and represented an enhancement of the UCLA BIMED34 regression
program.
In the 1960s, applied econometricians were hampered by researchers developing singlepurpose statistical packages, each of which required data in a different form. Many of the
statistical routines in these packages were unable to identify matrices that were almost rank
deficient and thus unexpectedly gave poor results.1 The research reported in this book originated
from the perceived need to implement on the computer a number of statistical methods and
specification tests for econometric models.2 This book documents a variety of econometric
1
In a now classic study of statistical program accuracy, Longley (1967) found that the percentage error in the price
coefficient of a multicollinear data set ranged from .03% to 375%. Four of the nine programs tested did not agree
even in the first digit, two were accurate to 1 digit, two to 3 digits and one to 4 digits. Further detail on this paper is
contained in Chapter 10, which discusses the QR approach to OLS estimation. Modern computer programs such as
sas® and speakeasy® have built-in accuracy checks. A series of papers by McCullough (1998a, 1998b, 1999a,
1999b) suggests that accuracy still is a problem. McCullough makes the point that it is not an error for a software
system to fail to solve a highly multicollinear model if a warning is given. What is not acceptable is to have software
produce wrong answers with no warning. The Rogers - Filliben - Gill - Gutrie- Lagergren and Vangel (1998)
Statistical Reference Datasets produced by the NIST now provide additional benchmarks developers can use to test
their software. B34S is distributed with test programs that run all applicable datasets. Altman-Gill-McDonald
(2004) is a good reference regarding desired modern statistical computing practices. In a series of papers, Stokes
(2004a, 2004b, 2005) provides a discussion of software design issues impacting accuracy including a discussion of
the role of variable precision math.
2
The basic b34s code now consists of over 360,000 Fortran 95 statements and runs on three UNIX platforms (IBM
RS6000 AIX, SUN and Linux) and Windows 95/98/NT/2000/XP systems. Versions for IBM-MVS, IBM-CMS and
Microsoft DOS have been frozen due to no demand. The full b34s is available for lease from the author (773-6434383) or from Scientific Computing Associates, 1410 N. Harlem Ave., Suite F, River Forest, IL 60305 (708-7714567). For further information see Houston H. Stokes web page under College of Business, Department of
Economics, UIC (http:/ /www.uic.edu/~hhstokes). Program updates can be downloaded from the b34s web page if
the user has a valid license.
Preface
5
diagnostic and specification tools and provides illustrations of their use with actual econometric
examples in a number of fields using the b34s® Data Analysis program.3
The reader does not need to have access to the b34s program to use this book
effectively. All results are completely documented in the text and illustrated with computer
output. Readers desiring to apply the indicated techniques could use b34s, or program the
techniques in a higher level programming language such as the speakeasy® system,
matlab® or the sas/iml® system. The techniques illustrated have been used in economic
analysis, in financial modeling, in health economics, in energy modeling, in environmental
economics, in sociology, in political science and in industrial research. Many of the problems in
these areas have been used as illustrations in this book.
Each chapter in this monograph will indicate briefly the statistical problem, what specific
calculations are available, the routines to be used to make these calculations and, wherever
possible, provide an example. When more common procedures are being discussed (such as
two-stage least squares), the technical discussion will be reduced and the reader will be referred
to appropriate textbooks. When procedures use code developed by others, the reader will be
directed to the original source for additional detail. In each case major attention to the mechanics
of the calculation will be presented. All problems illustrated are distributed test problems from
the B34S web page so that interested readers can try to replicate and or modify all calculations.
Due to the response from readers of the first edition and second editions, this edition has
been enlarged with material on the detection and modeling of nonlinear models using MARS
(Multivariate Adaptive Regressions Splines) and PISPLINE (  spline ) models as well as many
examples that are now possible with the new matrix programming capability in b34s. Most
other chapters have been substantially expanded to incorporate these new facilities. A project the
size of this book incurs numerous debts. My father, W. E. D. Stokes, Jr., first introduced me to
signal filtering as applied to economic problems and stimulated my interest in graduate work in
economics in the 50s. I am deeply indebted to Henri Theil and Arnold Zellner who introduced
me to econometrics in the late 60s at the University of Chicago and provided encouragement for
this project at many stages. Their classes led me to question whether the assumptions of the usual
OLS model were met by the data for the problem at hand. They stressed the importance of model
specification and diagnostic checking of the results.
Next, I would like to thank the numerous reviewers of my scientific papers and the first
edition of this book who have corrected my analysis and suggested many improvements. While
any remaining errors or shortcomings of the b34s system are the sole responsibility of the
author, certain individuals deserve special mention during the software development aspects of
this project. In the 70s, Ron Golland, at the University of Illinois at Chicago, was especially
3
Programs referenced in this monograph are usually shown in courier type. Procedures inside programs are shown
in bold, unless part of command file listings. This allows readers to distinguish between, for example, the mars
command and the MARS statistical procedure.
helpful in pointing me to the finest available utility routines (LINPACK, EISPACK) and in
developing other useful utility routines that I have incorporated into B34S. The University of
Illinois at Chicago Computer Center has generously provided computer time for the project and
George Yanos, the former Associate Director, has been most helpful when serious design
problems had to be solved in the 70s. In the 80s Paul Setze and Jim O'Leary made contributions.
Stan Cohen and his team of developers of the SPEAKEASY® System have been most helpful
over the years and have worked with me on the B34S to SPEAKEASY and SPEAKEASY to
B34S interface. On the PC, SPEAKEASY is seen as a command of B34S which is accessible
with a hot-key from the Display Manager. The SPEAKEASY software adds a powerful
interactive matrix language capability to the tools already in B34S. In the late 90's the matrix
command was built to implement a SPEAKEASY like programming language in B34S that was
tailored to econometrics and time series analysis. The B34S matrix command and SPEAKEASY
share the same data save format. Professor Lon-Mu Liu, the developer of SCA, and Bill Lattyak
the developer of WORKBENCH, have made many suggestions and have worked with me on the
design of the B34S/SCA data interface. The program WORKBENCH has made the use of B34S
substantially easier for many new users. Professor Barry Chiswick has made suggestions
involving changes to make B34S easier to use, such as the development of the new syntax and
the implementation of the SAS/B34S interface.4 My research colleagues, Georgios Karras,
Richard Kosobud, Evelyn Lehrer, John McDonald, Hugh Neuburger, Jin-Man Lee and Allen
Sinai all played major roles in pointing out econometric problems whose solutions required the
development of procedures that later found their way into B34S. Since 1964 Hugh Neuburger
has made many suggestions for improvements in the time series capabilities of the program,
which he has extensively used in financial model building. His help, friendship and the extensive
testing he has done have substantially improved the final product. The pioneering research of
Melvin Hinich and Doug Patterson into detecting nonlinearity in time series changed my
research focus. Their generosity in providing me code, advice and friendship are much
appreciated. As a consequence of their influence I developed the bispec, polyspec sentences and
the mvnltest paragraph, which are involved with testing for nonlinearity. The mars and pispline
commands represent efforts to deal with the estimation of nonlinear models.
Ali Akarca has assisted me with the testing of the program in its early stages, particularly
in the area of time series analysis. I have received many helpful suggestions from former
students, such as Terry Elder, Linda Manning, Dimitri Andrianacos, John Sfondouris and Ron
Usauskas in the 70's and 80'. For the 1997 edition, Marcos and Maria Lemos made a number of
helpful suggestions. Jin Man Lee has helped me in many ways to make B34S run well on
modern computer systems. His contributions for the present edition have been substantial and go
beyond making suggestions and finding errors. In addition to providing material for a number of
chapters, his help in testing and extending the matrix command has been especially valuable.
My son William A. Stokes made many contributions and suggestions on the Linux port, helped
design and program the web page, and made design suggestions for the B34S Display Manager.
His perspective as a modern computer science major opened my eyes to new approaches to old
problems. Stokes (2003b) provides an in depth look at software design issues involved in the
In the 90’s the B34S/MATLAB and MATLAB / B34S interface was developed. MATLAB is callable from the
Display Manager.
4
Preface
7
design of statistical software. For the preparation of edition three I have received expert help and
support from my excellent graduate students. Marek Kolodziej who found many slips and
Yuliya Yurova and Narsid Golic encouraged me to develop applications in many interesting
directions. In the period 2005-2009, Xin Fang, Kathleen Odell and Shaoying Chang proved
helpful, especially in the development of the nonlinear and non-parametric estimation capability.
I am grateful to Hodson Thornber who has given me permission to adapt material from
his manual for B34T and whose program was the basic building block for B34S that started
many long years ago in 1972 and to the Review of Economics and Statistics, published by North
Holland and from which I adapted material from my papers. Individual code contributions are
acknowledged throughout the book. Most important, I owe a large debt of gratitude to my wife,
Diana, who not only gave me encouragement and support while I was building the code and
fixing elusive "bugs," but, in addition, provided valuable editorial help on the manuscript in the
form of detailed editing for the three editions. Individual acknowledgement to others who have
contributed to B34S is given in the specific chapters where procedures are discussed and in the
two on-line help manuals.5 Any remaining errors or design limitations are, of course, my
responsibility.
Houston H. Stokes
Department of Economics
University of Illinois at Chicago
hhstokes@uic.edu
http://www.uic.edu/~hhstokes
13 February 2016
5
B34S contains two on-line help manuals (Stokes 2003a, 2003b) which are continually being updated. The B34S
command
b34sexec help=manual newpage bottompn makeindex$ b34seend$
places the complete command reference manual in the log file. The complete manual is available on-line and
sections can be downloaded. In addition extensive test output can be downloaded from the B34S web page. If the
keyword oldmanual is substituted for manual, the B34S "native" command manual is printed. This is usually niot
ever needed. Since complete help is available in these manuals and on line, no attempt is made in this book to
provide complete command references. The purpose of this book is to document the calculations and illustrate their
use with actual econometric research. This book is not a computer manual, nor is it meant to be a text. The B34S
program is supplied with a comprehensive help manual and detailed examples on all procedures and matrix
commands as well at over 600,000 lines of sample datasets. Users are encouraged to use these datasets to learn
econometrics. It has been my experience that only through analysis of actual data is it possible to fully understand
econometrics.
Download