Statistical and Econometric Issues

advertisement
Statistical and Econometric Issues
Associated with Community Economic Modeling
Steven Deller
Department of Agricultural
and Applied Economics
University of Wisconsin-Madison
Statistical and Econometric Issues
Associated with Community Economic Modeling
In an idealized world we hope for
Theoretical Rigor
Simulation Ease /
Reasonableness
Statistical and Econometric Issues
Associated with Community Economic Modeling
In real world we get
Theoretical Rigor
Simulation Ease /
Reasonableness
Statistical and Econometric Issues
Associated with Community Economic Modeling
The ultimate purpose for building and using the
model will directly affect your choice of
econometric approaches.
If theoretical insights and rigor are paramount,
then one will tend towards the construction and
estimation of a system of simultaneous
equations.
If simulation ease and prediction accuracy are
paramount, then one will tend towards the
construction and estimation of a set of reduced
form equations.
There is no “right” answer.
The art versus the science of community
economic modeling.
Statistical and Econometric Issues
Associated with Community Economic Modeling
Overview of a BLUE Model
Basic Assumptions:
1. The model is correctly specified.
2. The matrix of exogenous variables (X) are
non-stochastic
E(X’e) = 0
or/ cov(X’e) = 0
3. Normality
E(ei) = 0
In
4. Independence
E(eiej) = 0
 Ij  n
5. Constant Variance
E(ei ei) = 2
In
Statistical and Econometric Issues
Associated with Community Economic Modeling
Key Assumptions for Our Problem
The model is correctly specified
Specification Error
Functional Form
Multi-parameter Testing
Incorporating Information
The matrix of exogenous variables (X) are
non-stochastic
E(X’e) = 0
or/ cov(X’e) = 0
Endogenous Variables
Errors in Variables
System of Structural Equations
Statistical and Econometric Issues
Associated with Community Economic Modeling
The model is correctly specified
Relevant for both structural equations and
reduced form approach
Inclusion of irrelevant variables OLS yields
unbiased, but inefficient estimates
Omitted relevant variables OLS yields biased,
and weakly inefficient estimates
Trade-off between too many variables and not
enough is efficiency (minimum variance) and
biased parameter estimates
If unbiased parameters are the goal, error on
the side of too many variables
If hypothesis testing is the goal, error on the
side of keeping the model simple
Statistical and Econometric Issues
Associated with Community Economic Modeling
The model is correctly specified
Specification Rules
Overfit the Model: Let the sample evidence
drive the specification of the model (e.g.,
maximize R2)
Encompassing Principle: The chosen model
should account for results of competing models
and explain something new itself
Fragility Analysis: Chose the model which is
least sensitive to minor changes in specification
Appeal to Economic Theory: Let theory dictate
specification and imposition of additional
information
Ad Hoc Statistical Criterion:
1. Change in R2 (stepwise regression)
-Over simplification of problem
-R20 (when do you stop?)
-very easy to manipulate the R2
Statistical and Econometric Issues
Associated with Community Economic Modeling
The model is correctly specified
Statistical Criterion
2. Cp Criterion (mean squared prediction error)
Full Model: y = X + e
Subset Model: y = X1 + e1
(n-k1) 12
Cp = --------------- + (2 k1 – n) ~ F(n-k1)(2k1-n)
2
Rule: Pick X1 X such that Cp < Fcrit
3. Amemiya Prediction Criterion (PC)
PC = 12(1+ k1/n)
Rule: Pick X1 X such that CP is minimized
Statistical and Econometric Issues
Associated with Community Economic Modeling
The model is correctly specified
Statistical Criterion
4. Akaike Final Prediction Error (FPE)
FPE = 12(n+k1)/(n-k1)
Rule: Pick X1 X such that FPE is minimized
5. Akaike Information Criteria (AIC)
AIC = ln 12 + 2k/n
Rule: Pick X1 X such that AIC is minimized
6. Standard F-test (test of linear restrictions)
(SSErestricted – SSEunrestricted)/J
SSEunrestricted/(n-k)
Where Xrestricted  X
Statistical and Econometric Issues
Associated with Community Economic Modeling
Is the functional form correct
Theory seldom lends insight into what the
correct functional form of the statistical
relationship should be.
Option #1: PUNT
use linear / Cobb-Douglas
Option #2: Flexible Functional Forms
Quadratic:
y =  + IxI + ijxIxj
Translog:
ln y =  + IlnxI + ij(lnxI)(lnxj)
(see Griffin, Montgomery and Rister, WJAE, Dec 1987)
Statistical and Econometric Issues
Associated with Community Economic Modeling
Is the functional form correct
Option #3: Box-Cox ( or Box-Tidwell)
y - 1
x - 1
------- =  +  -------

lim
0
(y - 1)/  lny
 C-D
lim
0
(x - 1)/  lnx
and
lim
(y - 1)/  y
1
 linear
lim
1
(x - 1)/  x
Box-Tidwell allows  to vary for each variable
Statistical and Econometric Issues
Associated with Community Economic Modeling
Endogenous Variables
Or/ Errors in Variables
Result of using OLS in this case, parameter
estimates are biased and inconsistent.
Hausman Test
Explicit test if plim 1/n(Xe)=0 where Ho is that X
and e are orthogonal in large samples
Let
q = (IV - ols)
Let
B = IV2 ((’)-1 – (X’X)-1))
Where
 = W(W’W)-1W’X
And (W’W) is the design matrix for the IV
model and (X’X) is the design matrix for the OLS
model
TS  q’B-1q ~ 2
if
TS > 2 problem exits
or/ (IV - ols) is statistically “big”
Statistical and Econometric Issues
Associated with Community Economic Modeling
Endogenous Variables
Or/ Errors in Variables
Potential Solutions
Option #1: Direct use of Instrumental
Variables, via proxy or 2SLS
Option #2: Estimation of a Simultaneous
Set of Equations, via 2SLS or
3SLS
Issues to consider:
Is the system Identified?
Order Condition (necessary)
“The number of exogenous variables excluded
from the equation must be greater than or equal
to the number of endogenous variables included
in the equation.”
Statistical and Econometric Issues
Associated with Community Economic Modeling
Endogenous Variables
Or/ Errors in Variables
Rank Condition (sufficient)
“The rank of the matrix of rhs endogenous
variables must be equal to M-1 where M is the
number of equations.”
Some general rules:
An equation that contains one endogenous variable and all
exogenous variables in the system is just identified.
An equation that contains all of the variables in the system is
not identified.
If none of the excluded variables of the ith equation appears
in the jth equation, the ith equation is not identified.
If two equations contain the same set of variables, neither
equation is identified.
If the same excluded variables of the ith equation are also
excluded from the jth equation, the ith equation is not
identified.
Statistical and Econometric Issues
Associated with Community Economic Modeling
Endogenous Variables
Or/ Errors in Variables
Issues to Consider
Simulation of a set of simultaneous equations,
or simulation of the reduced form equations
derived from the simultaneous equations?
Structural:
Y + XB = E
Reduced:
Y = X + V
Simultaneous estimation of each module
independently or as a group?
Labor
housing
Demographic
fiscal
retail
Statistical and Econometric Issues
Associated with Community Economic Modeling
Concluding Thoughts
The art of model building versus the science of
economics.
Intent of undertaking?
Start simple, future generations can add
theoretical and empirical rigor.
Discourage a community from investing millions
of dollars because your model predicts one
number to be bigger or smaller than another
number.
Download