Lecture 10 - Nonlinear Regression Models

advertisement
Lecture 10 –
Estimating Nonlinear Regression Models
References:
Greene, Econometric Analysis, Chapter 10
Consider the following regression model:
yt = f(xt, β) + εt t = 1,…,T
xt is kx1 for each t, β is an rx1constant vector, εt is an
unobservable error process and f is a (“sufficiently
well-behaved”) function - f: RkxRr →R. So, each y is
a (fixed) function of x and β plus an additive error
term, ε.
2
Example: yt  1 xt   t
The estimation problem: given f, y1,…,yT, and
x1,…,xT, estimate β.
The solution: estimate β by LS (NLS), ML, or GMM.
The “problem”? In contrast to the linear regression
case, the FOCs are nonlinear and so, in general,
numerical methods must be applied to obtain
(consistent) point estimates. Also, the avar matrice of
β-hat will have a slightly more complicated form.
Nonlinear models are commonly encountered in
applied economics – largely because advances in
computational mathematics and desktop/laptop
computer technology have made solving nonlinear
optimization problems more feasible and more
reliable.
Nonlinear Least Squares (NLS)
Choose β-hat to minimize the SSR –
SSR( ˆ ) 
T
(y
t 1
t
 f ( xt , ˆ )) 2
FOCs –
g ( ˆ )  
f ( xt , ˆ )
ˆ
( yt  f ( xt ,  ))
0



t 1
T
which form a set of r nonlinear equations in the r
unknowns, ˆ1 ,..., ˆr . [In the case where f is linear in
the β’s, the derivative vector df/dβ = [ x1t…xrt], r =
k.]
2
y


x
Example: t
1 t  t
g ( ˆ )  
T
ˆ x ˆ2 )[ x ˆ2 ˆ ˆ x ˆ2 1 ]'  0
(
y


 t 1t t 2 1t
t 1
Computing the NLS Estimator –
In general, these FOCs must be solved numerically
to find the NLS estimator of β, ̂ NLS . (E.g., the GaussNewton procedure described in Greene, 10.2.3.)
Some issues –
-
choice of algorithm
selecting an initial value for β-hat
convergence criteria
local vs. global min
Asymptotic Properties of the NLS Estimator –
If
 the x’s are weakly exogenous
 the errors are serially uncorrelated and
homoskedastic
 the function f is sufficiently smooth
 the {xt,εt} process is sufficiently well-behaved
then
T 1 / 2 ( ˆT , NLS   )  N (0, 2 Q 1 )
D
where
σ2 = var(εt)
Q  p lim( 1/ T )QT ,
T
QT   [f ( xt , ˆT ) /  ][f ( xt , ˆT ) /  ' ]
t 1
The NLS estimator is (under appropriate conditions),
consistent, asymptotically normal and asymptotically
efficient.
Inference: For large samples act as though
ˆNLS ~ N ( ,ˆ 2QT 1 )
T
ˆ  (1 / T ) ˆt2
2
1
If the disturbances are heteroskedastic and/or serially
correlated the NLS estimator will be consistent but
not asymptotically efficient. Also, the correct form of
the asymptotic variance matrix of the NLS estimator
requires a heteroskedasticity and/or autocorrelation
correction. Heteroskedasticity and HAC estimators of
the variance-covariance matrix of ε can be used if the
exact forms of the heteroskedasticity and
autorcorrelation are not know.
If the form of the heteroskedasticity and/or serial
correction is known up to a small number of
parameters (e.g., εt is known to be an AR(1) process
with unknown ρ) then nonlinear GLS or (quasi)maximum likelihood will be asymptotically efficient
estimators.
Example – GNLS
Suppose E(εε’) = Σ. Then the GNLS estimator of β is
the value of ˆ that minimizes the weighted SSR:
[y-f(x, ˆ )]’ Σ-1[y-f(x, ˆ )]
If Σ then it can be replaced with a consistent
estimator to obtain the FGNLS estimator. (What
consistent estimator of Σ?)
If the regressors are correlated with the errors, none
of these estimators is consistent (even if the errors are
homoskedastic and serially uncorrelated). A
consistent, semi-parametrically efficient estimator
that does not rely on knowledge of the form/existence
of heteroskedasticity/autocorrelation and allows for
endogenous regressors: Nonlinear GMM
In addition, GMM provides a semi-parametric
alternative to MLE for nonlinear models that do not
fit the nonlinear regression format.
GMM in the nonlinear regression model –
Consider the population moment conditions:
E[wt(yt – f(xt,β))] = 0 for all t
where wt is an instrument vector.
The GMM estimator: choose ˆ to make the
corresponding sample moments
1 T
wt ( yt  f ( xt , ˆ ))

T 1
close to zero. As in the linear case, this will
involve minimizing an optimally weighted
quadratic form in these moments.
GMM in a more general nonlinear setting Hansen and Singleton’s (Econometrica, 1982)
Consumption-Based Asset Pricing Model
At the start of each time period t, a representative
agent chooses consumption and saving to maximize
expected discounted utility:

E[  U (ct i )  t ] , U (ct ) 
i

ct  1
i 0

,0<γ<1
At the start of t, the agent can allocate income to
purchase the consumption good or N assets with
maturities 1,2,…,N according to the sequence of
budget constraints
N
N
j 1
j 1
ct   p j ,t q j ,t   rj ,t q j ,t  j  wt
where
pj,t = price of a unit of asset j (i.e., an asset that
matures in t+j) in period t
qj,t = units of asset j purchased in period t
rj,t = payoff in period t of asset j purchased in t-j
wt = labor income in period t
Unknown parameters in this model- δ,γ
The optimal consumption path must satisfy the
sequence of Euler equations:
E[ j (rj ,t  j / p j ,t )(ct  j / ct ) 1 t ]  0
Let zt be any vector in Ωt. Then the Euler equations
imply the following set of moment conditions which
form the basis for estimating δ and γ by GMM –
E[ j (rj ,t  j / p j ,t )(ct  j / ct ) 1 zt ]  0
for all t and j=1,...N
GMM: Choose δ and γ to make the j sample
moments
1 T

(rj ,t  j / p j ,t )(ct  j / ct ) 1 zt , j = 1,…,N

T 1
j
close to zero.
The alternative – MLE: specify the joint distribution
for {(rj,t+j/pj,t+j,ct+j/ct), j = 1,2,…,N} then maximize the
corresponding likelihood function subject to the
Euler equations. (See, e.g., Hansen and Singleton,
JPE, 1983).
Download