Nonlinear Models 1/14

advertisement
Nonlinear Models
1/14
Nonlinear regression
I
In linear models, we assume that Yi = XiT β + εi for
i = 1, · · · , n. This implies that E(Yi |Xi ) = XiT β, which is a
linear function of β.
I
A generalization of a linear model is a nonlinear model. A
nonlinear model assumes that
E(Yi |Xi ) = f (Xi ; β),
where f (x; β) is a smooth function of β. For example,
f (x; β) = (β T x)2 or f (x; β) = exp(β T x).
2/14
Nonlinear regression
I
Assume the following nonlinear regression model,
Yi = f (Xi ; β) + εi
for i = 1, · · · , n,
where εi ’s are random errors, f (x; β) is a known function of
x up to a p-dim unknown parameter vector β.
I
εi ’s are independent random errors with mean 0 and
variance σ 2 .
3/14
Ordinary least squares in nonlinear model
Similar to the linear regression, the least squares estimator of β
can be obtained by minimizing the following objective function
gn (β) =
n
X
{Yi − f (Xi ; β)}2 .
i=1
That is
β̂ = arg min gn (β).
β
4/14
OLS estimator
I
Typically, we do not have explicit formulas for the least
squares estimation of β. We need to use numeric
algorithm to find the OLS estimator in nonlinear regression.
I
A necessary condition for β̂ to be a minimizer of gn (β) is
that
0
gnj
(β) =
∂gn (β)
| =0
∂βj β̂
for j = 1, · · · , p.
5/14
Estimating equations
I
The LS estimator of β must be the solution of the following
estimating equations
DβT {Y − f (X ; β)} = 0,
i ,β)
where Y = (Y1 , · · · , Yn )T , Dβ = ( ∂f (X
∂βj )ij is an n × p
matrix and f (X , β) = (f (X1 , β), · · · , f (Xn , β))T is an n × 1
vector.
I
In particular, if f (Xi ; β) = XiT β (a linear model), Dβ = X .
Then the above estimating equations become
X T (Y − X β) = 0, which are the normal equations.
6/14
Gauss-Newton algorithm
Let β̂ (r ) be the solution of the estimating equation at the r -th
iteration (r = 0, 1, 2, · · · ) and define the corresponding Dβ as
(r )
i ,β)
Dβ = ( ∂f (X
∂βj |β̂ (r ) )ij .
The first order Taylor expansion of f (X ; β) is
(r )
f (X ; β) ≈ f (X ; β̂ (r ) ) + Dβ (β − β̂ (r ) ).
Then the nonlinear regression model may be represented as
(r )
Y = f (X ; β̂ (r ) ) + Dβ (β − β̂ (r ) ) + ε.
This implies that
(r )T (r )
(r )T
β\
− β̂ (r ) = (Dβ Dβ )−1 Dβ {Y − f (X , β̂ (r ) )}.
7/14
Gauss-Newton algorithm
Step 1: Choose an initial value for β̂ (0) .
Step 2: Update the value of β̂ (r ) by
(r )T
β̂ (r +1) = β̂ (r ) + (Dβ
(r )
(r )T
Dβ )−1 Dβ
{Y − f (X , β̂ (r ) )}.
Step 3: Repeat step 2 until kβ̂ (r ) − β̂ (r +1) k < δ for a small
value of δ.
8/14
Example 1
Consider the following nonlinear regression model:
Yi = exp(β1 + β2 Xi2 ) + εi .
The least squares estimator of β = (β1 , β2 )T is the solution of
the following estimating equations:
DβT {Y − exp(β1 + β2 X 2 )} = 0,
where Dβ = (exp(β1 + β2 X 2 ), X 2 exp(β1 + β2 X 2 )) is n × 2
matrix.
9/14
Example 2
Consider the following nonlinear regression model:
Yi = sin{2π(β1 + β2 Xi2 )} + εi .
10/14
Example 2
Consider the following nonlinear regression model:
Yi = sin{2π(β1 + β2 Xi2 )} + εi .
Note that β1 is not identifiable without any constraints!
10/14
Example 2
Consider the following nonlinear regression model:
Yi = sin{2π(β1 + β2 Xi2 )} + εi .
Note that β1 is not identifiable without any constraints!
We have to put some range restriction on β1 .
10/14
Remarks
I
The nonlinear regression should be parametrized such that
all the parameters are identifiable.
I
If the objective function is non convex, finding the global
minimization is a challenging problem.
11/14
Large sample inference based on likelihood
Under some regularity conditions, we have the following
I
Asmptotic normality of β̂:
d
{σ 2 (DβT Dβ )−1 }−1/2 (β̂ − β) → Np (0, Ip ).
I
The MSE of the non-linear regression is defined as
n
1 X
{Yi − f (Xi ; β̂)}2 .
MSE =
n−p
i=1
p
It can be shown that MSE → σ 2 .
I
Note that (DβT Dβ )−1 can be estimated by (Dβ̂T Dβ̂ )−1 .
12/14
Large sample inference based on likelihood
Under some regularity conditions, we have the following
I
For any smooth (differentiable) function h(b) : R p → R q ,
h(β̂) ∼ Nq (h(β), σ 2 G(DβT Dβ )−1 GT )
where G = (∂hi (b)/∂bj |b=β )ij is a q × p matrix which could
be estimated as Ĝ = (∂hi (b)/∂bj |b=β̂ )ij .
13/14
Example 1 continued
Consider the following nonlinear regression model:
Yi = exp(β1 + β2 Xi2 ) + εi .
The least squares estimator β̂ is asymptotic normal
β̂ − β ∼ N(0, σ 2 (DβT Dβ )−1 )
where Dβ = (exp(β1 + β2 X 2 ), X 2 exp(β1 + β2 X 2 )) is n × 2
matrix.
14/14
Download