Document 11072577

advertisement
ALFRED
P.
WORKING PAPER
SLOAN SCHOOL OF MANAGEMENT
NONLINEAR THREE STAGE LEAST SQUARES POOLING OF
CROSS SECTION AND AVERAGE TIME SERIES DATA
Dale W. Jorgenson"
and
Thomas M. Stoker""
April 1982
August
(Revised:
I983)
Sloan School of Management Working Paper #1293-82
MASSACHUSETTS
INSTITUTE OF TECHNOLOGY
50 MEMORIAL DRIVE
CAMBRIDGE, MASSACHUSETTS 02139
^.S3
NONLINEAR THREE STAGE LEAST SQUARES POOLING OF
CROSS SECTION AND AVERAGE TIME SERIES DATA
Dale W. Jorgenson"
and
Thomas M. Stoker""
April 1982
August
(Revised:
1983)
Sloan School of Management Working Paper #1293-82
"Department of Economics, Harvard University
Cambridge, Massachusetts 02138
'"Sloan School of Management, Massachusetts
Cambridge, Massachusetts
02139
Institute of Technology
MONLIMEAR THREE STAGE LF^ST SQUARES POOLING
OF CROSS SECTION AND TIME SERIES OBSERVATIONS
by
l^ale
l_,
Iiitroduc
t
Jorgenson and Thomas
V/.
U.
Stolcer
The purpose of this paper is to discuss the pooling of
ion .
cross section and average time series data by the method of nonlinear three
stage least squares introduced by Jorgenson and Laffont
(1974).
^
\,e
applications of this method to exact aggregation models, where there
unique correspondence between individual and aggregate behavior.
consider
is
a
This
correspondence makes exact aggregation models appropriate for the analysis of
average data,
individual data,
\lc
or both
in coi.bina
consider observations on K individuals,
T time periods,
indexed by
t
=
2
1,
form of an exact aggregation model
...
for
T.
V/e
t
ion. "
indexed by k =
can represent the
the kth individual
the
in
2
1,
...
Iw
for
structural
tth time
period by:
^nkt = ^kt Pn^Pf
The observations
y^^j.^
^'^'
and
(n =
Xu^^
1.
2
...
N).
vary over both individuals and time periods,
while the vector of observations p^ varies over time periods, but is the same
for all
individuals in
a
given time period.
functions of the observations
6' = ^^1' ^2
•
^L^
•
p^
The coefficients
P
(p
,
O)
are
and the vector of L structural parameters
Restrictions on the parameters are embodied in the
forms of these functions.
We can write
the exact aggregation model
for
the kth
individual
in
vector
form
yj.,.
=
(I.,
»
Xj^^)
p(p^, e),
(1)
- 2
where y
vector of N observations,
is a
and
cients,
I.
is
©)
is
vector of N coeffi-
By averaging the model
identity matrix of order N.
tlie
a
(1)
individuals for each time period, we obtain the structural form of
over all
for averaged data.
eact aggregation model
the
fiip^,
^t = (^N ®
where y
and
x'
^P
(2)
^^
P^i't'
are vectors of M observations on averages of y,
.
and
x,'
over
individuals.
all
The models for individual cross section and average time series observations contain the same parameter vector 9 and the same coefficient vector
P(p^, ©)
.
This reflects the correspondence between individual and aggregate
behavior that characterizes exact aggregation models. The forms of the individual and aggregate
(1)
i.iodel
and (2) are necessary and sufficient for exact
aggregation, provided that the population distribution of
tricted.
x
is
unres-
^
As an example of exact aggregation models we first
consider the linear
model that underlies previous discussions of pooling cross section and time
series data ^
^nkt = Pt «ln
where G,
of models
parameters
^
2
...
Kt
«2n
(n =
'
and 0^
are vectors of parameters.
Zn
In
1,
'
N)
.
(1)
and (2)
x/
kt
is
"
(1
^
z'
'
2
kt
(i
(p
N)
...
.
In this example the vector of
includes the elements of
The vector of coefficients
vector of observations
1.
e)
'
is
(p^
and 0tzn (n =
In
0j,j
,
0oj,)
and the
)
*
Deciand analysis provides many examples of nonlinear exact aggregation
models.
In each of these examples
the
theory of consumer beliavior implies
constraints on the parameters of the model that are incorporated through the
- 3 -
form of the coefficients P^^Cp^, e)
=
(n
1.
...
2
N). Demand systems generated
indirect utility function are nonlinear exact
by the Gorman polar form of the
Specific examples include the linear expenditure system
aggregation models.
introduced by Klein and Rubin (1947-1948) and implemented by Stone
the
(1954),
S-branch utility tree of JJrown and Helen (1972), and the generalization of the
S-branch utility tree of Dlackorby, Boyce,
As an illustration,
and Russell
(1978).
linear expenditure system can be v/ritten in exact
the
aggregation form as follows:
(P„t
'nkt =
where y
period
^, ^
-
Pjt^
"^j
and
parameters
(Pjjj.
\
''kt
(n =
•
price of this comi.iodity (n =
is the
p^^^
1,
2
...
N).
b^^
c^ -
and
c^^
1
c-
bjj
=
(n
2
1,
...
*"''
P;t;'^ii^
N)
*^^^
,
N)
...
vector of observations
duced by Deaton and Muellbauer (1980a,
1981,
2
;
M,
is
the vector of coefficients n„(P(,
More complex nonlinear exact aggregation models
Stoker (1980,
1,
in
total
The vector of parameters 9 includes the
expenditure on all coiianodities.
is
^
denotes expenditure on the nth commodity by the kth individual
n*.
t
-^n
liave
x'
is
(1,
«)
'
''],.)•
recently been intro-
19G0b) and by Jorgenson, Lau,
and
The AIDS models of Deaton and Duellbauer can be
1982).
wri tten:
^nkt = (a„
where
Yj^j^^,
InP^
is
a
.
and p
'1^*,
=
la.
b„,
c„
•
(n,
j
in
^
in p ^
.
p.^)M,^
.
^^7^7^'^"
are defined as
^
I"
I I
c„
.
in the
Inp^^
'
=
1.
2
...
N).
linear expenditure system and:
In
p^^.
The vector of parameters d includes the parameters
price index.
nnnj*^
*t,>
I c„j
=
1,
2
...
N),
the vector of coefficients
B
(p.,
nt
O)
'
is
-
(a
n
1
+
c
.
nj
In p.^.
Jt'
;;")
:;
In P
4 -
and the vector of observations
is
x,'
kt
The translog model of Jorgenson, Lau and Stoker can be represented in the
form
a
_
^nkt -
y
I
+
/_a
b
In p.
D(p^)
^
b
.
l-Ll)
'li
„
_
"kt
'
Ul
m
n(p^)
^°
kt
.5- _ns
m
.
'
kt
(n =
where y^j,^,
Mj.^
and p^^ are defined as above,
A
,
^
(s
^
""
=
1,
kt
skt,
,
n(p^)
2
1,
A
n, ^
b^'
,.
M),
...
...
2
S)
represents
demographic characteristics such as family size, age of head of household,
and:
so on,
D(p^) = -1
In this example
j
=
a„
2
1,
...
lb.
+
N;
I
+
b,jj
In p.j.
the vector © consists of
=
s
1,
In p..
^)
^
...
2
S)
tion
a
^^^
Mj^^
A^^^.
••
^'
M,^^
A^j^,
P
...
M,^,
G)
,
Stochast
ic
?'>i)cc
close with
if ic at ion .
ns
'
(n,
is
°'"^-
(1)
a
5
we consider
and (2).
V/e
In Sec-
squares
stage least
In Section 4
we consider estimation subject
brief summary of the results and
discussion of applications.
2_.
b''
;:j'
A^.^^y
In Section 2
the nonlinear three
we discuss hypothesis testing and in Section
V/e
'
(p
estimator for pooled time series and cross section observations.
to inequality constraints.
b
.
nj
implications of nonlinearity for the pool-
section and average time series data.
we present and characterize
b
^"" ''^ ^^'^'"^ °'
stochastic specification of exact aggregation models
3
n'
b^e
'
V.^^.
parameters
b„-
-
In this paper we focus on the
ing of cross
the
the vector of coefficients
,
b„,
b,,.
'
vations x-^ is (M^^. H^^ In
the
and
begin by considering average
a
-
observations for T time periods and
a
-
5
single cross section of K inJividual
observations. We assune that the observations are generated by exact a^^gre^ation models
and (2) with additive disturbance terms.
(1)
specification of the disturbance terms,
Given the stochastic
the observations must be transformed
to obtain disturbances that are honoscedastic and uncorrelated across observa-
tions.
For pooling of cross section and average time series data the transformation of observations to obtain homoscedast ic and uncorrelated disturbances can
be divided into two steps.
The first
transforming the average data
so
step separates the data sets by
that time series disturbances are uncorre-
lated with cross section disturbances.
ing data
iiej)
sets to a form where disturbances in each data
and uncorrelated.
citly,
The second
V/e
transforms the resultset are hoMOScedast ic
present the transformation for the first step expli-
indicating the features of this transformation that result in increased
efficiency.
Tlie
second step involves standard techniques for transformation,
which we illustrate by example.
We assume that individual observations are generated by the exact aggre-
gation model (1) with an additive random component,
^kt = (In ® ^kt) P(Pf ®)
We assume that
the disturbance
^
^^'kt ^k't')
=0.
e
.
(!')
=kf
term
uncorrelated across individuals,
say
e^,^
is
distributed with mean zero and is
so that:
k
it
k'.
Any systematic correlation among individuals is assumed to be captured by
selection of the variables
variance Q
x^^^.
The disturbance term e^^ is assumed to have
and time series covariance structure E(e,
e,'
,
)
= C
,
!f
.
A
-
6
wide variety of alternative time series structures for
form for the matrix C
by choosing an appropriate
We could obtain
e
can be represented
,
stochastic version of the exact aggregation model
a
by averaging the individual observations in (1')
would be the appropriate procedure
if
averaging the individual observations.
for each
time period.
(2)
This
the average data were constructed by
However, we must allow for alternative
methods for constructing the aggregate data.
In demand analysis,
for example,
data on aggregate personal consumption expenditures are obtained from production accounts for the economy as
a
whole rather than by direct observation of
quantities consumed by the entire population of individual households.
To allow for differences in methods of construction of the individual and
aggregate data we introduce an additive random component V
aggregation model (2) for each time period.
data y
y^ =
where u
8 ^[)
(Ifj
=
+
\)
stochastic term
zero,
t
^
variance
P(Pt,
e)
+ Uj,
is
a
(2')
vector of N averaged disturbances
e.
and
^)
is assumed to be distributed independently
ft
J-K
»
e^^
and time series covariance structure E(\)
To accommodate
t'.
The model relating the averaged
jj then:
anj p
^q ^
into the exact
(e,
of
ej^^
'
\)
,
)
=
)
Tlie
.
with mean
n\1
1
for
variety of time series covariance structures for u
a
we have:
E(u
t
+—K
ii')=a
\'>
"\)
In order
to present
data we consider
a
r
o
^tt' ^e-
methods for pooling cross section and time series
sample of K'
individual observations.
We can "stack" the
equations (1') to obtain:
Y =
(I^j
9 X)
p(pj.
.
9)
+
e,
(3)
- 7
where Y is the vector of observations (y nkt
.
{x,'
kt
and
}
X is the matrix with rows
),
is the vector of disturbances with mean zero and covarianco
e
o
matrix H
9
Similarly, we can represent the equations (2')
I,.,.
K
6
Y = f(e)
+
(4)
u.
where Y is the vector of averaged observations {y
f(e)
=
form:
in the
xj p(Pi
.6)
I^
.e)
p(P2
}
and u is the vector of disturbances.
step in the transformation of observations eliminates the
The first
correlation between of
E(u
t
e,
^kt
= K'
TT- C
')
K
'
e
and u
n
tt
''e
(k =
,
'
This correlation is removed by
a
Y.
_ 'i1^ C
X
-
t
,
K
c
"tt
K';
t
=
2
1.
...
,
7
and
u^^
o
,
(5)
in (2') by:
(6)
^es'
I
::
x
and
tt
e
o
^cs'
denote the cross section averages of
y,
,
The resulting disturbances u° are now uncorrelated with
e^^^
and
x,
o
Ej.
T).
III C
u" = u
where y
i^
...
v
Yes.
^tt
t
2
nonsingular transformation of (3) and (4),
which is equivalent to replacing y
yo =
1.
o
(k
=
1,
2
- 8 -
K'). but have
a
the original
more coraplicated time series structure than
disturbances
9^
E(u- u«:) =
^
^ C^,.
- K'
P.,
[C^^^.
.
C^,^ -
II
^"'^
•
"e
observations is to apply
The second step in the transformation of
a
non-
to obtain disturbances that are
singular transform to the average data in (4)
this transformation below by
homoscedastic and uncorrelated. We illustrate
example.
been performed,
We assume that the transformation has
model (4)
to:
Y* = f*(e*)
+
—
u*
altering the
(8)
,
where u* is distributed with mean zero and variance
we stack the equation systems
(3)
9.^^
A I^.
For estimation,
and (8):
(
Y =
t»
where U' = (e'
--
9)
+ U.
(e«)
.
u.,
and variance:
u*'). which is distributed with mean zero
pr^;,j
=
described above requires conThe implementation of the transformations
covariances n^. C^^,. 9^
sistent estimates of the variances and
...
T).
In general,
=
1.
2
processes
these estimates require specific models of the
The purpose of the transformations is to assure
generating the disturbances.
Equation (2') shows that the contribution of the
efficiency in estimation.
individual errors
it.f
e^^^
to the covariance
gible unless the matrices
£2*^*
where K is population size.
'
are the
structure of
u^.
is
likely to be negli-
o^,
same order of magnitude as ^^C^^.
(6)
The benefits of performing the transformation
- 9 -
depend on the size of the cross section relative to the population.
In many
applications, K'/K will be extremely small so that the transformation (6)
Typical numbers for an analysis of U.S.
leaves the observations unaffected.
household demand behavior are
K'
= 10,000
=70
and K
million.
Consequently,
only when the cross section sample size is of the same order of magnitude as
the size of the population will
the correction yield significant
benefits;
otherwise it can be ignored.
The following examples illustrate different error structures,
assume K'/K is very small.
We take C^^, = q
,
t^t
r
^
where we
f^r simplicity,
defer-
ring further discussion of this time series structure until we have presented
In Examples
the examples.
Example
and 2 we take
1
(Random Individual Errors):
1
an additional random component
tributed with mean
^
at
and variance
9...,
V
I(\j.
E(u^
+
ej.j.)/K.
up
=
t
level,
I
\),
Kt
which is dis=
Then u
/K.
t
=
given as u
t
.
2
= 'X/IT'u^
^
with Q
,
t
=
^
u*
(Common Time Effect):
will usually encounter K
mation.
SI
>
>
fi.,
a
grouping correction, with
+
9.
V
n
,
n^*^
u
of
.'
£
Suppose that ^
turbance in the aggregate data with
Example
\).
arises because of
\}
the individual
so that
t'.
^
t
Suppose that
The second stage transformation is just
Example
for
with:
t#f
(8)
=
flK^
=
fi
represents
for all
so that u
=
\)
t.
a
conmion dis-
In practice one
for purposes of esti-
Here no second stage correction is necessary, with
3
(Autocorrelated Conmion Time Effect):
Suppose that Example
2
is
10 -
-
modified to
\)^
variance
and uncorrelated over time,
1
=
n,.
V
fi
w
/I - Y
Ji
= y
2
,
w
+
so
that u
=
\)
distributed with nean zero,
with K
fl
we
>
>
and
-
by y
x
y y
.
CI
U*^
= n
neeliai°
is
"
y
(with
x
Of course,
= u
u
in this case.
(1)
Now suppose
that C
lation structure for
,
so that we have a nontrivial
^
In Examples 2
e
Ik.
and
3
above,
time series corre-
the effect of C
,
jtQ
L L
L
would be negligible, due to the unimportance of X Eh^/K in
Kt
however,
-
and x
the standard adjustment to the first observation).
and
llien
.
The second stage correction is now quasi-first
.
replacing y
differencing,
is
lo.
the contribution of "Le.^/K to u
kt
t
(1)
ble,
where
,
As above,
•
'
^
the time series structure is potentially
u.
important,
have the same time series covariance structure as
e, ^
kt
.
In Example
1,
t
since
'\Ji;
u
will
conand would require
^
sideration in the second stage of the transformation of observations.
Example
illustrates the cost of pooling with very general error struc-
3
In particular.
tures.
Example
3,
the parameter y is best
relabeled as
a
com-
ponent of ©, with the transformed error covariance structure now determined by
^-
e
and
fil
u*»
=
,Q
w
.
The treatment of autocorrelation will
list of parameters to be estimated with
terized by Q
and Q
^.
DO
involve augmenting the
the remaining error
structure charac-
This modeling approach is standard practice in time
series analysis. Consequently,
estimation of the parameters
in
Section
and
9.
e
CI
u**,
3
we discuss only the consistent
which we will regard as positive
definite but otherwise unrestricted.
Before discussing the additional assumptions required for estimation of
the complete model,
we introduce
instrumental variables.
appropriate to treat the variables x^^ and
observations,
model is
a
the aggregate
p^^
as endogenous
observations, or both.
simultaneous equations model
It
in exact
Tliis
is often
for the individual
can occur when the
aggregation form or part of
a
- 11 -
.arger system of simultaneous equations.
For example,
in demand analysis
observations on prices can reflect both supply and demand influences,
ing aggregate
instruments. Alternatively,
in a
study of savings,
requir-
errors in
variables may necessitate instruments for the individual data, while
the
in
average data such errors may be negligible.
We assxune that there are vectors of observations on instruiaental variables,
say t^j
respectively,
Denote as Z,
).
and as Z
and Z
the matrices with
rows
z,
and z'
the matrix:
\\^0
Finally, we must introduce regularity assumptions in order to characterize the NL3SLo estimator.
We include these
are that the coefficient functions P(p
ablo in the components of
fl,
e)
in the Appendix.
are
twice continuously differenti-
that the moment matrices defining the NL3SLS
objective function converge to stable, well behaved limits,
parameter vector © is identified.
set §^
a
in
remaining parameters in
set 9,
3..
and all
the
,
and that the
We collect all components of ^ identified
in the cross section in
a
The assumptions
all parameters identified in the
a
set 9
time series
.
The Nonlinear Three Stage Least Squares Estimator
.
The
t-ILSSLS
estima-
tor 6 of ©* is found as the value of 6 which minimizes:
S(e)
= il -
(i
(9))
where
E
T.
u*
T,
'
[t^
V \ ^7
% Z(Z'Z)^Z']
lY - 4(9)).
•
'
(11)
- 12 -
is a
consistent estimator of I as IC
be written more explicitly
s(e)
= s^(e)
—
T
,
The objective function S(e) can
><».
as:
s(e),
+
(12)
with;
s^(e)=(Y-(i
8 X) p
,e))'[a^
(p^
® z^(z^z^)
V.^](y-(i^
s(e)
= (Y» -
f*
(G))'[rri
u*
s x)p(p
*^
o
®z
,
e)).
o
(Z'Z)"^z'] (Y* - f*(e)).
where S (9) and S(9) are IJL3SLS objective functions for the cross section and
average models individually.
ized to estimate
ters;
9
=
set.
similarly,
© and ©
A
Q
Obviously,
the elements of 8
S(9)
= ^,
,
the function S (9)
could be minim-
for fixed values of the remaining parame-
could be minimized to estimate the elements of ©.
If
then all parameters could be estimated from either data
Minimizing (11) constrains the estimated values from cross section and
time series data sets to be equal,
which results in efficiency gains.
"
Note that the function S (9) can be evaluated using only
and the
9.
e
moment matrices Z
X .Z„Z,
and (I
Thus for estimating 9 or other
8 Z„) Y.
more restricted parameterizations of P(p
,
9), only one pass through the cross
This computational sim-
section data is required to construct these moments.
plification results from exact aggregation.
The estimation procedure consists of three steps:
estimators of
9.^
and n^,;
second,
First,
find consistent
minimize (11) to obtain 9; third, calculate
the asymptotic covariance matrix of 9.
If %
is not
empty,
then we cannot
improve upon previous suggestions in the literature for finding consistent
estimators of
9.
anj o
.
for example.
equation of the model by NL2a^S.
Gallant
(1977)
suggests estimating each
This involves pooling both data sources on
a
- 13 -
single equation basis,
the cross section data,
covariance from
as the estimated residual
forming A
and forming
ft^^
as the estimated residual covariance
data.
from the transformed average time series
©^ is empty, which suggests
The more usual situation is that
procedure.
^
First,
obtain consistent estimates of p(p^
by
O)
,
dual covariance matrix
provides
P.^
a
,
Tlie
consistent estimator of
Using the consistent estimators of
estimates of the elements of ©«
(i(p^
,
P-^
or ML3SLS to the
the model
even if ^
is
solve for consistent
6).
Holding these parameters fixed at
say 6°.
system as
whole,
a
using only the time series
tJL2SLS residuals,
The estimated covariance matrix of the
data.
estimated resi-
NL2SLS to each equation
the remaining parameters of e by applying
^0. estimate
of
linear 2SLS
o
data.
estimation of each equation using the cross section
not empty.
simpler
a
consistent estimator of 0^^.
In addition,
ft^^
provides
a
this procedure usually produces
good starting values for G to use in minimizing (11).
The objective function (11) can be minimized using
known computational methods.
a
variety of well
A convenient method that illustrates pooling
Gauss-Newton process.^ To discuss
cross section and time series data is the
Let
this method we require the following notation:
l?^j(e)
and
«l>(e)
denote the
matrices
Doi(e)
BQ(e)
=
r.02(e)
.
%(«>
and 4.
(0)
is the
4(6)
=
4^(9)
matrix with elements
{
\[K
10
x^ P^
(P^.
©)
)
for finding 6 from an
The Gauss-Newton process is an iterative procedure
A
initial value G
At
the ith iteration,
the current value ©.
is
updated to
14
d .^,
1
+
11-'
=
A O.
+
<9.
1
first
by
Y - (I^ 9 X)
q
(p,
p
"tQ
'^
y* - f* (6.)=
linearizing the system (9) with respect to B as:
1
q
11
(9.
(J)
1
)
Oil
« X) D„
= (I
,e.)
AB.
(e.)
A9.
+
e
(13)
,
u*.
+
We then apply Zcllner and Theil's (1962)
linear three stage least
squares
uethod to the uodel (13), obtaining:
Ae.
=
(M
1
where
4-
xo
)~^
M
(M
X
+
eo
M
u
)
(14)
.
:
M^„ = Bo(e.)'(n;^ xo x'z^u^-z^)-' z^x) b^ (e.)
= 4
IT
X
(e.) '(a"l e z(Z'Z)~^z')
M^
= 4
<ji
U*
1
(9.)
1
(9.)'(a~i 8Z(Z'Z)~^J') Y* -f* (9.))
A
.
A
Convergence to 9 is achieved when A 9. becoties sufficiently small. Following
Hartley (1961) we check whether S(6.
A
A
forming 9.,,
=9.
1
1
1
where
a
+
-i-
A 9.)< S(9.);
a
we shrink A 9^ by
not,
i\
+
A
9/2.
1
V/e
continue until improvement in S is found,
new iteration is performed; alternatively,
falls below
if
if
the current
increment
convergence criterion, we have found the minimum.
Under our assumptions the NL3SLS estimator 9 is consistent for 9* as K'
T
—
>
",
and asymptotically normal with asymptotic covariance matrix:
AVAll
(6)
= (M*
xo
+
M*)~^
(15)
.
X
where the moment matrices are evaluated at the true values
f!
e
and
CI
u
.
precise form of the limiting normal distribution depends on the way that
Tlie
K'
- 15 -
and T approach infinity; however,
(M
by
'
xo
II
-t-
(fl)
given
js
case.
in any
^
)
x'
Closer inspection of (14)
indicates the relationship of K'LSSLS to linear
pooling estimators; for example,
A e. =
consistent estimator of AVAR
a
(r:
1
i-
ii
xo
^
)
(m
X
11
a e^
xo
%
ii
0—1
m
+
©
= ©,
1
=
i>
,
then:
XI
ag.)
A
where AG.
S(Q)
are the Gauss-Newton increments from minimizing S
and A6.
respectively.
Thus, A 0.
is just
a
and
(9)
matrix weighted average of the indi-
vidual increments.
Second,
^'xo
^^^
^'eo
if
'^^^
the
cross section data are exogenous,
then Z„ = X,
^® evaluated using the moments X'X and
case one can obtain
(I
and both
8 X)' Y.
this
In
from the cross section residuals of each equation
it
e
estimated by OLS.
Third,
additional cross section data sets can be incorporated in
straightforward manner.
tj,
(or
tQ
,
If
an
additional cross section is available for time
for that matter) with data Y,
are formed as above,
second term of (14).
gate data series. ^^
1
and
fl^^
X,
,
then n,(0), M
and Z,,
,
and M
,
enter additively into the first and
and M^^
The proper correction
In
a
(7)
must be applied to the aggre-
this way all of the available cross section informa-
tion can be used in estimating the vector of parameters 6.
4_.
Parametric Hypothesis Te s t
= g(p), where p
is a
Statistical hypotheses take the form 6
.
vector with dimensionality R less than that of
Our irterest is in testing the hypothesis that
tive G ^ s(p).
in the Appendix,
= g(p)
ft,
against the alterna-
For this task, we require two additional assumptions,
which indicate that
p
is
say L.
listed
identified and that the disturbances
- 16 -
Ei.^
let
and u t are normally distributed,
statistic of interest is found as follows: Let S^(p) denote the
The test
objective function:
= (Y - <k{g(p))) '[t~^ 8
Sj.(p)
Denote by
p
-
i*(g(p))).
(16)
Under our assumptions Gal-
the value of p which minimizes S (p).
have shown that the statistic,
lant and Jorgenson (1979)
(17)
- S(e).
r = S^(p)
is
Z(Z'Z)'^Z'](Y
asymptotically distributed as chi-square with L
The appropriate test
under the null hypothesis.
The minimization of S
(
p)
-
R degrees of freedom
statistic is provided by x.
to find p is analogous to the procedure for
Although
finding 0, and requires only moment matrices from the cross section.
the monotonicity
any consistent estimator of I can be used in evaluating (16),
condition
S
(p)
- S(0)
>
will be guaranteed only if the sane I is used to
A
A
evaluate both
S
and S.
Thus,
the original consistent estimates
9.^
and n^^
used in estimating 6 should be used in finding estimators for restricted versions of the model.
5_.
Estimation Subject to Inequality Restrictions
The final topic we
.
consider is the estimation of the parameter © subject to inequality restrictions.
For example,
an integrable demand system must
obey the condition that
the Slutsky matrix of compensated price derivatives is negative semi-definite.
The unnoustrained estimator 6 need not obey these restrictions for finite samples; thus,
it
may be desirable to impose them.
We represent
such restric-
tions formally as:
(»
>
m (e) —
,
(m =
1,
2
...
M')
.
(18)
17 -
where we assurne ^
twice continuously dif f erent iable in each component
be
to
of e.
The inequality constrained estimator 9 minimizes
This estimator corresponds to
straints (18).
a
SO) subject
to the con-
saddlepoint of the Lagrangian
function
L = s(e)
where X is
a
+ \'i>
(19)
,
vector of M' Lagrange multipliers and
The Iluhn- Tucker
straint functions.
(
195 1) Gonditions
is
the
W
vector of con-
for a saddlepoint of
this Lagraugian are:
^qL
=
Aq
s(fi)
+
A.'
(«(e))
=
,
and the complementary slackness condition:
X.'^
=
X
,
,
>.
9.
a
where
(1(9)
is
the matrix with elements
^
(
)
.
J
To obtain the estimator ^ we begin by linearizing the model as in (13).
Next,
we linearize the constraints as:
*^^n.i)
= «(e.)
A e.
+
^(e.)
.
A
A
where 9.
is the
current iteration value of the unknown parameters.
'
We
then
1
apply Liew's (1976)
inequality constrained linear three stage least squares
method to the linear model, obtaining:
*
A 4.
A
=
A 9.
+
(M^^
t
M^)"^ *(9.)
•
X*
,
- 18 -
where A d.
(14)
is given by
and
>.*
solution of the linear complementar-
is the
ity problem:
9(^
)
(M
1
xo
+
M )'^$(e.)
X
*
A
A
IS
•
X
11
[<P(e.)Ae.
+
1
-
piQ.)]'\ = o. x i
1
o.
AAA
A
A
and check that both S(e.^^)
we
<
S
(9.).
6.
+
A
9^
and that 0^
^^i + i^
>.
.
m =
1,
2
.
.
.
M'
shrink the increment vector as before, until either improvement is
found or the increment values fall
terion.
to e.^^ = e.
A
A
A
If not,
(18), we update
that satisfies the constraints
Given 9.
A
Tliis
in absolute
value below
a
convergence cri-
concludes our discussion of the NL3SLS estimator.
Conclusion and Applications
.
In this paper we have discussed the
nonlinear three stage least squares method of pooling average time series and
cross section data.
There are two major advantages of this technique.
The
first is the identification of parameters and the gains in efficiency in esti-
mation.
For example,
by pooling average
models can be estimated that account for
time series and cross section data,
a
large number of specific demo-
graphic effects in consumer behavior in both microeconomic and raacroeconomic
settings.
Such effects are difficult to identify or estimate precisely using
aggregate time series data alone.
Alternatively,
the effects of time varying
factors such as price levels that are constant across consumers in each time
period may be impossible to identify using only data from
tion survey.
a
single cross sec-
Both effects can be estimated when cross section observations
are pooled with average
time series observations.
The second major advantage of the nonlinear three stage least squares
technique is ease of computation.
for
While exact aggregation models can allow
on
substantial nonl ineari ties in variables representing common influences
-
19 -
employed in pooled
behavior as well as in parameters, cross section data are
estimation through moment matrices.
ing only one
These matrices can be constructed utiliz-
pass through each cross section data
source.
Tliis
feature sub-
iterations to estimate
stantially reduces the time and expense of performing
nonlinear model
same model
atri
a
the
the cost of estimating several restricted versions of
for hypothesis testing.
this paper to models of
We have applied the techniques described in
aggregate consumer behavior for the United States.
Models describing consumer
by Jorgenson.
budget allocation among broad commodity classes are presented
Lau and Stoker (1980,
1981,
series data from 1958-1974,
1982). These models are estimated from annual
together with cross section data from 1972.
time
Ine-
the resulting Slutquality constrained estimation is required to assure that
sky matrices are negative semi-definite.
to
A model describing the allocation of total energy expenditures
Tliis model
specific energy types is presented by Jorgenson and Stoker (1983).
is
together
estimated using annual time series average data from 1958-1978.
with five cross section data bases.
Parametric hypothesis tests are performed
of strucusing the test for separability of preferences and the possibility
tural change.
Finally, Jorgenson,
Slesnick.
and Stoker (1983) have presented
At
the first
stage the consumer budget is
models of two stage budgeting.
allocated between energy and nonenergy commodities.
energy budget is allocated among types of energy.
At
the
second stage the
20 -
APPENDIX
TECHNICAL ASSUMPTIONS
:
Below we list the assumptions required to establish consistency and
asymptotic normality for the NL3SLS estimator.
(1977)
Assumptions 1-5 follow Oallaut
and assumptions 6-7 follow Jorgenson and Gallant
Assumpt ion
The parameter space of €, say %,
_1:
is
(1979).
compact, with the true
value an interior point.
Assumption 2: The components of p^ (p^, e)
(n =
1,
2
...
N)
are twice
continuously differential be in 6..
J
For the next two assumptions,
P
^
^Pf,
6)
to refer
(^
^Pf
,,
«'
J
ae.' ae,
1
a
1
J
9)
(p
and
se.
ae.
1
J
(p^.,
O)
'
.
The matrix
Z'Z
jj
converges to
The Cesaro sums,
^°'.
Pn <Pt
N ^ <ynkt - ^kt
o
o
k
Rpkt
k
o
<ynkt - ^kt Pn
Npkt
o
k
(^kt
pi <Pt
iiAN.),
J
Assumption 3A (Cross Section).
—
•"
e.
where p^^ is the mth component of p^
o
p-'
J
jLh^
_(i!AL
- ^ae,
definite matrix as N
the notation
to the vectors:
J
pij
Pq
we use
•
•
®)^<yjkt
«))
(Pt
•
o
- '^kt
00
Pj
^Pt
«>>
'
•
•
o
«)>
'
o
converge almost surely uniformly in 6 (n,
j
=
1,
2
...
N)
.
The sums:
a
positive
- 21 -
k
^P"Pe" S'kt o
k
are bounded almost
is
^lyt
(^kt
PJ' ^Pt
o
surely for all
the sth component of
Time Series )
(
definite matrix as T
f^\
^^it - ^t
—
1,
2
...
N;
s'
= 1, 2
...
S)
,
where
.
K
<Pf ®)^ <yjt
Tj\Fr^
^^nt - ^t Pn
tJ\|^^
(I;
Z'Z converges to
=^
a
positive
^Pf ®)>'
- ^t pj
(Pf «))•
e)).
pi (p,.
-'
1^
fxsupQ
I
z^.^ (x^
1t-_
^isup^
I
:'
f.ij
z^.^ (x^ P^J
are bounded almost
The matrix
.
The Cesaro sums:
>°>.
converge almost surely in 6 (n,
al
p;;
j
=
1,
(p^.
e))i.
(p^.
e))l.
surely (n,
j
= 1, 2
2
...
—
N)
N)
,
.
The sums.
where
z
,
is the
sth component
f
Assumption
lim
N.T
is
=
j
o
Assumption 3B
z
(n,
z,
o
of
«>^''
•
o
N
The matrix:
4:
+
T
xo
nonsingular, where M
xo
X
and
ST
are defined in equations (14) and (15)
- 22 -
Assumption
mental variables
1
im i
ts
—
f
is the
z
;
—
>
k
"
solution of the almost sure
e))
=
.(n = 1. 2
...
N)
.
(API)
P^ (p^, e))
=
.(n = 1. 2
...
N)
,
(AP2)
- x|
t
*
6:
(Parameter Restriction). The function g(p)
tinuously dif f erent iable mapping of
There is only one point
interior point of
^
P.
8n
-
,
where
g
n
a
Pv
twice con-
is a
compact set P into the parameter space
in P which satisfies g(p)
p
The L x
Pj
component of
.
o
true value
ment of G is
(p^
o
^1*1^7 Zj (y^t
^
Assumption
©.
the only
that is,
(y^^^ - x'^ P^
kzj^t
°°
lim
T
and
z
identified by the instru-
is
:
Um
N
© of (11)
(Identification).
5.:
matrix G(p*) has rank
is the nth component
R,
of g(p)
=
and
is an
p
where the
n,
and p.
the
is
jtli
j
th
J
p.
Assumption
7:
(Normality).
distributed for all k and
t.
The disturbances
e
and
\.
are normally
ele-
- 23
Footnotes
1.
mators,
2.
For detailed discussion of nonlinear three stage least squares esti(1977), Gallant
see Amemiya
The correspondence between individual and aggregate behavior is dis-
cussed by Lau (1977.
3.
19G2)
and Stoker (1982b).
An alternative approach to aggregation is based on restrictions on
the distribution of the variables
4.
and Gallant and Jorgenson (1979).
(1977),
x^^^.
for example.
See,
Stoker (1982a).
See for example. Balestra and Nerlove (1966), Kmenta
Mundlak (1978).
Much of the discussion of the linear model focuses on the
stochastic specification rather than the structural model; see,
5.
and
3
a
linear model has been surveyed by Dielman (1983).
This stochastic specification is used in an exact aggregation model
by Jorgenson,
6.
for example,
The literature on pooling cross section and average time
Amemiya (1978).
series data in
and
(1978),
Lau,
and Stoker (1980,
The exclusion of
\)^
1981,
1982).
from the cross section disturbances in Examples
may appear to be somewhat arbitrary.
Suppose instead that ^
+
Ei
o
o
represent the cross section disturbances.
The
2
can be estimated as the
\)
o
difference between the estimate of the cross section constant term and the
constant term applicable to the time series.
Correlation between resulting
cross section and time series disturbances is then due only to the
e
terms,
o
so that the effect of
the
transformation separating the
tv/o
data sets is
negl igible.
7.
This excludes the possibility that x
is
subject to measurement
iviot; aggregate instruments would be required to deal with errors of measurement.
8.
We assume that the variance of the disturbance,
conditional on the
- 24 -
instrumental variables,
series models.
If
is
constant for both cross section and average time
this assumption is relaxed,
by adjusting the weighting matrix of equations
(1980a,
1980b.
and Hansen (1982)
(11)
and (12).
See White
for details.
The Gauss-Newton method for systems of nonlinear regression equations
9.
is
1982)
efficiency gains are possible
discussed by Malinvaud (19 80).
10.
If
the observations are
transformed,
the transformed data should be
used here.
11.
(1976)
Matrix weighted averages are discussed in Chamberlain and Learner
and Mundlak (1978), among others.
12.
This assiunes that disturbances in different cross sections are
uncorrel ated, which requires transformations of the average data only.
Over-
lapping cross sections require panel data techniques that are beyond the scope
of
this article.
- 25 -
References
"The riaximum Likelihood and Nonlinear Three-Stage Least
Squares Estimator in the General Nonlinear Simultaneous Equations Model."
Econometrica Vol. 45, No. 4, May, pp. 955-968.
Ameiniya.
T.
(1977),
.
"A Note on a Random Coefficients Model," International
Economic Review Vol. 19, No. 3, October, pp. 793-796.
(197 8),
,
Time Series
Balestra, P. and M. Nerlove (1976), "Pooling Cross Section and
Natural Gas."
for
Demand
The
Model:
Dynamic
of
a
Estimation
Data in the
585-612.
July.
No.
3,
pp.
Vol.
34,
Econometrica
R. Boyce, and R.R. Russell (1978), "Estimation of Demand SysBlackorby, C.
A Generalization of the Sby the Gorman Polar Form:
Generated
tems
Branch Utility Tree." Econometrica Vol. 46, No. 2. March, pp. 345-364.
.
,
A Generalization
and D.M. lleien (1972), "The S-Branch Utility Tree:
Vol. 40, No. 4. July,
of the Linear Expenditure System," Econometrica
pp. 737-747.
Brown, M.
.
Chamberlain. G. and E. Learner (1976), "Matrix Weighted Averages and Posterior
Bounds," Journal of the Royal Statistical Society B. 38, pp. 73-84.
,
Deaton, A. and J. Muellbauer (1980a), "An Almost Ideal Demand system," American Economic Review Vol. 70, No. 3. June, pp. 312-326.
,
(1980b), Economics and Consumer Behavior
Cambridge University Press,
and
Cambridge,
,
Dielman, Terry E. (1983). "Pooled Cross-Sectional and Time Series Data: A Survey of Current Statistical Methodology," American Statistician Vol. 37.
No. 2, May. pp. 111-122.
.
Gallant. A. R. (1977). "Three-Stage Least-Squares Estimation for a System of
Simultaneous. Nonlinear. Implicit Equations." Journal of Econometrics.
Vol. 5. No. 1. January, pp. 71-88.
Gallant. A.R. and D.W. Jorgenson (1979). "Statistical Inference for a System
of Simultaneous, Nonlinear, Implicit Equations in the Context of Instrumental Variable Estimation," Journal of Econometrics Vol. 11, No. 2/3,
October/December, pp. 275-302.
,
Hansen, L.P. (1983), "Large Sample Properties of Generalized Methods of
forthcoming.
Moments Estimators," Econometrica
.
Hartley. H.O. (1961), "The Modified Gauss-Newton Method for the Fitting of
Non-Linear Regression Functions by Least Squares," Technome tr ics. 3,
2 69-2 80.
Jorgenson. D.W. and J.J. Laffont (1974), "Efficient Estimation of Nonlinear
Simultaneous Equations with Additive Disturbances," Annals of Economic
- 26
and Social Measurement . Vol.
No.
3.
4.
October 1974. pp. 615-640.
under
Jorgenson, D.W.. L.J. Lau. andT.M. Stoker (1980). "Welfare Comparison
May, pp.
No.
Vol.
70.
2,
Review
Economic
American
Aggregation,"
Exact
,
2 6 8-2 72.
(1981). "Aggregate Consumer Behavior
Nobay and D. Peel (eds.),
and Individual Welfare." in D. Currie. R.
Macroeconomic Analysis London. Croom-Helm. pp. 35-61.
,
and
.
"The Transcendental Logarithmic Model of Aggregate Consumer Behavior," in R.L. Basmann and
Press,
Rhodes (eds.). Advances in Econometrics Vol. 1, Greenwich, JAI
,
(1982),
and
C
.
pp. 97-238.
D.T. Slesnick, andT.M. Stoker (1983), "Exact Aggregation
Jorgenson, D.W.
Harover Individuals and Commodities," Discussion Paper 1005, Cambridge,
vard Institute of Economic Research, August.
,
on
Jorgenson, D.W. and T.M. Stoker (1983), "Aggregate Consumer Expenditures
and
of
Economics
Enerfiy
Advances in the
Energy." in J R. Moroney (ed.)
Resources Vol. 4. Greenwich. JAI Press, forthcoming.
,
.
.
Rubin (1947-1948), "A Constant-Utility Index of the Cost
Vol. 15(2). No. 38. pp. 84-87.
of Living," Review of Economic Studies
Klein, L.R.
and H.
,
,
(1978). "Some Problems of Inference from Economic Survey Data." in
N.K. Namboodiri (ed.). Survey Sampling and Measurement. New York.
Academic Press. 1978. pp. 107-120.
Kmenta.
Kuhn,
J.
Neyman
and A.W. Tucker (1951), "Nonlinear Programming." in, J.
Mathematical
on
Symposium
Proceedings of the Second Berkeley
(ed. )
Statistics and Probability Berkeley, University of California Press,
481-492.
H.W.
.
.
Lau,
pp.
(1977), "Existence Conditions for Aggregate Demand Functions,"
Technical Report No. 248, Institute for Mathematical Studies in the
(revised
Social Sciences, Stanford University. Stanford. California
February 1980).
L.J.
"A Note on the Fundamental Theorem of Exact Aggregation."
Economics Letters Vol. 9, No. 2. pp. 119-126.
(1982).
.
Liew.
(1976), "A Two-Stage Least Squares Estimator with Inequality Res58.
trictions on Parameters," Review of Economics and Statistics, Vol.
No. 2. May. pp. 234-238.
C.K.
Malinvaud, E. 1980, Statistical Methods of Econometrics
North-Holland.
,
3rd.
ed.
,
Amsterdam.
Data."
Mundlak. Y. (1978). "On the Pooling of Time Series and Cross Section
69-86.
Econometrica Vol. 46. No. 1. January, pp.
.
- 27 -
(1982a), "The Use of Cross Section Data to Characterize Macro
Functions," Journal of the American Statistical Assoc ia t ion June, pp.
369-380.
Stoker, T.M.
,
(1982b), "Completeness, Distribution Restrictions and the Form of
Aggregate Functions," M.I.T. Sloan School of f*.anagemont V/orking Paper No.
1345-82, August.
Stone, R. (1954), "Linear Expenditure Systems and Demand Analysis: An ApplicaVol. 64, No.
tion to the Pattern of I3ritish Demand," Economic Journal
255, September, pp. 511-527.
,
White, H.
Vol.
(1980a), "Nonlinear Regression on Cross-Section Data," Economet r ica
No. 3, April, pp. 721-746.
48.
(1980b), "A Heteroscedasici ty-Consistent Covariance Matrix Estimator with a Direct Test for Heteroscedast ici ty, " Econometrica
Vol. 48,
No. 4, May, pp. 817-838.
,
(1982), "Instrumental Variables Regression with Independent Observations," Econometrica Vol. 50, pp. 483-500.
,
Zellner, A. and H. Theil (1962). "Three-Stage Least Squares:
Simultaneous
Estimation of Simultaneous Equations," Econometrica Vol. 30, No. 1,
January, pp. 54-78.
,
.
3251
013
Mil
3
I
IHRflRIFS
TDSD DQ4 SEM IMB
BAS
Date Due
Lib-26-67
Download