Modeling mixed spatial processes and spatio-temporal dynamics in information-theoretic frameworks

advertisement
Modeling mixed spatial processes and
spatio-temporal dynamics in
information-theoretic frameworks
Rosa Bernardini Papalia
Dipartimento di Scienze Statistiche, Università di Bologna,Via Belle Arti, 41 40100 Bologna , Italy
Summary. This work aims at developing an entropy-based estimation strategy of
a dynamic spatial heterogeneous panel data model where separate processes for each
unit are considered. As regards traditional estimation techniques, the formulation
of the constrained maximization problem in the maximum entropy view does not
require: (i) the use of restrictive parametric assumptions on the model; (ii) the
formulation of hypotheses regarding the form of the distribution of the objective
variables. Restrictions expressed in terms of inequality can be introduced and it
is possible to calibrate the precision in the estimation. Good results are produced
in the case of small-sized samples, in the presence of high numbers of explanatory
parameters and variables (highly correlated).
Key words: Dynamic panel data, generalized maximum entropy estimation, spatial
models
1 Introduction
In recent years, there has been a growing interest in the estimation of econometric
relationships based on panel data (Hsiao 1989). In this work we focus on dynamic
models for spatial panels, a family of models for which there is an increasing interest
in estimation problems (Elhorst 2003, Hadinger, Muller, Tondl 2002).
A dynamic spatial panel data model takes the form of a linear equation extended
with a variable intercept, a serially lagged dependent variable and either a spatially
lagged dependent variable (known as spatial lag model) or a spatially autoregressive
process incorporated in the error term (known as spatial error model). With respect
to a standard space-time model (Space-Time Autoregressive /Integrated Moving
Average model, STARMA, STARIMA (Hepple 1978), and the spatial autoregression
space-time forecasting model (Griffith 1996)) which assumes that the spatial units
are completely homogeneous, a panel data approach would presume that spatial
heterogeneity is a feature of data and attempt to model that heterogeneity.
The need to account for spatial heterogeneity is that spatial units are likely to
differ in their background variables, which are usually space-specific time-invariant
1484
Rosa Bernardini Papalia
variables that affect the dependent variable, but are difficult to measure or hard to
obtain. Omission of these variables leads to bias in the resulting estimates. To overcome these problems, one possibility is to introduce a variable intercept representing
the effect of the omitted variables that are peculiar to each spatial unit considered.
More specifically, in the fixed effects model, a dummy variable is introduced for
each spatial unit as a measure of the variable intercept, while, in the random effects
model, the variable intercept is treated as a random variable that is independently
and identically distributed with zero mean. The homogeneity assumptions that are
often imposed on the coefficients of the lagged dependent variables can lead to serious biases when in practice the dynamics are heterogeneous across the cross section
units.
This work aims at developing an entropy- based estimation strategy of an heterogeneous dynamic spatial panel data model. The starting point is a general model
specification which account for both temporal and spatial lagged effects in a panel
data context.
The entropy-based estimation procedure shares some of the characteristics of
Stein Rule estimators and Bayesian approaches to estimation (Zellner, 1996, 1997,
1998). There is now a considerable body of work, which has given an application of
the entropy criterion to a wide class of models (Marsh et al. 1988, 2003; Golan et al.
1996; Kullback 1959). As regards traditional estimation techniques, the formulation
of the constrained maximization problem in the maximum entropy view does not
require: (i) the use of restrictive parametric assumptions on the model; (ii) the
formulation of hypotheses regarding the form of the distribution of the objective
variables. Restrictions expressed in terms of inequality can be introduced and it
is possible to calibrate the precision in the estimation. Good results are produced
in the case of small-sized samples, in the presence of high numbers of explanatory
parameters and variables (highly correlated).
The paper is organized as follows. Section 2 introduces the basic model introducing the typical assumptions of an information theoretic framework. More specifically,
section 2.2. develops the maximum entropy based formulation for alternative spatial
error specifications. Finally, section 3 concludes and lists some potential advantages
and investigations of the proposed approach.
2 The information-theoretic framework
2.1 The basic problem
Following the heterogeneous panel approach proposed by Pesaran and Smith (1995),
we take into account the possibility of cross-sectional correlation by treating individual relationships as a system of seemingly unrelated regression equations.
In the context of this work, the general specification of the model for cross section unit i (i=1,..,N) and t=1,..,T time periods which involves specifying a different
intercept coefficient for each cross-sectional unit (equation) is:
Yit = ρi w1 Yit−1 + Xit βi + εit
(1)
We start by considering the basic model in vector form for a cross-section unit i:
Yi = ρi w1 Yt−1 + Xi βi + εi
Modeling mixed spatial processes and spatio-temporal dynamics
εi = (I − λi w2 )−1 ui
1485
(2)
where Yi is of dimension (Tx1), Xi is (TxKi ) and βi is (Ki x1).
The resulting complete model, expressed as a system of seemingly unrelated
system of equations, can be written as:
−1
Y = ρW1 Y +
2 Xβ3+ (I −2λW2 ) U
y1
X1 0 ..
6 ... 7
6 0 X2 ..
6
7
6
X≡4
where Y ≡ 4
..
..
... 5
0 0 ..
yN
2
β1
3
2
ρ1
3
2
u1
3
0
0 7
7
.. 5
XN
3
2
λ1
3
(3)
6 ... 7
6 ... 7
6 ... 7
6 ... 7
7
6
7
6
7
6
7
β≡6
4 ... 5 ρ ≡ 4 ... 5 U ≡ 4 ... 5 λ ≡ 4 ... 5
βN
ρN
uN
λN
where Y is a vector of dimension (NTx1), X is a block diagonal matrix of dimension
(NTxK) with K=ΣKi , and β=(β1 , β2 ,.. βN )’, λ=( λ1 , λ2 ,.., λN )’ ρ=( ρ1 , ρ2 ,..,
ρN )’ are unknown vectors of dimension (NKx1) (Nx1) and (Nx1), respectively. This
model specification assumes that spatial effects are not identical across spatial units;
by varying spatial residual correlation coefficients across units it takes into account
jointly structural instability and differentiated spatial effects within and between
spatial units.
W1 is a (TNxTN) block diagonal matrix which expresses for each observation
(row) those units (columns) that belong to its neighbourhood set as non-zero elements, that is: for pairs of units (i,j), wij 6=0 for ‘neighbours’ and wij =0 for others.
It is common practice to derive spatial weights from the location and spatial arrangements of observation by means of a geographic information system. In this
case, units are defined ‘neighbours’ when they are within a given distance of each
other, ie wij =1 for dij 6 δ and i6=j, where dij is the great circle distance chosen,
and δ is the critical cut-off value. More specifically, a spatial weights matrix W* is
defined as follow:
∗
wij
=
8
< 0 if
i=j
1 if dij 6 δ, i 6= j .
:
0 if dij > δ, i 6= j
(4)
and the elements of the row-standardized spatial weights matrix W (with elements
of a row sum to one) result:
wij =
∗
wij
N
P
j=1
, i, j = 1, .., N.
(5)
∗
wij
W2 is the (TNxTN) block diagonal matrix with T copies of the (NxN) spatial weight
matrix, and U is a (TNx1) vector of iid errors with variance σu2 .
The specification in the form of a nonlinear simultaneous equation model result:
Y = (ρW1 + λW2 − (λW2 )(ρW1 )) Y + (I − λW2 ) Xβ + u
(6)
Following Theil (1971) and Zellner (1998), from the joint spatial model (6) and from
substitution of the reduced form equation:
1486
Rosa Bernardini Papalia
Y = Sπ + τ
(7)
for Y, the resulting joint spatial model is derived:
Y = (ρw1 + λw2 − (λw2 )(ρw1 )) (Sπ) + (I − λw2 ) Xβ + u∗
(8)
where u* is a (NTx1) vector of appropriately transformed residuals, S is a (NxL)
matrix of instrumental variables, π is a (Lx1) vector of unknown parameters, and τ
is a (Nx1) vector of reduced form residuals.
The basic generalized maximum entropy (GME) formulation we develop refer to
a model specification where a constant spatial structure for error variance in each
equation is assumed so that the disturbance vector is assumed to have a zero mean
and a non-diagonal covariance matrix:
Φ = Σ ⊗ IT
(9)
where Σ is an (NxN) positive definite symmetric matrix,⊗is the Kronecker product
operator and IT is an identity matrix of dimension T.
The overall variance-covariance matrix takes the form:
Ψ = σu2 A′ A
−1
(10)
,
where:
ε = A−1 u, A = (IN − λW2 ) .
(11)
Assuming different error variances for error variance in each equation we have:
′
E uu
2
3
σu2 1 I1 ..
0
.. 5 .
= Ω = 4 .. ..
0 .. σu2 N IN
(12)
Under the GME framework we recover simultaneously the unknown parameters,the
unknown errors by defining an inverse problem, which is based only on indirect, partial or incomplete information. The objective is to recover probability distributions
for unknown parameters and errors.
Each parameter is treated as a discrete random variable with a compact support
Z and M possible outcomes, 2 6M 6 ∞. The uncertainty about the outcome of the
error process is represented by treating each error as a finite and discrete random
variable with Jpossible outcomes, 26J6 ∞. To this end, we start by choosing a set
of discrete points, the support space V =[v1 , v2 ,. . .,v J ]′ of dimension J>2, that are
at uniform intervals and symmetric around zero. Each error term has corresponding unknown P
weightsri =[ri1 , ri2 . . .,r iJ ]′ that have the properties of probabilities
06r ij 61 and j rij = 1.
Re-parameterizing the set of equations (7) and (8), so that β=Zp β and π=Zp π ,
ρ=Zp ρ ,λ =Zp λ , u*=Vr u∗ and τ =Vr τ yields:
λ λ
λ λ
ρ ρ
π π
Y = (Z ρ pρ ) w1+ Z
p w2 −
Z p w2 (Z p ) w1 (S (Z p ))
+ I − Z λ pλ w2 X Z β pβ + (Z u∗ pu∗ )
(13)
Y = S (Z π pπ ) + (Z τ pτ )
(14)
β π ρ λ
u∗ τ
where p=vec(p p p p ) and r=vec(p p ) are (KM+LM+2M)x1 and (2NJ)x1 vectors of unknown parameters and errors, respectively.
Modeling mixed spatial processes and spatio-temporal dynamics
1487
2.2 Estimation issues
Given the data consistency (13) and (14), and the covariance’s relationship (9) the
GME objective function relative to our formulation problem may be formulated as:
max H(p, r) = −p′ ln p − r ′ ln r
(15)
pi ,ri
subject to:
i) data consistency conditions:
λ λ
λ λ
w2 (Z ρ pρ ) w1 (S (Z π pπ ))
Y = (Z ρ pρ ) w1+ Z
2 − Z p
p w
λ λ
β β
u∗ u∗
+ I − Z p w2 X Z p + (Z p )
(16)
Y = S (Z π pπ ) + (Z τ pτ )
(17)
ii) adding-up constraints:
1′ pβk = 1 ∀k; 1′ pπk = 1 ∀k;
1′ pρ = 1 ; 1′ pλ = 1;
′ τ
1′ pu∗
i = 1 ∀i; 1 pi = 1 ∀i;
(18)
where p=vec(p β pπ pρ pλ ) and r=vec(p u∗ pτ ) are (KM+LM+2M)x1 and (2NJ)x1 vectors of unknown parameters and errors, respectively.
If the correlation between time periods varies with time, the error covariance
that results is more general than the block diagonal structure we have considered.
In such a cases it is possible to introduce different covariances for each period by
adding a new set of constraints in the optimization problem.
It is important to point out some advantages of our GME model specification.
First, by assuming heterogeneity of parameters across units it is possible to analyze the spatial patterns in the estimated coefficients. Second, by using the GME
procedure it is possible to derive an estimator even if the number of time periods involved, T, is less than the number of observations N and the corresponding
variance-covariance matrix for errors is singular.
3 Remarks and conclusions
In this paper we derived a GME estimation procedure for heterogeneous dynamic
spatial panel data model. We generalized the estimator proposed by Marsh, Mittelhammer and Cardell (1988) to include first spatial autoregressive processes in either
the dependent variable and residuals by taking into account jointly structural instability and differentiated spatial effects between and within spatial cross sectional
units. Our model specification allows for much more flexibility of the coefficients
across units than do traditional models and also contributes: (i) to avoid the simultaneity biases that arise from constraining the coefficients on the lagged dependent
variable to be constant across units, and (ii) to provide a diagnostic tool to investigate the presence of some type of heterogeneity in panel data sets. Monte Carlo
experiments were conducted for small and medium sized samples and estimators
compared using the mean squared error loss of the regression coefficients.
1488
Rosa Bernardini Papalia
Several advantages of the GME estimation procedure can be pointed out. The
maximum entropy-based estimator is most efficient relative to traditional estimators in particular when data constraints for each observation are included in the
maximum entropy-based problem formulations. This procedure is able to produce
consistent estimates in models where the number of parameters exceeds the number of data points and in models characterized by a non-scalar identity covariance
matrix. Prior information can be introduced by adding suitable constraints in the
formulation without imposing strong distributional assumptions. Further investigation of GME estimators for dynamic spatial panel data models could yield useful
results (i) to applied analyses of socio and economic phenomena, and (ii) to spacetime problems under complex and nonrandom sample designs.
References
[Cha84]
Chamberlain G (1984) Panel Data. In: Griliches S, Intriligator MD (eds)
Handbook of Econometrics. North-Holland, Amsterdam, pp 1247-1318
[Elh01]
Elhorst JP (2001) Dynamic Models in Space and Time. Geographical
Analysis 33: 119-140
[GJM96] Golan A, Judge G, Miller D (1996) Maximum entropy econometrics: robust estimation with limited data. Wiley New York
[Gri96]
Griffith DA (1996) Computational simplifications for Space-Time Forecasting within GIS: The Neighbourhood Spatial Forecasting Models. In:
Longley P, Batty M (eds) Spatial Analysis: Modelling in a GIS Environment. Cambridge
[Hep78] Hepple LW (1978) The Econometric Specification and Estimation of
Spatio-Temporal Models. In: Carlstein T, Parkes D, Carlstein NT, Parkes
D (eds) Thrift Time and Regional Dynamics. London, Edward Arnold
[Hsi89]
Hsiao C (1989) Modeling Ontario Regional Electricity System Demand
Using a Mixed Fixed and Random Coefficient Approach. Regional Science
and Urban Economics 19
[KR92]
Keane M, Runkle D (1992) On the Estimation of Panel Data with Serial Correlation when Instruments are not Strictly Exogenous. Journal of
Business and Economic Statistics 10(1): 1-9
[Kul59]
Kullback (1959) Information theory and statistics. Wiley New York.
[LZ86]
Liang KY, Zeger SL (1986) Longitudinal Data Analysis using Generalised
Linear Models, Biometrika 73(1): 13-22
[MM03] Marsh L, Mittlehammer RC (2003) Generalized maximum entropy estimation of spatial autoregressive models. Working paper, Kansan State
University
[MMC88] Marsh L, Mittlehammer RC, Cardell S (1988) A Structural-Equation
GME Estimator. Selected Paper for the AAEA Annual Meeting, 1998,
Salt Lake City
[Mun78] Mundlak Y (1978) On the Pooling of Time Series and Cross-Section Data.
Econometrica 36: 69-85
[PS95]
Pesaran MH, Smith R (1995) Estimating long-run relationships from dynamic heterogeneous panels. Journal of Econometrics 68: 79-113
[The71] Theil H (1971) Principles of Econometrics. John Wiley &Sons New York.
Modeling mixed spatial processes and spatio-temporal dynamics
[Zel96]
[Zel97]
[Zel98]
1489
Zellner A (1996) Models, prior information, and Bayesian Analysis. Journal of Econometrics 75: 51-68
Zellner A (1997) A Bayesian Method of Moments (BMOM): Theory and
Applications. In: Formby TB, Hill RC (eds) Advances in Econometrics.
Applying Maximum Entropy to Econometric Problems. JAI Press, Grenwich, pp 85-105
Zellner A (1998) The finite sample properties of simultaneous equations’estimates and estimators Bayesian and Non-Bayesian Approaches.
Journal of Econometrics 83: 185-212
Download