Panel data typically refer to data containing time series observations

advertisement
Bayesian analysis of autoregressive panel data model: a simulation study
Fabyano Fonseca e Silva1, Thelma Sáfadi2, Joel Augusto Muniz2, Luis Henrique de
Aquino2
INTRODUCTION
Panel data typically refer to data containing time series observations of a number of
individuals. Therefore, observations in panel data involve at least two dimensions; a crosssectional dimension, indicated by subscript i, and a time series dimension, indicated by
subscript t. These data consist of several times series generate by the same type of model, for
example, autoregressive (AR), moving average (MA), autoregressive integrated moving
average (ARIMA), and others more complex models. The key advantage of simultaneously
modeling several series is the possibility of pooling information from all series. Thus, this
advantage is related to generating more accurate predictions for individual outcomes by
pooling the data rather than generating predictions of individual outcomes using the data on
the individual in question. The pooling takes place as the parameters of the time series model
are assumed to arise from the same distribution (Liu, 1980).
The specification of this distribution leads to affirmation that the Bayesian procedure
has a theoretical advantage over the classical procedure, independent of convenience, because
the classical perspective focuses on the sampling distribution of an estimator, while the
Bayesian procedure provides exact information about this parameters distribution.
Widely, only approximate likelihood function is attempted in the Bayesian analysis of
AR(p) models, because the unconditional or exact function don’t provide conditionals
distributions with closed form, leading to more complex estimation process. Based on a panel
data study, it considers that the conditionality to initial observations, defined by order p of
each series, represent a larger information loss. Therefore, same that increases the complexity
of the Bayesian analysis, it is important to consider this characteristic.
The key element of Bayesian analysis is the choice of prior. A commonly used
informative prior for parameters of autoregressive models is a multivariate normal
distribution (Ni e Sun, 2003), but others can be used, for example multivariate Student’s t
(Barreto e Andrade, 2004) and independent rescaled beta distribution (Liu, 1980). Thus, the
1
2
Professor Adjunto, Dep Informática, Setor de Estatística, Universidade Federal de Viçosa. fabyano@dpi.ufv.br
Professor (a) Departamento de Ciências Exatas, Universidade Federal de Lavras.
choice of prior distribution is crucial, and specific methods, for example, Bayes Factor
(Gelfand, 1996) have been used.
In the present study we propose a full Bayesian analysis of an autoregressive, AR(p),
panel data model. The methodology considers exact likelihood function, comparative analysis
of priors distributions and predictive distributions of future observations.
Methodology
The autoregressive model to describe this situation is presented by Liu (1980):
yit  i1 yi (t 1)  i 2 yi (t  2)  ...  mp ym (t  p )  eit ,
where yit is the actual value of an stochastic process, whole values assumed in the past are
given by yt-1, yt-2, ... , yt-p; i1 , i 2 ,..., ip are autoregressive coefficients for each individual,
iid
eit is an error term, eit ~ N (0,  e2 ) .
The likelihood function presented in the matrix form is:
L(Y |  ,  e2 )   ( ,  e2 | Y p ) e2
 ( ,  e2 | Y p )   e2
 mp 


 2 
Vp

1
2
 m(n p ) 


2


 1

exp  2 (Y1  X ) '(Y1  X )  , where:
 2 e

 1

exp  2 Y p 'V pY p  ,
 2 e

Y p  [ y11 , y12 ,..., y1 p , y21 , y22 ,..., y2 p ,..., ym1 , ym 2 ,..., ymp ]',
Y1  [ y1 p 1 , y1 p  2 ,..., y1n , y2 p 1 , y2 p  2 ,..., y2 n ,..., ymp 1 , ymp  2 ,..., ymn ]'m(n-p) x 1 ,
 X1
0
X=
0

 0
0
0
X2
0
0
0
0
0 
 yip
y

0 
ip 1
Xi  

0 


X m  m ( n  p )mp
 yin 1
yi1 
yi 2 
and


yin  p 
( n  p ) p
  [11 , 12 ,..., 1 p , 21, 22 ,..., 2 p ,..., m1, m2 ,..., mp ]'mp x 1 
mp
.
The matrix Vp is obtained by Yule-Walker equation (Box, Jenkins and Reinsel, 1994).
In the present work we generalized this matrix for panel data using the block diagonal
structure, which is illustrated for AR(2) autoregressive models:
0
0
 1 - 122
-11 (1+ 12 )

2
0
0
1 - 12
 -11 (1+ 12 )
2

0
0
1 - 22
-21 (1+ 22 )

0
0
Vp  
-21 (1+ 22 )
1 - 222


0
0
0
0


0
0
0
0

0
0
0
0
0
0








2
1 - m2
-m1 (1+ m2 ) 
2

-m1 (1+ m2 )
1 - m2

0
0
0
0
0
0
0
0
In this study were compared the hierarchical multivariate Normal – Inverse Gamma
prior (model 1) and independent multivariate Student’s t – Inverse Gamma prior (model 2).
The conditional posterior distributions for model 1 are given by:
 |  e2 ,Y ~ ( ,  e2 | Yp )multivariate Normal (ˆ   e   ) , ˆ  ( X ' X )-1 ( X 'Y1 ),
 e2 |  ,Y ~ Inverse Gamma(
mp  mn  2 1
1
 (Y p 'V pY p )  D  (  ˆ ) ' (  ˆ )).
2
2
2
The conditional posterior distributions for model 2 are given by:
 |  e2 ,Y ~ ( ,  e2 | Yp )  mult. Normal (ˆ , ( X ' X )1 )  mult. t  Student (  , P -1 ),
(  ˆ ) '( X ' X )(  ˆ )  (Y1  Yˆ1 ) '(Y1  Yˆ1 ) 
 mn  2 1
 e2 |  ,Y ~ Inverse Gamma 
 (Y p 'V pY p )   
,
2
2
2


Yˆ1 = Xˆ  X ( X ' X )-1 ( X 'Y1 ).
For each considered prior, one chain with starting values obtained for maximum
likelihood estimation were run. After several trials, the length of each chain was set to 50,000.
The burn-in period was 20,000 iterations, higher than the minimum burn-in required
according to the method of Raftery & Lewis (1992), and the convergence was tested using the
Gelman & Rubin (1992 criterion. The Gibbs Sampler and Metropolis-Hastings algorithms
were implemented in the R (R Development Core Team, 2006) free software using matrix
language.
In the present panel data situation, for a specific individual i, the predictive
distribution for one future observation is given by :
P(Y(n+1) | Y ) 
2
  e 

 e2

m
2
 1

exp  2 Y(n+1)  X  ' Y(n+1)  X    P( ,  e2 | Y ) d d 2 .
e
 2 e

This
Integral don´t present analytical solution, but in agreement with Heckman & Leamer (2001),
is possible to get a approximation via MCMC algorithm by the distribution:
Y
(q)
(n+1)


| Y ~ N X (q) ,  e2(q) I , were I is mp x mp identity matrix. The set of the values,
proceeding of the each q MCMC iteration, constitutes a sample of the future observation
posterior predictive distribution. Then, the point estimate of this values, given by the mean of
this samples, is Pˆ (Y(n+1) | Y ) .
To compare the priors was used the Bayes Factor (BF) under approach showed by
Barreto & Andrade (2004), which consider the MCMC sample to obtain the normalization
factor, P(Y | M p ) , for a specific p prior. The Bayes Factor expression is given by:
1 Q
L(Y |  (q) , M 1 )

ˆ
P(Y | M 1 ) Q q 1
BF12 

, were  (q) indicate the generated values in the qth
Q
ˆ
1
P(Y | M 2 )
 L(Y |  (q) , M 2 )
Q q 1
iteration (q = 1,2, …, Q) for each compared priors. The index 1 is referent to model 1 and
index 2 to model 2. Then, the term L(Y |  (q) , M p ) is corresponding to the likelihood function
values obtained via parameter substitution by MCMC estimates.
A simulation study was conducted to evaluate the proposed methodology. The AR(2)
model was used because it is the more simple multiparametric autoregressive approach. It is
give by:
Yit = i1Yi (t 1) +i 2Yi (t 2)  eit ;
i  1, 2,...,10, t  1, 2,...,12.i  [i1, i 2 ]' 
2
if i1  i 2  1; i 2  i1  1; 1  i 2  1
The parameters values, i1 and i 2 , were generated by multivariate normal (model 1)
and multivariate Student’s t distributions (model 2):
0 
0 
  0, 5  0, 025
  0, 5  0, 025

,
,
, gl  m(n  1) 
and  ~ t - Student  





0, 010  
0, 010 
  0, 5  0
  0, 5  0

 ~ N  
The residual values distribution was a Normal, eit ~ N (0,  e2 ) , were  e2 is Inverse
Gamma random number,  e2 ~ IG (3, 2) .
This simulation study also it provides a form of evaluate the predictive capacity, which
is verified by last observation ( Yi12 ) exclusion. Therefore, the predicted values ( Yˆi12 ) can be
compared with the true values.
RESULTS AND DISCUSSION
Table 1. Models considered in the simulation and comparison criteria given by Bayes Factor
(BF) .
Models
Hierarchical Multivariate Normal –
Inverse Gamma prior (Model 1)
Independent multivariate Student's t –
Inverse Gamma prior (Model 2)
Criteria
FB12 
FB21 
1.336.129
 0,2602
5.133.331
2.973.740,179
 474.798
6.263,167
Table 2. Last observation true values ( y12), posterior mean estimate ( ŷ12) and 95%
credibility intervals (LL and UL).
Model 1
Series
y12
ŷ12
LL*
1
0,69
0,43
0,12
2
0,28
0,12
-0,13
3
0,42
0,25
0,06
4
0,66
1,04
0,69
5
1,18
1,36
1,13
6
1,18
1,41
0,89
7
1,43
1,02
0,75
8
-0,55
-0,08
-0,42
9
1,27
0,97
0,65
10
0,63
1,25
0,95
* Lower Limit (LL) and Upper Limit (UL)
UL
0,74
0,46
0,44
1,39
1,59
1,93
1,29
0,26
1,30
1,55
y12
0,60
-1,58
-0,56
1,49
-0,70
0,03
-0,54
-0,94
-0,04
-0,96
Model 2
ŷ12
LL*
0,46
0,24
-1,17
-1,63
0,02
-0,44
1,92
1,35
-0,92
-1,24
0,14
-0,19
0,06
-0,46
-0,69
-1,00
0,13
-0,06
-1,19
-1,54
UL
0,68
-0,91
0,48
2,49
-0,60
0,47
0,51
-0,38
0,33
-0,84
It is observed in the Table 1 the model 2 superiority, even when the data were
generated using the model 1. In general, the literature has related the Student's t prior quality
for parameters of time series autoregressive models, among these can be cited Barreto &
Andrade (2004), which showed its largest robustness.
In the Table 2 is possible compare the predictive capacity of models, because we know
the true value for the time series last observation, which were deleted in the analysis process.
For the model 1, in 60% of the cases the credibility intervals contains the true values, while
for the model 2 this quantity is 80%.Thus, the joint evaluation of the two models, produces an
efficiency of 70%. This efficiency is similar at others studies that used the same evaluation
form for the model predictive capacity (de Alba, 1993 and Hay & Pettitt, 2001).
REFERENCES
BARRETO, G.; ANDRADE, M.G. Robust Bayesian Approach for AR(p) Models Applied to
Streamflow Forecasting. Journal Applied Statistical Science, New York, v.12, n.3, p.269292, Mar. 2004.
BOX, G. E. P.; JENKINS, G. M.; G. C. REINSEL. Time Series Analysis: Forecasting and
Control. 3 ed. San Francisco, USA: Holden-Day, 1994, 500p.
de ALBA, E. Constrained forecasting in autoregressive time series models: A Bayesian
analysis. International Journal of Forecasting, New York, v.9, n.1, p. 95-108, Apr. 1993.
J. M. Bernardo et al.), Oxford, USA: University Press, p.763-773. 1992.
Download