1

advertisement
1
1
GTVD : A RATS procedure to estimate Smooth
Transition Regressions with Panel Data
G. Colletaz
Laboratoire d’Economie d’Orléans, Université d’Orléans, France
email: gilbert.colletaz@univ-orleans.fr
1
Thanks to Christophe Hurlin for numerous discussions. Interested Matlab users
may ask him for a practically equivalent program which runs under this software.
Abstract
GTVD.SRC can be used with RATS software to estimate Smooth Transition
Regressions with Panel Data following the recommendations given by Gonzáles,
Teräsvirta and van Dick (2004) . This paper exposes the model, the estimation
and testing schemes and documents the various options actually available.
Chapter 1
The Panel Smooth
Transition Regression
model
1.1
The estimated structure
The basic PSTR model is of the form:
yit = µi + β00 xit + β10 xit g(qit ; γ; c) + uit
(1.1)
for i = 1, . . . , N and t = 1, . . . , T . yit is a scalar, µi is an unobservable timeinvariant regressor, xit is a k-dimensional vector of time-varying exogenous variables, qit is an observable transition variable and uit are the residuals.
g() is a continuous bounded function of qit defined by:
g(qit ; γ; c) = (1 + exp(−γ
m
Y
(qit − cj )))−1 ,
γ > 0, c1 ≤ . . . ≤ cm
(1.2)
j=1
where c = (c1 , . . . , cm )0 is a m-dimensional vector of location parameters, γ is
the slope of the transition function, and γ > 0, c1 ≤ . . . ≤ cm are identification
restrictions.
The version that can be estimated via the proc PSTR is a generalization
of this basic structure. It allows for r transition functions based on eventually
different transition variables each with their owned vector of location parameters. Moreover, it is also possible to consider exogenous variables with constant
coefficients α0 . This additive generalization has the following form:
yit = µi + β00 xit +
r
X
(j)
βj0 xit gj (qit ; γj ; cj ) + α00 zit + uit
j=1
1
(1.3)
and,
j
gj (qit
; γj ; cj ) = (1 + exp(−γj
mj
Y
(j)
(qit − cj,k )))−1
(1.4)
k=1
1.2
Estimation of parameters
Details can be found in the paper by Gonzáles, Teräsvirta and van Dick. Here
we keep the description of the estimation phase to a minimum that is sufficient
to run the proc. However, reading the cited paper is a necessary condition to
understand exactly what you are doing (and in particular to have a knowledge
of the maintained hypothesis under which this estimation is supposed to be
valid). Estimation is conducted in 2 steps : elimination of the individual effects
and successive applications of OLS and NLLS to the transformed variables.
1. removing the fixed effects
suppression of the µi is done classically by centering the variables on their
individual means.
2. estimation of remaining parameters with NLLS
If the slopes of the transition functions, γj , and the location parameters,
cj , enter non linearly in the model this is not the case for the coefficients
β and α. Thus for given values of γj and cj an OLS regression gives estimates of β and α. With these estimates we can then search for the values
of γj and cj that minimize the residual sum of squares using a NLLS algorithm (here we use the bfgs algorithm available with the RATS’instruction
find). The process is repeated until convergence for the whole vector of
parameters. However the convergence issue is greatly dependant upon the
chosen starting values of β and α.
1.3
Choosing the starting values
This is normally done by mean of a grid search, i.e. a selection of initial values
for the slopes γj and the location parameters cj , j = 1, . . . , r.
1. initial values for γj
User simply provides a list of possible positive values.
2. initial values for cj
User selects a number of values to fill the vector cj such that min(cj,k ) >
(j)
(j)
min(qit ) and max(cj,k ) < max(qit ) for k = 1, . . . , mj and j = 1, . . . , r.
To ensure that a sufficient number of observations of the transition variable
are available for estimation in each side of a given location parameter, the
values of min(cj,k ) and max(cj,k ) are constrained by a trimming over the
(j)
observations of qit , i.e. the range for the initial cj,k leaves aside a certain
number of smaller and greater observations of the transition variable.
2
Given these grids, OLS regressions are performed for all combinations of the
initial values to estimate the corresponding β and α. The vector for which the
residual sum of squares is minimal is then passed as starting value for the realization of the second step of the estimation process described at the preceding
point.
In case of difficulties it may be a good idea to increase the number of grid
values. In any case it is certainly a good idea to try different starting values
choosing different specifications for the construction of these grid. Note also
that if the convergence is not obtained at the end of this process, then the proc
PSTR switch automatically to a simplex optimization in order to obtain new
starting values for the NLLS algorithm.
1.4
Tests of linearity against PSTR
1. Before estimation of any nonlinear form
If we start with the basic structure of the PSTR model given by (1.1) then
a linear model is obtained by imposing γ = 0 or β1 = 0. Whatever the
chosen solution, the test will then be nonstandard due to the presence of
unidentified parameters under the null. One solution adopted by Gonzáles,
Teräsvirta and van Dick (2004) is to replace g() by a first-order Taylor
expansion around γ = 0. This leads to an auxiliary regression:
0∗
m
yit = µi + β00∗ xit + β10∗ xit qit + · · · + βm
xit qit
+ u∗it
(1.5)
Testing γ = 0 in (1.1 is then equivalent to testing β10∗ = · · ·
beta0∗
m = 0 in (1.5). If we compute the residual sum of squares of (1.5)
estimated by OLS (after the individual-specific means have been remove
), SSR0 , and the residual sum of squares given by estimation of (1.1) then
two tests can be computed : a LM one and its F-version:
LM
F
= T N (SSR0 − SSR1 )/SSR0
= (SSR0 − SSR1 )/mk/SSR1 /(T N − N − mk)
(1.6)
(1.7)
which under the null have respectively χ2mk and F (mk, T N − N − mk) distributions. In addition to these LM and F tests, the proc PSTR compute
a pseudo-LRT given by :
pseudo − LRT = −2log(SSR1 /SSR0 )
(1.8)
having a χ2mk distribution under the null hypothesis.
2. After estimation of at least one transition function
The question is then the existence of remaining nonlinearity. We follow
GTvD and illustrate the construction of the test with an example. Suppose
3
that you estimate a model with one transition function (r=1 in equation
(1.3)):
(1)
yit = µi + β00 xit + β10 xit g1 (qit ; γ1 ; c1 ) + α00 zit + uit
(1.9)
and that you now think about the addition of a second one. The model
to be considered is then:
(1)
(2)
yit = µi + β00 xit + β10 xit g1 (qit ; γ1 ; c1 ) + β20 xit g2 (qit ; γ2 ; c2 ) + α00 zit + uit
(1.10)
(1)
(2)
where eventually qit = qit . As before, the test consider the slope of the
second transition function and the null hypothesis is then H0 : γ2 = 0.
Again a first-order Taylor expansion around γ2 = 0 leads to an auxiliary
regression:
(1)
0∗
0∗
m
yit = µi +β00∗ xit +β10 xit g1 (qit ; γ1 ; c1 )+β21
xit qit +· · ·+β2m
xit qit
+α00 zit +u∗it
(1.11)
∗
∗
With this approximation the test is now H0 : β21
= . . . = β2m
=0
Based on the residual sum of squares given by the estimations of (1.11) and
(1.9) one can construct LM , F tests as explained before. Note however
that in the paper by Gonzáles, Teräsvirta and van Dick the estimation of
(1.11) is made conditionally upon the values of γ1 and c1 given by the
adjustment of (1.9). The procedure PSTR gives the results so obtained,
but it also gives those obtained after a unconstrained re-estimation of
(1.11) allowing in particular new optimal estimates for γ1 and c1 . It also
prints a pseudo-LRT statistics.
4
Chapter 2
Instructions
This part lists the various options that can be specified and the syntax of the
call to the proc. It also gives some calling examples.
2.1
Calling the procedure
Data must be described using instructions allocate and calendar according to
the panel data structure of RATS. Call to the procedure is then done with a
standard syntax :
proc gtvd depvar
#x1 x2 · · ·
#q1 q2 · · ·
#z1 z2 · · ·
Where depvar is the name of the dependent variable, x1 x2 are the name of
the exogenous variables upon which the transition functions are active, q1 q2
are the name of the transition variables and z1 z2 are exogenous variables with
constant coefficients.
2.2
Specifying the transitions
The number of transition functions, r, is equal to the number of transition
variables. Hence #q1 q1 defines two functions each having q1 as argument
transition .
• m =
m is a vector having r components giving the number of restrictions to
5
be put on the transition functions. For example m=|3, 2| applies to two
transition functions, the first one having 3 restrictions and the second
having 2 restrictions. In this case we must have two transition variables
in the call to the proc.
Note that it is obligatory to specify the contents of this vector.
• rc =
rc specifies the number of transition functions that must be incorporated
in one unique estimated model.
If rc is not specified then the proc implements a sequential treatment. For
example #q1 q2 asks for the estimation of a model having q1 as transition variable and then for the estimation of a model having q1 and q2 as
transition variables. In this case the number of transition variables must
be equal to the dimension of the vector m.
If rc is specified then only the model with rc transition variables is estimated and the dimension of vector m must be at least rc. If the number
of transition variables specified in the calling is greater than rc then linearity tests versus PSTR are realized within the estimated model and this
is done successively for each variable in excess in the list (these tests are
realized only conditionally to the value of the option mlin, see below). If
rc is used then its value must be strictly positive.
2.3
Specific exogenous variables
• exoz =
If exogenous variables of type z are present, i.e. if some independent
variables having constant coefficients are used then the user must indicate
exoz = 1. In this case their names are read on the third supplementary
card. If exoz = 0 or if exoz is not specified then this third card must not
be present in the call to the proc.
2.4
estimation : initial values of parameters and
trimming
Given the slopes of the transition functions and the location parameters, others
coefficients are estimated by minimizing the sum of squared residuals. The
values that minimize the overall sum of squared residuals are used as startingvalues of the estimation algorithm based on BFGS. Following options are used
to specify array of numbers for slopes and location parameters upon which the
search of these starting values is undertaken.
• methodini = simplex / [genetic] / grid
This option specifies the way the proc find optimal initial values of various
parameters before switching the a bfgs algorithm. With methodini=grid
initial values for the slopes and the restriction parameters are taken in
6
grids defined by options grilleg and nbvalinic. In this case all combinations of these values are tried in order to find those associated with the
minimal Residuals Sum of Squares. Even if the proc takes into account the
ordered structure of the restriction parameters, this can lead to an enormous number of initial regressions and can be very much time consuming.
When methodini=simplex or methodini=genetic, only one value of the
slope parameter is considered for each of the transition functions, values
of the restrictions parameters being always given by the option nbvalinic:
Given the slope of the functions, the proc does initial regressions in order
to find optimal initial restrictions parameters, but the number of the initial regressions is gretly reduces, and then switch to a simplex or a genetic
algorithm to find initial values of parameters passed to the bfgs algorithm.
• grilleg =
Vector specifying an array of positive values for the slopes of the transition
functions. Use of these values depends on the method specified for finding optimal initial values before switching to the bfgs optimisation, i.e.,
on the choice made for the option methodini. When methodini=grid the
same vector is used for each of these functions and all its components are
successively tried. Default values are, without any particular justification:
grilleg = ||.2|.5|.9|1.4|1.9|2.5|3.0|3.5|4.0|5.0||
The number of grid values is determined by the dimension of the vector and thus can be arbitrarily fixed by the user. For example, grilleg
= ||.1|.4|.9|| search for a starting value in a list of only three potential
numbers before going to a bfgs optimization. When methodini=simplex
or methodini=genetic the dimension of grilleg is simply equal to the number of transition functions and its ith component is the initial value of
the slope parameter for the ith transition function. By default, grilleg is
taken in this case, again without any particular justification, as a random
realization of pseudo-uniform numbers over [0.2,3.5].
• nbvalinic =
Integer that gives the number initial values to be considered in the grid
for each restriction parameters. Values themselves are taken at regular
intervals among the observations of the corresponding transition variable.
Observations corresponding to trimc and (1 − trimc) are selected so that
nbvalinic must be greater or equal to 2. The default value corresponds to
nbvalinic =10.
• trimc =
It gives the percentage of observations used for the trimming applied to
the transition variables. The default value is trimc=.15. For example,
this default value consider initial values for the restriction parameter from
the 15%-tile (included) to the 85%-tile (included) of the corresponding
transition variable, and in this range the proc takes nbvalinic − 2 other
values with equal spacing.
7
• piters =
This integer indicates the number of iterations to be realized with the
simplex or the genetic algorithm chosen according to the option methodini.
By default, piters = 6. If the simplex or the genetic method converges
then its solutions are taken as initial values for a bfgs optimization.
2.5
linearity tests and tests of parameter constancy
• mlin =
Integer which gives the maximum power to be considered in the auxiliary
regression derived from a Taylor expansion of the initial representation.
mlin is the value of the parameter m in equations (1.5) or (1.11). This
Taylor development is made for the last transition variable of the adjusted
equation. Fox example, if rc is not specified, then at the first step it
corresponds to the first variable in the list q1 q2 · · ·. If mlin = 0 (which is
the default value) then the test of linearity against PSTR is not realized.
• mh =
Integer which gives the maximum power to be considered in the auxiliary
regression used to test the parameter constancy against the hypothesis of
a smooth change from one regime to another (corresponding to the socalled TV-PSTR model). If mh = 0 (which is the default value) then this
test is not done.
• graphd = Yes / [No]
If graphd = Yes, the proc evaluates the derivatives relative to variables
x1 x2 and plots the results. Actually formulas suppose that none of these
variables is used as transition variables, i.e. xi 6= qj ∀i, j. Results are given
for each individual at any point of time. Moreover the proc prints, for each
individual, the mean of these derivatives over all time periods.
• name =
Specifies the name of every individual in the construction of the preceeding
graphs. The order of these names must match the order of the entry in
the data file. If not specified, individuals are named by their order of
apparition in this file, i.e; (1,2,3,...).
2.6
examples
• @gtvd(m = ||2||) y
#x
#q
8
requests estimation of the model:
yit = µi + β0 xit + β1 xit (1 + exp(−γ
2
Y
(qit − cj )))−1 + uit
(2.1)
j=1
all options are taken their default values.
• @gtvd(m = ||1, 1||, nbvalinic = 20) y
#x
#q q
requests successive estimations of models:
yit = µi + β0 xit + β1 xit (1 + exp(−γ(qit − c1 )))−1 + uit
(2.2)
and,
yit = µi +β0 xit +β1 xit (1+exp(−γ1 (qit −c1 )))−1 +β2 xit (1+exp(−γ2 (qit −c2 )))−1 +uit
(2.3)
Linearity tests with m = 1, 2 are realized before estimation of the first
model (here redundant results are given as q is repeated) and after. Note
that for the second model the grid search imposes 40, 000 OLS adjustments
in order to find initial values.
• @gtvd(m = ||1||, nbvalinic = 20, mh = 0, mlin = 4, exoz = 1, rc = 1) y
#x
#q
#z
requests estimation of the model:
yit = µi + β0 xit + β1 xit (1 + exp(−γ(qit − c1 )))−1 + α0 zit + uit
(2.4)
before estimation it does tests of linearity successively for q and x. After
estimation it does tests of remaining linearity for x. These tests are realized for m = 1, 2, 3, 4 in equations (1.5) or (1.11). Tests of parameters
constancy are not made (the same result if we omit the option mh=0) .
The grid search for c1 is made over 20 values.
• @gtvd(m = ||1, 1||, nbvalinic = 5, methodini = genetic, grilleg = ||.8|.3||, rc =
2) y
#x
#q q
requests estimation of the model:
yit = µi +β0 xit +β1 xit (1+exp(−γ1 (qit −c1 )))−1 +β2 xit (1+exp(−γ2 (qit −c2 )))−1 +uit
(2.5)
9
before doing a bfgs optimization, this calls for a genetic optimization for
which initial values are respectively 0.8 and 0.3 for the slope parameters of
the two transition functions and initial values of the restriction parameters
are found among five numbers selected by means of OLS regressions.
10
Download