and AR(1) processes - Universidad Complutense de Madrid

advertisement
Issues on Spurious
Behaviors
Universidad Complutense de Madrid
March 18, 2014
Christos Agiakloglou
University of Piraeus
Three topics on spurious behaviors
 Spurious Correlations for stationary AR(1) processes
with Apostolos Tsimbanos
 The Balance between Size and Power
with Charalambos Agiropoulos
 Spurious Regressions for non-linear or time varying
coefficient processes
with Anil Bera
Spurious Behavior
 Granger and Newbold (1974) set the cat among the pigeons with





their simulations results showing that when two independent driftfree random walks are used in a simple linear regression, one ends
up with significant t-statistic 76% of the time.
This phenomenon was introduced by Yule (1926) as a spurious
correlation.
Agiakloglou and Tsimpanos (2012) examined the spurious
correlation phenomenon for stationary processes and they found no
evidence of spurious behavior using the true variance of the sample
correlation coefficient of the two independent AR(1) processes.
However, it was left to Phillips (1986) to mathematically prove the
Granger-Newbold simulation results, showing that the usual tstatistic does not have a limiting distribution.
Entorf (1997) consider random walk with drifts.
Granger, Hyung and Jeon (2001) found spurious results even for
stationary independent AR(1) processes.
The story continues
 Marmol (1995) who generalized the work of Phillips (1986) for high-
order integrated processes, showed also that the Durbin Watson statistic
will converge in probability to zero and therefore low values of this
statistic are expected to appear in the presence of spurious regressions, a
finding that was also indicated by Granger and Newbold (1974).
 Agiakloglou (2009) showed that evidence of serially correlated errors will
also appear in the case of two independent stationary AR(1) processes,
not only in first moments but also in second defined by an ARCH(1) error
structure.
 Recall, that the presence of serially correlated errors in the context of
spurious regression had also been investigated by Newbold and Davies
(1978) for variables that were generated for non-stationary moving
average processes.
 Tsay (1999) also examined spurious regression for I(1) processes with
infinite variance errors.
Two Examples
 Harvey ( 1980) “Econometrics – Alchemy or science?” studied the
relationship between rainfall and inflation rate in U.K..
 Ferson, Sarkissian and Simin (2003) “Spurious regression in financial
Economics” found that many predictive stock return regressions in the
literature, based on individual predicting variables, may be spurious.
Spurious Correlations
for Stationary AR(1) Processes
An alternative approach for testing for
linear association for two independent
stationary AR(1) processes
Story Number One
Spurious Correlations vs.
Spurious Regressions
 Two similar if not identical terms referring to the same phenomenon
of obtaining false evidence about the existence of a linear
relationship between two variables.
 In the case of spurious correlations the analyst has no indication
about the existence of such behavior, unless he or she has some a
priori information about the relationship between these two variables,
such that a high absolute value of the sample correlation coefficient
will be considered very suspicious.
 In the case of spurious regressions the analyst will have an
indication such as low value of the Durbin-Watson as Granger and
Newbold (1974) have pointed out, see also Marmol (1995) and
Agiakloglou (2009).
 More difficult to detect spurious correlations.
Testing for Linear Association
 The test of zero correlation in population is based on the
following hypotheses:
H0: ρ = 0 against H1: ρ ≠ 0
and it is implemented by the usual t statistic, i.e.,
t
r
1 r2
T 2
where r is the sample correlation coefficient and T is the
sample size.
 The t statistic follows a t distribution with (T - 2) degrees
of freedom and the null hypothesis will be accepted if its
absolute value is less than the critical value.
Frequency Distribution of r
 Yule (1926) studied the properties of the sample
correlation coefficient of two random variables and
noticed that the major factor which determines this
spurious behavior is the shape of the frequency
distribution of the correlation coefficient of the two series.
 More precisely, if the distribution has a U shape it is
certain that spurious correlations will arise.
 Banerjee, A., Dolado, J., Galbraith, J. W. and Hendry, D.
F. (1993) using Monte Carlo analysis, examined the
frequency distribution of the correlation coefficient for
various orders of integrated independent time series
verifying Yule’s (1926) initial results.
 They concluded:
I) Frequency Distribution of r
 A) if the two series are stationary white noise processes, the
frequency distribution of the correlation coefficient will be
symmetric around zero and it will look like normal distribution.
400
300
200
100
-0.3
-0.2
-0.1
0.
0.1
0.2
0.3
0.4
Frequency distribution for the correlation coefficient between
two independent white noise processes
(T=100 & 10,000 replications)
II) Frequency Distribution of r
 B) if the two processes are non-stationary I(1) processes, the
frequency distribution of the correlation coefficient will be
semi-ellipse.
140
120
100
80
60
40
20
-0.75
-0.5
-0.25
0.
0.25
0.5
0.75
Frequency distribution for the correlation coefficient between
two independent I(1) processes (T=100 & 10,000 replications)
III) Frequency Distribution of r
 C) If the two processes are non-stationary I(2) processes, the
frequency distribution has a U shape with values of -1 and +1
to be more likely to occur.
1000
800
600
400
200
-0.5
0.
0.5
1.
Frequency distribution for the correlation coefficient between two
independent I(2) processes (T=100 & 10,000 replications)
Frequency Distribution of t
 Consider:
 a) two independent white noise processes,
 b) two independent I(1) random walk processes,
i.e., Yt = Yt-1 + ut,
 c) two independent non-stationary I(2) processes,
i.e., Yt = 2Yt-1 – Yt-2 + ut,
for sample size of 100 observations using 10,000 replications.
 Besides the percentage of rejections of the null hypothesis,
the value of the standard deviation of the t statistic strongly
deviates from one for the two non-stationary cases as appose
to the white noise case which remains unchanged with value
of one, regardless of the sample size.
 Note the scale of the graphical presentation is not the same.
Simulation results for spurious correlations for white noise and
non-stationary I(1) & I(2) processes based on 10,000 replications
T
Percentage of
rejections
of H0
(|t| > 1.96)
Mean value of
r
Standard deviation
of
r
Mean value of
t
Standard deviation
of
t
White Noise processes
100
5.03
0.00
0.10
0.00
1.00
500
5.24
0.00
0.04
0.00
1.00
1,000
5.10
0.00
0.03
0.01
1.00
I(1) processes
100
76.95
0.00
0.49
0.03
7.38
500
89.32
0.00
0.49
0.01
16.37
1,000
92.76
0.00
0.49
0.07
23.37
I(2) processes
100
94.82
0.00
0.84
0.03
49.01
500
97.51
0.00
0.82
0.28
98.55
1,000
98.18
-0.01
0.82
-0.40
139.39
I) Frequency Distribution of t
400
300
200
100
-4.
-2.
0.
2.
4.
Frequency distribution for the t statistic between two independent
white noise processes (T=100 & 10,000 replications)
II) Frequency Distribution of t
600
500
400
300
200
100
-30.
-20.
-10.
0.
10.
20.
30.
40.
Frequency distribution for the t statistic between two independent
I(1) processes (T=100 & 10,000 replications)
III) Frequency Distribution of t
1200
1000
800
600
400
200
-300.
-200.
-100.
0.
100.
200.
300.
Frequency distribution for the t statistic between two independent
I(2) processes (T=100 & 10,000 replications)
Spurious correlations for AR(1)
 Consider two independent AR(1) processes Xt and Yt
generated from the following DGP:
X t  x X t 1   xt
and
Yt   yYt 1   yt
 where the errors εxt and εyt are each white noise N(0, 1)
processes independent of each other and the
autoregressive parameters are allowed to take values of
0.0, 0.2, 0.5, 0.8 and 0.9.
 Note that if φx = φy = 1, both processes are nonstationary random walk processes without drift, whereas
if φx = φy = 0, both processes are white noise processes.
Simulation Results
 Unlike the two non-stationary cases, previously
discussed and especially the I(1) case, the percentage
of rejections of the null hypothesis of zero correlation
remains unchanged regardless of the sample size.
 It is only affected by the magnitude of the
autoregressive parameters.
 We get more spurious results as the value of the
autoregressive parameter increases.
 For example, for φx = φy = 0.5, the null hypothesis is
rejected approximately 13%, for the 5% nominal level,
whereas for φx = φy = 0.9, this number becomes
approximately 52%.
Percentage of rejections of the null hypothesis of zero correlation
at the 5% nominal level (|t| > 1.96) for two independent stationary
AR(1) processes based on 10,000 replications
T
100
500
1,000
φx
φy
0.0
0.0
5.03
0.2
5.18
6.20
0.5
5.38
7.77
12.85
0.8
5.04
9.86
19.53
35.23
0.9
4.81
10.44
22.42
41.12
0.0
5.24
0.2
5.13
5.77
0.5
5.31
7.92
13.15
0.8
4.90
9.44
19.72
34.78
0.9
5.42
10.46
22.67
42.56
0.0
5.10
0.2
5.25
6.00
0.5
4.73
7.38
12.87
0.8
5.30
9.94
19.92
36.05
0.9
4.83
10.53
22.55
43.21
0.2
0.5
0.8
0.9
50.27
52.22
52.01
I) Further Simulation Results
 Frequency distribution for the correlation coefficient
 Clearly, if the decision, as to whether or not spurious
behavior exists, was based on the shape of the
frequency distribution of the correlation coefficient, the
analyst will have no indication in this case.
 The frequency distribution for the correlation coefficient
of two independent stationary AR(1) processes is
symmetric around mean zero and it looks very similar to
the white noise case previously presented.
Frequency Distribution of r
Frequency distribution for the correlation coefficient between
two independent AR(1) processes for φχ = φy = 0.5 (T=100 &
10,000 replications)
Frequency Distribution of r
Frequency distribution for the correlation coefficient between
two independent AR(1) processes for φχ = φy = 0.9 (T=100 &
10,000 replications)
II) Further Simulation Results
 Frequency distribution for the t statistic
 As in the case of non-stationary processes, the problem
of spurious correlations appears because the value of
the standard deviation of the t statistic for testing the null
hypothesis of zero correlation is not one for all values of
the autoregressive parameters.
 Thus, although the frequency distribution for the t
statistic is symmetric around mean zero, it becomes
flatter than the standard normal distribution as the value
of the autoregressive parameters increases.
 The standard deviation of the t statistic is affected only
by the values of the autoregressive parameters and not
by the sample size.
Standard deviation of the t statistic for testing the null hypothesis
of zero correlation for two independent stationary AR(1)
processes based on 10,000 replications
T
100
500
1,000
φx
φy
0.0
0.0
1.00
0.2
1.02
1.05
0.5
1.01
1.11
1.30
0.8
1.00
1.17
1.52
2.12
0.9
1.02
1.21
1.62
2.45
0.0
1.00
0.2
1.01
1.04
0.5
1.00
1.11
1.29
0.8
1.02
1.18
1.51
2.10
0.9
1.00
1.21
1.62
2.48
0.0
1.00
0.2
1.00
1.05
0.5
1.00
1.11
1.28
0.8
1.01
1.17
1.52
2.15
0.9
1.00
1.20
1.61
2.47
0.2
0.5
0.8
0.9
3.00
3.10
3.07
Frequency Distribution of t
Frequency distribution for the t statistic between two independent
AR(1) processes for φχ = φy = 0.5 (T=100 & 10,000 replications)
Frequency Distribution of t
Frequency distribution for the t statistic between two independent
AR(1) processes for φχ = φy = 0.9 (T=100 & 10,000 replications)
Variance of r of two independent
AR(1) processes
 For two independent stationary AR(1) processes Xt and Yt
generated by equations (2) and (3) with autocorrelation coefficients
ρx and ρy respectively we have:
E[
1
1
2
(
X
Y
)
]

E[ X t2Yt 2  2 X tYt X sYs ]
t t
2 
2
T
T
t s
and since
X Y X Y  X X
t s
t t
s s
t
YY  X t X t 2YY
t t  2  ...
t 1 t t 1
we have
1
1 2 2 2 x  y
2
2 2
E[ 2 ( X tYt ) ]   x  y 
[(
T

1)



(
T

2)

 y  ...]
x
y
x
2
T
T
T
2
2
which approximately is equal to:
2 x  y
1
1 2 2
2
E[ 2 ( X tYt ) ]   x  y (1 
)
T
T
1  x  y
Variance of r of two independent
AR(1) processes
 Hence, the variance of the sample correlation coefficient between the
two independent stationary AR(1) series is approximately defined as:
Var (r ) 
or equivalently as:
1 1  x  y
(
)
T 1  x  y
1 1  xy
Var (r )  (
)
T 1  xy
 since ρx = φx and ρy = φy for AR(1) processes.
 For more evidence about the proof of this variance see Bartlett (1935).
 McGregor (1962) also verifies the existence of this variance by
determining the approximate null distribution of the sample correlation
coefficient of two stationary Markov chain processes using the steepest
descents method proposed by Daniels (1954 and 1956).
Variance of r of two independent
AR(1) processes
 The degree of accuracy of this variance depends on three
things:
 a) on the sign of the autoregressive parameters
 b) on the absolute magnitude of the two autoregressive
coefficients and
 c) on the sample size
 One should expect less accuracy:
 a) if φx and φy are both positive (or negative)
 b) if their absolute magnitude is close to one and
 c) if their sample size is small.
 Therefore, it is interesting to investigate the accuracy of this
variance in the context of spurious correlations for all positive
values of the autoregressive parameters and for various
sample sizes.
Simulation Results using the Var(r) of
two independent AR(1) processes
 Series of two independent AR(1) processes Xt and Yt
previously defined are generated for values of the
autoregressive parameter of 0.0, 0.2, 0.5, 0.8 and 0.9
and for sample sizes of 100, 500 and 1,000
observations.
 Based on the sample correlation coefficient of these two
series, the test for zero correlation is conducted by
replacing the denominator of the usual t statistic by the
square root of the variance previously defined.
 The simulation results support no evidence of spurious
correlations.
 Empirical levels are close to nominal levels for
moderate and large sample sizes.
Percentage of rejections of the null hypothesis of zero correlation
at the 5% nominal level (|t| > 1.96) for two independent stationary
AR(1) processes using the approximate variance of their sample
correlation coefficient based on 10,000 replications
T
100
500
1,000
φx
φy
0.0
0.0
4.76
0.2
5.22
5.13
0.5
5.18
5.08
4.56
0.8
4.96
4.72
4.59
3.41
0.9
5.08
4.73
4.46
2.75
0.0
5.21
0.2
5.09
5.03
0.5
5.07
5.03
5.25
0.8
4.91
4.77
5.07
4.81
0.9
5.09
4.88
5.11
4.61
0.0
4.98
0.2
5.21
4.81
0.5
4.90
5.13
4.90
0.8
4.80
5.08
5.09
4.77
0.9
5.09
4.88
4.92
4.79
0.2
0.5
0.8
0.9
1.59
4.56
5.16
Further Simulation Results using the
Var(r) of two indep. AR(1) processes
 Frequency distribution for the corrected t statistic
 The frequency distribution of the corrected t statistic for
testing the null hypothesis of zero correlation using the
variance of two independent AR(1) processes is very
close to the standard normal distribution since the
standard deviation of the t statistic is one for almost all
cases.
Standard deviation of the t statistic for testing the null hypothesis
of zero correlation for two independent stationary AR(1)
processes using the approximate variance of their sample
correlation coefficient based on 10,000 replications
T
φy
φx
0.0
100
500
1,000
0.2
0.5
0.8
0.0
1.01
0.2
1.01
1.00
0.5
1.00
1.00
0.98
0.8
1.01
1.00
0.99
0.95
0.9
1.01
1.00
0.98
0.92
0.0
1.00
0.2
1.00
1.00
0.5
1.01
1.01
1.01
0.8
1.00
0.99
0.99
0.99
0.9
1.01
1.00
1.00
0.99
0.0
1.00
0.2
1.00
1.00
0.5
1.00
1.01
1.00
0.8
0.99
1.01
1.01
0.99
0.9
1.01
1.00
1.00
1.00
0.9
0.88
0.99
1.00
Concluding Remarks
 Using the approximate variance of the sample
correlation coefficient of two independent stationary
AR(1) processes, this study shows that the spurious
behavior can be eliminated for large and moderate
sample sizes, even for large values of the
autoregressive parameter.
The Balance between Size and Power
in testing for linear association
for two stationary AR(1) processes
Story Number Two




Fisher(1915) has revealed some of the properties of the
sample correlation coefficient, r, indicating that it is a
biased estimator of the population correlation
coefficient, ρ, for normal populations, proving also that
Ε[r] = ρ – ρ(1 – ρ2)/2Ν.
See also Kenny and Keeping (1951) and Sawkins (1944).
Clearly, the bias is not a large number, taking into
account that the correlation coefficient takes values
from -1 to 1.
However, if one is concerned with the accuracy of the t –
test for testing the null hypothesis of zero correlation,
especially when the absolute value of the sample
correlation coefficient is small, this bias may affect the
variance and, therefore, the test.
Consider 𝐻0 : 𝜌 = 0 and 𝐻1 : 𝜌 ≠ 0 using the following
three t statistics:
𝑡=
𝑡′
=
𝑟
1 − 𝑟2
𝑇−2
𝑟
1 1 + 𝜑𝑥 𝜑𝑦
𝑇 1 − 𝜑𝑥 𝜑𝑦
𝑡 ′′
=
𝑟
1 1 + 𝜑𝑥 𝜑𝑦
𝑇 1 − 𝜑𝑥 𝜑𝑦

Consider two independent AR(1) stationary
processes generated by the following DGP:
𝑋𝑡 = 𝜑𝑥 𝑋𝑡−1 + 𝜀𝑥𝑡


and
𝑌𝑡 = 𝜑𝑦 𝑌𝑡−1 + 𝜀𝑦𝑡
where the errors 𝜀𝑥𝑡 and 𝜀𝑦𝑡 are white noise N(0,1)
processes, independent of each other and the
autoregressive parameters are allowed to take
values of 0.0, 0.2, 0.5, 0.8 and 0.9.
Sample sizes of 50, 100, 500 and 1000
observations.
𝝋𝒙 = 𝝋𝒚 = 𝝋
T
50
100
500
1000
t-statistics
0.0
0.2
0.5
0.8
0.9
𝒕
5.89
6.63
13.20
33.46
48.01
𝒕′
5.45
5.16
4.53
1.89
0.07
𝒕′′
5.35
5.51
5.44
4.11
2.56
𝒕
5.58
5.88
12.75
35.10
50.43
𝒕′
5.40
4.77
4.42
3.34
1.60
𝒕′′
5.38
4.87
4.81
4.54
3.79
𝒕
4.74
6.02
12.39
35.41
52.19
𝒕′
4.70
5.19
4.63
4.59
4.31
𝒕′′
4.67
5.10
4.82
4.87
4.84
𝒕
5.22
6.12
12.55
35.89
52.61
𝒕′
5.21
5.22
4.55
4.64
4.73
𝒕′′
5.20
5.23
4.63
4.93
5.13

Using the empirical values of the autoregressive
parameters we are getting better size for large values
of the autoregressive parameter.

This is probably due to the fact that the estimates
were not so close to the true large values of the
autoregressive parameters for small and moderate
sample sizes.

For example, for 𝜑 = 0.9 and for T = 100, the mean
values of the estimates were 0.8622 and 0.8614.
𝝋𝒙 = 𝝋𝒚 = 𝝋
T
50
100
500
1000
mean
0.0
0.2
0.5
0.8
0.9
𝝋𝒙
-0.0196
0.1674
0.4474
0.7291
0.8202
𝝋𝒚
-0.0206
0.1660
0.4484
0.7298
0.8216
𝝋𝒙
-0.0096
0.1842
0.4754
0.7651
0.8622
𝝋𝒚
-0.0099
0.1833
0.4747
0.7651
0.8614
𝝋𝒙
-0.0016
0.1968
0.4951
0.7930
0.8927
𝝋𝒚
-0.0021
0.1975
0.4954
0.7935
0.8925
𝝋𝒙
-0.0005
0.1981
0.4972
0.7969
0.8964
𝝋𝒚
-0.0009
0.1987
0.4980
0.7964
0.8965

Consider two linearly dependent AR(1) processes Xt and Yt for
t = 1, 2, …, T, such that:
𝑐𝑜𝑟𝑟 𝑋𝑡 , 𝑌𝑡 = 𝜌

where 𝜌 is their correlation coefficient.

Using matrix notation, we may write:
𝜑𝑥
𝑋𝑡
= 0
𝑌𝑡

𝜀𝑥𝑡
0 𝑋𝑡−1
𝜑𝑦 𝑌𝑡−1 + 𝜀𝑦𝑡
where the errors 𝜀𝑥𝑡 and 𝜀𝑦𝑡 are white noise N(0, 1) processes,
but not independent of each other.



Equivalently, using matrices the former equation can be
expressed as a VAR(1) model:
𝒁𝒕 = 𝑭𝒁𝒕−𝟏 + 𝜺𝒕
where 𝒁𝒕 and 𝜺𝒕 are vectors with 𝑭 being a matrix.
It can be showed that:
𝜮 = 𝑭𝜮𝑭′ + 𝑸
where 𝜮 and 𝑸 are the covariance matrices of 𝒁𝒕 and 𝜺𝒕
respectively defined as:
𝜎11
𝜮= 𝜎
21

𝜎12
𝜎22
1
and 𝑸 =
𝑞21
𝑞12
1
where 𝜎11 and 𝜎22 are the variances of 𝑋𝑡 and 𝑌𝑡 , respectively,
defined by an AR(1) process, 𝜎12 or 𝜎21 is their covariance and
𝑞12 or 𝑞21 is the covariance of the error terms with unit
variances.

Using vectorization, the above equation becomes as:
𝑣𝑒𝑐𝜮 = 𝑰 − 𝑭 ⊗ 𝑭′


−1
𝑣𝑒𝑐𝑸
where 𝑣𝑒𝑐 stands for vectorisation, ⊗ is the Kronecker product and 𝑰
is the identity matrix.
It is easy to show that the equation can be written precisely as:
−1

1 − 𝜑𝑥2
0
0
0
𝜎11
1
0
1 − 𝜑𝑥 𝜑𝑦
0
0
𝜎21
𝑞21
=
𝜎12
0
0
1 − 𝜑𝑥 𝜑𝑦
0
𝑞12
𝜎22
1
0
0
0
1 − 𝜑𝑦2
from which the desired correlation ρ, between the two series, is
determined by the following equation:
𝑞12
𝜎21
𝜌=
=
𝜎11 𝜎22
1 − 𝜑𝑥2 1 − 𝜑𝑦2
1 − 𝜑𝑥 𝜑𝑦

Hence, to generate two dependent AR(1) processes
with desired correlation 𝜌, we need to generate
random errors 𝜀𝑥𝑡 and 𝜀𝑦𝑡 with unit variances and
correlation given by:
𝑐𝑜𝑟𝑟 𝜀𝑥𝑡 , 𝜀𝑦𝑡 = 𝜌
1 − 𝜑𝑥 𝜑𝑦
1 − 𝜑𝑥2 1 − 𝜑𝑦2
𝝋𝒙 = 𝝋𝒚 = 𝝋
T
ρ
0.2
0.4
50
0.6
0.8
t-statistics
0.0
0.2
0.5
0.8
0.9
𝒕
29.8
30.5
34.5
45.1
53.3
𝒕′
28.5
26.3
17.6
5.4
0.2
𝒕′′
28.3
26.9
20.0
9.9
4.9
𝒕
84.3
83.6
78.9
71.1
70.0
𝒕′
83.3
80.5
61.5
19.9
1.4
𝒕′′
83.3
81.3
65.4
30.8
13.4
𝒕
99.7
99.6
98.6
92.8
88.2
𝒕′
99.7
99.5
95.8
54.1
7.4
𝒕′′
99.7
99.5
96.8
69.3
36.8
𝒕
100.0
100.0
100.0
99.8
98.8
𝒕′
100.0
100.0
100.0
93.7
37.3
𝒕′′
100.0
100.0
100.0
97.7
74.6
𝝋𝒙 = 𝝋𝒚 = 𝝋
T
ρ
0.2
0.4
100
0.6
0.8
t-statistics
0.0
0.2
0.5
0.8
0.9
𝒕
53.4
53.1
52.3
54.7
60.5
𝒕′
52.6
49.1
34.1
13.3
4.9
𝒕′′
52.5
49.8
35.9
16.4
9.1
𝒕
98.7
98.5
96.4
86.9
81.0
𝒕′
98.6
98.2
90.7
47.9
19.2
𝒕′′
98.7
98.2
91.5
55.4
30.4
𝒕
100.0
100.0
100.0
99.2
96.3
𝒕′
100.0
100.0
99.9
89.1
52.7
𝒕′′
100.0
100.0
100.0
93.2
68.5
𝒕
100.0
100.0
100.0
100.0
99.9
𝒕′
100.0
100.0
100.0
99.9
93.1
𝒕′′
100.0
100.0
100.0
100.0
97.9
𝝋𝒙 = 𝝋𝒚 = 𝝋
T
ρ
0.2
0.4
500
0.6
0.8
t-statistics
0.0
0.2
0.5
0.8
0.9
𝒕
99.3
99.2
97.6
89.2
82.0
𝒕′
99.3
99.1
94.2
56.2
30.4
𝒕′′
99.3
99.0
94.4
57.5
32.6
𝒕
100.0
100.0
100.0
100.0
99.4
𝒕′
100.0
100.0
100.0
99.4
86.1
𝒕′′
100.0
100.0
100.0
99.5
88.7
𝒕
100.0
100.0
100.0
100.0
100.0
𝒕′
100.0
100.0
100.0
100.0
99.9
𝒕′′
100.0
100.0
100.0
100.0
100.0
𝒕
100.0
100.0
100.0
100.0
100.0
𝒕′
100.0
100.0
100.0
100.0
100.0
𝒕′′
100.0
100.0
100.0
100.0
100.0






The test has low power for small and moderate sample sizes, especially
for low values of the correlation coefficient, a result that has been also
indicated by Zimmerman et. al. (2003) for normal populations.
However, as the sample size increases the power of the test also
increases, regardless of the values of ρ and φ.
In general the classical t test has larger power than the other two tests.
The difference is obvious for large values of the autoregressive
parameter, low values of the correlation coefficient and small sample
size.
For example, for T = 100, the null hypothesis is rejected 60.5% for
ρ=0.2 and φ=0.9 using the classical t test, as opposed to 4.9% and
9.1% using the t' and the t'' statistic respectively.
For small values of the autoregressive parameters the power of the test
is very similar for all three cases, regardless of the value of the
correlation coefficient.
On the other hand, the test has larger power using the t'' rather than
the t' statistic for all cases.
Spurious Regressions for non-linear
or time varying coefficient processes
Effects of ARCH on Spurious
Regressions
Story Number Three
Spurious with nonlinear series







The spurious result is known as a linear behavior.
Lee, Kim and Newbold (2005) investigated the possibility of spurious
nonlinear relationships between two independent random walks.
However, to the best of our knowledge, there is no study on the
spurious regression phenomenon for two independent nonlinear
processes.
Nonlinear models are discussed in Granger and Terasvirta (1993).
Chaos?
A special case of nonlinear models, such as AR(1) with ARCH(1)
effects, are discussed in Bera, Higgins and Lee (1996) and in
(1992).
Randomness of the autoregressive parameter apparently increases
the unconditional variability for both series damping the degree of
linear relationship.
AR(1) Process

Consider a stationary AR(1) process with mean zero:
𝑋𝑡 = 𝜑𝑋𝑡−1 + 𝜀𝑡

where

and the unconditional variance of X is:
𝜎ε2
𝑉𝑎𝑟 𝑋𝑡 =
1 − 𝜑2
for all absolute values of the autoregressive parameter
less than one.

𝜀𝑡 ~𝑖𝑖𝑑𝑁 0, 𝜎ε2
Processes with Time Varying Parameters








The ARMA models typically are considered to have constant
autoregressive and moving average parameters.
When taste, technology and policy change it is difficult to assume
constant parameters.
For example, consider 𝑋𝑡 = 𝜑𝑋𝑡−1 + 𝜀𝑡
where 𝜑𝑡 = 𝑚 + 𝑎𝜑𝑡−1 + 𝑢𝑡
This particular model structure is called time-varying parameter
autoregressive model, i.e., VPAR(1).
φt is unobserved, but it can be estimated iterative using Kalman
Filter.
In the case where α = 0, the above model gives a “random
coefficient model” which is discussed in detail in Nicholls and
Quinn (1982).
There are several form of how 𝜑𝑡 is defined.
Non linear AR Processes


Granger (2008) said than any linear AR model can be
approximated as a non-linear model.
For example, an AR(2) model with mean zero:
𝑋𝑡 = 𝜑1 𝑋𝑡−1 + 𝜑2 𝑋𝑡−2 + 𝜀𝑡

can be expressed as an AR(1) process with random or time
varying coefficient, i.e.,:
𝑋𝑡 = 𝜑𝑡 𝑋𝑡−1 + 𝜀𝑡

where
𝑋𝑡−2
𝜑𝑡 = (𝜑1 + 𝜑2
)
𝑋𝑡−1
AR(1) Process with Random Coefficient
X t  t X t 1   t

Consider:

where 𝜑𝑡 ~ 𝑖𝑖𝑑𝑁 0, 𝑎1
0  1  1
and 𝜀𝑡 ~ 𝑖𝑖𝑑𝑁 0, 𝑎0
2

and
  0  0

for

The conditional mean and variance of Xt are:
E( X t | X t 1 )  X t 1E(t )  E( t )  0
V ( X t | X t 1 )  X t21V (t )  V (t )  0  1 X t21

where the unconditional variance is defined as:

The process behaves like an ARCH(1).
0
 2
V ( Xt ) 

1  1 1  1
AR(1) Process with Random Coefficient II
X t  t X t 1   t

Consider:

where 𝜑𝑡 ~ 𝑖𝑖𝑑𝑁 0, 𝑎1

for

The conditional mean and variance of Xt are:
|φ| < 1,
0  1  1
and 𝜀𝑡 ~ 𝑖𝑖𝑑𝑁 0, 𝑎0
and
 2  0  0
E( X t | X t 1 )  X t 1E(t )  E(t )   X t 1
V ( X t | X t 1 )  X t21V (t )  V (t )  0  1 X t21

The process behaves like an AR(1) and ARCH(1).

This process is the general case.

So what is the unconditional variance of Xt?
Unconditional Variance for
AR(1) + ARCH(1) Process

The unconditional variance is defined as:
V ( X t )  E[V ( X t | X t 1 )]  V [ E( X t | X t 1 )]

where
E( X t | X t 1 )  X t 1E(t )  E(t )   X t 1
V ( X t | X t 1 )  X t21V (t )  V (t )  0  1 X t21

Hence

which is

or

So the unconditional variance for an AR(1) + ARCH(1) depends on?
V ( X t )  E[0  1 X t21 ]  V [ X t 1 ]
V ( X t )   2  1V ( X t )   2V ( X t )
 2
V ( Xt ) 
1   2  1
AR(1) + ARCH(1) Process
X t  t X t 1   t

Consider:

where 𝜑𝑡 ~ 𝑖𝑖𝑑𝑁 𝜑, 𝑎1

for

Another way of defining this process is an AR(1) process with
ARCH(1) errors in variable, i.e.,
|φ| < 1,
and 𝜀𝑡 ~ 𝑖𝑖𝑑𝑁 0, 𝑎0
0  1  1
and
 2  0  0
X t   X t 1   t

where
 t  ut  0  1 X t21


and ut ~ iidN(0, 1).
Under this set up 𝑋𝑡 is defined as:
2
𝑋𝑡 | 𝑋𝑡−1 ~ 𝑁(𝜑𝑋𝑡−1 , 𝑎0 + 𝑎1 𝑋𝑡−1
)
Simulation Process

Consider two independent series Yt and Xt for t = 1, 2, …, T
generated by the AR(1) + ARCH(1) specification, previously defined,
under the following set up:




α0 = 1.0
α1 = 0.0, 0.2, 0.5, 0.8 and 0.9
Φ = 0.0, 0.2, 0.5, 0.8, 0.9 and 1.0.
To examine the presence of a linear relationship between these two
variables the following model:
Yt     X t  et


is estimated using simple ordinary least squares regression for
sample sizes of 50, 100 and 500 observations.
The objective is to define how many times the null hypothesis of
testing that β = 0 is rejected.
Simulation Results

CASE I: Linear Processes:





Φ = 1.0 and α1 = 0.0. Yt and Xt are independent random walks,
which corresponds to Granger and Newbold (1976) set up.
The rejection proportions increases as sample size increases,
for T = 500 it is almost 90%.
Φ < 1.0 and α1 = 0.0. Yt and Xt are independent AR(1)
processes, which corresponds to Granger et. al. (2001) set up.
The rejection proportions increases only as the value of the
autoregressive parameter increases and not as sample size
increases.
See Linear processes as a special case of non-linear processes.
Simulation Results

CASE II: Non-Linear Processes:






Φ = 1.0 and for different values of α1. As we increase the degree
of non-linearity the rejection proportions decreases, i.e., for α1 =
0.9 and for T = 500 it becomes 0.074, almost close to nominal,
as apposed to 0.896 in the absence of ARCH.
This implies that the performance of unit root test and
cointegration tests might be abruptly affected by conditional
heteroscedasticity in time series variables.
Φ < 1.0 and for different values of α1, the spurious results found
by Granger et. al. (2001) almost disappear.
For example, for Φ = 0.9 at the 5% nominal level the proportion
of rejections becomes 0.089 for T = 500 as apposed to 0.516 in
the absence of ARCH.
Spurious regression essentially looks for the presence of linear
relationships between two independent series.
Randomness of the autoregressive parameter apparently increases
the unconditional variability of both series damping the degree of
linear relationship.
Proportions of rejections of the null hypothesis
that β = 0 at the 5% nominal level based on 1,000
replications
φ
a1
1.0
0.9
0.8
0.5
0.2
0.0
0.0
0.682
0.762
0.896
0.462
0.519
0.516
0.351
0.349
0.347
0.130
0.127
0.145
0.068
0.061
0.055
0.055
0.047
0.048
0.2
0.445
0.474
0.401
0.361
0.385
0.377
0.276
0.292
0.281
0.128
0.121
0.127
0.063
0.068
0.067
0.044
0.049
0.040
0.5
0.257
0.217
0.129
0.216
0.200
0.185
0.211
0.213
0.195
0.089
0.100
0.123
0.053
0.052
0.050
0.064
0.060
0.037
0.8
0.161
0.151
0.090
0.153
0.125
0.091
0.128
0.137
0.098
0.095
0.078
0.090
0.062
0.055
0.060
0.047
0.060
0.052
0.9
0.154
0.142
0.074
0.149
0.133
0.089
0.154
0.097
0.084
0.077
0.085
0.073
0.067
0.049
0.056
0.058
0.058
0.053
Note: In each cell numbers are corresponding for sample sizes of 50, 100 and 500 observations.
General Comments
 It is very difficult to analyze time series data.
 To avoid spurious behaviors, do not use time
series data.
 Spurioucity does exist in Greece.
Download