Semiparametric Estimation of Partially Linear Varying Coefficient

advertisement
Semiparametric Estimation of Partially Linear Varying Coefficient
Models with Time Trend and Nonstationary Regressors
Yichen Gao∗
Zheng Li†
Zhongjian Lin‡
August 18, 2014
Abstract
This paper extends the partially linear varying coefficient model to contain time trend
and nonstationary variables as regressors. We use the profile likelihood method to estimate both time trend coefficient in the linear component and the functional coefficients in
the nonlinear component and establish their asymptotic distributions. Monte Carlo simulations are shown to investigate the finite sample performance of the proposed estimators.
Keywords: Partially linear varying coefficient model; time trend; integrated time series.
JEL: C14; C22; C51.
∗ Corresponding Author, International School of Economics and Management, Capital University of Economics
and Business, Beijing 100070, China, yichengao@gmail.com
† Department of Economics, Texas A&M University, College Station, TX 77843, lzrain@tamu.edu
‡ Department of Economics, Emory Univeristy, Atlanta, GA 30322, zhongjian.lin@emory.edu
1
1 Introduction
Recently, there is an increasing attention among econometricians and statisticians working
on nonparametric regression models with non-stationary covariates. Karlsen, Myklebust and
Tjøstheim (2007), Wang and Phillips (2009a,b), Kasparis and Phillips (2012) and Liang, Lin
and Hsiao (2013) consider the problem of nonparametric cointegration model of the following
form.
yt = g( xt ) + u t ,
t = 1, · · · , n
(1)
where xt follows a drift-less unit root I(1) process, the functional form of g(·) is not specified,
and ut is a zero mean stationary I(0) error term. Model (1) extends a standard nonparametric
regression model with independent or weakly dependent data to the strong dependent nonstationary data case. One restrictive feature of model (1) is that the dimension of xt cannot be
greater than two because when the dimension of xt is greater than two, the unit root process
becomes non-recurrent, hence, within a shrinking interval near x, say [ x − h, x + h] where
h = hn → 0 as n → ∞, the number of data points that fall inside [ x − h, x + h] will not increase
to ∞ as n → ∞. Hence, nonparametric kernel method cannot lead to consistent estimation of
g( x) if the dimension of x is greater than two.
To avoid the above problem and also allow for flexible functional forms in a nonparametric
regression model with I(1) covariates, Cai, Li and Park (2009) and Xiao (2009) suggest using a
varying coefficient framework to model the relationship among non-stationary variables.
yt = xt′ β(zt ) + ut ,
t = 1, · · · , n
(2)
where xt is a d × 1 vector of drift-less I(1) process, zt and ut are scalar weakly dependent
stationary I(0) variables, the functional form of (·) is unspecified. Sun and Li (2011) study the
asymptotic behavior of bandwidth using data-driven least squares cross-validation method to
select the bandwidth h. Sun, Cai and Li (2013) further extend model (2) to the case that both xt
and zt are drift-less I(1) variables.
Li et al. (2013) consider a partially linear varying coefficient model:
′
′
γ + x2t
β( zt ) + u t ,
yt = x1t
t = 1, · · · , n
(3)
where x1t and x2t are d1 × 1 and d2 × 1 I(1) non-stationary variables, γ is a d1 × 1 vector of con2
stant parameters, β(zt ) is a d2 × 1 vector function of zt , zt and ut are scalar stationary variables.
Li et al. (2013) suggest using a qth order local polynomial method to estimate γ and β(z). They
show that γ can be estimated with a parametric rate of O p (n−1 ), while the estimate of β(z) has
√
the nonparametric rate of convergence of O p (hq+1 + (n h)−1 ).
Juhl and Xiao (2005) consider the following partially linear model with integrated covariate:
yt = γyt−1 + g( xt ) + ut ,
t = 1, · · · , n
(4)
where γ = 1 or very close to one, xt and ut are stationary I(0) variables.
One common feature for the above works is that all the non-stationary cointegrated variables (i.e., I(1) covariates) are driftless unit processes. Hence, this type of models cannot capture time trend behavior as are often observed in macroeconomic and financial variables.
Juhl and Xiao (2009) consider a nonparametric time-trend model, but they normalize the
time trend variable as τ = τn = t/n for t = 1, · · · , n. So the regression model has the form of
yt = g(τn ) + ut = g(t/n) + ut ,
t = 1, · · · , n
(5)
where the functional form of g(·) is a smooth non-specified function. Since τ lies inside the unit
interval [0, 1] and g(·) is a smooth function, supτ ∈[0,1] | g(τ )| ≤ C, where C is a finite positive
constant. Hence, model (5) cannot be used to model upward trending behaviors.
Liang and Li (2012) consider a time trend varying coefficient model of the form:
yt = tγ + xt′ β(zt ) + ut ,
t = 1, · · · , n
(6)
where γ is a unknown constant parameter so that yt is an unit root or a near unit root process, β(z) is a smooth but otherwise unspecified function of z, xt , zt and ut are weakly dependent stationary variables. Although Liang and Li (2012) explicitly allow for a time trend
variable in their semiparametric varying coefficient model, their model does not contain any
non-stationary I(1) covariates.
Li and Li (2013) consider a time-trend varying coefficient model of the following form:
yt = tγ(zt ) + xt′ β(zt ) + ut ,
t = 1, · · · , n
(7)
where γ(z) and β(z) are both smooth, unspecified functions. Li and Li (2013) derive the rate
3
of convergence results. Let γ̂(z) and β̂(z) denote the local constant kernel estimator of γ(z)
√
√
and β(z), respectively. Li and Li (2013) establish that β̂(z) − β(z) = O p ( nh2 + h) and
√
√
γ̂(z) − γ(z) = O p ( nh2 + h). But they do not provide asymptotic distributions of γ̂(z) and
β̂(z).
In this paper we consider the problem of estimating a time-trend partially linear varying
coefficient model of the form given in (6), but different from model (6) in that we allow for xt
to be a non-stationary I(1) regressor. Specifically, we extend the partially linear varying coefficient model to contain time trend as the linear component and nonstationary variables having
a varying coefficient function. We show that the estimator for the parametric component has
the same rate of convergence as to the case when the function β(z) is known and derive its
asymptotic distribution. Once the fast rate of convergence of the parametric component estimator is established, the asymptotic behavior of the estimator of the nonparametric function
β(z) follows from existing work of Cai, Li and Park (2009) and Xiao (2009).
The remaining part of the paper is organized as follows. In Section 2, we consider the
problem of estimating a semiparametric time trend varying coefficient model. We use the
profile likelihood method to estimate both time trend coefficient and the functional coefficients
of the nonparametric component and establish their asymptotic distributions. Section 3 reports
Monte Carlo simulation results to investigate the finite sample performance of our proposed
estimators. We conclude the paper in Section 4. Mathematical proofs are postponed to the
Appendix.
2 The Model, Estimators and Their Asymptotic Distributions
We consider the following partially linear varying coefficient model with a time trend component enters the model linearly, while some I(1) regressors enter the model semiparametrically
in the sense that their coefficients are smooth functions of a stationary variable as follows:
yt = tγ + xt′ β(zt ) + ut ,
t = 1, · · · , n
(8)
where t is time trend, γ is an unknown constant coefficient associated with the time trend
variable, xt is p-dimensional I(1) regressors. That is, xt = xt−1 + vt with vt being a p × 1 vector
of zero mean weakly dependent I(0) variable. The prime denotes the transpose of a matrix.
4
β(·) is a p-dimensional unspecified varying coefficient function. zt is a scalar I(0) variables1 ,
and ut is a stationary error term satisfying E (ut | xt , zt ) = 0.
We propose estimating the unknown parametric γ and the unknown function β(·) by the
profile least squares method. First we treat γ as if it were known, we can re-write (8) as
yt − tγ + xt′ β(zt ) + ut ,
(9)
Then we estimate β(zt ) by a local polynomial (of order qth ) kernel estimation method. Let
e ≡ ( I p , Np× pq ), Gh ≡ diag(1, h, · · · , hq ) ⊗ I p , where h is the smoothing parameter, I p is a p × p
identity matrix, Np× pq is a p × pq null matrix (a matrix with all elements being zeros), and ⊗
denotes the Kronecker product. Define
Zst,h ≡ (zs − zt )/h, Kst ≡ h−1 K (zst,h ), Qst ≡ [1, (zs − zt ), · · · , (zs − zt )q ]′
where K (·) is a kernel function. Then the estimator of β(zt ) is given by
β̃(zt ) = e
n
∑
s =1
= eGh
1
= e 2
n
Kst ( Qst Q′st
n
∑
s =1
n
∑
s =1
⊗
Kst ( Qst Q′st
xs xs′ )
⊗
−1 xs xs′ )
Kst Gh−1 ( Qst Q′st
⊗
n
∑ Kst (Qst ⊗ xs )(ys − sγ)
s =1
−1
Gh Gh−1
xs xs′ ) Gh−1
= A1t − A2t γ
n
∑ Kst (Qst ⊗ xs )(ys − sγ)
s =1
−1 1
n2
n
∑
s =1
Kst Gh−1 ( Qst
⊗ xs )(ys − sγ)
(10)
where
A1t = eSt−1
"
1
n2
eSt−1
"
1
n2
A2t =
St =
1
n2
n
∑ Kst Gh−1 (Qst ⊗ xs )ys
s =1
n
∑
s =1
Kst Gh−1 ( Qst
⊗ xs ) s
#
#
n
∑ Kst Gh−1(Qst Q′st ⊗ xs xs′ )Gh−1
s =1
Note that β̃(zt ) in (10) is not a feasible estimator for β(zt ) because it depends on the unknown parametric γ. In order to obtain feasible estimators for γ and β(·), we add and subtract
1 It
is straightforward to extend the scalar zt to the multivariate zt case. For notational simplicity, we only
consider the scalar zt case in this paper.
5
xt′ β̃(zt ) in (8), then rearranging terms gives
yt − xt′ A1t = (t − xt′ A2t )γ + ǫt
(11)
where ǫt = ut + xt′ [ β(zt ) − β̃(zt )]. The ordinary least squares (OLS) estimator of γ based on
(11) is given by
γ̂ =
n
∑ (t −
t =1
xt′ A2t )2
−1
n
∑ (t − xt′ A2t )(yt − xt′ A1t )
(12)
t =1
Note that γ̂ defined in (12) is a feasible estimator of γ because A1t and A2t depend on observed
variables and therefore computable. Substituting (12) into (8), one can obtain a feasible local
polynomial estimator of β(zt ).
The use of local polynomial estimation (with q ≥ 1) is needed to ensure that γ̂ − γ =
O p (n−3/2 ) has a parametric rate of convergence. When estimating β(z), there is no need to
use a local high order polynomial method. Local linear or even local constant methods can be
used to consistently estimate β(z). We suggest using the local linear method because as shown
in Sun and Li (2011), for a varying coefficient model with xt being an I(1) variable, local linear
estimator of β(z) has a faster convergence rate than a local constant estimator of β(z). A local
linear estimator of β(z) along with a derivative estimator for β(1) (z) =



β̂(z)
β̂(1) (z)


n

  
 = ∑ 
where Kh,sz =
s =1
1
hK
zs − z
h
(zs − z) xs xs′
xs xs′
(zs − z) xs xs′ (zs − z)2 xs xs′

−1


 Kh,sz 
n

∑
s =1

dβ ( z)
dz
are given by
xs
( zs − z) xs


 (ys − sγ̂)Kh,sz
(13)
. Before establishing the asymptotic distribution of γ̂ and β̂(z), we
need to make some regularity assumptions. Let xt = xt−1 + vt , vt is stationary and weakly
dependent random vector process which will be specified later. Also, let k · k denotes the the
Euclidean norm.
Assumption 1. Let νt = (v′t , ut , zt ), {ν} is a strictly stationary Îś-mixing process with mixing coefficients α(m) = O(ρ−m ) for some 0 < ρ < 1, and E [kνt4 k] < ∞. Also, let ξ t = (v′t , ut )′ . Then ξ t has
zero mean and Var(ξ t ) = Σ.
√1
n
[nr ]
∑t=1 ξ t ⇒ Bv,u (r) = ( Bv (r), Bu (r)), where [ a] denotes the integer
part of a, ⇒ denotes weak convergence, Bv,u (r) is a Browning motion with covariance given by Σ.
Assumption 2. Let (ut , Fnt , 1 ≤ t ≤ n) be a martingale difference sequence with E (ut |Fnt−1 ) = 0
a.s. and E (u2t | Fnt−1 ) = σu2 a.s., where Fnt = σ{s1 , zs1 , us2 : 1 ≤ s1 ≤ n, 1 ≤ s2 ≤ t}.
6
Assumption 3. The function β(z) has (q + 1)-th order continuous derivatives for z ∈ Sz , where Sz is
the compact support of zt .
Assumption 4. The density function of zt , f (z), is positive and bounded away from infinity and zero
on z ∈ Sz , and has second-order continuous derivative z ∈ Sz . Furthermore, the joint density function
of (zt , zs ) is bounded for all s > t.
Assumption 5. K (·) is bounded continuous probability density function with a compact support.
Assumption 6. Let nhq+1 → 0 and
nh
ln n
→ ∞ as n → ∞.
We present the asymptotic distribution of γ̂ and β̂(z) below and delay the proofs to Appendices.
Theorem 1. Under assumptions 1 to 6, we have
d
n3/2 (γ̂ − γ) −
→ W1−1W2
where
W1 =
1 −
3
Z 1
0
rBx (r)′ dr
h Z
1
Bx (r) Bx (r)′ dr
0
i −1 Z
1
0
rBx (r)dr
and
W2 =
Z 1
0
rdBu (r) −
Z
1
0
′
Bx (r) dBu (r)
h Z
1
0
′
Bx (r) Bx (r) dr
i −1 Z
1
0
rBx (r)dr
Theorem 1 shows that γ̂ − γ = O p (n−3/2 ) has the same fast convergence rate as in a parametric time trend model (i.e., when β(zt ) is known). This result is expected as it is well known
that for semiparametric models, usually the parametric components can be estimated with
the same rate of convergence as the counterpart parametric models (when the nonparametric
functions are known).
With the result of Theorem 1, it is easy to establish the asymptotic distribution of β̂(z). We
R
d2 β ( z )
need first to introduce some notations. Denoting by β(2) (z) = dz2 , µ2 = v2 K (v)dv and
R
R1
ν0 = K 2 (v)dv. Also, let S = 0 Bv (r) Bv (r)′ dr. The following Theorem gives the asymptotic
′
distribution of β̂(z), β̂(1) (z) .
Theorem 2. Under assumptions 1 to 6, we have
n
√ d
h β̂(z) − β(z) − h2 µ2 β(2) (z)/2 −
→ MN (Σ β (z))
7
where MN (Σ β (z)) is a mixed normal distribution with mean zero and conditional covariance matrix
given by Σ β (z) = σu2 ν0 S−1 / f (z). We report simulation results in the next section.
3 Monte Carlo Simulations
In this section we use Monte Carlo simulations to investigate the finite sample performance of
the proposed estimator γ̂ and β̂(zt ). We consider the following data generating process:
yt = tγ + xt′ β(zt ) + ut ,
where γ = 1, xt = xt−1 + vt with vt being i.i.d. N (0, 1) and ut is i.i.d. N (0, 1). β(zt ) has several
choices:
DGP1 :
β( z) = 1 + z
DGP2 :
β(z) = 1 + 0.5 sin(πz)
DGP3 :
β ( z ) = e z = z2
DGP4 :
β( z) = Φ( z)
where Φ(·) is the normal density function. Liang and Li (2012) used DGP1 and DGP2 in their
simulations, while DGP3 is used by Li and Li (2013). zt follows an AR(1) process: zt = 0.5zt−1 +
ε t with ε t being i.i.d. uniform [−1, 1]. {vt }, {ε t } and {ut } are mutually independent with each
other. The smoothing parameter is h = c0 zsd n−1/5 , where c0 = 0.75, 1, 1.25 and 1.5, zsd is the
sample standard deviation of {zt }nt=1 . We consider different sample sizes n = 100, 200, 400 and
800, the number of replications is M = 2000. The Mean Squared Error (MSE) is calculated as
following:
MSEγ̂ =
MSEβ̂ =
M
1
M
j=1
1
M
∑n∑
∑ (γ̂j − γ)2
M
n
1
j=1
t =1
β̂ j (zt ) − β(zt )
2
where j refers to the jth simulation replication.
When estimating γ by γ̂, we use local linear (q = 1) kernel method. With γ̂ we estimate
β̂(·) using both local constant (LC) and local linear (LL) methods because both methods give
8
consistent estimate for β(·). However, as shown in Sun and Li (2011), MSEβ̂ LL = O((n2 h)−1/2 +
h4 ) = O(n−8/5 ) if h ∼ n−2/5 , and MSEβ̂ LC = O((n2 h)−1 + h/n) = O(n−3/2 ) if h ∼ n−1/2 .
Hence, β̂ LL (·) has a faster rate of convergence than that of β̂ LC (·).
Tables 1 and 2 report the Monte Carlo simulation results for four DGPs. We use n3 MSEγ̂
and n8/5 MSEβ̂ LL to verify the rate of convergence of γ̂ and β̂ LL we derived in Theorems 1 and 2
in Section 2. The results of n3/2 MSEβ̂ LC in the last column of both tables are also reported
to compare the different convergence rate between local linear estimator and local constant
estimator. For the convenience of comparing MSEβ̂ LL and MSEβ̂ LC , we also report n3/2 MSEβ̂ LL .
From the third and fourth columns of Tables 1 and 2, we find that the simulation results
tend to take constant values when n is large. This numerically verifies that the rate of convergence of MSEγ̂ and MSEβ̂ LC are correct. Specifically, MSEγ̂ = O(n−3 ) and MSEβ̂ LL = O(n−8/5 ).
Also, the numbers from sixth column support that the local constant estimator has a MSE
n−3/2 rate of convergence. Finally, from the fifth and the sixth columns we can compare MSE
of the local linear and the local constant estimators. We see that the results are mixed, for some
cases local linear estimator has smaller estimated MSE, while for other cases, local constant
method has smaller estimated MSE. Even the local linear estimator has a faster rate of convergence than the local constant estimator, in finite sample applications, it is possible for the
local constant estimator to has smaller estimated MSE. For sufficiently large sample size, local
linear estimator will dominate the local constant estimator as the asymptotic theory predicted.
DGP1 is a linear regression model, hence, as expected, the local linear method’s performance
improves as c0 increases. DGP2 exhibits the most nonlinearity among the four DGPs. The
c0 values considered in the simulations seem to be too large for the local constant estimator
for this DGP. Local linear estimator has a smaller MSE than the local constant estimator for
DGP2. For DGP3, local constant estimator has smaller (larger) MSE than local linear estimator
for small (large) value of c0 (or h). The normal cumulative function in DGP4 is a monotone
function, for this DGP, local constant estimator has smaller MSE than the local linear estimator
for the samples we considered.
9
Table 1: Monte Carlo Simulation Results
c0
n
n3 MSEγ̂
0.75 100
200
400
800
1
100
200
400
800
1.25 100
200
400
800
1.5 100
200
400
800
83.38
79.53
88.49
94.15
81.16
83.98
81.97
83.08
93.42
89.58
98.11
93.52
79.76
81.77
85.64
87.04
0.75 100
200
400
800
1
100
200
400
800
1.25 100
200
400
800
1.5 100
200
400
800
99.35
89.91
84.38
96.53
111.07
115.24
102.11
122.78
155.24
161.38
161.06
158.11
213.44
267.24
229.86
272.19
n8/5 MSEβ̂
n3/2 MSEβ̂
LL
LL
DGP1
96.00
60.57
88.46
52.07
48.30
26.53
56.51
28.96
38.13
24.06
33.91
19.96
26.08
14.33
18.53
9.49
28.67
18.09
21.48
12.65
17.13
9.41
12.76
6.54
23.82
15.03
18.00
10.60
14.52
7.97
11.06
5.67
DGP2
99.24
62.74
74.51
43.87
65.46
35.96
68.96
35.34
45.17
28.50
37.62
22.15
34.85
19.14
31.55
16.17
35.80
22.59
32.62
19.20
33.14
18.20
41.57
21.30
35.81
22.60
37.31
21.97
43.77
24.04
63.75
32.67
n3/2 MSEβ̂
LC
20.15
15.13
13.66
15.44
14.21
13.20
13.97
17.32
15.21
15.67
20.14
27.70
16.84
20.37
28.07
46.33
49.73
75.82
99.30
113.27
81.15
121.36
172.53
269.37
133.32
230.89
411.30
630.59
213.39
379.91
491.42
862.94
4 Conclusion
In this paper we consider the problem of estimating a semiparametric time trend model with
time trend component enter the model linearly, and the non-stationary covariate xt and the
stationary covariate zt enter the model via a varying coefficient form: xt′ β(zt ). We derive the
asymptotic distributions of the finite dimensional parameter estimator γ̂ and the infinite dimensional function estimate β̂(·). The results of this paper can be generalized in several directions. First, we can extend model (8) to the case that both xt and zt are non-stationary I(1)
variables. The result of Sun, Cai and Li (2013) should be useful in deriving the asymptotic
distributions of the estimators of this extended model. Second, we can develop model specification tests to test whether β(z) has a known parametric functional form. Various nonpara-
10
Table 2: Monte Carlo Simulation Results
c0
n
n3 MSEγ̂
0.75 100
200
400
800
1
100
200
400
800
1.25 100
200
400
800
1.5 100
200
400
800
83.38
79.53
88.49
94.15
81.16
83.98
81.97
83.08
93.42
89.58
98.11
93.52
79.76
81.77
85.64
87.04
0.75 100
200
400
800
1
100
200
400
800
1.25 100
200
400
800
1.5 100
200
400
800
99.35
89.91
84.38
96.53
111.07
115.24
102.11
122.78
155.24
161.38
161.06
158.11
213.44
267.24
229.86
272.19
n8/5 MSEβ̂
n3/2 MSEβ̂
LL
LL
DGP3
96.00
60.57
88.46
52.07
48.30
26.53
56.51
28.96
38.13
24.06
33.91
19.96
26.08
14.33
18.53
9.49
28.67
18.09
21.48
12.65
17.13
9.41
12.76
6.54
23.82
15.03
18.00
10.60
14.52
7.97
11.06
5.67
DGP4
99.24
62.74
74.51
43.87
65.46
35.96
68.96
35.34
45.17
28.50
37.62
22.15
34.85
19.14
31.55
16.17
35.80
22.59
32.62
19.20
33.14
18.20
41.57
21.30
35.81
22.60
37.31
21.97
43.77
24.04
63.75
32.67
n3/2 MSEβ̂
LC
20.15
15.13
13.66
15.44
14.21
13.20
13.97
17.32
15.21
15.67
20.14
27.70
16.84
20.37
28.07
46.33
49.73
75.82
99.30
113.27
81.15
121.36
172.53
269.37
133.32
230.89
411.30
630.59
213.39
379.91
491.42
862.94
metric/semiparametric specification tests are developed for a nonparametric/semiparametric
regression models with non-stationary data. For example, testing the null hypothesis that g( x)
has a parametric functional form, say g( x) is a linear function in x, has been considered by
Gao et al. (2009) and Wang and Phillips (2012). Gao et al. (2009) consider the problem of testing a linear cointegration model, yt = β0 + xt′ β1 + ut , against a nonlinear cointegration model,
yt = g( xt ) + ut , where { xt }nt=1 is a random walk process and is independent of {ut }. Wang and
Phillips (2012) consider a similar testing problem as in Gao et al. (2009) but relax many of the
restrictive assumptions to allow for more general nonstationary process for { xt }. For example,
Wang and Phillips (2012) do not require { xt }nt=1 to be independent of {ut }nt=1 . Sun, Cai and Li
(2013) consider problem of testing β(z) in (2) equals a vector of constant parameters, or it has
11
a known parametric functional form. The result of Gao et al. (2009) and Sun, Cai and Li (2013)
will be useful in developing model specification test for the partially linear time trend varying
coefficient model considered in this paper.
References
Cai, Z. (2007): Trending time varying coefficient time series models with serially correlated
errors, Journal of Econometrics 136, 163-188.
Cai, Z., J. Fan, and Q. Yao (2000): Functional coefficient regression models for nonlinear time
series, Journal of the American Statistical Association 95, 941-956.
Cai, Z., Q. Li, and J.Y. Park (2009): Functional-coefficient models for nonstationary time series
data, Journal of Econometrics 148, 101-113.
Gao, J., M. King, Z. Liu, and D. Tjøstheim (2009): Nonparametric specification testing for nonlinear time series with nonstationary, Econometric Theory 25, 1869-1892.
Hansen, B.E. (1992): Convergence to stochastic integrals for dependent heterogeneous processes, Econometric Theory 8, 489-500.
Hansen, B.E. (2008): Uniform convergence rates for kernel estimation with dependent data,
Econometric Theory 24, 1-23.
Juhl, T., and Z. Xiao (2005): Partially linear models with unit roots, Econometric Theory 21, 877906.
Juhl, T., and Z. Xiao (2009): Tests for Changing Mean with Monotonic Power, Journal of Econometrics 148, 14-24.
Karlsen H.A., T. Myklebust, and D. Tjøstheim (2007): Nonparametric estimation in a nonlinear
cointegration type model, Annals of Statistics 35, 252-299.
Kasparis, I., and P.C.B. Phillips (2012): Dynamic misspecification in nonparametric cointegration, Journal of Econometrics 168, 270-284.
Li, K., D. Li, Z. Liang, and C. Hsiao (2013): Estimation of Semi-Varying Coefficient Models
with Nonstationary Regressors, Working Paper.
Li, Q., C.J. Huang, D. Li, and T. Fu (2002): Semiparametric smooth coefficient models, Journal
of Business and Economics Statistics 20, 412-422.
Li, K., and W. Li (2013): Estimation of varying coefficient models with time trend and integrated regressors, Economics Letters 119, 89-93.
Laing, Z., and Q. Li (2012): Functional coefficient regression models with time trend, Journal of
Econometrics 170, 15-31.
Liang, Z., Lin, Z., and C. Hsiao (2013): Local Linear Estimation of Nonparametric Cointegration Models, Forthcoming in Econometric Reviews.
Masry, E. (1996): Multivariate Local Polynomial Regression for Time Series: Uniform Strong
Consistency and Rates, Journal of Time Serial Analysis 17, 571-599.
12
Phillips, P.C.B. (2009): Local limit theory and spurious nonparametric regression, Econometric
Theory 25, 1466-1497.
Phillips, P.C.B. and P. Perron (1988): Testing for unit roots in time series regression, Biometrika
75, 335-346.
Revuz, D., and M. Yor (2005): Continuous Martingales and Brownian Motion, 3rd edition.
Fundamental Principles of Mathematical Sciences 293, New York: Springer-Verlag.
Sun, Y., Z. Cai, and Q. Li (2013): Semiparametric Functional Coefficient Models with Integrated
Covariates, Econometric Theory 29, 659-672.
Sun, Y., and Q. Li (2011): Data-driven method selecting smoothing parameters in semiparametric models with integrated time series data, Journal of Business and Economic Statistics 29,
541-551.
Wang, Q., and P.C.B. Phillips (2009a): Asymptotic theory for local time density estimation and
nonparametric cointegrating regression, Econometric Theory 25, 710-738.
Wang, Q., and P.C.B. Phillips (2009b): Structural nonparametric cointegrating regression,
Econometrica 77, 1901-1948.
Wang, Q., and P.C.B. Phillips (2012): A specification test for nonlinear nonstationary models,
Annals of Statistics 40, 727-758.
Xiao, Z. (2009): Functional coefficient co-integration models, Journal of Econometrics 152, 81-92.
Yoshihara, K. (1976): Limiting behavior of U-statistics for stationary, absolutely regular processes, Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 35, 237-252.
13
Appendices
A Proof of Theorems
A.1 Proof of Theorem 1
Following similar proof strategy of Li et al. (2013), we decompose γ̂ − γ into several terms
below. Define
n
n
1
2
n St
s =1
h
i
n
β(zt ) = eSt−1 n−2 ∑ Kst Gh−1 ( Qst ⊗ xs ) xs′ β(zs ) =
h
ũt = eSt−1 n−2
i
−1
K
G
(
Q
⊗
x
)
u
∑ st h st s s =
∑ Kst xs us ,
s =1
1
n2 St
s =1
n
∑ Kst xs xs′ β(zs )
s =1
where e is defined in Section 2 and we apply eGh−1 = e and e( Qst ⊗ xs ) = xs . Then we have
γ̂ − γ =
=
h
t =1
h n
∑ (t − xt′ A2t )2
=
h
=
h
=
=
=
n
∑ (t − xt′ A2t )2
t =1
n
∑ (t − xt′ A2t )2
t =1
n
∑ (t −
xt′ A2t )2
∑ (t −
xt′ A2t )2
t =1
h n
t =1
h n
∑ (t − xt′ A2t )2
t =1
−3/2 −1
n
B1n [ B2n
i −1
n
h
n
∑ (t − xt′ A2t )(yt − xt′ A1t ) − ∑ (t − xt′ A2t )2
t =1
i −1 n
t =1
i −1 h
n
∑ (t − xt′ A2t )2
t =1
i
γ
∑ (t − xt′ A2t )(yt − xt′ A1t − tγ + xt′ A2t γ)
t =1
n
i −1 h
i −1 ∑ (t − xt′ A2t )(yt − tγ + xt′ ( A2t γ − A1t )
t =1
n
∑ (t −
xt′ A2t )
xt′ β(zt ) + ut
∑ (t −
xt′ A2t )
xt′ β(zt ) + ut
t =1
i −1 n
t =1
i −1 n
+
−
xt′
xt′
1
n2 St
1
n2 St
B3n =
B4n =
∑
s =1
n
∑
s =1
Kts eGh−1 ( Qst
⊗ xs )(γs − ys )
xs Kts ( xs′ β(zs ) + us )
n
n
t =1
t =1
t =1
where
B2n =
n
∑ (t − xt′ A2t ) xt′ ( β(zt ) − β(zt )) − ∑ (t − xt′ A2t ) xt′ ũt + ∑ (t − xt′ A2t )ut
− B3n + B4n ]
B1n =
i
1
n3
n
∑ (t − xt′ A2t )2
t =1
n
1
n3/2
1
n3/2
1
n3/2
∑ (t − xt′ A2t ) xt′ ( β(zt ) − β(zt ))
t =1
n
∑ (t − xt′ A2t ) xt′ ũt
t =1
n
∑ (t − xt′ A2t )ut
t =1
14
In Lemma 1 we show that
1 −
3
d
B1n −
→
Z 1
0
rBx (r)′ dr
h Z
1
Bx (r) Bx (r)′ dr
0
i −1 Z
1
0
rBx (r)dr ≡ W1
In Lemmas 2 and 3, we prove that
B2n = o p (1)
B3n = o p (1)
and
Finally, in Lemma 4, we establish that
d
B4n −
→
Z 1
0
rdBu (r) −
Z
1
0
′
Bx (r) dBu (r)
h Z
1
0
′
Bx (r) Bx (r) dr
i −1 Z
1
0
rBx (r)dr ≡ W2
Therefore,
d
n3/2 (γ̂ − γ) −
→ W1−1W2
This completes the proof of Theorem 1.
A.2 Proof of Theorem 2
Proof. By using γ̂ − γ = O p n−3/2 , we to show that replacing γ by γ̂ will not affect the
asymptotic distribution of β̂(z).



β̂(z)
β̂(1) (z)


n


 
 = ∑ 
s =1
xs xs′
(zs − z) xs xs′
(zs − z) xs xs′ (zs − z)2 xs xs′
= D1n + D2n
 −1



 Kh,sz 
n

∑
s =1
xs

( zs − z) xs
xs



 (ys − sγ̂)Kh,sz
where

n

 
D1n =  ∑ 
s =1

xs xs′
(zs − z) xs xs′
(zs − z) xs xs′
− z )2 x

 D1n,1 
= 

D1n,2
( zs
′
s xs

−1


 Kh,sz 
15
n

∑
s =1

( zs − z) xs

 (ys − sγ)Kh,sz
and


n
 
D2n =  ∑ 
s =1


 Kh,sz 
(zs − z) xs xs′ (zs − z)2 xs xs′


 D2n,1 
= 

D2n,2
It is straightforward to show that D2n,1 = O p
and Park (2009), we know that
 −1

(zs − z) xs xs′
xs xs′
1
n
n

∑
s =1

xs
( zs − z) xs


 s(γ − γ̂)Kh,sz
, and from the proof of Theorem 2.1 in Cai, Li
D1n,1 = β(z) + O p h2 +
1 √
n h
Hence, combining the above results we obtain
β̂(z) = D1n,1 + D2n,1 = D1n,1 + O p
1
n
= D1n,1 + o p
1 √
n h
(14)
Equation (14) implies that the asymptotic distribution of β̂(z) is the same when one replaces γ
by γ̂. Hence, Theorem 2 follows from Theorem 2.1 of Cai, Li and Park (2009).
B Lemmas
Lemma 1. Under assumptions 1 to 6, we have
d
B1n −
→
1 −
3
Z 1
0
rBx (r)′ dr
h Z
1
0
Bx (r) Bx (r)′ dr
i −1 Z
1
0
rBx (r)dr ≡ W1
Proof. Recall that
B1n =
1
n3
n
∑ (t − xt′ A2t )2 =
t =1
1
n3
n
∑ (t2 − 2txt′ A2t + ( xt′ A2t )2 )
t =1
= B1n,1 − 2B1n,2 + B1n,3
It is easy to see that
B1n,1 =
1
∑nt=1 t2
→
3
n
3
16
(15)
For B1n,2 , using Lemma 5, we have
B1n,2
=
=
=
1
n3
1
n3
1
n3
n
∑ txt′ A2t
t =1
n
∑ txt′ [ A2 + o p (1)]
t =1
n
∑ txt′ A2 + o p (1)
t =1
−1 1 n
1 n ′ 1 n
′
x
x
sx
tx
=
s s
s + o p ( 1)
t
n3 t∑
n2 s∑
n2 s∑
=1
=1
=1
1 n t x′ 1 n x x′ −1 1 n s x √t
√s √s
√s + o p (1)
=
∑
∑
n t∑
n
n
n
n
n
n n
n
=1
s =1
s =1
Z 1
h Z 1
i −1 Z 1
d
−
→
rBx (r)′ dr
Bx (r) Bx (r)′ dr
rBx (r)dr .
0
0
0
where the second equality follows Lemma 5. Similarly, we have that
B1n,3
=
1
n3
=
1
n3
=
d
−
→
=
A2′
n
Z
n
∑ ( xt′ A2t )2
t =1
n
∑ ( xt′ A2 )2 + o p (1)
t =1
0
n
x x ′ A
t
2
t
+ o p ( 1)
∑ √ n √n
n
t =1
h Z 1
i −1 h Z
1
′
′
rBx (r) dr
Bx (r) Bx (r) dr
Z 1
0
1
n
0
rBx (r)′ dr
h Z 1
0
Bx (r) Bx (r)′ dr
1
0
i −1 Z 1
0
′
Bx (r) Bx (r) dr
rBx (r)dr .
ih Z
1
0
′
Bx (r) Bx (r) dr
i −1 Z
1
0
Therefore, combining the above results, we obtain
B1n
Z 1
h Z 1
i −1 Z 1
1
′
′
−
→
−2
rBx (r) dr
Bx (r) Bx (r) dr
rBx (r)dr
3
0
0
0
Z 1
h Z 1
i −1 Z 1
+
rBx (r)′ dr
Bx (r) Bx (r)′ dr
rBx (r)dr
d
0
=
1 −
3
0
Z 1
0
rBx (r)′ dr
h Z
0
1
0
Bx (r) Bx (r)′ dr
i −1 Z
1
0
rBx (r)dr
Lemma 2. Under assumptions 1 to 6, we have B2n = o p (1).
17
rBx (r)dr
Proof. Define
′
β( q ) ( zt )
,
M β ( zt ) =
β ( z t ), β ( z t ), · · · ,
q!
i
h1 n
M β (zt ) = St−1 2 ∑ Kst Gh−1 ( Qst ⊗ xs ) xs′ β(zs ) ,
n s =1
′
then we have
h1
k M β (zt ) − M β (zt )k = St−1 2
n
h1
= St−1 2
n
≤
n
∑ Kst Gh−1 (Qst ⊗ xs ) xs′
s =1
n
∑
s =1
n
Kst Gh−1 ( Qst
⊗
xs ) xs′
β(zs ) − Q′st M β (zt )
β( zs ) − ∑
q
i=0
i
i
β( i) ( zt )
( zt − zs ) i i!
q
−1
β( i) ( zt )
1 −1
i
G
(
Q
⊗
x
)
·
k
x
k
·
β
(
z
)
−
Kst k
S
k
·
(
z
−
z
)
s
s
t
s
∑ i!
∑ h st s
n2 t
s =1
i=0
= O p ( hq +1 )
where the last equality follows the results that kSt−1 k = O p (1),
k xs k = O p (1) and
1
n2
∑ns=1 Gh−1 ( Qst ⊗ xs ) ·
q
β( i) ( zt )
i
(zt − zs ) Kst = O p (hq+1 )
E β( zs ) − ∑
i!
i=0
Hence, we obtain that uniformly for t = 1, · · · , n,
M β ( z t ) − M β ( z t ) = O p ( hq +1 )
Using (16), it is easy to show that
sup β(zt ) − β(zt ) = sup eM β (zt ) − eM β (zt ) = O p (hq+1 )
1≤ t ≤ n
1≤ t ≤ n
Thus, by the definition of B2n , we have
B2n =
1
n
∑ (t − xt′ A2t ) xt′ ( β(zt ) − β(zt ))
n3/2
t =1
≤
=
1
n3/2
1
n3/2
n
sup k β(zt ) − β(zt )k ∑ kt − xt′ A2t k · k xt k
1≤ t ≤ n
t =1
O p (hq+1 )nO p (n)O p (
18
√
n) = O p (nhq+1 ) = o p (1)
(16)
where the first equality follows the facts that kt − xt′ A2t k = O p (n) and k xt k = O p (
the last equality is implied by assumption 6, which completes the proof.
Lemma 3. Under assumptions 1 to 6, we have B3n = o p (1).
Proof. Note that
1
B3n =
n3/2
n
∑ (t − xt′ A2t ) xt′ ũt
t =1
n
i
h
n
−1
′
′
−1 1
K
G
(
Q
⊗
x
)
u
(
t
−
x
A
)
x
eS
st h
st
s s
t 2t t
t
n2 s∑
n3/2 t∑
=1
=1
1 n
1 n
′
′
−1
−1
(t − xt A2t ) xt (eSt Kst Gh Qst ⊗ xs ) us
5/2 ∑
n s∑
t =1
=1 n
1
=
=
Similar to the proof of Lemma 1, we can show that uniformly in s = 1, · · · , n,
n
xs i
eSt−1 Kst Gh−1 Qst ⊗ √
n
t =1
h
n
1
xs i
−1
−1
′
′
√
eS
K
G
(
t
−
x
A
)
x
Q
⊗
+ o p ( 1)
st
2
st
∑
t
t
t
h
n5/2 t=1
n
1
(t − xt′ A2t ) xt′
∑
5/2
n
=
h
Define
Vns ≡
Θns ≡
h
i−1 h
1 n
xs i
′
Γ (κ ) ⊗ √
xt xt
e ∆(κ ) ⊗
∑ (t −
n2 t∑
n5/2 t=1
n
=1
h
i
n
1
xs
(t − xt′ A2 ) xt′ eSt−1 Kst Gh−1 Qst ⊗ √
− Vns
∑
5/2
n
n
t =1
1
n
xt′ A2 ) xt′
where
κj =
Z
v j K (v)dv, j = 1, · · · , 2q
Γ(κ ) = (1, κ1 , · · · , κq )′

1
κ1
κ2


 κ1 κ2
κ3


..
∆(κ ) = 
.
 κ2 κ3
 .
.
.
 .
..
..
 .

κ q κ q +1 κ q +2
19
···
···
···
..
.
···
κq



κ q +1 


κ q +2 

.. 

. 

κ2q
√
n), and
Then we have
1
B3n = √
n
n
∑ Vns us +
s =1
1
√
n
∑ Θns us
n
s =1
where Θns = o p (1) uniformly in s = 1, · · · , n.
For any ǫ1 > 0 and ǫ2 > 0, we have
n 1
Pr √
n 1
= Pr √
=
n
n
n
h
o
>
ǫ
Θ
u
1
∑ ns s
s =1
n
o
n 1
∑ Θns us > ǫ1 , max kθns k > ǫ2 + Pr √
s
s =1
E k∑ns=1 Θns us k2 1{maxs kθns k ≤ ǫ2 }
nǫ12
i
+ o ( 1)
n
n
o
∑ Θns us > ǫ1 , max kθns k ≤ ǫ2
s
s =1
where we apply Chebyshev’s Inequality. By assumptions 1 and 2, we can prove that
2
i
n
1 h
E ∑ Θns us 1{max kθns k ≤ ǫ2 } = o(1)
s
s =1
n
Further, since xt′ A2 is a scalar, we can show that
Vns =
=
=
n
h
1 n
i−1 h
xs i
′
e ∆(κ ) ⊗
Γ (κ ) ⊗ √
xt xt
∑ (t −
n2 t∑
n5/2 t=1
n
=1
−1 x n
n
n
n −1 1
1
1
1
′
′
′
√s
t−
xx
sxs xt xt
xx
2 ∑ s s
2 ∑
2 ∑ t t
n
n
n
n5/2 t∑
n
s =1
s =1
t =1
=1
n
n
−1 1 n
−1 xs
1
1 n
1 n
′
′
′
′
√
xt xt
xt xt
∑ tx − ∑ sxs n2 ∑ xs xs
n2 t∑
n2 t∑
n5/2 t=1 t
n
s =1
s =1
=1
=1
1
A2′ xt ) xt′
= 0
Therefore we have
B3n = o p (1)
Lemma 4. Under assumptions 1 to 6, we have
d
B4n −
→
Z 1
0
rdBu (r) −
Z
1
0
′
Bx (r) dBu (r)
h Z
20
1
0
′
Bx (r) Bx (r) dr
i −1 Z
1
0
rBx (r)dr
Proof.
n
1
B4n =
n3/2
1
n
≡
∑ (t − xt′ A2t )ut =
t =1
n
1
1
n
n
n
t
1
xt′
√
√
u
−
A
u
2 t +
∑ n t
n
n
t =1
n
∑
t =1
x′
√t ( A2 − A2t )ut
n
∑ Unt∗ ut + n ∑ Φ∗nt ut
t =1
t =1
≡ B4n,1 + B4n,2
By Lemma 5, we have kΦnt k = o p (1) uniformly in t = 1, · · · , n. Similarly as that in the proof
of Lemma 3, we have B4n,2 =
B4n,1
=
1
n
∑nt=1 Φnt ut = o p (1). And
1 n t
xt′
√
√
u
−
A
u
t
2
t
n t∑
n
n
=1
n
n
xt′ ut 1
t ut
√
√
−
∑ n n ∑ n √n · n
t =1
t =1
xs xs′ −1 1 n s xs √
√
∑ n √n
n s∑
n
s =1
=1 n
Z 1
Z 1
Z
Z
1
−1
1
d
Bx (r)′ dBu (r) ·
Bu (r)dr −
−
→
Bx (r) Bx (r)′ dr
Bx (r)dr
·
=
n
0
0
0
0
This completes the proof of Lemma 4.
Lemma 5. Under assumptions 1 to 6, we have
A2t = A2 + o p (1)
where
A2 =
1
n2
uniformly int = 1, · · · , n
n
∑ xs xs′
s =1
−1 1
n2
n
∑ sxs
s =1
Proof. We follow the proof strategy in the Proposition A.1 of Li et al. (2013). We will first
consider the first term St in A2t . Let Qsz = [1, (zs − z), · · · , (zs − z)q ]′ , K̃h,sz = Kh,sz Qsz Q′sz ,
ηh,sz = K̃h,sz − E [K̃h,sz ]. Then we have
h
1
−1 1
Cn ( z ) = Gh
h
n2
h1
= Gh−1 2
n
n
∑ K̃h,sz ⊗
s =1
n
xs xs′
i
Gh−1
∑ E[K̃h,sz ] ⊗ xs xs′
s =1
≡ C1n (z) + C2n (z)
i
Gh−1 + Gh−1
21
h1
n2
n
∑
s =1
i
K̃h,sz − E [K̃h,sz ] ⊗ xs xs′ Gh−1
Under assumptions 4 and 5, we obtain that
sup E [K̃h,sz ] = f (z)∆(κ ) + o(1).
z ∈ Sz
where Sz is the bounded support of zt .
Noting that f (z) is bounded away from zero and infinity for z ∈ Sz , then we have uniformly
for z ∈ Sz ,
1
1
C1n (z) − ∆(κ ) ⊗
f ( z)
n2
n
∑
s =1
xs xs′
= o p ( 1)
(17)
Next we will show that C2n (z) = o p (1) uniformly for z ∈ Sz . Let Q∗sz = [1, (zs − z)/h, · · · , (zs −
∗ be defined as η
∗
z)q /hq ]′ and ηh,sz
h,sz with Q sz replaced by Q sz . Then
C2n (z) =
n
1
n2
∗
⊗ xs xs′
∑ ηh,sz
s =1
By the similar argument in Theorem 1 of Masry (1996), we can prove that
sup sup Var
z ∈ Sz
l ≥0
l +m
∑
s = l +1
∗
ηh,sz
= O(m/h)
for all m ≥ 1. For some 0 < δ < 1, set N = [1/δ], sk = [kn/N ] + 1, s∗k = sk+1 − 1, and
∗
s∗∗
k = min {sk , n }. Let Un,s = xs xs /n for any 1 ≤ s ≤ n and Un (r ) = Un,[nr ] for any r ∈ [0, 1].
Following the proof of Theorem 3.3 of Hansen (1992), we have
1
C2n (z) = sup 2
z ∈ Sz n
1
≤ sup 2
z ∈ Sz n
1
≤ sup 2
n
z ∈ Sz
≤
1
n
N −1
∑
n
∑
s =1
∗
ηsz
1
⊗ Un,s = sup 2
z ∈ Sz n
∗∗
N −1 s k
∑ ∑
k=0 s = s k
∗∗
N −1 s k
∗
ηsz
∗∗
N −1 s k
∑ ∑
k=0 s = s k
1
⊗ Un,sk + sup 2
z ∈ Sz n
∗
ηsz
⊗ Un,s ∗∗
N −1 s k
∑ ∑
k=0 s = s k
∗
ηsz
⊗ (Un,s − Un,sk )
1 N −1 s∗∗
k
∗
∗
kηsz k · kUn,sk k + sup 2
kηsz
k · kUn,s
n
z ∈ Sz
k=0 s = s k
k=0 s = s k
∑ ∑
sup k =0 z ∈ Sz
s ∗∗
k
∑
s=sk
≡ C2n,1 + C2n,2.
∗
ηsz
·
∑ ∑
− Un,sk k
1
sup kUn (r)k + sup kUn (r1 ) − Un (r2 )k · sup
0≤ r ≤ 1
z ∈ Sz n
|r1 −r2 |≤ δ
∗∗
N −1 s k
∑ ∑ kηsz∗ k
k=0 s = s k
Note that sup0≤r ≤1 kUn (r)k = O p (1) due to Un (r) ⇒ Bx (r) Bx (r)′ by assumption 1. Besides, by
22
the similar argument in the proof of Theorem 1 of Masry (1996), we obtain
1
n
N −1
∑
sup k =0 z ∈ Sz
s ∗∗
k
∑
s=sk
which implies that
∗
ηsz
N
≤
n
s ∗∗
k
sup
sup
∑ kηsz∗ k ≤
0≤ k ≤ N − 1 z ∈ S z s = s k
1
sup sup 1≤ s ≤ n z∈Sz δn
s + δn
∑
i= s
∗
ηiz
= o p ( 1)
C2n,1 = sup kUn (r)k · o p (1) = o p (1).
0≤ r ≤ 1
Since it is easy to show that uniformly for z ∈ Sz ,
1
n
∗∗
N −1 s k
∑ ∑ kηsz∗ k = O p (1),
k=0 s = s k
we have that
C2n,2 =
sup kUn (r1 ) − Un (r2 )k · O p (1) = o p (1)
|r1 −r2 |≤ δ
by letting δ → 0. Therefore C2n (z) = o p (1) uniformly for z ∈ Sz . With equation (17) we have
uniformly for z ∈ Sz
1
1
Cn ( z ) − ∆ ( κ ) ⊗
h f ( z)
n2
n
∑ xs xs′
s =1
= o p ( 1)
(18)
Using the similar argument as above, we can also prove that, for the second term in A2t , uniformly for z ∈ Sz
1
2
n h f ( z)
n
∑ Kst Gh−1 (Qst ⊗ xs )s = Γ(κ ) ⊗
s =1
1
n2
n
sx
∑ s + o p ( 1) .
s =1
With equations (18) and (19), we show that
A2t = A2 + o p (1), uniformly in t = 1, · · · , n,
where
A2 =
=
=
=
1 n
−1
1 n
′
e ∆(κ ) ⊗
Γ
(
κ
)
⊗
x
x
sx
s
s
s
n2 s∑
n2 s∑
=1
=1
−1
1 n
1 n
′
·
Γ
(
κ
)
⊗
x
x
sx
e∆(κ )−1 ⊗
s s
s
n2 s∑
n2 s∑
=1
=1
h
i h 1 n
−1 1 n
i
′
e ∆(κ ) −1 Γ (κ ) ⊗
x
x
sx
s
s
∑
∑
s
n2 s =1
n2 s =1
−1 1 n
1 n
′
x
x
sx
s
s
s
n2 s∑
n2 s∑
=1
=1
23
(19)
h
i
where the (1, 1)-th element of ∆(κ )−1 Γ(κ ) is 1 and we deploy the structure of e.
24
Download