Journal of Statistical and Econometric Methods, vol.2, no.3, 2013, 153-174

advertisement
Journal of Statistical and Econometric Methods, vol.2, no.3, 2013, 153-174
ISSN: 1792-6602 (print), 1792-6939 (online)
Scienpress Ltd, 2013
The Superiorities of Minimum Bayes Risk
Linear Unbiased Estimator in
Two Seemingly Unrelated Regressions1
Radhey S. Singh2 , Lichun Wang3 and Huiming Song4
Abstract
In the system of two seemingly unrelated regressions, the minimum
Bayes risk linear unbiased (MBRLU) estimators of regression parameters are derived. The superiorities of the MBRLU estimators over the
classical estimators are investigated, respectively, in terms of the mean
square error matrix (MSEM) criterion, the predictive Pitman closeness
(PRPC) criterion and the posterior Pitman closeness (PPC) criterion.
Mathematics Subject Classification: 62C10; 62J05
Keywords: Seemingly unrelated regressions; generalized least square estimator; MBRLU estimator; MSEM criterion; PRPC criterion; PPC criterion
1
Sponsored by NSERC of Canada Grant (400045) and Scientific Foundation of
BJTU(2012JBM105)
2
Department of Statistics and Actuarial Sciences, University of Waterloo, Canada.
Corresponding author.
3
Department of Mathematics, Beijing Jiaotong University, Beijing 100044, China.
4
Department of Statistics and Finance, USTC, Hefei 230026, China.
Article Info: Received : March 21, 2013. Revised : May 2, 2013
Published online : September 1, 2013
154
1
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
Introduction
Consider a system consisting of two seemingly unrelated regressions (SUR)
(
Y1 = X1 β1 + ²1
,
(1.1)
Y2 = X2 β2 + ²2
where Yi (i = 1, 2) are n × 1 vectors of observations, Xi (i = 1, 2) are n × pi
matrices with full column rank, βi (i = 1, 2) are pi × 1 vectors of unknown
regression parameters, ²i (i = 1, 2) are n × 1 vectors of error variables, and
E(²i ) = 0,
Cov(²i , ²j ) = σij In ,
i, j = 1, 2,
where Σ∗ = (σij ) is a 2 × 2 non-diagonal positive definite matrix. This kind of
system has been widely applied in many fields such as econometrics, social and
biological sciences and so on. It was first introduced to statistics by the Zellner’s pioneer works (Zellner (1962, 1963)) and later developed by Kementa and
Gilbert (1968), Revankar (1974), Mehta and Swamy (1976), Schmidt (1977),
Wang (1989) and Lin (1991), etc.
Denote Y = (y10 , y20 )0 , X = diag(X1 , X2 ), β = (β10 , β20 )0 , ² = (²01 , ²02 )0 . One
can represent the system (1.1) as the following regression model
Y = Xβ + ²,
² ∼ (0, Σ∗ ⊗ In ),
(1.2)
where ⊗ denotes the Kronecker product operator.
Following from Wang et al (2011), we know that if Σ∗ is known then the
generalized least square estimator of βi (i = 1, 2) would be
"
#
∞
X
σ12
N2 Y 2 )
β̄1,GLS = (X10 X1 )−1 X10 In − ρ2
(ρ2 P2 P1 )i P2 N1 (Y1 −
σ
22
i=0
(1.3)
and
"
β̄2,GLS = (X20 X2 )−1 X20 In − ρ2
∞
X
i=0
#
(ρ2 P1 P2 )i P1 N2 (Y2 −
σ21
N1 Y1 ),
σ11
(1.4)
Radhey S. Singh, Lichun Wang and Huiming Song
155
2
where Pi = Xi (Xi0 Xi )−1 Xi0 , Ni = In − Pi and ρ2 = σ12
/(σ11 σ22 ).
Further, also by the Theorem 3.1 of Wang et al (2011), we know that
under the condition that X10 X2 = 0 (see Zellner (1963)) or X1 = (X2 , L)
(see Revankar (1974)) or P1 P2 P1 N2 = 0 (see Liu (2002)), formally β̄1,GLS and
β̄2,GLS have unique simpler form, respectively. Thus, in what follows we assume
X10 X2 = 0 which implies Xi0 Nj = 0(i 6= j) and causes β̄1,GLS and β̄2,GLS to be
simplified into
σ12 0
(X1 X1 )−1 X10 Y2 ,
σ22
σ12 0
= β̂2 −
(X X2 )−1 X20 Y1 ,
σ11 2
β̂1∗ = β̂1 −
(1.5)
β̂2∗
(1.6)
where
β̂1 = (X10 X1 )−1 X10 Y1 ,
β̂2 = (X20 X2 )−1 X20 Y2 .
In Section 2 we derive the Bayes minimum risk linear unbiased (MBRLU)
estimator for β and accordingly the MBRLU estimators for βi (i = 1, 2). In
Section 3 the superiorities of the MBRLU estimators of βi (i = 1, 2) are established based on the mean square error matrix (MSEM) criterion. In Section
4 we discuss the superiorities of MBRLU estimators in terms of the predictive Pitman closeness (PRPC) criterion and the posterior Pitman closeness
(PPC) criterion, respectively. In the case that the design matrices are non-full
rank, we investigate the superiorities of BMRLU estimators of some estimable
functions. Finally, brief concluding remarks are made in Section 6.
2
The BMRLU Estimators of Regression Parameters
Normally, there are two different approaches concerned with Bayes esti-
mation in linear model. The first one supposes that the prior of regression
156
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
parameter is normal which implies under the normal linear model the posterior is still a normal distribution. Thus under the quadratic loss the Bayes
estimator of the regression parameter would be the posterior mean (see Box
and Tiao (1973), Berger (1985) and Wang and Chow (1994), etc). Recently,
Wang and Veraverbeke (2008) employed this approach to exhibit the superiorities of Bayes and empirical Bayes estimators in two generalized SURs. The
second approach proposed by Rao (1973) which yields the minimum Bayes risk
linear (MBRL) estimator of the regression parameter by minimizing the Bayes
risk under the assumption that some moment conditions of the prior are given.
Rao (1976) further pointed out that the admissible linear estimators of regression parameter are either MBRL estimators or the limit of MBRL estimators.
Gruber (1990) proposed a MBRL estimator for an estimable function of the
regression parameter and obtained an alternative form of MBRL estimator.
Some results related to this area can be found in Trenkler and Wei (1996),
Zhang (2005) and others. In this paper, we use the second approach to derive
MBRL estimator of the regression parameter and discuss the superiorities of
MBRL estimator in terms of MSEM, PRPC and PPC criterion, respectively.
Denote the prior of β by π(β). It is assumed that the prior π(β) satisfies:
Ã
!
Ã
!
µ1
τ12 Ip1
0
E(β) =
=µ,
ˆ
Cov(β) =
=V,
ˆ
(2.1)
µ2
0
τ22 Ip2
where µi and τi (i = 1, 2) are known.
Let the loss function be defined by
L(δ, β) = (δ − β)0 (δ − β),
(2.2)
and the linear estimator class of β be
n
o
e
F = β = AY + b : where A is (p1 + p2 ) × 2n matrix, b is (p1 + p2 ) × 1 vector .
(2.3)
157
Radhey S. Singh, Lichun Wang and Huiming Song
Then the MBRLU estimator β̂B is defined to minimize the Bayes risk
e β) = min E[(βe − β)0 (βe − β)]
R(β̂B , β) = min R(β,
A,b
A,b
(2.4)
and subject to the constraint E(βe − β) = 0, where E denotes the expectation
with respect to (w.r.t.) the joint distribution of (Y, β).
From the constraint, we have b = (I − AX)µ. Note that the fact that
n
o
0
e
R(β, β) = E [AY + (I − AX)µ − β] [AY + (I − AX)µ − β]
n
o
0
= E [A(Y − Xµ) − (β − µ)] [A(Y − Xµ) − (β − µ)]
n
o
0
= Etr [A(Y − Xµ) − (β − µ)] [A(Y − Xµ) − (β − µ)]
n
o
0
0
0
0
= tr A(XV X + Φ)A + V − AXV − V X A ,
by solving
e )
∂R(β,β
∂A
= 0, we obtain
A = V X 0 (XV X 0 + Φ)−1 .
(2.5)
(P + BCB 0 )−1 = P −1 − P −1 B(C −1 + B 0 P −1 B)−1 B 0 P −1 ,
(2.6)
By the fact that
we obtain
A = V X 0 (XV X 0 + Φ)−1 = (X 0 Φ−1 X + V −1 )−1 X 0 Φ−1 ,
(2.7)
and
I − AX = I − (X 0 Φ−1 X + V −1 )−1 X 0 Φ−1 X = (X 0 Φ−1 X + V −1 )−1 V −1 . (2.8)
Hence, we have
β̂B = AY + b = AY + (I − AX)µ
= (X 0 Φ−1 X + V −1 )−1 (X 0 Φ−1 X β̂LS + V −1 µ)
= β̂LS − (X 0 Φ−1 X + V −1 )−1 V −1 (β̂LS − µ)
!
Ã
0
−1
X1 X1 + Ip1 )−1 (β̂1∗ − µ1 )
β̂1∗ − (τ12 σ11.2
.
=
0
−1
X2 X2 + Ip2 )−1 (βˆ2∗ − µ2 )
βˆ2∗ − (τ22 σ22.1
(2.9)
158
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
Thus the MBRLU estimators of βi (i = 1, 2) are
0
−1
β̂1B = β̂1∗ − (τ12 σ11.2
X1 X1 + Ip1 )−1 (β̂1∗ − µ1 ),
0
−1
X2 X2 + Ip2 )−1 (βˆ2∗ − µ2 ),
β̂2B = βˆ2∗ − (τ22 σ22.1
(2.10)
(2.11)
2 −1
2 −1
σ11 .
σ22 and σ22.1 = σ22 − σ12
where σ11.2 = σ11 − σ12
3
The Superiorities of MBRLU Estimator Under MSEM Criterion
We state the following MESM superiorities of β̂iB (i = 1, 2).
Theorem 3.1 Let the GLS estimators and MBRLU estimators of βi are
given by (1.5),(1.6) and (2.10), (2.11) respectively, then
M (β̂i∗ ) − M (β̂iB ) > 0,
i = 1, 2.
Proof: We only prove the above conclusion for the case of i = 1. Denote
0
−1
B1 = (τ12 σ11.2
X1 X1 + Ip1 )−1 , we have
h
i
0
M (β̂1B ) = E (β̂1B − β1 )(β̂1B − β1 )
½h
ih
i0 ¾
∗
∗
∗
∗
= E (β̂1 − β1 ) − B1 (β̂1 − µ1 ) (β̂1 − β1 ) − B1 (β̂1 − µ1 )
0
0
0
= M (β̂1∗ ) − E[(β̂1∗ − β1 )(β̂1∗ − µ1 ) ]B1 − B1 E[(β̂1∗ − µ1 )(β̂1∗ − β1 ) ]
i 0
h
0
+B1 E (β̂1∗ − µ1 )(β̂1∗ − µ1 ) B1
0
0
0
= M (β̂1∗ ) − J1 B1 − B1 J1 − B1 J2 B1 ,
(3.1)
n
³
´o
n ³
´o
J2 = Cov(β̂1∗ ) = E Cov β̂1∗ |β
+ cov E β̂1∗ |β
n
³
´o
∗
= E cov β̂1 | β
+ τ12 I,
(3.2)
where
159
Radhey S. Singh, Lichun Wang and Huiming Song
and
n
³
E Cov
β̂1∗ |β1
´o
½ ·³
´³
´0 ¸¾
∗
∗
= E E β̂1 − β1 β̂1 − β1 |β1
´
n
³
0
−1
−1 0
= E Cov(β̂1 |β1 ) + Cov σ12 σ22 (X1 X1 ) X1 Y2 |β1
´o
³
0
0
−1
(X1 X1 )−1 X1 Y2 |β1
−2Cov β̂1 , σ12 σ22
0
0
0
2 −1
2 −1
= σ11 (X1 X1 )−1 + σ12
σ22 (X1 X1 )−1 − 2σ12
σ22 (X1 X1 )−1
¡
¢ 0
0
2 −1
= σ11 − σ12
σ22 (X1 X1 )−1 = σ11.2 (X1 X1 )−1 .
(3.3)
Putting (3.3) into (3.2) we have
n
³
´o
0
J2 = E Cov β̂1∗ |β1
+ τ12 I = σ11.2 (X1 X1 )−1 + τ12 I.
(3.4)
Then J1 can be expressed as follows
h
i
0
0
J1 = E (β̂1∗ − β1 )(β̂1∗ − µ1 ) = J2 − Cov(β1 ) = σ11.2 (X1 X1 )−1 .
(3.5)
Combining (3.4) and (3.5) with (3.1), we obtain
0
0
0
M (β̂1B ) − M (β̂1∗ ) = J1 B1 + B1 J1 − B1 J2 B1
h
0
0
= B1 B1−1 σ11.2 (X1 X1 )−1 + σ11.2 (X1 X1 )−1 B1−1
i 0
0
2
−1
−τ1 I − σ11.2 (X1 X1 )
B1
h
i 0
0
= B1 τ12 I + σ11.2 (X1 X1 )−1 B1 > 0.
(3.6)
Theorem 3.1 has been proved.
Obviously, the MSEM is much stronger than the MSE. A point estimator
could be MSE superior to another, but not necessarily superior in sense of
MSEM. Hence, we have
M SE(β̂i∗ ) − M SE(β̂iB ) > 0,
i = 1, 2.
160
4
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
Superiorities of MBRLU Estimators Under
PRPC and PPC Criterion
The criterion of Pitman Closeness (PC), originally introduced by Pitman
(1937), is based on the probabilities of the relative closeness of competing
estimators to an unknown parameter or parameter vector. After a long fallow
period, renewed interest in this topic has been sparked in the last twenty
years. Rao (1981), Keating and Mason (1985) and Rao et al. (1986) helped
to resurrect the criterion as an alternative comparison criterion to traditional
criterion such as MSE criterion and mean absolute error (MAE) criterion.
Mason et al. (1990) and Fountain and Keating (1994) proposed some general
methods for the comparisons between linear estimators under PC criterion.
Many important contributions to this direction were described by Keating et
al. (1993) and others.
Definition 4.1 Let θ̂1 and θ̂2 be two different estimators of θ, L(θ̂, θ) be
the loss function. if
P (L(θ̂1 , θ) ≤ L(θ̂2 , θ)) ≥ 0.5, ∀ θ ∈ Θ,
with strict inequality “ > ” for some θ ∈ Θ, the parameter space, then θ̂1 is
said to be Pitman closer than θ̂2 , or θ̂1 is said to be superior to θ̂2 under PC
criterion.
Ghosh and Sen (1991) introduced two alternative notions of PC motivated from Bayesian viewpoint, which are called Predictive Pitman Closeness
(PRPC) and Posterior Pitman Closeness (PPC) criterion. They are defined as
follows:
Definition 4.2 Let Γ be a class of prior distributions of θ, θ̂1 and θ̂2 be
two different estimators of θ, then θ̂1 is said to be superior to θ̂2 under PRPC
criterion if
Pπ (L(θ̂1 , θ) ≤ L(θ̂2 , θ)) ≥ 0.5,
∀ π ∈ Γ,
161
Radhey S. Singh, Lichun Wang and Huiming Song
where Pπ is computed under the joint distribution of Y and θ for every π ∈ Γ.
Definition 4.3 Suppose π is a prior distribution of θ, θ̂1 and θ̂2 are two
different estimators of θ, then θ̂1 is said to superior to θ̂2 under PPC criterion
if
Pπ (L(θ̂1 , θ) ≤ L(θ̂2 , θ)|y) ≥ 0.5,
∀ y ∈ Y,
with strict inequality “ > ” for some y ∈ Y, where Y is the sample space.
Obviously, if an estimator θ̂1 of θ is superior to θ̂2 for every π ∈ Γ under PPC
criterion, then it is also superior to θ̂2 under PRPC criterion. The converse is
not necessarily true. Ghosh and Sen presented an example to show that the
classical James-Stein estimator is superior to the sample mean under PRPC
criterion for all priors, but it is not hold under the PPC criterion.
Let the loss function be defined by (2.2) and in this section
ε|β ∼ N (0, Σ∗ ⊗ In ).
(4.1)
For the MBRLU estimator β̂1B , we have the following results.
Theorem 4.1 Let the GLS estimator and MBRLU estimator of β1 be given
by (1.5) and (2.10). If
λp (p1 − 2)
τ12
≤ 1
,
σ11.2
2p1 λ21
then
³
Pπ
(4.2)
´
∗
ˆ
L(β̂1B , β1 )) ≤ L(β1 , β1 ) ≥ 0.5,
for every π ∈ Γ(β1 ),
0
where λ1 and λp1 are the maximum and the minimum eigenvalues of X1 X1
and Γ(β1 ) = {π(β1 ) : E(β1 ) = µ1 , Cov(β1 ) = τ12 Ip1 }.
´
³
0
−1
X1 X1 +Ip1 )−1 and W β̂1B , β1∗ ; β1 = L(β̂1B , β1 )−
Proof: Denote B1 = (τ12 σ11.2
L(βˆ1∗ , β1 ).
By (2.10) we have
0
L(β̂1B , β1 ) = [(β̂1∗ − β1 ) − B1 (β̂1∗ − µ1 )] [(β̂1∗ − β1 ) − B1 (β̂1∗ − µ1 )]
0
0
0
= L(β̂1∗ , β1 ) − 2(β̂1∗ − µ1 ) B1 (β̂1∗ − β1 ) + (β̂1∗ − µ1 ) B12 (β̂1∗ − µ1 ). (4.3)
162
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
Hence W < 0 is equivalent to
0
0
0
(β̂1∗ − µ1 ) B12 (β̂1∗ − µ1 ) − 2(β̂1∗ − µ1 ) B1 (β̂1∗ − β1 ) ≤ 0.
(4.4)
Since B12 ≤ B1 , we know that (4.4) is implied by
0
0
(β̂1∗ − µ1 ) B1 (β̂1∗ − µ1 ) − 2(β̂1∗ − µ1 ) B1 (β̂1∗ − β1 ) ≤ 0.
(4.5)
Substituting β̂1∗ − µ1 = β̂1∗ − β1 − (µ1 − β1 ) into (4.5), it is equivalent to
0
0
(β1 − µ1 ) B1 (β1 − µ1 ) ≤ (β̂1∗ − β1 ) B1 (β̂1∗ − β1 ).
(4.6)
Since
µ
τ12
λ1 + 1
σ11.2
¶−1
µ
Ip1 ≤ B1 =
τ12 0
X X1 + Ip1
σ11.2 1
µ
¶−1
≤
τ12
λp + 1
σ11.2 1
¶−1
Ip1 ,
It is easy to see that inequality (4.6) is implied by
¶−1
¶−1
µ 2
τ1
τ12
0
0
λp1 +1 (β1 − µ1 ) (β1 − µ1 ) ≤
λ1 +1 (β̂1∗ − β1 ) (β̂1∗ − β1 ). (4.7)
σ11.2
σ11.2
µ
Since 0 <
−1
τ12 σ11.2
λ1 +1
−1
2
τ1 σ11.2 λp1 +1
≤
λ1
,
λp1
So (4.7) is implied by
λ1
0
0
(β1 − µ1 ) (β1 − µ1 ) ≤ (β̂1∗ − β1 ) (β̂1∗ − β1 ).
λp1
(4.8)
¢
¡
0
0
−1/2
Note that (β̂1∗ −β1 )|β ∼ N 0, σ11.2 (X1 X1 )−1 . Let Z = σ11.2 (X1 X1 )1/2 (β̂1∗ −
β1 ), then Z|β ∼ N (0, Ip1 ) . So the inequality( 4.8) is equivalent to
λ1
0
0
0
(β1 − µ1 ) (β1 − µ1 ) ≤ σ11.2 Z (X1 X1 )−1 Z.
λp1
(4.9)
0
Notice that (X1 X1 )−1 > λ−1
1 Ip1 , hence (4.9) is implied by
λ21
0
||β1 − µ1 ||2 ≤ Z Z.
λp1 σ11.2
(4.10)
163
Radhey S. Singh, Lichun Wang and Huiming Song
0
Since Z Z|β ∼ χ2p1 , by (4.4)–(4.10) and Markov inequality we have
µ
¶
³ ³
´
´
λ21
0
2
∗
Pπ W β̂1B , β1 ; β1 ≤ 0 ≥ Pπ Z Z ≥
||β1 − µ1 ||
λp1 σ11.2
µ
¶
λp1 σ11.2 0
2
= 1 − Pπ ||β1 − µ1 || ≥
ZZ
λ21
¶¸
· µ
λp1 σ11.2 0 ¯¯ 0
2
= 1 − E Pπ ||β1 − µ1 || ≥
Z Z ¯Z Z
λ21
Ã
!
µ
¶
λ21 tr[Cov(β1 )]
λ21 E ||β1 − µ1 ||2
1
≥1−E
=1−
E
λp1 σ11.2 Z 0 Z
σ11.2 λp1
Z0Z
=1−
λ21 p1 τ12
1
≥ .
σ11.2 λp1 (p1 − 2)
2
The proof of Theorem 4.1 is finished.
For the estimator β̂2B , we have similar result as below.
Theorem 4.2 Let the GLS estimator and MBRLU estimator of β2 are
given by (1.6) and (2.11). If
λ∗p (p2 − 2)
τ22
≤ 2
,
σ22.1
2p2 λ∗2
1
then
³
´
Pπ L(β̂2B , β2 )) ≤ L(β̂2∗ , β2 ) ≥ 0.5,
(4.11)
for every π ∈ Γ(β2 ),
0
where λ∗1 , λ∗p2 are the maximum and the minimum eigenvalue of X2 X2 and
Γ(β2 ) = {π(β2 ) : E(β2 ) = µ2 , Cov(β2 ) = τ22 Ip2 }.
To discuss PPC properties of MBRLU estimators, we further assume the
prior π(β) is the normal distribution,i.e.
β ∼ N (µ, V ).
(4.12)
Thus we have the following results.
Theorem 4.3 Under the assumptions (4.1) and (4.12).
´
³
Pπ L(β̂iB , βi )) ≤ L(βˆi∗ , βi )|Y = y ≥ 0.5 for any y ∈ Y.
∗
i.e., β̂iB is superior over β̂i under PPC criterion, where Y is sample space.
164
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
Proof: We only prove the case of i = 1. The proof of β̂2B is similar. Note
that
³
´
= ||β̂1B − β1 ||2 − ||β̂1∗ − β1 ||2
i
i0 h
h
= (β̂1B − β̂1∗ ) + (β̂1∗ − β1 ) (β̂1B − β̂1∗ ) + (β̂1∗ − β1 ) − ||β̂1∗ − β1 ||2
i
h
= ||β̂1∗ − β̂1B ||2 + 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) − (β̂1∗ − β̂1B )
W
β̂1B , β̂1∗ ; β1
0
= 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) − ||β̂1∗ − β̂1B ||2 .
(4.13)
Obviously
0
W ≤ 0 ⇐⇒ 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) ≤ ||β̂1∗ − β̂1B ||2 .
From (4.12) we know that the prior of β1 is normal distribution, hence the
posterior distribution of β1 given Y = y is still normal distribution. Under
the quadratic loss function the Bayes estimator of β1 is E(β1 |y), which has the
same expression as (2.10), therefore the posterior of β1 − β̂1B = β1 − E(β1 |y)
given Y = y is distributed as p1 -dimension normal distribution with zero mean.
0
Thus, the posterior of 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) given Y = y is one dimension
normal distribution with zero mean. Hence we have
³
´
³ ³
´
´
∗
∗
ˆ
Pπ L(β̂1B , β1 )) ≤ L(β1 , β1 )|y = Pπ W β̂1B , β̂1 ; β1 ≤ 0|y
³
´
0
∗
∗
2
= Pπ 2(β̂1 − β̂1B ) (β1 − β̂1B ) ≤ ||β̂1 − β̂1B || |y
³
´
0
> Pπ 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) ≤ 0 | y = 0.5,
the last inequality is true due to ||β̂1∗ − β̂1B ||2 > 0 with probability one. The
Theorem 4.3 has been proved.
5
The MBRLU Estimators of Estimatable Function and Its Superiorities
In this section we consider the case that in (1.2) the rank of X is non-
165
Radhey S. Singh, Lichun Wang and Huiming Song
full rank, i.e., R(X) < p. Note that in this case β is un-estimable, hence we
consider the estimable function η = Xβ. Since any estimable function of β can
be expressed by the linear function of η, it is enough to study the estimator of
η. It is obvious that the GLS estimator of η = Xβ is
0
0
η̂GLS = X(X Φ−1 X)− X Φ−1 Y,
0
where the expression of ηbGLS does not depend on g-inverse (X Φ−1 X)− , so it
0
can be replaced by Moore-Penrose inverse (X Φ−1 X)+ . Then we have
Ã
!
∗
η̂
0
0
1
η̂GLS = X(X Φ−1 X)+ X Φ−1 Y =
,
η̂2∗
(5.1)
where ηi∗ is the GLS estimator of ηi = Xi βi (i=1,2), and
0
0
η̂1∗ = X1 (X1 X1 )+ X1 Y1 −
σ12
0
0
X1 (X1 X1 )+ X1 Y2
σ22
σ12
0
0
X1 (X1 X1 )+ X1 Y2 ,
σ22
σ12
0
0
0
0
= X2 (X2 X2 )+ X2 Y2 −
X2 (X2 X2 )+ X2 Y1
σ11
σ12
0
0
X2 (X2 X2 )+ X2 Y1 ,
= η̂2 −
σ11
(5.2)
= η̂1 −
η̂2∗
0
0
0
(5.3)
0
with η̂1 = X1 (X1 X1 )+ X1 Y1 , η̂2 = X2 (X2 X2 )+ X2 Y2 .
Similar to the way used in section 1, we may obtain the MBRLU estimator
of η = Xβ. Let η0 = Xµ. Then we have
0
0
0
0
η̂B = XV X (XV X + Φ)−1 Y + [I − XV X (XV X + Φ)−1 ] η0 . (5.4)
0
0
0
0
By (2.6) and the fact that X Φ−1 = X Φ−1 X(X Φ−1 X)+ X Φ−1 we know
that
0
0
0
0
0
XV X (XV X + Φ)−1 = XV X [Φ−1 − Φ−1 X(V −1 + X Φ−1 X)−1 X Φ−1 ]
0
0
0
= XV [I − X Φ−1 X(V −1 + X Φ−1 X)−1 ]X Φ−1
0
0
0
0
= X(V −1 + X Φ−1 X)−1 X Φ−1 = H
0
0
= X(V −1 + X Φ−1 X)−1 X Φ−1 X(X Φ−1 X)+ X Φ−1
0
0
= HX(X Φ−1 X)+ X Φ−1 .
(5.5)
166
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
0
0
Therefore, using (5.4) and (5.5) and the fact that X(X X)+ X X = X we
have
0
0
η̂B = HX(X Φ−1 X)+ X Φ−1 Y + (I − H)η0
= H η̂GLS + (I − H)η0 = η̂GLS − (I − H)(η̂GLS − η0 )
0
0
= η̂GLS − [X − HX(X X)+ X X](β̂GLS − µ)
0
0
= η̂GLS − [I − HX(X X)+ X ](η̂GLS − η0 ).
(5.6)
By the assumption X10 X2 = 0 we have
0
0
0
X Φ−1 X(X X)+ X
à 0 !Ã
!Ã
!
0
−1
+ 0
X1 0
σ11.2 In − ρ12 In
X1 (X1 X1 ) X1
0
=
0
0
0
−1
0 X2
−ρ12 In σ22.1 In
0
X2 (X2 X2 )+ X2
Ã
! Ã
!
0
0
0
0
−1
−1
σ11.2
X1 X1 (X1 X1 )+ X1
0
σ11.2
X1 0
=
=
, (5.7)
0
0
0
0
−1
−1
0
σ22.1
X2 X2 (X2 X2 )+ X2
0 σ22.1
X2
and
0
0
0
0
0
0
HX(X X)+ X = X[V −1 + X Φ−1 X]−1 X Φ−1 X(X X)+ X
!−1 Ã
!
Ã
!Ã 1
0
0
1
−1
X
I
X
+
0
2
σ
X
0
X1 0
p
1
1
1
σ11.2
τ1
11.2 1
=
0
0
−1
1
1
X
0
X
+
I
0 σ22.1
X2
0 X2
σ22.1 2 2
τ22 p2
Ã
!
¡ 0
¢−1 0
X1 X1 X1 + δ1 Ip1
X1
0
¡ 0
¢−1 0
=
(5.8)
0
X2 X2 X2 + δ2 Ip2
X2
with δ1 = σ11.2 τ1−2 , δ2 = σ22.1 τ2−2 .
Substituting (5.8) into (5.6) we obtain
0
0
η̂B = η̂GLS − [I − HX(X X)+ X ](η̂GLS − η0 )
!
Ã
0
0
η̂1∗ − [In − X1 (X1 X1 + δ1 Ip1 )−1 X1 ](η̂1∗ − η01 )
,
=
0
0
η̂2∗ − [In−X2 (X2 X2 + δ2 Ip2 )−1 X2 ](η̂2∗ − η02 )
(5.9)
0
0
) and η 0 = (η10 , η20 ). So the MBRLU estimators of ηi (i =
, η02
where η00 = (η01
1, 2) are
0
0
η̂1B = η̂1∗ − [In − X1 (X1 X1 + δ1 Ip1 )−1 X1 ] (η̂1∗ − η01 ) ,
0
0
η̂2B = η̂2∗ − [In − X2 (X2 X2 + δ2 Ip2 )−1 X2 ] (η̂2∗ − η02 ) .
(5.10)
(5.11)
167
Radhey S. Singh, Lichun Wang and Huiming Song
Theorem 5.1 Let the GLS estimators and MBRLU estimators of ηi = Xi βi
are given by (5.2), (5.3) and (5.10),(5.11), respectively, then
M (η̂i∗ ) − M (η̂iB ) ≥ 0,
i=1, 2.
Proof: We only prove the case of i = 1, the case for i = 2 can be proved in a
0
0
similar way. Let G1 = In − X1 (X1 X1 + δ1 Ip1 )−1 X1 , then
n
o
0
M (η̂1B ) = E [η̂1∗ − G1 (η̂1∗ − η01 ) − η1 ][η̂1∗ − G1 (η̂1∗ − η01 ) − η1 ]
n
o
0
= E [(η̂1∗ − η1 ) − G1 (η̂1∗ − η01 )][(η̂1∗ − η1 ) − G1 (η̂1∗ − η01 )]
n
0
0
0
= E (η̂1∗ − η1 ) (η̂1∗ − η1 ) − (η̂1∗ − η1 ) (η̂1∗ − η01 ) G1
o
0
0
0
∗
∗
∗
∗
−G1 (η̂1 − η01 ) (η̂1 − η1 ) + G1 (η̂1 − η01 ) (η̂1 − η01 ) G1
0
0
0
= M (η̂1∗ ) − K1 G1 − G1 K1 + G1 K2 G1 ,
(5.12)
where
³
´
0
K2 = E (η̂1∗ − η01 ) (η̂1∗ − η01 ) = Cov (η̂1∗ )
= Cov[E (η̂1∗ |η)] + E[Cov (η̂1∗ |η)]
0
0
0
= τ12 X1 X1 + σ11.2 X1 (X1 X1 )+ X1 ,
(5.13)
0
K1 = E[(η̂1∗ − η1 ) (η̂1∗ − η01 ) ]
0
0
= Cov (η̂1∗ ) − Cov(η1 ) = σ11.2 X1 (X1 X1 )+ X1 .
(5.14)
0
Putting (5.13), (5.14) into (5.12) and by the fact of G1 = In − X1 (X1 X1 +
0
0
δ1 Ip1 )−1 X1 = (In + δ1−1 X1 X1 )−1 , we have
0
0
0
M (η̂1∗ ) − M (η̂1B ) = K1 G1 + G1 K1 − G1 K2 G1
³
´ 0
0
0
2
+ 0
= G1 τ1 X1 X1 + σ11.2 X1 (X1 X1 ) X1 G1 ≥ 0.
Theorem 5.1 has been proved.
For nay general estimable functions γi = Pi βi , where Pi is a ki × pi real
matrix, for which there exists ki × n matrix Ci such that Pi = Ci Xi , therefore,
168
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
γi = Ci Xi βi = Ci ηi (i = 1, 2). It is easy to know the GLS estimators and
MBRLU estimators of γi would be
γ̂i∗ = Ci η̂i∗ ,
γ̂iB = Ci η̂iB ,
i = 1, 2,
(5.15)
then we have the following corollary.
Corollary 5.1 Let the GLS estimators and MBRLU estimators of estimable
function γi (i = 1, 2) are given by (5.15), then
M (γ̂i∗ ) − M (γ̂iB ) ≥ 0,
i = 1, 2.
Proof: The conclusion holds since
0
M (γ̂i∗ ) − M (γ̂iB ) = Ci [M (η̂i∗ ) − M (η̂iB )] Ci ≥ 0.
Now we discuss the superiority of MBRLU estimator of η1 under PRPC
criterion, the superiority of MBRLU estimator of η2 can be discussed similarly.
Under the loss function (2.2) we have the following result:
Theorem 5.2 Let the GLS estimator and MBRLU estimator of η1 = X1 β1
be given by (5.2) and (5.10). Under the condition (4.1) and suppose R(X1 ) =
t1 ≤ p1 , if
λ̃t (t1 − 2)
τ12
≤ 1
σ11.2
2t1 λ̃21
(5.16)
then we have
Pπ (L(η̂1B − η1 )) ≤ L(ηˆ1 ∗ − η1 )) ≥ 0.5,
for every π ∈ Γ(β1 ),
0
where λ̃1 , λ̃t1 are the maximum and minimum positive eigenvalue of X1 X1 , and
Γ(β1 ) is given by Theorem 4.1.
Remark 5.1 The condition of (5.16) indicates the fact that the variance
of prior should not be too larger than that of the samples. It implies some
requirement for the precision of the variance of prior distribution.
169
Radhey S. Singh, Lichun Wang and Huiming Song
0
0
0
Proof: Let G1 = I − X1 (X1 X1 + δ1 Ip1 )−1 X1 = (I + δ1−1 X1 X1 )−1 and
W (η̂1B , η1∗ ; η1 ) = L(η̂1B , η1 ) − L(ηˆ1 ∗ , η1 ), where δ1 is given in (5.8). By (5.10)
we have
0
L(η̂1B , η1 ) = [(η̂1∗ − η1 ) − G1 (η̂1∗ − η01 )] [(η̂1∗ − η1 ) − G1 (η̂1∗ − η01 )]
0
= L(η̂1∗ , η1 ) − 2(η̂1∗ − η01 ) G1 (η̂1∗ − η1 )
0
+(η̂1∗ − η01 ) G21 (η̂1∗ − η01 ).
(5.17)
From (5.17), it is easy to see that W < 0 is equivalent to
0
0
(5.18)
0
(5.19)
(η̂1∗ − η01 ) G21 (η̂1∗ − η01 ) ≤ 2(η̂1∗ − η01 ) G1 (η̂1∗ − η1 ).
since G21 < G1 , then (5.18) is implied by
0
(η̂1∗ − η01 ) G1 (η̂1∗ − η01 ) ≤ 2(η̂1∗ − η01 ) G1 (η̂1∗ − η1 ).
Substituting η̂1∗ − η01 = η̂1∗ − η1 − (η01 − η1 ) into (5.19), we have
0
0
(η1 − η01 ) G1 (η1 − η01 ) ≤ (η̂1∗ − η1 ) G1 (η̂1∗ − η1 ).
(5.20)
Since λ̃1 and λ̃t1 is the maximum and minimum non-zero eigenvalue of
0
X1 X1 respectively, then
µ 2
¶−1
¶−1 µ 2
¶−1
µ 2
τ1
τ1
τ1
0
In ≤ G1 =
λ̃1 + 1
X1 X1 + In
≤
λ̃t + 1
In .
σ11.2
σ11.2
σ11.2 1
Hence (5.20) is implied by
³
η1−1 λ̃t1 + 1
´−1
´−1
³
0
0
(η1 − η01 ) (η1 − η01 ) ≤ δ1−1 λ̃1 + 1 (η̂1∗ − η1 ) (η̂1∗ − η1 ).(5.21)
Note that 0 <
δ1−1 λ̃1 +1
δ1−1 λ̃t1 +1
≤
λ̃1
,
λ̃t1
(5.21) is implied by
λ̃1
0
0
(η1 − η01 ) (η1 − η01 ) ≤ (η̂1∗ − η1 ) (η̂1∗ − η1 ).
λ̃t1
From (4.1) we know that
(η̂1∗
³
0
+
0
´
− η1 )|η ∼ N 0, σ11.2 X1 (X1 X1 ) X1 .
(5.22)
170
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
0
0
Since Q = X1 (X1 X1 )+ X1 is an idempotent matrix, there exists an orthogonal matrix P such that
Ã
0
P QP =
It1 0
0 0
!
.
−1/2
Let Z̃ = σ11.2 P (η̂1∗ − η1 ) , then
Ã
Z̃|δ1 ∼ N
0,
Ã
It1 0
00
!!
,
0
which implies Z̃ Z̃ ∼ χ2t1 . Thus (5.22) is equivalent to
λ̃1
0
||η1 − η01 ||2 ≤ Z̃ Z̃.
λ̃t1 σ11.2
by (5.18)-(5.23) and Markov-inequality we obtain
Ã
λ̃1
Pπ (W (η̂1B , η1∗ ; η1 ) ≤ 0) ≥ Pπ Z̃ Z̃ ≥
||η1 − η01 ||2
σ11.2 λ̃t1
!
Ã
σ11.2 λ̃t1 0
Z̃ Z̃
= 1 − Pπ ||η1 − η01 ||2 ≥
λ̃1
" Ã
!#
σ11.2 λ̃t1 0 ¯¯ 0
2
= 1 − E Pπ ||η1 − η01 || ≥
Z̃ Z̃ ¯Z̃ Z̃
λ̃1
¶
µ
λ̃1 tr(Cov(η1 ))
1
λ̃1 E ||η1 − η01 ||2
=1−
E
≥1−
0
Z̃ Z̃
σ11.2 λ̃t1
σ11.2 λ̃t1 (t1 − 2)
(5.23)
!
0
0
λ̃1 τ12 tr(X1 X1 )
t1 λ̃21 τ12
≥1−
≥1−
≥ 0.5.
σ11.2 λ̃t1 (t1 − 2)
σ11.2 λ̃t1 (t1 − 2)
The proof of Theorem 5.2 is completed.
Similar to Theorem 5.2, we have the following theorem.
Theorem 5.3 Let the GLS estimator and MBRLU estimator of η2 = X2 β2
be given by (5.3) and (5.11). Under the condition (4.1) and suppose R(X2 ) =
t2 ≤ p2 , if
τ22
λ̄t (t2 − 2)
≤ 2
σ22.1
2t2 λ̄21
(5.24)
171
Radhey S. Singh, Lichun Wang and Huiming Song
then we have
Pπ (L(η̂2B − η2 )) ≤ L(ηˆ2 ∗ − η2 )) ≥ 0.5,
for every π ∈ Γ(β2 ),
0
where λ̄1 , λ̄t2 are the maximum and minimum positive eigenvalue of X2 X2 , and
Γ(β2 ) is the same as that of Theorem 4.2.
Similar to the proof of Theorem 4.3, we have the following result.
Theorem 5.4 Let the GLS estimator and MBRLU estimator of ηi = Xi βi
be given by (5.2), (5.3) and (5.10), (5.11) respectively. Under the conditions
(4.1) and (4.12). Then for i = 1, 2, we have
Pπ (L(η̂iB , ηi )) ≤ L(η̂i ∗ , ηi )|y) ≥ 0.5,
for any y ∈ Y,
where Y is the sample space.
6
Concluding remarks
In summary, we have investigated Bayesian estimation problem of regres-
sion parameter in the system of two seemingly unrelated regressions. We derive
the Bayes minimum risk linear unbiased (MBRLU) estimators for regression
parameters and establish their superiorities based on the mean square error
matrix (MSEM) criterion. Also, we exhibit the superiorities of MBRLU estimators in terms of the predictive Pitman closeness (PRPC) criterion and the
posterior Pitman closeness (PPC) criterion, respectively. In the case that the
design matrices are non-full rank, the superiorities of BMRLU estimators of
some estimable functions are investigated.
172
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
References
[1] J.O. Berger, Statistical decision theory and Bayesian analysis, Berlin,
Springer, 1985.
[2] G.E.P. Box and G. Tiao, Bayesian Inference in Statistical Analysis, Reading, Addison-Wesley, 1973.
[3] R.L. Fountain and J.P. Keating, The Pitman comparison of unbiased
linear estimators, Statistics and Probability Letters, 19, (1994), 131-136.
[4] M. Ghosh and P.K. Sen, Bayesian Pitman closeness, Comm. Statist. Theory and Methods, 20, (1991),3659-3678.
[5] M.H.J.Gruber, Regression Estimators, A Comparative Study, Boston,
Academic Press, 1990.
[6] J.P. Keating and R.L. Mason, Practical relevance of an alternative criterion in estimation, Amer. Stat., 39, (1985), 203-205.
[7] J.P. Keating, R.L. Mason and P.K. Sen, Pitman’s Measure of Closeness: A
Comparison of Statistical Estimators, Philadelphia-Society of Industrial
and Applied Mathematics, (1993).
[8] J. Kementa and R.F.Gilbert, Small sample properties of alternative estimators of seemingly unrelated regressions, J. Amer. Statist. Assoc., 63,
(1968), 1180-1200.
[9] C.T. Lin, The efficiency of least squares estimator of a seemingly unrelated
regression model, Comm. Statist. Simulation, 20, (1991), 919-925.
[10] A.Y. Liu, Efficient estimation of two seemingly unrelated regression equations, Journal of Multivariate Analysis, 82, (2002), 445-456.
Radhey S. Singh, Lichun Wang and Huiming Song
173
[11] R.L. Mason, J.P. Keating, P.K. Sen and N.W.Blaylock, Comparison of
linear estimators using Pitman’s measure of closeness, J. Amer. Statist.
Assoc., 81, (1990), 579-581.
[12] J.S. Mehta and P.A.V.B. Swamy, Further evidence on the relative efficiencies of Zellner’s seemingly unrelated regressions estimator, J. Amer.
Statist. Assoc., 71, (1976), 634-639.
[13] E.J.G. Pitman, The closest estimates of statistical parameters, Proc.
Camb. Phil. Soc., 33, (1937), 212-222.
[14] C.R. Rao, Linear Statistical Inference and Its Applications, Second Edition, New York, Wiley, 1973.
[15] C.R. Rao, Estimation of parameters in a linear model, The 1975 Wald
Memorial Lectures, Ann. Statist., 4, (1976), 1023-1037.
[16] C.R. Rao, Some comments on the minimum mean square as criterion of
estimation. In Statistics and Related Topics, North Holland, Amsterdan,
pp. 123-143, 1981.
[17] C.R. Rao, J.P. Keating and R.L. Mason, The Pitman nearness criterion
and determination, Comm. Statist. Theor. Math., 15, (1986), 3173-3191.
[18] N.S. Revankar, Some finite sample results in the context of two seemingly
unrelated regression equations, J. Amer. Statist. Assoc., 69, (1974), 187190.
[19] P. Schmidt, Estimation of seemingly unrelated regressions with unequal
numbers of observations, Journal of Econometrics, 5, (1977), 365-377.
[20] G. Trenkler and L.S. Wei, The Bayes estimators in a misspecified linear
regression model, Test, 5, (1996), 113-123.
174
The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator...
[21] L.C. Wang and N. Veraverbeke, MSE superiority of Bayes and empirical Bayes estimators in two generalized seemingly unrelated regressions,
Statistics & Probability Letters, 78, (2008), 109-117.
[22] L.C. Wang, H. Lian and R.S. Singh, On efficient estimators of two seemingly unrelated regressions, Statistics & Probability Letters, 81, (2011),
563-570.
[23] S.G. Wang, A new estimate of regression coefficients in seemingly unrelated regression system, Science in China (Ser. A), 7, (1989), 808-816.
[24] S.G. Wang and S.C. Chow, Advanced linear models: Theory and Applications, New York, Marcel Dekker, 1994.
[25] W.P. Zhang, Research of Problems of Bayes Analysis in Linear Model,
Thesis of USTC for Ph.D, 2005-JX-C-1773, 2005.
[26] A. Zellner, An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias, J. Amer. Statist. Assoc., 57, (1962),
348-368.
[27] A. Zellner, Estimators of seemingly unrelated regressions: some exact
finite sample results, J. Amer. Statist. Assoc., 58, (1963), 977-992.
Download