Journal of Statistical and Econometric Methods, vol.2, no.3, 2013, 153-174 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2013 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator in Two Seemingly Unrelated Regressions1 Radhey S. Singh2 , Lichun Wang3 and Huiming Song4 Abstract In the system of two seemingly unrelated regressions, the minimum Bayes risk linear unbiased (MBRLU) estimators of regression parameters are derived. The superiorities of the MBRLU estimators over the classical estimators are investigated, respectively, in terms of the mean square error matrix (MSEM) criterion, the predictive Pitman closeness (PRPC) criterion and the posterior Pitman closeness (PPC) criterion. Mathematics Subject Classification: 62C10; 62J05 Keywords: Seemingly unrelated regressions; generalized least square estimator; MBRLU estimator; MSEM criterion; PRPC criterion; PPC criterion 1 Sponsored by NSERC of Canada Grant (400045) and Scientific Foundation of BJTU(2012JBM105) 2 Department of Statistics and Actuarial Sciences, University of Waterloo, Canada. Corresponding author. 3 Department of Mathematics, Beijing Jiaotong University, Beijing 100044, China. 4 Department of Statistics and Finance, USTC, Hefei 230026, China. Article Info: Received : March 21, 2013. Revised : May 2, 2013 Published online : September 1, 2013 154 1 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... Introduction Consider a system consisting of two seemingly unrelated regressions (SUR) ( Y1 = X1 β1 + ²1 , (1.1) Y2 = X2 β2 + ²2 where Yi (i = 1, 2) are n × 1 vectors of observations, Xi (i = 1, 2) are n × pi matrices with full column rank, βi (i = 1, 2) are pi × 1 vectors of unknown regression parameters, ²i (i = 1, 2) are n × 1 vectors of error variables, and E(²i ) = 0, Cov(²i , ²j ) = σij In , i, j = 1, 2, where Σ∗ = (σij ) is a 2 × 2 non-diagonal positive definite matrix. This kind of system has been widely applied in many fields such as econometrics, social and biological sciences and so on. It was first introduced to statistics by the Zellner’s pioneer works (Zellner (1962, 1963)) and later developed by Kementa and Gilbert (1968), Revankar (1974), Mehta and Swamy (1976), Schmidt (1977), Wang (1989) and Lin (1991), etc. Denote Y = (y10 , y20 )0 , X = diag(X1 , X2 ), β = (β10 , β20 )0 , ² = (²01 , ²02 )0 . One can represent the system (1.1) as the following regression model Y = Xβ + ², ² ∼ (0, Σ∗ ⊗ In ), (1.2) where ⊗ denotes the Kronecker product operator. Following from Wang et al (2011), we know that if Σ∗ is known then the generalized least square estimator of βi (i = 1, 2) would be " # ∞ X σ12 N2 Y 2 ) β̄1,GLS = (X10 X1 )−1 X10 In − ρ2 (ρ2 P2 P1 )i P2 N1 (Y1 − σ 22 i=0 (1.3) and " β̄2,GLS = (X20 X2 )−1 X20 In − ρ2 ∞ X i=0 # (ρ2 P1 P2 )i P1 N2 (Y2 − σ21 N1 Y1 ), σ11 (1.4) Radhey S. Singh, Lichun Wang and Huiming Song 155 2 where Pi = Xi (Xi0 Xi )−1 Xi0 , Ni = In − Pi and ρ2 = σ12 /(σ11 σ22 ). Further, also by the Theorem 3.1 of Wang et al (2011), we know that under the condition that X10 X2 = 0 (see Zellner (1963)) or X1 = (X2 , L) (see Revankar (1974)) or P1 P2 P1 N2 = 0 (see Liu (2002)), formally β̄1,GLS and β̄2,GLS have unique simpler form, respectively. Thus, in what follows we assume X10 X2 = 0 which implies Xi0 Nj = 0(i 6= j) and causes β̄1,GLS and β̄2,GLS to be simplified into σ12 0 (X1 X1 )−1 X10 Y2 , σ22 σ12 0 = β̂2 − (X X2 )−1 X20 Y1 , σ11 2 β̂1∗ = β̂1 − (1.5) β̂2∗ (1.6) where β̂1 = (X10 X1 )−1 X10 Y1 , β̂2 = (X20 X2 )−1 X20 Y2 . In Section 2 we derive the Bayes minimum risk linear unbiased (MBRLU) estimator for β and accordingly the MBRLU estimators for βi (i = 1, 2). In Section 3 the superiorities of the MBRLU estimators of βi (i = 1, 2) are established based on the mean square error matrix (MSEM) criterion. In Section 4 we discuss the superiorities of MBRLU estimators in terms of the predictive Pitman closeness (PRPC) criterion and the posterior Pitman closeness (PPC) criterion, respectively. In the case that the design matrices are non-full rank, we investigate the superiorities of BMRLU estimators of some estimable functions. Finally, brief concluding remarks are made in Section 6. 2 The BMRLU Estimators of Regression Parameters Normally, there are two different approaches concerned with Bayes esti- mation in linear model. The first one supposes that the prior of regression 156 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... parameter is normal which implies under the normal linear model the posterior is still a normal distribution. Thus under the quadratic loss the Bayes estimator of the regression parameter would be the posterior mean (see Box and Tiao (1973), Berger (1985) and Wang and Chow (1994), etc). Recently, Wang and Veraverbeke (2008) employed this approach to exhibit the superiorities of Bayes and empirical Bayes estimators in two generalized SURs. The second approach proposed by Rao (1973) which yields the minimum Bayes risk linear (MBRL) estimator of the regression parameter by minimizing the Bayes risk under the assumption that some moment conditions of the prior are given. Rao (1976) further pointed out that the admissible linear estimators of regression parameter are either MBRL estimators or the limit of MBRL estimators. Gruber (1990) proposed a MBRL estimator for an estimable function of the regression parameter and obtained an alternative form of MBRL estimator. Some results related to this area can be found in Trenkler and Wei (1996), Zhang (2005) and others. In this paper, we use the second approach to derive MBRL estimator of the regression parameter and discuss the superiorities of MBRL estimator in terms of MSEM, PRPC and PPC criterion, respectively. Denote the prior of β by π(β). It is assumed that the prior π(β) satisfies: à ! à ! µ1 τ12 Ip1 0 E(β) = =µ, ˆ Cov(β) = =V, ˆ (2.1) µ2 0 τ22 Ip2 where µi and τi (i = 1, 2) are known. Let the loss function be defined by L(δ, β) = (δ − β)0 (δ − β), (2.2) and the linear estimator class of β be n o e F = β = AY + b : where A is (p1 + p2 ) × 2n matrix, b is (p1 + p2 ) × 1 vector . (2.3) 157 Radhey S. Singh, Lichun Wang and Huiming Song Then the MBRLU estimator β̂B is defined to minimize the Bayes risk e β) = min E[(βe − β)0 (βe − β)] R(β̂B , β) = min R(β, A,b A,b (2.4) and subject to the constraint E(βe − β) = 0, where E denotes the expectation with respect to (w.r.t.) the joint distribution of (Y, β). From the constraint, we have b = (I − AX)µ. Note that the fact that n o 0 e R(β, β) = E [AY + (I − AX)µ − β] [AY + (I − AX)µ − β] n o 0 = E [A(Y − Xµ) − (β − µ)] [A(Y − Xµ) − (β − µ)] n o 0 = Etr [A(Y − Xµ) − (β − µ)] [A(Y − Xµ) − (β − µ)] n o 0 0 0 0 = tr A(XV X + Φ)A + V − AXV − V X A , by solving e ) ∂R(β,β ∂A = 0, we obtain A = V X 0 (XV X 0 + Φ)−1 . (2.5) (P + BCB 0 )−1 = P −1 − P −1 B(C −1 + B 0 P −1 B)−1 B 0 P −1 , (2.6) By the fact that we obtain A = V X 0 (XV X 0 + Φ)−1 = (X 0 Φ−1 X + V −1 )−1 X 0 Φ−1 , (2.7) and I − AX = I − (X 0 Φ−1 X + V −1 )−1 X 0 Φ−1 X = (X 0 Φ−1 X + V −1 )−1 V −1 . (2.8) Hence, we have β̂B = AY + b = AY + (I − AX)µ = (X 0 Φ−1 X + V −1 )−1 (X 0 Φ−1 X β̂LS + V −1 µ) = β̂LS − (X 0 Φ−1 X + V −1 )−1 V −1 (β̂LS − µ) ! à 0 −1 X1 X1 + Ip1 )−1 (β̂1∗ − µ1 ) β̂1∗ − (τ12 σ11.2 . = 0 −1 X2 X2 + Ip2 )−1 (βˆ2∗ − µ2 ) βˆ2∗ − (τ22 σ22.1 (2.9) 158 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... Thus the MBRLU estimators of βi (i = 1, 2) are 0 −1 β̂1B = β̂1∗ − (τ12 σ11.2 X1 X1 + Ip1 )−1 (β̂1∗ − µ1 ), 0 −1 X2 X2 + Ip2 )−1 (βˆ2∗ − µ2 ), β̂2B = βˆ2∗ − (τ22 σ22.1 (2.10) (2.11) 2 −1 2 −1 σ11 . σ22 and σ22.1 = σ22 − σ12 where σ11.2 = σ11 − σ12 3 The Superiorities of MBRLU Estimator Under MSEM Criterion We state the following MESM superiorities of β̂iB (i = 1, 2). Theorem 3.1 Let the GLS estimators and MBRLU estimators of βi are given by (1.5),(1.6) and (2.10), (2.11) respectively, then M (β̂i∗ ) − M (β̂iB ) > 0, i = 1, 2. Proof: We only prove the above conclusion for the case of i = 1. Denote 0 −1 B1 = (τ12 σ11.2 X1 X1 + Ip1 )−1 , we have h i 0 M (β̂1B ) = E (β̂1B − β1 )(β̂1B − β1 ) ½h ih i0 ¾ ∗ ∗ ∗ ∗ = E (β̂1 − β1 ) − B1 (β̂1 − µ1 ) (β̂1 − β1 ) − B1 (β̂1 − µ1 ) 0 0 0 = M (β̂1∗ ) − E[(β̂1∗ − β1 )(β̂1∗ − µ1 ) ]B1 − B1 E[(β̂1∗ − µ1 )(β̂1∗ − β1 ) ] i 0 h 0 +B1 E (β̂1∗ − µ1 )(β̂1∗ − µ1 ) B1 0 0 0 = M (β̂1∗ ) − J1 B1 − B1 J1 − B1 J2 B1 , (3.1) n ³ ´o n ³ ´o J2 = Cov(β̂1∗ ) = E Cov β̂1∗ |β + cov E β̂1∗ |β n ³ ´o ∗ = E cov β̂1 | β + τ12 I, (3.2) where 159 Radhey S. Singh, Lichun Wang and Huiming Song and n ³ E Cov β̂1∗ |β1 ´o ½ ·³ ´³ ´0 ¸¾ ∗ ∗ = E E β̂1 − β1 β̂1 − β1 |β1 ´ n ³ 0 −1 −1 0 = E Cov(β̂1 |β1 ) + Cov σ12 σ22 (X1 X1 ) X1 Y2 |β1 ´o ³ 0 0 −1 (X1 X1 )−1 X1 Y2 |β1 −2Cov β̂1 , σ12 σ22 0 0 0 2 −1 2 −1 = σ11 (X1 X1 )−1 + σ12 σ22 (X1 X1 )−1 − 2σ12 σ22 (X1 X1 )−1 ¡ ¢ 0 0 2 −1 = σ11 − σ12 σ22 (X1 X1 )−1 = σ11.2 (X1 X1 )−1 . (3.3) Putting (3.3) into (3.2) we have n ³ ´o 0 J2 = E Cov β̂1∗ |β1 + τ12 I = σ11.2 (X1 X1 )−1 + τ12 I. (3.4) Then J1 can be expressed as follows h i 0 0 J1 = E (β̂1∗ − β1 )(β̂1∗ − µ1 ) = J2 − Cov(β1 ) = σ11.2 (X1 X1 )−1 . (3.5) Combining (3.4) and (3.5) with (3.1), we obtain 0 0 0 M (β̂1B ) − M (β̂1∗ ) = J1 B1 + B1 J1 − B1 J2 B1 h 0 0 = B1 B1−1 σ11.2 (X1 X1 )−1 + σ11.2 (X1 X1 )−1 B1−1 i 0 0 2 −1 −τ1 I − σ11.2 (X1 X1 ) B1 h i 0 0 = B1 τ12 I + σ11.2 (X1 X1 )−1 B1 > 0. (3.6) Theorem 3.1 has been proved. Obviously, the MSEM is much stronger than the MSE. A point estimator could be MSE superior to another, but not necessarily superior in sense of MSEM. Hence, we have M SE(β̂i∗ ) − M SE(β̂iB ) > 0, i = 1, 2. 160 4 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... Superiorities of MBRLU Estimators Under PRPC and PPC Criterion The criterion of Pitman Closeness (PC), originally introduced by Pitman (1937), is based on the probabilities of the relative closeness of competing estimators to an unknown parameter or parameter vector. After a long fallow period, renewed interest in this topic has been sparked in the last twenty years. Rao (1981), Keating and Mason (1985) and Rao et al. (1986) helped to resurrect the criterion as an alternative comparison criterion to traditional criterion such as MSE criterion and mean absolute error (MAE) criterion. Mason et al. (1990) and Fountain and Keating (1994) proposed some general methods for the comparisons between linear estimators under PC criterion. Many important contributions to this direction were described by Keating et al. (1993) and others. Definition 4.1 Let θ̂1 and θ̂2 be two different estimators of θ, L(θ̂, θ) be the loss function. if P (L(θ̂1 , θ) ≤ L(θ̂2 , θ)) ≥ 0.5, ∀ θ ∈ Θ, with strict inequality “ > ” for some θ ∈ Θ, the parameter space, then θ̂1 is said to be Pitman closer than θ̂2 , or θ̂1 is said to be superior to θ̂2 under PC criterion. Ghosh and Sen (1991) introduced two alternative notions of PC motivated from Bayesian viewpoint, which are called Predictive Pitman Closeness (PRPC) and Posterior Pitman Closeness (PPC) criterion. They are defined as follows: Definition 4.2 Let Γ be a class of prior distributions of θ, θ̂1 and θ̂2 be two different estimators of θ, then θ̂1 is said to be superior to θ̂2 under PRPC criterion if Pπ (L(θ̂1 , θ) ≤ L(θ̂2 , θ)) ≥ 0.5, ∀ π ∈ Γ, 161 Radhey S. Singh, Lichun Wang and Huiming Song where Pπ is computed under the joint distribution of Y and θ for every π ∈ Γ. Definition 4.3 Suppose π is a prior distribution of θ, θ̂1 and θ̂2 are two different estimators of θ, then θ̂1 is said to superior to θ̂2 under PPC criterion if Pπ (L(θ̂1 , θ) ≤ L(θ̂2 , θ)|y) ≥ 0.5, ∀ y ∈ Y, with strict inequality “ > ” for some y ∈ Y, where Y is the sample space. Obviously, if an estimator θ̂1 of θ is superior to θ̂2 for every π ∈ Γ under PPC criterion, then it is also superior to θ̂2 under PRPC criterion. The converse is not necessarily true. Ghosh and Sen presented an example to show that the classical James-Stein estimator is superior to the sample mean under PRPC criterion for all priors, but it is not hold under the PPC criterion. Let the loss function be defined by (2.2) and in this section ε|β ∼ N (0, Σ∗ ⊗ In ). (4.1) For the MBRLU estimator β̂1B , we have the following results. Theorem 4.1 Let the GLS estimator and MBRLU estimator of β1 be given by (1.5) and (2.10). If λp (p1 − 2) τ12 ≤ 1 , σ11.2 2p1 λ21 then ³ Pπ (4.2) ´ ∗ ˆ L(β̂1B , β1 )) ≤ L(β1 , β1 ) ≥ 0.5, for every π ∈ Γ(β1 ), 0 where λ1 and λp1 are the maximum and the minimum eigenvalues of X1 X1 and Γ(β1 ) = {π(β1 ) : E(β1 ) = µ1 , Cov(β1 ) = τ12 Ip1 }. ´ ³ 0 −1 X1 X1 +Ip1 )−1 and W β̂1B , β1∗ ; β1 = L(β̂1B , β1 )− Proof: Denote B1 = (τ12 σ11.2 L(βˆ1∗ , β1 ). By (2.10) we have 0 L(β̂1B , β1 ) = [(β̂1∗ − β1 ) − B1 (β̂1∗ − µ1 )] [(β̂1∗ − β1 ) − B1 (β̂1∗ − µ1 )] 0 0 0 = L(β̂1∗ , β1 ) − 2(β̂1∗ − µ1 ) B1 (β̂1∗ − β1 ) + (β̂1∗ − µ1 ) B12 (β̂1∗ − µ1 ). (4.3) 162 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... Hence W < 0 is equivalent to 0 0 0 (β̂1∗ − µ1 ) B12 (β̂1∗ − µ1 ) − 2(β̂1∗ − µ1 ) B1 (β̂1∗ − β1 ) ≤ 0. (4.4) Since B12 ≤ B1 , we know that (4.4) is implied by 0 0 (β̂1∗ − µ1 ) B1 (β̂1∗ − µ1 ) − 2(β̂1∗ − µ1 ) B1 (β̂1∗ − β1 ) ≤ 0. (4.5) Substituting β̂1∗ − µ1 = β̂1∗ − β1 − (µ1 − β1 ) into (4.5), it is equivalent to 0 0 (β1 − µ1 ) B1 (β1 − µ1 ) ≤ (β̂1∗ − β1 ) B1 (β̂1∗ − β1 ). (4.6) Since µ τ12 λ1 + 1 σ11.2 ¶−1 µ Ip1 ≤ B1 = τ12 0 X X1 + Ip1 σ11.2 1 µ ¶−1 ≤ τ12 λp + 1 σ11.2 1 ¶−1 Ip1 , It is easy to see that inequality (4.6) is implied by ¶−1 ¶−1 µ 2 τ1 τ12 0 0 λp1 +1 (β1 − µ1 ) (β1 − µ1 ) ≤ λ1 +1 (β̂1∗ − β1 ) (β̂1∗ − β1 ). (4.7) σ11.2 σ11.2 µ Since 0 < −1 τ12 σ11.2 λ1 +1 −1 2 τ1 σ11.2 λp1 +1 ≤ λ1 , λp1 So (4.7) is implied by λ1 0 0 (β1 − µ1 ) (β1 − µ1 ) ≤ (β̂1∗ − β1 ) (β̂1∗ − β1 ). λp1 (4.8) ¢ ¡ 0 0 −1/2 Note that (β̂1∗ −β1 )|β ∼ N 0, σ11.2 (X1 X1 )−1 . Let Z = σ11.2 (X1 X1 )1/2 (β̂1∗ − β1 ), then Z|β ∼ N (0, Ip1 ) . So the inequality( 4.8) is equivalent to λ1 0 0 0 (β1 − µ1 ) (β1 − µ1 ) ≤ σ11.2 Z (X1 X1 )−1 Z. λp1 (4.9) 0 Notice that (X1 X1 )−1 > λ−1 1 Ip1 , hence (4.9) is implied by λ21 0 ||β1 − µ1 ||2 ≤ Z Z. λp1 σ11.2 (4.10) 163 Radhey S. Singh, Lichun Wang and Huiming Song 0 Since Z Z|β ∼ χ2p1 , by (4.4)–(4.10) and Markov inequality we have µ ¶ ³ ³ ´ ´ λ21 0 2 ∗ Pπ W β̂1B , β1 ; β1 ≤ 0 ≥ Pπ Z Z ≥ ||β1 − µ1 || λp1 σ11.2 µ ¶ λp1 σ11.2 0 2 = 1 − Pπ ||β1 − µ1 || ≥ ZZ λ21 ¶¸ · µ λp1 σ11.2 0 ¯¯ 0 2 = 1 − E Pπ ||β1 − µ1 || ≥ Z Z ¯Z Z λ21 à ! µ ¶ λ21 tr[Cov(β1 )] λ21 E ||β1 − µ1 ||2 1 ≥1−E =1− E λp1 σ11.2 Z 0 Z σ11.2 λp1 Z0Z =1− λ21 p1 τ12 1 ≥ . σ11.2 λp1 (p1 − 2) 2 The proof of Theorem 4.1 is finished. For the estimator β̂2B , we have similar result as below. Theorem 4.2 Let the GLS estimator and MBRLU estimator of β2 are given by (1.6) and (2.11). If λ∗p (p2 − 2) τ22 ≤ 2 , σ22.1 2p2 λ∗2 1 then ³ ´ Pπ L(β̂2B , β2 )) ≤ L(β̂2∗ , β2 ) ≥ 0.5, (4.11) for every π ∈ Γ(β2 ), 0 where λ∗1 , λ∗p2 are the maximum and the minimum eigenvalue of X2 X2 and Γ(β2 ) = {π(β2 ) : E(β2 ) = µ2 , Cov(β2 ) = τ22 Ip2 }. To discuss PPC properties of MBRLU estimators, we further assume the prior π(β) is the normal distribution,i.e. β ∼ N (µ, V ). (4.12) Thus we have the following results. Theorem 4.3 Under the assumptions (4.1) and (4.12). ´ ³ Pπ L(β̂iB , βi )) ≤ L(βˆi∗ , βi )|Y = y ≥ 0.5 for any y ∈ Y. ∗ i.e., β̂iB is superior over β̂i under PPC criterion, where Y is sample space. 164 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... Proof: We only prove the case of i = 1. The proof of β̂2B is similar. Note that ³ ´ = ||β̂1B − β1 ||2 − ||β̂1∗ − β1 ||2 i i0 h h = (β̂1B − β̂1∗ ) + (β̂1∗ − β1 ) (β̂1B − β̂1∗ ) + (β̂1∗ − β1 ) − ||β̂1∗ − β1 ||2 i h = ||β̂1∗ − β̂1B ||2 + 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) − (β̂1∗ − β̂1B ) W β̂1B , β̂1∗ ; β1 0 = 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) − ||β̂1∗ − β̂1B ||2 . (4.13) Obviously 0 W ≤ 0 ⇐⇒ 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) ≤ ||β̂1∗ − β̂1B ||2 . From (4.12) we know that the prior of β1 is normal distribution, hence the posterior distribution of β1 given Y = y is still normal distribution. Under the quadratic loss function the Bayes estimator of β1 is E(β1 |y), which has the same expression as (2.10), therefore the posterior of β1 − β̂1B = β1 − E(β1 |y) given Y = y is distributed as p1 -dimension normal distribution with zero mean. 0 Thus, the posterior of 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) given Y = y is one dimension normal distribution with zero mean. Hence we have ³ ´ ³ ³ ´ ´ ∗ ∗ ˆ Pπ L(β̂1B , β1 )) ≤ L(β1 , β1 )|y = Pπ W β̂1B , β̂1 ; β1 ≤ 0|y ³ ´ 0 ∗ ∗ 2 = Pπ 2(β̂1 − β̂1B ) (β1 − β̂1B ) ≤ ||β̂1 − β̂1B || |y ³ ´ 0 > Pπ 2(β̂1∗ − β̂1B ) (β1 − β̂1B ) ≤ 0 | y = 0.5, the last inequality is true due to ||β̂1∗ − β̂1B ||2 > 0 with probability one. The Theorem 4.3 has been proved. 5 The MBRLU Estimators of Estimatable Function and Its Superiorities In this section we consider the case that in (1.2) the rank of X is non- 165 Radhey S. Singh, Lichun Wang and Huiming Song full rank, i.e., R(X) < p. Note that in this case β is un-estimable, hence we consider the estimable function η = Xβ. Since any estimable function of β can be expressed by the linear function of η, it is enough to study the estimator of η. It is obvious that the GLS estimator of η = Xβ is 0 0 η̂GLS = X(X Φ−1 X)− X Φ−1 Y, 0 where the expression of ηbGLS does not depend on g-inverse (X Φ−1 X)− , so it 0 can be replaced by Moore-Penrose inverse (X Φ−1 X)+ . Then we have à ! ∗ η̂ 0 0 1 η̂GLS = X(X Φ−1 X)+ X Φ−1 Y = , η̂2∗ (5.1) where ηi∗ is the GLS estimator of ηi = Xi βi (i=1,2), and 0 0 η̂1∗ = X1 (X1 X1 )+ X1 Y1 − σ12 0 0 X1 (X1 X1 )+ X1 Y2 σ22 σ12 0 0 X1 (X1 X1 )+ X1 Y2 , σ22 σ12 0 0 0 0 = X2 (X2 X2 )+ X2 Y2 − X2 (X2 X2 )+ X2 Y1 σ11 σ12 0 0 X2 (X2 X2 )+ X2 Y1 , = η̂2 − σ11 (5.2) = η̂1 − η̂2∗ 0 0 0 (5.3) 0 with η̂1 = X1 (X1 X1 )+ X1 Y1 , η̂2 = X2 (X2 X2 )+ X2 Y2 . Similar to the way used in section 1, we may obtain the MBRLU estimator of η = Xβ. Let η0 = Xµ. Then we have 0 0 0 0 η̂B = XV X (XV X + Φ)−1 Y + [I − XV X (XV X + Φ)−1 ] η0 . (5.4) 0 0 0 0 By (2.6) and the fact that X Φ−1 = X Φ−1 X(X Φ−1 X)+ X Φ−1 we know that 0 0 0 0 0 XV X (XV X + Φ)−1 = XV X [Φ−1 − Φ−1 X(V −1 + X Φ−1 X)−1 X Φ−1 ] 0 0 0 = XV [I − X Φ−1 X(V −1 + X Φ−1 X)−1 ]X Φ−1 0 0 0 0 = X(V −1 + X Φ−1 X)−1 X Φ−1 = H 0 0 = X(V −1 + X Φ−1 X)−1 X Φ−1 X(X Φ−1 X)+ X Φ−1 0 0 = HX(X Φ−1 X)+ X Φ−1 . (5.5) 166 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... 0 0 Therefore, using (5.4) and (5.5) and the fact that X(X X)+ X X = X we have 0 0 η̂B = HX(X Φ−1 X)+ X Φ−1 Y + (I − H)η0 = H η̂GLS + (I − H)η0 = η̂GLS − (I − H)(η̂GLS − η0 ) 0 0 = η̂GLS − [X − HX(X X)+ X X](β̂GLS − µ) 0 0 = η̂GLS − [I − HX(X X)+ X ](η̂GLS − η0 ). (5.6) By the assumption X10 X2 = 0 we have 0 0 0 X Φ−1 X(X X)+ X à 0 !à !à ! 0 −1 + 0 X1 0 σ11.2 In − ρ12 In X1 (X1 X1 ) X1 0 = 0 0 0 −1 0 X2 −ρ12 In σ22.1 In 0 X2 (X2 X2 )+ X2 à ! à ! 0 0 0 0 −1 −1 σ11.2 X1 X1 (X1 X1 )+ X1 0 σ11.2 X1 0 = = , (5.7) 0 0 0 0 −1 −1 0 σ22.1 X2 X2 (X2 X2 )+ X2 0 σ22.1 X2 and 0 0 0 0 0 0 HX(X X)+ X = X[V −1 + X Φ−1 X]−1 X Φ−1 X(X X)+ X !−1 à ! à !à 1 0 0 1 −1 X I X + 0 2 σ X 0 X1 0 p 1 1 1 σ11.2 τ1 11.2 1 = 0 0 −1 1 1 X 0 X + I 0 σ22.1 X2 0 X2 σ22.1 2 2 τ22 p2 à ! ¡ 0 ¢−1 0 X1 X1 X1 + δ1 Ip1 X1 0 ¡ 0 ¢−1 0 = (5.8) 0 X2 X2 X2 + δ2 Ip2 X2 with δ1 = σ11.2 τ1−2 , δ2 = σ22.1 τ2−2 . Substituting (5.8) into (5.6) we obtain 0 0 η̂B = η̂GLS − [I − HX(X X)+ X ](η̂GLS − η0 ) ! à 0 0 η̂1∗ − [In − X1 (X1 X1 + δ1 Ip1 )−1 X1 ](η̂1∗ − η01 ) , = 0 0 η̂2∗ − [In−X2 (X2 X2 + δ2 Ip2 )−1 X2 ](η̂2∗ − η02 ) (5.9) 0 0 ) and η 0 = (η10 , η20 ). So the MBRLU estimators of ηi (i = , η02 where η00 = (η01 1, 2) are 0 0 η̂1B = η̂1∗ − [In − X1 (X1 X1 + δ1 Ip1 )−1 X1 ] (η̂1∗ − η01 ) , 0 0 η̂2B = η̂2∗ − [In − X2 (X2 X2 + δ2 Ip2 )−1 X2 ] (η̂2∗ − η02 ) . (5.10) (5.11) 167 Radhey S. Singh, Lichun Wang and Huiming Song Theorem 5.1 Let the GLS estimators and MBRLU estimators of ηi = Xi βi are given by (5.2), (5.3) and (5.10),(5.11), respectively, then M (η̂i∗ ) − M (η̂iB ) ≥ 0, i=1, 2. Proof: We only prove the case of i = 1, the case for i = 2 can be proved in a 0 0 similar way. Let G1 = In − X1 (X1 X1 + δ1 Ip1 )−1 X1 , then n o 0 M (η̂1B ) = E [η̂1∗ − G1 (η̂1∗ − η01 ) − η1 ][η̂1∗ − G1 (η̂1∗ − η01 ) − η1 ] n o 0 = E [(η̂1∗ − η1 ) − G1 (η̂1∗ − η01 )][(η̂1∗ − η1 ) − G1 (η̂1∗ − η01 )] n 0 0 0 = E (η̂1∗ − η1 ) (η̂1∗ − η1 ) − (η̂1∗ − η1 ) (η̂1∗ − η01 ) G1 o 0 0 0 ∗ ∗ ∗ ∗ −G1 (η̂1 − η01 ) (η̂1 − η1 ) + G1 (η̂1 − η01 ) (η̂1 − η01 ) G1 0 0 0 = M (η̂1∗ ) − K1 G1 − G1 K1 + G1 K2 G1 , (5.12) where ³ ´ 0 K2 = E (η̂1∗ − η01 ) (η̂1∗ − η01 ) = Cov (η̂1∗ ) = Cov[E (η̂1∗ |η)] + E[Cov (η̂1∗ |η)] 0 0 0 = τ12 X1 X1 + σ11.2 X1 (X1 X1 )+ X1 , (5.13) 0 K1 = E[(η̂1∗ − η1 ) (η̂1∗ − η01 ) ] 0 0 = Cov (η̂1∗ ) − Cov(η1 ) = σ11.2 X1 (X1 X1 )+ X1 . (5.14) 0 Putting (5.13), (5.14) into (5.12) and by the fact of G1 = In − X1 (X1 X1 + 0 0 δ1 Ip1 )−1 X1 = (In + δ1−1 X1 X1 )−1 , we have 0 0 0 M (η̂1∗ ) − M (η̂1B ) = K1 G1 + G1 K1 − G1 K2 G1 ³ ´ 0 0 0 2 + 0 = G1 τ1 X1 X1 + σ11.2 X1 (X1 X1 ) X1 G1 ≥ 0. Theorem 5.1 has been proved. For nay general estimable functions γi = Pi βi , where Pi is a ki × pi real matrix, for which there exists ki × n matrix Ci such that Pi = Ci Xi , therefore, 168 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... γi = Ci Xi βi = Ci ηi (i = 1, 2). It is easy to know the GLS estimators and MBRLU estimators of γi would be γ̂i∗ = Ci η̂i∗ , γ̂iB = Ci η̂iB , i = 1, 2, (5.15) then we have the following corollary. Corollary 5.1 Let the GLS estimators and MBRLU estimators of estimable function γi (i = 1, 2) are given by (5.15), then M (γ̂i∗ ) − M (γ̂iB ) ≥ 0, i = 1, 2. Proof: The conclusion holds since 0 M (γ̂i∗ ) − M (γ̂iB ) = Ci [M (η̂i∗ ) − M (η̂iB )] Ci ≥ 0. Now we discuss the superiority of MBRLU estimator of η1 under PRPC criterion, the superiority of MBRLU estimator of η2 can be discussed similarly. Under the loss function (2.2) we have the following result: Theorem 5.2 Let the GLS estimator and MBRLU estimator of η1 = X1 β1 be given by (5.2) and (5.10). Under the condition (4.1) and suppose R(X1 ) = t1 ≤ p1 , if λ̃t (t1 − 2) τ12 ≤ 1 σ11.2 2t1 λ̃21 (5.16) then we have Pπ (L(η̂1B − η1 )) ≤ L(ηˆ1 ∗ − η1 )) ≥ 0.5, for every π ∈ Γ(β1 ), 0 where λ̃1 , λ̃t1 are the maximum and minimum positive eigenvalue of X1 X1 , and Γ(β1 ) is given by Theorem 4.1. Remark 5.1 The condition of (5.16) indicates the fact that the variance of prior should not be too larger than that of the samples. It implies some requirement for the precision of the variance of prior distribution. 169 Radhey S. Singh, Lichun Wang and Huiming Song 0 0 0 Proof: Let G1 = I − X1 (X1 X1 + δ1 Ip1 )−1 X1 = (I + δ1−1 X1 X1 )−1 and W (η̂1B , η1∗ ; η1 ) = L(η̂1B , η1 ) − L(ηˆ1 ∗ , η1 ), where δ1 is given in (5.8). By (5.10) we have 0 L(η̂1B , η1 ) = [(η̂1∗ − η1 ) − G1 (η̂1∗ − η01 )] [(η̂1∗ − η1 ) − G1 (η̂1∗ − η01 )] 0 = L(η̂1∗ , η1 ) − 2(η̂1∗ − η01 ) G1 (η̂1∗ − η1 ) 0 +(η̂1∗ − η01 ) G21 (η̂1∗ − η01 ). (5.17) From (5.17), it is easy to see that W < 0 is equivalent to 0 0 (5.18) 0 (5.19) (η̂1∗ − η01 ) G21 (η̂1∗ − η01 ) ≤ 2(η̂1∗ − η01 ) G1 (η̂1∗ − η1 ). since G21 < G1 , then (5.18) is implied by 0 (η̂1∗ − η01 ) G1 (η̂1∗ − η01 ) ≤ 2(η̂1∗ − η01 ) G1 (η̂1∗ − η1 ). Substituting η̂1∗ − η01 = η̂1∗ − η1 − (η01 − η1 ) into (5.19), we have 0 0 (η1 − η01 ) G1 (η1 − η01 ) ≤ (η̂1∗ − η1 ) G1 (η̂1∗ − η1 ). (5.20) Since λ̃1 and λ̃t1 is the maximum and minimum non-zero eigenvalue of 0 X1 X1 respectively, then µ 2 ¶−1 ¶−1 µ 2 ¶−1 µ 2 τ1 τ1 τ1 0 In ≤ G1 = λ̃1 + 1 X1 X1 + In ≤ λ̃t + 1 In . σ11.2 σ11.2 σ11.2 1 Hence (5.20) is implied by ³ η1−1 λ̃t1 + 1 ´−1 ´−1 ³ 0 0 (η1 − η01 ) (η1 − η01 ) ≤ δ1−1 λ̃1 + 1 (η̂1∗ − η1 ) (η̂1∗ − η1 ).(5.21) Note that 0 < δ1−1 λ̃1 +1 δ1−1 λ̃t1 +1 ≤ λ̃1 , λ̃t1 (5.21) is implied by λ̃1 0 0 (η1 − η01 ) (η1 − η01 ) ≤ (η̂1∗ − η1 ) (η̂1∗ − η1 ). λ̃t1 From (4.1) we know that (η̂1∗ ³ 0 + 0 ´ − η1 )|η ∼ N 0, σ11.2 X1 (X1 X1 ) X1 . (5.22) 170 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... 0 0 Since Q = X1 (X1 X1 )+ X1 is an idempotent matrix, there exists an orthogonal matrix P such that à 0 P QP = It1 0 0 0 ! . −1/2 Let Z̃ = σ11.2 P (η̂1∗ − η1 ) , then à Z̃|δ1 ∼ N 0, à It1 0 00 !! , 0 which implies Z̃ Z̃ ∼ χ2t1 . Thus (5.22) is equivalent to λ̃1 0 ||η1 − η01 ||2 ≤ Z̃ Z̃. λ̃t1 σ11.2 by (5.18)-(5.23) and Markov-inequality we obtain à λ̃1 Pπ (W (η̂1B , η1∗ ; η1 ) ≤ 0) ≥ Pπ Z̃ Z̃ ≥ ||η1 − η01 ||2 σ11.2 λ̃t1 ! à σ11.2 λ̃t1 0 Z̃ Z̃ = 1 − Pπ ||η1 − η01 ||2 ≥ λ̃1 " à !# σ11.2 λ̃t1 0 ¯¯ 0 2 = 1 − E Pπ ||η1 − η01 || ≥ Z̃ Z̃ ¯Z̃ Z̃ λ̃1 ¶ µ λ̃1 tr(Cov(η1 )) 1 λ̃1 E ||η1 − η01 ||2 =1− E ≥1− 0 Z̃ Z̃ σ11.2 λ̃t1 σ11.2 λ̃t1 (t1 − 2) (5.23) ! 0 0 λ̃1 τ12 tr(X1 X1 ) t1 λ̃21 τ12 ≥1− ≥1− ≥ 0.5. σ11.2 λ̃t1 (t1 − 2) σ11.2 λ̃t1 (t1 − 2) The proof of Theorem 5.2 is completed. Similar to Theorem 5.2, we have the following theorem. Theorem 5.3 Let the GLS estimator and MBRLU estimator of η2 = X2 β2 be given by (5.3) and (5.11). Under the condition (4.1) and suppose R(X2 ) = t2 ≤ p2 , if τ22 λ̄t (t2 − 2) ≤ 2 σ22.1 2t2 λ̄21 (5.24) 171 Radhey S. Singh, Lichun Wang and Huiming Song then we have Pπ (L(η̂2B − η2 )) ≤ L(ηˆ2 ∗ − η2 )) ≥ 0.5, for every π ∈ Γ(β2 ), 0 where λ̄1 , λ̄t2 are the maximum and minimum positive eigenvalue of X2 X2 , and Γ(β2 ) is the same as that of Theorem 4.2. Similar to the proof of Theorem 4.3, we have the following result. Theorem 5.4 Let the GLS estimator and MBRLU estimator of ηi = Xi βi be given by (5.2), (5.3) and (5.10), (5.11) respectively. Under the conditions (4.1) and (4.12). Then for i = 1, 2, we have Pπ (L(η̂iB , ηi )) ≤ L(η̂i ∗ , ηi )|y) ≥ 0.5, for any y ∈ Y, where Y is the sample space. 6 Concluding remarks In summary, we have investigated Bayesian estimation problem of regres- sion parameter in the system of two seemingly unrelated regressions. We derive the Bayes minimum risk linear unbiased (MBRLU) estimators for regression parameters and establish their superiorities based on the mean square error matrix (MSEM) criterion. Also, we exhibit the superiorities of MBRLU estimators in terms of the predictive Pitman closeness (PRPC) criterion and the posterior Pitman closeness (PPC) criterion, respectively. In the case that the design matrices are non-full rank, the superiorities of BMRLU estimators of some estimable functions are investigated. 172 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... References [1] J.O. Berger, Statistical decision theory and Bayesian analysis, Berlin, Springer, 1985. [2] G.E.P. Box and G. Tiao, Bayesian Inference in Statistical Analysis, Reading, Addison-Wesley, 1973. [3] R.L. Fountain and J.P. Keating, The Pitman comparison of unbiased linear estimators, Statistics and Probability Letters, 19, (1994), 131-136. [4] M. Ghosh and P.K. Sen, Bayesian Pitman closeness, Comm. Statist. Theory and Methods, 20, (1991),3659-3678. [5] M.H.J.Gruber, Regression Estimators, A Comparative Study, Boston, Academic Press, 1990. [6] J.P. Keating and R.L. Mason, Practical relevance of an alternative criterion in estimation, Amer. Stat., 39, (1985), 203-205. [7] J.P. Keating, R.L. Mason and P.K. Sen, Pitman’s Measure of Closeness: A Comparison of Statistical Estimators, Philadelphia-Society of Industrial and Applied Mathematics, (1993). [8] J. Kementa and R.F.Gilbert, Small sample properties of alternative estimators of seemingly unrelated regressions, J. Amer. Statist. Assoc., 63, (1968), 1180-1200. [9] C.T. Lin, The efficiency of least squares estimator of a seemingly unrelated regression model, Comm. Statist. Simulation, 20, (1991), 919-925. [10] A.Y. Liu, Efficient estimation of two seemingly unrelated regression equations, Journal of Multivariate Analysis, 82, (2002), 445-456. Radhey S. Singh, Lichun Wang and Huiming Song 173 [11] R.L. Mason, J.P. Keating, P.K. Sen and N.W.Blaylock, Comparison of linear estimators using Pitman’s measure of closeness, J. Amer. Statist. Assoc., 81, (1990), 579-581. [12] J.S. Mehta and P.A.V.B. Swamy, Further evidence on the relative efficiencies of Zellner’s seemingly unrelated regressions estimator, J. Amer. Statist. Assoc., 71, (1976), 634-639. [13] E.J.G. Pitman, The closest estimates of statistical parameters, Proc. Camb. Phil. Soc., 33, (1937), 212-222. [14] C.R. Rao, Linear Statistical Inference and Its Applications, Second Edition, New York, Wiley, 1973. [15] C.R. Rao, Estimation of parameters in a linear model, The 1975 Wald Memorial Lectures, Ann. Statist., 4, (1976), 1023-1037. [16] C.R. Rao, Some comments on the minimum mean square as criterion of estimation. In Statistics and Related Topics, North Holland, Amsterdan, pp. 123-143, 1981. [17] C.R. Rao, J.P. Keating and R.L. Mason, The Pitman nearness criterion and determination, Comm. Statist. Theor. Math., 15, (1986), 3173-3191. [18] N.S. Revankar, Some finite sample results in the context of two seemingly unrelated regression equations, J. Amer. Statist. Assoc., 69, (1974), 187190. [19] P. Schmidt, Estimation of seemingly unrelated regressions with unequal numbers of observations, Journal of Econometrics, 5, (1977), 365-377. [20] G. Trenkler and L.S. Wei, The Bayes estimators in a misspecified linear regression model, Test, 5, (1996), 113-123. 174 The Superiorities of Minimum Bayes Risk Linear Unbiased Estimator... [21] L.C. Wang and N. Veraverbeke, MSE superiority of Bayes and empirical Bayes estimators in two generalized seemingly unrelated regressions, Statistics & Probability Letters, 78, (2008), 109-117. [22] L.C. Wang, H. Lian and R.S. Singh, On efficient estimators of two seemingly unrelated regressions, Statistics & Probability Letters, 81, (2011), 563-570. [23] S.G. Wang, A new estimate of regression coefficients in seemingly unrelated regression system, Science in China (Ser. A), 7, (1989), 808-816. [24] S.G. Wang and S.C. Chow, Advanced linear models: Theory and Applications, New York, Marcel Dekker, 1994. [25] W.P. Zhang, Research of Problems of Bayes Analysis in Linear Model, Thesis of USTC for Ph.D, 2005-JX-C-1773, 2005. [26] A. Zellner, An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias, J. Amer. Statist. Assoc., 57, (1962), 348-368. [27] A. Zellner, Estimators of seemingly unrelated regressions: some exact finite sample results, J. Amer. Statist. Assoc., 58, (1963), 977-992.