17.2.1 Estimable Functions In this section we discuss briefly the

advertisement
17.2.1 Estimable Functions
In this section we discuss briefly the theory of estimable functions (Bose (1944)), which is
important for a deeper understanding of certain topics that arise in the analysis of variance.
Definition 17.2.1 A parametric function  ( 1 ,
function  (Y1 ,
, Yn ) of the random variables Y1 ,
,  m ) is said to be estimable if there exists some
, Yn such that
E ( )  
(17.2.8)
Definition 17.2.2 A linear function t  of parameters 1 ,...,  m is said be linearly estimable if
there exists a linear function aY of the random vector Y such that for all 1 ,...,  m
E (aY )  t 
(17.2.9)
that is, aY is an unbiased estimator of t  .
We now discuss various important theorems concerning estimable functions. The proofs of these
theorem and corollaries are optional.
Theorem 17.2.1 Suppose we are dealing with the model (17.2.2). Then a necessary and
sufficient condition for a linear parametric function t  to be linearly estimable is that
rank( X )  rank( X  : t )
where ( X  : t ) is the matrix obtained from X  by adjoining the column vector t.
Proof: Let a  (a1 , , an ) be any (1 n) row vector. Then, considering the underlying model E
(Y) = X, the expectation of the linear function aY can be written as
for all 
E (a'Y ) = a'X
Now the linear function a'Y is an unbiased estimator of the linear function t' of the parameters if
E (a'Y) = t' 






Obviously (17.2.10) and (17.2.11) are true for all  if and only if
t' =
a' X
or
Xa = t
(17.2.12)
that is, if and only if there exists a solution for the unknown vector a, which is true only if
rank(X') = rank( X  : t)
This completes the proof of Theorem 17.2.1.
Corollary 1 The linear parametric function t' is linearly estimable if and only if there exists a
solution for in X X t
Proof: The result follows immediately by using the fact that
rank( X  ) = rank( X  : t)  rank( X  X) = rank( X  X: t).
Corollary 2 If the matrix X (n  m), where m  n, is of full rank m, then every linear parametric
function t' is linearly estimable.
Proof: rank( X  : t )  rank( X  ) . But rank( X  : t) cannot exceed m, since ( X  : t ) is a
m  (n + 1) matrix with m  n. Hence rank( X  : t) = rank( X  ).
Example 17.2.2 (Example 17.2.1 Revisited) Consider the experiment in Example 17.2.1 in which
four observations are made, two on each of the two treatments. Our model for this experiment is
yij     j  eij
i = 1, 2;
j = 1, 2

We have     1  and the transpose of the design matrix X is
  2 
1 1 1 1
X  = 1 1 0 0


0 0 1 1
and is of rank 2. We show that a linear parametric function, 1  2, is linearly estimable. The
function 1  2 can be written as t   where t  = ( 0, 1, -1 ). Now, the rank of
 1 1 1 1 | 0
1
( X  : t) =  1 1 0 0 |
0 0 1 1 | 1
is again 2, as the sum of the last two rows is equal to the first row. This proves that
1  2 is a linearly estimable function (see Theorem 17.2.1).
Theorem 17.2.2 Any linear combination of estimable functions is estimable.
Proof :
Let t1 ,
, tr  be estimable functions, where t j is m 1,  is m1 , and let 1 , ....
, r be some constants. Then we must show that t  is an estimable function where
r
t=
 t
j j
j 1
Since the functions t1 ,
, tr  are estimable, there exist (n 1) vectors a1 ,
X ai  ti
, ar such that
i = 1, 2, . . . , r
(17.2.13)
Whenever (17.2.13) holds, we have
X a = t
where a = i ai and t = i ti . This completes the proof of the theorem.
Now it is quite interesting to investigate the number of linearly independent estimable functions.
In order to see this, we need the following definition.
Definition 17.2.3 The estimable functions t1 ,
, tr  are said to be linearly independent if
there exist a1, . . . , ar such that X  ai = ti , i = 1, . . . , r and if the vectors t1 ,
, tr are linearly
independent.
Now, using Definition 17.2.3, one can easily show that if the rank (X) = m0, then there are
exactly m0 linearly independent estimable functions. Thus, we have the following result.
Theorem 17.2.3 The maximum number of linearly independent estimable functions is exactly
equal to the rank of the design matrix X.
For instance, in Example 17.2.2, as the rank of the design matrix X is 2, then there are exactly
two linearly independent estimable functions. One such set of two linearly independent
estimable functions is  + 1 and  + 2 .
Further, if we use Theorem 17.2.2 and the fact that  + 1 and  + 2 are estimable functions, it
follows immediately that the function 1  2= (   1 ) – (   2 ) is also estimable, the result of
Example 17.2.2.
We come now to an important historic theorem, which will be used in subsequent sections.
Theorem 17.2.4 (Gauss–Markoff Theorem) Suppose in model (17.2.2) t  is an estimable
function. Then the best linear unbiased estimator (BLUE) of t  is t'*, where * is any
solution of the least square normal equations X X  *  X Y .
Proof : Since t  is an estimable function, it follows from corollary 1 of Theorem 17.2.1 that




t = X X for some so that t    X X
(17.2.14)
  t  *    X X  *    X Y
(17.2.14a)
since X X  *  X Y .
Thus, t'* is a linear function of Y.
Further, we have from (17.2.14) and (17.2.14a), that
E (t'*) =  E (t  * )    X E (Y )    X X   t  . 
Hence,  t  * is an unbiased estimator of t  .
Now, to complete this proof of the theorem, we show that, among all the linear unbiased
estimators of t  , t  * is the best in the sense that it has the minimum variance.
Let b'Y be an arbitrary function of Y such that
E(b'Y) = t 
We then have that,
E (b'Y) = b' E(Y) = b' X 
that is,
b'X  = t  for all   b'X = t'.
(17.2.15)
Thus, Var (b'Y) = Var (b'Y - 'X'Y + 'X'Y ), or
Var (b'Y) = Var(b'Y - 'X'Y) + Var('X'Y) + 2 Cov[( [(bY    X Y ),   X Y ]
We wish now to evaluate the last term of (17.2.16). We first notice that
E (bY    X Y ) = (b    X ) E (Y )  (b    X ) X  ,
(17.2.16)
So that (bY    X Y )  E (bY    X Y ) can be written as
(b    X )(Y  X  )
(17.2.17)
Y X   E (Y X  )  (Y  X  ) X 
(17.2.17a)
Similarly, we have that
Now the last term of (17.2.16) by definition of covariance can be written as
2Cov[(bY    X Y ),  X Y ]
=2 E[(b    X )(Y  X  )(Y  X  ) X  ]
= 2(b    X ) E[(Y  X  )(Y  X  )] X 
= 2(b    X ) E[ 2 I n ] X 
(17.2.18)
since we are assuming that model (17.2.2) holds, which states that the  i ’s are uncorrelated with
constant variance  2 , so that the variance-covariance matrix of Yi ' s is  2 I n . ( I n is the
(n  n) identity matrix.)
Hence, from (17.2.18), we have
2Cov[(bY    X Y ),  X Y ] = 2 2 (bX    X X )
and from (17.2.14) and (17.2.15) we have that
2Cov[(bY    X Y ),  X Y ] = 2 2 (t   t )  0
(17.2.19)
We may now write (17.2.16) as
Var(bY )  Var(bY    X Y )  Var(t  * ),
since (17.2.14a) and (17.2.19) hold. Thus
Var(bY )  Var(t  * )
which completes the proof of this theorem.
17.5.3 Blocking in Two-Way Experimental Layouts
In Section 17.4, we discussed the use of a randomized block design to eliminate the effect of a
nuisance variable in one-way experimental design. Quite often we confront a similar situation
when using a two-way experimental design: again, we need to eliminate the effect of a nuisance
variable.
For example, suppose that we use a two-way experimental design for a two-factorial experiment
with r replications where the total number of observations to be generated is a  b  r. But on a
given day we may only be able to complete one replication, and the experiment may be effected
by weather conditions that vary daily. Thus, in this example days constitute a nuisance variable
and are therefore treated as blocks.
Thus, we wish to conduct an experiment using a  b treatment combinations of a levels of factor
A and b levels of factor B in r blocks, with each block containing the a  b combinations of
Ai ' s with B j ' s. We may tabulate the observations yi jk obtained using the ith level of A and the
jth level of B in the kth block, as in Table 17.5.9.
TABLES 17.5.9 Observations yi jk obtained from r blocks.
Block 1
A1
Ai
Aa
B1
Bj
Bb
y111
y1 j1
:
yi11
:
ya11
:
yij1
:
yaj1
y1b1
:
yib1
:
yab1
T1.1
:
Ti.1
:
Ta.1
T11
T j1
Tb1
T1
B1
Bj
Bc
y11r
y1 jr
:
yi1r
:
ya1r
:
yijr
:
yajr
y1br
:
yibr
:
yabr
T1.r
:
Ti.r
:
Ta.r
T1r
T jr
Tbr
T..r
Block r
A1
Ai
Aa
r
A subsidiary totals table is also constructed, as in Table 17.5.10, where Tij .  y i j .   yijk .
k 1
TABLE 17.5.10 Totals table for the data in table 17.5.9.
A1
Ai
Aa
B1
Bj
Bb
T11.
T1 j .
:
Ti1.
:
Ta1.
:
Tij .
:
Taj .
T1b.
:
Tib.
:
Tab.
T1..
:
Ti ..
:
Ta
T.1.
T. j .
T.b.
T...  G
The reader is again cautioned to take Table 17.5.10 only as an aid in analysis and not to forget
that we are talking about an experiment that involves ( a  b) treatments run in each of r blocks.
This is not to be confused with an experiment with (a  b  r ) treatments without any blocking.
Initially, in our analysis, we may remove the source due to blocks (see Table 17.5.11), then,
using Table 17.5.10, extract the sum of squares due to treatments, and after computing the total
sum of squares, find the error sum of squares by subtraction. Now let
a
T  k  
b

i 1 j 1
b
yijk  block total of the observation in the kth block,
Ti  k   yi j k  sum of observations taken, using A i in the kth block, etc.
(17.5.21)
j 1
TABLE 17.5.11 Preliminary ANOVA Table for the Data in Tables 17.5.9.
Source
Degrees of freedom
Sums of squares
2
r
Tk G 2
SS



bl
Blocks
r 1
abr
k 1 ab
a
b T2
G2
ij 
SS



Treatments
treat
ab 1
abr
i 1 j 1 r
Error
(ab  1)(r  1)
SSE  SStotal  (SSbl  Streat )
2
Total
abr 1
SStotal     yijk
 G2 / abr
Now it is quite easy to partition the “treatments” line in the usual way; there are (a  1) degrees
of freedom for SS A , (b  1) degrees of freedom for SS B , and (a  1)(b  1) degrees of freedom
for the interaction sum of squares SS AB (see Table 17.5.12). In practice, we compute SS AB by
subtraction (see Equation 17.5.13a).
Table17.5.12 Partitioning of treatment sum of squares.
Source
Degrees of
Freedom
Sum of Squares
Ti2 G 2
SS A   
abr
i 1 br
2
c T
G2
 j
SS B  

abr
j 1 ar
S AB  Streat  (S A  SB )
r
A
a 1
B
b 1
A B
(a  1)(b  1)
a
ab 1
Treatments
b
SStreat  
i 1 j 1
Tij2
r

G2
abr
As usual cases in these situations, the blocking is assumed not to interact with the factors A and
B, so that we use the model (in the usual notation)
yi j k     i   j   k   ij   i j k
(17.5.22)
with
a
b
r
i 1
j 1
k 1
i    j  k    ij    ij  0
i
(17.5.22a)
j
and
 ijk
N (0,  2 )
(17.5.22b)
with all  ijk ’s independent. We proceed first to test the interaction in the usual way. If we do not
reject the hypothesis that there is no interaction, we then proceed to test for main effects. Note
that we have eliminated a source of variation, namely that due to blocks, by running a complete
replication of ab treatments within each block. Examples of blocks situations that may be
encountered are given in the settings of the relevant problems at the end of this chapter. The
reader is recommended to work out some of the problems of this nature.
17.5.4 Extending Two-Way Experimental Designs to n-Way Experimental Layouts
So far we have discussed experiments involving one or two factors. Now we consider more
general experiments, that is, experiments involving three or more factors. For example, in a
bread-making process, we may consider factors such as the type of flour, amount of yeast, type
of oil, amount of calcium propionate (a preservative), oven temperature, etc. As another
example, consider patients, their medication, duration of treatment, dosage, age, gender, and
other factors. Here we discuss briefly the analysis of a three-way experimental design. Extension
to n-way experimental designs (n > 3) can be done in a similar fashion.
The model for a three-way experimental design with r replications (factor A at a levels, factor B
at b levels, and factor C at c levels) is given by
yijkl     i   j   k   ij  ik   jk  ijk   ijkl
i  1, 2,

i
i

i
ijk
, a; j  1, 2,
, b; k  1, 2,
, c; l  1, 2,
(17.5.23)
, r , with side conditions
 0,   j 0,   k  0,   ij 0,   ij 0, ik 0, ik 0,   jk 0,   jk 0,
j
k
i
j
i
k
j
k
 0, ijk  0, and  ijk  0
j
(17.5.23a)
k
As usual, we assume that the  ijkl are independent N (0,  2 ).
Suppose we conduct a “three-way layout design of an experiment” with r replications, each
factor having two levels. Then the data obtained from such an experiment can be displayed as in
Table 17.5.13. Note that each treatment Ai B j Ck , i  1, 2, j  1, 2, k  1, 2 is used (eight treatments).
We say that A and B and C are completely crossed in this experiment.
Table 17.5.13 Data from a three-way experimental design.
A2
A1
B2
B1
C1
C2
C1
B2
B1
C2
C1
C2
C1
C2
y1111
y1121
y1211
y1221
y2111
y2121
y2211
y2221
y1112
y1122
y1212
y1222
y2112
y2122
y2212
y2222
y111r
y112r
y121r
y122r
y211r
y212r
y221r
y222r
Now by minimizing the error sum of squares, that is, minimizing
a
b
c
r
Q   (Yijkl    i   j   k   ij  ik   jk  ijk )2
(17.5.24)
i 1 j 1 k 1 l 1
subject to the conditions (17.5.23a) and solving the least square normal equations, we obtain
a
b
c
r
SS E  Min Q   ( yijkl  yijk . )2
(17.5.25)
i 1 j 1 k 1 l 1
and an unbiased estimator of  2 is
a
ˆ 2  S 2 
b
c
r
 (Y
i 1 j 1 k 1 l 1
ijkl
 Yijk . ) 2
=
abc(r  1)
SS E
 MS E
abc(r  1)
(17.5.26)
Various estimators of model parameters are given by (with obvious definitions of yi... , y. j .. , …)
ˆ  y....
ˆi  yi...  y.... , ˆ j  y. j..  y.... , ˆk  y..k .  y....
ˆij  yij ..  yi...  y. j ..  y.... ,ˆik  yi.k .  yi...  y..k .  y.... , ˆ jk  y. jk .  y. j..  y..k .  y....
(17.5.27)
ˆijk  yijk .  yij ..  yi.k .  y. jk .  yi...  y. j ..  y..k .  y....
Now, proceeding as in the the two-way experimental design with equal numbers of observations
per cell, the ANOVA table for a three-way experimental design, when yijkl ’s are generated as in
the model (17.5.23)  (17.5.23a), is given in Table 17.5.14.
TABLE 17.5.14 ANOVA table for a three-way experimental design with r ( >1)
observations per cell.
Source
A
DF
(a  1)
SS
bcr  ( yi ...  y ....)2
MS
SSA/(a-1)
acr  ( y . j..  y ....) 2
SSB/(b-1)
i
B
(b  1)
j
C
abr  ( y ..k.  y....) 2
SSC/(c-1)
(a  1)(b  1)
cr  ( yij ..  yi ...  y . j..  y.... ) 2
SSAB/(a-1)(b-1)
(a  1)(c  1)
br  ( yi .k .  yi ...  y ..k.  y....)2
(c  1)
k
AB
AC
ij
ik
SSAC/(a-1)(c-1)
BC
(b  1)(c  1)
ABC
(a  1)(b  1)(c  1)
ar  ( y . jk .  y . j..  y ..k .  y ....) 2
SSBC/(b-1)(c-1)
jk
r  ( yijk .  yi ...  y . j..  y ..k .  yij ..
SSABC/(a-1)(b-1)(c-1)
ijk
 yi .k .  y . jk .  y ....)2
Error
TOTAL
abc(r  1)
(y
abcr 1
(y
ijkl
 yijk .) 2
ijkl
 y....) 2
SSE/abc(r-1)
ijkl
ijkl
If there is one observation per cell, then we simply assume that the three-factor interaction is
zero and use the corresponding degrees of freedom for error sum of squares, and then estimate
the error variance  2 . The various sum of squares for r >1 observations per cell in the ANOVA
table are obtained as follows: (T....  G   yijkl )
i
j
k
l
1
1
Ti...2 
T....2

bcr i
abcr
1
1
SS B 
T. 2j .. 
T....2

acr j
abcr
SS A 
1
1
TT..2k . 
T....2

abr k
abcr
1
1
1
1
SS AB   Tij2.. 
Ti...2 
T. 2j .. 
T....2


cr i j
bcr i
acr j
abcr
(17.5.28)
1
1
1
1
SS AC   Ti.2k . 
Ti...2 
T..2k . 
T....2


br i k
bcr i
abr k
abcr
1
1
1
1
SS BC   T. 2jk . 
T. 2j .. 
T..2k . 
T....2


ar j k
acr j
abr k
abcr
1
1
1
1
SS ABC   Tijk2 .   Tij2..   Ti.2k .   T. 2jk .
r i j k
cr i j
br i k
ar j k
SSC 
1
1
1
1
Ti...2 
T. 2j .. 
T..2k . 
T....2



bcr i
acr j
abr k
abcr
1
SST   yijkl 2 
T....2
abcr
i
j
k
l

SS E  SST  SS A  SS B  SSC  SS AB  SS AC  SS BC  SS ABC , or
SS E   y
2
ijkl
i
j
k
l
 
i
j
k
Tijk2 .
r
Download