Document

advertisement
XII. Justifying the ANOVA-
based hypothesis test
XII.A The sources for an ANOVA
XII.B The sums of squares for an ANOVA
XII.C Degrees of freedom of the sums of
squares for an ANOVA
XII.D Expected mean squares for an
ANOVA
XII.E The distribution of the F statistics for
an ANOVA
XII.F Application of theory for ANOVAbased hypothesis test to an example
Statistical Modelling Chapter XI
1
Introduction
• In chapter VI we gave a procedure for
determining an ANOVA table.
• In this chapter we look at the theory that justifies
this procedure.
• In particular, examine basis of the SSqs, dfs,
E[MSq]s and the distribution of the F statistic.
Statistical Modelling Chapter XI
2
XII.A The sources for an ANOVA
• Various methods for determining the
sources to include in an ANOVA table.
• Perhaps most popular is to try to identify
the situation that is closest to yours in a
textbook.
• Big disadvantage is not exact match and
use wrong analysis.
• Our approach based on dividing the factors
into unrandomized and randomized factors
Statistical Modelling Chapter XI
3
Our approach
1. identify the unrandomized and randomized factors;
2. determine the structure formulae that describe the
crossing and nesting relations
a) amongst the unrandomized factors and
b) amongst the randomized factors (and in some cases between the
randomized and some unrandomized factors);
3. expand the structure formulae to obtain two sets of
sources;
4. form the lines in the ANOVA table by
a) listing the unrandomized sources in the table,
b) entering the randomized sources indented under the appropriate
unrandomized sources, and
c) adding Residuals for unrandomized sources where there are df
left over.
• This extracts the key features of the experiment arising
from the randomization and captures these in the structure
formulae for use in determining the terms in the analysis.
• The ANOVA table produced using this method displays
the confounding arising from the randomization,
something that traditional tables do not do.
4
Statistical Modelling Chapter XI
XII.B The SSqs for an ANOVA
• Derivation of SSqs, given the sources in the ANOVA, is based on the
S matrices.
• For a structure formula, one must identify the generalized factors
corresponding to the sources derived from that formula.
• For each generalized factor there is an S that specifies the units that
are related in having the same level of that generalized factor.
• For example, SBlocks has a one for every pair of units that occur in the
same block.
• Generally have two sets of relationship (S) or, equivalently, meanoperator matrices (M = (n/f)-1S = g-1S): one for the unrandomized
factors and one for the randomized factors.
(relationship matrices also called summation matrices)
• Now a set of such matrices forms the basis for an algebra that is
called a relationship algebra.
• Associated with a set of Ss is a set of Qs, again one Q for each
generalized factor, that are the idempotents of the relationship
algebra and provide the quadratic-form matrices for the analysis.
• Under certain conditions, met by all our examples, a set of Qs
derived from a structure formula forms a complete set of mutuallyorthogonal idempotents (CSMOI — see defn XI.10).
• So we form SSqs of the projections of the data vector into the
subspaces projected onto by the Qs.
Statistical Modelling Chapter XI
5
Obtaining expressions for Qs and Ms
• Rule VI.6 used to obtain expressions for Qs in terms of Ms.
• Following rule used to obtain expressions for the S (& Ms) as the
direct products of I and J matrices. (generalizes Rule XI.1)
Rule XII.1: The S for a generalized factor, formed as the subset of a set
of s equally-replicated factors that uniquely index the units,
– is the direct product of s matrices, provided the s factors are arranged in
standard order.
• Taking the s factors in the sequence specified by the standard order,
– for each factor in the generalized factor include an I matrix in the direct
product and
– J matrices for those that are not.
• The order of a matrix in the direct product is equal to the number of
levels of the corresponding factor.
• The M matrix is the same direct product of I and J matrices, except
that each J is multiplied by the reciprocal of its order.
• Rule applies to factors from the unrandomized structure.
Statistical Modelling Chapter XI
6
Ms are symmetric and idempotent
• Prove this useful result for those that can be expressed as
the direct product of I and J matrices in the next lemma.
Lemma XII.1: Let M be the direct product of I and J matrices,
the latter multiplied by the reciprocal of their order, as
prescribed in rule XII.1.
Then M is symmetric and idempotent.
Proof: Since (A  B)' = A'  B' and I and J are both
symmetric, M must also be symmetric.
• Since (A  B)(C  D) = (AC  BD)
and in particular (A  B)(A  B) = (A2  B2),
M2 must be the direct product of I and the square of J
matrices, each J multiplied by the reciprocal of its order.
But q1 Jq q1 Jq = 12 Jq Jq = 12 qJq = q1 Jq .
q
q
• Consequently, M2 and M will be same direct product of I
and J matrices, latter multiplied by the reciprocal of their
order.
Statistical Modelling Chapter XI
7
XII.C Degrees of freedom of the
SSqs for an ANOVA
• Definition XII.1: The degrees of freedom of a sum of
squares is the rank of the idempotent of its quadratic form.
That is the degrees of freedom of Y'QY is given by
rank(Q).
• The following definition and lemma establish that the trace
of an idempotent is the same as rank and hence its df.
• Definition XII.2: The trace of a square matrix is the sum of
its diagonal elements.
• Lemma XII.2: For B idempotent, trace(B) = rank(B).
• Proof: The proof is based on the following facts:
– the trace of a matrix is equal to the sum of its eigenvalues;
– the rank of a matrix is equal to the number of nonzero eigenvalues;
and
– the eigenvalues of an idempotent can be proved to be either 1 or
zero and so sum = number.
Statistical Modelling Chapter XI
8
Results on traces of matrices
• Clearly, the dfs can be established by
determining the trace of a matrix so some results
on the traces.
• Lemma XII.3: Let c be a scalar and A, B and C
be matrices. Then, when the appropriate
operations are defined, we have
–
–
–
–
–
–
–
trace(A') = trace(A)
trace(cA) = ctrace(A)
trace(A + B) = trace(A) + trace(B)
trace(AB) = trace(BA)
trace(ABC) = trace(CAB) = trace(BCA)
trace(A  B) = trace(A)  trace(B)
trace(A'A) = 0 if and only if A = 0.
Statistical Modelling Chapter XI
9
Trace of a Q
• Each Q matrix is a linear combination of M matrices so that
– the trace(Q) will be the same linear combination of the traces of the
Ms.
• In the next lemma we prove that the trace of an M is equal
to the no. of levels in its generalized factor,
– when corresponding summation matrix is direct product of I and Js.
Lemma XII.4: Let MF be a mean operator matrix for a
generalized factor F that has f levels each replicated n/f = g
times and SF be the corresponding summation matrix that
is direct product of I and Js. Then trace(MF) = f.
Proof:
• Now trace(MF) = g-1 trace(SF).
• But SF is the direct product of I and Js.
• However, trace(A  B) = trace(A)  trace(B)
and the traces of I and J matrices are equal to their orders.
• As the product of the orders of the I and Js for any S must
be n, trace(SF) = n
and so trace(MF) = g-1 trace(SF) = n/g = f.
Statistical Modelling Chapter XI
10
XII.D Expected mean squares for
an ANOVA
• As has been previously stated, the E[MSq]s are
just the average or mean value of the MSqs
under sampling from a population whose Y
variables behave as described by the linear
model.
– i.e. the true mean value of a Mean Square
– It depends on the model parameters
• To derive the expected values, we note that the
general form of an MSq is a quadratic form
divided by a df, Y'QY/n.
• So we first establish an expression for the
expectation of any quadratic form.
Statistical Modelling Chapter XI
11
Expectation of a quadratic form
• Theorem XII.1: Let Y be an n  1 vector of
random variables with
E[Y] = y and var[Y] = V,
where y is a n  1 vector of expected
values and V is an n  n matrix.
• Let A be an n  n matrix of real numbers.
• Then
E[Y'AY] = trace(AV) + y'Ay.
Statistical Modelling Chapter XI
12
Proof of theorem XII.1
• Firstly note that, as Y'AY is a scalar, Y'AY = trace(Y'AY)
• Also, recall that trace of matrix products are cyclically
commutative.
• Thus, E[Y'AY] = E[trace(Y'AY)] = E[trace(AYY')].
• Now, for an n  n matrix Z
n
n


E trace  Z  = E i =1 zii = i =1E  zii  = trace  E  Z 


• Also, lemma XI.5 (E[Y] results) states that E[AY] = AE[Y]
and this can be extended to E[AYY'] = AE[YY'].
• Hence, E[trace(AYY')] = trace(E[AYY']) = trace(AE[YY']).
• Now we require an expression for E[YY'] which we obtain
by considering definition I.7 of V = E[(Y - y) (Y - y)'].
 lemma I.1

• From this
V = E  Y - ψ  Y - ψ   

—
transposes


= E  YY - Yψ - ψY  ψψ
Statistical Modelling Chapter XI
= E  YY - E  Yψ - E ψY  E ψψ.
13
Proof of theorem XII.1 (continued)
• Now, E[y] = y as the elements of y are population
quantities and are constants with respect to expectation.
• Thus
E[Yy'] = E[Y]y' = yy' = E[yY].
• Hence, V = E[YY'] - E[Yy'] - E[yY] + yy' = E[YY'] - yy'
so that E[YY'] = V + yy'.
• This leads to
E  YAY  = trace  AE  YY
= trace  A  V  ψψ  
= trace  AV   trace  Aψψ 
= trace  AV   trace  ψAψ 
= trace  AV   ψAψ.
Statistical Modelling Chapter XI
14
Theorem for E[MSq]
• Theorem XII.2: Let Y be an n  1 vector of random
variables with
E[Y] = y and var[Y] = V,
where y is a n  1 vector of expected values and V is an
n  n matrix.
• Let Y'QY/n be the mean square where Q is an n  n
symmetric, idempotent matrix and n is the degrees of
freedom of the sums of squares.
• Then
E[Y'QY/n] = (trace(QV) + y'Qy/n.
• Proof: Since n is a constant, E[Y'QY/n] = E[Y'QY]/n and
the result follows straightforwardly using theorem XII.1.
• So to derive E[MSq] for a particular source under a
specific model, one substitutes
– the Q matrix for the source and
– the y and V for the model
into the expression E[Y'QY/n] = (trace(QV) + y'Qy/n
given by theorem XII.2.
Statistical Modelling Chapter XI
15
XII.E The distribution of the F
statistics for an ANOVA
•
•
•
Each F statistic in an ANOVA is used to assess whether
or not a null hypothesis is likely to be true by determining
if the value of our test statistic is unlikely when H0 is true.
To do this we need to establish the sampling distribution
of the test statistic when the null hypothesis is true.
Test statistic is of the following general form
Fn1, n 2
•
•
s12 YQ1Y n 1
= 2 =
s2 YQ 2Y n 2
It is the ratio of two MSqs each of which is a quadratic
form divided by its df.
We will in the following three theorems:
1. Give the distribution of a quadratic form.
2. Establish the relationship between several quadratic forms.
3. Obtain the distribution of the ratio of two independent quadratic
forms.
Statistical Modelling Chapter XI
16
Distribution of a quadratic form
• Theorem XII.3: Let A be an n  n symmetric matrix of
rank n and Y be an n  1 normally distributed random
vector with E[AY] = 0, var[Y] = V and E[Y'AY/n] = l.
• Then (1/ l)Y'AY follows a chi-squared distribution with n
degrees of freedom if and only if A is idempotent.
• Proof: not given
• The chi-square probability distribution function for the
random variable U is:

 n - 2  2 -u 2
1
2
n  u  = 
u
e
,
0u
n 2
   n / 2 2 
Probability
distribution
function
2

for 6
0
Statistical Modelling Chapter XI
5
10
15
U
20
25
30
17
Relationships between idempotents
•
•
•
Theorem XII.4: Let Y be an n  1 vector of random
variables with E[Y] = Xq and var[Y] = V.
Let Y'A1Y, Y'A2Y, …, Y'AhY be a collection of h quadratic
forms where, for each i = 1, 2, …, h,
– Ai is symmetric, of rank ni,
– E[AiY] = 0 and
– E[Y'AiY/ni] = li.
If any two of the following three statements are true (any
two implies the other),
1. All Ai are idempotent
h
2. i =1 Ai is idempotent
3. AiAj = 0, i  j
then not only does (1/li)Y'AiY, for each i, follows a chisquared distribution with ni degrees of freedom as
Theorem XII.3 establishes, but
•
– Y'AiY are independent for i  j and
h
– i =1n i = n where n denotes the rank of

h
A
i =1 i .
Proof: not given
Statistical Modelling Chapter XI
18
Distribution of the ratio of two
independent quadratic forms
• Theorem XII.5: Let U1 and U2 be two random variables
distributed as chi-squares with n1 and n2 degrees of
freedom.
• Then, provided U1 and U2, are independent, the random
variable
U n
W =
1
1
U2 n 2
is distributed as Snedecor's F with n1 and n2 degrees of
freedom.
• Proof: not given
Statistical Modelling Chapter XI
19
F distribution for W



Fn1,n 2 w  = 



r 2
1
 n 1  n 2   r1 

 
2

  r2 
n  n 
  1  2 
2  2


- n 1 n 2  2
 n1 - 2 2   n 1  
,
1   w 
w
n2  





0  w  .
Probability distribution function for F3,46
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
W
Statistical Modelling Chapter XI
20
XII.F Application of theory for
ANOVA-based hypothesis test
to an example
a) The sources for an ANOVA
b) The sums of squares for an ANOVA
c) Degrees of freedom of the sums of squares for
an ANOVA
d) Expected mean squares for an ANOVA
e) The distribution of the F statistics for an
ANOVA
Statistical Modelling Chapter XI
21
Example XII.1 Randomized Complete Block Design
a) The sources for an ANOVA
1. For the RCBD, the unrandomized factors are Blocks and Units
and the randomized factors are Treatments.
2. The unrandomized structure formula is b Blocks/t Units and the
randomized structure formula t Treatments.
3. The unrandomized sources are Blocks and Units[Blocks] and
the randomized source is Treatments.
4. The sources for the ANOVA table for this experiment are:
Source
Blocks
Units[Blocks]
Treatments
Residual
Total
Statistical Modelling Chapter XI
22
Example XII.1 Randomized Complete Block
Design (continued)
b) The sums of squares for an ANOVA
• Generalized factors
– unrandomized: (G), Blocks and BlocksUnits and
– randomized: (G) and Treatments.
• So Qs are
– unrandomized: QG, QB and QBU and
– randomized: QG and QT.
• Since each of these sets is derived from a structure
formula consisting of only nesting and crossing operators,
they each form a CSMOI.
Statistical Modelling Chapter XI
23
Example XII.1 Randomized Complete Block
Design (continued)
b) The sums of squares for an ANOVA (continued)
• The Hasse diagrams of generalized-factor marginalities,
that includes expressions for the Qs in terms of the Ms,
are as follows:
Unrandomized factors
Randomized factors


MG MG
MG MG
Blocks
MB
BlocksUnits
MBU
Statistical Modelling Chapter XI
B
MB-MG
Treatments
MT
T
MT-MG
U[B]
MBU-MB
24
Example XII.1 Randomized Complete Block
Design (continued)
b) The sums of squares for an ANOVA (continued)
• If the data is ordered in standard order for Blocks and
Treatments and if Units are given the same numbering as
Treatments (this doesn't affect the analysis), rule XII.1 (S,
M as direct products) yields:
– MBU = Ib  It, MB = Ib  1t Jt and MG = b1 Jb  1t Jt ;
– MT = b1 Jb  It and MG = b1 Jb  1t Jt .
• Latter requires inclusion of a factor for replicates, here
Blocks; this factor never occurs in its generalized factors.
• So the
Source
SSq
ANOVA
Blocks
YQBY = Y MB - MG  Y
table with
Units[Blocks] YQBUY = Y MBU - MB  Y
SSqs
added is:
Treatments YQTY = Y MT - MG  Y
Residual
YQBU Y = Y  QBU - Q T  Y
= Y  MBU - MB - MT  MG  Y
Res
Total
Statistical Modelling Chapter XI
YQUY = Y MBU - MG  Y
25
Example XII.1 Randomized Complete Block
Design (continued)
c) Degrees of freedom of the SSqs for an ANOVA
• The following theorem establishes that the degrees of
freedom of the sums of squares are as given in the
analysis of variance table.
• Theorem XII.6: Let quadratic forms for the SSqs be
YQUY = Y  MBU - MG  Y
YQBY = Y  MB - MG  Y
YQBUY = Y  MBU - MB  Y
YQ T Y = Y  MT - MG  Y
YQBURes Y = Y  QBU - Q T  Y = Y MBU - MB - MT  MG  Y
– where MBU = Ib  It, MB = Ib  1t Jt , MT = b1 Jb  It and MG = b1 Jb  1t Jt
• Their degrees of freedom are n-1, b-1, b(t-1), t-1 and
(b-1)(t-1), respectively,
– where n is the no. of observations, b is the no. of blocks and t is
the no. of treatments.
Statistical Modelling Chapter XI
26
Proof of theorem XII.6 (continued)
• First we establish that all Q matrices are
symmetric and idempotent so that we can utilise
lemma XII.2 to conclude that the ranks of the Q
matrices are equal to their traces.
• Now QB, QBU and QT are symmetric and
idempotent as they are in the complete sets of
mutually-orthogonal idempotents.
• It remains to demonstrate that QU and QBU
Res
are symmetric and idempotent.
Statistical Modelling Chapter XI
27
Proof of theorem XII.6 (continued)
• Firstly, from lemma I.1 (transpose properties), (cA + dB)'
= cA' + dB' and we have from lemma XII.1 that the Ms
are symmetric and idempotent so that
QU = MBU - MG  = MBU - MG = MBU - MG = QU
QUQU = MBU - MG MBU - MG 
= MBUMBU - MBUMG - MGMBU  MGMG.
• Now MBU = In so that
QUQU = MBUMBU - MBUMG - MGMBU  MGMG
= MBU - MG - MG  MG
= MBU - MG
= QU.
• That is QU is symmetric and idempotent.
Statistical Modelling Chapter XI
28
Proof of theorem XII.6 (continued)
• Secondly,
QBURes =  MBU - MB - MT  MG  = MBU - MB - MT  MG
as each M is symmetric.
• Further,
2
QBU
=  MBU - MB - MT  MG  MBU - MB - MT  MG 
Res
=  MBUMBU - MBUMB - MBUMT  MBUMG 
-  MBMBU - MBMB - MBMT  MBMG 
-  MTMBU - MTMB - MTMT  MTMG 
  MGMBU - MGMB - MGMT  MGMG 
=  MBU - MB - MT  MG 
-  MB - MB - MBMT  MBMG 
-  MT - MTMB - MT  MTMG 
  MG - MGMB - MGMT  MG  .
Statistical Modelling Chapter XI
29
Proof of theorem XII.6 (continued)
• Now
MBMT =  Ib  1t Jt  b1 Jb  It  = b1 Jb  1t Jt = MG = MTMB
MBMG =  Ib  1t Jt  b1 Jb  1t Jt  = b1 Jb  1t Jt = MG = MGMB
MTMG =  b1 Jb  It  b1 Jb  1t Jt  = b1 Jb  1t Jt = MG = MGMT
• and so
2
QBU
=  MBU - MB - MT  MG 
Res
-  MB - MB - MBMT  MBMG 
-  MT - MTMB - MT  MTMG 
  MG - MGMB - MGMT  MG 
=  MBU - MB - MT  MG 
-  MB - MB - MG  MG 
-  MT - MG - MT  MG 
  MG - MG - MG  MG 
= MBU - MB - MT  MG
Statistical Modelling Chapter XI
= QBURes .
30
Proof of theorem XII.6 (continued)
• Consequently lemma XII.2 (rank = trace) applies to the
Qs so that the ranks of all the Q matrices are equal to
their trace.
• We also note that using lemma XII.4 (trace(MF) = f) we
have that
trace MG  = 1 trace MB  = b
trace MBU  = bt = n.
trace MT  = t
• Since, from lemma XII.3 (traces)
trace cA  dB = c  trace  A   d  trace B
• we have that trace  QU  = trace  MBU - MG  = n - 1
trace  QB  = trace  MB - MG  = b - 1
trace  QBU  = trace  MBU - MB  = bt - b = b  t - 1
trace  Q T  = trace  MT - MG  = t - 1


trace QBURes = trace  MBU - MB - MT  MG 
Statistical Modelling Chapter XI
= bt - b - t  1 =  b - 1 t - 1 .
31
ANOVA table with degrees of
freedom added
Source
df
SSq
Blocks
b-1
YQB Y = Y MB - MG  Y
Units[Blocks]
b(t-1)
YQBUY = Y  MBU - MB  Y
Treatments
t-1
YQ T Y = Y  MT - MG  Y
Residual
(b-1)(t-1)
YQBURes Y = Y  QBU - Q T  Y
= Y  MBU - MB - MT  MG  Y
Total
Statistical Modelling Chapter XI
n-1
Y  QU  Y = Y MBU - MG  Y
32
d) Expected mean squares for an
ANOVA
• We now prove the E[MSq]s for the case of Blocks
random.
• But first a useful lemma.
• Lemma XII.5: Let
y = E  Y  = X T  = 1b  It  
MBU = Ib  It = In , MB = Ib  1t Jt , MT = b1 Jb  It , MG = b1 Jb  1t Jt
QB = MB - MG, Q T = MT - MG, QBURes = MBU - MB - MT  MG
• Then
QGX Τ = 1t 1b  Jt 
QB X Τ = 0
QBURes X Τ = 0.
and XΤQGX Τ = bt Jt
Statistical Modelling Chapter XI
33
Proof of lemma XII.5
• First we have
 Jb  It 1b  It 
= 1b  It 
MT X Τ =
 Jb  Jt 1b  It 
= 1t 1b  Jt 
MGX Τ =
1
bt
1
b
MBUXΤ = Ib  It 1b  It 
MBX Τ = 1t  Ib  Jt 1b  It 
= 1b  It 
= 1t 1b  Jt 
• so that
QGX Τ = MGX Τ
QBURes X Τ =  MBU - MB - MT  MG  X Τ
= 1t 1b  Jt 
QB X Τ =  MB - MG  X Τ
= 1t 1b  Jt  - 1t 1b  Jt 
=0
Statistical Modelling Chapter XI
= 1b  It  - 1t 1b  Jt  - 1b  It   1t 1b  Jt 
= 0.
Finally,
XΤQGX Τ = 1t 1b  It  1b  Jt  = bt Jt .
34
Theorem XII.7
• Let
y = E  Y  = X T  = 1b  It  
2
V =  BU
In   B2  Ib  Jt 
SSB = YQB Y = Y  MB - MG  Y
SST = YQ T Y = Y  MT - MG  Y
SSBURes = YQBURes Y
= Y  QBU - Q T  Y = Y  MBU - MB - MT  MG  Y.
• Then, the expected mean squares are
2
E SSB  b - 1  =  BU
 t B2
 
2
E SST  t - 1  =  BU
 qT y
2
 b - 1 t - 1 =  BU
2
t
t
where qT  y  =  j =1b  j -  .   t - 1 ,  . =  j =1 j
E SSBURes
•
t , j is the jth
element of the t-vector , b is the number of blocks and t is the
number of treatments.
Statistical Modelling Chapter XI
35
Proof of theorem XII.7
2
• First note that we can write V =  BU
MBU  t B2MB
• For E[SSB/(b- 1)], we use theorem XII.2 (E[Y'QY/n] =
(trace(QV) + y'Qy/n) and that it can be shown that
QBMBU = QB and QBMB = QB to obtain the following
expressions:
 SSB 
E
 = E  YQBY   b - 1
  b - 1 
trace Q  2 M  t 2M

B
BU
BU
B
B


=



  X T   QB  X T   
2
 BU
trace  QBMBU   t B2trace  QBMB  
=


  X T   QB  X T   

 
b - 1
b - 1
2
 BU
trace  QB   t B2trace  QB  
=



  X TQBX T  

b - 1
2
  BU

 t B2 trace  QB 
=

 XTQBX T  

b - 1.

Statistical Modelling Chapter XI


36
Proof of theorem XII.7 (continued)
• Now from theorem XII.6 (RCDD dfs) we have
that trace(QB) = (b- 1) and from lemma XII.5
QBXΤ = 0
• Hence, the expected mean square is




2
  BU

 t B2 trace  QB 
E SSB  b - 1  = 

 XTQBX T  

b - 1
 b - 1
2
=  BU
 t B2  b - 1  0
2
=  BU
 t B2.
 
2
• The proof that E SST  t - 1 =  BU
 qT y is
left as an exercise for you.
Statistical Modelling Chapter XI
37
Proof of theorem XII.7 (continued)
• Now, for E SSBURes  b - 1 t - 1, again we use
theorem XII.2 (E[MSq]s) and also that it can be shown
that
M Q
=Q
M =Q
BU
BURes
BURes
BU
BURes
MBQBURes = QBURes MB = 0
• so that we have
 b - 1 t - 1
= E  YQBU Y   b - 1 t - 1
E SSBURes

=
Res


2
trace QBURes  BU
MBU  t B2MB

=

 b - 1 t - 1




  X T   QBURes  X T  
.
2
 BU
trace QBURes  0  XTQBURes X T 
Statistical Modelling Chapter XI
 b - 1 t - 1
38
Proof of theorem XII.7 (continued)
• Now from theorem XII.6 (RCBD dfs) we have that
trace QBURes =  b - 1 t - 1


and from lemma XII.5 QBURes X Τ = 0 so that
E SSBURes

=
 b - 1 t - 1



2
 BU
trace QBURes  XTQBURes X T 

=
 b - 1 t - 1

2
 BU
 b - 1 t - 1  0
 b - 1 t - 1
2
=  BU
as claimed.
Statistical Modelling Chapter XI
39
ANOVA table with E[MSq]s added
df
Source
SSq
E[MSq]
Blocks
b-1
YQB Y = Y MB - MG  Y
Units[Blocks]
b(t-1)
YQBUY = Y  MBU - MB  Y
Treatments
t-1
YQ T Y = Y  MT - MG  Y
2
 qT  y 
BU
Residual
(b-1)(t-1)
YQBURes Y = Y  QBU - Q T  Y
2
 BU
2
 t B2
 BU
= Y  MBU - MB - MT  MG  Y
Total
n-1
Statistical Modelling Chapter XI
Y  QU  Y = Y MBU - MG  Y
40
e) The distribution of the F
statistics for an ANOVA
• We next derive the sampling distribution of the Fstatistics for testing treatment and block
differences.
• Note that the model
– under the null hypothesis of no treatment effects is
2
E[Y] = XG and V =  BU
In   B2  Ib  Jt 
– while under the null hypothesis of no block variation is
2
E[Y] = XT and V =  BU
In
• The following theorems involve establishing the
sampling distribution of the F-statistics under
these models.
Statistical Modelling Chapter XI
41
Theorem XII.8
• Let
ψ = E  Y  = X G  = 1b  1t 
2
V =  BU
In   B2  Ib  Jt 
sT2 = YQ T Y  t - 1
2
sBU
= YQBURes Y
Res
where
 b - 1 t - 1
Q T = MT - MG, QBURes = MBU - MB - MT  MG
MBU = Ib  It = In , MB = Ib  1t Jt , MT = b1 Jb  It , MG = b1 Jb  1t Jt
• Then, the ratio of these two mean squares, given by
F t -1,  b -1 t -1 =
sT2
2
sBU
Res
is distributed as a Snedecor's F with (t-1) and (b-1)(t-1)
degrees of freedom.
Statistical Modelling Chapter XI
42
Proof of theorem XII.8
•
We have to show that
1. E QTY  = E QBU Y  = 0 and
Res
E YQTY  t - 1 = E YQBURes Y
2
 b - 1t - 1 = BU
under the expectation and variation models above, so that
theorem XII.3 (distn Y'AY) can be invoked to conclude that
2
2
1  BU
YQBURes Y and 1  BU
YQTY
follow chi-square distributions;
2. the quadratic forms YQT Y and YQBURes Y are independent as in
theorem XII.4 (several Y'AiYs);
3. then theorem XII.5 (F distribution) can be invoked to obtain the
distribution of the F test statistic.

Statistical Modelling Chapter XI



43
Proof of theorem XII.8 (continued)
• Firstly,
E  Q T Y  = Q TE  Y 
= Q T XG
=  MT - MG  X G 
=  b1 Jb  It - b1 Jb  1t Jt  1b  1t 
= 1b  1t - 1b  1t  
=0
E QBURes Y  = QBURes E  Y 
= QBURes X G 
=  MBU - MB - MT  MG  X G 
=  Ib  It - Ib  1t Jt - b1 Jb  It - b1 Jb  1t Jt  1b  1t 
= 1b  1t - 1b  1t - 1b  1t  1b  1t  
Statistical Modelling Chapter XI
= 0.
44
Proof of theorem XII.8 (continued)
• Also, using theorem XII.2 (E[MSq]s) and
– as QTMBU = QTIn = QT
– QTMB = (MT - MG)MB = MG - MG = 0
(MBMT = MTMB = MBMG = MGMB = MGMT = MTMG = MG from the
proof of theorem XII.6 {RCBD dfs})
– QTXG = 0 (see above in this proof), and
– trace(QT) = t - 1 (from theorem XII.6 {RCBD dfs}),
• we have
 

trace Q  2 M  t 2M

T
BU
BU
B
B


E  YQ T Y  t - 1  = 



  X G   Q T  X G   

2
=  BU
 t - 1  0 t - 1
2
=  BU
trace  Q T   0   XGQ T X G 
t - 1
 t - 1
2
=  BU
.
Statistical Modelling Chapter XI
45
Proof
of
theorem
XII.8
(continued)
• Now, using theorem XII.2 (E[MSq]s), and as
– QBURes MBU = QBURes In = QBURes
– QBU MB = MBU - MB - MT  MG  MB
Res
= MB - MB - MG  MG
=0
(MBMT = MTMB = MBMG = MGMB = MGMT = MTMG = MG from the
proof of theorem XII.6 {RCBD dfs})
– QBU X G = 0 (see above in this proof), and
Res
– trace QBU
=  b - 1 t - 1 (from theorem XII.6 {RCBD dfs}),
Res
• we have E  YQBURes Y  b - 1 t - 1
2
2
trace Q


M

t

M
BU
BU
BU
B
B
Res


=
  b - 1 t - 1


  X G   QBURes  X G   









2
=  BU
trace QBURes  0   XGQBURes X G 

  b - 1t - 1
  b - 1t - 1
2
=  BU
 b - 1 t - 1  0
Statistical Modelling Chapter XI
2
=  BU
.
46
Proof of theorem XII.8 (continued)
• Secondly, to show that YQT Y and YQBURes Y are
independent quadratic forms we have to show that QT
and QBURes meet two of the three conditions outlined in
theorem XII.4 (several Y'AiYs).
• As outlined in the proof of theorem XII.6 (RCBD dfs), QT
and QBURes are idempotent so that condition 1 is met.
• For condition 3, we require that Q TQBURes = 0.
• Now,
Q TQBURes =  MT - MG MBU - MB - MT  MG 
=  MTMBU - MTMB - MTMT  MTMG 
-  MGMBU - MGMB - MGMT  MGMG 
=  MT - MG - MT  MG  - MG - MG - MG  MG 
= 0.
Statistical Modelling Chapter XI
47
Proof of theorem XII.8 (continued)
• Thirdly, theorem XII.5 (F distribution) means that, as




2
2
1  BU
YQTY and 1  BU
YQBURes Y
are distributed as independent chi-squares with degrees
of freedom (t - 1) and (b - 1)(t - 1), respectively, then
F=



2
1  BU
YQ T Y  t - 1
2
1  BU

YQ T Y  t - 1
=
YQBURes Y  b - 1 t - 1 YQBURes Y  b - 1 t - 1
follows an F distribution with (t - 1) and (b - 1)(t - 1)
degrees of freedom.
Statistical Modelling Chapter XI
48
Theorem XII.9
• Let
ψ = E  Y  = X T  = 1b  It  
2
V =  BU
In
sB2 = YQB Y  b - 1
2
sBU
= YQBURes Y
Res
• where
 b - 1 t - 1
QB = MB - MG, QBURes = MBU - MB - MT  MG
MBU = Ib  It = In , MB = Ib  1t Jt , MT = b1 Jb  It , MG = b1 Jb  1t Jt
• Then, the ratio of these two mean squares, given by
F b -1,  b -1 t -1 =
sB2
2
sBU
Res
is distributed as a Snedecor's F with (b-1) and (b-1)(t-1)
degrees of freedom.
• Proof: parallels that for the treatment differences.
Statistical Modelling Chapter XI
49
XII.G Exercises
• Ex. 12-1 requires the proofs of some properties
of Q and M matrices for the RCBD.
• Ex. 12-2 asks you to derive expressions for Q
and M matrices for a CRD and to prove that the
Residual operator is symmetric and derive its df.
• Ex. 12-3 involves the proof of the E[MSq] for the
RCBD.
• Ex. 12-4 asks you to prove that the ratio of the
Rows and Residual Msq for a Latin square is
distributed as Snedecor's F under the null
hypothesis.
Statistical Modelling Chapter XI
50
Download