17.2.1 Estimable Functions In this section we discuss briefly the theory of estimable functions (Bose (1944)), which is important for a deeper understanding of certain topics that arise in the analysis of variance. Definition 17.2.1 A parametric function ( 1 , function (Y1 , , Yn ) of the random variables Y1 , , m ) is said to be estimable if there exists some , Yn such that E ( ) (17.2.8) Definition 17.2.2 A linear function t of parameters 1 ,..., m is said be linearly estimable if there exists a linear function aY of the random vector Y such that for all 1 ,..., m E (aY ) t (17.2.9) that is, aY is an unbiased estimator of t . We now discuss various important theorems concerning estimable functions. The proofs of these theorem and corollaries are optional. Theorem 17.2.1 Suppose we are dealing with the model (17.2.2). Then a necessary and sufficient condition for a linear parametric function t to be linearly estimable is that rank( X ) rank( X : t ) where ( X : t ) is the matrix obtained from X by adjoining the column vector t. Proof: Let a (a1 , , an ) be any (1 n) row vector. Then, considering the underlying model E (Y) = X, the expectation of the linear function aY can be written as for all E (a'Y ) = a'X Now the linear function a'Y is an unbiased estimator of the linear function t' of the parameters if E (a'Y) = t' Obviously (17.2.10) and (17.2.11) are true for all if and only if t' = a' X or Xa = t (17.2.12) that is, if and only if there exists a solution for the unknown vector a, which is true only if rank(X') = rank( X : t) This completes the proof of Theorem 17.2.1. Corollary 1 The linear parametric function t' is linearly estimable if and only if there exists a solution for in X X t Proof: The result follows immediately by using the fact that rank( X ) = rank( X : t) rank( X X) = rank( X X: t). Corollary 2 If the matrix X (n m), where m n, is of full rank m, then every linear parametric function t' is linearly estimable. Proof: rank( X : t ) rank( X ) . But rank( X : t) cannot exceed m, since ( X : t ) is a m (n + 1) matrix with m n. Hence rank( X : t) = rank( X ). Example 17.2.2 (Example 17.2.1 Revisited) Consider the experiment in Example 17.2.1 in which four observations are made, two on each of the two treatments. Our model for this experiment is yij j eij i = 1, 2; j = 1, 2 We have 1 and the transpose of the design matrix X is 2 1 1 1 1 X = 1 1 0 0 0 0 1 1 and is of rank 2. We show that a linear parametric function, 1 2, is linearly estimable. The function 1 2 can be written as t where t = ( 0, 1, -1 ). Now, the rank of 1 1 1 1 | 0 1 ( X : t) = 1 1 0 0 | 0 0 1 1 | 1 is again 2, as the sum of the last two rows is equal to the first row. This proves that 1 2 is a linearly estimable function (see Theorem 17.2.1). Theorem 17.2.2 Any linear combination of estimable functions is estimable. Proof : Let t1 , , tr be estimable functions, where t j is m 1, is m1 , and let 1 , .... , r be some constants. Then we must show that t is an estimable function where r t= t j j j 1 Since the functions t1 , , tr are estimable, there exist (n 1) vectors a1 , X ai ti , ar such that i = 1, 2, . . . , r (17.2.13) Whenever (17.2.13) holds, we have X a = t where a = i ai and t = i ti . This completes the proof of the theorem. Now it is quite interesting to investigate the number of linearly independent estimable functions. In order to see this, we need the following definition. Definition 17.2.3 The estimable functions t1 , , tr are said to be linearly independent if there exist a1, . . . , ar such that X ai = ti , i = 1, . . . , r and if the vectors t1 , , tr are linearly independent. Now, using Definition 17.2.3, one can easily show that if the rank (X) = m0, then there are exactly m0 linearly independent estimable functions. Thus, we have the following result. Theorem 17.2.3 The maximum number of linearly independent estimable functions is exactly equal to the rank of the design matrix X. For instance, in Example 17.2.2, as the rank of the design matrix X is 2, then there are exactly two linearly independent estimable functions. One such set of two linearly independent estimable functions is + 1 and + 2 . Further, if we use Theorem 17.2.2 and the fact that + 1 and + 2 are estimable functions, it follows immediately that the function 1 2= ( 1 ) – ( 2 ) is also estimable, the result of Example 17.2.2. We come now to an important historic theorem, which will be used in subsequent sections. Theorem 17.2.4 (Gauss–Markoff Theorem) Suppose in model (17.2.2) t is an estimable function. Then the best linear unbiased estimator (BLUE) of t is t'*, where * is any solution of the least square normal equations X X * X Y . Proof : Since t is an estimable function, it follows from corollary 1 of Theorem 17.2.1 that t = X X for some so that t X X (17.2.14) t * X X * X Y (17.2.14a) since X X * X Y . Thus, t'* is a linear function of Y. Further, we have from (17.2.14) and (17.2.14a), that E (t'*) = E (t * ) X E (Y ) X X t . Hence, t * is an unbiased estimator of t . Now, to complete this proof of the theorem, we show that, among all the linear unbiased estimators of t , t * is the best in the sense that it has the minimum variance. Let b'Y be an arbitrary function of Y such that E(b'Y) = t We then have that, E (b'Y) = b' E(Y) = b' X that is, b'X = t for all b'X = t'. (17.2.15) Thus, Var (b'Y) = Var (b'Y - 'X'Y + 'X'Y ), or Var (b'Y) = Var(b'Y - 'X'Y) + Var('X'Y) + 2 Cov[( [(bY X Y ), X Y ] We wish now to evaluate the last term of (17.2.16). We first notice that E (bY X Y ) = (b X ) E (Y ) (b X ) X , (17.2.16) So that (bY X Y ) E (bY X Y ) can be written as (b X )(Y X ) (17.2.17) Y X E (Y X ) (Y X ) X (17.2.17a) Similarly, we have that Now the last term of (17.2.16) by definition of covariance can be written as 2Cov[(bY X Y ), X Y ] =2 E[(b X )(Y X )(Y X ) X ] = 2(b X ) E[(Y X )(Y X )] X = 2(b X ) E[ 2 I n ] X (17.2.18) since we are assuming that model (17.2.2) holds, which states that the i ’s are uncorrelated with constant variance 2 , so that the variance-covariance matrix of Yi ' s is 2 I n . ( I n is the (n n) identity matrix.) Hence, from (17.2.18), we have 2Cov[(bY X Y ), X Y ] = 2 2 (bX X X ) and from (17.2.14) and (17.2.15) we have that 2Cov[(bY X Y ), X Y ] = 2 2 (t t ) 0 (17.2.19) We may now write (17.2.16) as Var(bY ) Var(bY X Y ) Var(t * ), since (17.2.14a) and (17.2.19) hold. Thus Var(bY ) Var(t * ) which completes the proof of this theorem. 17.5.3 Blocking in Two-Way Experimental Layouts In Section 17.4, we discussed the use of a randomized block design to eliminate the effect of a nuisance variable in one-way experimental design. Quite often we confront a similar situation when using a two-way experimental design: again, we need to eliminate the effect of a nuisance variable. For example, suppose that we use a two-way experimental design for a two-factorial experiment with r replications where the total number of observations to be generated is a b r. But on a given day we may only be able to complete one replication, and the experiment may be effected by weather conditions that vary daily. Thus, in this example days constitute a nuisance variable and are therefore treated as blocks. Thus, we wish to conduct an experiment using a b treatment combinations of a levels of factor A and b levels of factor B in r blocks, with each block containing the a b combinations of Ai ' s with B j ' s. We may tabulate the observations yi jk obtained using the ith level of A and the jth level of B in the kth block, as in Table 17.5.9. TABLES 17.5.9 Observations yi jk obtained from r blocks. Block 1 A1 Ai Aa B1 Bj Bb y111 y1 j1 : yi11 : ya11 : yij1 : yaj1 y1b1 : yib1 : yab1 T1.1 : Ti.1 : Ta.1 T11 T j1 Tb1 T1 B1 Bj Bc y11r y1 jr : yi1r : ya1r : yijr : yajr y1br : yibr : yabr T1.r : Ti.r : Ta.r T1r T jr Tbr T..r Block r A1 Ai Aa r A subsidiary totals table is also constructed, as in Table 17.5.10, where Tij . y i j . yijk . k 1 TABLE 17.5.10 Totals table for the data in table 17.5.9. A1 Ai Aa B1 Bj Bb T11. T1 j . : Ti1. : Ta1. : Tij . : Taj . T1b. : Tib. : Tab. T1.. : Ti .. : Ta T.1. T. j . T.b. T... G The reader is again cautioned to take Table 17.5.10 only as an aid in analysis and not to forget that we are talking about an experiment that involves ( a b) treatments run in each of r blocks. This is not to be confused with an experiment with (a b r ) treatments without any blocking. Initially, in our analysis, we may remove the source due to blocks (see Table 17.5.11), then, using Table 17.5.10, extract the sum of squares due to treatments, and after computing the total sum of squares, find the error sum of squares by subtraction. Now let a T k b i 1 j 1 b yijk block total of the observation in the kth block, Ti k yi j k sum of observations taken, using A i in the kth block, etc. (17.5.21) j 1 TABLE 17.5.11 Preliminary ANOVA Table for the Data in Tables 17.5.9. Source Degrees of freedom Sums of squares 2 r Tk G 2 SS bl Blocks r 1 abr k 1 ab a b T2 G2 ij SS Treatments treat ab 1 abr i 1 j 1 r Error (ab 1)(r 1) SSE SStotal (SSbl Streat ) 2 Total abr 1 SStotal yijk G2 / abr Now it is quite easy to partition the “treatments” line in the usual way; there are (a 1) degrees of freedom for SS A , (b 1) degrees of freedom for SS B , and (a 1)(b 1) degrees of freedom for the interaction sum of squares SS AB (see Table 17.5.12). In practice, we compute SS AB by subtraction (see Equation 17.5.13a). Table17.5.12 Partitioning of treatment sum of squares. Source Degrees of Freedom Sum of Squares Ti2 G 2 SS A abr i 1 br 2 c T G2 j SS B abr j 1 ar S AB Streat (S A SB ) r A a 1 B b 1 A B (a 1)(b 1) a ab 1 Treatments b SStreat i 1 j 1 Tij2 r G2 abr As usual cases in these situations, the blocking is assumed not to interact with the factors A and B, so that we use the model (in the usual notation) yi j k i j k ij i j k (17.5.22) with a b r i 1 j 1 k 1 i j k ij ij 0 i (17.5.22a) j and ijk N (0, 2 ) (17.5.22b) with all ijk ’s independent. We proceed first to test the interaction in the usual way. If we do not reject the hypothesis that there is no interaction, we then proceed to test for main effects. Note that we have eliminated a source of variation, namely that due to blocks, by running a complete replication of ab treatments within each block. Examples of blocks situations that may be encountered are given in the settings of the relevant problems at the end of this chapter. The reader is recommended to work out some of the problems of this nature. 17.5.4 Extending Two-Way Experimental Designs to n-Way Experimental Layouts So far we have discussed experiments involving one or two factors. Now we consider more general experiments, that is, experiments involving three or more factors. For example, in a bread-making process, we may consider factors such as the type of flour, amount of yeast, type of oil, amount of calcium propionate (a preservative), oven temperature, etc. As another example, consider patients, their medication, duration of treatment, dosage, age, gender, and other factors. Here we discuss briefly the analysis of a three-way experimental design. Extension to n-way experimental designs (n > 3) can be done in a similar fashion. The model for a three-way experimental design with r replications (factor A at a levels, factor B at b levels, and factor C at c levels) is given by yijkl i j k ij ik jk ijk ijkl i 1, 2, i i i ijk , a; j 1, 2, , b; k 1, 2, , c; l 1, 2, (17.5.23) , r , with side conditions 0, j 0, k 0, ij 0, ij 0, ik 0, ik 0, jk 0, jk 0, j k i j i k j k 0, ijk 0, and ijk 0 j (17.5.23a) k As usual, we assume that the ijkl are independent N (0, 2 ). Suppose we conduct a “three-way layout design of an experiment” with r replications, each factor having two levels. Then the data obtained from such an experiment can be displayed as in Table 17.5.13. Note that each treatment Ai B j Ck , i 1, 2, j 1, 2, k 1, 2 is used (eight treatments). We say that A and B and C are completely crossed in this experiment. Table 17.5.13 Data from a three-way experimental design. A2 A1 B2 B1 C1 C2 C1 B2 B1 C2 C1 C2 C1 C2 y1111 y1121 y1211 y1221 y2111 y2121 y2211 y2221 y1112 y1122 y1212 y1222 y2112 y2122 y2212 y2222 y111r y112r y121r y122r y211r y212r y221r y222r Now by minimizing the error sum of squares, that is, minimizing a b c r Q (Yijkl i j k ij ik jk ijk )2 (17.5.24) i 1 j 1 k 1 l 1 subject to the conditions (17.5.23a) and solving the least square normal equations, we obtain a b c r SS E Min Q ( yijkl yijk . )2 (17.5.25) i 1 j 1 k 1 l 1 and an unbiased estimator of 2 is a ˆ 2 S 2 b c r (Y i 1 j 1 k 1 l 1 ijkl Yijk . ) 2 = abc(r 1) SS E MS E abc(r 1) (17.5.26) Various estimators of model parameters are given by (with obvious definitions of yi... , y. j .. , …) ˆ y.... ˆi yi... y.... , ˆ j y. j.. y.... , ˆk y..k . y.... ˆij yij .. yi... y. j .. y.... ,ˆik yi.k . yi... y..k . y.... , ˆ jk y. jk . y. j.. y..k . y.... (17.5.27) ˆijk yijk . yij .. yi.k . y. jk . yi... y. j .. y..k . y.... Now, proceeding as in the the two-way experimental design with equal numbers of observations per cell, the ANOVA table for a three-way experimental design, when yijkl ’s are generated as in the model (17.5.23) (17.5.23a), is given in Table 17.5.14. TABLE 17.5.14 ANOVA table for a three-way experimental design with r ( >1) observations per cell. Source A DF (a 1) SS bcr ( yi ... y ....)2 MS SSA/(a-1) acr ( y . j.. y ....) 2 SSB/(b-1) i B (b 1) j C abr ( y ..k. y....) 2 SSC/(c-1) (a 1)(b 1) cr ( yij .. yi ... y . j.. y.... ) 2 SSAB/(a-1)(b-1) (a 1)(c 1) br ( yi .k . yi ... y ..k. y....)2 (c 1) k AB AC ij ik SSAC/(a-1)(c-1) BC (b 1)(c 1) ABC (a 1)(b 1)(c 1) ar ( y . jk . y . j.. y ..k . y ....) 2 SSBC/(b-1)(c-1) jk r ( yijk . yi ... y . j.. y ..k . yij .. SSABC/(a-1)(b-1)(c-1) ijk yi .k . y . jk . y ....)2 Error TOTAL abc(r 1) (y abcr 1 (y ijkl yijk .) 2 ijkl y....) 2 SSE/abc(r-1) ijkl ijkl If there is one observation per cell, then we simply assume that the three-factor interaction is zero and use the corresponding degrees of freedom for error sum of squares, and then estimate the error variance 2 . The various sum of squares for r >1 observations per cell in the ANOVA table are obtained as follows: (T.... G yijkl ) i j k l 1 1 Ti...2 T....2 bcr i abcr 1 1 SS B T. 2j .. T....2 acr j abcr SS A 1 1 TT..2k . T....2 abr k abcr 1 1 1 1 SS AB Tij2.. Ti...2 T. 2j .. T....2 cr i j bcr i acr j abcr (17.5.28) 1 1 1 1 SS AC Ti.2k . Ti...2 T..2k . T....2 br i k bcr i abr k abcr 1 1 1 1 SS BC T. 2jk . T. 2j .. T..2k . T....2 ar j k acr j abr k abcr 1 1 1 1 SS ABC Tijk2 . Tij2.. Ti.2k . T. 2jk . r i j k cr i j br i k ar j k SSC 1 1 1 1 Ti...2 T. 2j .. T..2k . T....2 bcr i acr j abr k abcr 1 SST yijkl 2 T....2 abcr i j k l SS E SST SS A SS B SSC SS AB SS AC SS BC SS ABC , or SS E y 2 ijkl i j k l i j k Tijk2 . r