A Latin square is an n by n array of n distinct symbols in such a manner that each symbol appears in each row and column exactly once. For example, a Latin square of order 4 might be represented as follows: A B C D B A D C C D A B D C B A The particular Latin square above features the symbols in alphabetical order in the first row and column. This arrangement makes this a standard or reduced Latin square. If you shuffle around the rows or columns you cannot destroy the Latinization of the square, but it will not be a reduced one. This ability to permute rows and/or columns is an important facet of Latin squares. In order to best understand how these Latin squares provide a good structure for experimental design, specific applications can be reviewed. For instance, suppose a tire manufacturer wants to evaluate four different treatments for a new tire tread. The manufacturer arranges for four different cars and four different drivers which are to be used to test the different treatments, which we conveniently choose to represent with the letters A, B, C, & D. Now, we can assign the cars and drivers to the rows and columns (respectively) of the array above and get, without loss of generality, the following arrangement1: Ed Hal Joe Ken Ferrari A B C D GTO B A D C Isuzu C D A B LeMans D C B A This arrangement spreads out the nuisance factors of the different drivers and different cars in a manner which should systematically block variability possibly caused by those factors. To put some numbers with this design for the purpose of demonstrating the calculations, we’ll use the following set of numbers. Ferrari GTO Isuzu LeMans Ed A=12 B=18 C=49 D=61 Hal B =24 A=12 D=63 C=57 Joe C=37 D=53 A=20 B=42 Ken D=67 C=49 B=40 A=36 total 140 132 172 196 Total 140 156 152 192 640 1 The statistical model for this arrangement would be given by the equation. Yijk R i C j Tk ijk With the T term measuring the treatment effect, the R and C terms measuring the row and column terms respectively and the and terms representing the grand mean and the residuals. This model leads to the following ANOVA structure. SST = +SStreatments + SSrows+ SScolumns + SSerror Source SS df MS n-1 SS treatments n 1 n-1 SS rows n 1 n-1 SS columns n 1 (n - 2)(n - 1) SS e (n 2)(n 1) 2 Treatments 1 n 2 y.... y n j1 . j.. n 2 Rows 1 n 2 y.... y n i 1 i... n 2 Columns 1 n 2 y.... y n k 1 ..k . n 2 2 2 Error 2 y y n n Total n n i 1 j1 k 1 2 ijk ... 2 n2 – 1 The analysis of variance of the n2 observations would give (n-1) degrees of freedom to the Row, Column and Treatment factors, this sums to 3(n-1), which, when subtracted from the n2-1 or (n + 1)(n - 1) gives us (n + 1 -3)(n – 1) or (n - 2)(n - 1) degrees of freedom for the error term. For the sake of good order, it should be noted that in general the 3(n – 1) would be (n -1)(n -1), so (n + 1 – (n – 1))(n - 1) becomes 2(n - 1), which is consistent with (n – 2)(n -1) here as n = 4. We should observe that the row and column Latin restrictions compromise the randomization of the design, and therefore one must apply some mechanism to counter this. The standard randomization procedure for Latin square designs was given by Yates in 1933 and roughly it says that for squares of order 3, 4 or 5, you must start with a reduced Latin square and then you must permute all rows except the first and all columns or all columns except the first and all rows, and then assign treatments at random to the 2 marker symbols. For squares of order 6 or higher, it is satisfactory to permute all rows, columns and treatments, without the need to start with a reduced Latin square. Additionally, it should be pointed out that the model doesn’t allow for interactions and affords the experimenter information only on the main effects. For each of the specific designs [differentiated by the exponentiated footnote] for which a model is articulated there will be a complete ANOVA using a single set of data. The set of these calculations will be in gathered at the end of the paper. So let’s say that we set this design up and are about to run it when some technician points out that we have the cars and drivers rented for four days, so we could run the experiments for all four days and thus have replicates. Just as the cheering dies down from the realization that we can build in a greater variance of error, someone points out that there are choices to be made. Specifically, we could maintain the tread, driver, and car trinity throughout the four days2, or we could rotate the either the cars3 or drivers4 or even both5. In each of the cases the basic model is, Yijkl R i C j Tk Pl ijkl , where the P term accounts for the replication. However, it should be pointed out that the very specific differences in the designs call for different ANOVA calculations. The subtleties are best understood when they are viewed together. This paper will outline the different setup for each of the various cases of the replication, but again, the computations will be reserved for the end of the paper. In order to maintain some degree of brevity, we will employ only a pair of replicates. Case 1 (LSD2): No variation in the tread/car/driver arrangement through the p (in this case 2) replicates. Here the same reduced and randomized (we’ll assume without loss of generality that the random permutations left the reduced design intact) Latin square will be employed. Ed Hal Joe Ken total Ferrari GTO Isuzu LeMans A=12,16 B=18,23 C=49,51 D=61,44 B =24,18 A=12,49 D=63,16 C=57,33 C=37,41 D=53,33 A=20,59 B=42,82 D=67,52 C=49,28 B=40,37 A=36,29 140,127 132,133 172,163 196,188 Total 140,134 156,116 152,215 192,146 640,611 The first number in the ijth entry in the array corresponds to the first replicate, and the second number corresponds to the second trial 3 Source SS df MS n-1 SS treatments n 1 n-1 SS rows n 1 n-1 SS columns n 1 2 Treatments 1 n 2 y.... y np j1 . j.. n 2 p n 2 y Rows 1 2.... y i ... np i 1 n p Columns y 1 n 2 2.... y np k 1 ..k . n p Replicates 1 n2 2 2 p y l 1 2 ... l y 2 .... 2 p-1 n p Error (n-1)(p(n+1)-3) SS replicates p 1 SS e (n 1)(p(n 1) 3) 2 y y n n Total n n n i 1 j1 k 1 l 1 2 ijkl .... 2 pn2 - 1 At times the following substitutions may be made E = Ed, H = Hal, J = Joe, K = Ken, F = Ferrari, G = GTO, I = Isuzu, and L = LeMans. 4 Case 2 (LSD3): The drivers are randomly rearranged in the second replicate, but not the cars. The first set of numbers will be assigned to the first (reduced) Latin square, but the second set will be assigned to the following, permuted version of the original. E H J K K J E H F A=12 B=24 C=37 D=67 F A=16 B=18 C=41 D=52 G B=18 A=12 D=53 C=49 G B=23 A=49 D=33 C=28 I C=49 D=63 A=20 B=40 I C=51 D=16 A=59 B=37 L D=61 C=57 B=42 A=36 L D=44 C=33 B=82 A=29 + Source SS df MS 2 Treatments 1 n 2 y.... y np j1 . j.. n 2 p n-1 SS treatments n 1 Rows p y ....2 1 p n 2 y n l 1 i 1 i..l l 1 n 2 p(n-1) SS rows p(n 1) Columns y 1 n 2 2.... y np k 1 ..k . n p n-1 SS columns n 1 Replicates 1 n2 2 p y j 1 2 ... l y 2 .... 2 p-1 n p Error (n - 1)(np - 1) n Total n n n y i 1 j1 k 1 l 1 2 ijkl y n 2 .... 2 pn2 - 1 5 SS replicates p 1 SS e (n 1)(np 1) Case 3 (LSD4): The cars are randomly rearranged in the second replicate, but not the drivers. E H J K E H J K F A=12 B=24 C=37 D=67 I A=16 B=18 C=41 D=52 G B=18 A=12 D=53 C=49 F B=23 A=49 D=33 C=28 I C=49 D=63 A=20 B=40 L C=51 D=16 A=59 B=37 L D=61 C=57 B=42 A=36 G D=44 C=33 B=82 A=29 + Source SS df MS n-1 SS treatments n 1 2 Treatments 1 n 2 y.... y np j1 . j.. n 2 p Rows 1 n 2 y.... y np i 1 i... n 2 p n–1 SS rows n 1 Columns p y ....2 1 n p 2 y n k 1 l 1 ..kl l 1 n 2 p(n – 1) SS columns p(n 1) 2 Replicates 1 n2 p y j 1 2 ... l y 2 .... 2 p-1 n p Error (n-1)(pn - 1) n Total n n n y i 1 j1 k 1 l 1 2 ijkl y n 2 .... 2 pn2 - 1 6 SS replicates p 1 SS e (n 1)(pn 1) Case 4 (LSD5): Both the cars and drivers are randomly rearranged in the second replicate. E H J K K F A=12 B=24 C=37 D=67 G B=18 A=12 D=53 C=49 J E H I A=16 B=18 C=41 D=52 F B=23 A=49 D=33 C=28 + I C=49 D=63 A=20 B=40 L C=51 D=16 A=59 B=37 L D=61 C=57 B=42 A=36 G D=44 C=33 B=82 A=29 Source SS df MS Treatments 1 n 2 y.... y np j1 . j.. n 2 p n-1 SS treatments n 1 Rows p y ....2 1 p n 2 y n l 1 i 1 i..l l 1 n 2 p(n – 1) SS rows p(n 1) Columns p y ....2 1 n p 2 y n k 1 l 1 ..kl l 1 n 2 p(n - 1) SS columns p(n 1) Replicates 1 n2 2 p y j 1 2 ... l y 2 .... 2 p-1 n p Error (n - 1)(pn - 1) n Total n n n y i 1 j1 k 1 l 1 2 ijkl y n 2 .... 2 pn2 - 1 7 SS replicates p 1 SS e (p 1)(np 1) Finally, just as decisions are about to be made, the issue is complicated by the realization that only a single set of tires can be measured at the end of a day and further, the weather forecast calls for materially different weather on each of the four days. While the first bit of news removes the possibility for replicates, the second evidenced the addition of another factor. Essentially we want to simultaneously consider the three Latin square designs listed below. Clear Hot Sleet Wind Ferrari A B C D GTO B A D C Isuzu C D A B Le Mans D C B A Ed Hal Joe Ken Clear A B C D Hot B A D C Sleet C D A B Wind D C B A Ed Hal Joe Ken Ferrari A B C D GTO B A D C Isuzu C D A B Le Mans D C B A 8 It is through the use of an orthogonal Latin square that this can easily be achieved. A pair of Latin squares are said to be orthogonal if the unordered pairs formed by the union of the two sets of n2 ijth entries are non-repeating. For example look at the two Latin squares below and then their “union”. A B C D A B C D B A D C C D A B C D A B D C B A D C B A B A D C AA BB CC DD BC AD DA CB CD DC AB BA DB CA BD AC These are commonly represented with different symbols, Latin and Greek, thus avoiding any visual confusion about similar symbols represented by different fonts. It also gives rise to the term Graeco-Latin squares. To summarize this union and its specific arrangement here, we need to detail the symbols and demonstrate how the factors are blocked. A, B, C, & D are the treads, A, B, C, & D represent Ed, Hal, Joe and Ken respectively. Thus the Latin square above can be used to represent the following design. Clear Hot Sleet Wind Ed A Ferrari B GTO C Isuzu D LeMans Hal B Isuzu A LeMans D Ferrari C GTO Joe C LeMans D Isuzu Ed A GTO B Ferrari Ken D GTO C Ferrari B LeMans A Isuzu For future needs, we’ll establish the set of a, b, c & d as symbols for Ferrari, GTO, Isuzu & LeMans. 9 This idea of extending the “uniting of squares” can be demonstrated if we can add another Latin square that is pairwise orthogonal to each of the above Latin squares and we’ll see their “union”. a b c d AAa BBb CCc DDd d c b a BCd ADc DAb CBa b a d c CDb DCa ABd BAc c d a b DBc CAd BDa ACb Think tread/driver/car This is a set of three mutually orthogonal Latin squares of size 4, and in fact, this set is complete. Modulo a change of symbols or a randomization of the rows or columns, are complete sets of size 4 are identical to that given above. It should be obvious that the measure of a complete set of MOLS of size n is n -1. What isn’t obvious is that they are only proven to exist for an N which is either a prime or a prime power. All other number have a lower bound of 2 MOLS, except for n = 6, for which there is no pair. This means that additional factors can more easily be accommodated as the number of treatments climbs. Let us combine our developing tire experiment with the complete set of MOLS above for a theoretical experiment. In this latest version6 we are not only adding in the blocking of the factor of the weather, but also we are adding in yet another factor for, say, different locations for the four tracks. Clear Hot Sleet Wind Track1 A ED Ferrari B Hal GTO C Joe Isuzu D KenLemans Track 2 B Ken Isuzu A Joe LeMans D Hal Ferrari C Ed GTO Track 3 C Hal LeMans D Ed Isuzu A Ken GTO B Joe Ferrari Track 4 D Joe GTO C Ken Ferrari B Ed LeMans A Hal Isuzu Here we have the four tire tread types (A,B,C, & D) arranged with the four, four-degree factors (location, weather, car & driver). A Latin square design can accommodate one more factor than the cardinality of the complete set of set of MOLS for a given n. While one may defend the stipulation that there are no interactions, it is difficult to ignore the constraint that the various factors must have an equal number of options. In the case of continuous parameters of which we have control, we can consider that we are merely adding center points and we would realize that some factors can easily be forced to fit the model’s needs. For instance, if one of the factors, say, a temperature level only had two levels, while the other factor and the treatment had a cardinality of four, we could balance 10 things out by interjecting two interpolated temperature points to achieve the requisite symmetry of the Lain square design. While it is reassuring to know that such an option may exist, there are frequent occasions where an imbalance in the parameters does preclude the use of a Latin square. For example: Suppose a baker wants to experiment with seven different bread recipes at a variety of temperatures in three different ovens. It is easy for him to justify seven different temperature levels (and cooking times), but there is no flexibility in the three distinct ovens. Clearly a Latin square cannot be used, but there exist fundamental results in design theory which allows for non-symmetric designs and they are called Balanced Incomplete Block Designs (BIBD), and they are defined below. A BIBD with parameters (v, b, r, k, ) is a pair (X, A) that satisfies the following properties: 1. X is a set of v elements (called points). 2. A is a family of b subsets of X, each with cardinality k (called blocks). 3. Each pair occurs in exactly r blocks. 4. Every pair of distinct points occurs in exactly blocks. A little bit of inspection would reveal the following relationships between the parameters: vr = bk (v-1) = r(k-1) & This allows a BIBD to be defined as a (v, k, )-BIBD, which gives us all our information. In this case, the experiment can easily be set up as follows: 250 275 300 325 350 375 400 v b r k 60 min 50 min 45 min 42 min 40 min 38 min 37 min Oven 1 Oven 2 Oven 3 A B C D E F G B C D E F G A D E F G A B C = number of treatments. = number of blocks. = number of blocks in which a treatment occurs. = size of the blocks. = number of times that two treatments occur together in a block in the overall design. 11 This BIBD, sometimes called a Youden square, loses the two direction heterogeneity as the columns are orthogonal to the rows and treatments, but the rows are not orthogonal to the treatments since not every treatment occurs in every row. This means that the estimates of the treatment effects have to be adjusted for row effects, i.e., no longer can treatment means be used to estimate treatment effects but Latin square means must be obtained A Room square of side n (on a set of n + 1 symbols) is an n by n array which satisfies the following properties: 1. Every cell is either empty or filled with an unordered pair of symbols. 2. Every symbol occurs in each row and column exactly once. 3. Every unordered pair occurs in exactly one cell. For example: 07 - - 15 - 46 23 34 17 - - 26 - 50 61 45 27 - - 30 - - 02 56 37 - - 41 52 - 13 60 47 - - - 63 - 24 01 57 - - - 04 - 35 12 67 A common application for Room squares is in the construction of round-robin tournaments. Think of the rows as the rounds, the columns as the locations and n + 1 symbols as the teams. Notice that the following properties result: 1. Every team plays every other team exactly once. 2. Every team plays exactly once in every round. 3. Every team plays at every location exactly once. The application for this would be for examining interactions while blocking for other factors. My question was whether this design, in conjunction with analysis using a Latin square design to give some measure for the main effects, would be a reasonable platform for comparison. To more clearly articulate how this experiment might look, I’ll extend and modify the tire tread analysis. We’ll assume that there are eight treatments, now A, B, C, D, E, F, G, & 12 H, which we substitute for 0 to 7 according to their rank. After running the n = 8 Latin square design, we might look at the following Room square design. Quint Rod Sam Tim Unk Will EG CD Isuzu AH Jeep DE BH KIA BG EF CH AC FG DH BD AG EH CE AB FH DF BC LeMans Nova BF Vic CF Olds CG DG Pontiac AE AF AD BE GH What I cannot show you very well is that this design can also be extended to ndimensions through some clever constructions, the main being through the construction of pairwise orthogonal, symmetric Latin squares of order n. In fact, the existence of the following is equivalent: 1. 2. 3. 4. A Room d-cube of side n. (d = dimension) d pairwise orthogonal-symmetric Latin squares of order n. d pairwise orthogonal one-factorizations of Kn+1 v(n) greater than or equal to d. I look forward to continuing my enquiry and I am hopeful that I will better understand the utility of some combinatorial structures for the purpose of designing experiments. 13 ANOVA tables. For all the different versions, we will use the following set of measures for the degree of tire erosion. The numbers in parenthesis are for the first replicate (LSD2 – LSD5). 1 2 3 4 total 1 12(16) 18(23) 49(51) 61(44) 140(134) Treatment Row Column Replicate Weather Track Model Error Total Treatment Row Column Replicate Weather Track Model Error Total Treatment Row Column Replicate Weather Track Model Error Total 2 24(18) 12(49) 63(16) 57(33) 156(116) SS 3944 656 376 na na na 4976 80 5056 LSD1 df MS 3 3 3 0 0 0 9 6 15 SS 3944 656 376 na na na 4976 80 5056 LSD3 df MS 3 3 3 0 0 0 9 6 15 SS 3944 656 376 na na na 4976 80 5056 LSD5 df MS 3 3 3 0 0 0 9 6 15 3 37(41) 53(33) 20(59) 42(82) 152(215) F value F value F value 14 4 67(52) 49(28) 40(37) 36(29) 192(146) total 140(127) 132(133) 172(163) 196(188) 640(611) SS LSD2 df MS F Value SS LSD4 df MS F Value SS LSD6 df MS F Value