Fixed vs. Random Effects Fixed effect – we are interested in the effects of the treatments (or blocks) per se – if the experiment were repeated, the levels would be the same – conclusions apply to the treatment (or block) levels that were tested – treatment (or block) effects sum to zero Random effect i 0 i – represents a sample from a larger reference population – the specific levels used are not of particular interest – conclusions apply to the reference population • inference space may be broad (all possible random effects) or narrow (just the random effects in the experiment) – goal is generally to estimate the variance among treatments (or other groups) Need to know which effects are fixed or random to determine appropriate F tests in ANOVA 2 T Fixed or Random? lambs born from common parents (same ram and ewe) are given different formulations of a vitamin supplement comparison of new herbicides for potential licensing comparison of herbicides used in different decades (1980’s, 1990’s, 2000’s) nitrogen fertilizer treatments at rates of 0, 50, 100, and 150 kg N/ha years of evaluation of new canola varieties (2008, 2009, 2010) location of a crop rotation experiment that is conducted on three farmers’ fields in the Willamette valley (Junction City, Albany, Woodburn) species of trees in an old growth forest Fixed and random models for the CRD Yij = µ + i + ij 2t i2 (t 1) i variance among fixed treatment effects Fixed Model (Model I) Source Treatment Error Random Model (Model II) Source Treatment Error Expected df Mean Square t -1 e2 + r T2 tr -t e2 df t -1 tr -t Expected Mean Square 2e + r2T e2 Yij = µ + i +j + ij Models for the RBD Fixed Model Source Block Treatment Error df r-1 t-1 (r-1)(t-1) Source Random Model Expected Mean Square e2 + t2B e2 + rT2 e2 Mixed Model Source Block Treatment Error df r-1 t-1 (r-1)(t-1) Source Block Block Treatment Error e2 + t2B e2 + rT2 e2 Treatment Source Expected Mean Square + t + r 2 e 2 e 2 e df r-1 t-1 (r-1)(t-1) Expected Mean Square 2 B 2 T T2 2j (t 1) j Block 2 B i Treatment 2 i (r 1) Nested (Hierarchical) Designs Levels of one factor (B) occur within the levels of another factor (A) Levels of B are unique to each level of A Factor B is nested within A Factor A = the pigs (sows) Factor B = the piglets Nested factors are usually random effects Nested vs. Cross-Classified Factors Nested Cross-classified A1 A2 A3 B1 B2 B3 B4 B5 B6 Each unit of B is unique to each unit of A B1 B2 A1 X X A2 X X A3 X X All possible combinations of A and B General form for degrees of freedom B nested in A a(b-1) A*B (a-1)(b-1) Sub - Sampling It may be necessary or convenient to measure a treatment response on subsamples of a plot – several soil cores within a plot – duplicate laboratory analyses to estimate grain protein Introduces a complication into the analysis that can be handled in one of two ways: – compute the average for each plot and analyze normally – subject the subsamples themselves to an analysis The second choice gives an additional source of variation in the ANOVA – often called the sampling error Use Sampling to Gain Precision When making lab measurements, you will have better results if you analyze several samples to get a truer estimate of the mean. It is often useful to determine the number of samples that would be required for your chosen level of precision. Sampling will reduce the variability within a treatment across replications. Stein’s Sample Estimate 2 2 1 2 Where t s n d t1 is the tabular t value for the desired confidence level and the degrees of freedom of the initial sample d is the half-width of the desired confidence interval s is the standard deviation of the initial sample For Example • • • • • We are measuring grain protein content and want to increase the precision for each replicate of a treatment. We collect and run five samples from the same block and same treatment. We decide that an alpha level of 5% is acceptable and we would like to be able to get within 0.5 units of the true mean. We continue to apply the formula until we get a stable estimate of n. To obtain the desired level of precision, we would need to run at least 10 samples per block per treatment. Subsample 6.2 7.4 5.8 7 6.1 mean variance t (0.05, 4 df) d n 6.50 0.45 2.78 0.50 13.88 For n = 5 t12s2 2.782 * 0.45 n 2 13.88 2 d 0.5 For n = 14 t12s2 2.162 * 0.45 n 2 8.40 2 d 0.5 For n = 9 t12s2 2.312 * 0.45 n 2 9.57 2 d 0.5 For n = 10 t12s2 2.262 * 0.45 n 2 9.21 2 d 0.5 Linear model with sub-sampling For a CRD Yijk= + i + ij + ijk = mean effect i = ith treatment effect ij = random error ijk=sampling error For an RBD Yijk= + i + j + ij + ijk = mean effect βi = ith block effect j = jth treatment effect ij = treatment x block interaction, treated as error ijk=sampling error Expected Mean Squares – RBD with subsampling Source df Expected Mean Square Block r-1 σ + nσ + tnσ Treatment t-1 s2 + ne2 + rn2t Error Sampling Error (r-1)(t-1) rt(n-1) 2 s 2 e 2 b s2 + n e2 s2 In this example, treatments are fixed and blocks are random effects This is a mixed model because it includes both fixed and random effects Appropriate F tests can be determined from the Expected Mean Squares The RBD ANOVA with Subsampling Source df SS MS Total rtn-1 SSTot = Block r-1 tn Y Y rn Y Y n Y Y SSB SST ijk Yijk Y SSB= t-1 SST = 2 (r-1)(t-1) SSE = k Sampling Error rt(n-1) SST/(t-1) FT = MST/MSE j j Error SSB/(r-1) 2 i i Trtmt F 2 SSE/(r-1)(t-1) FE = MSE/MSS k SSS = SSS/rt(n-1) SSTot-SSB-SST-SSE Significance Tests MSS estimates – the variation among samples MSE estimates Therefore: FE – – the variation among samples plus – the variation among plots treated alike MST estimates – the variation among samples plus – the variation among plots treated alike plus – the variation among treatment means tests the significance of the variation among plots treated alike FT – tests the significance of the differences among the treatment means Means and Standard Errors 2 2 2 2 + n MSE e s2Y s s + e rn rn rn r Standard Error of a treatment mean s Y MSE rn Confidence interval estimate L i Y i t MSE rn Standard Error of a difference s Y Y 2MSE rn 1 2 Confidence interval estimate L 1 2 Y 1 Y 2 t 2MSE rn t to test difference between two means Y 1 Y2 t 2MSE rn Allocating resources – reps vs samples Cost function n C = c1r + c2rn – c1 = cost of an experimental unit – c2 = cost of a sampling unit 2 c1s 2 c 2e If your goal is to minimize variance for a fixed cost, use the estimate of n to solve for r in the cost function If your goal is to minimize cost for a fixed variance, use the estimate of n to solve for r using the formula for a variance of a treatment mean 2 2 2 y See Kuehl pg 163 for an example s e + rn r