Single-Factor Studies

Single-Factor Studies KNNL – Chapter 16 Single-Factor Models • Independent Variable can be qualitative or quantitative • If Quantitative, we typically assume a linear, polynomial, or no “structural” relation • If Qualitative, we typically have no “structural” relation • Balanced designs have equal numbers of replicates at each level of the independent variable • When no structure is assumed, we refer to models as “Analysis of Variance” models, and use indicator variables for treatments in regression model Single-Factor ANOVA Model • Model Assumptions for Model Testing  All probability distributions are normal  All probability distributions have equal variance  Responses are random samples from their probability distributions, and are independent • Analysis Procedure  Test for differences among factor level means  Follow-up (post-hoc) comparisons among pairs or groups of factor level means Cell Means Model r  # of levels of the study factor ni  # of replicates (cases, units) for the i th level of the study factor r n1  ...  nr   ni  nT  overall sample size (number of cases) i 1 Yij  i   ij i  1,..., r j  1,..., ni Yij  Response for j th case within the i th level of the study factor i  Population mean for the i th level of the study factor  ij ~ NID  0,  2  where NID  Normally and Independently Distributed  E Yij   i  2 Yij    2  Yij are independent N i ,  2  Cell Means Model – Regression Form Suppose r  3 and n1  n2  n3  2  Y11  Y   12  Y  Y   21  Y22  Y31    Y32  1 1  0 X 0 0  0 0 0 1 1 0 0 0 0  0  0 1  1  E Y11  1   1  E Y12     E Y21 0 E Y    Xβ     E Y22  0  E Y   0 31     E Y32  0 2 0 0 X'X   0 2 0   0 0 2   1  β   2   3  0 0 1 1 0 0  11     12    ε   21   22    31     32   2 0 0 0 0 0   2 0  0 0 0 0   2 0 0  0 0 0 2  2 ε     I 2 0 0  0 0 0 0 0 0 0 2 0    2 0 0 0 0 0    0  1    0  1  1    0     2   2    0     2   3  1     3     1  3   Y11  Y12  X'Y  Y21  Y22  Y31  Y32  ^   0   Y11  Y12   Y 1   1  0.5 0 ^   ^  -1     β =  X'X  X'Y   0 0.5 0  Y21  Y22   Y 2     2   0 0 0.5 Y31  Y32  Y 3   ^    3    Model Interpretations • Factor Level Means  Observational Studies – The i represent the population means among units from the populations of factor levels  Experimental Studies - The i represent the means of the various factor levels, had they been assigned to a population of experimental units • Fixed and Random Factors  Fixed Factors – All levels of interest are observed in study  Random Factors – Factor levels included in study represent a sample from a population of factor levels Fitting ANOVA Models ni ni Notation: Yi   Yij Y i  Y ij j 1 ni j 1 ni r r Y  i ni ni Y   Yij Y   i 1 j 1  Y i 1 j 1 nT ij Y r ni Y i   nT i 1 nT Least Squares and Maximum Likelihood Estimation ni ni Error Sum of Squares: Q      Yij  i  r i 1 j 1 r 2 ij 2 i 1 j 1 nk Q   2 Ykj  k  k j 1 nk Q Setting 0  k nk Y  kj j 1 ^ ^  nk  k   k   Likelihood: L 1 ,..., r ,  | Y11 ,..., Yrnr  2  Y 1 2 2 j 1 kj nk   Y k k  1,..., r  1 r ni 2 exp   2  Yij  i   n  2 i 1 j 1  ni maximizing Likelihood wrt 1 ,..., r  minimizing  Yij  i  r i 1 j 1 ^ Fitted values: Y ij  Y i ^ Residuals: eij  Yij  Y ij  Yij  Y i 2 k  1,..., r Analysis of Variance   Y Yij  Y   Yij  Y i  Total Deviation Deviation from trt mean (residual)  Y  r ni    r   Y  Deviation of trt mean from overall mean  r    Y ni i 1 j 1 Yij  Y   2 i 1 ni r    Yij  Y i i 1 j 1  2  ni  Y i Y i  Y    Y i  Y  ij i 1 j 1 i  Y i  0 ij j 1 ni r    Y i  Y  i 1 j 1 r ni   Total (Corrected) Sum of Squares: SSTO   Yij  Y  i 1 j 1 ni r  Treatment Sum of Squares: SSTR   Y i  Y  i 1 j 1 r ni  Error Sum of Squares: SSE   Yij  Y i i 1 j 1 Note: SSTO  SSTR  SSE  ni s  2 i j 1 Yij  Y i ni  1   2  2 2  2 r dfTO  nT  1    ni Y i  Y  i 1  2 dfTR  r  1 df E  nT  r dfTO  dfTR  df E Useful result: 2  Mean Squares: MSTR  ni   ni  1 s   Yij  Y i SSTR r 1 MSE  2 i j 1  SSE nT  r 2 r  SSE    ni  1 s i 1 2 i r df E  nT  r    ni  1 i 1 ANOVA Table Source df SS MS E{MS } r Treatments r  1  r SSTR   ni Y i  Y  i 1 nT  r Error nT  1 Total ni r  SSE   Yij  Y i i 1 j 1 r ni   2  SSTO   Yij  Y  i 1 j 1 r Note: SSTR   ni Y  nT Y 2 i 2  i 1  SSTR 2 MSTR    i 1 r 1 2 SSE nT  r MSE   r  ni i     2 r 1 2 2 ni r SSE   Y   ni Y i i 1 j 1 2 ij 2 i 1  r ni 2  r  Yij     E Y      E  Yij    ni i2  nT  2  i 1 j 1  i 1 r 2 2  2 2  r 2 2  Y i   E Y i   i   E  ni Y i    ni i2  r 2 ni ni  i 1  i 1 E Yij   i 2     2 2 ij 2     E Y i   i 2 i r   E Y   n  i 1 i nT i      Y   2 2 nT    E Y 2    2  2 nT  2   E nT Y   nT 2   2 F-Test for H0: 1  ...  r H 0 : 1  ...   r H A : Not all i are equal MSTR MSE Under null hypothesis (and independence and normality of errors): Test Statistic: F *  SSTR 2 ~  r21 SSE 2 ~  n2T  r and are independent (independent even if H 0 false)  SSTR  r  1     2  MSTR   ~ F  r  1, nT  r   SSE  MSE n  r    2  T  Decision Rule: Reject H 0 if F *  MSTR  F 1   ; r  1, nT  r  MSE General Linear Test of Equal Means H 0 : 1  ...  r  c c  Common Mean (Reduced Model) H A : Not all i are equal (Complete Model) ^ ^ Reduced Model:  c  Y   Y ij 2     i    SSE ( R)    Yij  Y ij    Yij  Y   i 1 j 1  i 1 j 1 r ni ^ ^ r n 2  SSTO df R  nT  1 2  SSE df F  nT  r ^ Complete (Full) Model:  i  Y i  Y ij 2 r i    SSE ( F )    Yij  Y ij    Yij  Y i  i 1 j 1  i 1 j 1 r ni ^ n  SSE ( R)  SSE ( F )   SSTO  SSE   SSTR       n  1  n  r  T   T    r  1  MSTR df R  df F    * Test Statistic: F      SSE ( F )   SSE   SSE  MSE  df  n r  n  r  F    T   T  Factor Effects Model Alternative Form of Model (Necessary for interactions in multi-factor models): i     i         i Yij     i   ij  i  i    "Effect" of i th factor level  ij ~ NID  0,  2  Defining  : r Unweighted Mean:    i 1 r i  i 1  i 1 r Weighted Mean:    wi i r r s.t. i 0 w 1 i 1 i  r  w i 1 i i 0 Weights may represent the population sizes in observational studies Note: 1  ...  r   1  ...   r  0 Regression Approach – Factor Effects Model Suppose r  3 and n1  n2  n3  2 and Unweighted Mean Model:  1   2   3  0   3   1   2  Y11  Y   12  Y  Y   21  Y22  Y31    Y32  1 1 0  1 1 0    1 0 1  X  1 0 1  1 1 1   1  1  1      β    1   2   11     12    ε   21   22   31     32   E Y11      1      1  1 1 0            1 1 0   E Y12         1    1   E Y21 1 0 1         2      2  E Y     Xβ       1    E Y       1 0 1   22         2    2  2  E Y    1 1 1       1   2      3  31          E Y32  1 1 1     1   2      3  6 0 0  X'X  0 4 2  0 2 4  Y11  Y12  Y21  Y22  Y31  Y32    X'Y   Y11  Y12   Y31  Y32    Y21  Y22   Y31  Y32   ^   0 0  Y11  Y12  Y21  Y22  Y31  Y32   Y      1/ 6 ^  ^  -1    β =  X'X  X'Y   0 1/ 3 1/ 6   Y11  Y12   Y31  Y32     Y 1  Y      1   0 1/ 6 1/ 3   Y21  Y22   Y31  Y32   Y 2  Y    ^     2    Factor Effects Model with Weighted Mean ni Weights are relative sample sizes: wi  nT r r r ni   wi i  0    i   ni i  0 i 1 i 1 nT i 1 r 1 r 1 ni  nr r   ni i   r    i i 1 i 1 nr  Yij     1 X ij1  ...   r 1 X ij ,r 1   ij  1 if i  1   n1 X ij1   if i  r  nr  0 otherwise ...  1 if i  r  1   nr 1 X ij ,r 1   if i  r  nr  0 otherwise Regression for Cell Means Model Yij  i   ij  1 X ij1  ...   r X ijr 1 if i  1 X1   0 if i  1  1  β     r  1 if i  r Xr   0 if i  r ... Y 1    β  Y r     ^ When fitting with a regression package, no intercept is used Under H 0 : 1  ...  r  c : 1 X    1 β   c  ^ β  Y   Randomization (aka Permutation) Tests • Treats the units in the study as a finite population of units, each with a fixed error term ij • When the randomization procedure assigns the unit to treatment i, we observe Yij = .  i + ij • When there are no treatment effects (all i = 0), Yij = .  ij • We can compute a test statistic, such as F* under all (or in practice, many) potential treatment arrangements of the observed units (responses) • The p-value is measured as proportion of observed test statistics as or more extreme than original. • Total number of potential permutations = nT!/(n1!...nr!) Power Approach to Sample Size Choice - Tables When the means are not all equal, the F -statistic is non-central F : r F ~ F  r  1, nT  r ,   where   * 1  n   r i 1 i i    r When all sample sizes are equal:   1  r 2 n   i    where   n  where   r The power of the test, when conducted at the significance level of  :   Pr F *  F 1   ; r  1, nT  r ,   i nT r 2 i 1 i i 1  i 1 i r See Table B.11 Choose sample sizes so that the power is sufficiently high for specific  1 ,..., r  or effects levels of interest 1 ,..., r   max  i   min  i  Table B.12 is simple to use for equal sample sizes and  mean levels of interest   Power Approach to Sample Size Choice – R Code When the means are not all equal, the F -statistic is non-central F : r F ~ F  r  1, nT  r ,   where   *  n  i 1 i i    where   2 r When all sample sizes are equal:   r 2 n   i    i 1 n  i 1 nT r 2 where   2  i 1 r The power of the test, when conducted at the significance level of  :  i  Pr F *  F 1   ; r  1, nT  r  | F * ~ F  r  1, nT  r ,   In R: F 1   ; r  1, nT  r   qf (1   , r  1, nT  r ) Power = 1    1  pf  qf (1   , r  1, nT  r ), r  1, nT  r ,   i i Power Approach to Finding “Best” Treatment Goal: Determining the best treatment (one with highest or lowest mean): 1    Probability the treatment with highest (lowest) sample mean has highest (lowest) population mean   Difference between highest (lowest) mean and 2nd highest (lowest) mean r  Number of treatments  n for various r ,1    Solve for n for given  , Table B.13 gives

Single-Factor Studies

Related documents

Products

Support

Single-Factor Studies

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib