Discussion 3 1/20/2014 Outline • How to fill out the table in the appendix in HW3 • What does the Model statement do in SAS Proc GLM (please download lab 2 for reference) – What is a statistical model in layman’s terms – What are the residuals and predicted values • How to calculate Variance Components • Power questions Questions? How to fill out the table in the The treatments are appendix in HW3 the herbicides used and there are 6 different herbicides (treatments) The pots are the objects that were applied the treatment therefore they are your experimental unit Each pot contained 4 plants therefore 4 measurements were made to each pot (subsamples) How to fill out the table in the appendix in HW3 Specify what design was used Specify what was measured Specify your what was the treatment applied to Specify what were your treatments and how many levels (i.e. different treatments) Specify if sub-samples were measured and what was measure as a sub-sample What does the Model statement do in SAS Proc GLM • What is a statistical model in layman’s terms – A mathematical equation constructed to describe the makeup of an observation or a group of observations – Example: One-way ANOVA model The One-way ANOVA model describes an observation (Y) as a deviation from an overall mean (µ) of a group of observations due to a treatment effect (τ) and the addition of random error (ε). What does the Model statement do in SAS Proc GLM • The overall mean (Ῡ..) is the mean of all the observations Yij – Yij read as observation from the ith treatment and the jth replication. • The treatment effect is the deviation from the overall mean Ῡ.. to the treatment mean (Ῡi.) • The random error is the deviation from the treatment mean (Ῡi.) to a given replication of that treatment (Yij ) What does the Model statement do in SAS Proc GLM • What are the predicted values? – The theoretical values obtained based on the statistical model (the error is excluded from the model): Predicted Yij • What are the residuals? – The deviation from the expected values to the observed values: — ( ) What does the Model statement do in SAS Proc GLM • To do ANOVA in SAS we use Proc GLM and specify our model: – Example 3.2 in lab 2: Proc GLM; Class Culture; Model Nlevel = Culture; Means Culture; Output Out = Residual R = Res1 P = Pred1; The Class statement tells SAS that our data is grouped by the variable Culture in the model statement we tell SAS that we want to explain Nlevel by the variable Culture; therefore: SAS calculates the overall mean and the residuals but what Proc GLM is also calculating is the sums of squares, mean squares, F – values, and p –values for each variable we specify in the model see how in the following slide. What does the Model statement do in SAS Proc GLM • How does SAS calculate the sums of square (SS)? Which is equivalent to: Where: r = number of replications When you divide the SS by their respective degrees of freedom the mean squares are obtained (Equivalent to the variances, s2) What does the Model statement do in SAS Proc GLM • In Nested Designs: – Example 3.4 in lab 2: Proc GLM; Class Trtmt Pot; * We want SAS to calculate the variances between pots because that will be our error for our ANOVA Model Growth = Trtmt Pot(Trtmt); *Pot is not a treatment. Pot is only an ID variable Random Pot(Trtmt); *must specify pot as random because we are not interested in detecting differences between pots Test H = Trtmt E = Pot(Trtmt); * Here we request a customized F test Total SS Treatment SS Pot SS (e.u.) Where i is treatment ID, j is replication ID, k is subsample ID r = number of replications, s = number of sub-samples sub-sample SS How to Calculate Variance Components • We analyze nested designs to estimate the variance components which can be used to estimate optimal sub-sample size (section 3.5.2.3 in lecture reading topic 3) • The variance components are the estimate of variance for a particular variable (e.g. treatment, experimental unit, and subsample) – The variance components can be calculate using Proc VarComp in SAS How to Calculate Variance Components • In the lecture topic 3 reading section 3.5.2.2 an experiment is described where mint plants are exposed to different treatments of temperature and daylight and stem length was measured • There are a total of 6 treatments, 3 pots (replications) per treatment, and 4 plants were measured per pot (subsamples) How to Calculate Variance Components • The sums of squares was calculated for treatment, pots and subsamples Total SS Treatment SS Pot SS (e.u.) sub-sample SS • Then the variances (equivalent to means squares) MS Treatment t -1 MS Pot t (r -1) MS sub-sample rt (s - 1) Where t = number of treatments, r = number of replications, and s = number of sub-samples How to Calculate Variance Components • The variance due to the sub-sample is the variance due to error: MS sub-sample = σδ2 • To estimate the variance component of the subsample we just calculate the MS of subsample rt (s - 1) = 0.93 How to Calculate Variance Components • The variance of pots contains the variance of the sub-samples (NOTE: this is not the variance component of pots): MS Pots = σδ2 + 4σε2 = 2.15 • To estimate the variance component of pots we have to solve for σε2. The variance components of pot is calculated below: σε2 = (MS Pots - σδ2) / 4 σε2 = (2.15 - 0.93) / 4 = 0.30 How to Calculate Variance Components • The variance of treatments includes the variance of pots and subsamples (NOTE:): MS Treatments = σδ2 + 4σε2 + 12Στ2/5 = 35.92 • To estimate the variance component of treatments only we have to solve for Στ2/5 Στ2/5 = (MS Treatments - σδ2 - 4σε2) / 12 Στ2/5 = (35.92 - 0.93 – 4*0.3) / 12 = 2.81 Power • How to use Power Charts for ANOVA