Topic 30: Random Effects Outline • One-way random effects model – Data – Model – Inference Data for one-way random effects model • Y, the response variable • Factor with levels i = 1 to r • Yij is the jth observation in cell i, j = 1 to ni • Almost identical model structure to earlier one-way ANOVA. • Difference in level of inference Level of Inference • In one-way ANOVA, interest was in comparing the factor level means • In random effects scenario, interest is in the pop of factor level means, not just the means of the r study levels • Need to make assumptions about population distribution • Will take “random” draw from pop of factor levels for use in study KNNL Example • KNNL p 1036 • Y is the rating of a job applicant • Factor A represents five different personnel interviewers (officers), r=5 • n=4 different applicants were interviewed by each officer • The interviewers were selected at random and the applicants were randomly assigned to interviewers Read and check the data data a1; infile 'c:\...\CH25TA01.DAT'; input rating officer; proc print data=a1; run; The data Obs 1 2 3 4 5 6 7 8 9 10 rating 76 65 85 74 59 75 81 67 49 63 officer 1 1 1 1 2 2 2 2 3 3 The data Obs 11 12 13 14 15 16 17 18 19 20 rating 61 46 74 71 85 89 66 84 80 79 officer 3 3 4 4 4 4 5 5 5 5 Plot the data title1 'Plot of the data'; symbol1 v=circle i=none c=black; proc gplot data=a1; plot rating*officer/frame; run; Find and plot the means proc means data=a1; output out=a2 mean=avrate; var rating; by officer; title1 'Plot of the means'; symbol1 v=circle i=join c=black; proc gplot data=a2; plot avrate*officer/frame; run; Random effects model • Yij = μi + εij Key – the μi are iid N(μ, σμ2) difference – the εij are iid N(0, σ2) – μi and εij are independent • Yij ~ N(μ, σμ2 + σ2) • Two sources of variation • Observations with the same i are not independent, covariance is σμ2 Random effects model • This model is also called – Model II ANOVA – A variance components model • The components of variance are σμ2 and σ2 • The models that we previously studied are called fixed effects models Random factor effects model • Yij = μ + i + εij • i ~ N(0, σμ2) ***** • εij ~ N(0, σ2) • Yij ~ N(μ, σμ2 + σ2) Parameters • There are three parameters in these models –μ – σ μ2 – σ2 • The cell means (or factor levels) are random variables, not parameters • Inference focuses on these variances Primary Hypothesis • Want to know if H0: σμ2 = 0 • This implies all mi in model are equal but also all mi not selected for analysis are also equal. • Thus scope is broader than fixed effects case • Need the factor levels of the study to be “representative” of the population Alternative Hypothesis • We are sometimes interested in estimating σμ2 /(σμ2 + σ2) • This is the same as σμ2 /σY2 • In some applications it is called the intraclass correlation coefficient • It is the correlation between two observations with the same I • Also percent of total variation of Y ANOVA table • The terms and layout of the anova table are the same as what we used for the fixed effects model • The expected mean squares (EMS) are different because of the additional random effects but F test statistics are the same • Be wary that hypotheses being tested are different EMS and parameter estimates • • • • • E(MSA) = σ2 + nσμ2 E(MSE) = σ2 We use MSE to estimate σ2 Can use (MSA – MSE)/n to estimate σμ2 Question: Why might it we want an alternative estimate for σμ2? Main Hypotheses • H 0: σ μ 2 = 0 • H 1: σ μ 2 ≠ 0 • Test statistic is F = MSA/MSE with r-1 and r(n-1) degrees of freedom, reject when F is large, report the P-value Run proc glm proc glm data=a1; class officer; model rating=officer; random officer/test; run; Model and error output Source DF Model 4 Error 15 Total 19 MS 394 73 F P 5.39 0.0068 Random statement output Source Type III Expected MS officer Var(Error) + 4 Var(officer) Proc varcomp proc varcomp data=a1; class officer; model rating=officer; run; Output MIVQUE(0) Estimates Variance Component Var(officer) Var(Error) rating 80.41042 73.28333 Other methods are available for estimation, minque is the default Proc mixed proc mixed data=a1 cl; class officer; model rating=; random officer/vcorr; run; Output Covariance Parameter Estimates Cov Parm officer Residual Est Lower 80.4 73.2 24.4 39.9 Upper 1498 175 80.4104/(80.4104+73.2833)=.5232 Output from vcorr Row Col1 1 1.0000 2 0.5232 3 0.5232 4 0.5232 Col2 0.5232 1.0000 0.5232 0.5232 Col3 0.5232 0.5232 1.0000 0.5232 Col4 0.5232 0.5232 0.5232 1.0000 Other topics • Estimate and CI for μ, p1038 – Standard error involves a combination of two variances – Use MSA instead of MSE → r-1 df • Estimate and CI for σμ2 /(σμ2 + σ2), p1040 • CIs for σμ2 and σ2, p1041-1047 – Available using Proc Mixed Applications • In the KNNL example we would like σμ2 /(σμ2 + σ2) to be small, indicating that the variance due to interviewer is small relative to the variance due to applicants • In many other examples we would like this quantity to be large, – e.g., Are partners more likely to be similar in sociability? Last slide • Start reading KNNL Chapter 25 • We used program topic30.sas to generate the output for today