Topic 5: Power Outline • Review estimation and inference for simple linear regression • Power / Sample Size Estimation – Slope – Intercept Simple Linear Normal Error Regression Model • Yi = b0 + b1Xi + ei • ei is a Normally distributed random variable with mean 0 and variance σ2 • ei and ej are uncorrelated → indep • Parameter Estimators ( Xi X)(Yi Y) β 1: b1 2 (Xi X) • β 0: b0 Y b1 X • σ2: Y b s 2 i 0 b1Xi n2 2 95% Confidence Intervals for β0 and β1 b0 ± tcs(b0) and b1 ± tcs(b1) where tc = t(.975, n-2), the upper 97.5 percentile of the t distribution with n-2 degrees of freedom Significance tests for β0 and β1 • H0: β0 = 0, Ha: β0 0 t* =b0/s(b0) • H0: β1 = 0, Ha: β1 ≠ 0 t* =b1/s(b1) Reject H0 if the P-value is small (<.05) Power • The power of a significance test is the probability that the null hypothesis is to be rejected when, in fact, it is false. • This is 1-P(Type II error) • This probability depends on the particular value of the parameter in Ha. Power for β1 • H0: β1 = 0, Ha: β1 0 t* = b1/s(b1) • When H0 true, t* ~ t(n-2) • We reject H0 when |t*| t(1-/2,n-2) Power for β1 • To compute power, we need to find P(|t*| t(1-/2,n-2)) for arbitrary values of β1 • Note: When β1 = 0, calculation gives α Power for β1 • • • • • When H0 false, t*~ t(n-2,d). This refers to the noncentral t distribution δ= β1/ σ(b1) – noncentrality parameter Need to assume values to get σ(b1) Often use prior info or pilot study data Power for β1 (b1 ) 2 2 (X X) 2 i 2, n, and • Need to assume values for s n 2 ( X X ) i i 1 • KNNL use tables, see pg 51 • We will use SAS Example of Power for β1 • From KNNL pg 51 • They assume σ2=2500, n=25, and 2 X i X 19800 based on s=48.82 and other results from pg 20 • Results in 2 X X 2500 /19800 0.1263 2 i Example of Power for β1 • Suppose β1 were 1.5 • We can calculate δ= β1/ σ(b1) and use the distribution t~ t(n-2,δ) to find P(|t*| t(1-/2,n-2)) • We will use a function to calculate this probability SAS CODE data a1; n=25; sig2=2500; ssx=19800; alpha=.05; beta1=1.5; sig2b1=sig2/ssx; df=n-2; delta=beta1/sqrt(sig2b1); t_c=tinv(1-alpha/2,df); power=1-probt(t_c,df,delta) +probt(-t_c,df,delta); output; proc print data=a1; run; SAS OUTPUT Obs 1 n 25 sig2 2500 sig2b1 0.12626 df 23 beta1 1.5 t_c 2.06866 power 0.98121 ssx 19800 alpha 0.05 delta 4.22137 SAS CODE *Computes power for range of beta1; data a2; n=25; sig2=2500; ssx=19800; alpha=.05; sig2b1=sig2/ssx; df=n-2; t_c=tinv(1-alpha/2,df); do beta1=-2.0 to 2.0 by .05; delta=beta1/sqrt(sig2b1); power=1-probt(t_c,df,delta) +probt(-t_c,df,delta); output; end; SAS CODE title1 'Power for the slope in Simple linear regression'; symbol1 v=none i=join; proc gplot data=a2; plot power*beta1; proc print data=a2; run; Background Reading • File knnl051.sas contains the SAS code used in this Topic (addresses example on page 51) • Chapter 2 – 2.4 : Estimation of E(Yh) – 2.5 : Prediction of new observation