Matakuliah : I0174 – Analisis Regresi Tahun : Ganjil 2007/2008 Uji Kelinearan dan Keberartian Regresi Pertemuan 02 Uji Kelinieran dan Keberartian Regresi • Anova pada regresi Sederhana • Selang Kepercayaan Parameter Regresi • Uji Independen Antar Peubah Bina Nusantara Measures of Variation: The Sum of Squares SST = Total = Sample Variability Bina Nusantara SSR Explained Variability + SSE + Unexplained Variability Measures of Variation: The Sum of Squares (continued) • SST = Total Sum of Squares – Measures the variation of the Yi values around their mean, Y • SSR = Regression Sum of Squares – Explained variation attributable to the relationship between X and Y • SSE = Error Sum of Squares – Variation attributable to factors other than the relationship between X and Y Bina Nusantara Measures of Variation: The Sum of Squares (continued) SSE =(Yi - Yi )2 Y _ SST = (Yi - Y)2 _ SSR = (Yi - Y)2 Bina Nusantara Xi _ Y X Venn Diagrams and Explanatory Power of Regression Variations in store Sizes not used in explaining variation in Sales Sizes Bina Nusantara Sales Variations in Sales explained by the error term or unexplained by Sizes SSE Variations in Sales explained by Sizes or variations in Sizes used in explaining variation in Sales SSR The ANOVA Table in Excel ANOVA df SS Regression k Residuals n-k-1 SSE Total n-1 Bina Nusantara SSR SST MS MSR =SSR/k MSE =SSE/(n-k-1) F MSR/MSE Significance F P-value of the F Test Measures of Variation The Sum of Squares: Example Excel Output for Produce Stores Degrees of freedom ANOVA df SS MS Regression 1 30380456.12 30380456 Residual 5 1871199.595 374239.92 Total 6 32251655.71 F 81.17909 Regression (explained) df Error (residual) df Total df Bina Nusantara SSE SSR Significance F 0.000281201 SST The Coefficient of Determination • SSR Regression Sum of Squares r SST Total Sum of Squares 2 • Measures the proportion of variation in Y that is explained by the independent variable X in the regression model Bina Nusantara Venn Diagrams and Explanatory Power of Regression r 2 Sales Sizes Bina Nusantara SSR SSR SSE Coefficients of Determination (r 2) and Correlation (r) Y r2 = 1, r = +1 Y r2 = 1, r = -1 ^=b +b X Y i ^=b +b X Y i 0 1 i 0 X Y r2 = .81,r = +0.9 X Bina Nusantara X Y ^=b +b X Y i 0 1 i 1 i r2 = 0, r = 0 ^=b +b X Y i 0 1 i X Standard Error of Estimate n • SYX SSE n2 i 1 Y Yˆi 2 n2 • Measures the standard deviation (variation) of the Y values around the regression equation Bina Nusantara Measures of Variation: Produce Store Example Excel Output for Produce Stores R e g r e ssi o n S ta ti sti c s M u lt ip le R R S q u a re 0 .9 4 1 9 8 1 2 9 A d ju s t e d R S q u a re 0 .9 3 0 3 7 7 5 4 S t a n d a rd E rro r 6 1 1 .7 5 1 5 1 7 O b s e r va t i o n s r2 = .94 0 .9 7 0 5 5 7 2 n 7 94% of the variation in annual sales can be explained by the variability in the size of the store as measured by square footage. Bina Nusantara Syx Linear Regression Assumptions • Normality – Y values are normally distributed for each X – Probability distribution of error is normal • Homoscedasticity (Constant Variance) • Independence of Errors Bina Nusantara Consequences of Violation of the Assumptions • Violation of the Assumptions – Non-normality (error not normally distributed) – Heteroscedasticity (variance not constant) • Usually happens in cross-sectional data – Autocorrelation (errors are not independent) • Usually happens in time-series data • Consequences of Any Violation of the Assumptions – Predictions and estimations obtained from the sample regression line will not be accurate – Hypothesis testing results will not be reliable • It is Important to Verify the Assumptions Bina Nusantara Variation of Errors Around the Regression Line f(e) • Y values are normally distributed around the regression line. • For each X value, the “spread” or variance around the regression line is the same. Y X2 X1 X Bina Nusantara Sample Regression Line Residual Analysis • Purposes – Examine linearity – Evaluate violations of assumptions • Graphical Analysis of Residuals – Plot residuals vs. X and time Bina Nusantara