Matakuliah Tahun Versi : I0284 - Statistika : 2008 : Revisi Pertemuan 23 dan 24 Regresi dan Korelasi Ganda 1 Learning Outcomes Pada akhir pertemuan ini, diharapkan mahasiswa akan mampu : • Mahasiswa akan dapat menghitung koefisien regresi, korelasi dan determinasi ganda. 2 Outline Materi • • • • Model regresi ganda Persamaan normal regresi ganda Persamaan regresi dugaan Koefisien korelasi dan determinasi ganda 3 Multiple Regression • • • • • • Multiple Regression Model Least Squares Method Multiple Coefficient of Determination Model Assumptions Testing for Significance Using the Estimated Regression Equation for Estimation and Prediction • Qualitative Independent Variables • Residual Analysis 4 The Multiple Regression Model • The Multiple Regression Model y = 0 + 1x1 + 2x2 + . . . + pxp + • The Multiple Regression Equation E(y) = 0 + 1x1 + 2x2 + . . . + pxp • The Estimated Multiple Regression Equation y^ = b0 + b1x1 + b2x2 + . . . + bpxp 5 The Least Squares Method • Least Squares Criterion 2 min ( y i y i ) ^ • Computation of Coefficients’ Values The formulas for the regression coefficients b0, b1, b2, . . . bp involve the use of matrix algebra. We will rely on computer software packages to perform the calculations. • A Note on Interpretation of Coefficients bi represents an estimate of the change in y corresponding to a one-unit change in xi when all other independent variables are held constant. 6 The Multiple Coefficient of Determination • Relationship Among SST, SSR, SSE SST = SSR + SSE 2 2 2 ( y i y^ ) ( y i y^) ( y i y i ) • Multiple Coefficient of Determination R 2 = SSR/SST Ra2 n1 1 (1 R ) np1 2 • Adjusted Multiple Coefficient of Determination 7 Model Assumptions • Assumptions About the Error Term – – – – The error is a random variable with mean of zero. The variance of , denoted by 2, is the same for all values of the independent variables. The values of are independent. The error is a normally distributed random variable reflecting the deviation between the y value and the expected value of y given by 0 + 1 x 1 + 2 x 2 + . . . + p x p 8 Testing for Significance: F Test • Hypotheses H0: 1 = 2 = . . . = p = 0 Ha: One or more of the parameters is not equal to zero. • Test Statistic F = MSR/MSE • Rejection Rule Reject H0 if F > F where F is based on an F distribution with p d.f. in the numerator and n - p - 1 d.f. in the denominator. 9 Testing for Significance: t Test • Hypotheses H0: i = 0 Ha: i ≠ 0 • Test Statistic bi t sbii • Rejection Rule Reject H0 if t < -tor t > t where t is based on a t distribution with n - p - 1 degrees of freedom. 10 Testing for Significance: Multicollinearity • The term multicollinearity refers to the correlation among the independent variables. • When the independent variables are highly correlated (say, |r | > .7), it is not possible to determine the separate effect of any particular independent variable on the dependent variable. • If the estimated regression equation is to be used only for predictive purposes, multicollinearity is usually not a serious problem. • Every attempt should be made to avoid including independent variables that are highly correlated. 11 • Selamat Belajar Semoga Sukses. 12