Pertemuan 23 Regresi Ganda – Metoda Statistika Matakuliah

advertisement
Matakuliah
Tahun
Versi
: I0134 – Metoda Statistika
: 2005
: Revisi
Pertemuan 23
Regresi Ganda
1
Learning Outcomes
Pada akhir pertemuan ini, diharapkan mahasiswa
akan mampu :
• Mahasiswa dapat menunjukkan
persamaan normal pada regresi ganda
dan menghitung koefisien regresi ganda
dan koefisien determinasi.
2
Outline Materi
•
•
•
•
Hasil pengamatan
Persamaan normal
Persamaan regresi
Koefisien determinasi
3
Multiple Regression
•
•
•
•
•
•
Multiple Regression Model
Least Squares Method
Multiple Coefficient of Determination
Model Assumptions
Testing for Significance
Using the Estimated Regression Equation
for Estimation and Prediction
• Qualitative Independent Variables
• Residual Analysis
4
The Multiple Regression
Model
• The Multiple Regression Model
y = 0 + 1x1 + 2x2 + . . . + pxp + 
• The Multiple Regression Equation
E(y) = 0 + 1x1 + 2x2 + . . . + pxp
• The Estimated Multiple Regression
Equation
y^ = b0 + b1x1 + b2x2 + . . . + bpxp
5
The Least Squares Method
• Least Squares Criterion
min  ( y i  y^i ) 2
• Computation of Coefficients’ Values
The formulas for the regression coefficients b0,
b1, b2, . . . bp involve the use of matrix algebra. We
will rely on computer software packages to perform
the calculations.
• A Note on Interpretation of Coefficients
bi represents an estimate of the change in y
corresponding to a one-unit change in xi when all
other independent variables are held constant.
6
The Multiple Coefficient of
Determination
• Relationship Among SST, SSR, SSE
SST = SSR + SSE
2
2
2
 ( y i  y^)   ( y i  y )   ( y^i  y i )
• Multiple Coefficient of Determination
R 2 = SSR/SST
Ra2  1  ( 1  R 2 )
n1
np1
• Adjusted Multiple Coefficient of Determination
7
Model Assumptions
• Assumptions About the Error Term 
–
–
–
–
The error  is a random variable with mean of
zero.
The variance of  , denoted by 2, is the
same for all values of the independent
variables.
The values of  are independent.
The error  is a normally distributed random
variable reflecting the deviation between the y
value and the expected value of y given by
0 +  1 x 1 + 2 x 2 + . . . + p x p
8
Testing for Significance: F
Test
• Hypotheses
H0: 1 = 2 = . . . = p = 0
Ha: One or more of the parameters
is not equal to zero.
• Test Statistic
F = MSR/MSE
• Rejection Rule
Reject H0 if F > F
where F is based on an F distribution with p
d.f. in
the numerator and n - p - 1 d.f. in the
denominator.
9
Testing for Significance: t
Test
• Hypotheses
H0: i = 0
Ha: i = 0
• Test Statistic
bi
t
sbi
• Rejection Rule
Reject H0 if t < -tor t > t
where t is based on a t distribution with
n - p - 1 degrees of freedom.
10
Using the Estimated Regression
Equation
for Estimation and Prediction
• The procedures for estimating the mean value of y and
predicting an individual value of y in multiple regression
are similar to those in simple regression.
• We substitute the given values of x1, x2, . . . , xp into the
estimated regression equation and use the
corresponding value of ^y as the point estimate.
• The formulas required to develop interval estimates for
the mean value of y and for an individual value of y are
beyond the scope of the text.
• Software packages for multiple regression will often
provide these interval estimates.
11
Example: Programmer Salary
Survey
Exper.
Salary
4
7
1
5
8
10
0
1
6
6
Score
78
100
86
82
86
84
75
80
83
91
Salary
24
43
23.7
34.3
35.8
38
22.2
23.1
30
33
Exper.
9
2
10
5
6
8
4
6
3
3
Score
88
73
75
81
74
87
79
94
70
89
38
26.6
36.2
31.6
29
34
30.1
33.9
28.2
30
12
Example: Programmer Salary
Survey
• Multiple Regression Model
Suppose we believe that salary (y) is related to the
years of experience (x1) and the score on the
programmer aptitude test (x2) by the following
regression model:
y = 0 + 1x1 + 2x2 + 
where
y = annual salary ($000)
x1 = years of experience
x2 = score on programmer aptitude test
13
Example: Programmer Salary
Survey
• Multiple Regression Equation
Using the assumption E ( ) = 0, we obtain
E(y ) = 0 + 1x1 + 2x2
• Estimated Regression Equation
b0, b1, b2 are the
^ least squares estimates of
0 , 1 , 2
Thus
y = b 0 + b 1x 1 + b 2x 2
14
Example: Programmer Salary
Survey
• Solving for the Estimates of 0, 1, 2
Least Squares
Output
Input Data
x1
x2 y
4 78 24
7 100 43
.
.
.
.
.
.
3 89 30
Computer
Package
for Solving
Multiple
Regression
Problems
b0 =
b1 =
b2 =
R2 =
etc.
15
Example: Programmer Salary
Survey
• Minitab Computer Output
The regression is
Salary = 3.17 + 1.40 Exper + 0.251 Score
Predictor
Constant
Exper
Score
Coef
3.174
1.4039
.25089
Stdev
t-ratio
p
6.156
.52
.1986
7.07
.07735
3.24
s = 2.419
R-sq = 83.4%
.613
.000
.005
R-sq(adj) = 81.5%
16
Example: Programmer Salary
Survey
• Minitab Computer Output (continued)
Analysis of Variance
SOURCE
DF
SS
MS
F
P
Regression 2 500.33 250.16 42.76 0.000
Error
17
99.46 5.85
Total
19 599.79
17
Example: Programmer Salary
Survey
• F Test
– Hypotheses H0: 1 = 2 = 0
Ha: One or both of the
parameters
is not equal to zero.
– Rejection Rule
For  = .05 and d.f. = 2, 17: F.05 = 3.59
Reject H0 if F > 3.59.
– Test Statistic
F = MSR/MSE = 250.16/5.85 = 42.76
– Conclusion
We can reject H0.
18
Example: Programmer Salary
Survey
• t Test for Significance of Individual Parameters
– Hypotheses
H0: i = 0
Ha: i = 0
– Rejection Rule
For  = .05 and d.f. = 17, t.025 = 2.11
Reject H0 if t > 2.11
– Test Statistics
b1 1. 4039
b2 . 25089

 7 . 07

 3. 24
sb1
. 1986
sb2 . 07735
–
Conclusions
Reject H0: 1 = 0
Reject H0: 2 = 0
19
• Selamat Belajar Semoga Sukses.
20
Download