Uji Kelinearan dan Keberartian Regresi Pertemuan 02 Tahun

advertisement
Matakuliah : I0174 – Analisis Regresi
Tahun
: Ganjil 2007/2008
Uji Kelinearan dan Keberartian Regresi
Pertemuan 02
Uji Kelinieran dan Keberartian Regresi
• Anova pada regresi Sederhana
• Selang Kepercayaan Parameter Regresi
• Uji Independen Antar Peubah
Bina Nusantara
Measures of Variation:
The Sum of Squares
SST
=
Total
=
Sample
Variability
Bina Nusantara
SSR
Explained
Variability
+
SSE
+
Unexplained
Variability
Measures of Variation:
The Sum of Squares
(continued)
• SST = Total Sum of Squares
– Measures the variation of the Yi values around their mean, Y
• SSR = Regression Sum of Squares
– Explained variation attributable to the relationship between X and Y
• SSE = Error Sum of Squares
– Variation attributable to factors other than the relationship between X and Y
Bina Nusantara
Measures of Variation:
The Sum of Squares
(continued)

SSE =(Yi - Yi )2
Y
_
SST = (Yi - Y)2
 _
SSR = (Yi - Y)2
Bina Nusantara
Xi
_
Y
X
Venn Diagrams and Explanatory Power of Regression
Variations in
store Sizes not
used in
explaining
variation in
Sales
Sizes
Bina Nusantara
Sales
Variations in Sales
explained by the
error term or
unexplained by
Sizes  SSE 
Variations in Sales
explained by Sizes
or variations in Sizes
used in explaining
variation in Sales
 SSR 
The ANOVA Table in Excel
ANOVA
df
SS
Regression
k
Residuals
n-k-1 SSE
Total
n-1
Bina Nusantara
SSR
SST
MS
MSR
=SSR/k
MSE
=SSE/(n-k-1)
F
MSR/MSE
Significance
F
P-value of
the F Test
Measures of Variation
The Sum of Squares: Example
Excel Output for Produce Stores
Degrees of freedom
ANOVA
df
SS
MS
Regression
1
30380456.12
30380456
Residual
5
1871199.595 374239.92
Total
6
32251655.71
F
81.17909
Regression (explained) df
Error (residual) df
Total df
Bina Nusantara
SSE
SSR
Significance F
0.000281201
SST
The Coefficient of Determination
•
SSR Regression Sum of Squares
r 

SST
Total Sum of Squares
2
• Measures the proportion of variation in Y that is explained by the
independent variable X in the regression model
Bina Nusantara
Venn Diagrams and
Explanatory Power of Regression
r 
2
Sales
Sizes
Bina Nusantara
SSR

SSR  SSE
Coefficients of Determination (r 2) and Correlation (r)
Y r2 = 1, r = +1
Y r2 = 1, r = -1
^=b +b X
Y
i
^=b +b X
Y
i
0
1 i
0
X
Y r2 = .81,r = +0.9
X
Bina Nusantara
X
Y
^=b +b X
Y
i
0
1 i
1 i
r2 = 0, r = 0
^=b +b X
Y
i
0
1 i
X
Standard Error of Estimate

n
•
SYX
SSE


n2
i 1
Y  Yˆi

2
n2
• Measures the standard deviation (variation) of the Y values around
the regression equation
Bina Nusantara
Measures of Variation:
Produce Store Example
Excel Output for Produce Stores
R e g r e ssi o n S ta ti sti c s
M u lt ip le R
R S q u a re
0 .9 4 1 9 8 1 2 9
A d ju s t e d R S q u a re
0 .9 3 0 3 7 7 5 4
S t a n d a rd E rro r
6 1 1 .7 5 1 5 1 7
O b s e r va t i o n s
r2 = .94
0 .9 7 0 5 5 7 2
n
7
94% of the variation in annual sales can be
explained by the variability in the size of the
store as measured by square footage.
Bina Nusantara
Syx
Linear Regression Assumptions
• Normality
– Y values are normally distributed for each X
– Probability distribution of error is normal
• Homoscedasticity (Constant Variance)
• Independence of Errors
Bina Nusantara
Consequences of Violation
of the Assumptions
• Violation of the Assumptions
– Non-normality (error not normally distributed)
– Heteroscedasticity (variance not constant)
• Usually happens in cross-sectional data
– Autocorrelation (errors are not independent)
• Usually happens in time-series data
• Consequences of Any Violation of the Assumptions
– Predictions and estimations obtained from the sample regression line
will not be accurate
– Hypothesis testing results will not be reliable
• It is Important to Verify the Assumptions
Bina Nusantara
Variation of Errors Around
the Regression Line
f(e)
• Y values are normally distributed
around the regression line.
• For each X value, the “spread” or
variance around the regression line is
the same.
Y
X2
X1
X
Bina Nusantara
Sample Regression Line
Residual Analysis
• Purposes
– Examine linearity
– Evaluate violations of assumptions
• Graphical Analysis of Residuals
– Plot residuals vs. X and time
Bina Nusantara
Download