Decomposition of Sum of Squares
• The total sum of squares (SS) in the response variable is
SSTO
Y i
Y
2
• The total SS can be decompose into two main sources; error SS and regression SS…
• The error SS is SSE e i
2 .
• The regression SS is SSR b
1
2
X i
X
2
.
It is the amount of variation in Y ’s that is explained by the linear relationship of Y with X .
STA302/1001 - week 4 1
Claims
• First, SSTO = SSR +SSE, that is
SSTO
Y i
Y
2 b
1
2
X i
X
2 e i
2
• Proof:….
• Alternative decomposition is
SSTO
Y i
Y
2
Y
ˆ i
Y
2
Y i
Y
ˆ i
2
• Proof: Exercises.
STA302/1001 - week 4 2
Analysis of Variance Table
• The decomposition of SS discussed above is usually summarized in analysis of variance table (ANOVA) as follow:
• Note that the MSE is s 2 our estimate of σ 2 .
STA302/1001 - week 4 3
Coefficient of Determination
• The coefficient of determination is
R 2
SSR
SSTO
1
SSE
SSTO
• It must satisfy 0 ≤ R 2 ≤ 1.
• R 2 gives the percentage of variation in Y ’s that is explained by the regression line.
STA302/1001 - week 4 4
Claim
• R 2 = r 2 , that is the coefficient of determination is the correlation coefficient square.
• Proof:…
STA302/1001 - week 4 5
Important Comments about R 2
• It is a useful measure but…
• There is no absolute rule about how big it should be.
• It is not resistant to outliers.
• It is not meaningful for models with no intercepts.
• It is not useful for comparing models unless same Y and one set of predictors is a subset of the other.
STA302/1001 - week 4 6
ANOVE F Test
• The ANOVA table gives us another test of H
0
: β
1
= 0.
• The test statistics is F stat
MSR
MSE
• Derivations …
STA302/1001 - week 4 7
Prediction of Mean Response
• Very often, we would want to use the estimated regression line to make prediction about the mean of the response for a particular X value (assumed to be fixed).
• We know that the least square line is an estimate of
E
Y | X
0
1
X
Y
ˆ b
0 1
• Now, we can pick a point, X = x* (in the range in the regression
Y
ˆ b
0 b
1
E
Y | X x *
0
1 x
•
Claim: Var
Y
ˆ
* | X x *
2
1 n
x * x
2
S
XX
* .
•
Proof:
• This is the variance of the estimate of E ( Y | X=x* ).
STA302/1001 - week 4 8
Confidence Interval for E(Y | X = x*)
• For a given x , x * , a 100(1α )% CI for the mean value of Y is
Y
ˆ
* t
n 2
;
2 s
1 n
x * x
2
S
XX where s MSE .
STA302/1001 - week 4 9
Example
• Consider the smoking and cancer data.
• Suppose we wish to predict the mean mortality index when the smoking index is 101, that is, when x* = 101….
STA302/1001 - week 4 10
Prediction of New Observation
• Suppose we want to predict a particular value of Y* when X = x *.
• The predicted value of a new point measured when X = x * is
Y
ˆ
* b
0
b
1 x *
• Note, the above predicted value is the same as the estimate of
E ( Y | X = x* ).
to the regression line being estimated by b
0
+ b
1
X . The second one is due to ε * i.e., points don’t fall exactly on line.
• To calculated the variance in error of prediction we look at the difference Y * Y
ˆ
* ....
STA302/1001 - week 4 11
Prediction Interval for New Observation
• 100(1α )% prediction interval for when X = x* is
Y
ˆ
* t
n 2
;
2 s 1
1 n
x * x
2
S
XX
• This is not a confidence interval; CI’s are for parameters and we are estimating a value of a random variable.
• Prediction interval is wider than CI for E ( Y | X = x* ).
STA302/1001 - week 4 12
Dummy Variable Regression
• Dummy or indicator variable takes two values: 0 or 1.
• It indicates which category an observation is in.
• Example…
• Interpretation of regression coefficient in a dummy variable regression…
STA302/1001 - week 4 13