Slides by
John
Loucks
St. Edward’s
University
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 1
Chapter 14, Part B
Simple Linear Regression

Using the Estimated Regression Equation
for Estimation and Prediction

Computer Solution

Residual Analysis: Validating Model Assumptions

Residual Analysis: Outliers and Influential
Observations
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 2
Using Excel’s Regression Tool
 Up to this point, you have seen how Excel can be
used for various parts of a regression analysis.
 Excel also has a comprehensive tool in its Data
Analysis package called Regression.
 The Regression tool can be used to perform a
complete regression analysis.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 3
Estimated Regression Equation

Excel Worksheet (showing data)
1
2
3
4
5
6
7
A
Week
1
2
3
4
5
B
TV Ads
1
3
2
1
3
C
Cars Sold
14
24
18
17
27
D
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 4
Using Excel’s Regression Tool

Performing the Regression Analysis
Step 1 Select the Tools menu
Step 2 Choose the Data Analysis option
Step 3 Choose Regression from the list of
Analysis Tools
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 5
Using Excel’s Regression Tool

Excel Regression Dialog Box
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 6
Using Excel’s Regression Tool

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Excel Value Worksheet
A
Week
1
2
3
4
5
C
Cars Sold
14
24
18
17
27
B
TV Ads
1
3
2
1
3
SUMMARY OUTPUT
D
E
F
G
H
I
Data
Regression Statistics Output
Regression Statistics
0.936585812
Multiple R
0.877192982
R Square
0.83625731
Adjusted R Square
2.160246899
Standard Error
5
Observations
Estimated Regression
Equation Output
ANOVA Output
ANOVA
SS
df
Regression
Residual
Total
Intercept
TV Ads
1
3
4
Significance F
F
0.018986231
100 21.42857
100
14 4.666667
114
MS
P-value
t Stat
Standard Error
Coefficients
2.366431913 4.225771 0.024236
10
4.6291 0.018986
1.08012345
5
Upper 95% Lower 95.0% Upper 95.0%
Lower 95%
2.468950436 17.53104956 2.468950436 17.53104956
1.562561893 8.437438107 1.562561893 8.437438107
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 7
Using Excel’s Regression Tool

Excel Value Worksheet (bottom-left portion)
A
B
C
D
E
22
23
Coeffic. Std. Err. t Stat P-value
24 Intercept
10 2.36643 4.2258 0.02424
25 TV Ads
5 1.08012 4.6291 0.01899
26
Note: Columns F-I are not shown.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 8
Using Excel’s Regression Tool

Excel Value Worksheet (bottom-right portion)
A
B
F
G
H
I
22
23
Coeffic. Low. 95% Up. 95% Low. 95.0% Up. 95.0%
24 Intercept
10 2.46895 17.53105 2.46895044 17.5310496
25 TV Ads
5 1.562562 8.437438 1.56256189 8.43743811
26
Note: Columns C-E are hidden.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 9
Using Excel’s Regression Tool

Excel Value Worksheet (middle portion)
A
16
17
18
19
20
21
22
B
C
D
E
F
ANOVA
df
Regression
Residual
Total
SS
MS
F
Significance F
1 100
100 21.4286
0.018986231
3
14 4.66667
4 114
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 10
Using Excel’s Regression Tool

Excel Value Worksheet (top portion)
A
9
10
11
12
13
14
15
16
B
C
Regression Statistics
Multiple R
0.936585812
R Square
0.877192982
Adjusted R Square
0.83625731
Standard Error
2.160246899
Observations
5
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 11
Using the Estimated Regression Equation
for Estimation and Prediction
 A confidence interval is an interval estimate of the
mean value of y for a given value of x.
 A prediction interval is used whenever we want to
predict an individual value of y for a new observation
corresponding to a given value of x.
 The margin of error is larger for a prediction interval.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 12
Using the Estimated Regression Equation
for Estimation and Prediction

Confidence Interval Estimate of E(y*)
yˆ *  t /2 syˆ *

Prediction Interval Estimate of y*
yˆ *  t /2 spred
where:
confidence coefficient is 1 -  and
t/2 is based on a t distribution
with n - 2 degrees of freedom
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 13
Point Estimation
If 3 TV ads are run prior to a sale, we expect
the mean number of cars sold to be:
y^ = 10 + 5(3) = 25 cars
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 14
Confidence Interval for E(y*)

Estimate of the Standard Deviation of ŷ *
( x *  x )2
1
syˆ *  s

n  ( x i  x )2
(3  2)2
1
syˆ *  2.16025

5 (1  2)2  (3  2)2  (2  2)2  (1  2)2  (3  2)2
syˆ *  2.16025
1 1
  1.4491
5 4
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 15
Confidence Interval for E(y*)
The 95% confidence interval estimate of the mean
number of cars sold when 3 TV ads are run is:
yˆ *  t /2 syˆ *
25 + 3.1824(1.4491)
25 + 4.61
20.39 to 29.61 cars
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 16
Prediction Interval for y*

Estimate of the Standard Deviation
of an Individual Value of y*
spred
( x *  x )2
1
 s 1 
n  ( x i  x )2
1 1
spred  2.16025 1  
5 4
spred  2.16025(1.20416)  2.6013
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 17
Prediction Interval for y*
The 95% prediction interval estimate of the number
of cars sold in one particular week when 3 TV ads
are run is:
yˆ *  t /2 spred
25 + 3.1824(2.6013)
25 + 8.28
16.72 to 33.28 cars
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 18
Residual Analysis
 If the assumptions about the error term e appear
questionable, the hypothesis tests about the
significance of the regression relationship and the
interval estimation results may not be valid.
 The residuals provide the best information about e .
 Residual for Observation i
y i  yˆ i
 Much of the residual analysis is based on an
examination of graphical plots.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 19
Residual Plot Against x

If the assumption that the variance of e is the same
for all values of x is valid, and the assumed
regression model is an adequate representation of the
relationship between the variables, then
The residual plot should give an overall
impression of a horizontal band of points
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 20
Residual Plot Against x
Residual
y  yˆ
Good Pattern
0
x
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 21
Residual Plot Against x
Residual
y  yˆ
Nonconstant Variance
0
x
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 22
Residual Plot Against x
Residual
y  yˆ
Model Form Not Adequate
0
x
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 23
Residual Plot Against x

Residuals
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 24
Residual Plot Against x

Using Excel to Produce a Residual Plot
• The steps outlined earlier to obtain the regression
output are performed with one change.
• When the Regression dialog box appears, we must
also select the Residual Plot option.
•
The output will include two new items:
• A plot of the residuals against the
independent variable, and
• A list of predicted values of y and the
corresponding residual values.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 25
Residual Plot Against x
TV Ads Residual Plot
3
Residuals
2
1
0
-1
-2
-3
0
1
2
3
4
TV Ads
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 26
Standardized Residuals

Standardized Residual for Observation i
y i  yˆ i
syi yˆ i
where:
syi yˆ i  s 1  hi
( x i  x )2
1
hi  
n  ( x i  x )2
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 27
Standardized Residual Plot


The standardized residual plot can provide insight
about the assumption that the error term e has a
normal distribution.
If this assumption is satisfied, the distribution of the
standardized residuals should appear to come from a
standard normal probability distribution.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 28
Standardized Residual Plot

Standardized Residuals
Observation
Predicted y
Residual
1
15
-1
Standardized
Residual
-0.5345
2
25
-1
-0.5345
3
20
-2
-1.0690
4
15
2
1.0690
5
25
2
1.0690
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 29
Standardized Residual Plot

Standardized Residual Plot
Standard Residuals
28
29
30
31
32
33
34
35
36
37
1.5
A
B
C
D
1
RESIDUAL
OUTPUT
0.5
Observation
0
-0.5 0
-1
-1.5
1
2
3
4
5
Predicted Y
15
10
25
20
15
25
Residuals
Standard Residuals
-1 -0.534522
20
30
-1 -0.534522
-2 -1.069045
2 1.069045
2 1.069045
Cars Sold
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 30
Standardized Residual Plot

All of the standardized residuals are between –1.5
and +1.5 indicating that there is no reason to question
the assumption that e has a normal distribution.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 31
Outliers and Influential Observations

Detecting Outliers
• An outlier is an observation that is unusual in
comparison with the other data.
• Minitab classifies an observation as an outlier if its
standardized residual value is < -2 or > +2.
• This standardized residual rule sometimes fails to
identify an unusually large observation as being
an outlier.
• This rule’s shortcoming can be circumvented by
using studentized deleted residuals.
• The |i th studentized deleted residual| will be
larger than the |i th standardized residual|.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 32
End of Chapter 14, Part B
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide 33