Chapter 12 Simple Linear Regression

advertisement
Chapter 12
Simple Linear Regression





Simple Linear Regression Model
Least Squares Method
Coefficient of Determination
Model Assumptions
Testing for Significance
Simple Linear Regression Model
 The equation that describes how y is related to x and
an error term is called the regression model.
 The simple linear regression model is:
y = b0 + b1x +e
where:
b0 and b1 are called parameters of the model,
e is a random variable called the error term.
Simple Linear Regression Equation

The simple linear regression equation is:
E(y) = b0 + b1x
• Graph of the regression equation is a straight line.
• b0 is the y intercept of the regression line.
• b1 is the slope of the regression line.
• E(y) is the expected value of y for a given x value.
Simple Linear Regression Equation

Positive Linear Relationship
E(y)
Regression line
Intercept
b0
Slope b1
is positive
x
Simple Linear Regression Equation

Negative Linear Relationship
E(y)
Intercept
b0
Regression line
Slope b1
is negative
x
Simple Linear Regression Equation

No Relationship
E(y)
Regression line
Intercept
b0
Slope b1
is 0
x
Estimated Simple Linear Regression Equation

The estimated simple linear regression equation
ŷ  b0  b1 x
• The graph is called the estimated regression line.
• b0 is the y intercept of the line.
• b1 is the slope of the line.
• ŷ is the estimated value of y for a given x value.
Estimation Process
Regression Model
y = b0 + b1x +e
Regression Equation
E(y) = b0 + b1x
Unknown Parameters
b0, b1
b0 and b1
provide estimates of
b0 and b1
Sample Data:
x
y
x1
y1
.
.
.
.
xn yn
Estimated
Regression Equation
ŷ  b0  b1 x
Sample Statistics
b0, b1
Least Squares Method

Least Squares Criterion
min  (y i  y i ) 2
where:
yi = observed value of the dependent variable
for the ith observation
y^i = estimated value of the dependent variable
for the ith observation
Least Squares Method

Slope for the Estimated
Regression Equation

x y 

x y 
i
b1 
i
i
n
2

xi 

2
 xi  n
i
Least Squares Method

y-Intercept for the Estimated Regression Equation
b0  y  b1 x
where:
xi = value of independent variable for ith
observation
yi = value of dependent variable for ith
_ observation
x = mean value for independent variable
_
y = mean value for dependent variable
n = total number of observations
Simple Linear Regression

Example: Reed Auto Sales
Reed Auto periodically has
a special week-long sale.
As part of the advertising
campaign Reed runs one or
more television commercials
during the weekend preceding the sale. Data from a
sample of 5 previous sales are shown on the next slide.
Simple Linear Regression

Example: Reed Auto Sales
Number of
TV Ads
1
3
2
1
3
Number of
Cars Sold
14
24
18
17
27
Estimated Regression Equation

Slope for the Estimated Regression Equation
b1
( x  x )( y  y ) 20



5
4
 (x  x )
i
i
2
i

y-Intercept for the Estimated Regression Equation
b0  y  b1 x  20  5(2)  10

Estimated Regression Equation
yˆ  10  5x
Scatter Diagram and Trend Line
30
Cars Sold
25
20
y = 5x + 10
15
10
5
0
0
1
2
TV Ads
3
4
Coefficient of Determination

Relationship Among SST, SSR, SSE
SST
=
SSR
+
SSE
2
2
2
ˆ
ˆ
(
y

y
)

(
y

y
)

(
y

y
)
 i
 i
 i i
where:
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error
Coefficient of Determination

The coefficient of determination is:
r2 = SSR/SST
where:
SSR = sum of squares due to regression
SST = total sum of squares
Coefficient of Determination
r2 = SSR/SST = 100/114 = .8772
The regression relationship is very strong; 88%
of the variability in the number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.
Sample Correlation Coefficient
rxy  (sign of b1 ) Coefficien t of Determinat ion
rxy  (sign of b1 ) r 2
where:
b1 = the slope of the estimated regression
equation yˆ  b0  b1 x
Sample Correlation Coefficient
rxy  (sign of b1 ) r 2
The sign of b1 in the equation yˆ  10  5 x is “+”.
rxy = + .8772
rxy = +.9366
Assumptions About the Error Term e
1. The error e is a random variable with mean of zero.
2. The variance of e , denoted by  2, is the same for
all values of the independent variable.
3. The values of e are independent.
4. The error e is a normally distributed random
variable.
Testing for Significance
To test for a significant regression relationship, we
must conduct a hypothesis test to determine whether
the value of b1 is zero.
Two tests are commonly used:
t Test
and
F Test
Both the t test and F test require an estimate of  2,
the variance of e in the regression model.
Testing for Significance

An Estimate of 
The mean square error (MSE) provides the estimate
of  2, and the notation s2 is also used.
s 2 = MSE = SSE/(n  2)
where:
SSE   (yi  yˆi ) 2   ( yi  b0  b1 xi ) 2
Testing for Significance

An Estimate of 
• To estimate  we take the square root of  2.
• The resulting s is called the standard error of
the estimate.
SSE
s  MSE 
n2
Download