Simple Linear Regression Deterministic Model Probabilistic Model

advertisement
Simple Linear Regression
14
Slope-intercept equation for a line
y  mx  b
y
8
10
12
Regression equation—an equation that describes the
average relationship between a response (dependent)
and an explanatory (independent) variable.
(4,6)
6
slope 
y  6  3  3
(2,3)
4
y 3
  1.5
x 2
2
x  4  2  2
2
4
6
8
10
x
1
2
120
Deterministic Model
9
F  C  32 y-intercept
5
96
A model that defines an exact relationship
between variables.
slope
Fahrenheit
48
72
Example: y  1.5 x
There is no allowance for error in the prediction of y
for a given x.
0
24
32
-10
0
10
20
30
40
50
Celsius
3
4
Includes both a deterministic component and a random
error component.
y  1.5 x  
deterministic
component
random error
}random error
y
y  1.5 x  random error
15
A model that accounts for random error.
10
Probabilistic Model
5
Also termed
a residual
This model hypothesizes a probabilistic relationship
between y and x.
0
 ~ N  0,1
0
2
4
6
8
10
x
5
6
1
Probabilistic Model—General Form
First-Order (Straight Line) Probabilistic Model
y   0  1 x  
y = Deterministic component + Random
component
where y = Dependent variable
x = Independent variable
 0 = population y-intercept of the line—the point at which the
line intersects or cuts
through the y-axis
1 = population slope of the line—the amount of increase (or
decrease) in the deterministic
component of y for every
1-unit increase (or decrease)
in x.
 = random error component
where y is the “variable of interest”.
Assume that the mean value of the random error is
zero  the mean value of y, E(y), equals the
deterministic component of the model
7
8
First-Order (Straight Line) Probabilistic Model
Model Development
 0 and 1 are population parameters. They will only
 0 and 1 will generally be unknown.
The process of developing a model, estimating model
parameters, and using the model can be summarized in
these 5-steps:
1. Hypothesize the deterministic component of
the model that relates the mean, E(y) to the
independent variable x
be known if the population of all (x, y) measurements
are available.
 0 and 1, along with a specific value of the
independent variable x determine the mean value of
the dependent variable y.
E  y    0  1 x
2. Use sample data to estimate unknown model
parameters
find estimates: ˆ 0 or b0 , ˆ 1 or b1
9
Model Development (continued)
10
Example: Reaction time versus drug percentage
3. Specify the probability distribution of the random
error term and estimate the SD of this distribution
 ~ N  0,    will revisit this later
4. Statistically evaluate the usefulness of the model
5. Use model for prediction, estimation or other
purposes
11
Subject
Amount of Drug
(%)
x
Reaction Time
(seconds)
y
1
2
3
4
5
1
2
3
4
5
1
1
2
2
4
12
2
4.0
Reaction Time (sec.)
0.7
1.5
2.3
3.2
y  x   1
Errors or
residuals
(3,2)
(2,1)
0
1
2
3
4
intercept = -1
-1.0
-1.0
-0.2
-0.2
4.0
Example: Reaction time versus drug percentage
Reaction Time (sec.)
0.7
1.5
2.3
3.2
Example: Reaction time versus drug percentage
5
Drug (%)
0
1
Slope = (2-1)/(3-2) = 1/1 = 1
2
13
3
4
5
14
Drug (%)
Example: Reaction time versus drug percentage
Least Squares Line
Errors of prediction---vertical differences between
the observed and the predicted values of y
Also called regression line, or the least squares
prediction equation
x
y y  1  x
1
2
3
4
5
1
1
2
2
4
0
1
2
3
4
 y  y 
 y  y 
2
Method to find this line is called the method of least
squares
(1-0) = 1
1
(1-1) = 0
0
(2-2) = 0
0
(2-3) = -1
1
(4-4) = 0
0
Sum of
Sum of squared
errors = 0 errors (SSE) = 2
For our example, we have a sample of n = 5 pairs of
(x, y) values. The fitted line that we will calculate is
written as ŷ  b0  b1 x
ŷ is an estimator of the mean value of y, E  y  ;
b0 and b1 are estimators of  0 and 1
15
Least Squares Line (continued)
Define the sum of squares of the deviations of the y
values about their predicted values for all n data points
as:
n
n
SSE    yi  ˆy     yi   b0  b1 xi  
i 1
2
2
16
Formulas for the Least Squares Estimates
SD y
SS
Slope: b1  xy or b1  r
SS xx
SD x
  xi yi 
i 1
We want to find b0 and b1 to make the SSE a
minimum---termed least squares estimates
  x   y 
i
i
  xi2 
n
y-intercept: b0  y  b1 x 
ŷ  b0  b1 x is called the least squares line
sxx = SS xx    xi  x 
s xy = SS xy    xi  x  yi  y 
y
i
n
 b1
2
 x 
2
i
n
x
i
n
n = sample size
17
18
3
LS Calculations for Drug/Reaction Example
1
1
1
1
2
1
4
2
3
2
9
6
4
2
16
8
5
4
25
20
x
i
 15
SSxy  37 
y
i
 10
1510 
5
x
2
i
 55
 37  30  7
x y
i
i
LS Line for Drug/Reaction Example
4.0
xi yi
b1 
7
 0.7
10
10
15
  .7 
5
5
 2   .7  3
b0 
 2  2.1  .1
 37
SS xx  55 
15
ŷ  .1  .7 x
2
5
ŷ  .1  .7 x
Reaction Time (sec.)
0.7
1.5
2.3
3.2
xi2
x,y
-0.2
yi
 55  45  10
-1.0
xi
0
1
2
3
4
5
Drug (%)
19
20
LS Calculations for Drug/Reaction Example
Least Squares Line—Interpretation of ŷ  .1  .7 x
 y  ˆy 
Estimated intercept is negative
that the estimated mean reaction time is equal to
-0.1 seconds when the amount of drug is 0%.
x y ŷ  .1  .7 x
1
2
3
4
5
1
1
2
2
4
.6
1.3
2.0
2.7
3.4
 y  ˆy 
2
(1-.6) = .4
.16
(1-1.3) = -.3
.09
(2-2.0) = 0
.00
(2-2.7) = -.7
.49
(4-3.4) = .6
.36
Sum of
Sum of squared
errors = 0
errors (SSE) = 1.10
What does this mean since negative reaction times are
not possible?
Model parameters should be interpreted only within
the sampled range of the independent variable.
The LS line has a sum of errors = 0, but
SSE = 1.1 < 2.0 for visual model
21
22
Least Squares Line—Interpretation of ŷ  .1  .7 x
Coefficient of Determination
The slope of 0.7 implies that for every unit increase of
x, the mean value of y is estimated to increase by 0.7
units.
A measure of the contribution of x in predicting y
4.0
Assuming that x provides no information for the prediction of y,
the best prediction for the value of y is y
SS yy    yi  y 
2
-1.0
For every 1% increase in the amount of drug in the
bloodstream, the mean reaction time is estimated to
increase by 0.7 seconds over the sampled range of drug
amounts from 1% to 5%.
ˆy  y
Reaction Time (sec.)
-0.2
0.7
1.5
2.3
3.2
In the context of the problem:
0
23
1
2
3
Drug (%)
4
5
24
4
4.0
Coefficient of Determination (continued)
Reaction Time (sec.)
0.7
1.5
2.3
3.2
SSE    yi  ˆyi 
Coefficient of Determination (continued)
SS yy    yi  y 
2
2
--total sample variation around mean
i
SSE    yi  ˆyi  --unexplained sample variability after
2
fitting
SS yy  SSE --explained sample variability attributable to
linear relationship
-0.2
SS yy  SSE
-1.0
SSyy
0
1
2
3
4
explained
 proportion of total sample
total
variability explained by the
linear relationship
5
Drug (%)

25
26
Coefficient of Determination (continued)
r2 
SS yy  SSE
SS yy
 1
SSE
SS yy
}
Unexplained
variability
In simple linear regression r 2 is computed as the
square of the correlation coefficient, r
0  r2  1
Interpretation— r 2  .75 means that the sum of
squared deviations of the y values about their
predicted values has been reduced by 75% by the use
ŷ, instead of y , to predict y of the least squares
equation.
27
5
Download