The data in the motivating example can be modeled by y i
0
1 x i
i
, i
1 , , 10
.
0
is the basic quarterly sales,
1
reflects the increased or decreased quarterly sales per student population unit (1000) and
i
’s are the unexpected variations or random errors. If
1
is positive, it implies that as student populations increases, quarterly sales increase. Conversely, negative
1
implies that more students might result in less quarterly sales. Therefore, estimating
0
and
1
is crucial in determining the relationship between the student population and quarterly sales. The well-known method in estimating
0
and
1
is the least squares method proposed by Gauss in eighteen century. For the general setting, y i
0
1 x i
i
, i
1 , , n
The least squares method is to find the estimates of
0
and
1
, b
0
and b
1
, which can minimize the object function
S (
0
,
1
)
i n
1
( y i
0
1 x i
)
2 i n
1
i
2
.
That is, for any values of b
0
, b
1
, S ( b
0
, b
1
)
S ( b
0
, b
1
) . The following explains why the least squares method work from two points of view.
Suppose y i
60
5 x i
i
, i
1 ,..., 10 , is the true model for the data in the
' motivating example. Also, we assume all the random errors s i
are
3
Then, heuristically,
S ( 60 , 5 )
i
10
1
( y i
60
5 x i
) 2 i
10
1
( 60
5 x i
i
60
5 x i
) 2
10
i
1
i
2
However,
S ( 59 , 5 )
10 i
1
( y i
59
5 x i
)
2
10 i
1
( 60
5 x i
i
59
5 x i
)
2 i
10
1
( 1
i
)
2 i
10
1
i
2
,
S ( 60 , 4 )
10 i
1
( y i
60
4 x i
)
2 i
10
1
( 60
5 x i
i
60
4 x i
)
2 i
10
1
( x i
i
)
2 i
10
1
i
2 and
S ( 59 , 4 )
i
10
1
( y i
59
4 X i
)
2
10 i
1
( 60
5 x i
i
59
4 x i
)
2
10 i
1
( 1
x i
i
)
2
10 i
1
i
2
These 4 equations imply the object function S (
0
,
1
) attains its minimum as the parameters estimates equal to their true counterparts.
That is, under small random errors (which usually happen in practice), the parameter estimates b
0
and b
1
might be quite close to the true value of the parameters as 60 and 5 given above.
The least squares method is to find the “best” line which the data points
( x
1
, y
1
), ( x
2
, y
2
), , ( x
10
, y
10
)
are closest to.
Now, we demonstrate the procedure to find b
0
and b
1
. In calculus, the maximum or the minimum of a two variables of function f(x,y) can be found by solving
f ( x ,
x y )
0 and
f ( x ,
y y )
0 first, then check the
Hessian matrix (the second partial derivative) matrix is positive definite or negative definite. Therefore, we need to find the solutions of the
4
S (
0
0
,
1
)
0
and
S (
0
1
,
1
)
0
.
It is quite complicated to find the solutions of the above equations directly.
However, it is much easier by let
0
0
1 x , x n
i
1 n x i
and thus
S (
0
,
1
)
i n
1
( y i
0
1 x i
)
2 i n
1
( y i
(
0
1 x )
1
( x i
x ))
2
i n
1
( y i
0
1
( x i
x ))
2
S (
0
,
1
).
Then, we find b
0
, b
1
minimizing S (
0
,
1
) and b
0
can be obtained by the equation b
0
b
0
b
1 x .
S (
0
0
,
1
)
i n
(
1
( y i
0
1
( x i
0
x ))
2
)
n i
1
2 ( y i
0
1
( x i
x ))
0
Since i n
1
1
( x i
x )
1 i n
1
( x i
x )
1 i
(
n
1 x i
n x )
1
( n i
1 x i
i n
1 x i
)
0 ,
Thus,
n i
1
2 ( y i
0
1
( x i
x ))
2 i n
1
( y i
0
)
2 ( i n
1 y i
n
0
)
0 .
b
0
i n
1 y i n
y
S ( b
0
1
,
1
)
n
(
i
1
( y i
y
1
( x i
x ))
2
)
1
2 i n
1
( y i
y
1
( x i
x ))( x i
x )
0
5
n i
1
( y i
y )( x i
x )
1 i n
1
( x i
x ) 2
0
b
1
n i
1
( y i
i n
1
( x i y )( x i
x )
2
x )
s
XY s
XX
, where s
XX
n i
1
x i
x
2 i n
1 x i
2 n x
2 , s
XY
i n
1
( x i
x )( y i
y )
n i
1 x i y i
n x y .
Thus, b
0
b
0
b
1 x
y
b
1 x
.
The fitted regression equation (the fitted line) is y
b
0
b
1 x
y
s
XY s
XX
x
x
.
The fitted value for the i’ th observation is y
ˆ i
b
0
b
1 x i
In the motivating example, i
10
1 x i
140 , i
10
1 y i
1300 , i
10
1 x i
2
2528 , i
10
1 y i
2
184730 , x
14 , y
130 ,
10 i
1 x i y i
21040 ,
Thus, s
XY
21040
10
14
130
2840 and s
XX
2528
10
14
2
568
b
1
s
XY s
XX
2840
568
5 and b
0
y
b
1 x
130
5
14
60 .
y
ˆ
60
5 x is the fitted regression equation.
That is, an increase in the student population of 1000 is associated with an increase of $5000 in expected quarterly sales, i.e., quarterly sales are expected to increase by $5 per student.
Example 1: y i
0
-2
2
0
2
2
5
1
Please fit the model Y
0
1
X
.
5
3
9
1
9
0
9
0
9
1
10
-1
6
[sol]:
Thus, i
10
1 x i
S
XY
60 , i
10
1 y i
5 ,
10 i
1 x i
2
482 , i
10
1
32
10
6
0 .
5
2 and S
XX y i
2
21 , x
6 , y
0 .
5 ,
482
10
36
122 i
10
1 x i y i
32 ,
b
1
S
XY
S
XX
2
122
1
61
and b
0
y
b
1 x
0 .
5
6
61
49
122
.
49
122
x
61
is the fitted regression equation.
The fitted value of the first observation is y
1
49
122
x
1
61
49
122
0
61
49
122
Sir Francis Galton (1822-1911) was responsible for the introduction of the word “regression”. Originally he used the term “reversion” in an unpublished address. He investigated the relationship between the child’s height (y) and parent’s height (x). The fitted regression equation he obtained was something like
y
2
3
x
x
.
Example 2:
Suppose the model is y i
0
1 x i
i
, i
1 , , 20 ,
i
~ N
, and i
20
1 x i
1330 , i
20
1 y i
1862 .
8 , i
20
1 x i
2
90662 ,
20 i
1 y i
2
173554 .
26 , i
20
1 x i y i
124206 .
9
Find the least squares estimate and the fitted regression equation
[solution:] s
XX
i
20
1 x i
2
20
x
2
90662
20
1330
20 s
XY
i
20
1 x i y i
20
x y
124206 .
9
20
2
2217
1330
20
1862 .
8
20
330 .
7
7
Then, the least squares estimate is b
1
s
XY s
XX
330 .
7
2217
0 .
149 , b
0
y
b
1 x
1862 .
8
20
0 .
149
1330
20
83 .
232
The fitted regression equation is
ˆ y
83 .
232
0 .
149 x
.
Exercise 14.1.1
8