Discrete Distributions

Multiple Regression

Multiple regression problems involve more than one regressor variable.

Ex. Y

  o

  x

1 1

  x

2 2

 

(2 regressors x

1

, x

2

; and error term)

Termed “linear” because y is a linear function of the unknown beta parameters.

Often used as approximating functions when true functional relationship between y and x

1

, x

2

, x k

is unknown, but over certain ranges of the independent variables the linear regression model is an adequate approximation.

Any regression model that is linear in the parameters (the betas) is a linear regression model, regardless of the shape of the surface that it generates.

Ex.

Y

  o

  x

1 1

  x

2 2

  x x

3 1 2 let x

3

 x x

1 2

  o

   

2 x

2

 

 

 

Now the form of the equation has been linearized.

We use the method of least squares to estimate the regression coefficient. The goal is to minimize the error function:

( Y

L

 i n

 i

 n 

2





1



1 i





 y i

  o

 j k



2





1 x j ij







  o

 

Solve model equation for ε x

1 1

  x

2 2

  x

3 3

 

for our example.)

Partial derivatives are taken with respect to each regression coefficient and set = 0 to determine minimum of error function. This results in the least squares normal equations:

CIVL 7012/8012 Probabilistic Methods for Engineers 1

n

 ˆ

0

  ˆ

1 i n 



1 x i 1

  ˆ

2 i n 



1 x i 2

   ˆ k i n 



1 x ik

 i n 



1 y i

 ˆ

0 i n 



1 x i 1

  ˆ

1 n  i



1

2 x i 1

  ˆ

2 n  i



1 x x i 1 i 2

   ˆ k i n 



1 x x i 1 ik

 i n 



1 x y i 1 i

 ˆ

0 i n 



1 x ik

  ˆ

1 i n 



1 x x i 1

  ˆ

2 n  i



1 x x i 2

   ˆ k n  i



1

2 x ik

 i n 



1 x y i

In matrix for m : y

 y

1   y

2

  y n



X









1



1



 1

 ˆ







ˆ

ˆ

  

 

ˆ k







 

 x x

11 12 x x

21 22 x x n 1 n 2 x x

1 k

2 x nk k











 where : n

 number of observations k

 number of regressors .


For



L

 



0,

 ˆ 

 ˆ 

  

1

T T

X X X Y or may be written as

 ˆ  

X X

 

1 

.

Example:

The data given below is a sample of trip generation data collected in a major city.

Zone Home Based

Number

1

Trips/D.U.

4.2

Average Family Net Residential

Income ($



10

-3

) Density (D.U./acre)

37.4 13.9

2

3

4

5

6

7

8

9

10

11

12

13

3.0

3.9

4.5

5.6

5.8

4.4

7.0

5.9

3.1

4.6

6.6

6.1

26.0

42.0

51.2

68.7

51.2

58.0

65.7

65.7

32.8

42.0

51.2

58.8

49.2

15.2

16.2

11.5

6.5

5.7

4.9

7.6

28.5

10.4

2.6

2.4

It is suspected that a model for home based trips per dwelling (D.U.) may be written as a linear function of average family income and natural logarithm of net residential density.

To determine the "best" relationship of this form, we make the following definitions:

Y = home based trip per dwelling unit

X

1

= average family income



10

-3

X

2

= ln (net residential density)

We can write the prediction equation as:

Y

  

0

 

1

X

1



2

X

2


X



























1 37 4

1 26 0 .

.

1 42 0 2 72















 1 51 2

1 68 7

1 51 2 .

.

.

1 58 0 1 74

1 65 7 1 59

1 65 7

1 32 8 3 35

1 42 0 2 34

1 51 2

1 58 8 .

.

.















































13

650 7 .

.

.

.







  

1 











0 06472 .

.

0 93043



.

.



.

.





 .













    

1  



















 









.



.

.

.

.

.

.

.













Y


























































































The final trip generation prediction equation, then, is:

Home based trips / D U



.



.



.

ln



 average family income

1000

 net residential density







Multiple Correlation

To determine the suitability of the model form itself to the observed data, we use the coefficient of multiple determination, R

2

. This parameter is defined as the fraction of the total variation which is explained by the regression model. where:

SSR = sum of squares of regression

R

2 

SSR

SST

SST



SSR



SSE

SST = total sum of squares

SSE = sum of squares due to error or residual

For the ideal situation (no random error), the regression model would explain all of the variation, and R

2

= 1. As R

2

approaches 0, the model fit becomes very poor. We are measuring how well the model fits the data points.

To estimate error variance (often called “common variance”):

 ˆ 2 

SSE n

 p

 T where SSE Y Y

  ˆ

T T

X Y

SSE = sum of squares of errors

(p = number of betas, or k+1 degrees of freedom, where k is the number of independent variables (x’s)).


Confidence intervals for the mean of the sample:

Mean response:

Y

ˆ

0



X

T

0

 ˆ for a given X

0





1





 x

01





 

 x

0 k

 s

2

Y

ˆ

 ˆ

0

 

2 T T



1

X

0

( X X ) X

0

(29)

A 100(1α )% confidence interval on the mean response at the point (x

01

, x

02

, …, x

0k

) is:

Y

ˆ

0

 t



2

,( )

 ˆ 2

X

T

0

(

T

X X )



1

X

0



( )

0

 

0 t



2

,(

For a confidence interval on the β ’s: (100(1α )%

)

 ˆ 2 T T



1

X X X X

0

( )

0

(30)

 ˆ j

 t



2

,(



)

 ˆ 2

C jj

  j

  ˆ j

 t



2

,(



)

 ˆ 2

C jj


where C jj

is the jjth element of the (X

T

X)

-1

matrix, and

 ˆ 2 is the estimate of error variance.

Let’s go back to the last example. Suppose we are concerned with the average number of home based trips per dwelling unit when the average family income is $72,600 and the net residential density is 3.3 d.u./acre. Let's predict the average home based trips per dwelling, and construct a 90% C.I. on this prediction.

X

1 desired



72 .

6

X

2 desired

 ln( 3 .

3 )



1 .

19

We can immediately construct the a matrix:

X o

= a











72

1 .

19

1

.

6









While we’re at it, we can get Y

 from the prediction equation (already developed).

Y

 

4 88

Y

 

.

   

0 846

 

home based trips per dwelling unit (on the average).

Now,

  

1

was already calculated in developing the prediction equation:

  

1 











0 06472 .

.

0 93043



.

.



.

.







Let's work under the radical in Eq. 30.

    

1 a



 













.



.

0 06472 .

.

0 93043 .

.



















1





.



    

1 a



 

 





 










    

1 a



We now need to calculate s

2

. Using Eq. 26:

SSE Y Y X Y

We've already previously calculated:





 .







 ˆ 

X



Y





4 .

880148 0 .

03997



0 .

84645











64

3405

134 .

.

7

.

17

035









   

338

Y





















































































The measured Y values!






Y



Y

 i n 



1

Y i

2

Y



Y



341 .

81

SSE



341.81



338

SSE



3.40

Now, n



13 k



2 d .

f .

 n



 k



1

 d .

f .



13





2



1

 d .

f .



10

(short cut!)

Using eq. (27) to find our estimate of common variance: s

2  n



SSE

 k



1

 s

2 

317 .

99

10 s

2 

31 .

799

And using eq. (29): s

Y

ˆ

2  s

2

 a

 

X



X

 

1 a

 s

Y

ˆ

2 

31 .

799



0 .

31434

 s

Y

ˆ

2 

9 .

99570


s s

Y

ˆ



 s

Y

ˆ

2 

3 .

16160

9 .

99570

Suppose we wish to find a 90% confidence interval for our prediction Y

ˆ

.

 

1



0 .

9



0 .

10

 d

/ 2



.

f .



0 .

05

10

From the t -table we get: t

0 .

05 , 10



1 .

812

Now we can use eq. ( 30) !

90% C.I. = Y

ˆ  t

0 .

05 , 10

 s

Y

ˆ

90% C.I. = 6 .

77





1 .

812



3 .

16160



Or

= 6.77



5.73

90% C.I. = (1.04, 12.50)

ANOTHER EXAMPLE

Let’s find a 95% confidence interval on with. We’ve already found that

 ˆ

1



 ˆ

1

for the same example we’ve been working

0 .

03997 and we also calculated: s

2 

31 .

799

We further know that

C





X



X

 

1 













5 .

40906

0

0

.

.

06472

93043



0 .

06472

0

0

.

.

000877

009249



0

0

.

0

.

.

93043

009249

20784












Then s

 ˆ

1

2  s

2

C

11



31 .

799



0 .

000877

 or s

 ˆ

1

2 

0 .

02789 d.f. = 10

For a 95% C.I.,

(as before)



= 1 - 0.95 = 0.05



/2 = 0.025

From the t table: t

0 .

025 , 10



2 .

228

95% C.I. =

 ˆ

1

 t

0 .

025 , 10 s

 ˆ

1

2 

0 .

03997



2 .

228

95% C.I. = 0.03997



0.37207

95% C.I. = (-0.332, 0.412)

0 .

02789


Hypothesis Testing:

Test for significance of regression- tests to see if a linear relationship exists for dependent variable y and independent variables x

1

, x

2

, …x k

.

Ho: β

1

= β

2

=… β k

= 0

Ha: β j

≠ 0 for at least on j.

Rejection of Ho implies that at least one of the independent variables x

1

, x

2

, …x k contributes significantly to the model.

Recall: SST = SSR + SSE

SSE



Y Y

  ˆ

T X Y

SSR

  ˆ

T T

X Y n



 i n 



1 y i





2

 T

SST Y Y n



 i n 



1 y i





2





T

Y Y

 i n 



1 y i

2





Total sum of squares = sum of squares due to regression + sum of squares due to error

T.S.

F o



SSR k

1)



MSR

MSE

Where MS = “mean square.”

Reject Ho if F o



F

  

1)

.


Tests on Individual Regression Coefficients:

Determines the value of each of the independent variables in the regression model

(should you add or delete?)

Ho: β j

= 0

Ha:

β j

≠ 0

T.S. t o



 ˆ

 ˆ j

2

C jj





 ˆ 2 

SSE n

 p



 where p

 number of betas

C jj

is the diagonal element of (X'X)

-1

corresponding to ˆ j

. (*Remember- starts at

β

0

.)

Ho is rejected if t o

 t



2

,( n k 1)

.

If Ho is not rejected, this indicates that x j

can possibly be deleted from the model.


Selection of Variables in Multiple Regression:

We want to find the “best” subset of regressor variables for a final model from a set of possible regressor variables.

Backward Elimination:

-

Begins with all candidate variables in the model.

-

The variable with the smallest partial F-statistic is deleted if this F-statistic is insignificant (F < Fout = F

α ,r,(n-p)

).

Algorithm terminates when no further variables can be deleted.

Set up elimination (ANOVA) table:

Source of Variation SSR Degrees of

Freedom

Mean

Square k MSR Regression Current

SSR

SSR(

β

1

∣β

2

,

β

3

,

β

4

,

β

5

,

β

0

) SSR-SSR w/o β

1

SSR( β

2

∣β

1

, β

3

, β

4

, β

5

, β

0

)

SSR( β

3

∣β

1

, β

2

, β

4

, β

5

, β

0

)

SSR(

β

4

∣β

1

,

β

2

,

β

3

,

β

5

,

β

0

)

SSR( β

5

∣β

1

, β

2

, β

3

, β

4

, β

0

)

Error

Total

SSE

SST n-k-1 n-1

MSE

Fo

MSR/MSE


Discrete Distributions

Related documents

Products

Support

Discrete Distributions

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib