Document 10639589

A Review of Some Key

Linear Models Results

Copyright c 2016 Dan Nettleton (Iowa State University) Statistics 510 1 / 24

A General Linear Model (GLM)

Suppose y = X β + , where y ∈

R n is the response vector,

X is an n × p matrix of known constants,

β ∈

R p is an unknown parameter vector, and is a vector of random “errors” satisfying E ( ) = 0 .


This GLM says simply that y is a random vector satisfying

E ( y ) = X β for some β ∈

R p .

The distribution of y is left unspecified.

We know only that y is random and its mean is in the column space of X ; i.e., E ( y ) ∈ C ( X ) .


Ordinary Least Squares (OLS) Estimation of E ( y )

Because we know E ( y ) ∈ C ( X ) , a natural estimator of E ( y ) is the

Ordinary Least Squares Estimator (OLSE), which is the unique point in C ( X ) that is closest to y in terms of Euclidean distance.

The OLSE of E ( y ) is given by ˆ ≡ P

X y , where

P

X

= X ( X

0

X )

−

X

0

, because P

X y ∈ C ( X ) and

|| y − P

X y || 2

< || y − z || 2 ∀ z ∈ C ( X ) \ { P

X y } .


The Orthogonal Projection Matrix

P

X

= X ( X

0

X )

−

X

0 is known as the orthogonal projection matrix

(a.k.a. the perpendicular projection operator) onto the column space of X and has the following properties:

P

X is symmetric (i.e., P

X

= P

0

X

).

P

X is idempotent (i.e., P

X

P

X

= P

X

).

P

X

X = X and X

0

P

X

= X

0

.

rank ( X ) = rank ( P

X

) = tr ( P

X

) .

P

X

= X ( X

0

X )

−

X

0 is the same matrix for all generalized inverses ( X

0

X )

− of X

0

X .


The OLSE of Estimable Functions of β

For any q × n matrix A , the OLSE of A E ( y ) = AX β is

AP

X y = AX ( X

0

X )

−

X

0 y .

If there exists a matrix A such that C = AX , then C β = AX β is estimable , and the OLSE of C β = AX β is denoted C

ˆ

, where is any solution for b in the Normal Equations

X

0

Xb = X

0 y .


Solutions to the Normal Equations

The Normal Equations

X

0

Xb = X

0 y have ( X

0

X )

− 1 X

0 y as the unique solution for b if rank ( X ) = p .

The Normal Equations have infinitely many solutions for b if rank ( X ) < p .

ˆ

= ( X

0

X )

−

X

0 y is always a solution to the Normal Equations for any ( X

0

X )

−

, a generalized inverse of X

0

X .


Uniqueness of the OLSE of an Estimable C β

If C β is estimable, then C

ˆ is the same for all solutions

ˆ to the

Normal Equations.

In particular, the unique OLSE of C β is

C

ˆ

= C ( X

0

X )

−

X

0 y = AX ( X

0

X )

−

X

0 y = AP

X y , where C = AX .


The Guass-Markov Model (GMM)




β ∈

R p is an unknown parameter vector, and is a vector of random “errors” satisfying E ( ) = 0 and

Var ( ) = σ 2 I for some unknown variance parameter σ 2 ∈

R

+ .


The Guass-Markov Theorem

The GMM is a special case of the GLM presented previously.

We have added the assumption Var ( ) = σ 2 I ; i.e., we assume the errors are uncorrelated and have constant variance.

All the results presented for the GLM hold for the GMM.

For the GMM we have an additional result provided by the

Gauss-Markov Theorem : The OLSE of an estimable function C β is the Best Linear Unbiased Estimator (BLUE) of C β in the sense that the OLSE C

ˆ has the smallest variance among all linear unbiased estimators of C β .


Unbiased Estimation of σ 2

An unbiased estimator of σ 2 under the GMM is given by

2 ≡ y

0

( I − P

X

) y

, where r = rank ( X ) .

n − r

Note that y

0

( I − P

X

) y = y

0

( I − P

X

)

0

( I − P

X

) y

= { ( I − P

X

) y }

0

{ ( I − P

X

) y } = || ( I − P

X

) y || 2

= || y − ˆ || 2

= “Sum of Squared Errors” (SSE) .


Gauss-Markov Model with Normal Errors (GMMNE)




β ∈

R p is an unknown parameter vector, and

∼ N ( 0 , σ 2

I ) for some unknown variance parameter

σ

2 ∈

R

+ .


The GMMNE is a special case of the GMM.

We have added the assumption is multivariate normal.

The GMMNE implies y ∼ N ( X β , σ 2 I ) .

The GMMNE is useful for drawing statistical inferences regarding estimable C β .


Throughout the remainder of these slides we will assume the GMMNE model holds,

C is a q × p matrix such that C β is estimable, rank ( C ) = q , and d is a known q × 1 vector.

These assumptions imply H

0

: C β = d is a testable hypothesis .


The Distribution of C

ˆ and ˆ 2

C

ˆ ∼ N ( C β , σ 2

C ( X

0

X )

−

C

0

) .

( n − r )ˆ 2

σ 2

∼ χ 2 n − r

⇐⇒ σ

2

σ 2

∼

χ 2 n − r n − r

⇐⇒ ˆ 2 ∼ σ

2 n − r

χ 2 n − r

.

C

ˆ and ˆ

2 are independent.


The F Test of H

0

: C β = d

The test statistic

F ≡ ( C

ˆ

− d )

0

[ c

( C

ˆ

)]

− 1

( C

ˆ

− d ) / q

= ( C

ˆ

− d )

0

=

2

C ( X

0

X )

−

C

0

]

− 1

( C

ˆ

− d ) / q

( C

ˆ − d )

0

[ C ( X

0

X )

−

C

0

]

− 1 ( C

ˆ − d ) / q

ˆ 2 has a non-central F distribution with non-centrality parameter

( C β − d )

0

[ C ( X

0

X )

−

C

0

]

− 1

( C β − d )

2 σ 2 and degrees of freedom q and n − r .


The F Test of H

0

: C β = d (continued)

The non-negative non-centrality parameter

( C β − d )

0

[ C ( X

0

X )

−

C

0

]

− 1

( C β − d )

2 σ 2 is equal to zero if and only if H

0

: C β = d is true.

If H

0

: C β = d is true, the statistic F has a central F distribution with q and n − r degrees of freedom ( F q , n − r

).


The F Test of H

0

: C β = d (continued)

Thus, to test H

0

: C β = d , we compute the test statistic F and compare the observed value of F to the F q , n − r distribution.

If F is so large that it seems unlikely to have been a draw from the F q , n − r distribution, we reject H

0 and conclude C β = d .

The p -value of the test is the probability that a random variable with distribution F q , n − r matches or exceeds the observed value of the test statistic F .


The t Test of H

0

: c

0

β = d for Estimable c

0

β

The test statistic t ≡

= c

0 ˆ

− d q c

( c 0

ˆ

) c

0 ˆ

− d p

ˆ 2 c 0 ( X

0

X ) − c has a non-central t distribution with non-centrality parameter c

0

β − d p

σ 2 c 0 ( X

0

X ) − c and degrees of freedom n − r .


The t Test (continued)

The non-centrality parameter c

0

β − d p

σ 2 c 0 ( X

0

X ) − c is equal to zero if and only if H

0

: c

0

β = d is true.

If H

0

: c

0

β = d is true, the statistic t has a central t distribution with n − r degrees of freedom ( t n − r

).


The t Test (continued)

Thus, to test H

0

: c

0

β = d , we compute the test statistic t and compare the observed value of t to the t n − r distribution.

If t is so far from zero that it seems unlikely to have been a draw from the t n − r distribution, we reject H

0 and conclude c

0

β = d .

The p -value of the test is the probability that a random variable with distribution t n − r would be as far or farther from 0 than the observed value of the t test statistic.


A 100 ( 1 − α ) % Confidence Interval for Estimable c

0

β c

0 ˆ

− t n − r , 1 − α/ 2 q

ˆ 2 c 0 ( X

0

X ) − c , c

0 ˆ

+ t n − r , 1 − α/ 2 q

ˆ 2 c 0 ( X

0

X ) − c c

0 ˆ

± t n − r , 1 − α/ 2 q

ˆ 2 c 0 ( X

0

X ) − c estimate ± (distribution quantile) × (standard error of estimate)


Form of the t Statistic for Testing H

0

: c

0

β = d t = estimate − d standard error of estimate

= estimate − d q c

(estimator) t

2

=

( estimate − d ) 2 c

(estimator)

= ( estimate − d ) h c

(estimator) i

− 1

( estimate − d ) / 1


Revisiting the F Statistic for Testing H

0

: C β = d

F = ( estimate − d )

0 h c

( estimator ) i

− 1

( estimate − d ) / q

= ( C

ˆ

− d )

0

[ c

( C

ˆ

)]

− 1

( C

ˆ

− d ) / q

= ( C

ˆ − d )

0 2

C ( X

0

X )

−

C

0

]

− 1

( C

ˆ − d ) / q

=

( C

ˆ

− d )

0

[ C ( X

0

X )

−

C

0

]

− 1 ( C

ˆ

− d ) / q

ˆ 2


Document 10639589

A Review of Some Key

Linear Models Results

Related documents

Products

Support

Document 10639589

A Review of Some Key

Linear Models Results

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib