OLS Geometry

advertisement
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
OLS Geometry
Walter Sosa-Escudero
Econ 507. Econometric Analysis. Spring 2009
February 3, 2009
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Vector Space Geometry
A vector space S is a set along with an addition and a scalar
multiplication on S that satisfies some properties:
conmutativity, associativity, etc.
The euclidean space <n is the vector space formed by all
vectors in <n with the usual definition of sum of vectors and
scalar multiplication.
Actually we will impose more structure than the requirements to
form a vector space.
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Some Definitions and Notation
Inner product: < x, y > ≡ x0 y
P
Norm: ||x|| ≡ (x0 x)1/2 = ( ni=1 x2i )1/2 .
Orthogonality: x and y are orthogonal iff < x, y >= x0 y = 0
Linear dependence: x1 , . . . , xk are linearly dependent if there
exists P
xj , 1 ≤ j ≤ k and coefficients ci such that
xj = i6=j ci xi
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Vector geometry in <2
Vector representation
Vector addition.
Scalar multiplication
Angles, perpendicular and parallel vectors.
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
A vector in <2
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Vector addition: parallelogram’s rule
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Subspaces of the Euclidean Space
A vector subspace is any subset of a vector space that is itself
a vector space.
P
Span: S(x1 , . . . , xk ) ≡ z ∈ E n | z = ki=1 bi xi , bi ∈ < is
the euclidean vector subspace spanned by x1 , . . . , xk , that is
the set of all liner combinations of (x1 , . . . , xk ).
Alternatively X = [x1 · · · xk ], S(X) ≡ {z ∈ E n | z = Xγ} is
the subspace generated by the columns of X (the span of X).
All vectors that can be formed as linear combinations of the
columns of X.
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Orthogonal complement:
S ⊥ (X) ≡ w ∈ E n | w0 z = 0 for all z ∈ S(x) . All vectors
that are orthogonal to the columns of X.
Basis: a basis of V is a list of linearly independent vectors
that spans V .
Dimension: # of vectors of any basis.
Note dim S(X) ≡ ρ(X)
Result: Xn×k with dim S(X) = k ⇒ dim S ⊥ (X) = n − k
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
X is a vector in <2 .
S(X) is the subspace spanned by X, S ⊥ (X) is its orthogonal
complement, each of dimension 1.
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Variables and observations in the axis
The goal is to represent the data and the OLS estimator.
We need to change our notion of ‘point’. A scatter plot takes
every observation as a point.
Now we need to think of Y and the columns of X as K + 1
‘points’ in <n .
Each column is a point
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Source: Bring, J., 1996, A Geometric Approach to Compare Variables in a Regression Model, The American
Statistician, 50,1, pp. 57-62.
What do you expect to happen with this picture if we add a third person? A
fourth?
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
OLS Geometry
By definition, any point in S(X) can be expressed as Xβ,
β ∈ <k .
Least squares: given X and Y , find the point in S(X) that is
the closest as possible to Y .
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
The problem: minβ ||y − xβ|| ⇔ minβ ||y − xβ||2 .
Define: β̂ (solution to the problem), Ŷ = X 0 β̂, e = Y − Ŷ
Some properties:
e is orthogonal to any point in S(X), in particular, to X or
X β̂.
β̂ = (X 0 X)−1 X 0 Y .
From the orthogonality condition X 0 (Y − β̂) = 0.
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Projections
A projection is a mapping that takes any point in E n into a
point in a subspace of E n .
An orthogonal projection maps any point into the point of the
subspace that is closest to it.
Ŷ = X β̂ = X(X 0 X)−1 X 0 Y = PX Y is the orthogonal
projection of Y on S(X). PX = X(X 0 X)−1 X 0 is the
projection matrix that projects Y orthogonally on to S(X).
e = Y − Ŷ = Y − X β̂ = (I − X(X 0 X)−1 X 0 )Y = MX Y is the
projection of Y on to the orthogonal complement of S(X),
that is, S ⊥ (X). MX ≡ I − PX = I − X(X 0 X)−1 X 0 . is the
projecton matrix that projects Y orthogonally on to S ⊥ (X).
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Properties: easy to check algebraically, better to understand them
geometrically
MX and PX are symmetric matrices.
MX + PX = I. This suggests the orthogonal decomposition
Y = MX Y + P X Y
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
PX and MX are idempotent: PX PX = PX , MX MX = MX .
Intuition: if a vector is already in S(X), further projecting it
in S(X) has no effect.
PX MX = 0. Think about what you get of doing fisrt one
projection and then the other (in any order). PX and MX
‘anihilate’ each other. 0 is the only point that belongs to both
S(X) and S ⊥ (X).
MX anihilates any point in S(X), that is MX Xβ = 0
PX anihilates any point in S ⊥ (X) : PX Xβ = 0 CHECK
If A is a non-singular matrix K × K, PXA = PX .
ρ(X) = ρ(PX )
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Goodness of fit
From the orthogonal decomposition
Y = PY + MY
Then
Y 0Y
= Y 0P Y + Y 0M Y
0
0
0
(1)
0
= Y P PY + Y M MY
2
||Y ||
2
2
= ||P Y || + ||M Y ||
In <2 this is simply Pythagoras’ theorem. Then
R2 =
||P Y ||2
= cos2 θ
||Y ||2
where θ is the angle formed by Y and P Y . Actually this is the
uncentered R2 .
Walter Sosa-Escudero
OLS Geometry
(2)
(3)
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
The Frisch-Waugh-Lovell Theorem
Consider the linear model: Y = Xβ + u
And partition it as follows: Y = X1 β1 + X2 β2 + u
X1 , X2 matrices of k1 and k2 explanatory variables. Then,
X = [X1 X2 ], β 0 = (β10 β20 )0 and k = k1 + k2 .
M1 ≡ I − X1 (X10 X1 )−1 X10 , projects any vector in Rn in the
orthogonal complement of the span of X1 .
Y ∗ ≡ M1 Y , X2∗ ≡ M1 X2 , respectively, OLS residuals of regressing
Y on X1 , and all columns of X2 on X1 .
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Suppose that we are interested in estimating β2 , and consider the
following alternative methods:
Method 1: Proceed as usual and regress Y on X obtaining
the OLS estimator β̂ = (β̂10 β̂20 )0 = (X 0 X)−1 X 0 Y . β̂2 would
be the desired estimate.
Method 2: Regress Y ∗ on X2∗ and obtain as estimate
β̃2 = (X2∗0 X2∗ )−1 X2∗0 Y ∗
Let e1 and e2 be the residuals vectors of the regressions in Method
1 and 2, respectively.
Theorem (Frisch and Waugh, 1933, Lovell, 1963): β̂2 = β̃2 (first
part) and e1 = e2 (second part).
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Proof (boring): Start point with the orthogonal decomposition:
Y = P Y + M Y = X1 β̂1 + X2 β̂2 + M Y
To prove the first part, multiply by X20 M1 to get:
X20 M1 Y = X20 M1 X1 β̂1 + X20 M1 X2 β̂2 + X20 M1 M Y
M1 X1 = 0, why?
X20 M1 M = X20 M − X20 P1 M = 0 (same reasons as before)
Then: X20 M1 Y = X20 M1 X2 β̂2
So: β̂2 = (X20 M1 X2 )−1 X20 M1 Y
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
To prove the second part multiply the orthogonal decomposition by
M1 and obtain:
M1 Y = M1 X1 β̂1 + M1 X2 β̂2 + M1 M Y
Again, M1 X1 = 0
M Y belongs to the orthogonal complement of [X1 X2 ], so
further projecting it on the orthogonal complement of X1
(which is what premultiplying by M1 would do) has no effect,
hence M1 M Y = M Y .
This leaves:
M1 Y − M1 X2 β̂2 = M Y
Y ∗ − X2∗ β̂2 = M Y
e2 = e1
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Geometric Illustration of FWLT
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Geometric Illustration of FWLT
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Comments and Intuitions
Idea of ‘controling for X1 ’: either put it in the model, or first
get rid of it by extracting its effect.
What if X1 and X2 are orthogonal?
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Applications of the FWLT
Deviations from means.
Detrending
Seasonal effects
Later on: multicolinearity, omitted variable bias, panel-data
fixed-effects estimation, instrumental variables.
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Deviation from means
Simple model with intercept
Y = Xβ + u = β1 1 + [X2 X3 · · · XK ] β−1 ,
1 ≡ (1, 1, . . . , 1)0 , β−1 = (β2 , β3 , . . . , βK )0 , and Xk , k = 2, . . . , K
are the corresponding columns of X.
Two methods of estimating β−1
Method 1: Regress Y on X = [10 X2 · · · XK ].
Method 2: Get residuals of projecting Xk , k = 2, . . . , K on 1, call
them Xk∗ . Do the same with Y , and call them Y ∗ .
Walter Sosa-Escudero
OLS Geometry
Vector Spaces
OLS and Projections
The FWL Theorem
Applications
Note P1 = 1(10 1)−1 10 = n−1 J, J is an n × n matrix of 1’s. Then
P1 Xk =
1
JXk = (X̄k , X̄k , . . . , X̄k )0
n
so Xk∗ = M1 Xk = (I − P1 )Xk = Xk − (X̄k , X̄k , . . . , X̄k )0 , an
n × 1 vector with typical element
∗
Xik
= Xik − X̄k
So the second method consists in:
1
Reexpress all varaibles as deviations from their sample means.
2
Run the standard regression of these ‘residuals’ without
intercept.
Question: what happens if we forget to reexpress Y as deviations
from its means. Generalize this result
Walter Sosa-Escudero
OLS Geometry
Download